HTML minifier for Python frameworks (not only Django, despite the name).

Overview

django-htmlmin

django-html is an HTML minifier for Python, with full support for HTML 5. It supports Django, Flask and many other Python web frameworks. It also provides a command line tool, that can be used for static websites or deployment scripts.

Why minify HTML code?

One of the important points on client side optimization is to minify HTML. With minified HTML code, you reduce the size of the data transferred from the server to the client, which results in faster load times.

Installing

To install django-htmlmin, run this on the terminal: :

$ [sudo] pip install django-htmlmin

Using the middleware

All you need to do is add two middlewares to your MIDDLEWARE_CLASSES and enable the HTML_MINIFY setting:

MIDDLEWARE_CLASSES = (
    # other middleware classes
    'htmlmin.middleware.HtmlMinifyMiddleware',
    'htmlmin.middleware.MarkRequestMiddleware',
)

Note that if you're using Django's caching middleware, MarkRequestMiddleware should go after FetchFromCacheMiddleware, and HtmlMinifyMiddleware should go after UpdateCacheMiddleware:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',
    'htmlmin.middleware.HtmlMinifyMiddleware',
    # other middleware classes
    'django.middleware.cache.FetchFromCacheMiddleware',
    'htmlmin.middleware.MarkRequestMiddleware',
)

You can optionally specify the HTML_MINIFY setting:

HTML_MINIFY = True

The default value for the HTML_MINIFY setting is not DEBUG. You only need to set it to True if you want to minify your HTML code when DEBUG is enabled.

Excluding some URLs

If you don't want to minify all views in your app and it's under a /my_app URL, you can tell the middleware to not minify the response of your views by adding a EXCLUDE_FROM_MINIFYING setting on your settings.py:

EXCLUDE_FROM_MINIFYING = ('^my_app/', '^admin/')

Regex patterns are used for URL exclusion. If you want to exclude all URLs of your app, except a specific view, you can use the decorator @minified_response (check the next section above).

Keeping comments

The default behaviour of the middleware is to remove all HTML comments. If you want to keep the comments, set the setting KEEP_COMMENTS_ON_MINIFYING to True:

KEEP_COMMENTS_ON_MINIFYING = True

Conservative whitespace minifying

By default the minifier will try to intelligently remove whitespace and leave spaces only as needed for inline text rendering. Sometimes it may be necessary to not completely remove whitespace but only reduce spaces to a single space. If you set CONSERVATIVE_WHITESPACE_ON_MINIFYING to False then whitespace is always reduced to a single space and never completely removed.

CONSERVATIVE_WHITESPACE_ON_MINIFYING = True

Using the decorator

django-htmlmin also provides a decorator, that you can use only on views you want to minify the response:

from htmlmin.decorators import minified_response

@minified_response
def home(request):
    return render_to_response('home.html')

Decorator to avoid response to be minified

You can use the not_minified_response decorator on views if you want to avoid the minification of any specific response, without using the EXCLUDE_FROM_MINIFYING setting:

from htmlmin.decorators import not_minified_response

@not_minified_response
def home(request):
    return render_to_response('home.html')

Using the html_minify function

If you are not working with Django, you can invoke the html_minify function manually:

from htmlmin.minify import html_minify
html = '<html>    <body>Hello world</body>    </html>'
minified_html = html_minify(html)

Here is an example with a Flask view:

from flask import Flask
from htmlmin.minify import html_minify

app = Flask(__name__)

@app.route('/')
def home():
    rendered_html = render_template('home.html')
    return html_minify(rendered_html)

Keeping comments

By default, html_minify() removes all comments. If you want to keep them, you can pass ignore_comments=False:

from htmlmin.minify import html_minify
html = '<html>  <body>Hello world<!-- comment to keep --></body>  </html>'
minified_html = html_minify(html, ignore_comments=False)

Using command line tool

If you are not even using Python, you can use the pyminify command line tool to minify HTML files:

$ pyminify index.html > index_minified.html

You can also keep the comments, if you want:

$ pyminify --keep-comments index.html > index_minified_with_comments.html

development

Pull requests are very welcome! Make sure your patches are well tested.

Running tests

If you are using a virtualenv, all you need to do is:

$ make test

community

IRC channel

#cobrateam channel on irc.freenode.net

Changelog

You can see the complete changelog on the Github releases page.

LICENSE

Unless otherwise noted, the django-htmlmin source files are distributed under the BSD-style license found in the LICENSE file.

Comments
  • How to cache minified pages?

    How to cache minified pages?

    Hi, I'd love to use HtmlMinifyMiddleware, but I can't figure out where to put it in my settings.MIDDLEWARE so that it does its job, but is then also cached, so that we don't have to minify again until the page changes or the cache expires. As it is now, when I disable HtmlMinifyMiddleware, my server can handle >60X more requests per second, which suggests that, when it is enabled, HtmlMinifyMiddleware is doing its job for each request. I'd like it to do its job only when the page is rewritten to memcache via UpdateCacheMiddleware/FetchFromCacheMiddleware.

    My settings look a bit like this:

    MIDDLEWARE = (
        'django.middleware.cache.UpdateCacheMiddleware',
        'htmlmin.middleware.HtmlMinifyMiddleware',
        ...  # Other middleware
        'django.middleware.cache.FetchFromCacheMiddleware',
        'django.contrib.redirects.middleware.RedirectFallbackMiddleware',
    )
    

    My reasoning (which could easily be misguided), is that minification should be the last thing to happen to a response body before it gets stored in the cache, hence, it should be the last thing before UpdateCacheMiddleware on the way "out" as a response, therefore the first thing after UpdateCacheMiddleware in the MIDDLEWARE tuple.

    I'm using Python 2.7 and Django 1.4, if that is relavant.

    What am I doing wrong? Also, whatever the answer, it would make a good addition to the installation guide.

    opened by mkoistinen 20
  • < and > get converted to < and > respectively

    < and > get converted to < and > respectively

    I have this snippet of an html page::

    <div class="highlight"><pre><span class="cp">&lt;!DOCTYPE html&gt;</span>
    <span class="nt">&lt;html</span> <span class="na">lang=</span><span class="s">&quot;en&quot;</span> <span class="na">dir=</span><span class="s">&quot;ltr&quot;</span><span class="nt">&gt;</span>
        <span class="nt">&lt;head&gt;</span>
            <span class="nt">&lt;title&gt;</span>My Site!<span class="nt">&lt;/title&gt;</span>
        <span class="nt">&lt;/head&gt;</span>
        <span class="nt">&lt;body&gt;</span>
            <span class="nt">&lt;div</span> <span class="na">id=</span><span class="s">&quot;pagebody&quot;</span><span class="nt">&gt;</span>
                %include
            <span class="nt">&lt;/div&gt;</span>
        <span class="nt">&lt;/body&gt;</span>
    <span class="nt">&lt;/html&gt;</span>
    </pre></div>
    

    When I try to run this through html_minify() I will wind up with the < and > pieces of text being treated as html tags. I think the problem with that is rather obvious, but if I didn't make sense, let me know and I'll try to explain better.

    opened by MTecknology 8
  • UnicodeDecodeError

    UnicodeDecodeError

    I think I've found another bug. I have a TextField in a model, and in the admin site, when I input a latin character, for example: á, after I submit the change_form and try to edit that object I get this error.

    Here's the traceback:

    Django Version: 1.3 Python Version: 2.6.1

    Traceback: File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/lib/python2.6/site-packages/django/core/handlers/base.py" in get_response

    1.             response = middleware_method(request, response)
      
      File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/src/django-htmlmin/htmlmin/middleware.py" in process_response
    2.         response.content = html_minify(response.content)
      
      File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/src/django-htmlmin/htmlmin/minify.py" in html_minify
    3.         html_code = html_code.replace(script, TAGS_PATTERN % (tag, index))
      

    Exception Type: UnicodeDecodeError at /admin/chunks/chunk/2/ Exception Value: ('ascii', '', 94, 95, 'ordinal not in range(128)')

    Bug 
    opened by honi 8
  • Support the new Django 1.10 MIDDLEWARE style

    Support the new Django 1.10 MIDDLEWARE style

    I get a TypeError when using the django-htmlmin middlewares in MIDDLEWARE in Django 1.10 (they're fine when using the old-style MIDDLEWARE_CLASSES).

    See https://docs.djangoproject.com/en/1.10/topics/http/middleware/#upgrading-pre-django-1-10-style-middleware for upgrading information.

    Bug 
    opened by tremby 7
  • ignore option

    ignore option

    hey, thanks for the module. i just tried your app, and it looks awesome as it is already.

    i'm using grappelly admin interface and it looks kinda wierd with your middleware. the selectors loose their default height and become oneliners. it's very uneasy when you have the list with lots of elements.

    so i would kindly feature request for some decorator or optional ignore settings. in my case i would like to ignore the whole admin interface.

    anyway, thanks for your job.

    Bug 
    opened by fuxter 7
  • error with gzip on nginx

    error with gzip on nginx

    If I try add gzip on in nginx.conf, PageSpeed Insights return

    The server closed the connection before sending a full response. Ensure that the page loads in a browser and try again

    Bug 
    opened by vladimirmyshkovski 6
  • Too aggressive?

    Too aggressive?

    This seems sort of wrong to me (using the latest release):

    >>> str= "<b>hey </b>you"                                                                                                      
    >>> html_minify(str)
    u'<html><head></head><body><b>hey</b>you</body></html>'
    >>> 
    

    which would render like "heyyou", when the intent would pretty clearly be "hey you"

    Bug 
    opened by rosskarchner 6
  • change html_minify to recursive function

    change html_minify to recursive function

    As per discussions in #21, I implemented a minify version using recursive functions to walk and clean the tree. This method is, I think, more robust and happens to be faster.

    opened by hrbonz 6
  • Spaces removed around <a> tags in text

    Spaces removed around tags in text

    I have a text of the form:

        Lorem ipsum dolor sit amet....
        <a href="#">Ut enim</a> ad minim veniam
    

    When minified the spaces around the link tags are removed, causing the linked words to cram into the surrounding text. This is with django-htmlmin 0.5.2.

    I had to use non-breaking space around the links to workaround this issue.

    Bug 
    opened by atodorov 6
  • Add CONSERVATIVE_WHITESPACE_ON_MINIFYING to retain inline text spaces

    Add CONSERVATIVE_WHITESPACE_ON_MINIFYING to retain inline text spaces

    I was looking into the issue outlined in #121 and #21 as it's affecting me. Essentially the current system of completely removing whitespace in a text flow has some bugs and it's not easy to fix. So, in an effort to at least get things working I've added a CONSERVATIVE_WHITESPACE_ON_MINIFYING option to turn off eagerly removing whitespace and always leave at least 1 space (I'm also hoping that feature may be useful to someone else for other purposes). The default is to continue working as before for backwards compatibility.

    The slightly longer explanation

    If you try minifying a <i>b</i> it will properly reduce to a <i>b</i> and the display output in a browser will be "a b" (note the space). However, if you try minifying a<i> b</i> it will reduce to a<i>b</i> and then get displayed as "ab" (note, no space). Ideally, the 2nd case should end up being a <i>b<i> to retain that inline space but at the right spot.

    Now, I tried going through the code to see if I could fix that 2nd case, but I'm not sure I see a way to get there without some major restructuring. One issue is that when examining the NavigableString of ' b' in is_inflow, the previous_sibling is None (in the above example). BS4 doesn't treat a tag next to a text block as siblings.

    I did find another project that seemed to do a good job of this here: https://github.com/kangax/html-minifier/blob/51ce10f4daedb1de483ffbcccecc41be1c873da2/src/htmlminifier.js#L65-L83 However, that's in JS using completely different libraries and would require quite a bit of time to rework with BS4. That project also had a "conservative whitespace" option and I thought it might be a good feature to add and also provide a workaround for this issue until a better solution is developed.

    One last point... Any solution essentially assumes that inline/inline-block HTML elements will remain such, but CSS could easily break things if someone were to do something like <div style="display:inline">. Obviously that's a silly thing to do, but this change allows for things like that to work as expected. Sometimes HTML on the page comes from third party libraries and it's not always possible to fix these things to have proper HTML.

    opened by tisdall 5
  • Added support for Python 3 and 2

    Added support for Python 3 and 2

    I've refactored the unicode literals so that it supports both Python 2 and Python 3. I used 'six' just to keep the codebase a little cleaner, I hope that's okay.

    However, there are some known issues, which I hope you guys know how to fix:

    Even though it passes the tests on Python 2.x, it fails to even run the tests on Python 3. This is because nosedjango does not support Python 3, which causes Travis to stop the builds on Python 3.x. Though our code should (theoretically) work on Python 3 now.

    Is there any idea on how we can fix that? The guys over at nosedjango seems pretty dead.

    opened by drcd 5
  • Bump actions/setup-python from 4.3.0 to 4.4.0

    Bump actions/setup-python from 4.3.0 to 4.4.0

    Bumps actions/setup-python from 4.3.0 to 4.4.0.

    Release notes

    Sourced from actions/setup-python's releases.

    Add support to install multiple python versions

    In scope of this release we added support to install multiple python versions. For this you can try to use this snippet:

        - uses: actions/[email protected]
          with:
            python-version: |
                3.8
                3.9
                3.10
    

    Besides, we changed logic with throwing the error for GHES if cache is unavailable to warn (actions/setup-python#566).

    Improve error handling and messages

    In scope of this release we added improved error message to put operating system and its version in the logs (actions/setup-python#559). Besides, the release

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump actions/checkout from 3.1.0 to 3.2.0

    Bump actions/checkout from 3.1.0 to 3.2.0

    Bumps actions/checkout from 3.1.0 to 3.2.0.

    Release notes

    Sourced from actions/checkout's releases.

    v3.2.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/actions/checkout/compare/v3...v3.2.0

    Changelog

    Sourced from actions/checkout's changelog.

    Changelog

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • No wheel available for 0.11.0 on PyPI

    No wheel available for 0.11.0 on PyPI

    Hello!

    Only the source tarball is available on PyPI for version 0.11.0 (https://pypi.org/project/django-htmlmin/0.11.0/#files). Would it be possible to get a wheel as well?

    opened by Steap 0
  • need to retain space around inline, inline-block, and comment

    need to retain space around inline, inline-block, and comment

    Currently django-htmlmin retains space around inline text elements. However, there are other inline/inline-block elements that still render whitespace around them that are currently not handled. Also, comment removal takes spaces outside the comment with it.

    examples:

    >>> from htmlmin.minify import html_minify
    >>> html_minify('''<img src="http://example.com/image.jpg" alt="example">  hi''')
    '<html><head></head><body><img alt="example" src="http://example.com/image.jpg"/>hi</body></html>'
    >>> html_minify('''name: <input type="text">''')
    '<html><head></head><body>name:<input type="text"/></body></html>'
    >>> html_minify('''<p>a <!-- --> b</p>''')
    '<html><head></head><body><p>ab</p></body></html>'
    >>> html_minify('''<em>a <!-- --> b</em>''')  # this works as `em` is a text context
    '<html><head></head><body><em>a  b</em></body></html>'
    >>> html_minify('''<button> click me  </button>  text after''')
    '<html><head></head><body><button>click me</button>text after</body></html>'
    

    Some of these can be fixed by adding attributes to TEXT_FLOW, but they aren't really text elements so naming would probably need adjusting.

    opened by tisdall 0
  • problem using html_minify in django

    problem using html_minify in django

    Hi, thank you for doing this project!

    I am trying to manually minify a view in a django project, but I don't have it integrated to the middleware (other views are already taken care of and have a build process) But I expected to be able to call html_minify on the output of a call to django.shortcuts.render. Instead I get:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 34, in inner
        response = get_response(request)
      File "/usr/local/lib/python3.8/site-packages/debug_toolbar/middleware.py", line 67, in __call__
        panel.generate_stats(request, response)
      File "/usr/local/lib/python3.8/site-packages/debug_toolbar/panels/headers.py", line 53, in generate_stats
        self.response_headers = OrderedDict(sorted(response.items()))
    
    Exception Type: AttributeError at /v/27/
    Exception Value: 'str' object has no attribute 'items'
    

    Maybe there is something in newer django? Or is there some intermediate step in this possibly unusal workflow?

    opened by chrisamow 0
Releases(0.11.0)
A faster collectstatic command.

Collectfast A faster collectstatic command. Features Efficiently decide what files to upload using cached checksums Parallel file uploads Supported St

Anton Agestam 405 Dec 27, 2022
A django compressor tool that bundles css, js files to a single css, js file with webpack and updates your html files with respective css, js file path.

django-webpacker's documentation: Introduction: django-webpacker is a django compressor tool which bundles css, js files to a single css, js file with

MicroPyramid 72 Aug 18, 2022
Photonix Photo Manager - a photo management application based on web technologies

A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recogni

Photonix Photo Manager 1.5k Jan 01, 2023
Compresses linked and inline javascript or CSS into a single cached file.

Django Compressor Django Compressor processes, combines and minifies linked and inline Javascript or CSS in a Django template into cacheable static fi

2.6k Jan 03, 2023
Peak.py - An awesome tool to keep you up about servers and websites status

An In-house API Powered GUI App To Check Server Stats, Generate Logs And More Features To Be Added Soon.

3 Feb 24, 2022
django-systemjs

Django SystemJS Django SystemJS brings the Javascript of tomorrow to Django, today. It leverages JSPM (https://jspm.io) to do the heavy lifting for yo

Sergei Maertens 42 Jan 11, 2022
Transparently use webpack with django

django-webpack-loader Read http://owaislone.org/blog/webpack-plus-reactjs-and-django/ for a detailed step by step guide on setting up webpack with dja

2.4k Jan 06, 2023
HTML minifier for Python frameworks (not only Django, despite the name).

django-htmlmin django-html is an HTML minifier for Python, with full support for HTML 5. It supports Django, Flask and many other Python web framework

Cobra Team 536 Dec 25, 2022
Compresses linked and inline javascript or CSS into a single cached file.

Django Compressor Django Compressor processes, combines and minifies linked and inline Javascript or CSS in a Django template into cacheable static fi

2.6k Jan 03, 2023
https://django-storages.readthedocs.io/

Installation Installing from PyPI is as easy as doing: pip install django-storages If you'd prefer to install from source (maybe there is a bugfix in

Josh Schneier 2.3k Jan 05, 2023
Manage your activities as well as possible.

ToDo-List Manage your activities as well as possible. Create workspace Create many boards for different topics Add unlimited tasks About me Full name:

Matin Ardestani 10 Sep 11, 2022
Python bindings to webpack

Unmaintained This project is unmaintained as it's a complicated solution to a simple problem. You should try using either https://github.com/owais/dja

Mark Finger 62 Apr 15, 2022
Pipeline is an asset packaging library for Django.

Pipeline Pipeline is an asset packaging library for Django, providing both CSS and JavaScript concatenation and compression, built-in JavaScript templ

Jazzband 1.4k Dec 26, 2022
Asset management for Python web development.

Asset management application for Python web development - use it to merge and compress your JavaScript and CSS files. Documentation: https://webassets

Michael Elsdörfer 912 Dec 25, 2022