Site-Wide Caching in Django

My last article about caching RSS feeds in a Django project generated a lot of interest. My original goal was to help other people who have tried to cache QuerySet objects and received a funky error message. Many of my visitors offered helpful advice in the comments, making it clear that I was going about caching my feeds the wrong way.

I knew my solution was wrong before I even produced it, but I couldn't get Django's site-wide caching middleware to work in my production environment. Site-wide caching worked wonderfully in my development environment, and I tried all sorts of things to make it work in my production setup. It wasn't until one "Jacob" offered a beautiful pearl of wisdom that things started to make more sense:

This doesn't pertain to feeds, but one rather large gotcha with the cache middleware is that any javascript you are running that plants a cookie will affect the cache key. Google analytics, for instance, has that effect. A workaround is to use a middleware to strip out the offending cookies from the request object before the cache middleware looks at it.

The minute I read that comment, I realized just how logical it was! If Google Analytics, or any other JavaScript used on my site, was setting a cookie, and it changed that cookie on each request, then the caching engine would effectively have a different page to cache for each request! Thank you so much, Jacob, for helping me get past the frustration of not having site-wide caching in my production environment.

How To Setup Site-Wide Caching

While most of this can be gleaned from the official documentation, I will repeat it here in an effort to provide a complete "HOWTO". For further information, hit up the official caching documentation.

The first step is to choose a caching backend for your project. Built-in options include:

To specify which backend you want to use, define the CACHE_BACKEND variable in your settings.py. The definition for each backend is different, so check out the official documentation for details.

Next, install a couple of middleware classes, and pay attention to where the classes are supposed to appear in the list:

  • django.middleware.cache.UpdateCacheMiddleware - This should be the first middleware class in your MIDDLEWARE_CLASSES tuple in your settings.py.
  • django.middleware.cache.FetchFromCacheMiddleware - This should be the last middleware class in your MIDDLEWARE_CLASSES tuple in your settings.py.

Finally, you must define the following variables in your settings.py file:

  • CACHE_MIDDLEWARE_SECONDS - The number of seconds each page should be cached
  • CACHE_MIDDLEWARE_KEY_PREFIX - If the cache is shared across multiple sites using the same Django installation, set this to the name of the site, or some other string that is unique to this Django instance, to prevent key collisions. Use an empty string if you don't care

If you don't use anything like Google Analytics that sets/changes cookies on each request to your site, you should have site-wide caching enabled now. If you only want pages to be cached for users who are not logged in, you may add CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True to your settings.py file--its meaning should be fairly obvious.

If, however, your site-wide caching doesn't appear to work (as it didn't for me for a long time), you can create a special middleware class to strip those dirty cookies from the request, so the caching middleware can do its work.

import re

class StripCookieMiddleware(object):
    """Ganked from http://2ze.us/Io"""

    STRIP_RE = re.compile(r'\b(_[^=]+=.+?(?:; |$))')

    def process_request(self, request):
        cookie = self.STRIP_RE.sub('', request.META.get('HTTP_COOKIE', ''))
        request.META['HTTP_COOKIE'] = cookie

Edit: Thanks to Tal for regex the suggestion!

Once you do that, you need only install the new middleware class. Be sure to install it somewhere between the UpdateCacheMiddleware and FetchFromCacheMiddleware classes, not first or last in the tuple. When all of that is done, your site-wide caching should really work! That is, of course, unless your offending cookies are not found by that STRIP_RE regular expression.

Thanks again to Jacob and "nf", the original author of the middleware class I used to solve all of my problems! Also, I'd like to thank "JaredKuolt" for the django-staticgenerator on his github account. It made me happy for a while as I was working toward real site-wide caching.

Review: Django 1.0 Web Site Development

Introduction

Several months ago, a UK-based book publisher, Packt Publishing contacted me to ask if I would be willing to review one of their books about Django. I gladly jumped at the opportunity, and I received a copy of the book a couple of weeks later in the mail. This happened at the beginning of September 2009. It just so happened that I was in the process of being hired on by ScienceLogic right when all of this took place. The subsequent weeks were filled to the brim with visitors, packing, moving, finding an apartment, and commuting to my new job. It was pretty stressful.

Things are finally settling down, so I've taken the time to actually review the book I was asked to review. I should mention right off the bat that this is indeed a solicited review, but I am in no way influenced to write a good or bad review. Packt Publishing simply wants me to offer an honest review of the book, and that is what I indend to do. While reviewing the book, I decided to follow along and write the code the book introduced. I made sure that I was using the official Django 1.0 release instead of using trunk like I tend to do for my own projects.

The title of the book is Django 1.0 Web Site Development, written by Ayman Hourieh, and it's only 250 pages long. Ayman described the audience of the book as such:

This book is for web developers who want to learn how to build a complete site with Web 2.0 features, using the power of a proven and popular development system--Django--but do not necessarily want to learn how a complete framework functions in order to do this. Basic knowledge of Python development is required for this book, but no knowledge of Django is expected.

Ayman introduced Django piece by piece using the end goal of a social bookmarking site, a la del.icio.us and reddit. In the first chapter of the book, Ayman discussed the history of Django and why Python and Django are a good platform upon which to build Web applications. The second chapter offers a brief guide to installing Python and Django, and getting your first project setup. Not much to comment on here.

Digging In

Chapter three is where the reader was introduced to the basic structure of a Django project, and the initial data models were described. Chapter four discussed user registration and management. We made it possible for users to create accounts, log into them, and log out again. As part of those additions, the django.forms framework was introduced.

In chapter five, we made it possible for bookmarks to be tagged. Along with that, we built a tag cloud, restricted access to certain pages, and added a little protection against malicious data input. Next up was the section where things actually started getting interesting for me: enhancing the interface with fancy effects and AJAX. The fancy effects include live searching for bookmarks, being able to edit a bookmark in place (without loading a new page), and auto-completing tags when you submit a bookmark.

This chapter really reminded me just how simple it is to add new, useful features to existing code using Django and Python. I was thoroughly impressed at how easy it was to add the AJAX functionality mentioned above. Auto-completing the tags as you type, while jQuery and friends did most of the work, was very easy to implement. It made me happy.

Chapter seven introduced some code that allowed users to share their bookmarks with others. Along with this, the ability to vote on shared bookmarks was added. Another feature that was added in this chapter was the ability for users to comment on various bookmarks.

The ridiculously amazing Django Administration utility was first introduced in chapter eight. It kinda surprised me that it took 150 pages before this feature was brought to the user's attention. In my opinion, this is one of the most useful selling points when one is considering a Web framework for a project. When I first encountered Django, the admin interface was one of maybe three deciding factors in our company's decision to become a full-on Django shop.

Bring on the Web 2.0

Anyway, in chapter nine, we added a handful of useful "Web 2.0" features. RSS feeds were introduced. We learned about pagination to enhance usability and performance. We also improved the search engine in our project. At this stage, the magical Q objects were mentioned. The power behind the Q objects was discussed very well, in my opinion.

In chapter 10, we were taught how we can create relationships between members on the site. We made it possible for users to become "friends" so they can see the latest bookmarks posted by their friends. We also added an option for users to be able to invite some of their other friends to join the site via email, complete with activation links. Finally, we improved the user interface by providing a little bit of feedback to the user at various points using the messages framework that is part of the django.contrib.auth package in Django 1.0.

More advanced topics, such as internationalization and caching, were discussed in chapter 11. Django's special unit testing features were also introduced in chapter 11. This section actually kinda frustrated me. Caching was discussed immediately before unit testing. In the caching section, we learned how to enable site-wide caching. This actually broke the unit tests. They failed because the caching system was "read only" while running the tests. Anyway, it's probably more or less a moot point.

Chapter 11 also briefly introduced things to pay attention to when you deploy your Django projects into a production environment. This portion was mildly disappointing, but I don't know what else would have made it better. There are so many functional ways to deploy Django projects that you could write books just to describe the minutia involved in deployment.

The twelfth and final chapter discussed some of the other things that Django has to offer, such as enhanced functionality in templates using custom template tags and filters and model managers. Generic views were mentioned, and some of the other useful things in django.contrib were brought up. Ayman also offered a few ideas of additional functionality that the reader can implement on their own, using the things they learned throughout the book.

Afterthoughts

Overall, I felt that this book did a great job of introducing the power that lies in using Django as your framework of choice. I thought Ayman managed to break things up into logical sections, and that the iterations used to enhance existing functionality (from earlier chapters) were superbly executed. I think that this book, while it does assume some prior Python knowledge, would be a fine choice for those who are curious to dig into Django quickly and easily.

Some of the beefs I have with this book deal mostly with the editing. There were a lot of strange things that I found while reading through the book. However, the biggest sticking point for me has to do with "pluggable" applications. Earlier I mentioned that the built-in Django admin was one of only a few deciding factors in my company's choice to become a Django shop. Django was designed to allow its applications to be very "pluggable."

You may be asking, "What do I mean by 'pluggable'?" Well, say you decide to build a website that includes a blog, so you build a Django project and create an application specific to blogging. Then, at some later time, you need to build another site that also has blog functionality. Do you want to rewrite all of the blogging code for the second site? Or do you want to use the same code that you used in the first site (without copying it)? If you're anything like me and thousands of other developers out there, you would probably rather leverage the work you had already done. Django allows you to do this if you build your Django applications properly.

This book, however, makes no such effort to teach the reader how to turn all of their hard work on the social bookmarking features into something they could reuse over and over with minimal effort in the future. Application-specific templates are placed directly into the global templates directory. Application-specific URLconfs are placed in the root urls.py file. I would have liked to see at least some effort to make the bookmarking application have the potential to be reused.

Finally, the most obvious gripe is that the book is outdated. That's understandable, though! Anything in print media will likely be outdated the second it is printed if the book has anything to do with computers. However, with the understanding that this book was written specifically for Django 1.0 and not Django 1.1 or 1.2 alpha, it does an excellent job at hitting the mark.