Syndication Caching in Django
Posted: | More posts about articles cache caching database django feed open-source performance programming python rss syndication work
I've recently been working on some performance enhancements on my site. Apparently some of my latest articles are a little too popular for my shared hosting plan. The surge of traffic to my site took down several sites on the same server as my own.
My response to the fiasco was to, among other things, implement caching on my site. It seems like the caching has helped a lot. I've noticed that my RSS feeds are hit almost as hard as real articles on my site, and I noticed that they weren't being cached the way I had expected. I tried a couple of things that I thought would work, but nothing seemed to do the trick.
After doing some brief research into the idea of caching my RSS feeds using Django's built-in caching mechanisms, I came up empty. It occurred to me to implement caching in the feed classes themselves. I tried something like this:
from django.contrib.syndication.feeds import Feed from django.core.cache import cache from articles.models import Article class LatestEntries(Feed): ... def items(self): articles = cache.get('latest_articles') if articles is None: articles = Article.objects.active().order_by('-publish_date')[:10] cache.set('latest_articles', articles) return articles ...
This code doesn't work! When I would try to retrieve one of my RSS feeds with such "caching" in place, I got the following traceback:
Traceback (most recent call last): File "/home/wheaties/dev/django/core/servers/basehttp.py", line 280, in run self.result = application(self.environ, self.start_response) File "/home/wheaties/dev/django/core/servers/basehttp.py", line 674, in __call__ return self.application(environ, start_response) File "/home/wheaties/dev/django/core/handlers/wsgi.py", line 241, in __call__ response = self.get_response(request) File "/home/wheaties/dev/django/core/handlers/base.py", line 143, in get_response return self.handle_uncaught_exception(request, resolver, exc_info) File "/home/wheaties/dev/django/core/handlers/base.py", line 101, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/home/wheaties/dev/django/utils/decorators.py", line 36, in __call__ return self.decorator(self.func)(*args, **kwargs) File "/home/wheaties/dev/django/utils/decorators.py", line 86, in _wrapped_view response = view_func(request, *args, **kwargs) File "/home/wheaties/dev/django/contrib/syndication/views.py", line 215, in feed feedgen = f(slug, request).get_feed(param) File "/home/wheaties/dev/django/contrib/syndication/feeds.py", line 37, in get_feed return super(Feed, self).get_feed(obj, self.request) File "/home/wheaties/dev/django/contrib/syndication/views.py", line 134, in get_feed for item in self.__get_dynamic_attr('items', obj): File "/home/wheaties/dev/django/contrib/syndication/views.py", line 69, in __get_dynamic_attr return attr() File "/home/wheaties/dev/articles/feeds.py", line 22, in items cache.set(key, articles) File "/home/wheaties/dev/django/core/cache/backends/filebased.py", line 72, in set pickle.dump(value, f, pickle.HIGHEST_PROTOCOL) PicklingError: Can't pickle <class 'django.utils.functional.__proxy__'>: attribute lookup django.utils.functional.__proxy__ failed
This error took me by surprise. I didn't expect anything like this. I tried a few things to get around it, but then I actually stopped to consider what was happening to cause such an error. My Article objects are definitely serializable, which is why the error didn't make sense.
Then it hit me: the object I was actually attempting to cache was a QuerySet, not a list or tuple of Article objects. Changing the code to wrap the Article.objects.active() call with list().
from django.contrib.syndication.feeds import Feed from django.core.cache import cache from articles.models import Article class LatestEntries(Feed): ... def items(self): articles = cache.get('latest_articles') if articles is None: articles = list(Article.objects.active().order_by('-publish_date')[:10]) cache.set('latest_articles', articles) return articles ...
And that one worked. I would prefer to cache the actual XML version of the RSS feed, but I will settle with a few hundred fewer hits to my database each day by caching the list of articles. If anyone has better suggestions, I'd love to hear about them. Until then, I hope my experience will help others out there who are in danger of taking down other sites on their shared hosting service!