Test-Driven Development With Python

Earlier this year, I was approached by the editor of Software Developer's Journal to write a Python-related article. I was quite flattered by the opportunity, but, being extremely busy at the time with work and family life, I was hesitant to agree. However, after much discussion with my wife and other important people in my life, I decided to go for it.

I had a lot of freedom to choose a topic to write about in the article, along with a relatively short timeline. I think I had two weeks to write the article after finally agreeing to do so, and I was supposed to write some 7-10 pages about my chosen topic.

Having recently been converted to the wonders of test-driven development (TDD), I decided that should be my topic. Several of my friends were also interested in getting into TDD, and they were looking for a good, simple way to get their feet wet. I figured the article would be as good a time as any to write up something to help my friends along.

I set out with a pretty grand plan for the article, but as the article progressed, it became obvious that my plan was a bit too grandios for a regular magazine article. I scaled back my plans a bit and continued working on the article. I had to scale back again, and I think one more time before I finally had something that was simple enough to not write a book about.

Well, that didn't exactly turn out as planned either. I ended up writing nearly 40 pages of LibreOffice single-spaced, 12pt Times New Roman worth of TDD stuff. Granted, a fair portion of the article's length is comprised of code snippets and command output.

Anyway, I have permission to repost the article here, and I wanted to do so because I feel that the magazine formatting kinda butchered the formatting I had in mind for my article (and understandably so). To help keep the formatting more pristine, I've turned it into a PDF for anyone who's interested in reading it.

So, without much further ado, here's the article! Feel free to download or print the PDF as well.

Quick And Easy Execution Speed Testing

There have been many times when I've been programming, encounter a problem that probably involves a loop of some sort, and I think of two or more possible ways to achieve the same end result. At this point, I usually think about which one will probably be the fastest solution (execution-wise) while still being readable/maintainable. A lot of the time, the essentials of the problem can be tested in a few short lines of code.

A while back, I was perusing some Stack Overflow questions for work, and I stumbled upon what I consider one of the many hidden jewels in Python: the timeit module. Given a bit of code, this little guy will handle executing it in several loops and giving you the best time out of three trials (you can ask it to do more than 3 runs if you want). Once it completes its test, it will offer some very clean and useful output.

For example, today I encountered a piece of code that was making a comma-separated list of an arbitrary number of "%s". The code I saw essentially looked like this:

",".join(["%s"] * 50000)

Even though this code required no optimization, I thought, "Hey, that's neat... I wonder if a list comprehension could possibly be any faster." Here's an example of the contender:

",".join(["%s" for i in xrange(50000)])

I had no idea which would be faster, so timeit to the rescue!! Open up a terminal, type a couple one-line Python commands, and enjoy the results!

$ python -mtimeit 'l = ",".join(["%s"] * 50000)'
1000 loops, best of 3: 1.15 msec per loop
$ python -mtimeit 'l = ",".join(["%s" for i in xrange(50000)])'
100 loops, best of 3: 3.23 msec per loop

Hah, the list comprehension is certainly slower.

Now, for other more in-depth tests of performance, you might consider using the cProfile module. As far as I can tell, simple one-liners can't be tested directly from the command line using cProfile--they apparently need to be in a script. You can use something like:

python -mcProfile script.py

...in such situations. Or you can wrap function calls using cProfile.run():

import cProfile

def function_a():
    # something you want to profile

def function_b():
    # an alternative version of function_a to profile

if __name__ == '__main__':
    cProfile.run('function_a()')
    cProfile.run('function_b()')

I've used this technique for tests that I'd like to have "hard evidence" for in the future. The output of such a cProfile test looks something like this:

3 function calls in 6.860 CPU seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    6.860    6.860 <string>:1(<module>)
     1    6.860    6.860    6.860    6.860 test_enumerate.py:5(test_enumerate)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This is useful when your code is calling other functions or methods and you want to find where your bottlenecks are. Hooray for Python!

What profiling techniques do you use?

On Security and Python's Exec

A recent project at work has renewed my aversion to Python's exec statement--particularly when you want to use it with arbitrary, untrusted code. The project requirements necessitated the use of exec, so I got to do some interesting experiments with it. I've got a few friends who, until I slapped some sense into them, were seemingly big fans of exec (in Django projects, even...). This article is for them and others in the same boat.

Take this example:

#!/usr/bin/env python

import sys

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

exec """import sys

dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)"""

print dirname, 'in path?', (dirname in sys.path)

Take a second and examine what the script is doing. Done? Great... So, the script first makes sure that a very critical directory is in my PYTHONPATH: /usr/lib/python2.6/site-packages. This is the directory where all of the awesome Python packages, like PIL, lxml, and dozens of others, reside. This is where Python will look for such packages when I try to import and use them in my programs.

Next, a little Python snippet is executed using exec. Let's say this snippet comes from an untrusted source (a visitor to your website, for example). The snippet removes that very important directory from my PYTHONPATH. It might seem like it's relatively safe to do within an exec--maybe it doesn't change the PYTHONPATH that I was using before the exec?

Wrong. The output of this script on my personal system says it all:

$ python bad.py
/usr/lib/python2.6/site-packages in path? True
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? False

From this example, we learn that Python code that is executed using exec runs in the same context as the code that uses exec. This is a critical concept to learn.

Some people might say, "Oh, there's an easy way around that. Give exec its own globals dictionary to work with, and all will be well." Wrong again. Here's a modified version of the above script.

#!/usr/bin/env python

import sys

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

context = {'something': 'This is a special context for the exec'}

exec """import sys

print something
dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)""" in context

print dirname, 'in path?', (dirname in sys.path)

And here's the output:

$ python also_bad.py
/usr/lib/python2.6/site-packages in path? True
This is a special context for the exec
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? False

How can you get around this glaring risk in the exec statement? One possible solution is to execute the snippet in its own process. Might not be the best way to handle things. Could be the absolute worst solution. But it's a solution, and it works:

#!/usr/bin/env python

import multiprocessing
import sys

def execute_snippet(snippet):
    exec snippet

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

snippet = """import sys

dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)"""

proc = multiprocessing.Process(target=execute_snippet, args=(snippet,))
proc.start()
proc.join()

print dirname, 'in path?', (dirname in sys.path)

And here comes the output:

$ python better.py
/usr/lib/python2.6/site-packages in path? True
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? True

So the PYTHONPATH is only affected by the sys.path.remove within the process that executes the snippet using exec. The process that spawns the subprocess is unaffected, and can continue with life, happily importing all of those wonderful packages from the site-packages directory. Yay.

With that said, exec isn't always bad. But my personal point of view is basically, "There is probably a better way." Unfortunately for me, that does not hold up in my current situation, and it might not work for your circumstances too. If no one is forcing you to use exec, you might investigate alternatives in all of that free time you've been wondering what to do with.

Python And Execution Context

I recently found myself in a situation where knowing the execution context of a function became necessary. It took me several hours to learn about this functionality, despite many cleverly-crafted Google searches. So, being the generous person I am, I want to share my findings.

My particular use case required that a function behave differently depending on whether it was called in an exec call. Specifics beyond that are not important for this article. Here's an example of how I was able to get my desired behavior.

import inspect

def is_exec():
    caller = inspect.currentframe().f_back
    module = inspect.getmodule(caller)

    if module is None:
        print "I'm being run by exec!"
    else:
        print "I'm being run by %s" % module.__name__

def main():
    is_exec()

    exec "is_exec()"

if __name__ == '__main__':
    main()

The output of such a script would look like this:

$ python is_exec.py
I'm being run by __main__
I'm being run by exec!

It's also interesting to note that when you're using the Python interactive interpreter, calling the is_exec function from the code above will tell you that you are indeed using exec.

Some may argue that modifying behavior as I needed to is dirty, and that if your system requires such code, you're doing it wrong. Well, you could apply this sort of code to situations that have nothing to do with exec. Perhaps you want to determine which part of your product is using a specific function the most. Perhaps you want to get additional debugging information that isn't immediately obvious.

Just like always, I want to add the disclaimer that there may be other ways to do this and there probably are. However, this is the way that worked for me. I'd still be interested to here about other solutions you may have encountered for this problem.

On a side note, if you're up for some slightly advanced Python mumbo jumbo, I suggest diving into the inspect documentation.

Site-Wide Caching in Django

My last article about caching RSS feeds in a Django project generated a lot of interest. My original goal was to help other people who have tried to cache QuerySet objects and received a funky error message. Many of my visitors offered helpful advice in the comments, making it clear that I was going about caching my feeds the wrong way.

I knew my solution was wrong before I even produced it, but I couldn't get Django's site-wide caching middleware to work in my production environment. Site-wide caching worked wonderfully in my development environment, and I tried all sorts of things to make it work in my production setup. It wasn't until one "Jacob" offered a beautiful pearl of wisdom that things started to make more sense:

This doesn't pertain to feeds, but one rather large gotcha with the cache middleware is that any javascript you are running that plants a cookie will affect the cache key. Google analytics, for instance, has that effect. A workaround is to use a middleware to strip out the offending cookies from the request object before the cache middleware looks at it.

The minute I read that comment, I realized just how logical it was! If Google Analytics, or any other JavaScript used on my site, was setting a cookie, and it changed that cookie on each request, then the caching engine would effectively have a different page to cache for each request! Thank you so much, Jacob, for helping me get past the frustration of not having site-wide caching in my production environment.

How To Setup Site-Wide Caching

While most of this can be gleaned from the official documentation, I will repeat it here in an effort to provide a complete "HOWTO". For further information, hit up the official caching documentation.

The first step is to choose a caching backend for your project. Built-in options include:

To specify which backend you want to use, define the CACHE_BACKEND variable in your settings.py. The definition for each backend is different, so check out the official documentation for details.

Next, install a couple of middleware classes, and pay attention to where the classes are supposed to appear in the list:

  • django.middleware.cache.UpdateCacheMiddleware - This should be the first middleware class in your MIDDLEWARE_CLASSES tuple in your settings.py.
  • django.middleware.cache.FetchFromCacheMiddleware - This should be the last middleware class in your MIDDLEWARE_CLASSES tuple in your settings.py.

Finally, you must define the following variables in your settings.py file:

  • CACHE_MIDDLEWARE_SECONDS - The number of seconds each page should be cached
  • CACHE_MIDDLEWARE_KEY_PREFIX - If the cache is shared across multiple sites using the same Django installation, set this to the name of the site, or some other string that is unique to this Django instance, to prevent key collisions. Use an empty string if you don't care

If you don't use anything like Google Analytics that sets/changes cookies on each request to your site, you should have site-wide caching enabled now. If you only want pages to be cached for users who are not logged in, you may add CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True to your settings.py file--its meaning should be fairly obvious.

If, however, your site-wide caching doesn't appear to work (as it didn't for me for a long time), you can create a special middleware class to strip those dirty cookies from the request, so the caching middleware can do its work.

import re

class StripCookieMiddleware(object):
    """Ganked from http://2ze.us/Io"""

    STRIP_RE = re.compile(r'\b(_[^=]+=.+?(?:; |$))')

    def process_request(self, request):
        cookie = self.STRIP_RE.sub('', request.META.get('HTTP_COOKIE', ''))
        request.META['HTTP_COOKIE'] = cookie

Edit: Thanks to Tal for regex the suggestion!

Once you do that, you need only install the new middleware class. Be sure to install it somewhere between the UpdateCacheMiddleware and FetchFromCacheMiddleware classes, not first or last in the tuple. When all of that is done, your site-wide caching should really work! That is, of course, unless your offending cookies are not found by that STRIP_RE regular expression.

Thanks again to Jacob and "nf", the original author of the middleware class I used to solve all of my problems! Also, I'd like to thank "JaredKuolt" for the django-staticgenerator on his github account. It made me happy for a while as I was working toward real site-wide caching.

Syntax Highlighting, ReST, Pygments, and Django

Some of you regulars out there may have noticed an interesting change in the presentation of some of my articles: source code highlighting. I've been interested in doing this for quite some time, I just never really got around to implementing it until last night.

I found this implementation process to be a bit more complicatd than I had anticipated. For my own benefit as well as for anyone else who wants to do the same thing, I thought I'd document my findings in a thorough article for how to add syntax highlighting to an existing Django- and reStructuredText-powered Web site.

The power behind the syntax highlighting is:

Python is a huge player in this feature because reStructuredText (ReST) was built for Python, Pygments is the source highlighter (written in Python), and Django is written in Python (and my site is powered by Django). Some of you may recall that I converted all of my articles to ReST not too long ago because it suited my needs better than Textile, my previous markup processor. At the time, I was not aware that the conversion to ReST would make it all the easier for me to implement the syntax highlighting, but last night I figured out that that conversion probably saved me a lot of frustration. Cascading Stylesheets (CSS) are responsible for making the source code actually look good, while Pygments takes care of assigning classes to various parts of the designated source code and generating the CSS.

So, the first set of requirements, which I will not document in this article, are that you already have a Django site up and running and that you're familiar with ReST syntax. If you have the django.contrib.flatpages application installed already, you can type up some ReST documents there and apply the concepts discussed in this article.

Next, you should ensure that you have Pygments installed. There are a variety of ways to install this. Perhaps the easiest and most platform-independent method is to use easy_install:

$ easy_install pygments

This command should work essentially the same on Windows, Linux, and Macintosh computers. If you don't have it installed, you can get it from its website. If you're using a Debian-based distribution of Linux, such as Ubuntu, you could do something like this:

$ sudo apt-get install python-pygments

...and it should take care of downloading and installing Pygments. Alternatively, you can download it straight from the PyPI page and install it manually.

Now we need to install the Pygments ReST directive. A ReST directive is basically like a special command to the ReST processor. I think this part was the most difficult aspect of the implementation, simply because I didn't know where to find the Pygments directive or how to write my own. Eventually, I ended up downloading the Pygments-1.0.tar.gz file from PyPI, opening the Pygments-1.0/external/rst-directive.py file from the archive, and copying the stuff in there into a new file within my site.

For my own purposes, I made some small adjustments to the directive over what come with the Pygments distribution. I think it would save us all a lot of hassle if I just copied and pasted the directive, as I currently have it, so you can see it first-hand.

"""
    The Pygments reStructuredText directive
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    This fragment is a Docutils_ 0.4 directive that renders source code
    (to HTML only, currently) via Pygments.

    To use it, adjust the options below and copy the code into a module
    that you import on initialization.  The code then automatically
    registers a ``code-block`` directive that you can use instead of
    normal code blocks like this::

    .. code:: python

            My code goes here.

    If you want to have different code styles, e.g. one with line numbers
    and one without, add formatters with their names in the VARIANTS dict
    below.  You can invoke them instead of the DEFAULT one by using a
    directive option::

    .. code:: python
       :number-lines:

            My code goes here.

    Look at the `directive documentation`_ to get all the gory details.

    .. _Docutils: http://docutils.sf.net/
    .. _directive documentation:
       http://docutils.sourceforge.net/docs/howto/rst-directives.html

    :copyright: 2007 by Georg Brandl.
    :license: BSD, see LICENSE for more details.
"""

# Options
# ~~~~~~~

# Set to True if you want inline CSS styles instead of classes
INLINESTYLES = False

from pygments.formatters import HtmlFormatter

# The default formatter
DEFAULT = HtmlFormatter(noclasses=INLINESTYLES)

# Add name -> formatter pairs for every variant you want to use
VARIANTS = {
    'linenos': HtmlFormatter(noclasses=INLINESTYLES, linenos=True),
}


from docutils import nodes
from docutils.parsers.rst import directives

from pygments import highlight
from pygments.lexers import get_lexer_by_name, TextLexer

def pygments_directive(name, arguments, options, content, lineno,
                       content_offset, block_text, state, state_machine):
    try:
        lexer = get_lexer_by_name(arguments[0])
    except ValueError:
        # no lexer found - use the text one instead of an exception
        lexer = TextLexer()
    # take an arbitrary option if more than one is given
    formatter = options and VARIANTS[options.keys()[0]] or DEFAULT
    parsed = highlight(u'\n'.join(content), lexer, formatter)
    parsed = '<div class="codeblock">%s</div>' % parsed
    return [nodes.raw('', parsed, format='html')]

pygments_directive.arguments = (1, 0, 1)
pygments_directive.content = 1
pygments_directive.options = dict([(key, directives.flag) for key in VARIANTS])

directives.register_directive('code-block', pygments_directive)

I won't explain what that code means, because, quite frankly, I'm still a little hazy on the inner workings of ReST directives myself. Suffice it to say that this snippet allows you to easily highlight blocks of code on ReST-powered pages.

The question now is: where do I put this snippet? As far as I'm aware, this code can be located anywhere so long as it is loaded at one point or another before you start your ReST processing. For the sake of simplicity, I just stuffed it in the __init__.py file of my Django site. This is the __init__.py file that lives in the same directory as manage.py and settings.py. Putting it in that file just makes sure it's loaded each time you start your Django site.

To make Pygments highlight a block of code, all you need to do is something like this:

.. code:: python

    print 'Hello world!'

...which would look like...

print 'Hello world!'

If you have a longer block of code and would like line numbers, use the :number-lines: option:

.. code:: python
    :number-lines:

    for i in range(100):
        print i

...which should look like this...

for i in range(100):
    print i

That's all fine and dandy, but it probably doesn't look like the code is highlighted at all just yet (on your site, not mine). It's just been marked up by Pygments to have some pretty CSS styles applied to it. But how do you know which styles mean what?

Luckily enough, Pygments takes care of generating the CSS files for you as well. There are several attractive styles that come with Pygments. I would recommend going to the Pygments demo to see which one suits you best. You can also roll your own styles, but I haven't braved that yet so I'll leave that for another day.

Once you choose a style (I chose native for Code Koala), you can run the following commands:

$ pygmentize -S native -f html > native.css
$ cp native.css /path/to/site/media/css

(obviously, you'd want to replace native with the name of the style you like the most) Finally, add a line to your HTML templates to load the newly created CSS file. In my case, it's something like this:

<link rel="stylesheet" type="text/css" href="/static/styles/native.css" />

Now you should be able to see nicely-formatted source code on your Web pages (assuming you've already got ReST processing your content).

If you haven't been using ReST to generate nicely-formatted pages, you should make sure a couple of things are in place. First, you must have the django.contrib.markup application installed. Second, your templates should be setup to process ReST markup into HTML. Here's a sample templates/flatpages/default.html:

{% extends 'base.html' %}
{% load markup %}

{% block title %}{{ flatpage.title }}{% endblock %}

{% block content %}
<h2>{{ flatpage.title }}</h2>

{{ flatpage.content|restructuredtext }}
{% endblock %}

So that short template should allow you to use ReST markup for your flatpages, and it should also take care of the magic behind the .. code:: python directive.

I should also note that Pygments can handle a TON of languages. Check out the Pygments demo for a list of languages it knows how to highlight.

I think that about does it. Hopefully this article will help some other poor chap who is currently in the same situation as I was last night, and hopefully it will save you a lot more time than it took me to figure out all this junk. If it looks like I've missed something, or maybe that something needs further clarification, please comment and I'll see what I can do.

Installing Django on Shared Hosting (Site5)

This article is a related to my previously posted article about installing Django, an advanced Web framework for perfectionists, on your own computer. Now we will learn how to install Django on a shared hosting account, using Site5 and fastcgi as an example. Depending on your host, you may or may not have to request additional privileges from the support team in order to execute some of these commands.

Note: Django requires at least Python 2.3. Newer versions of Python are preferred.

Note: This HOWTO assumes familiarity with the UNIX/Linux command line.

Note: If the wget command doesn't work for you (as in you don't have permission to run it), you might try curl [url] -O instead. That's a -O as in upper-case o.

Install Python

Site5 (and many other shared hosting providers that offer SSH access) already has Python installed, but you will want to have your own copy so you can install various tools without affecting other users. So go ahead and download virtual python:

mkdir ~/downloads
cd ~/downloads
wget http://peak.telecommunity.com/dist/virtual-python.py

Virtual Python will make a local copy of the installed Python in your home directory. Now you want to make sure you execute this next command with the newest version of Python available on your host. For example, Site5 offers both Python 2.3.4 and Python 2.4.3. We want to use Python 2.4.3. To verify the version of your Python, execute the following command:

python -V

If that displays Python 2.3.x or anything earlier, try using python2.4 -V or python2.5 -V instead. Whichever command renders the most recent version of Python is the one you should use in place of python in the next command. Since python -V currently displays Python 2.4.3 on my Site5 sandbox, I will execute the following command:

python ~/downloads/virtual-python.py

Again, this is just making a local copy of the Python installation that you used to run the virtual-python.py script. Your local installation is likely in ~/lib/python2.4/ (version could vary).

Make Your Local Python Be Default

To reduce confusion and hassle, let's give our new local installation of Python precedence over the system-wide Python. To do that, open up your ~/.bashrc and make sure it contains a line similar to this:

export PATH=$HOME/bin:$PATH

If you're unfamiliar with UNIX-based text editors such as vi, here is what you would type to use vi to make the appropriate changes:

  • vi ~/.bashrc to edit the file
  • go to the end of the file by using the down arrow key or the j key
  • hit o (the letter) to tell vi you want to start typing stuff on the next line
  • type export PATH=$HOME/bin:$PATH
  • hit the escape key
  • type :x to save the changes and quit. Don't forget the : at the beginning. Alternatively, you can type :wq, which works exactly the same as :x.

Once you've made the appropriate changes to ~/.bashrc, you need to make those changes take effect in your current SSH session:

source ~/.bashrc

Now we should verify that our changes actually took place. Type the following command:

which python

If they output of that command is not something like ~/bin/python or /home/[your username]/bin/python, something probably didn't work. If that's the case, you can try again, or simply remember to use ~/bin/python instead of python throughout the rest of this HOWTO.

Install Python's setuptools

Now we should install Python's setuptools to make our lives easier down the road.

cd ~/downloads
wget http://peak.telecommunity.com/dist/ez_setup.py
python ez_setup.py

This gives us access to a script called easy_install, which makes it easy to install many useful Python tools. We will use this a bit later.

Download Django

Let's now download the most recent development version of Django. SSH into your account and execute the following commands (all commands shall be executed on your host).

svn co http://code.djangoproject.com/svn/django/trunk ~/downloads/django-trunk

Now we should make a symlink (or shortcut) to Django and put it somewhere on the Python Path. A sure-fire place is your ~/lib/python2.4/site-packages/ directory (again, that location could vary from host to host):

ln -s ~/downloads/django-trunk/django ~/lib/python2.4/site-packages
ln -s ~/downloads/django-trunk/django/bin/django-admin.py ~/bin

Now verify that Django is installed and working by executing the following command:

python -c "import django; print django.get_version()"

That command should return something like 1.0-final-SVN-8964. If you got something like that, you're good to move onto the next section. If, however, you get something more along the lines of...

Traceback (most recent call last):
    File "<string>", line 1, in ?
ImportError: No module named django

...then your Django installation didn't work. If this is the case, make sure that you have a ~/downloads/django-trunk/django directory, and also verify that ~/lib/python2.4/site-packages actually exists.

Installing Dependencies

In order for your Django projects to become useful, we need to install some other packages: PIL (Python Imaging Library, required if you want to use Django's ImageField), MySQL-python (a MySQL database driver for Python), and flup (a utility for fastcgi-powered sites).

easy_install -f http://www.pythonware.com/products/pil/ Imaging
easy_install mysql-python
easy_install flup

Sometimes, using easy_install to install PIL doesn't go over too well because of your (lack of) permissions. To circumvent this situation, you can always download the actual PIL source code and install it manually.

cd ~/downloads
wget http://effbot.org/downloads/Imaging-1.1.6.tar.gz
tar zxf Imaging-1.1.6.tar.gz
cd Imaging-1.1.6
ln -s ~/downloads/Imaging-1.1.6/PIL ~/lib/python2.4/site-packages

And to verify, you can try this command:

python -c "import PIL"

If that doesn't return anything, you're good to go. If it says something about "ImportError: No module named PIL", it didn't work. In that case, you have to come up with some other way of installing PIL.

Setting Up A Django Project

Let's attempt to setup a sample Django project.

mkdir -p ~/projects/django
cd ~/projects/django
django-admin.py startproject mysite
cd mysite
mkdir media templates

If that works, then you should be good to do the rest of your Django development on your server. If not, make sure that ~/downloads/django-trunk/django/bin/django-admin.py exists and that it has a functioning symlink (shortcut) in ~/bin. If not, you'll have to make adjustments according to your setup. Your directory structure should look something like:

  • projects
    • django
      • mysite
        • media
        • templates
        • __init__.py
        • manage.py
        • settings.py
        • urls.py

Making A Django Project Live

Now we need to make your Django project accessible from the Web. On Site5, I generally use either a subdomain or a brand new domain when setting up a Django project. If you plan on having other projects accessible on the same hosting account, I recommend you do the same. Let's assume you setup a subdomain such as mysite.mydomain.com. On Site5, you would go to ~/public_html/mysite for the next few commands. This could differ from host to host, so I won't go into much more detail than that.

Once you're in the proper place, you need to setup a few things: two symlinks, a django.fcgi, and a custom .htaccess file. Let's begin with the symlinks.

ln -s ~/projects/django/mysite/media ~/public_html/mysite/static
ln -s ~/lib/python2.4/site-packages/django/contrib/admin/media ~/public_html/mysite/media

This just makes it so you can have your media files (CSS, images, javascripts, etc) in a different location than in your public_html.

Now for the django.fcgi. This file is what tells the webserver to execute your Django project.

#!/home/[your username]/bin/python
import sys, os

# Add a custom Python path.
sys.path.insert(0, "/home/[your username]/projects/django")

# Switch to the directory of your project. (Optional.)
os.chdir("/home/[your username]/projects/django/mysite")

# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = "mysite.settings"

from django.core.servers.fastcgi import runfastcgi
runfastcgi(method="threaded", daemonize="false")

And finally, the .htaccess file:

1
2
3
4
5
6
RewriteEngine On
RewriteBase /
RewriteRule ^(media/.*)$ - [L]
RewriteRule ^(static/.*)$ - [L]
RewriteCond %{REQUEST_URI} !(django.fcgi)
RewriteRule ^(.*)$ django.fcgi/$1 [L]

The .htaccess file makes it so that requests to http://mysite.mydomain.com/ are properly directed to your Django project. So, now you should have a directory structure that something that looks like this:

  • public_html
    • mysite
      • media
      • static
      • .htaccess
      • django.fcgi

If that looks good, go ahead and make the django.fcgi executable and non-writable by others:

chmod 755 ~/public_html/mysite/django.fcgi

After that, head over to http://mysite.mydomain.com/ (obviously, replace the mydomain accordingly). If you see a page that says you've successfully setup your Django site, you're good to go!

Afterthoughts

I've noticed that I need to "restart" my Django sites on Site5 any time I change the .py files. There are a couple methods of doing this. One includes killing off all of your python processes (killall ~/bin/python) and the other simply updates the timestamp on your django.fcgi (touch ~/public_html/mysite/django.fcgi). I find the former to be more destructive and unreliable than the latter. So, my advice is to use the touch method unless it doesn't work, in which case you can try the killall method.

Good luck!

Django's New Comment System

There are a lot of exciting changes happening with Django right now. A lot. Some of these changes cause a lot of things to break across my sites. One such change was the integration of Thejaswi Puthraya's Summer of Code project: an improved comment system.

The first, and most obvious problem, was the change in the URLconf. This took me a while to track down for one reason or another. Here's the situation: originally, the django.contrib.comments application used a URLconf such as:

(r'^comments/', include('django.contrib.comments.urls.comments')),

This makes any comments-powered pages blow up. To solve this particular problem, just make it:

(r'^comments/', include('django.contrib.comments.urls')),

The next thing that caught me dealt with the templates for comments. Now there are actually some default ones, which is nice, but they might interfere with your own templates. I found that all I need in my templates/comments/ directory now is a single simple template called base.html:

{% extends 'base.html' %}

All of the other templates aren't needed unless you do some customized stuff (which I don't bother with).

Finally, and probably the most frustrating of all, getting an error such as:

NoReverseMatch: Reverse for '<function post_comment at 0xb504a1b4>' not found.

I'm not really sure why this problem has arisen, but my solution for it is to remove the entire django/contrib/comments/ directory and bring it back down from SVN. My guess is that some .pyc file lingering from the original comments application is interfering with the new comments application.

Feel free to post here if you have any other advice or problems!

Step-by-Step: Installing Django

Being the Django and Python zealot that I am, I often find myself trying to convert those around me to this awesome development paradigm. Once I break them, these people often ask me a lot of questions about Django. Over the past few months I've noticed that one of the biggest sticking points for people who are new to Django is actually getting it up and running to begin with. In response, this is the first in a series of articles dedicated to getting Django up and running.

What is Django?

The Django Web site describes Django as "a high-level Python Web framework that encourages rapid development and clean, pragmatic design." Basically, Django is just about the most amazing thing for Web development. I have tinkered with several different Web technologies, but nothing seems to even come close to what Django can do for me.

What is Python?

Python is a programming language used in numerous aspects of computing these days. It has a very simple yet powerful syntax. It's an easy language for beginners to pick up, but it provides adequate levels of power for the more experienced developers out there. If you have never programmed anything before, or you have dabbled with something like BASIC, Python should be fairly straightforward. If you are a programming veteran, but have only worked with languages like C, C++, Java, etc, you might struggle a bit with the syntax of the language. It's not difficult to overcome the differences in a couple hours of hands-on development.

Let's get started.

Installing Python...

Having Python installed is critical--Django does not work without Python. I'm guessing that you're relatively familiar with the procedures for installing software packages on your particular operating system. However, I will share a few notes to point you in the proper direction if you're lost. If nothing else, just head over to the Python download page to download anything you need to install Python. I whole-heartedly recommend using the latest stable version of Python for Django, but you should be able to get by with as early a version as 2.3.

...On Windows

Simply grab the latest version of the Python installer. It is currently version 2.5.2. Once the installer has downloaded successfully, just run through the installation wizard like any other setup program.

...On Mac OS X

Recent Mac OS X computers come with Python pre-installed. To determine whether or not you actually have it, launch the Terminal (Applications > Utilities > Terminal) and type python -c "import sys; print sys.version". If Python is already installed, you will see the version you have installed. If you have a version that is less than 2.3, you should download the newest version. If you don't have Python installed, you will get a "command not found" error. If you're in this boat, just download the latest version of the Python Universal installer and install it.

...On Linux

Most Linux distributions also have Python pre-installed. Just like with Mac OS X, you can check to see by opening up a terminal/konsole session and running the command python -c "import sys; print sys.version". If you have Python installed, you will see its version. If you get an error message when running that command, or you have a version earlier than 2.3, you need to download and install the latest version of Python.

If you're running a Debian-based distribution (like Ubuntu, sidux, MEPIS, KNOPPIX, etc), you can probably use sudo apt-get install python to get Python. If you're running an RPM-based Distribution, you can probably use something like Yum or YaST to install Python.

A sure-fire way to install Python on any Linux system, however, is to install from source. If you need to do this, you simply:

  1. download the source for the latest version of Python
  2. extract it: tar jxf Python-2.5.2.tar.bz2
  3. go into the newly-extracted directory: cd python-2.5.2
  4. configure it: ./configure
  5. compile it: make
  6. install it: make install

(I've only installed Python from source one time, so I might be wrong)

Setting Up Your PYTHONPATH...

Generally speaking, if you didn't have Python installed before starting this tutorial, you will need to setup your PYTHONPATH environment variable. This is a variable that lets Python know where to find useful things (like Django).

...On Windows

  • Open up your System Properties (Win+Break or right click on "My Computer" on your desktop and select Properties)
  • Go to the "Advanced" tab
  • Click the "Environment Variables" button
  • If you have permission to change system variables, click the "New" button in the bottom pane. Otherwise, create the PYTHONPATH variable for your user account using the "New" button in the top (User variables for [username]) pane.
  • Set the variable name to PYTHONPATH
  • Set the variable value to C:\Python25\Lib\site-packages (replace C:\Python25\ with whatever it is on your system if needed)
  • Save it

You may also need to add the python executable to your PATH. If you can successfully run python from a command prompt window, you don't need to worry about it.

If you can't run python from a command prompt, follow the procedure above, but use the PATH variable instead of PYTHONPATH. PATH most likely already exists, so you just need to append/prepend the existing value with something like C:\Python25\ (again, this might need to change depending on where you installed Python)

...On Mac OS X

Your PYTHONPATH should already be setup for you.

...On Linux

Usually you just need to edit your ~/.bash_rc script to setup your PYTHONPATH environment variable. Go ahead and open that up in your preferred text editor and make sure there's something in it like:

export PYTHONPATH=/usr/lib/python2.5/site-packages:$PYTHONPATH

Save any changes necessary and run the following command:

source ~/.bash_rc

This will take care of updating your current session with any changes you made to your ~/.bash_rc.

Installing Django

Once you have Python and have verified that you have version 2.3 or later, you are ready to install Django. Currently, the latest stable release is 0.96.1, but this is grossly out-dated. Django 1.0 will be released on September 2nd 2008, so the "unstable" copy of Django is pretty close to what 1.0 will have to offer. There are some incredibly useful improvements in the unstable version that I don't think I could do without anymore, so that's what I'll talk about installing here.

First, you need to have a subversion client. On Windows, the most popular one is called TortoiseSVN. On Mac OS X, I have played with a few, but I think Versions is a pretty decent one. Linux also has several to choose from, but if you're using Linux, you're probably going to use the command line anyway (right?).

For brevity, I will just use the subversion commands necessary to accomplish this task (instead of discussing all GUI interfaces to subversion).

The exact location that Django should be installed differs from system to system, but here are some guidelines for typical setups:

  • Windows: C:\Python25\Lib\site-packages
  • Linux: /usr/lib/python2.5/site-packages
  • Mac OS X: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages

If you want a definite location, run the following command:

python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

Once you know that location, go there in your command prompt or terminal session. Then execute this command svn co http://code.djangoproject.com/svn/django/trunk/django django. You will see loads of output, showing all of the files that you are downloading as you install Django.

As soon as that process completes, you should run python -c "import django" to make sure everything worked properly. If the command doesn't display an ImportError, you're good. Otherwise, you need to try again.

Getting Access to Django Scripts...

Once you can successfully import django, you might want to make sure you can run the django-admin.py script that comes with Django.

...On Windows

This process is very similar to what we did with the PYTHONPATH environment variable earlier.

  • Open your System Properties again
  • Go to the Advanced tab
  • Click the Environment Variables button
  • Find your PATH environment variable (either for your user or system-wide)
  • Make sure that the variable value contains something like C:\Python25\Lib\site-packages\django\bin
  • Save any changes
  • Open a fresh command prompt
  • Try to run django-admin.py. If you're successful, you're ready to get started with Django. Otherwise, you need to fix your path to django/bin or just call the django-admin.py script using an absolute path when needed.

...On Mac OS X

You can run a command similar to this:

sudo ln -s /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/django/bin/django-admin.py /usr/local/bin

...On Linux

If you have "root" privileges on your Linux system, you can execute a command like:

sudo ln -s /usr/lib/python2.5/site-packages/django/bin/django-admin.py /usr/local/bin

If you don't have "root" privileges, you can setup your own /usr/local/bin:

mkdir ~/bin

Make sure your ~/.bash_rc contains something like:

export PATH=$HOME/bin:$PATH

Then update your current session with any changes you made to ~/.bash_rc by running this command:

source ~/.bash_rc

And that should do it! Now you should be ready to get started with Django.

Feel free to leave a comment if you're having problems installing Django. Good luck!

Check out Installing Django on Shared Hosting.

Make Your Own iPod-Compatible Audio Books Using Linux

I like music a lot. I think I always have, and I probably always will. I like to be able to listen to good music wherever I go whenever I want. Thanks to the wonders of technology, we have a myriad of portable media devices to choose from. I personally chose an iPod nano. It's a wonderful little toy.

Anyway, as much as I like music, sometimes I feel that my time could be better used doing things more productive than just listening to music. Once I realized I felt this way, I began looking into ways to get my audio books onto my iPod. At first I simply transfered over the MP3s that came straight from the CDs. But I soon realized that this wasn't the most effective use of the iPod's audio book capabilities. So the hunt was on for some good Windows software to convert my MP3 audio books into M4B format for the iPod.

Now, I'm a pretty cheap guy when it comes to paying for software (which is probably one of the main reasons I started using Linux way back when). I found a bunch of different "free" tools that claimed to be able to convert my MP3's, but few of them actually worked well enough for me to stand using them. Eventually, I found a (very round-about) routine that allowed me to turn everything into something my iPod could understand as an audio book. I followed this routine to convert several audio books and transfer them to my iPod. I never actually finished listening to any of them completely.

Last night I started fooling around with converting my DVDs into a format my iPod could understand. When I finally got The Bourne Identity converted properly, I tried to throw it onto my iPod from my wife's Mac. It told me that I would have to erase everything (because I used my own PC to transfer my files before), and I said it was ok. I didn't have any of my original .m4b files around anymore, and so I began looking for ways of creating those audio books (in Linux this time).

It wasn't long before I stumbled upon a particularly interesting post on this exact topic. It requires the use of mp3wrap, mplayer, and faac. Pretty simple, really. Here's what you do:

# mp3wrap outputfilename *.mp3
# mplayer -vc null -vo null -ao pcm:nowaveheader:fast:file=outputfilename.pcm outputfilename_MP3WRAP.mp3
# faac -R 44100 -B 16 -C 2 -X -w -q 80 --artist "author" --album "title" --title "title" --track "1" --genre "Spoken Word" --year "year" -o outputfilename.m4b outputfilename.pcm

Nice and easy, huh? Now to decipher it all.

# mp3wrap outputfilename *.mp3

This command will stitch a bunch of MP3 files into a single MP3. This makes it easier to have a "real" audio book on your iPod.

# mplayer -vc null -vo null -ao pcm:nowaveheader:fast:file=outputfilename.pcm outputfilename_MP3WRAP.mp3

This command converts that one big MP3 file to PCM (uncompressed) format. Somewhere in the output of this command, you will see something like AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) which comes in handy for the next command:

# faac -R 44100 -B 16 -C 2 -X -w -q 80 --artist "author" --album "title" --title "title" --track "1" --genre "Spoken Word" --year "year" -o outputfilename.m4b outputfilename.pcm

Finally, this command turns the PCM file into an audio book (m4b) file. The 44100, 16, and 2 right after faac all come from that special line in the output of the mplayer command.

As much as I like the command line, I don't like having to remember all of those parameters and options. So I decided to create a utility script (written in Python, of course) to wrap all of these commands into one simple one:

# mp3s2m4b.py BookName mp3s_directory [--quality=0..100] [--artist="artist"] [--album="album"] [--title="title"] [--genre="genre"] [--year=year] [--track=number]

While this might still seem too complex for pleasure, it does reduce a lot of the typing involved with the other three commands. All of the thingies in square brackets (like [--quality=0..100]) are optional. My script runs the commands mentioned previously in order, and suppresses all of the scary output.

I've used my script 4 or 5 different times so far, and it seems to work great. You may download it here.