Selenium Unit Test Reuse

Yesterday, one of the QA guys at work approached me with a question that turned out to be much more interesting to me than I think he had planned. He's been doing some unit testing using Selenium, exporting his test cases to Python. His question was this: how can I run the same unit tests using multiple browsers and multiple target servers?

I'm pretty sure he expected a simple 3-step answer or something like that. Instead, he got my crazy wide-eyed "ohhh... that's something I want to experiment with!" look. I started rambling on about inheritance, dynamic class creation, and nested for loops. His eyes started to look a little worried. He didn't really appreciate the nerdy lingo that much. I told him to pull up a chair and get comfortable.

Since I already had some other work I needed to pay attention to, I didn't want to spend too much time trying to figure out a good way to solve his problem. After about 20 minutes of devilish chuckles and frantic rustling through Python documentation, I came up with the following code:

from types import ClassType
from selenium import selenium
import unittest

IPS = ['192.168.0.1', '192.168.0.2']
BROWSERS = ['safari', 'chrome']

class SomeUnitTest(object):

    def test_something(self):
        sel = self.selenium
        # test code

def main(base):
    suites = []
    results = unittest.TestResult()

    for iidx, ip in enumerate(IPS):
        for bidx, browser in enumerate(BROWSERS):
            def setUp(self):
                self.verificationErrors = []
                self.selenium = selenium("localhost", 4444, "*%s" % self.browser, "http://%s/" % self.ip)
                self.selenium.start()

            def tearDown(self):
                self.selenium.stop()
                self.assertEqual([], self.verificationErrors)

            ut = ClassType('UT_%i_%i' % (iidx, bidx), (unittest.TestCase, base), {'ip': ip, 'browser': browser})
            ut.setUp = setUp
            ut.tearDown = tearDown

            suites.append(unittest.TestLoader().loadTestsFromTestCase(ut))

    unittest.TestSuite(suites)(results)
    for obj, error in results.errors:
        print 'In: ', obj
        print error

if __name__ == "__main__":
    main(SomeUnitTest)

I know, I know... it's got some dirty rotten tricks in it, and there are probably more efficient ways of doing what I've done. If the code offends you, look up at my previous disclaimer: I had other things I needed to be working on, so I didn't spend much time refining this. One thing I'm almost certain could be done better is not monkey patching the dynamic classes with the setUp and tearDown methods. Also, the output at the end of the test execution could definitely use some love. Oh well. Perhaps another day I'll get around to that.

Basically, you just set the servers you need to test and the browsers you want Selenium to run the tests in. Those are at the top of the script: IPS and BROWSERS. Then a new unittest.TestCase class is created for each combination of IP/server+browser. Finally, each of the test cases is thrown into a TestSuite, and the suite is processed. If there were any errors during the tests, they'll be printed out. We weren't really concerned with printing out other information, but you can certainly make other meaningful feedback appear.

Anyway, I thought that someone out there might very well benefit from my little experiment on my co-worker's question. Feel free to comment on your personal adventures with some variation of the code if you find it useful!

More django-articles Updates

I've spent a little more time lately adding new features to django-articles. There are two major additions in the latest release (2.0.0-pre2).

  • Article attachments
  • Article statuses

That's right folks! You can finally attach files to your articles. This includes attachments to emails that you send, if you have the articles from email feature properly configured. To prove it, I'm going to attach a file to this article (which I'm posting via email).

Next, I've decided that it's worth allowing the user to specify different statuses for their articles. One of the neat things about this feature is that if you are a super user, you're logged in, and you save an article with a status that is designated as "non-live", you will still be able to see it on the site. This is a way for users to preview their work before making it live. Out of the box, there are only two statuses: draft and finished. You're free to add more statuses if you feel so inclined (they're in the database, not hardcoded).

The article status is still separate from the "is_active" flag when saving an article. Any article that is marked as inactive will not appear on the site regardless of the article's "status".

On a slightly less impressive note (although still important), this release includes some basic unit tests. Most of the tests currently revolve around article statuses and making sure that the appropriate articles appear on the site.

2Ze.us Updates

There has been quite a bit of recent activity in my 2ze.us project since I first released it nearly a year ago. My intent was not to become a competitor with bit.ly, is.gd, or anyone else in the URL-shortening arena. I created the site as a way for me to learn more about Google's AppEngine. It didn't take very long to get it up and running, and it seemed to work fairly well.

AppEngine and Extensions

I was able to basically leave the site alone on AppEngine for several months--through about September 2009. In that time, I came up with a Firefox extension to make its use more convenient.

The extension allows you to quickly get a shortened URL for the page you're currently looking at, and a couple of context menu items let you get a short URL for things like specific images on a page. Also included in the extension is a preview for 2ze.us links. The preview can tell you the title and domain of the link's target. It can tell you how much smaller the 2ze.us URL is compared to the full URL. Finally, it displays how many times that particular 2ze.us link has been clicked.

That as all fine and dandy. It was the second Firefox extension I had ever written, and it's still running strong. In June or July of 2009, I started working on a little program to make it easier for me to interact with Twitter the way I wanted to. This was a great opportunity for me to incorporate 2ze.us into the application so any URL I wanted to post to Twitter would automatically be shortened for me, using my own shortener.

Porting to WebFaction And PHP

Anyway, around the end of September 2009, I noticed that there were a lot of problems with 2ze.us. It was slow and sometimes completely unresponsive. Certain URLs would redirect to their full URLs, while others wouldn't. The Firefox extension stopped working nicely. Oh yeah, and AppEngine rolled back to a previous revision of the code without me telling it to. That's when everything just died. It didn't take long for me to decide to migrate my project from AppEngine onto my awesome WebFaction hosting.

At this point, I was faced with a small dilemma: keep the code in Python, or port it to PHP. I opted to port it over to PHP, because I didn't want all of the overhead of a full Django instance for a site that needed to be very zippy. And I was unacquainted with other Python options.

By early October 2009, I had managed to turn the project into a PHP beast, running on Apache. It was a lot more responsive than AppEngine ever let 2ze.us be. There were a few bumps along the road, what with the extension and Twitter client relying on various parts of the site. Eventually it got to a point where I could just let it sit and work.

Chromium Extension

Sometime around the end of December, I decided to write another extension for 2ze.us, only for Google Chrome and Chromium this time. This extension isn't quite as feature-packed as its Firefox brother, but it gets the job done.

Clip2Zeus

Shortly after "completing" the Chromium extension, I had what seemed like a pretty original idea. Who knows if it really is, but I still haven't seen another tool quite like the one that I made as a result of this idea. I thought, "Now, why should I need to install an extension in each Web browser I use on each computer I use? Is there a better way?"

The answer came quickly: a standalone, desktop application. Write one program that handles shortening URLs for you. My laziness told me to make a program that monitors your system clipboard for URLs. If a URL is detected, try to shorten it, and update the clipboard contents in place. Boom. Done. All extensions become useless beyond things like the URL preview (which is very useful, imo).

The next question I asked was, "Do I make it platform-dependent? Should I stick it to the majority of computer users and write my tool for Linux only? For OSX only? For, uh... Windows only?" Again, an easy question to answer. Support them all or don't even bother writing the application.

A week's worth of midnight hacking saw the birth of Clip2Zeus 1.0a. It's a cross-platform compatible desktop application that does exactly what I just mentioned. When it's running and detects a URL on your system clipboard, it will try to shorten it and update it in your clipboard. If you copy a block of text, the application will only modify the URLs in that block of text--meaning the block of text will still be in your clipboard, but it will have shorter URLs.

I use the program every day at work (on OSX). It's been very fun for me to see a short URL any time I copy a nasty URL to my clipboard. Imagine that; I'm a big fan of my own work...

Tornado

Lately, I've noticed that the site was getting kind of slow again. Sometimes it would take several seconds for Clip2Zeus to shorten URLs in my clipboard, when it was normally instantaneous. Every once in a while, Clip2Zeus would completely fail to connect to the website.

One of my friends has asked me a lot of questions about the Tornado framework in the past months. I had read a few things about Tornado when it was open-sourced last year, but I didn't really feel the need to dabble with it. These questions prompted me to tinker a little.

Last night I re-ported 2ze.us to Python, using the Tornado framework this time. So far I'm very impressed with its responsiveness. The framework offers a lot of neat little utilities, and it is very fast (as reported by dozens of other reputable sources).

On top of the speed increase that came with the transition to Tornado, my RAM usage on WebFaction has come down by nearly 100MB. Just by turning off the one Apache-backed website. Now I'm nowhere near my RAM cap! Wahoo!!

Enough rambling. Like I said at the beginning of this article, a lot has been happening with this project in the past year. I didn't even think about all of the time I put into projects related to my simple little side project. Looking back, I'm quite satisfied with how things have unfolded.

Statistics

Here are some simple statistics for 2ze.us. Since March 2009...

  • 5,252 URLs have been shortened using 2ze.us
  • 2ze.us links have been clicked 198,267 times
  • 315,951 URL characters have been turned into 11,532 characters

In April 2009...

  • 217 URLs were shortened
  • 2ze.us links were clicked 617 times

In February 2010...

  • 1,182 URLs were shortened
  • 2ze.us links were clicked 32,830 times

Not too shabby for a side project.

Network Manager, Cisco VPN, And Internet

Those of us out on the eastern side of the United States are currently experiencing quite a snow storm. While this sort of storm would probably have not even made the local news in Rexburg (where my wife and I attended university), everyone is making a big deal about it around here. Part of that big deal included the option, and even recommendation, that we work from home on Friday, using the company VPN to take care of our tasks.

I was pretty excited at the idea of working from home once again (my last job was almost exclusively a work-at-home gig), so I made sure I was able to connect to the VPN a few days ago, after receiving the credentials. It took a few tries to get everything right in Windows, but eventually it started working quite well. Then I tried connecting from Linux, using the awesomeness known as Network Manager.

Since I'm currently on Fedora 12, all I had to do was make sure that I had network-manager-vpnc installed, and I could then configure a connection using the same credentials I used in Windows. I had a successful connection on the very first try, and it was working fabulously. I had access to all of my development machines and all of the tools I use on a daily basis.

It didn't take long, however, for me to notice a big problem: no Internet access. I could get to any machine I dang well pleased on the company network, but nothing on the Internet. Quite frustrating, to say the least.

I decided to leave the investigation as to why I had no Internet access and how to fix it for another night. Here I am now, tinkering with it again. I found out what I needed to change:

  • Right click on the Network Manager icon in the system tray, and select "Edit Connections..."
  • Click on the VPN tab
  • Edit your VPN connection
  • Click on the "IPv4 Settings" tab
  • Click the "Routes..." button
  • Make sure that the "Use this connection only for resources on its network" option is checked
  • Connect to your VPN, and enjoy access to the devices there as well as on the Internet!

Hopefully this saves someone else's sanity (Jeremy?)

Auto-Generating Documentation Using Mercurial, ReST, and Sphinx

I often find myself taking notes about various aspects of my job that I feel I would forget as soon as I moved onto another project. I've gotten into the habit of taking my notes using reStructured Text, which shouldn't come as any surprise to any of my regular visitors. On several occasions, I had some of the other guys in the company ask me for some clarification on some things I had taken notes on. Lucky for me, I had taken some nice notes!

However, these individuals probably wouldn't appreciate reading ReST markup as much as I do, so I decided to do something nice for them. I setup Sphinx to prettify my documentation. I then wrote a small Web server using Python, so people within the company network could access the latest version of my notes without much hassle.

Just like I take notes to remind myself of stuff at work, I want to do that again for this automated ReST->HTML magic--I want to be able to do this in the future! I figured I would make my notes even more public this time, so you all can enjoy similar bliss.

Platform Dependence

I am writing this article with UNIX-like operating systems in mind. Please forgive me if you're a Windows user and some of this is not consistent with what you're seeing. Perhaps one day I'll try to set this sort of thing up on Windows.

Installing Sphinx

The first step that we want to take is installing Sphinx. This is the project that Python itself uses to generate its online documentation. It's pretty dang awesome. Feel free to skip this section if you have already installed Sphinx.

Depending on your environment of choice, you may or may not have a package manager that offers python-sphinx or something along those lines. I personally prefer to install it using pip or easy_install:

$ sudo pip install sphinx

Running that command will likely respond with a bunch of output about downloading Sphinx and various dependencies. When I ran it in my sandbox VM, I saw it install the following packages:

  • pygments
  • jinja2
  • docutils
  • sphinx

It should be a pretty speedy installation.

Installing Mercurial

We'll be using Mercurial to keep track of changes to our ReST documentation. Mercurial is a distributed version control system that is built using Python. It's wonderful! Just like with Sphinx, if you have already installed Mercurial, feel free to skip to the next section.

I personally prefer to install Mercurial using pip or easy_install--it's usually more up-to-date than what you would have in your package repositories. To do that, simply run a command such as the following:

$ sudo pip install mercurial

This will go out and download and install the latest stable Mercurial. You may need python-dev or something like that for your platform in order for that command to work. However, if you're on Windows, I highly recommend TortoiseHg. The installer for TortoiseHg will install a graphical Mercurial client along with the command line tools.

Create A Repository

Now let's create a brand new Mercurial repository to house our notes/documentation. Open a terminal/console/command prompt to the location of your choice on your computer and execute the following commands:

$ hg init mydox
$ cd mydox

Configure Sphinx

The next step is to configure Sphinx for our project. Sphinx makes this very simple:

$ sphinx-quickstart

This is a wizard that will walk you through the configuration process for your project. It's pretty safe to accept the defaults, in my opinion. Here's the output of my wizard:

$ sphinx-quickstart
Welcome to the Sphinx quickstart utility.

Please enter values for the following settings (just press Enter to
accept a default value, if one is given in brackets).

Enter the root path for documentation.
> Root path for the documentation [.]:

You have two options for placing the build directory for Sphinx output.
Either, you use a directory "_build" within the root path, or you separate
"source" and "build" directories within the root path.
> Separate source and build directories (y/N) [n]: y

Inside the root directory, two more directories will be created; "_templates"
for custom HTML templates and "_static" for custom stylesheets and other static
files. You can enter another prefix (such as ".") to replace the underscore.
> Name prefix for templates and static dir [_]:

The project name will occur in several places in the built documentation.
> Project name: My Dox
> Author name(s): Josh VanderLinden

Sphinx has the notion of a "version" and a "release" for the
software. Each version can have multiple releases. For example, for
Python the version is something like 2.5 or 3.0, while the release is
something like 2.5.1 or 3.0a1.  If you don't need this dual structure,
just set both to the same value.
> Project version: 0.0.1
> Project release [0.0.1]:

The file name suffix for source files. Commonly, this is either ".txt"
or ".rst".  Only files with this suffix are considered documents.
> Source file suffix [.rst]:

One document is special in that it is considered the top node of the
"contents tree", that is, it is the root of the hierarchical structure
of the documents. Normally, this is "index", but if your "index"
document is a custom template, you can also set this to another filename.
> Name of your master document (without suffix) [index]:

Please indicate if you want to use one of the following Sphinx extensions:
> autodoc: automatically insert docstrings from modules (y/N) [n]:
> doctest: automatically test code snippets in doctest blocks (y/N) [n]:
> intersphinx: link between Sphinx documentation of different projects (y/N) [n]:
> todo: write "todo" entries that can be shown or hidden on build (y/N) [n]:
> coverage: checks for documentation coverage (y/N) [n]:
> pngmath: include math, rendered as PNG images (y/N) [n]:
> jsmath: include math, rendered in the browser by JSMath (y/N) [n]:
> ifconfig: conditional inclusion of content based on config values (y/N) [n]:

A Makefile and a Windows command file can be generated for you so that you
only have to run e.g. `make html' instead of invoking sphinx-build
directly.
> Create Makefile? (Y/n) [y]:
> Create Windows command file? (Y/n) [y]: n

Finished: An initial directory structure has been created.

You should now populate your master file ./source/index.rst and create other documentation
source files. Use the Makefile to build the docs, like so:
   make builder
where "builder" is one of the supported builders, e.g. html, latex or linkcheck.

If you followed the same steps I did (I separated the source and build directories), you should see three new files in your mydox repository:

  • build/
  • Makefile
  • source/

We'll do our work in the source directory.

Get Some ReST

Now is the time when we start writing some ReST that we want to turn into HTML using Sphinx. Open some file, like first_doc.rst and put some ReST in it. If nothing comes to mind, or you're not familiar with ReST syntax, try the following:

=========================
This Is My First Document
=========================

Yes, this is my first document.  It's lame.  Deal with it.

Save the file (keep in mind that it should be within the source directory if you used the same settings I did). Now it's time to add it to the list of files that Mercurial will pay attention to. While we're at it, let's add the other files that were created by the Sphinx configuration wizard:

$ hg add
adding ../Makefile
adding conf.py
adding first_doc.rst
adding index.rst
$ hg st
A Makefile
A source/conf.py
A source/first_doc.py
A source/index.rst

Don't worry that we don't see all of the directories in the output of hg st--Mercurial tracks files, not directories.

Automate HTML-ization

Here comes the magic in automating the conversion from ReST to HTML: Mercurial hooks. We will use the precommit hook to fire off a command that tells Sphinx to translate our ReST markup into HTML.

Edit your mydox/.hg/hgrc file. If the file does not yet exist, go ahead and create it. Add the following content to it:

[hooks]
precommit.sphinxify = ~/bin/sphinxify_docs.sh

I've opted to call a Bash script instead of using an inline Python call. Now let's create the Bash script, ~/bin/sphinxify_docs.sh:

#!/bin/bash
cd $HOME/mydox
sphinx-build source/ docs/

Notice that I used the $HOME environment variable. This means that I created the mydox directory at /home/myusername/mydox. Adjust that line according to your setup. You'll probably also want to make that script executable:

$ chmod +x ~/bin/sphinxify_docs.sh

Three, Two, One...

You should now be at a stage where you can safely commit changes to your repository and have Sphinx build your HTML documentation. Execute the following command somewhere under your mydox repository:

$ hg ci -m "Initial commit"

If your setup is anything like mine, you should see some output similar to this:

$ hg ci -m "Initial commit"
Making output directory...
Running Sphinx v0.6.4
No builder selected, using default: html
loading pickled environment... not found
building [html]: targets for 2 source files that are out of date
updating environment: 2 added, 0 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... /home/jvanderlinden/mydox/source/first_doc.rst:: WARNING: document isn't included in any toctree
done
preparing documents... done
writing output... [100%] index
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded, 1 warning.
$ hg st
? docs/.buildinfo
? docs/.doctrees/environment.pickle
? docs/.doctrees/first_doc.doctree
? docs/.doctrees/index.doctree
? docs/_sources/first_doc.txt
? docs/_sources/index.txt
? docs/_static/basic.css
? docs/_static/default.css
? docs/_static/doctools.js
? docs/_static/file.png
? docs/_static/jquery.js
? docs/_static/minus.png
? docs/_static/plus.png
? docs/_static/pygments.css
? docs/_static/searchtools.js
? docs/first_doc.html
? docs/genindex.html
? docs/index.html
? docs/objects.inv
? docs/search.html
? docs/searchindex.js

If you see something like that, you're in good shape. Go ahead and take a look at your new mydox/docs/index.html file in the Web browser of your choosing.

Not very exciting, is it? Notice how your first_doc.rst doesn't appear anywhere on that page? That's because we didn't tell Sphinx to put it there. Let's do that now.

Customizing Things

Edit the mydox/source/index.rst file that was created during Sphinx configuration. In the section that starts with .. toctree::, let's tell Sphinx to include everything we ReST-ify:

.. toctree::
   :maxdepth: 2
   :glob:

   *

That should do it. Now, I don't know about you, but I don't really want to include the output HTML, images, CSS, JS, or anything in my documentation repository. It would just take up more space each time we change an .rst file. Let's tell Mercurial to not pay attention to the output HTML--it'll just be static and always up-to-date on our filesystem.

Create a new file called mydox/.hgignore. In this file, put the following content:

syntax: glob
docs/

Save the file, and you should now see something like the following when running hg st:

$ hg st
M source/index.rst
? .hgignore

Let's include the .hgignore file in the list of files that Mercurial will track:

$ hg add .hgignore
$ hg st
M source/index.rst
A .hgignore

Finally, let's commit one more time:

$ hg ci -m "Updating the index to include our .rst files"
Running Sphinx v0.6.4
No builder selected, using default: html
loading pickled environment... done
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 1 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

Tada!! The first_doc.rst should now appear on the index page.

Serving Your Documentation

Who seriously wants to have HTML files that are hard to get to? How can we make it easier to access those HTML files? Perhaps we can create a simple static file Web server? That might sound difficult, but it's really not--not when you have access to Python!

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler

def main():
    try:
        server = HTTPServer(('', 80), SimpleHTTPRequestHandler)
        server.serve_forever()
    except KeyboardInterrupt:
        server.socket.close()

if __name__ == '__main__':
    main()

I created this simple script and put it in my ~/bin/ directory, also making it executable. Once that's done, you can navigate to your mydox/docs/ directory and run the script. Since I called the script webserver.py, I just do this:

$ cd ~/mydox/docs
$ sudo webserver.py

This makes it possible for you to visit http://localhost/ on your own computer, or to use your computer's IP in place of localhost to access your documentation from a different computer on your network. Pretty slick, if you ask me.

I suppose there's more I could add, but that's all I have time for tonight. Enjoy!

Announcing: Clip2Zeus

Sometime last year, I embarked on a mission to create my own TinyURL or bit.ly. This project had no real purpose other than to help me learn how to use Google's AppEngine. All of the URL-shortening services I had tried up to that point were perfectly satisfactory for my needs, but I wanted to explore a little.

It didn't take long for me to come up with the site that is now 2ze.us. I learned some neat things about AppEngine, and the site worked well enough for my needs (just like the others). Eventually I wrote a Firefox extension to make it easier to use the site. It offers the ability to quickly shorten "any" URL, and it also has a preview utility. This allows you to hover your cursor over a 2ze.us link and learn various bits of information about it--target domain name, the target page's title, number of hits, etc.

Toward the end of 2009, I started writing the same sort of extension for Chrome/Chromium. It offers pretty much the same sort of functionality as its Firefox brother, minus keyboard shortcuts.

Before long, I found myself embarking on another 2zeus-related endeavor. This new project is one that I am actually quite proud of and satisfied with. I wrote a program that will run in the background on your computer. I call it "Clip2Zeus". This program will periodically poll your clipboard, looking for URLs in whatever text you currently have on it. If any URLs are found, the program will run out to 2ze.us and try to shorten them. Once a valid result comes back from 2ze.us, your clipboard is automatically updated with the original URLs replaced by the shortened version.

It doesn't stop there, though. You can control the program using a couple of interfaces. One interface is a Tk GUI, which allows you to set the polling interval or turn off polling altogether. Should you choose to do that, you can click a button in the GUI any time you explicitly want to shorten URLs in your clipboard. There is another command line interface that offers the same sort of functionality.

I've been using this program on several computers for a couple of weeks, and I haven't noticed any memory/performance problems at all. It works just as well on Windows as it does on Linux, and just as well on OSX as it does on Linux. It just sits there silently until you give it a URL. It works with any program that can access the standard clipboard mechanism for whatever OS you're using.

You can download and install it using easy_install or pip. Or you can download it and install it directly from http://pypi.python.org/pypi/Clip2Zeus/

PyPI Download Stats

Every so often I find myself in need of a small ego boost (or reality check). One of the things I've done in the past to satisfy such a need is go to the PyPI and see how many downloads my packages have. Depending on how much time I have or how much effort I want to put into my pride, I may or may not check the download stats for all releases of each package.

A couple of weeks ago, I was in the mood for an ego boost. It was actually an every day thing for nearly a week! So, instead of wasting a lot of time checking download stats for each version of each package I have on PyPI, I wrote a script to do it for me. It uses the XML-RPC API that PyPI offers.

Here she is!

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
Calculates the total number of downloads that a particular PyPI package has
received across all versions tracked by PyPI
"""

from datetime import datetime
import locale
import sys
import xmlrpclib

locale.setlocale(locale.LC_ALL, '')

class PyPIDownloadAggregator(object):

    def __init__(self, package_name, include_hidden=True):
        self.package_name = package_name
        self.include_hidden = include_hidden
        self.proxy = xmlrpclib.Server('http://pypi.python.org/pypi')
        self._downloads = {}

        self.first_upload = None
        self.first_upload_rel = None
        self.last_upload = None
        self.last_upload_rel = None

    @property
    def releases(self):
        """Retrieves the release number for each uploaded release"""

        result = self.proxy.package_releases(self.package_name, self.include_hidden)

        if len(result) == 0:
            # no matching package--search for possibles, and limit to 15 results
            results = self.proxy.search({
                'name': self.package_name,
                'description': self.package_name
            }, 'or')[:15]

            # make sure we only get unique package names
            matches = []
            for match in results:
                name = match['name']
                if name not in matches:
                    matches.append(name)

            # if only one package was found, return it
            if len(matches) == 1:
                self.package_name = matches[0]
                return self.releases

            error = """No such package found: %s

Possible matches include:
%s
""" % (self.package_name, '\n'.join('\t- %s' % n for n in matches))

            sys.exit(error)

        return result

    @property
    def downloads(self, force=False):
        """Calculate the total number of downloads for the package"""

        if len(self._downloads) == 0 or force:
            for release in self.releases:
                urls = self.proxy.release_urls(self.package_name, release)
                self._downloads[release] = 0
                for url in urls:
                    # upload times
                    uptime = datetime.strptime(url['upload_time'].value, "%Y%m%dT%H:%M:%S")
                    if self.first_upload is None or uptime < self.first_upload:
                        self.first_upload = uptime
                        self.first_upload_rel = release

                    if self.last_upload is None or uptime > self.last_upload:
                        self.last_upload = uptime
                        self.last_upload_rel = release

                    self._downloads[release] += url['downloads']

        return self._downloads

    def total(self):
        return sum(self.downloads.values())

    def average(self):
        return self.total() / len(self.downloads)

    def max(self):
        return max(self.downloads.values())

    def min(self):
        return min(self.downloads.values())

    def stats(self):
        """Prints a nicely formatted list of statistics about the package"""

        self.downloads # explicitly call, so we have first/last upload data
        fmt = locale.nl_langinfo(locale.D_T_FMT)
        sep = lambda s: locale.format('%d', s, 3)
        val = lambda dt: dt and dt.strftime(fmt) or '--'

        params = (
            self.package_name,
            val(self.first_upload),
            self.first_upload_rel,
            val(self.last_upload),
            self.last_upload_rel,
            sep(len(self.releases)),
            sep(self.max()),
            sep(self.min()),
            sep(self.average()),
            sep(self.total()),
        )

        print """PyPI Package statistics for: %s

    First Upload: %40s (%s)
    Last Upload:  %40s (%s)
    Number of releases: %34s
    Most downloads:    %35s
    Fewest downloads:  %35s
    Average downloads: %35s
    Total downloads:   %35s
""" % params

def main():
    if len(sys.argv) < 2:
        sys.exit('Please specify at least one package name')

    for pkg in sys.argv[1:]:
        PyPIDownloadAggregator(pkg).stats()

if __name__ == '__main__':
    main()

Usage is pretty simple. All you need to do is call the script (I called it pypi_downloads.py with the name or names of the package(s) you want download stats for:

bash-4.0$ ./pypi_downloads.py clip2zeus
PyPI Package statistics for: Clip2Zeus

    First Upload:             Sun 10 Jan 2010 03:25:30 AM  (0.1)
    Last Upload:              Mon 18 Jan 2010 06:58:42 PM  (0.9d)
    Number of releases:                                 12
    Most downloads:                                     41
    Fewest downloads:                                   21
    Average downloads:                                  28
    Total downloads:                                   342

And there you have it!

Review: Django 1.0 Web Site Development

Introduction

Several months ago, a UK-based book publisher, Packt Publishing contacted me to ask if I would be willing to review one of their books about Django. I gladly jumped at the opportunity, and I received a copy of the book a couple of weeks later in the mail. This happened at the beginning of September 2009. It just so happened that I was in the process of being hired on by ScienceLogic right when all of this took place. The subsequent weeks were filled to the brim with visitors, packing, moving, finding an apartment, and commuting to my new job. It was pretty stressful.

Things are finally settling down, so I've taken the time to actually review the book I was asked to review. I should mention right off the bat that this is indeed a solicited review, but I am in no way influenced to write a good or bad review. Packt Publishing simply wants me to offer an honest review of the book, and that is what I indend to do. While reviewing the book, I decided to follow along and write the code the book introduced. I made sure that I was using the official Django 1.0 release instead of using trunk like I tend to do for my own projects.

The title of the book is Django 1.0 Web Site Development, written by Ayman Hourieh, and it's only 250 pages long. Ayman described the audience of the book as such:

This book is for web developers who want to learn how to build a complete site with Web 2.0 features, using the power of a proven and popular development system--Django--but do not necessarily want to learn how a complete framework functions in order to do this. Basic knowledge of Python development is required for this book, but no knowledge of Django is expected.

Ayman introduced Django piece by piece using the end goal of a social bookmarking site, a la del.icio.us and reddit. In the first chapter of the book, Ayman discussed the history of Django and why Python and Django are a good platform upon which to build Web applications. The second chapter offers a brief guide to installing Python and Django, and getting your first project setup. Not much to comment on here.

Digging In

Chapter three is where the reader was introduced to the basic structure of a Django project, and the initial data models were described. Chapter four discussed user registration and management. We made it possible for users to create accounts, log into them, and log out again. As part of those additions, the django.forms framework was introduced.

In chapter five, we made it possible for bookmarks to be tagged. Along with that, we built a tag cloud, restricted access to certain pages, and added a little protection against malicious data input. Next up was the section where things actually started getting interesting for me: enhancing the interface with fancy effects and AJAX. The fancy effects include live searching for bookmarks, being able to edit a bookmark in place (without loading a new page), and auto-completing tags when you submit a bookmark.

This chapter really reminded me just how simple it is to add new, useful features to existing code using Django and Python. I was thoroughly impressed at how easy it was to add the AJAX functionality mentioned above. Auto-completing the tags as you type, while jQuery and friends did most of the work, was very easy to implement. It made me happy.

Chapter seven introduced some code that allowed users to share their bookmarks with others. Along with this, the ability to vote on shared bookmarks was added. Another feature that was added in this chapter was the ability for users to comment on various bookmarks.

The ridiculously amazing Django Administration utility was first introduced in chapter eight. It kinda surprised me that it took 150 pages before this feature was brought to the user's attention. In my opinion, this is one of the most useful selling points when one is considering a Web framework for a project. When I first encountered Django, the admin interface was one of maybe three deciding factors in our company's decision to become a full-on Django shop.

Bring on the Web 2.0

Anyway, in chapter nine, we added a handful of useful "Web 2.0" features. RSS feeds were introduced. We learned about pagination to enhance usability and performance. We also improved the search engine in our project. At this stage, the magical Q objects were mentioned. The power behind the Q objects was discussed very well, in my opinion.

In chapter 10, we were taught how we can create relationships between members on the site. We made it possible for users to become "friends" so they can see the latest bookmarks posted by their friends. We also added an option for users to be able to invite some of their other friends to join the site via email, complete with activation links. Finally, we improved the user interface by providing a little bit of feedback to the user at various points using the messages framework that is part of the django.contrib.auth package in Django 1.0.

More advanced topics, such as internationalization and caching, were discussed in chapter 11. Django's special unit testing features were also introduced in chapter 11. This section actually kinda frustrated me. Caching was discussed immediately before unit testing. In the caching section, we learned how to enable site-wide caching. This actually broke the unit tests. They failed because the caching system was "read only" while running the tests. Anyway, it's probably more or less a moot point.

Chapter 11 also briefly introduced things to pay attention to when you deploy your Django projects into a production environment. This portion was mildly disappointing, but I don't know what else would have made it better. There are so many functional ways to deploy Django projects that you could write books just to describe the minutia involved in deployment.

The twelfth and final chapter discussed some of the other things that Django has to offer, such as enhanced functionality in templates using custom template tags and filters and model managers. Generic views were mentioned, and some of the other useful things in django.contrib were brought up. Ayman also offered a few ideas of additional functionality that the reader can implement on their own, using the things they learned throughout the book.

Afterthoughts

Overall, I felt that this book did a great job of introducing the power that lies in using Django as your framework of choice. I thought Ayman managed to break things up into logical sections, and that the iterations used to enhance existing functionality (from earlier chapters) were superbly executed. I think that this book, while it does assume some prior Python knowledge, would be a fine choice for those who are curious to dig into Django quickly and easily.

Some of the beefs I have with this book deal mostly with the editing. There were a lot of strange things that I found while reading through the book. However, the biggest sticking point for me has to do with "pluggable" applications. Earlier I mentioned that the built-in Django admin was one of only a few deciding factors in my company's choice to become a Django shop. Django was designed to allow its applications to be very "pluggable."

You may be asking, "What do I mean by 'pluggable'?" Well, say you decide to build a website that includes a blog, so you build a Django project and create an application specific to blogging. Then, at some later time, you need to build another site that also has blog functionality. Do you want to rewrite all of the blogging code for the second site? Or do you want to use the same code that you used in the first site (without copying it)? If you're anything like me and thousands of other developers out there, you would probably rather leverage the work you had already done. Django allows you to do this if you build your Django applications properly.

This book, however, makes no such effort to teach the reader how to turn all of their hard work on the social bookmarking features into something they could reuse over and over with minimal effort in the future. Application-specific templates are placed directly into the global templates directory. Application-specific URLconfs are placed in the root urls.py file. I would have liked to see at least some effort to make the bookmarking application have the potential to be reused.

Finally, the most obvious gripe is that the book is outdated. That's understandable, though! Anything in print media will likely be outdated the second it is printed if the book has anything to do with computers. However, with the understanding that this book was written specifically for Django 1.0 and not Django 1.1 or 1.2 alpha, it does an excellent job at hitting the mark.

Tip: easy_install / pip

With all of the exciting updates to Mercurial recently, I've been on a rampage, updating various boxes everywhere I go. I'm in the habit of using easy_install and/or pip to install most of my Python-related packages. It's pretty easy to install packages that are in well-known locations (like PyPI or on Google Code, for example). It's also pretty easy to update packages using either utility. Both take a -U parameter, which, to my knowledge, tells it to actually check for updates and install the latest version.

That's all fine and dandy, but what happens when you want to install an "unofficial" version of some package? I mean, what if your favorite project all of the sudden includes some feature that you will die unless you can have access to it and the next official version is weeks or months in the future? There are typically a few avenues you can take to satisfy your needs, but I wanted to bring up something that I think not many people are aware of: easy_install and pip can both understand URLs to installable Python packages.

What do I mean by that, you ask? Well, when you get down to the basics of what both utilities do, they just take care of downloading some Python package and installing it with the setup.py file contained therein. In many cases, these utilities will search various package repositories, such as PyPI, to download whatever package you specify. If the package is found, it will be downloaded and extracted.

In most cases, you can do all of that yourself:

$ wget http://pypi.python.org/someproject/somepackage.tar.gz
$ tar zxf somepackage.tar.gz
$ cd somepackage
$ python setup.py install

Both easy_install and pip obviously do a lot of other magic, but that is perhaps the most basic way to understand what they do. To answer that last question, you can help your utility of choice out by specifying the exact URL to the specific package you want it to install for you:

$ easy_install http://pypi.python.org/someproject/somepackage.tar.gz
$ pip install http://pypi.python.org/someproject/somepackage.tar.gz

For me, this feature comes in very handy with projects that are hosted on BitBucket, for example, because you can always get any revision of the project in a tidy .tar.gz file. So when I'm updating Mercurial installations, I can do this to get the latest stable revision:

$ easy_install http://selenic.com/repo/hg-stable/archive/tip.tar.gz

It's pretty slick. Here's a full example:

[user@web ~]$ hg version
Mercurial Distributed SCM (version 1.2.1)

Copyright (C) 2005-2009 Matt Mackall <mpm@selenic.com> and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[user@web ~]$ easy_install http://selenic.com/repo/hg-stable/archive/tip.tar.gz
Downloading http://selenic.com/repo/hg-stable/archive/tip.tar.gz
Processing tip.tar.gz
Running Mercurial-stable-branch--8bce1e0d2801/setup.py -q bdist_egg --dist-dir /tmp/easy_install-Gnk2c9/Mercurial-stable-branch--8bce1e0d2801/egg-dist-tmp--2VAce
zip_safe flag not set; analyzing archive contents...
mercurial.help: module references __file__
mercurial.templater: module references __file__
mercurial.extensions: module references __file__
mercurial.i18n: module references __file__
mercurial.lsprof: module references __file__
Removing mercurial unknown from easy-install.pth file
Adding mercurial 1.4.1-4-8bce1e0d2801 to easy-install.pth file
Installing hg script to /home/user/bin

Installed /home/user/lib/python2.5/mercurial-1.4.1_4_8bce1e0d2801-py2.5-linux-i686.egg
Processing dependencies for mercurial==1.4.1-4-8bce1e0d2801
Finished processing dependencies for mercurial==1.4.1-4-8bce1e0d2801
[user@web ~]$ hg version
Mercurial Distributed SCM (version 1.4.1+4-8bce1e0d2801)

Copyright (C) 2005-2009 Matt Mackall <mpm@selenic.com> and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Notice the version change from 1.2.1 to 1.4.1+4-8bce1e0d2801. w00t.

Edit: devov pointed out that pip is capable of installing packages directly from its repository. I've never used this functionality, but I'm interested in trying it out sometime! Thanks devov!

Mercurial 1.4.1 Released

I just noticed that Mercurial 1.4.1 was released today. Most of the changes are pretty minor, but I wanted to voice my appreciation for a new extension that is included with this release: schemes.

This extension basically makes your life easier by shortening redundant URLs for you. For example, you can now use the following command to snag my simple Mercurial extensions repo from BitBucket:

hg clone bb://codekoala/hgext

Without hgext.schemes, that command would be something like one of the following commands:

hg clone http://bitbucket.org/codekoala/hgext
hg clone ssh://hg@bitbucket.org/codekoala/hgext

Not the most ground-breaking of extensions, but still pretty slick!