Quick And Easy Execution Speed Testing

There have been many times when I've been programming, encounter a problem that probably involves a loop of some sort, and I think of two or more possible ways to achieve the same end result. At this point, I usually think about which one will probably be the fastest solution (execution-wise) while still being readable/maintainable. A lot of the time, the essentials of the problem can be tested in a few short lines of code.

A while back, I was perusing some Stack Overflow questions for work, and I stumbled upon what I consider one of the many hidden jewels in Python: the timeit module. Given a bit of code, this little guy will handle executing it in several loops and giving you the best time out of three trials (you can ask it to do more than 3 runs if you want). Once it completes its test, it will offer some very clean and useful output.

For example, today I encountered a piece of code that was making a comma-separated list of an arbitrary number of "%s". The code I saw essentially looked like this:

",".join(["%s"] * 50000)

Even though this code required no optimization, I thought, "Hey, that's neat... I wonder if a list comprehension could possibly be any faster." Here's an example of the contender:

",".join(["%s" for i in xrange(50000)])

I had no idea which would be faster, so timeit to the rescue!! Open up a terminal, type a couple one-line Python commands, and enjoy the results!

$ python -mtimeit 'l = ",".join(["%s"] * 50000)'
1000 loops, best of 3: 1.15 msec per loop
$ python -mtimeit 'l = ",".join(["%s" for i in xrange(50000)])'
100 loops, best of 3: 3.23 msec per loop

Hah, the list comprehension is certainly slower.

Now, for other more in-depth tests of performance, you might consider using the cProfile module. As far as I can tell, simple one-liners can't be tested directly from the command line using cProfile--they apparently need to be in a script. You can use something like:

python -mcProfile script.py

...in such situations. Or you can wrap function calls using cProfile.run():

import cProfile

def function_a():
    # something you want to profile

def function_b():
    # an alternative version of function_a to profile

if __name__ == '__main__':
    cProfile.run('function_a()')
    cProfile.run('function_b()')

I've used this technique for tests that I'd like to have "hard evidence" for in the future. The output of such a cProfile test looks something like this:

3 function calls in 6.860 CPU seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    6.860    6.860 <string>:1(<module>)
     1    6.860    6.860    6.860    6.860 test_enumerate.py:5(test_enumerate)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

This is useful when your code is calling other functions or methods and you want to find where your bottlenecks are. Hooray for Python!

What profiling techniques do you use?

Whoa! Another Reason To Love Vim

I've been struggling with some misconfigured appliances at work for the past couple of days, and I was getting tired of manually diff-ing things. On a whim, I decided to ask Google if there is a better way. Turns out there is, and it uses what I already know and love: VIM. Here's a command that lets you diff two remote file using vimdiff:

vimdiff scp://user@host//path/to/file scp://user@otherhost//path/to/file

This is going to save me so much time! I hope it is as useful to you all as it is to me.

SVN Commits By User

The other day at work, I found myself needing to see a list of Subversion commits by a specific user. I spent a few minutes looking at the svn log help, but nothing seemed to be designed to show commits by user. It took me a while to find something to do the trick, but this is it:

svn log | sed -n '/username/,/-----$/ p'

Gotta love sed!

Selenium Unit Test Reuse

Yesterday, one of the QA guys at work approached me with a question that turned out to be much more interesting to me than I think he had planned. He's been doing some unit testing using Selenium, exporting his test cases to Python. His question was this: how can I run the same unit tests using multiple browsers and multiple target servers?

I'm pretty sure he expected a simple 3-step answer or something like that. Instead, he got my crazy wide-eyed "ohhh... that's something I want to experiment with!" look. I started rambling on about inheritance, dynamic class creation, and nested for loops. His eyes started to look a little worried. He didn't really appreciate the nerdy lingo that much. I told him to pull up a chair and get comfortable.

Since I already had some other work I needed to pay attention to, I didn't want to spend too much time trying to figure out a good way to solve his problem. After about 20 minutes of devilish chuckles and frantic rustling through Python documentation, I came up with the following code:

from types import ClassType
from selenium import selenium
import unittest

IPS = ['192.168.0.1', '192.168.0.2']
BROWSERS = ['safari', 'chrome']

class SomeUnitTest(object):

    def test_something(self):
        sel = self.selenium
        # test code

def main(base):
    suites = []
    results = unittest.TestResult()

    for iidx, ip in enumerate(IPS):
        for bidx, browser in enumerate(BROWSERS):
            def setUp(self):
                self.verificationErrors = []
                self.selenium = selenium("localhost", 4444, "*%s" % self.browser, "http://%s/" % self.ip)
                self.selenium.start()

            def tearDown(self):
                self.selenium.stop()
                self.assertEqual([], self.verificationErrors)

            ut = ClassType('UT_%i_%i' % (iidx, bidx), (unittest.TestCase, base), {'ip': ip, 'browser': browser})
            ut.setUp = setUp
            ut.tearDown = tearDown

            suites.append(unittest.TestLoader().loadTestsFromTestCase(ut))

    unittest.TestSuite(suites)(results)
    for obj, error in results.errors:
        print 'In: ', obj
        print error

if __name__ == "__main__":
    main(SomeUnitTest)

I know, I know... it's got some dirty rotten tricks in it, and there are probably more efficient ways of doing what I've done. If the code offends you, look up at my previous disclaimer: I had other things I needed to be working on, so I didn't spend much time refining this. One thing I'm almost certain could be done better is not monkey patching the dynamic classes with the setUp and tearDown methods. Also, the output at the end of the test execution could definitely use some love. Oh well. Perhaps another day I'll get around to that.

Basically, you just set the servers you need to test and the browsers you want Selenium to run the tests in. Those are at the top of the script: IPS and BROWSERS. Then a new unittest.TestCase class is created for each combination of IP/server+browser. Finally, each of the test cases is thrown into a TestSuite, and the suite is processed. If there were any errors during the tests, they'll be printed out. We weren't really concerned with printing out other information, but you can certainly make other meaningful feedback appear.

Anyway, I thought that someone out there might very well benefit from my little experiment on my co-worker's question. Feel free to comment on your personal adventures with some variation of the code if you find it useful!

More django-articles Updates

I've spent a little more time lately adding new features to django-articles. There are two major additions in the latest release (2.0.0-pre2).

  • Article attachments
  • Article statuses

That's right folks! You can finally attach files to your articles. This includes attachments to emails that you send, if you have the articles from email feature properly configured. To prove it, I'm going to attach a file to this article (which I'm posting via email).

Next, I've decided that it's worth allowing the user to specify different statuses for their articles. One of the neat things about this feature is that if you are a super user, you're logged in, and you save an article with a status that is designated as "non-live", you will still be able to see it on the site. This is a way for users to preview their work before making it live. Out of the box, there are only two statuses: draft and finished. You're free to add more statuses if you feel so inclined (they're in the database, not hardcoded).

The article status is still separate from the "is_active" flag when saving an article. Any article that is marked as inactive will not appear on the site regardless of the article's "status".

On a slightly less impressive note (although still important), this release includes some basic unit tests. Most of the tests currently revolve around article statuses and making sure that the appropriate articles appear on the site.

GitHub and django-articles

Some of you who prefer to use git for your version control needs and were following the django-articles mirror on GitHub may have noticed some strange activity recently. I noticed today that the GitHub mirror was out of sync with the other mirrors, and I took a bit of time to investigate the problem.

I thought, for some reason, that I might be able to quickly and easily bring it back into sync if I just deleted the repo, recreated it, and pushed my changes to it. That didn't work. This means that all of you who were once following the project there are no longer following it, and I only realized that side effect after I had clicked the delete button. I apologize for this inconvenience.

In the end, it turned out that I had some things misconfigured with git on my box. I have resolved the problems and have brought the mirror back into sync. Please let me know if you run into any problems with it!

Review: Hacking Vim

Introduction

Some of my faithful visitors may have noticed that I have a thing for Vim, one of the oldest and most powerful text editors in the world. In the past 15 or so years that I've been developing, I have spent quite a bit of time in several different text editors. It seemed like I was continually on the quest to find the fastest, most feature-packed editor out there, while still being cross-platform compatible and having it stay out of my way. Speed has always been very important to me.

I have been using Vi and Vim regularly since about 2000, when I began dabbling with Linux. I could certainly hold my ground in either of the two programs, but I was by no means proficient. The more appealing text editors for me offered syntax highlighting and code completion. At the time, I was under the impression that Vi/Vim didn't offer either of these two features. It wasn't until around the middle of last year, however, that I really started putting effort into learning and using Vim. After asking some of my Vim-savvy friends a lot of questions to get me kickstarted, I began to see the power that lies in Vim.

Before long, Vim had replaced all other text editors as my preferred editing environment. I learned that Vim could satisfy just able every single one of my personal qualifications for the perfect editor. I dumped all other editors in favor of Vim, and I even opted to use Vim over a several hundred dollar IDE at work.

Anyway. I received a review copy of Kim Schulz' "Hacking Vim: A cookbook to get the most out of the latest Vim editor" a couple of months ago and have been rummaging through it since then. I have learned a ton of fantastic tips from this little book! Being a cookbook, you're not expected to read the entire book start to finish. Rather, you can dig right into whatever section interests you and feel right at home.

Brief Overview

Packt Publishing printed this book back in 2007, but all of the tips are still very much up-to-date. The book starts off with the obligatory history lesson (which is actually quite interesting if you're a nerd like me), and the target audience is described as such:

New users join the Vim user community every day and want to use this editor in their daily work, and even though Vim sometimes can be complex to use, they still favor it above other editors. This is a book for these Vim users.

After the history lesson, chapter 2 of the book digs right into personalizing Vim to fit your own preferences. Topics covered include:

  • changing fonts
  • changing color schemes
  • personalizing highlighting
  • customizing the status line
  • toggling menus and toolbars in gvim
  • adding your own menu items and toolbar buttons
  • customizing your work area

Chapter 3 discusses better navigation techniques. Topics covered include:

  • faster navigation in a file
  • faster navigation in the Vim help system
  • faster navigation in multiple buffers
  • in-file searching
  • searching in multiple files or buffers
  • using marks and signs

Chapter 4, titled "Production Boosters" discusses the following:

  • templates using simple template file
  • templates using abbreviations
  • auto-completion using known words and tag lists
  • auto-completion using omni-completion
  • macros
  • sessions
  • registers and undo branches
  • folding
  • vimdiff
  • opening remote files using Netrw

Chapter 5 introduces some advanced formatting tips. You can learn how to put text into nicely-formatted paragraphs, aligning text, marking headlines, and creating lists. For code, this chapter discusses several different indentation options.

Vim scripting is the topic of chapter 6, and Schulz covers a wide variety of useful tips to get anyone started on scripting Vim to do their bidding. Tips include:

  • creating syntax-coloring scripts
  • how to install and use scripts
  • different types of scripts
  • basic syntax of Vim scripts
  • how to structure Vim scripts
  • debugging a Vim script
  • using other scripting languages (Perl, Python, Ruby)

Appendix A describes how Vim can be used for much more than just text editing. Several different games, including Tetris and a Rubik's Cube are briefly introduced, along with how to use Vim as a mail client or programmer's IDE. Appendix B suggests miscellaneous configuration script maintenance tips, such as how you can maintain the same configuration script across several different machines.

My Thoughts

I was very impressed with this book. I was afraid that, being published in 2007, it might be a little too out-of-date for my personal tastes. Since the book is about Vim, though, I wasn't overly concerned (the editor has been around for decades, and it doesn't change drastically from release to release anymore).

Just like the last book I reviewed, I found several typos in this book. A lot of the typos were in the first few pages of the actual content, and some were definitely more minor than others. This sort of thing doesn't really detract much from the material covered, but it sure does stand out as a distraction for people who pay attention to details.

Here are some of the things that I truly enjoyed reading and learning about (many of which actually made my jaw drop in awe of Vim)

  • Specifying multiple fonts for GVim, just in case your first choice isn't always available:

    :set guifont=Courier\ New\ 12, Arial\ 10
    
  • Specifying different font faces based on the extension of the file you're editing:

    :autocmd BufEnter *.txt set guifont=Arial\ 12
    
  • Highlighting the line your cursor is currently on, and the column the cursor is in:

    :set cursorline
    :set cursorcolumn
    
  • Limiting the number of suggestions that the spell checker offers:

    :set spellsuggest=5
    
  • Navigating to different words based on whitespace instead of "regular" word separators:

    • W to move to the beginning of the next word
    • B to move to the beginning of the previous word
    • E to move to the beginning of the previous word

    I knew about the lowercase variations of these commands, but not the uppercase.

  • Navigating up and down in the same long, wrapped line:

    gk
    gj
    
  • Opening a file that is referenced in the current buffer:

    gf
    

    I learned that this even works on Python imports! Just like the description says, it will work on the import module, not classes or other objects from inside the module. Not quite that intelligent!

  • Incremental searching:

    :set incsearch
    
  • Searching up/down in a buffer for any occurrence of the word under the cursor:

    g#
    g*
    

    I knew about the usual # and *, but those two will only match the same exact word. When they're prefixed with g, they will match any occurrence of the word, be it whole or part of another word. For example, hitting g* while the cursor is over the word foo would would match both food and foobar, while * would match neither.

  • Using markers to jump between specific points in different open buffers (mA through mZ)

  • Prepopulating empty files based on their extension:

    :autocmd BufNewFile * silent! 0r $VIMHOME/templates/%:e.tpl
    
  • Formatting a paragraph of text:

    gqap
    
  • Formatting all paragraphs of text in a file:

    1gqG
    
  • Smart indentation:

    :set smartindent
    
  • Enabling paste mode, so smartindent doesn't try to format code that you paste into your buffer:

    :set paste
    
  • Prettifying XML and HTML using Tidy:

    :autocmd FileType xml exe ":silent 1,$!tidy --input-xml true --indent yes -q"
    :autocmd FileType html,htm exe ":silent 1,$!tidy --indent yes -q"
    

Conclusion

All in all, this is a fantastic book. I will be keeping it near my workstation as a quick reference book when I want to do something crazy with Vim. I've already recommended the book to several of my friends and acquaintances, and I will make the same recommendation here. If you are mildly familiar with Vim and at all interested in getting more out of this fabulous editor, I highly recommend picking up a copy of this book.

2Ze.us Updates

There has been quite a bit of recent activity in my 2ze.us project since I first released it nearly a year ago. My intent was not to become a competitor with bit.ly, is.gd, or anyone else in the URL-shortening arena. I created the site as a way for me to learn more about Google's AppEngine. It didn't take very long to get it up and running, and it seemed to work fairly well.

AppEngine and Extensions

I was able to basically leave the site alone on AppEngine for several months--through about September 2009. In that time, I came up with a Firefox extension to make its use more convenient.

The extension allows you to quickly get a shortened URL for the page you're currently looking at, and a couple of context menu items let you get a short URL for things like specific images on a page. Also included in the extension is a preview for 2ze.us links. The preview can tell you the title and domain of the link's target. It can tell you how much smaller the 2ze.us URL is compared to the full URL. Finally, it displays how many times that particular 2ze.us link has been clicked.

That as all fine and dandy. It was the second Firefox extension I had ever written, and it's still running strong. In June or July of 2009, I started working on a little program to make it easier for me to interact with Twitter the way I wanted to. This was a great opportunity for me to incorporate 2ze.us into the application so any URL I wanted to post to Twitter would automatically be shortened for me, using my own shortener.

Porting to WebFaction And PHP

Anyway, around the end of September 2009, I noticed that there were a lot of problems with 2ze.us. It was slow and sometimes completely unresponsive. Certain URLs would redirect to their full URLs, while others wouldn't. The Firefox extension stopped working nicely. Oh yeah, and AppEngine rolled back to a previous revision of the code without me telling it to. That's when everything just died. It didn't take long for me to decide to migrate my project from AppEngine onto my awesome WebFaction hosting.

At this point, I was faced with a small dilemma: keep the code in Python, or port it to PHP. I opted to port it over to PHP, because I didn't want all of the overhead of a full Django instance for a site that needed to be very zippy. And I was unacquainted with other Python options.

By early October 2009, I had managed to turn the project into a PHP beast, running on Apache. It was a lot more responsive than AppEngine ever let 2ze.us be. There were a few bumps along the road, what with the extension and Twitter client relying on various parts of the site. Eventually it got to a point where I could just let it sit and work.

Chromium Extension

Sometime around the end of December, I decided to write another extension for 2ze.us, only for Google Chrome and Chromium this time. This extension isn't quite as feature-packed as its Firefox brother, but it gets the job done.

Clip2Zeus

Shortly after "completing" the Chromium extension, I had what seemed like a pretty original idea. Who knows if it really is, but I still haven't seen another tool quite like the one that I made as a result of this idea. I thought, "Now, why should I need to install an extension in each Web browser I use on each computer I use? Is there a better way?"

The answer came quickly: a standalone, desktop application. Write one program that handles shortening URLs for you. My laziness told me to make a program that monitors your system clipboard for URLs. If a URL is detected, try to shorten it, and update the clipboard contents in place. Boom. Done. All extensions become useless beyond things like the URL preview (which is very useful, imo).

The next question I asked was, "Do I make it platform-dependent? Should I stick it to the majority of computer users and write my tool for Linux only? For OSX only? For, uh... Windows only?" Again, an easy question to answer. Support them all or don't even bother writing the application.

A week's worth of midnight hacking saw the birth of Clip2Zeus 1.0a. It's a cross-platform compatible desktop application that does exactly what I just mentioned. When it's running and detects a URL on your system clipboard, it will try to shorten it and update it in your clipboard. If you copy a block of text, the application will only modify the URLs in that block of text--meaning the block of text will still be in your clipboard, but it will have shorter URLs.

I use the program every day at work (on OSX). It's been very fun for me to see a short URL any time I copy a nasty URL to my clipboard. Imagine that; I'm a big fan of my own work...

Tornado

Lately, I've noticed that the site was getting kind of slow again. Sometimes it would take several seconds for Clip2Zeus to shorten URLs in my clipboard, when it was normally instantaneous. Every once in a while, Clip2Zeus would completely fail to connect to the website.

One of my friends has asked me a lot of questions about the Tornado framework in the past months. I had read a few things about Tornado when it was open-sourced last year, but I didn't really feel the need to dabble with it. These questions prompted me to tinker a little.

Last night I re-ported 2ze.us to Python, using the Tornado framework this time. So far I'm very impressed with its responsiveness. The framework offers a lot of neat little utilities, and it is very fast (as reported by dozens of other reputable sources).

On top of the speed increase that came with the transition to Tornado, my RAM usage on WebFaction has come down by nearly 100MB. Just by turning off the one Apache-backed website. Now I'm nowhere near my RAM cap! Wahoo!!

Enough rambling. Like I said at the beginning of this article, a lot has been happening with this project in the past year. I didn't even think about all of the time I put into projects related to my simple little side project. Looking back, I'm quite satisfied with how things have unfolded.

Statistics

Here are some simple statistics for 2ze.us. Since March 2009...

  • 5,252 URLs have been shortened using 2ze.us
  • 2ze.us links have been clicked 198,267 times
  • 315,951 URL characters have been turned into 11,532 characters

In April 2009...

  • 217 URLs were shortened
  • 2ze.us links were clicked 617 times

In February 2010...

  • 1,182 URLs were shortened
  • 2ze.us links were clicked 32,830 times

Not too shabby for a side project.