On Security and Python's Exec

A recent project at work has renewed my aversion to Python's exec statement--particularly when you want to use it with arbitrary, untrusted code. The project requirements necessitated the use of exec, so I got to do some interesting experiments with it. I've got a few friends who, until I slapped some sense into them, were seemingly big fans of exec (in Django projects, even...). This article is for them and others in the same boat.

Take this example:

#!/usr/bin/env python

import sys

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

exec """import sys

dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)"""

print dirname, 'in path?', (dirname in sys.path)

Take a second and examine what the script is doing. Done? Great... So, the script first makes sure that a very critical directory is in my PYTHONPATH: /usr/lib/python2.6/site-packages. This is the directory where all of the awesome Python packages, like PIL, lxml, and dozens of others, reside. This is where Python will look for such packages when I try to import and use them in my programs.

Next, a little Python snippet is executed using exec. Let's say this snippet comes from an untrusted source (a visitor to your website, for example). The snippet removes that very important directory from my PYTHONPATH. It might seem like it's relatively safe to do within an exec--maybe it doesn't change the PYTHONPATH that I was using before the exec?

Wrong. The output of this script on my personal system says it all:

$ python bad.py
/usr/lib/python2.6/site-packages in path? True
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? False

From this example, we learn that Python code that is executed using exec runs in the same context as the code that uses exec. This is a critical concept to learn.

Some people might say, "Oh, there's an easy way around that. Give exec its own globals dictionary to work with, and all will be well." Wrong again. Here's a modified version of the above script.

#!/usr/bin/env python

import sys

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

context = {'something': 'This is a special context for the exec'}

exec """import sys

print something
dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)""" in context

print dirname, 'in path?', (dirname in sys.path)

And here's the output:

$ python also_bad.py
/usr/lib/python2.6/site-packages in path? True
This is a special context for the exec
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? False

How can you get around this glaring risk in the exec statement? One possible solution is to execute the snippet in its own process. Might not be the best way to handle things. Could be the absolute worst solution. But it's a solution, and it works:

#!/usr/bin/env python

import multiprocessing
import sys

def execute_snippet(snippet):
    exec snippet

dirname = '/usr/lib/python2.6/site-packages'

print dirname, 'in path?', (dirname in sys.path)

snippet = """import sys

dirname = '/usr/lib/python2.6/site-packages'
print 'In exec path?', (dirname in sys.path)

sys.path.remove(dirname)

print 'In exec path?', (dirname in sys.path)"""

proc = multiprocessing.Process(target=execute_snippet, args=(snippet,))
proc.start()
proc.join()

print dirname, 'in path?', (dirname in sys.path)

And here comes the output:

$ python better.py
/usr/lib/python2.6/site-packages in path? True
In exec path? True
In exec path? False
/usr/lib/python2.6/site-packages in path? True

So the PYTHONPATH is only affected by the sys.path.remove within the process that executes the snippet using exec. The process that spawns the subprocess is unaffected, and can continue with life, happily importing all of those wonderful packages from the site-packages directory. Yay.

With that said, exec isn't always bad. But my personal point of view is basically, "There is probably a better way." Unfortunately for me, that does not hold up in my current situation, and it might not work for your circumstances too. If no one is forcing you to use exec, you might investigate alternatives in all of that free time you've been wondering what to do with.

Python And Execution Context

I recently found myself in a situation where knowing the execution context of a function became necessary. It took me several hours to learn about this functionality, despite many cleverly-crafted Google searches. So, being the generous person I am, I want to share my findings.

My particular use case required that a function behave differently depending on whether it was called in an exec call. Specifics beyond that are not important for this article. Here's an example of how I was able to get my desired behavior.

import inspect

def is_exec():
    caller = inspect.currentframe().f_back
    module = inspect.getmodule(caller)

    if module is None:
        print "I'm being run by exec!"
    else:
        print "I'm being run by %s" % module.__name__

def main():
    is_exec()

    exec "is_exec()"

if __name__ == '__main__':
    main()

The output of such a script would look like this:

$ python is_exec.py
I'm being run by __main__
I'm being run by exec!

It's also interesting to note that when you're using the Python interactive interpreter, calling the is_exec function from the code above will tell you that you are indeed using exec.

Some may argue that modifying behavior as I needed to is dirty, and that if your system requires such code, you're doing it wrong. Well, you could apply this sort of code to situations that have nothing to do with exec. Perhaps you want to determine which part of your product is using a specific function the most. Perhaps you want to get additional debugging information that isn't immediately obvious.

Just like always, I want to add the disclaimer that there may be other ways to do this and there probably are. However, this is the way that worked for me. I'd still be interested to here about other solutions you may have encountered for this problem.

On a side note, if you're up for some slightly advanced Python mumbo jumbo, I suggest diving into the inspect documentation.

Send E-mails When You Get @replies On Twitter

I just had a buddy of mine ask me to write a script that would send an e-mail to you whenever you get an "@reply" on Twitter. I've recently been doing some work on a Twitter application, so I feel relatively comfortable with the python-twitter project to access Twitter. It didn't take very long to come up with this script, and it appears to work fine for us (using a cronjob to run the script periodically).

I thought others on the Internets might enjoy the script as well, so here it is!

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
A simple script to check your Twitter account for @replies and send you an email
if it finds any new ones since the last time it checked.  It was developed using
python-twitter 0.5 and Python 2.5.  It has been tested on Linux only, but it
should work fine on other platforms as well.  This script is intended to be
executed by a cron manager or scheduled task manager.

Copyright (c) 2009, Josh VanderLinden
All rights reserved.

Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

- Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
- Neither the name of the organization nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""

import twitter
import ConfigParser
import os
import sys
from datetime import datetime
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText
from email.Utils import formatdate

# get the user's "home" directory
DIRNAME = os.path.expanduser('~')
CONFIG = os.path.join(DIRNAME, '.twitter_email_replies.conf')
FORMAT = '%a %b %d %H:%M:%S +0000 %Y'
REPLY_TEMPLATE = """%(author)s said: %(text)s
Posted on %(created_at)s
Go to http://twitter.com/home?status=@%(screen_name)s%%20&in_reply_to_status_id=%(id)s&in_reply_to=%(screen_name)s to post a reply
"""

# sections
AUTH = 'credentials'
EXEC = 'exec_info'
EMAIL = 'email_info'

# make the code a bit "cleaner"
O = lambda s: sys.stdout.write(s + '\n')
E = lambda s: sys.stderr.write(s + '\n')
str2dt = lambda s: datetime.strptime(s, FORMAT)

def get_dict(status):
    my_dict = status.AsDict()
    my_dict['screen_name'] = my_dict['user']['screen_name']
    my_dict['author'] = my_dict['user']['name']
    return my_dict

def main():
    O('Reading configuration from %s' % CONFIG)
    parser = ConfigParser.SafeConfigParser()
    config = parser.read(CONFIG)

    # make sure we have the proper sections
    if not parser.has_section(AUTH): parser.add_section(AUTH)
    if not parser.has_section(EMAIL): parser.add_section(EMAIL)
    if not parser.has_section(EXEC): parser.add_section(EXEC)

    try:
        # get some useful settings from the configuration file
        username = parser.get(AUTH, 'username')
        password = parser.get(AUTH, 'password')

        to_address = parser.get(EMAIL, 'to_address')
        from_address = parser.get(EMAIL, 'from_address')
        smtp_server = parser.get(EMAIL, 'smtp_server')
        smtp_user = parser.get(EMAIL, 'smtp_user')
        smtp_pass = parser.get(EMAIL, 'smtp_pass')

        if '' in [username, password, to_address, from_address, smtp_server]:
            raise Exception('Not configured')
    except Exception:
        E('Please configure your credentials and e-mail information in %s!' % CONFIG)

        # create some placeholders in the configuration file to make it easier
        sections = {
            AUTH: ('username', 'password'),
            EMAIL: ('to_address', 'from_address', 'smtp_server', 'smtp_user', 'smtp_pass')
        }

        for section in sections.keys():
            for opt in sections[section]:
                if not parser.has_option(section, opt):
                    parser.set(section, opt, '')
    else:
        # determine the last time we checked for replies
        try:
            last_check = str2dt(parser.get(EXEC, 'last_run'))
        except ConfigParser.NoOptionError:
            last_check = datetime.utcnow()
        last_check_str = last_check.strftime(FORMAT)

        info = 'Fetching updates for %s since %s' % (username,
                                                       last_check_str)
        O(info)

        # attempt to connect to Twitter
        api = twitter.Api(username=username, password=password)

        # not using the `since` parameter for more backward-compatibility
        timeline = api.GetReplies()
        new_replies = []
        for reply in timeline:
            post_time = str2dt(reply.GetCreatedAt())
            if post_time > last_check:
                new_replies.append(reply)

        count = len(new_replies)
        if count:
            # send out an email for this user
            O('Found %i new replies... sending e-mail to %s' % (count, to_address))
            reply_list = '\n\n'.join([REPLY_TEMPLATE % get_dict(r) for r in new_replies])
            is_are = 'is'
            plural = 'y'
            if count != 1:
                is_are = 'are'
                plural = 'ies'

            params = {
                'is_are': is_are,
                'count': count,
                'replies': plural,
                'username': username,
                'reply_list': reply_list,
                'last_check': last_check_str
            }

            text = """There %(is_are)s %(count)i new @repl%(replies)s for %(username)s on Twitter since %(last_check)s:

%(reply_list)s""" % params

            # compose the e-mail
            msg = MIMEMultipart()
            msg['From'] = from_address
            msg['To'] = to_address
            msg['Date'] = formatdate(localtime=True)
            msg['Subject'] = 'New @Replies for %s' % username
            msg.attach(MIMEText(text))

            # try to send the e-mail message out
            email = smtplib.SMTP(smtp_server)
            if smtp_user and smtp_pass:
                email.login(smtp_user, smtp_pass)
            email.sendmail(from_address,
                           to_address,
                           msg.as_string())
            email.close()

        # save the current time so we know where to pick up next time
        parser.set(EXEC, 'last_run', datetime.utcnow().strftime(FORMAT))

    # write the config
    O('Saving settings...')
    out = open(CONFIG, 'wb')
    parser.write(out)
    out.close()

if __name__ == '__main__':
    main()

Feel free to copy this script and modify it to your desires. Also, please comment if you have issues using it.