Monitor Multiple Remote Files Using Multitail

There comes a time in each of our individual lives that we just learn to love log files. We learn to love utilities like tail and grep as we pore over countless lines of information, seeking out the stuff that really matters. We like to show off our debugging prowess as innocent bystanders look on in absolute wonderment.

While that's all fine and dandy, I'm always on the lookout for utilities to make my log monitoring less painful. A few weeks ago, my supervisor introduced me to a program that he's been using for quite some time: multitail. In essence, it's tail with some really neat features, such as the ability to:

  • "tail" multiple files (or commands, like netstat) independently in the same terminal
  • highlight text using regular expressions
  • search log messages and see only the matching lines
  • merge multiple files into one log window
  • scrolling back in the history of a log file
  • highlighting "themes"

I've been using multitail for a couple of weeks now (it took me a while to warm up to it after my supervisor introduce it), and I'm quite satisfied with it. One thing I really, really like about multitail is that I can kinda sorta almost monitor multiple remote files. What does that mean, you ask?

Well, my development environment includes at least 5 virtual machines, each of which will be logging different but equally important information. I want to be able to "tail" a specific log file on each of the virtual machines in one window. Now, it took me a while to learn how to do this, which is why I'm sharing the information with you.

And here comes my usual disclaimer: this may not be the most efficient way to do what I want to do, but it's currently working for me. I'm open to other solutions too!

Anyway, I can run a command like the following to monitor multiple remote log files:

multitail -l 'ssh user@host1 "tail -f /path/to/log/file"' -l 'ssh user@host2 "tail -f /path/to/log/file"'

Such a command would ssh into two computers, host1 and host2, and run tail -f /path/to/log/file on each. Multitail allows you to monitor the output of both tail commands in a single window, reducing clutter on your desktop. You can also arrange the files/commands you're "tailing" into various rows and columns. I tend to have a 2x2 grid of log files when I use multitail at work.

I've also started using multitail to monitor the access and error logs for my Django sites on WebFaction. I simply ssh into my account, run an alias for a ridiculous multitail command, and watch as both log files scroll on by.

Again, this is just another aspect of my work environment that is fun and useful to me, and I wanted to spread the joy. Multitail may or may not be a utility you like to use, but it suits my current needs and desires quite well. YMMV. And, once again, I'm always on the look-out for other tools to make my work life more interesting and productive!

Auto-Generating Documentation Using Mercurial, ReST, and Sphinx

I often find myself taking notes about various aspects of my job that I feel I would forget as soon as I moved onto another project. I've gotten into the habit of taking my notes using reStructured Text, which shouldn't come as any surprise to any of my regular visitors. On several occasions, I had some of the other guys in the company ask me for some clarification on some things I had taken notes on. Lucky for me, I had taken some nice notes!

However, these individuals probably wouldn't appreciate reading ReST markup as much as I do, so I decided to do something nice for them. I setup Sphinx to prettify my documentation. I then wrote a small Web server using Python, so people within the company network could access the latest version of my notes without much hassle.

Just like I take notes to remind myself of stuff at work, I want to do that again for this automated ReST->HTML magic--I want to be able to do this in the future! I figured I would make my notes even more public this time, so you all can enjoy similar bliss.

Platform Dependence

I am writing this article with UNIX-like operating systems in mind. Please forgive me if you're a Windows user and some of this is not consistent with what you're seeing. Perhaps one day I'll try to set this sort of thing up on Windows.

Installing Sphinx

The first step that we want to take is installing Sphinx. This is the project that Python itself uses to generate its online documentation. It's pretty dang awesome. Feel free to skip this section if you have already installed Sphinx.

Depending on your environment of choice, you may or may not have a package manager that offers python-sphinx or something along those lines. I personally prefer to install it using pip or easy_install:

$ sudo pip install sphinx

Running that command will likely respond with a bunch of output about downloading Sphinx and various dependencies. When I ran it in my sandbox VM, I saw it install the following packages:

  • pygments
  • jinja2
  • docutils
  • sphinx

It should be a pretty speedy installation.

Installing Mercurial

We'll be using Mercurial to keep track of changes to our ReST documentation. Mercurial is a distributed version control system that is built using Python. It's wonderful! Just like with Sphinx, if you have already installed Mercurial, feel free to skip to the next section.

I personally prefer to install Mercurial using pip or easy_install--it's usually more up-to-date than what you would have in your package repositories. To do that, simply run a command such as the following:

$ sudo pip install mercurial

This will go out and download and install the latest stable Mercurial. You may need python-dev or something like that for your platform in order for that command to work. However, if you're on Windows, I highly recommend TortoiseHg. The installer for TortoiseHg will install a graphical Mercurial client along with the command line tools.

Create A Repository

Now let's create a brand new Mercurial repository to house our notes/documentation. Open a terminal/console/command prompt to the location of your choice on your computer and execute the following commands:

$ hg init mydox
$ cd mydox

Configure Sphinx

The next step is to configure Sphinx for our project. Sphinx makes this very simple:

$ sphinx-quickstart

This is a wizard that will walk you through the configuration process for your project. It's pretty safe to accept the defaults, in my opinion. Here's the output of my wizard:

$ sphinx-quickstart
Welcome to the Sphinx quickstart utility.

Please enter values for the following settings (just press Enter to
accept a default value, if one is given in brackets).

Enter the root path for documentation.
> Root path for the documentation [.]:

You have two options for placing the build directory for Sphinx output.
Either, you use a directory "_build" within the root path, or you separate
"source" and "build" directories within the root path.
> Separate source and build directories (y/N) [n]: y

Inside the root directory, two more directories will be created; "_templates"
for custom HTML templates and "_static" for custom stylesheets and other static
files. You can enter another prefix (such as ".") to replace the underscore.
> Name prefix for templates and static dir [_]:

The project name will occur in several places in the built documentation.
> Project name: My Dox
> Author name(s): Josh VanderLinden

Sphinx has the notion of a "version" and a "release" for the
software. Each version can have multiple releases. For example, for
Python the version is something like 2.5 or 3.0, while the release is
something like 2.5.1 or 3.0a1.  If you don't need this dual structure,
just set both to the same value.
> Project version: 0.0.1
> Project release [0.0.1]:

The file name suffix for source files. Commonly, this is either ".txt"
or ".rst".  Only files with this suffix are considered documents.
> Source file suffix [.rst]:

One document is special in that it is considered the top node of the
"contents tree", that is, it is the root of the hierarchical structure
of the documents. Normally, this is "index", but if your "index"
document is a custom template, you can also set this to another filename.
> Name of your master document (without suffix) [index]:

Please indicate if you want to use one of the following Sphinx extensions:
> autodoc: automatically insert docstrings from modules (y/N) [n]:
> doctest: automatically test code snippets in doctest blocks (y/N) [n]:
> intersphinx: link between Sphinx documentation of different projects (y/N) [n]:
> todo: write "todo" entries that can be shown or hidden on build (y/N) [n]:
> coverage: checks for documentation coverage (y/N) [n]:
> pngmath: include math, rendered as PNG images (y/N) [n]:
> jsmath: include math, rendered in the browser by JSMath (y/N) [n]:
> ifconfig: conditional inclusion of content based on config values (y/N) [n]:

A Makefile and a Windows command file can be generated for you so that you
only have to run e.g. `make html' instead of invoking sphinx-build
directly.
> Create Makefile? (Y/n) [y]:
> Create Windows command file? (Y/n) [y]: n

Finished: An initial directory structure has been created.

You should now populate your master file ./source/index.rst and create other documentation
source files. Use the Makefile to build the docs, like so:
   make builder
where "builder" is one of the supported builders, e.g. html, latex or linkcheck.

If you followed the same steps I did (I separated the source and build directories), you should see three new files in your mydox repository:

  • build/
  • Makefile
  • source/

We'll do our work in the source directory.

Get Some ReST

Now is the time when we start writing some ReST that we want to turn into HTML using Sphinx. Open some file, like first_doc.rst and put some ReST in it. If nothing comes to mind, or you're not familiar with ReST syntax, try the following:

=========================
This Is My First Document
=========================

Yes, this is my first document.  It's lame.  Deal with it.

Save the file (keep in mind that it should be within the source directory if you used the same settings I did). Now it's time to add it to the list of files that Mercurial will pay attention to. While we're at it, let's add the other files that were created by the Sphinx configuration wizard:

$ hg add
adding ../Makefile
adding conf.py
adding first_doc.rst
adding index.rst
$ hg st
A Makefile
A source/conf.py
A source/first_doc.py
A source/index.rst

Don't worry that we don't see all of the directories in the output of hg st--Mercurial tracks files, not directories.

Automate HTML-ization

Here comes the magic in automating the conversion from ReST to HTML: Mercurial hooks. We will use the precommit hook to fire off a command that tells Sphinx to translate our ReST markup into HTML.

Edit your mydox/.hg/hgrc file. If the file does not yet exist, go ahead and create it. Add the following content to it:

[hooks]
precommit.sphinxify = ~/bin/sphinxify_docs.sh

I've opted to call a Bash script instead of using an inline Python call. Now let's create the Bash script, ~/bin/sphinxify_docs.sh:

#!/bin/bash
cd $HOME/mydox
sphinx-build source/ docs/

Notice that I used the $HOME environment variable. This means that I created the mydox directory at /home/myusername/mydox. Adjust that line according to your setup. You'll probably also want to make that script executable:

$ chmod +x ~/bin/sphinxify_docs.sh

Three, Two, One...

You should now be at a stage where you can safely commit changes to your repository and have Sphinx build your HTML documentation. Execute the following command somewhere under your mydox repository:

$ hg ci -m "Initial commit"

If your setup is anything like mine, you should see some output similar to this:

$ hg ci -m "Initial commit"
Making output directory...
Running Sphinx v0.6.4
No builder selected, using default: html
loading pickled environment... not found
building [html]: targets for 2 source files that are out of date
updating environment: 2 added, 0 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... /home/jvanderlinden/mydox/source/first_doc.rst:: WARNING: document isn't included in any toctree
done
preparing documents... done
writing output... [100%] index
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded, 1 warning.
$ hg st
? docs/.buildinfo
? docs/.doctrees/environment.pickle
? docs/.doctrees/first_doc.doctree
? docs/.doctrees/index.doctree
? docs/_sources/first_doc.txt
? docs/_sources/index.txt
? docs/_static/basic.css
? docs/_static/default.css
? docs/_static/doctools.js
? docs/_static/file.png
? docs/_static/jquery.js
? docs/_static/minus.png
? docs/_static/plus.png
? docs/_static/pygments.css
? docs/_static/searchtools.js
? docs/first_doc.html
? docs/genindex.html
? docs/index.html
? docs/objects.inv
? docs/search.html
? docs/searchindex.js

If you see something like that, you're in good shape. Go ahead and take a look at your new mydox/docs/index.html file in the Web browser of your choosing.

Not very exciting, is it? Notice how your first_doc.rst doesn't appear anywhere on that page? That's because we didn't tell Sphinx to put it there. Let's do that now.

Customizing Things

Edit the mydox/source/index.rst file that was created during Sphinx configuration. In the section that starts with .. toctree::, let's tell Sphinx to include everything we ReST-ify:

.. toctree::
   :maxdepth: 2
   :glob:

   *

That should do it. Now, I don't know about you, but I don't really want to include the output HTML, images, CSS, JS, or anything in my documentation repository. It would just take up more space each time we change an .rst file. Let's tell Mercurial to not pay attention to the output HTML--it'll just be static and always up-to-date on our filesystem.

Create a new file called mydox/.hgignore. In this file, put the following content:

syntax: glob
docs/

Save the file, and you should now see something like the following when running hg st:

$ hg st
M source/index.rst
? .hgignore

Let's include the .hgignore file in the list of files that Mercurial will track:

$ hg add .hgignore
$ hg st
M source/index.rst
A .hgignore

Finally, let's commit one more time:

$ hg ci -m "Updating the index to include our .rst files"
Running Sphinx v0.6.4
No builder selected, using default: html
loading pickled environment... done
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 1 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

Tada!! The first_doc.rst should now appear on the index page.

Serving Your Documentation

Who seriously wants to have HTML files that are hard to get to? How can we make it easier to access those HTML files? Perhaps we can create a simple static file Web server? That might sound difficult, but it's really not--not when you have access to Python!

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler

def main():
    try:
        server = HTTPServer(('', 80), SimpleHTTPRequestHandler)
        server.serve_forever()
    except KeyboardInterrupt:
        server.socket.close()

if __name__ == '__main__':
    main()

I created this simple script and put it in my ~/bin/ directory, also making it executable. Once that's done, you can navigate to your mydox/docs/ directory and run the script. Since I called the script webserver.py, I just do this:

$ cd ~/mydox/docs
$ sudo webserver.py

This makes it possible for you to visit http://localhost/ on your own computer, or to use your computer's IP in place of localhost to access your documentation from a different computer on your network. Pretty slick, if you ask me.

I suppose there's more I could add, but that's all I have time for tonight. Enjoy!

Automatic Config Replication With Mercurial

I've done a lot of neat things since I started my new job earlier this month. I'm really excited about the things I've learned and experimented with, and I would like to share some of the concepts with my visitors.

At work we use a lot of virtual machines in our individual development environments. Most of these virtual machines use very similar configuration settings, but the settings are not a standard part of the installation. That is because we build our virtual machines using the same installation tools that our customers would use. The configuration I'm talking about is just stuff specific to our development environment.

Creating and configuring these virtual machines is one of the first things my mentor showed me how to do my first day on the job. He commented on how quickly I would probably start learning all of the configuration tasks because we tend to setup our development VMs several times a month. That was all fine and dandy, and I did get a pretty good feel for what needed to go into a development VM that first day.

However, after doing it so many times, I realized how much time I was using just trying to get the VM set up just right. It wasn't hard to configure--it was just time-consuming. It wasn't long before I started thinking of ways to optimize the process.

One of the ideas I came up with, which seems to be serving my purposes perfectly, is that of using Mercurial to quickly and easily get the exact same configuration from one box to another. It also has the added benefit of keeping a history of the changes I make to my configuration as time goes on.

I won't go into exact detail on how I have things setup at work, but I would like to try to describe a similar scenario that should illustrate my goal just as well.

Getting Started

One of the first things I would encourage you to do is follow along. It will make the concept sink in much faster, and you will probably see other applications very quickly. Please note, however, that if you're following along exactly, it could be a very time-consuming process. I will be using 3 virtual machines as I write this, but you could just as easily use 5, 10, or 100,000. Likewise, you could eliminate the virtual machines altogether if you're in an environment with several physical computers.

One virtual machine will act as the "master" server, or the one that will be configured first. The other virtual machines will act as "slave" servers, which will simply receive configuration updates that happen on the master server. We will also modify this behavior to be a bit more interesting toward the end of the article.

Virtual Machines Galore!

First off, I will create some basic virtual machines using the net install version of Debian 5.0.3. I really only need to create 1 VM and then clone it a couple of times. I am willing to furnish my virtual machines to those who are interested in using them. I will install some additional software in the VM to make sure the demo works smoothly. Among the packages that I will install are:

  • Python
  • Mercurial
  • OpenSSH server

Initialize a Repository

Once I have all of that set up in my virtual machines, I will initialize a Mercurial repository on the master server to maintain the configuration files that I am interested in. Let's just use the /etc directory for the time being. There's a pretty good chance that most of our system-wide configuration will all be contained somewhere beneath /etc.

cd /etc
hg init

Now let's have a gander at the files that we can have Mercurial manage for us:

hg st

Wow! That is quite a set of files, isn't it? Thankfully, they should mostly be plain text files. Mercurial is very efficient at managing text files. Let's now add all of the files in /etc to our repository, so they can be tracked and easily pushed out to other systems.

hg add

That command will happily add everything that hg st printed. Obviously, we can get a little more picky about what we do and do not add to our repository, but that's not the goal of this article. Now, this step merely tells Mercurial that it needs to pay attention to changes in these files. The files have not yet been committed to the repo. Let's do that, so we have a backup of our configuration files in their pristine state:

hg ci -m "Initial import"

The -m "Initial import" is just a comment, to describe what happened to warrant a commit to the repository. It is for your use and the use of anyone who has access to your repo.

Clone The Configuration

Now let's try to push the configuration we just committed on the master server to one of the slave servers. Since my virtual machines are all essentially in the same state, there should be no conflicts, right? Try running the following command on the master server:

hg push ssh://root@slave1//etc
root@slave1's password:
remote: abort: There is no Mercurial repository here (.hg not found)!
abort: no suitable response from remote hg!

Blast! We can't simply push the configuration files out to another computer. For that to work, we'd first have to have the repository itself exist on the slave server. Let's try this another way. One the slave server, run this command:

hg clone ssh://root@master//etc /etc
root@master's password:
abort: destination '/etc/' is not empty

Doh! Mercurial won't let us clone the repository from the master server! That's because Mercurial wants to clone to a new directory, with nothing already in it. One way to get around this hairball of a show-stopper is to just copy the repo using conventional UNIX utilities. Execute this command on one of your slave servers:

scp -r root@master:/etc/.hg /etc/

The .hg directory contains all of the repository information, and it's really all we need to snag in order to clone the repository. This might not be the most elegant solution in the world, but it will suffice for the time being. Once the scp command completes, we should have a full copy of the configuration file repository. Run this command to verify:

hg st

If your setup is anything like mine, you'll probably have a few files that are listed as being modified. Chances are that these files will vary from host to host anyway, and they are probably not worth keeping in a version control system. That would just be begging for conflicts.

I wrote an extension for Mercurial that should make this part of my tutorial a little less hacky. On your other slave server, run the following commands:

hg clone http://bitbucket.org/codekoala/hgext /root/hgext
echo "[extensions]" >> /root/.hgrc
echo "neclone = /root/hgext/neclone.py" >> /root/.hgrc

This extension gives you a new Mercurial command called neclone (N. E. Clone, or "not empty clone"). As we saw earlier, Mercurial doesn't let us clone a repository into a directory that is not empty. This extension allows us to do that. It works almost identically to the regular clone command... takes the same options and everything.

Still on your second slave server, run these additional commands:

hg neclone ssh://root@master//etc /etc
cd /etc
hg up -C

The last step is optional, and soon to be included as part of the extension. It will update your working copy to the latest revision in the repository. Beware that it overwrites any uncommitted changes you may have made to files that are tracked by Mercurial.

So now both slave servers should have a clone of the configuration repository from the master server.

Being Picky

Let's start to be a little picky about the files we are tracking in our repository. Some of the files appears as being modified on my slave server after copying the .hg directory from the master server are:

  • adjtime
  • alternatives/pager
  • alternatives/pager.1.gz
  • mailcap
  • network/run/ifstate
  • udev/rules.d/70-persistent-net.rules

I think it's safe to remove these from the repository, to avoid conflicts with other systems. To tell Mercurial to stop tracking files it is tracking, without actually deleting the file from the filesystem, you can use the following command:

hg forget adjtime
hg forget mailcap

And so on. Go ahead and do that for each of the files that appeared to be modified on your slave server immediately after copying the .hg directory. I'm going to add /etc/hostname to the list of files to forget too.

After doing that, each of those files should appear as being marked for removal when you run hg st. Don't worry, this is normal. The files will not be deleted from the filesystem, but they will be deleted from the repository. Go ahead and commit those changes to the repository on your slave server.

hg ci -Am "Removed some files from version control"

Now let's push those changes out to the master server:

hg push
abort: repository default-push not found!

Since we copied the .hg directory directly using scp, our slave won't know where the changes need to go when we run the push command with no explicit destination repository. To fix that, let's create a file in /etc/.hg/ called hgrc on the slave server. In that file, put the following text:

[paths]
default = ssh://root@master//etc

The hg push command should now push directly to the master server. Yay! The problem we face now is that every other slave server in the group is out of date. How can we fix that? We'll use Mercurial hooks.

Automating Config Replication

Mercurial offers some very useful hooks that we can use to automatically push configuration changes out to each of our slave servers. We will use the commit and changegroup hooks to do the magic. Let's create a script that will live on the master server to take care of pushing our changes out to each slave server. Create a new file in /etc/ on the master server called propagate.sh:

#!/bin/bash
hg up
for node in 'slave1' 'slave2'
do
    ssh root@$node "cd /etc; hg pull -u"
done

Let's also make sure this script is executable:

chmod +x /etc/propagate.sh

This script assumes that your /etc/hosts file or your nameserver are configured appropriately to allow slave1 and slave2 to be resolved to IP addresses. The reason we're SSH'ing into each slave server and using hg pull instead of simply using hg push ssh://root@$node//etc is because you can't force an update on a remote server using push. You can, however, request an update when you're using pull.

Obviously, this script is not the most sophisticated of scripts. It might work well for my demonstration, with only a few servers, but once you get beyond that it would be a nightmare to maintain the list of servers the script has to connect to. You can use whatever means you'd like to keep track of the servers you want to replicate your configuration to. I don't want to bother with all of the crap I'd get for suggesting one thing over another, so it's now your call.

Now it's time to configure the Mercurial hook to execute that script when the master server sees a changeset get into its repository. Open up /etc/.hg/hgrc on the master server, or create it if it doesn't exist. Make sure it has at least the following in it:

[hooks]
commit.propagate = /etc/propagate.sh
changegroup.propagate = /etc/propagate.sh

Let's try it out! Run these commands on your master server:

echo "" >> /etc/hosts
hg ci -m "Added a blank line to the hosts file"
root@slave1's password:
remote: Permission denied, please try again.
remote: Permission denied, please try again.
remote: Permission denied (publickey,password).
abort: no suitable response from remote hg!
Connection closed by slave2
warning: commit.propagate hook exited with status 255

Blast! The script failed because it wanted us to type in a password, but it was not in interactive mode. Let's fix that with a little preshared key magic. I won't go into the details about how this works, but the following commands on your master server should get us rolling:

ssh-keygen
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys2
scp -r ~/.ssh root@slave1:~
scp -r ~/.ssh root@slave2:~

Warning

Keep in mind this is not secure and should probably not be how your production machines are configured, especially with the root user.

For simplicity's sake, just accept all of the details and don't set a passphrase. These commands enable us to SSH into our slave servers without using a password. If you get an error such as:

remote: Host key verification failed.
abort: no suitable response from remote hg!

...it just means you need to manually log into your master server from the slave machine that threw that error. When doing so, you will have to answer "yes" to a question about the authenticity of the host you're logging into.

Testing It Out

It is now time to see if we can make a configuration change on one slave server and have it show up on the other slave server. Let's update the hosts file a little bit. Let's add the following line on the second slave server:

10.0.0.5        nonexistanthost

Now let's commit the change and push it off to the master server:

hg ci -m "Added a dumb line to the hosts file"
hg push

My system actually told me that that it had copied the change out to another host. I know because I saw these lines:

remote: pulling from ssh://root@master//etc
remote: searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files

Now when I look at the first slave server, I should see that new line in my /etc/hosts file. Also, the log on each server should have the same entry that I just made about adding "a dumb line to the hosts file."

Seem Like A Lot of Work?

A lot of what we just did probably seemed like more work that it is worth, right? Well, being a nerd typically comes with a few qualities. One quality which I have observed many a time in my most geeky of friends is that they will spend hours and hours up front on a program or script just so they can save 2 minutes in the future. They work hard to be lazy.

There is a lot of boilerplate configuration that takes place in this particular scenario. I realize that. What I haven't shared with you, though, is how I automated the boilerplate configuration as well as the propagation of configuration. I'm tired of putting this article off, so I will have to leave those details for another article. Sorry!

Why?! There's a Better Way (tm)

There is always a better way. Always. Go ahead and use whatever you feel is the most efficient method for keeping configuration files in sync across several computers. This is just one more option to add to your toolkit. Don't worry, I won't be offended if you don't like it or don't use it. It works perfect for me and it's free, and I just wanted to share!

Installing Django on Shared Hosting (Site5)

This article is a related to my previously posted article about installing Django, an advanced Web framework for perfectionists, on your own computer. Now we will learn how to install Django on a shared hosting account, using Site5 and fastcgi as an example. Depending on your host, you may or may not have to request additional privileges from the support team in order to execute some of these commands.

Note: Django requires at least Python 2.3. Newer versions of Python are preferred.

Note: This HOWTO assumes familiarity with the UNIX/Linux command line.

Note: If the wget command doesn't work for you (as in you don't have permission to run it), you might try curl [url] -O instead. That's a -O as in upper-case o.

Install Python

Site5 (and many other shared hosting providers that offer SSH access) already has Python installed, but you will want to have your own copy so you can install various tools without affecting other users. So go ahead and download virtual python:

mkdir ~/downloads
cd ~/downloads
wget http://peak.telecommunity.com/dist/virtual-python.py

Virtual Python will make a local copy of the installed Python in your home directory. Now you want to make sure you execute this next command with the newest version of Python available on your host. For example, Site5 offers both Python 2.3.4 and Python 2.4.3. We want to use Python 2.4.3. To verify the version of your Python, execute the following command:

python -V

If that displays Python 2.3.x or anything earlier, try using python2.4 -V or python2.5 -V instead. Whichever command renders the most recent version of Python is the one you should use in place of python in the next command. Since python -V currently displays Python 2.4.3 on my Site5 sandbox, I will execute the following command:

python ~/downloads/virtual-python.py

Again, this is just making a local copy of the Python installation that you used to run the virtual-python.py script. Your local installation is likely in ~/lib/python2.4/ (version could vary).

Make Your Local Python Be Default

To reduce confusion and hassle, let's give our new local installation of Python precedence over the system-wide Python. To do that, open up your ~/.bashrc and make sure it contains a line similar to this:

export PATH=$HOME/bin:$PATH

If you're unfamiliar with UNIX-based text editors such as vi, here is what you would type to use vi to make the appropriate changes:

  • vi ~/.bashrc to edit the file
  • go to the end of the file by using the down arrow key or the j key
  • hit o (the letter) to tell vi you want to start typing stuff on the next line
  • type export PATH=$HOME/bin:$PATH
  • hit the escape key
  • type :x to save the changes and quit. Don't forget the : at the beginning. Alternatively, you can type :wq, which works exactly the same as :x.

Once you've made the appropriate changes to ~/.bashrc, you need to make those changes take effect in your current SSH session:

source ~/.bashrc

Now we should verify that our changes actually took place. Type the following command:

which python

If they output of that command is not something like ~/bin/python or /home/[your username]/bin/python, something probably didn't work. If that's the case, you can try again, or simply remember to use ~/bin/python instead of python throughout the rest of this HOWTO.

Install Python's setuptools

Now we should install Python's setuptools to make our lives easier down the road.

cd ~/downloads
wget http://peak.telecommunity.com/dist/ez_setup.py
python ez_setup.py

This gives us access to a script called easy_install, which makes it easy to install many useful Python tools. We will use this a bit later.

Download Django

Let's now download the most recent development version of Django. SSH into your account and execute the following commands (all commands shall be executed on your host).

svn co http://code.djangoproject.com/svn/django/trunk ~/downloads/django-trunk

Now we should make a symlink (or shortcut) to Django and put it somewhere on the Python Path. A sure-fire place is your ~/lib/python2.4/site-packages/ directory (again, that location could vary from host to host):

ln -s ~/downloads/django-trunk/django ~/lib/python2.4/site-packages
ln -s ~/downloads/django-trunk/django/bin/django-admin.py ~/bin

Now verify that Django is installed and working by executing the following command:

python -c "import django; print django.get_version()"

That command should return something like 1.0-final-SVN-8964. If you got something like that, you're good to move onto the next section. If, however, you get something more along the lines of...

Traceback (most recent call last):
    File "<string>", line 1, in ?
ImportError: No module named django

...then your Django installation didn't work. If this is the case, make sure that you have a ~/downloads/django-trunk/django directory, and also verify that ~/lib/python2.4/site-packages actually exists.

Installing Dependencies

In order for your Django projects to become useful, we need to install some other packages: PIL (Python Imaging Library, required if you want to use Django's ImageField), MySQL-python (a MySQL database driver for Python), and flup (a utility for fastcgi-powered sites).

easy_install -f http://www.pythonware.com/products/pil/ Imaging
easy_install mysql-python
easy_install flup

Sometimes, using easy_install to install PIL doesn't go over too well because of your (lack of) permissions. To circumvent this situation, you can always download the actual PIL source code and install it manually.

cd ~/downloads
wget http://effbot.org/downloads/Imaging-1.1.6.tar.gz
tar zxf Imaging-1.1.6.tar.gz
cd Imaging-1.1.6
ln -s ~/downloads/Imaging-1.1.6/PIL ~/lib/python2.4/site-packages

And to verify, you can try this command:

python -c "import PIL"

If that doesn't return anything, you're good to go. If it says something about "ImportError: No module named PIL", it didn't work. In that case, you have to come up with some other way of installing PIL.

Setting Up A Django Project

Let's attempt to setup a sample Django project.

mkdir -p ~/projects/django
cd ~/projects/django
django-admin.py startproject mysite
cd mysite
mkdir media templates

If that works, then you should be good to do the rest of your Django development on your server. If not, make sure that ~/downloads/django-trunk/django/bin/django-admin.py exists and that it has a functioning symlink (shortcut) in ~/bin. If not, you'll have to make adjustments according to your setup. Your directory structure should look something like:

  • projects
    • django
      • mysite
        • media
        • templates
        • __init__.py
        • manage.py
        • settings.py
        • urls.py

Making A Django Project Live

Now we need to make your Django project accessible from the Web. On Site5, I generally use either a subdomain or a brand new domain when setting up a Django project. If you plan on having other projects accessible on the same hosting account, I recommend you do the same. Let's assume you setup a subdomain such as mysite.mydomain.com. On Site5, you would go to ~/public_html/mysite for the next few commands. This could differ from host to host, so I won't go into much more detail than that.

Once you're in the proper place, you need to setup a few things: two symlinks, a django.fcgi, and a custom .htaccess file. Let's begin with the symlinks.

ln -s ~/projects/django/mysite/media ~/public_html/mysite/static
ln -s ~/lib/python2.4/site-packages/django/contrib/admin/media ~/public_html/mysite/media

This just makes it so you can have your media files (CSS, images, javascripts, etc) in a different location than in your public_html.

Now for the django.fcgi. This file is what tells the webserver to execute your Django project.

#!/home/[your username]/bin/python
import sys, os

# Add a custom Python path.
sys.path.insert(0, "/home/[your username]/projects/django")

# Switch to the directory of your project. (Optional.)
os.chdir("/home/[your username]/projects/django/mysite")

# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = "mysite.settings"

from django.core.servers.fastcgi import runfastcgi
runfastcgi(method="threaded", daemonize="false")

And finally, the .htaccess file:

1
2
3
4
5
6
RewriteEngine On
RewriteBase /
RewriteRule ^(media/.*)$ - [L]
RewriteRule ^(static/.*)$ - [L]
RewriteCond %{REQUEST_URI} !(django.fcgi)
RewriteRule ^(.*)$ django.fcgi/$1 [L]

The .htaccess file makes it so that requests to http://mysite.mydomain.com/ are properly directed to your Django project. So, now you should have a directory structure that something that looks like this:

  • public_html
    • mysite
      • media
      • static
      • .htaccess
      • django.fcgi

If that looks good, go ahead and make the django.fcgi executable and non-writable by others:

chmod 755 ~/public_html/mysite/django.fcgi

After that, head over to http://mysite.mydomain.com/ (obviously, replace the mydomain accordingly). If you see a page that says you've successfully setup your Django site, you're good to go!

Afterthoughts

I've noticed that I need to "restart" my Django sites on Site5 any time I change the .py files. There are a couple methods of doing this. One includes killing off all of your python processes (killall ~/bin/python) and the other simply updates the timestamp on your django.fcgi (touch ~/public_html/mysite/django.fcgi). I find the former to be more destructive and unreliable than the latter. So, my advice is to use the touch method unless it doesn't work, in which case you can try the killall method.

Good luck!

How To Compile and Install a 2.6.x Series Linux Kernel

The Linux kernel is the core component in any Linux distribution. Without a kernel, your computer would be essentially useless. It is the piece of software which allows interaction between you, your computer's applications, and your computer's hardware. With such a powerful role in your computing experience, it is important to keep your kernel up-to-date. Each new release provides more hardware support and many performance enhancements. It is also important to keep your kernel up-to-date for security purposes.

Let's upgrade our Linux kernels together. I will walk you through each of the steps I take, from beginning to end, to upgrade my kernel. Just as a warning, I prefer to do the whole process on the command line, so you might want to pull up a terminal, konsole, xterm or whatever you prefer to use for your command line operations.

First you need to download the kernel source code. Many Linux distributions provide specialized editions of the Linux kernel. Typically, you don't want to manually compile and install a custom kernel for these distributions. This does not mean that you can't, it simply means that you might be better off using the "official" kernels for your distribution, which can usually be obtained through your distribution's package manager. You can get the official, 100% free, and complete Linux kernel source code from http://www.kernel.org/. Look for "The latest stable version of the Linux kernel is:" and click the link on the F on the same line. Currently, the latest stable version is 2.6.20, and that's what I'll be using for this tutorial. Please note that commands which begin with a dollar sign ($) are executed as a regular user and commands beginning with a pound sign (#) are executed as a superuser.

$ cd /home/user/download
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.20.tar.bz2

Now login as the superuser, and navigate to the /usr/src directory. Then extract the kernel source into that directory.

$ su -
# cd /usr/src
# tar jxf /home/user/download/linux-2.6.20.tar.bz2

You probably already have a symlink or shortcut called linux which points to your most recent kernel. If you do, delete the link and create another link to the new source tree. Then go into your kernel source tree.

# rm /usr/src/linux
# ln -s /usr/src/linux-2.6.20 /usr/src/linux
# cd /usr/src/linux

I like to identify each compile of my kernel uniquely, to make sure that I'm using the right one. To do that, you have to modify your Makefile

# vi Makefile

You will see the following lines, or something similar, at the very top of the file:

VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 20
EXTRAVERSION =
NAME = Homicidal Dwarf Hamster

Change the EXTRAVERSION property to something you want to use to identify this kernel. I will use -jcv1

EXTRAVERSION = -jcv1

The rest of the Makefile should be fine. In fact, I discourage editing Makefiles unless you know what you're doing. This next step is totally optional, but I like to do it to save some time. You can copy your existing kernel's configuration file in order to have a very similar kernel configuration. My previous kernel version was 2.6.19.1, so this is the command I use:

# cp /usr/src/linux-2.6.19.1/.config /usr/src/linux/

Then I run make oldconfig or make silentoldconfig to update my older kernel configuration file to be able to handle newer features. If you use oldconfig you are required to specify whether or not you want the new features included in your kernel, whereas silentoldconfig will use the defaults determined by kernel developers (they usually know best), asking for minimal input. Let's update our configuration file and then customize it by running make menuconfig (there are several options here, such as make xconfig and make gconfig, but I prefer the text-based menuconfig; there is another you can run by using make config, which runs through each and every option available--it's scary).

# make silentoldconfig
# make menuconfig

menuconfig is a graphical command line application which lets you navigate the features offered by the kernel. Each computer is considerably different from the next, so it really does no good to provide a list of things that I tweak. However, it is important to note what some of the symbols are in the menuconfig utility:

  • M = Module. Modules are loaded when they are required and can contribute to the speed of your system
  • * = built into the kernel. These are typically things which are necessary for your machine to function properly, such as support for your root file system.
  • X = exclusively selected. You'll see this when you select what type of processor you have, for example.

One thing to note before we go further is MAKE SURE YOU KERNEL HAS BUILT-IN SUPPORT FOR YOUR ROOT FILE SYSTEM!!!! My root file system is reiserfs. In my configuration, I made sure that reiserfs was marked with a star. If you don't do this, your kernel won't boot and you will be very frustrated. Trust me.

Your computer is probably quite different than mine, so you might want to just poke around and see if you recognize things that deal with your computer's hardware. Once you are done tweaking your kernel configuration, exit the configuration utility and make sure the configuration is stored in /usr/src/linux/.config

Next we get to build and install the kernel. After that, we have to add an entry to our boot manager so that we can try out our new kernel. The compilation part usually takes just about a half hour on my 2.2Ghz Turion64 processor with 1.25GB of RAM. It takes about 6 hours on my 300Mhz Pentium 2 with 32MB of RAM. Let's find out how long it takes for you to compile your kernel!

# time make
...
real    27m29.663s
user    23m34.476s
sys     2m56.575s

Now let's install the modules and install the appropriate files in the boot area:

# make modules_install
# make install

This is the part that always used to mess me up. I use Slackware Linux, which is more UNIX-ish than most distributions. It's actually the oldest surviving Linux distribution to date, but that's another story. For some reason, the make install command doesn't always work with Slackware. There is a process I use to setup my boot directory when I compile a new kernel. I wrote a simple shell script called fixkernelinstall to take care of it for me:

#!/bin/bash
# Configure my computer for a new kernel
# Author: Josh VanderLinden
# Assisted By: Dan Purcell

# if the user didn't supply a kernel number, ask for it
if [ $# -eq 0 ]; then
    echo -n "Kernel: "
    read kernel
else
    kernel=$1
fi

# determine root partition
echo "Determining root partition..."
rootpart=`mount -l | grep ' / ' | cut -f 1 -d\ `
echo "Root partition is $rootpart"

# copy kernel configuration file
cp /usr/src/linux/.config ./config-$kernel

# now rename everything
echo "Renaming files..."
mv System.map System.map-$kernel
mv vmlinuz vmlinuz-$kernel

# if the config file exists and it's a symlink, remove it
if [ -f 'config' -a `stat config | grep -c 'symbolic link'` = '1' ]; then
    echo "Removing link to configuration file"
    rm config
else
    # otherwise it might be important
    echo "Renaming configuration file"
    mv config config.bak
fi

# Link files
echo "Creating symlinks..."
ln -s System.map-$kernel System.map
ln -s config-$kernel config
ln -s vmlinuz-$kernel vmlinuz

# Update lilo
echo "Adding entry to /etc/lilo.conf for $kernel"
echo "image = /boot/vmlinuz-$kernel" >> /etc/lilo.conf
echo "  root = $rootpart" >> /etc/lilo.conf
echo "  label = $kernel" >> /etc/lilo.conf
echo "  read-only" >> /etc/lilo.conf
echo "Linux kernel $kernel has been configured."
echo "Please check your lilo configuration and run lilo before rebooting"

I'm not an expert on shell scripts, so please feel free to offer suggestions for doing things better if you know how. This script uses the kernel version (given by the user) to setup by /boot directory properly. In my case, I run the script as such

# cd /boot
# fixkernelinstall 2.6.20-jcv1

And the output is something like:

Determining root partition...
Root partition is /dev/hda5
Renaming files...
Renaming configuration file
Creating symlinks...
Adding entry to /etc/lilo.conf for 2.6.20-jcv1
Linux kernel 2.6.20-jcv1 has been configured.
Please check your lilo configuration and run lilo before rebooting

As you can see from the script, I use LILO instead of the arguably more popular GRUB. Either one works for me, but LILO is sufficient for my needs. If you want to use the same kind of script for a GRUB installation, just change the LILO part at the end to something like:

echo 'Adding entry to /boot/grub/menu.lst for $kernel'
echo '  title Linux on ($rootpart)' >> /boot/grub/menu.lst
echo '  root (hd0,4)' >> /boot/grub/menu.lst
echo '  kernel /boot/vmlinuz-$kernel root=$rootpart ro vga=normal' >> /boot/grub/menu.lst

Make sure you change the line with root (hd0,4) to fit your setup. With GRUB, you don't have to worry about applying changes to see the menu entry at boot. It's automatically there. With LILO, however, you have to actually apply changes each time you make them. You do this by running the lilo command as the superuser:

# lilo
Added Windows
Added Linux
Added 2.6.20-jcv1 *

The star (*) signifies the default kernel to boot. Make sure that your root partition is correctly specified in your boot loader configuration. My root partition is on /dev/hda5, but yours may be (and probably is) on a different partition. If you fail to specify the correct root partition, your system will not boot that kernel until the configuration is fixed. GRUB makes this a lot easier than LILO.

And this is the point when you start to cross your figures and hope that your computer doesn't blow up... We get to reboot our computer and hope that our configuration file plays well with our computer. So, let's do that! See you in a few minutes (hopefully).

# shutdown -r now

So here I am, back on Linux on my freshly-rolled kernel. I hope you are as successful as I have been this time around. Keep in mind that you have to reinstall custom kernel modules if you installed others while you were on your other kernel. For example, I use ndiswrapper to access wireless Internet. I have to recompile and reinstall the ndiswrapper module and device drivers before I can use wireless. Likewise, I have VMWare Server on my laptop, which installed special modules. I have to run vmware-config.pl to reconfigure VMWare Server for my new kernel before I can run any virtual machines.

To summarize, here are the commands that I used in this tutorial. Remember that lines beginning with a dollar sign ($) are executed as a non-privileged user, while lines beginning with the pound sign (#) are executed as the superuser (root).

$ cd /home/user/download
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.20.tar.bz2
$ su -
# cd /usr/src
# tar jxf /home/user/download/linux-2.6.20.tar.bz2
# rm /usr/src/linux
# ln -s /usr/src/linux-2.6.20 /usr/src/linux
# cd /usr/src/linux
# make clean
# vi Makefile (to change EXTRAVERSION to -jcv1)
# cp ../linux-2.6.19.1/.config .
# make silentoldconfig
# make menuconfig (just to ensure settings were good)
# time make
# make modules_install
# make install
# cd /boot
# fixkernelinstall 2.6.20-jcv1
# vi /etc/lilo.conf (to make sure things were good)
# lilo
# shutdown -r now

I hope that you are able to use this tutorial to successfully install or upgrade your kernel. Good luck! Any comments or suggestions are welcome!

Linux Basics

Filesystem

  • /bin - This is where basic Linux commands reside (ls, du, dd, cp, etc).
  • /boot - Your boot images are stored here.
  • /dev - Links to access your machine's devices.
  • /etc - Configuration files and boot scripts.
  • /home - User directories, equivalent to "Documents and Settings" in Windows XP.
  • /lib - System libraries, codecs, etc.. similar to Windows/System and Windows/System32.
  • /mnt, /media - Mount points. A mount point is a directory that the contents of your hard drives, cd/dvd drives, floppy drives, or jump drives will be accessible.
  • /opt - Optional packages and programs. Could be thought of as a "Program Files" directory.
  • /proc - Special dynamic information about your system.
  • /root - System administrator home. Could be thought of as a "Documents and Settings/Administrator" directory.
  • /sbin - Super-user binaries. These programs need super-user (root) privileges to execute.
  • /tmp - Temporary files. Every user usually has read, write, and execute permissions here.
  • /usr - The main place for programs to be installed. Most like "Program Files" in Windows.
  • /var - System logs, mail spools, default web server directory, databases, etc...

Basic Commands

  • cd - Change Directory: moves to a different directory.

    Usage: cd directory, cd .., cd /directory

  • cp - CoPy: Copy a file or directory. If you wish to copy recursively and retain all attributes associated with the file or directory, use the -a option.

    Usage: cp original original.backup, cp -a /home/user/directory /home/user/backup

  • df - Disk usage on Filesystem: Display an overall summary of disk usage on mounted mountpoints. If you want human-readable sizes, use the -h option.

    Usage: df, df -h, df /mnt/mountpoint

  • du - Disk Usage: Display the disk usage of each file (recursively, by default) in the current directory. If you want human-readable sizes (1024 bytes = 1Kb, 1024Kb = 1Mb, etc), use the -h option. If you want a summary of the total disk usage by a directory and everything inside, use the -s option.

    Usage: du, du -s, du -h, du -sh, du -s /directory

  • ln - LiNk: Create a link, or shortcut, to a file or directory. I prefer to do symlinks by using the -s option.

    Usage: ln original link, ln original /directory, ln original /directory/link, ln -s original /directory/link

  • ls - LiSt: lists the contents of a directory.

    Usage: ls, ls .., ls /directory/subdirectory

  • man - View the MANual page for a program or other file. Probably the most useful program ever.

    Usage: man program, man xorg.conf

  • mkdir - MaKe DIRectory: create a new directory/folder.

    Usage: mkdir dirname, mkdir /directory/newdirname

  • mv - MoVe: Move a file or directory to a new location, or rename a file or directory.

    Usage: mv file /directory/newhome, mv file newfilename

  • pwd - Print Working Directory: returns the full path of the directory in which you are working.

    Usage: pwd

  • rm - ReMove: Remove a file or directory. If you want to get rid of a directory and all of its contents, use rm -R or rm -Rf for recursive deletion.

    Usage: rm filename, rm /directory/filename, rm -Rf /directory/dirname

  • rmdir - ReMove DIRectory: remove a directory. The directory must be empty.

    Usage: rmdir dirname, rmdir /directory/dirname

  • whereis - Determine where a certain file exists (if it's in your path)

    Usage: whereis filename

  • whoami - Detemine which user you are currently logged in as

    Usage: whoami

Linux Permissions

Linux has a great permission scheme. Since its inception, three basic levels of security have existed: user, group, and everyone. A simple way to change the permissions on a file or directory is to use the chmod, or CHange MODe, command. Changes to the permissions can be either a symbolic representation or an octal number representing the bit pattern for the new permissions. I prefer the symbolic method, myself, but many others prefer to see the octal pattern.

When working with permissions in Linux, always remember the following orders: User, Group, All; Read, Write, Execute. Those are the orders you will put the permissions in. Let's say that we want to make a file readable and writable only to the owner, while no one else will even be able to read the file. Here are some examples:

NOTE: Commands that begin with $ are executed as a regular user. Commands that begin with # are executed by a superuser (root). These two symbols (when they are the very first character in the command) are not entered by the user.

Symbolic:

$ echo "Hi" >> testing
$ chmod a-rwx,u+rw testing

Octal:

$ echo "Hi" >> testing
$ chmod 600 testing

Let's now examine the commands individually.

$ echo "Hi" >> testing

This command will append "Hi" (without the quotes) to the end of the file called testing. The file will be created if it does not already exist, assuming that the user has write permissions in the current directory. If you didn't want to append, you could overwrite anything that may be in the file by using a single > rather than >>.

$ chmod a-rwx,u+rw testing

This command removes (the - in a-rwx) read (the r in a-rwx), write (the w in a-rwx), and execute (the x in a-rwx) permissions from all (the a in a-rwx) users on the file called testing. Next we add (the + in u+rw) permissions for the owner (the u in u+rw) of the file: read (the r in u+rw) and write (the w in u+rw) on the file called testing.

$ chmod 600 testing

This command sets the permissions for everyone in one shot. I think of the digits in binary:

  • 1 = execute only;
  • 2 = write only;
  • 3 = write and execute, but no read;
  • 4 = read, but no write or execute;
  • 5 = read and execute, but no write;
  • 6 = read and write, but no execute;
  • 7 = read, write, and execute.

A digit is required for each level of permissions (user, group, and all). It is also possible to put another digit before the 3 levels of permissions, but to be honest, I don't know what significance it has. A little bit of testing has shown that it puts either an S or T in place of the execute permissions (depending on the digit).

A couple more things about chmod: Directories must also be executable in order to list the contents. chmod is very powerful. Finally, you can recursively apply permissions to directories and everything underneath with the -R option.

$ chmod -R a+rx /home/user/share

A couple of commands closely associated with chmod are chgrp (CHange GRouP) and chown (CHange OWNer).

chown will change the user ownership of files or directories. This can be done recursively with the -R option. It also has the capability to change the group ownership built into it. The syntax is: chown [options] user[:group] file1 [file...]

chgrp will change the group ownership of files or directories. You can do this recursively with the -R option. The syntax is: chgrp [options] groupname file1 [file...]

Cronjobs

Cronjobs are similar to scheduled tasks in the Windows world. Schedule tasks or cronjobs are simply programs that you want to run regularly, without having to type in the command every time you want it to run. Most distributions come with a cron daemon of some sort installed by default. Generally speaking, you can edit your cronjobs by typing crontab -e. This will bring up an editor like vi (it usually is vi by default) in which you edit your cronjob file. Each user can have their own cronjobs (unless it's been disabled by the administrator, I would assume). Here is an example of a cronjob entry:

47 * * * * /usr/bin/run-parts /etc/cron.hourly 1> /dev/null

You'll notice the 47 with four *'s after it. This is how the daemon knows when to execute a job. This is what each of the stars represent in their order:

  1. Minute: 0-59
  2. Hour: 0-23 (0 = midnight)
  3. Day of month: 1-31
  4. Month: 1-12
  5. Day of week: 0-6 (0 = Sunday)

So the example above will run after 47 minutes every hour of every day of the month of every month. You can also do some fancy things like having a job run every 5 minutes, after 15 and 45 minutes, etc. Let's say that we want to grab our mail every 5 minutes. The cronjob entry would look something like:

*/5 * * * * /usr/bin/fetchmail

If we wanted to grab our mail every 2 hours but only on Mondays, we would use the following:

0 */2 * * 1 /usr/bin/fetchmail

To have a job run after 15 and 45 minutes, we could do this:

15,45 * * * * /usr/bin/fetchmail

Pretty nifty, eh?

Make

This fancy utility is usually the means used for compiling programs from source. The usual sequence of commands for compiling and installing a program from the source in Linux is as follows:

$ ./configure
$ make
# make install

Most packages will follow this convention, but some require special procedures. Sometimes you can even get away with skipping the make and jump straight from ./configure to make install. It is always a good idea to read the README and INSTALL files included in source packages. They will generally tell you about anything out of the ordinary when compiling the source. Obviously, there is a lot more to this utility, but I'm not the person to explain it to you.

Package Management

There are several different types of package managers. The most popular these days are .rpm (RedHat Package Manager) and .deb (Debian). There are some other kinds of packages, but they aren't as popular as RPM and DEB. Slackware uses a straight .tgz (gzipped tarball) as its package system. Frugalware uses .fpm, which are bzipped tarballs. In the end, packages are almost always gzipped or bzipped tarballs.

Each package system has its ups and downs. I've personally found RPM-based distributions to be overly slow, especially with the package management. DEB-based distributions seem to be a lot more speedy when put up against RPM-based distros. However, I have found Slackware's TGZ-based system the most efficient and the fastest. Both RPM's and DEB's have dependency checking. In other words, the package manager will attempt to locate all entities upon which a program may depend in order to function properly before installing or upgrading that program.

A lot of people claim that .tgz packages are inferior to RPM and DEB because of the lack of dependency checking. By default, Slackware does not have dependency checking, but if you know what you're doing, you can get your dependencies a lot easier than you can with RPM or DEB (in my opinion).

RPM packages usually seem quite large compared to other package systems like DEB and TGZ. As far as I have seen, TGZ packages are smaller than both RPM and DEB packages. Here are a few options to help you use the RPM and TGZ package managers. I am not sure on Debian packages, so I won't attempt that one:

  • RPM:
    • rpm -q or rpm --query: look for a package on your system
    • rpm -i or rpm --install: install a new package on your system
    • rpm -U or rpm --upgrade: upgrade a package which is already installed on your system
    • rpm -e or rpm --erase: remove a package from your system
  • TGZ:
    • pkgtool: a text-based package manager
    • installpkg: install a package onto your system
    • upgradepkg: upgrade a package which is currently installed on your system
    • removepkg: remove a package from your system

You can also get some other programs that are VERY useful for package management. I think the latest craze for RPM's is YUM. I have not had great luck with this utility, but a lot of people really like it. Debian packages have used apt-get for ages now. My favorite add-on for Slackware packages is called swaret. Other distributions use the pacman utility, which is very efficient. Each one of these applications has several options and operation procedures.

Secure Shell and Secure Copy

One of my favorite aspects of Linux and other UNIX-derived systems is their secure shell capability (which is usually installed by default). Secure shell, or SSH, is a way for users to log into a remote computer and work on the remote computer as though it were right in front of them. Granted, it's all text-based unless you have X11 forwarding setup properly on both machines. But the command line interface (CLI) is extremely powerful--you should not be afraid to learn and use it. If you need to SSH from a Windows machine, you can use PuTTY.

In order to ssh into another computer, you simply type:

$ ssh hostname

or use the computer's IP address:

$ ssh xxx.xxx.xxx.xxx

By default, ssh on Linux machines will use the username of the account that you are running ssh from. Sometimes you need to login as a different user than the one you're currently using. To do that, you use the -l (lowercase L) or make it look like an e-mail address:

$ ssh hostname -l differentuser
$ ssh differentuser@hostname

Once your ssh session begins with the remote host, you will be asked to enter the password associated with the account with which you are attempting to login. If you do a lot of ssh'ing between machines, typing in your password several times is not only annoying but it could also pose a security risk--some wandering eyes might be watching you each time you enter your password. A great way to get around this is to generate a public and private key for your account. Once you do this, you can use the private key file on the machine you're ssh'ing out of and the public key on the remote machine.

To generate a public/private key, you can use ssh-keygen:

$ ssh-keygen -t rsa

You will be asked to enter and verify a passphrase for your private key. If you were aiming to get around not typing in your password, just hit enter twice for this part. It's still not secure, but it is a lot less hassle if you're only working on machines that no one else has "access" to. Usually your keys will be stored in ~/.ssh/ (or your /home/username/.ssh/ directory: ~ refers to your /home/yourusername).

The next step is to create your identification:

$ cd ~/.ssh
$ echo "IdKey private_key_file" > identification

Now you have to copy your public key (usually the one that ends in .pub) to the remote host:

$ scp public_key_file.pub username@xxx.xxx.xxx.xxx:/home/username/.ssh

And finally you should add your public key to the list of authorized users on the remote host by adding a line like the following to the ~/.ssh/authorization file:

Key public_key_file.pub

At this point you should be able to log into your remote host without your password (assuming you skipped the passphrase part of the key generation above).

As for the secure copy utility, you can get an idea of how to use it from the scp command above. This program uses the SSH system to securely copy files between two computers. This is how I use the scp command:

$ scp [-r] user@remote:/path/to/remote/file /local/destination/path/
$ scp [-r] /path/to/local/file user@remote:/remote/destination/path/

If you have setup public key authorization, you will not have to enter your password each time you use scp. Otherwise, you are asked for a password each time you run scp.

Archiving and Backup

There are many different kinds of compression and archiving tools in Linux. The most common types are tarballs, gzipped, and bzipped files. Below is a list of purposes for each of the three and some of the options:

  • tar - multiple files, little or no compression
    • c, --create - create a tarball
    • f, --file - specify the tarball's filename
    • x, --extract, --get - extract the contents of a tarball
    • j, --bzip2 - use bzip2 compression/decompression
    • v, --verbose - show verbose output
    • z, --gzip, --ungzip - use gzip compression/decompression
    • to create a tarball called filename.tar which contains all of the files in /dir/to/archive: $ tar cf filename.tar /dir/to/archive
    • to create a tarball called filename.tar.gz which contains all of the files in /dir/to/archive and gzip it: $ tar zcf filename.tar.gz /dir/to/archive
    • to create a tarball called filename.tar.bz2 which contains all of the files in /dir/to/archive and bzip it: $ tar jcf filename.tar.bz2 /dir/to/archive
    • to extract the contents of a tarball called filename.tar.gz to the current directory: $ tar zxf filename.tar.gz
    • to extract the contents of a tarball called filename.tar.bz2 to the current directory: $ tar jxf filename.tar.bz2
  • gzip - single file compression
    • To gzip a file called filename to make it filename.gz: $ gzip filename
    • To gunzip a file called filename.gz to make it filename: $ gunzip filename.gz
  • bzip2 - single file compression
    • To bzip a file called filename to make it filename.bz2: $ bzip2 filename
    • To bunzip a file called filename.bz2 to make it filename: $ bunzip2 filename.bz2