Simple Weighted Sort in Python

Last night I found myself in need of a simple weighted sort function in Python. I had a list of integers which represented object IDs in my project. Some of the objects needed to be processed before the others while iterating over the list of integers, and I already knew which object IDs those were. The order the rest of the object IDs were processed didn't matter at all. I just wanted the special object IDs to arrive at the beginning of the list, and the remaining object IDs could be in any order.

I was surprised at how simple it was to produce such a weighted sort. Here's an example of what I did:

import random
object_ids = [random.randint(0, 100) for i in range(20)]
special_ids = [random.choice(object_ids) for i in range(5)]
print 'Object IDs:', object_ids
print 'Special IDs:', special_ids

object_ids.sort(key=special_ids.__contains__, reverse=True)
print 'Object IDs:', object_ids

And some sample output:

Object IDs: [13, 97, 67, 5, 77, 58, 24, 99, 29, 20, 29, 75, 100, 31, 79, 5, 27, 11, 6, 1]
Special IDs: [13, 1, 27, 6, 67]
Object IDs: [13, 67, 27, 6, 1, 97, 5, 77, 58, 24, 99, 29, 20, 29, 75, 100, 31, 79, 5, 11]

Notice that each of the "special" IDs have shifted from their original position in the object_ids list to be at the beginning of the list after the sort.

The Python documentation for sort says that the key argument "specifies a function of one argument that is used to extract a comparison key from each list element." I'm using it to check to see if a given element in the list is in my special_ids list. If the element is present in the special_ids list, it will be shifted to the left because of the way the special_ids.__contains__ works.

In sorting, a value of 1 (or other positive integer) out of a comparison function generally means "this belongs to the right of the other element." A value of -1 (or other negative integer) means "this belongs to the left of the other element." A value of 0 means "these two elements are equal" (for the purposes of sorting). I'm assuming it works similarly with the key argument. Please correct me if I'm wrong!

As lqc states in the comments below, the key argument works differently. It creates a new sequence of values which is then sorted. Before lqc jumped in, I was using key=int(i in special_ids) * -2 + 1 to do the sorting, which is pretty dumb. Using key=special_ids.__contains__ is much more appropriate. Thanks lqc!!

This sort of weighted sort might not be just right for your needs, but hopefully it will give you a place to start to build your customized weighted sort!


Comments powered by Disqus