Arduino-Powered Webcam Mount

Earlier this month, I completed yet another journey around the biggest star in our galaxy. Some of my beloved family members thought this would be a good occasion to send me some cash, and I also got a gift card for being plain awesome at work. Even though we really do need a bigger car and whatnot, my wife insisted that I only spend this money on myself and whatever I wanted.

Little did she know the can of worms she just opened up.

I took pretty much all of the money and blew it on stuff for my electronics projects. Up to this point, my projects have all been pretty boring simply because nothing ever moved--it was mostly just lights turning on and off or changing colors. Sure, that's fun, but things really start to get interesting when you actually interact with the physical world. With the birthday money, I was finally able to buy a bunch of servos to begin living out my childhood dream of building robots.

My first project since getting all of my new toys was a motorized webcam mount. My parents bought me a Logitech C910 for my birthday because they were tired of trying to see their grandchildren with the crappy webcam that is built into my laptop. It was a perfect opportunity to use SparkFun's tutorial for some facial tracking (thanks to OpenCV) using their Pan/Tilt Servo Bracket.

It took a little while to get everything setup properly, but SparkFun's tutorial explains perfectly how you can get everything setup if you want to repeat this project.

The problem I had with the SparkFun tutorial, though, is that it basically only gives you a standalone program that does the facial tracking and displays your webcam feed. What good is that? I actually wanted to use this rig to chat with people!! That's when I set out to figure out how to do this.

While the Processing sketch ran absolutely perfect on Windows, it didn't want to work on my Arch Linux system due to some missing dependencies that I didn't know how/care to satisfy. As such, I opted to rewrite the sketch using Python so I could do the facial tracking in Linux.

This is still a work in progress, but here's the current facial tracking program which tells the Arduino where the webcam should be pointing, along with the Arduino sketch.

Now that I could track a face and move my webcam in Linux, I still faced the same problem as before: how can I use my face-tracking, webcam-moving program during a chat with my mom? I had no idea how to accomplish this. I figured I would have to either intercept the webcam feed as it was going to Skype or the Google Talk Plugin, or I'd have to somehow consume the webcam feed and proxy it back out as a V4L2 device that the Google Talk Plugin could then use.

Trying to come up with a way of doing that seemed rather impossible (at least in straight Python), but I eventually stumbled upon a couple little gems.

So the GStreamer tutorial walks you step-by-step through different ways of using a gst-launch utility, and I found this information very useful. I learned that you can use tee to split a webcam feed and do two different things with it. I wondered if it would be possible to split one webcam feed and send it to two other V4L2 devices.

Enter v4l2loopback.

I was able to install this module from Arch's AUR, and using it was super easy (you should be root for this):

modprobe v4l2loopback devices=2

This created two new /dev/video* devices on my system, which happened to be /dev/video4 and /dev/video5 (yeah... been playing with a lot of webcams and whatnot). One device, video4, is for consumption by my face-tracking program. The other, video5, is for VLC, Skype, Google+ Hangouts, etc. After creating those devices, I simply ran the following command as a regular user:

gst-launch-0.10 v4l2src device=/dev/video1 ! \
    'video/x-raw-yuv,width=640,height=480,framerate=30/1' ! \
    tee name=t_vid ! queue ! \
    v4l2sink sync=false device=/dev/video4 t_vid. ! \
    queue ! videorate ! 'video/x-raw-yuv,framerate=30/1' ! \
    v4l2sink device=/dev/video5

There's a whole lot of stuff going on in that command that I honestly do not understand. All I know is that it made it so both my face-tracking Python program AND VLC can consume the same video feed via two different V4L2 devices! A co-worker of mine agreed to have a quick Google+ Hangout with me to test this setup under "real" circumstances (thx man). It worked :D Objective reached!

I had really hoped to find a way to handle this stuff inside Python, but I have to admit that this is a pretty slick setup. A lot of things are still hardcoded, but I do plan on making things a little more generic soon enough.

So here's my little rig (why yes, I did mount it on top of an old Kool-Aid powder thingy lid):

And a video of it in action. Please excuse the subject of the webcam video, I'm not sure where that guy came from or why he's playing with my webcam.