Sunday, January 31, 2010

Initial Data Collection Attempt

Well I finally got the webcam and the netbook into the car and starting doing video capture. The good news is, the suction cup mount on my $10 webcam is perfect for the windshield.


The bad news is, it turns out $10 webcams do not automatically adjust gain. Driving around during the day yields a time-lapse movie of what appears to be the afterlife. Here's a daytime driving still:


I see dead people! Twilight looks a little better:

But many of the stills have alot of blur in them, perhaps the low quality CCD has a slow shutter speed.

Maybe if I can find a software shutter speed control then I can both operate during the day and eliminate most of the blur. Otherwise, I'm going to have to pony up for a better camera.

Tuesday, January 26, 2010

Computer Vision 101

Since it's been over a decade since I've done anything computer vision related, I need to acquaint myself with progress in the space. I figure reproducing a current paper should get me in the zone; furthermore, I'm familiar with latent Dirichlet allocation (LDA) as applied to text and I'm aware that is has been applied to images, so I figure that's a good place to start.

Wang and Grimson describe an extension to LDA which incorporates spatial correlation; but there is enough to confuse me already, so I need something involving vanilla LDA (which I'm familiar with). Going back a bit further, Sivic et. al. describe an application of vanilla LDA to image object recognition. Since LDA applies to discrete documents (that is, something consisting of "words"), the images need to be decomposed into sets of tokens. The approach taken is
  1. Extract feature vectors from local patches of the image.
  2. Use vector quantization to reduce the feature vectors to a codebook.
Sivic et. al. extract features from interest regions, whereas Wang and Grimson follow the approach of Winn et. al. and extract features densely. Interest regions are just another way for to get confused and screw something up, so I'll defer that for now.

Step #1 for Winn et. al. is to decompose the image from RGB to CIELAB. CIELAB is a cool encoding of color space which is supposed to mimic the response of the human visual system, so that makes sense. Poking around for an implementation of a converter, I came across OpenCV, which has tons of routines in it including color conversion. Writing a program to split an image into CIELAB components is pretty straightforward:
#include <cassert>
#include <string>
#include "cv.h"
#include "highgui.h"

using namespace cv;

int main (int, char** argv)
{
string file (argv[1]);
Mat img = imread (file);
vector<Mat> color (3);
assert (! img.empty ());
assert (img.channels () == 3);

cvtColor (img, img, CV_BGR2Lab);

split (img, color);
imwrite (file + ".L.png", color[0]);
imwrite (file + ".a.png", color[1]);
imwrite (file + ".b.png", color[2]);

return 0;
}
(I guessed that a loaded .jpg image would be BGR color order, based upon the documentation for imwrite). So let's see this in action: here's a picture of a police car.

Here's the L (luminosity) channel:

Here's the a* (magenta vs. green) channel:

Here's the b* (yellow vs. green) channel:

Monday, January 25, 2010

Car Brain!


My New Year's resolution is to give my car a rudimentary visual system, which should give me plenty to blog about.
My initial goal is to train the car to identify police cars, mount 4 cameras to allow for panoramic view, and have some kind of audible or visual warning. Is identifying a police car feasible? I'm not a computer vision expert, but searching around these points appear salient:
  • single object class: the question is binary, "does this picture contain a police car?", which makes the problem easier.
  • pose variability is low: the cameras will be mounted at fixed points at my car, and both my car and any police car will (barring the "Dukes of Hazzard" scenario) have all four wheels on the ground, which should moderate the pose variability.
  • volatile illumination: ideally, the detector would operate under all weather conditions, day or night. so that means significant illumination changes.
Well, we see how far I get anyway.

Step #1 was to get a brain for my car, so I purchased a refurb Dell Mini 110-1030nr for $280 from Amazon and put Ubuntu Netbook Remix on it. I also got 1 webcam for now.


It is presumably underpowered for the image processing that will be required, but if I get that far that will justify spending more money. First, I need to install it in the car and have it do video capture; from the resulting data I imagine many interesting problems will suggest themselves.