Sunday, February 21, 2010

Better Image Capture

I tried to figure out what webcams would work from Linux and allow me to control the shutter speed. The existence of setpwc focused me on Logitech webcams. The marketing copy didn't mention shutter speed but did mention something called "RightLight2" technology which I figured was related (parenthetically, it seems like as a particular technology improves, it becomes less intelligible to technologists; the reason being of course is that it becomes more intelligible to the non-technologists who outnumber us). Anyway I bought a Logitech C905 which was like $70. Sadly, it was not compatible with setpwc because it is too new: it uses the uvcvideo driver. However whatever the default settigs are on the C905, the image capture looked good in broad daylight. Plug and play for once ...

The real problem was mounting the camera. My cheap $10 webcam with horrible shutter speed had a perfect suction cup mount. The C905 comes with a clip that I'd hoped would clip to my rear-view mirror but the mirror was too thick on account of fancy auto-dimming logic. So I dismantled the cheap $10 webcam down to the suction cup and then tied the C905 to it. Hey, I paid $10 for a suction cup ...

Now I have some real data: we had to go to a birthday in Culver City today so I did a capture for the trip, which included a police car.

The next step will be some tools to manage monster collections of image training data because I had to manually binary search through the pile of today's capture to find this police car.

Thursday, February 4, 2010

Feature Extraction

I'm still trying to figure out a webcam which is Linux compatible with a shutter speed that admits capturing motion in direct sunlight. Until then, more armchair computer vision 101 ...

Winn et. al. use a standard set of filters for extracting features from images: Gaussian, derivatives of Gaussian (aka DoG), and Laplacian of Gaussian (aka LoG). Here's what they look like:

Laplacian of Gaussian
Y-derivative of Gaussian
X-derivative of Gaussian

OpenCV has a method for computing GaussianBlur which is sufficiently flexible to reproduce Winn et. al., but to reproduce their usage of DoG and LoG I wrote my own OpenCV kernels (although maybe a combination of the GaussianBlur method with the Laplacian method and Sobel method would work?). It turns out separable filters are much faster and everything above except the LoG is separable; in addition the LoG can be written as the sum of two separable filters, the second derivative of Gaussian in x and y. Thus with something like

#include <string>
#include "cv.h"
#include "highgui.h"

#include "cyclopsutil.hh"

using namespace cv;
using namespace cyclops;

int main (int, char** argv)
string file (argv[1]);
Mat img = imread (file);
vector<Mat> color (3);

assert (! img.empty ());
assert (img.channels () == 3);

cvtColor (img, img, CV_BGR2Lab);

split (img, color);

imwrite (file + ".L.png", color[0]);
imwrite (file + ".a.png", color[1]);
imwrite (file + ".b.png", color[2]);

for (int sigma = 1; sigma < 16; sigma *= 2)
string labels[3] = { string ("L"), string ("a"), string ("b") };
Mat conv = color[0].clone ();
Mat convtwo = color[0].clone ();
std::stringstream out;
out << sigma;

GaussianBlur (color[0], conv, Size (6 * sigma + 1, 6 * sigma + 1), sigma, sigma, BORDER_REPLICATE);
imwrite (file + "." + labels[0] + ".gauss." + out.str () + ".png", conv);

sepFilter2D (color[0], conv, -1, dygauss_kernelx (sigma), dygauss_kernely (sigma), Point (-1, -1), 0, BORDER_REPLICATE);
imwrite (file + "." + labels[0] + ".dygauss." + out.str () + ".png", conv);

sepFilter2D (color[0], conv, -1, dxgauss_kernelx (sigma), dxgauss_kernely (sigma), Point (-1, -1), 0, BORDER_REPLICATE);
imwrite (file + "." + labels[0] + ".dxgauss." + out.str () + ".png", conv);

sepFilter2D (color[0], conv, -1, dyygauss_kernelx (sigma), dyygauss_kernely (sigma), Point (-1, -1), 0, BORDER_REPLICATE);
imwrite (file + "." + labels[0] + ".dyygauss." + out.str () + ".png", conv);

sepFilter2D (color[0], convtwo, -1, dxxgauss_kernelx (sigma), dxxgauss_kernely (sigma), Point (-1, -1), 0, BORDER_REPLICATE);
imwrite (file + "." + labels[0] + ".dxxgauss." + out.str () + ".png", convtwo);

imwrite (file + "." + labels[0] + ".lapgauss." + out.str () + ".png", conv + convtwo);

return 0;

applied to our awesome cop car picture's luminosity component we have:
Gaussian (sigma = 1)
Gaussian (sigma = 8)
DxGaussian (sigma = 1)
DxGaussian (sigma = 8)
DyGaussian (sigma = 1)
DyGaussian (sigma = 8)
Laplacian of Gaussian (sigma = 1)
Laplacian of Gaussian (sigma = 8)