Deer Detection with Machine Learning (Part 2)

ir_test_night

Last time I wrote about building a simple Java program to take pictures of my backyard to keep track of those murderous deer that infest our backyard. To recap, I want to use machine learning to teach a computer to recognize that this is a deer:

This is a deer.

This is a deer.

And teach the computer that this is not a deer:

This is not a deer.

This is not a deer.

When I last wrote, I was going to have the computer take pictures every 10 seconds via a Kinect, and then create my training data set by finding lots of pictures with deer in them, and lots of pictures that were deer free. However, what I thought would be a walk through the park ended up being a walk through a scary, deer-absent park of horrors instead.

A Watched Yard Has No Deer…

I have no idea if the deer has a good data plan and read my blog (antlers make good cell receivers), or if it has some form of ESP and could read my thoughts – in either case, after 7 days of observation I have too few examples of deer for my machine learning approach to work. This is despite the fact that there is mounting evidence (i.e. deer poop – the jerk) that the deer is in the yard quite often.

A second or two on the internet provided me a fairly good answer. As it turns out, deer are primarily nocturnal. This means I’m watching at the wrong time of day. I find this very strange, since we’ve seen the deer in our yard quite often during the daytime. Luckily, the Kinect has a second camera – one that sees in infrared.

Note: actually, it’s not entirely true that I saw NO deer. I actually did catch a few glimpses of them. The problem was that there were too few images to use for training. Here’s an example of one of them:

2014-08-18_05-58-44-deer-legs

A picture of the deer – if you squint, you can see the deer’s legs in the picture. Obviously, this deer is very camera shy.

Alone in the Dark

To take depth images, the Kinect uses an infrared camera, and a cool infrared laser (which I will call the IR emitter from here on in) that shoots out a pattern of dots. The IR camera uses the dot pattern from the IR emitter to figure out how close or far away objects in the image are.

Using the OpenKinect libfreenect wrappers, it is actually quite easy to take an IR picture. When taking a video, specify the video format as VideoFormat.IR_8BIT. Each frame will have a ByteBuffer where a single byte represents the light intensity of a single pixel. To convert that into grayscale, just set the R, G and B components to the byte value. The following Java function will convert the ByteBuffer properly:

private BufferedImage getIR8BITBufferedImage() {
    int width = sFrameMode.width;
    int height = sFrameMode.height;
 
    // Convert the ByteBuffer into a ByteArrayInputStream for easier access
    byte[] data = new byte[sByteBuffer.remaining()];
    sByteBuffer.get(data);
    ByteArrayInputStream stream = new ByteArrayInputStream(data);
 
    mBufferedImage = new BufferedImage(width, height, 
                BufferedImage.TYPE_BYTE_GRAY);
 
    // Loop through the image data and write it into the BufferedImage
    for (int y = 0; y < height; y++) {
        for (int x = 0; x < width; x++) {
            int g = stream.read();
            mBufferedImage.setRGB(x, y, new Color(g, g, g).getRGB());
        }
    }
    return mBufferedImage;                
}

To test it out, I put the camera in a window and snapped a few IR photos during the daytime. After some tweaking, I got it working really well. Originally I had problems – it turns out that the dot pattern generated by the IR emitter reflects in the window, causing a glare. After covering the IR emitter with a tea towel, and framing the Kinect with a similar material to cut down on room glare, the resulting images looked like this:

An IR photo snapped during the daytime – ooooh, aaaah.

So, how well does it work at night? Here’s the result:

ir_test_2

Here’s an image that the Kinect took at night through the window. Notice the subtle variations of black throughout the frame.

The picture is the same when taken through the window, and through the screen. When I uncovered the IR emitter, the result was the glare of the IR dot pattern against the window. Unfortunately, without a source of IR light in the 830nm wavelength illuminating the yard, the IR camera isn’t able to pick up anything. I blame the deer.

Stepping into the Limelight

The solution was to find an infrared illumination source. I managed to find some cheap CCTV IR emitters on eBay. While they are of the 850nm variety, there is enough bleed across other wavelengths for the Kinect IR camera to pick it up (in fact, they glow faintly red, meaning that they bleed all the way into the 700nm range!). Here’s a picture of one of the IR emitters:

ir_emitter

One of the IR emitters I bought on eBay. It also has a nifty photo-sensitive cell in it, so it only turns on at night.

Note: many people have problems with these, because the box doesn’t list the power supply requirements. For the record, it is 12V DC, 500 mA, barrel center positive. With a little digging, you can pick a used one up at your local Value Village.

So, after waiting a week for my IR emitters to arrive, I plugged them in, aimed them out the window, and turned on the camera at night to see what happened.

And…

Still nothing. Just a black image. To double check that the Kinect could actually see the light being cast by the IR emitters, I turned them on in my room, and pointed the camera at me. My image came back lit very brightly, even though I was sitting in the dark.

Not to give up so easily, I took the IR emitters and the Kinect outside, pointed everything straight ahead, and had my DW stand roughly 10 feet in front of the rig. Here’s the result:

ir_test_led_rebalance

My DW about 10 feet away from the Kinect and IR emitters. I’ve actually enhanced the image so you could see what the Kinect picked up.

If you look closely, you can just about see her outline in the image. Note that she isn’t a deer. But there could be one lurking behind her – we’ll never know!

The verdict: the IR camera on the Kinect isn’t sensitive enough to the light being cast by the IR emitters. I was hoping that there would be a lot of bleed-through to the 830nm wavelength the Kinect used, but alas, it just wasn’t happening. That isn’t good, since every night that passes means another night where the deer might attempt to murder me.

So, we’re going to need a bigger boat… er… smaller camera.

Easy as Raspberry Pi

A while ago I bought a little device called a Raspberry Pi. For those of you who don’t know, it’s a neat little computer on a single circuit board that is about the size of a business card.

pi_next_to_card

One of my early model B Raspberry Pi devices in a case, next to a business card. It’s slightly bigger than a business card. Well, bigger and warmer than a business card – and pokier. Plus it needs power.

The Raspberry Pi is a very cool low power computer that runs various operating systems, and has many different input and output options available for it. One new development board recently released for the Pi is the NOIR camera. The NOIR actually stands for “no IR” – meaning the IR filter has been removed from the camera. Most normal cameras have an IR filter on them so that you only see the wavelengths that you are used to seeing through your own wetware (i.e. your eyeballs). Without an IR filter, the pictures you get back would actually look a little strange.

A Film NOIR

The NOIR camera plugs into the Raspberry Pi board, and with a few tweaks to the OS, is easy to get up and running as a camera. To enable the NOIR camera, simply run:

sudo raspi-config

Of the options on the screen, one of them will let you enable / disable the Raspberry Pi camera module. With that done, all I needed to do was mount the camera, fiddle with the exposure time and white balance settings through Python using the picamera module.

Note: I could have used the raspistill command line program, but it has the annoying habit of turning on the red LED for the camera when it takes a picture. Plus I want to control some other aspects of the camera. I created a GitHub repository with the script I used to turn off the LED while taking pictures.

Here is a picture of the rig with the IR camera:

rpi_ir_camera_rig

The finished rig pointed out the back window.

Inside the white case is a newer Raspberry Pi model B. I used an SD card case as a case for the NOIR camera. The washcloth underneath is to stop the glare from the window. I actually put another one on top of it as well (image olde-timey cameras – the person taking the picture needed to stand under a black cover – same thing here to cut down on glare). The Raspberry Pi also has a wireless USB stick in it, so that it can save the pictures to a NAS device I have on our network.

Night Test

After running through several tests, I waited for nightfall and then I carefully aimed both IR emitters out the window, and snapped a photo (well, many actually – it involved a lot of readjustment for both the camera and the IR emitters). Here’s one of my sample pictures:

ir_test_night

An IR photo of me taken at nearly 1 in the morning. Yes, I’m wearing a hat – I’m a redhead. I also felt very silly standing in pitch blackness waving at the camera.

There is enough exposure to clearly see everything out in the back yard! Excellent! No deer can hide now!

The only caveat to this is that I’m badly over-exposing each frame I take in order to get enough IR light to the camera. To give you a sense of the over-exposure – the shutter on the camera remains open for nearly 6 seconds. The result is that any moderate or fast motion will be very blurry. I’ll try a few more experiments with it to see if I can get a better image.

Conclusions

The deer is still out there… somewhere… plotting my murder. Unfortunately, the Kinect IR camera isn’t sensitive enough to the light thrown by my IR emitters. Luckily the Raspberry Pi NOIR camera sees very well in the dark. For now, it’s back to data collection mode both during the day, and at night. Mu-ha-ha-ha!

Check back soon, where I’ll actually start describing the Machine Learning component I’m hoping to use. I’ll try it out on another shady backyard visitor – one who wears a mask and steals our blackberries!

Deer Detection with Machine Learning (Part 1)

This is the deer. It is looking at me like it wants to murder me.

So, I have a deer who visits my yard. The deer wants to murder me and eat all of my vegetables. Well, the deer probably doesn’t really want to murder me, but it probably does want to eat all of the vegetables that we have in our garden. When we originally moved here, there were two deer that would frequent our yard just about every day. They aren’t really a problem – more of an unexpected annoyance when we want to pick apples or take out the compost.

This is the deer. It is looking at me like it wants to murder me.

This is the deer. It is looking at me like it wants to murder me.

The deer are not afraid of us either. If we come out into the yard, they will just stand there (or sit there) looking at us like we’re pond scum. I’m not sure why they seem to loathe us so much – it probably has something to do with the deer fence we have around our garden.

Keeping an Eye Out

When I snapped the picture above and posted it to Facebook, I got some interesting responses back (you know who you are AP!). One of them was about a gentleman who used machine learning to recognize squirrels, and fire a water gun at them. I don’t want to fire anything at the deer in my yard, I’m just curious about what exactly they are doing and when they are hanging around.

Given that the deer enter the yard somewhat infrequently, I want some method that will detect when the deer is present, and snap some photos – or even a video – of what they are doing, and then turn off when they leave. Placing a camera to take pictures of the back yard is relatively simple. For the actual deer detection, I decided to use a little machine learning. Essentially, I want the computer to sort through thousands of images or hours of video looking for the deer so that I don’t have to. Think of it as a Gorilla Detector, but for deer (since we all know about the dangers of undetected gorillas).

Collecting Data with the Kinect

I’m going to be using a simple supervised machine learning technique. With supervised machine learning, to teach a computer to learn when a deer is in a picture, I need to feed the computer pictures that have deer and pictures that don’t have deer. So, step 1 in the project is to collect pictures of the deer.

Remember how I said up above I didn’t want to sort through thousands of pictures to see if there is a deer in them? Well, unfortunately, that’s what I have to do to get my training examples to teach the computer (lousy deer – yes, I blame you!). But, if all goes well, it shouldn’t take too long.

To take pictures, I decided to use an XBOX 360 Kinect. Why? Well, I’m hoping to use the IR camera to take pictures of the deer at night. Plus, the OpenKinect project has some nice drivers available for many platforms, as well as wrappers for many programming languages. It is relatively easy to capture either a single frame, or a video using the open source drivers. And the Kinect is just plain cool (it has a LED you can turn on and off!).

Installing OpenKinect Drivers

Installing the necessary drivers under Ubuntu 14.04 is relatively simple. It was a matter of using apt-get to install the necessary packages:

sudo apt-get install freenect

Then, I plugged in the Kinect. Running dmesg showed me that the kernel successfully recognized the Kinect:

[34089.811775] usb 2-2.3: New USB device found, idVendor=045e, idProduct=02ae
[34089.811782] usb 2-2.3: New USB device strings: Mfr=2, Product=1, SerialNumber=3
[34089.811786] usb 2-2.3: Product: Xbox NUI Camera
[34089.811789] usb 2-2.3: Manufacturer: Microsoft

Excellent! Step one of my ridiculously circuitous plan was complete.

Testing Out the Kinect

The next step was to ensure that I could actually acquire data from the Kinect sensors. I hopped over to the OpenKinect GitHub account, and checked out whether they had any sample programs. Sure enough, their wrapper classes had some examples for grabbing video and pictures. Their Python examples looked simple enough, so I decided to try them out.

First, I cloned their repo:

git clone https://github.com/OpenKinect/libfreenect

Then, I installed some necessary Python packages:

sudo apt-get install python-freenect
sudo apt-get install python-opencv

From there, it was a simple matter of running their demo to grab both an RGB image, as well as an infrared depth image:

cd libfreenect/wrapper/python
./demo_cv_sync.py

Here is an example of the infrared capture:

ir_image_kinect

Infrared depth image of me waving at the camera. Hi!

Enter Java

With the Python examples demonstrating that the Kinect works, the next step was to build a simple image capture program. I decided to write it in Java.

Why Java? Well, for one, I could build a fat JAR for my program containing all of the components necessary to actually run the program (the OpenKinect wrappers are distributed under an Apache 2 License – very much appreciated!). Plus, when it comes time to actually crunch data with the machine learning components, I want something that will execute relatively fast, and Python – while great for prototyping – doesn’t usually offer performance guarantees. So, Java it is.

The first step was to package the OpenKinect Java wrapper into a JAR. Here is where I ran into problems. Building the wrapper was supposed to be as simple as executing a Maven package command. For me however, the unit tests kept generating segfaults outside of the JVM. Being brave, I just turned off the unit tests, and crossed my fingers that the resultant JAR was usable:

cd libfreenect/wrappers/java
mvn -Dmaven.test.skip=true package

Hopefully this won’t make the deer explode. I then copied the resultant JAR to my library path, and updated Gradle to include it in the compile time dependencies.

Update August 7, 2014: the BoofCV project has the libfreenect wrappers built into it – best of all, they have a Maven Central repository for their libraries. I’ve updated my source code and Gradle script to use BoofCV instead of my locally built JAR (the deer definitely can’t explode now – and in case you didn’t get the humor, there never was a chance that they could!).

The next step was to write a class that would manage the flow of data from the Kinect. I created a simple Monitor class to create a connection to the device in the constructor:

public Monitor() {
    mContext = Freenect.createContext();
    if (mContext.numDevices() == 0) {
        throw new IllegalStateException("no Kinect sensors detected");
    }
    mDevice = mContext.openDevice(0);
}

It has a single function called takeSnapshot that will actually turn on the device, and take a picture with it:

public VideoFrame takeSnapshot() throws InterruptedException {
    mDevice.setVideoFormat(VideoFormat.RGB);
    mVideoHandler = new VideoFrameHandler();
    mDevice.setLed(LedStatus.RED);
    mDevice.startVideo(mVideoHandler);
    while (mVideoHandler.getVideoFrame() == null) {
        Thread.sleep(100);
    }
    VideoFrame videoFrame = mVideoHandler.getVideoFrame();
    mDevice.stopVideo();
    mDevice.setLed(LedStatus.OFF);        
    return videoFrame;
}

The real magic is performed by the VideoHandler interface. When you call mDevice.startVideo, it needs a class that will handle the set of video frames that the Kinect generates. The only function that you are required to implement is the onFrameReceived function. My VideoHandler class simply stores the last frame of information sent from the Kinect. Any newer frames will overwrite the old ones – this is mostly because I’m lazy. I don’t need a sequence of frames, or even exact pictures at a point in time – any frame within say a second of taking a snapshot is good enough for me. This makes my handler very simple. I just store the information that the Kinect sends:

public void onFrameReceived(FrameMode arg0, ByteBuffer arg1, int arg2) {
    mVideoFrame = new VideoFrame(arg0, arg1, arg2);
}

My VideoFrame class is nothing more than a container class that stores the FrameMode, ByteBuffer and timestamp (arg2). The ByteBuffer and the FrameMode are the keys for actually displaying the information you get back from the Kinect. The ByteBuffer has the raw bytes received from the Kinect. The FrameMode on the other hand, holds information relating to the image width, height and color depth. Using these pieces of information, it’s relatively easy to reconstruct the actual image. In my case, I created a function in my VideoFrame class that generates a BufferedImage:

public BufferedImage getBufferedImage() {
    int width = sFrameMode.width;
    int height = sFrameMode.height;
    byte[] data = new byte[sByteBuffer.remaining()];
    sByteBuffer.get(data);
 
    mBufferedImage = new BufferedImage(width, height, 
            BufferedImage.TYPE_INT_RGB);
 
    ByteArrayInputStream stream = new ByteArrayInputStream(data);
 
    for (int y = 0; y < height; y++) {
        for (int x = 0; x < width; x++) {
            int r = stream.read();
            int g = stream.read();
            int b = stream.read();
            mBufferedImage.setRGB(x, y, new Color(r, g, b).getRGB());
        }
    }
    return mBufferedImage;
}

The code is quite simple. I’ve already asked the Kinect to generate RGB data from the camera back in the Monitor class. The ByteBuffer therefore has a sequence of R, G, B values in it – a triplet for every pixel in the image.

To make it easy to read, I converted the ByteBuffer into a ByteArrayInputStream so I could easily call read to get the next byte. All that was left was to get the height and width of the image that is stored in the FrameMode data, loop around the stream reading R, G, B data and write it into the BufferedImage. Note however, that if I had asked the Kinect to generate an image in a different color format that I would have had to do something different in the getBufferedImage function.

Putting it All Together

With my simple monitor class complete, all that was required was an option parser to set things like how many shots to take, and the time delay between them. The result is a command line Java program that will take pictures over given time intervals (the source code is available on my GitHub account). Cue evil laugh here.

With the program in hand, I mounted the Kinect in our yard facing window, and fired it up. I’m going to take a snapshot every 10 seconds, meaning that I’ll grab 6 pictures per minute. Since I also want to make sure that it detects deer, not just anything strange in the backyard, we’re also going to throw a few shots of insanity in.

Not a deer.

Not a deer.

Still not a deer.

Still not a deer.

Nope, still not a deer.

Nope, still not a deer.

Conclusions

I don’t have a degree in deer psychology, so I can’t speculate as to why the deer harbors such hatred of me. I can, however, keep an eye on it to make sure it doesn’t build weapons of mass destruction in our backyard while we aren’t looking (we’re a part of a block watch program, as such, deer-built WMDs are generally frowned upon). Tune in again in about 2 weeks when I will have the machine learning component of the project complete, as well as some preliminary data analysis of the backyard images.