Deer Detection with Machine Learning (Part 1)

This is the deer. It is looking at me like it wants to murder me.

So, I have a deer who visits my yard. The deer wants to murder me and eat all of my vegetables. Well, the deer probably doesn’t really want to murder me, but it probably does want to eat all of the vegetables that we have in our garden. When we originally moved here, there were two deer that would frequent our yard just about every day. They aren’t really a problem – more of an unexpected annoyance when we want to pick apples or take out the compost.

This is the deer. It is looking at me like it wants to murder me.

This is the deer. It is looking at me like it wants to murder me.

The deer are not afraid of us either. If we come out into the yard, they will just stand there (or sit there) looking at us like we’re pond scum. I’m not sure why they seem to loathe us so much – it probably has something to do with the deer fence we have around our garden.

Keeping an Eye Out

When I snapped the picture above and posted it to Facebook, I got some interesting responses back (you know who you are AP!). One of them was about a gentleman who used machine learning to recognize squirrels, and fire a water gun at them. I don’t want to fire anything at the deer in my yard, I’m just curious about what exactly they are doing and when they are hanging around.

Given that the deer enter the yard somewhat infrequently, I want some method that will detect when the deer is present, and snap some photos – or even a video – of what they are doing, and then turn off when they leave. Placing a camera to take pictures of the back yard is relatively simple. For the actual deer detection, I decided to use a little machine learning. Essentially, I want the computer to sort through thousands of images or hours of video looking for the deer so that I don’t have to. Think of it as a Gorilla Detector, but for deer (since we all know about the dangers of undetected gorillas).

Collecting Data with the Kinect

I’m going to be using a simple supervised machine learning technique. With supervised machine learning, to teach a computer to learn when a deer is in a picture, I need to feed the computer pictures that have deer and pictures that don’t have deer. So, step 1 in the project is to collect pictures of the deer.

Remember how I said up above I didn’t want to sort through thousands of pictures to see if there is a deer in them? Well, unfortunately, that’s what I have to do to get my training examples to teach the computer (lousy deer – yes, I blame you!). But, if all goes well, it shouldn’t take too long.

To take pictures, I decided to use an XBOX 360 Kinect. Why? Well, I’m hoping to use the IR camera to take pictures of the deer at night. Plus, the OpenKinect project has some nice drivers available for many platforms, as well as wrappers for many programming languages. It is relatively easy to capture either a single frame, or a video using the open source drivers. And the Kinect is just plain cool (it has a LED you can turn on and off!).

Installing OpenKinect Drivers

Installing the necessary drivers under Ubuntu 14.04 is relatively simple. It was a matter of using apt-get to install the necessary packages:

sudo apt-get install freenect

Then, I plugged in the Kinect. Running dmesg showed me that the kernel successfully recognized the Kinect:

[34089.811775] usb 2-2.3: New USB device found, idVendor=045e, idProduct=02ae
[34089.811782] usb 2-2.3: New USB device strings: Mfr=2, Product=1, SerialNumber=3
[34089.811786] usb 2-2.3: Product: Xbox NUI Camera
[34089.811789] usb 2-2.3: Manufacturer: Microsoft

Excellent! Step one of my ridiculously circuitous plan was complete.

Testing Out the Kinect

The next step was to ensure that I could actually acquire data from the Kinect sensors. I hopped over to the OpenKinect GitHub account, and checked out whether they had any sample programs. Sure enough, their wrapper classes had some examples for grabbing video and pictures. Their Python examples looked simple enough, so I decided to try them out.

First, I cloned their repo:

git clone https://github.com/OpenKinect/libfreenect

Then, I installed some necessary Python packages:

sudo apt-get install python-freenect
sudo apt-get install python-opencv

From there, it was a simple matter of running their demo to grab both an RGB image, as well as an infrared depth image:

cd libfreenect/wrapper/python
./demo_cv_sync.py

Here is an example of the infrared capture:

ir_image_kinect

Infrared depth image of me waving at the camera. Hi!

Enter Java

With the Python examples demonstrating that the Kinect works, the next step was to build a simple image capture program. I decided to write it in Java.

Why Java? Well, for one, I could build a fat JAR for my program containing all of the components necessary to actually run the program (the OpenKinect wrappers are distributed under an Apache 2 License – very much appreciated!). Plus, when it comes time to actually crunch data with the machine learning components, I want something that will execute relatively fast, and Python – while great for prototyping – doesn’t usually offer performance guarantees. So, Java it is.

The first step was to package the OpenKinect Java wrapper into a JAR. Here is where I ran into problems. Building the wrapper was supposed to be as simple as executing a Maven package command. For me however, the unit tests kept generating segfaults outside of the JVM. Being brave, I just turned off the unit tests, and crossed my fingers that the resultant JAR was usable:

cd libfreenect/wrappers/java
mvn -Dmaven.test.skip=true package

Hopefully this won’t make the deer explode. I then copied the resultant JAR to my library path, and updated Gradle to include it in the compile time dependencies.

Update August 7, 2014: the BoofCV project has the libfreenect wrappers built into it – best of all, they have a Maven Central repository for their libraries. I’ve updated my source code and Gradle script to use BoofCV instead of my locally built JAR (the deer definitely can’t explode now – and in case you didn’t get the humor, there never was a chance that they could!).

The next step was to write a class that would manage the flow of data from the Kinect. I created a simple Monitor class to create a connection to the device in the constructor:

public Monitor() {
    mContext = Freenect.createContext();
    if (mContext.numDevices() == 0) {
        throw new IllegalStateException("no Kinect sensors detected");
    }
    mDevice = mContext.openDevice(0);
}

It has a single function called takeSnapshot that will actually turn on the device, and take a picture with it:

public VideoFrame takeSnapshot() throws InterruptedException {
    mDevice.setVideoFormat(VideoFormat.RGB);
    mVideoHandler = new VideoFrameHandler();
    mDevice.setLed(LedStatus.RED);
    mDevice.startVideo(mVideoHandler);
    while (mVideoHandler.getVideoFrame() == null) {
        Thread.sleep(100);
    }
    VideoFrame videoFrame = mVideoHandler.getVideoFrame();
    mDevice.stopVideo();
    mDevice.setLed(LedStatus.OFF);        
    return videoFrame;
}

The real magic is performed by the VideoHandler interface. When you call mDevice.startVideo, it needs a class that will handle the set of video frames that the Kinect generates. The only function that you are required to implement is the onFrameReceived function. My VideoHandler class simply stores the last frame of information sent from the Kinect. Any newer frames will overwrite the old ones – this is mostly because I’m lazy. I don’t need a sequence of frames, or even exact pictures at a point in time – any frame within say a second of taking a snapshot is good enough for me. This makes my handler very simple. I just store the information that the Kinect sends:

public void onFrameReceived(FrameMode arg0, ByteBuffer arg1, int arg2) {
    mVideoFrame = new VideoFrame(arg0, arg1, arg2);
}

My VideoFrame class is nothing more than a container class that stores the FrameMode, ByteBuffer and timestamp (arg2). The ByteBuffer and the FrameMode are the keys for actually displaying the information you get back from the Kinect. The ByteBuffer has the raw bytes received from the Kinect. The FrameMode on the other hand, holds information relating to the image width, height and color depth. Using these pieces of information, it’s relatively easy to reconstruct the actual image. In my case, I created a function in my VideoFrame class that generates a BufferedImage:

public BufferedImage getBufferedImage() {
    int width = sFrameMode.width;
    int height = sFrameMode.height;
    byte[] data = new byte[sByteBuffer.remaining()];
    sByteBuffer.get(data);
 
    mBufferedImage = new BufferedImage(width, height, 
            BufferedImage.TYPE_INT_RGB);
 
    ByteArrayInputStream stream = new ByteArrayInputStream(data);
 
    for (int y = 0; y < height; y++) {
        for (int x = 0; x < width; x++) {
            int r = stream.read();
            int g = stream.read();
            int b = stream.read();
            mBufferedImage.setRGB(x, y, new Color(r, g, b).getRGB());
        }
    }
    return mBufferedImage;
}

The code is quite simple. I’ve already asked the Kinect to generate RGB data from the camera back in the Monitor class. The ByteBuffer therefore has a sequence of R, G, B values in it – a triplet for every pixel in the image.

To make it easy to read, I converted the ByteBuffer into a ByteArrayInputStream so I could easily call read to get the next byte. All that was left was to get the height and width of the image that is stored in the FrameMode data, loop around the stream reading R, G, B data and write it into the BufferedImage. Note however, that if I had asked the Kinect to generate an image in a different color format that I would have had to do something different in the getBufferedImage function.

Putting it All Together

With my simple monitor class complete, all that was required was an option parser to set things like how many shots to take, and the time delay between them. The result is a command line Java program that will take pictures over given time intervals (the source code is available on my GitHub account). Cue evil laugh here.

With the program in hand, I mounted the Kinect in our yard facing window, and fired it up. I’m going to take a snapshot every 10 seconds, meaning that I’ll grab 6 pictures per minute. Since I also want to make sure that it detects deer, not just anything strange in the backyard, we’re also going to throw a few shots of insanity in.

Not a deer.

Not a deer.

Still not a deer.

Still not a deer.

Nope, still not a deer.

Nope, still not a deer.

Conclusions

I don’t have a degree in deer psychology, so I can’t speculate as to why the deer harbors such hatred of me. I can, however, keep an eye on it to make sure it doesn’t build weapons of mass destruction in our backyard while we aren’t looking (we’re a part of a block watch program, as such, deer-built WMDs are generally frowned upon). Tune in again in about 2 weeks when I will have the machine learning component of the project complete, as well as some preliminary data analysis of the backyard images.

Switching from Maven to Gradle

During the development of my Chip 8 emulator in Java, I grew increasingly dissatisfied with Maven. While I liked the fact that dependency management was handled in a single spot, the verbose XML format of the POM combined with the quirks of the Maven lifecycle came to be a bit grating. However, for me, the tipping point came when I was trying to build a fat jar.

I want all of my program’s dependencies packaged in a single jar so that I can run a packaged version of the emulator without having to worry about installing anything else on the classpath. This usage case is exactly what the Apache Maven Shade plugin is for. It allows you to construct an “uber-jar” (fat jar), containing all of the other dependencies that your project requires. While it is quite easy to use, I kept getting a lot of warnings while building and packaging that were related to the shade plugin and the creation of the fat jar. To top that off, configuring the Shade plugin meant adding a lot of XML to the POM, most of which felt like a lot of unnecessary cruft (admittedly, I am a Maven novice, so your experiences will probably differ).

Enter Gradle

Gradle is a Groovy based project automation tool. Like Maven, Gradle offers dependency management, and can hook into the Maven Central Repository and use Maven plugins. However, Gradle can do much, much more. For example, Gradle contains a wrapper that can be checked into source control. The wrapper can then download Gradle automatically, and build the project. This means that anyone can download and build the project without having to have Gradle installed and configured beforehand. This makes it very nice for things like continuous integration, where you really want to minimize the number of tools needed to build and test the project.

To me however, the main advantage of Gradle was the simple declarative language that it uses to define the project. Gradle does away with XML entirely – the result is a simple and concise looking configuration file. I highly value readability and maintainability, so Gradle felt like a breath of fresh air.

Initializing the Project

Given that my project is rather simple to begin with, I decided to write the new Gradle configuration by hand. For the sake of completeness however, Gradle does offer a command that will attempt to convert a Maven project into a Gradle project. To do that, you need a valid POM file. The command is as simple as:

gradle init --type pom

At this point I should mention that you can get a context sensitive list of tasks that Gradle can perform at any time by typing:

gradle tasks

Given that I wanted to start a new project from scratch to learn more about Gradle, I just created a simple default project definition to start. To do so, I did the following:

gradle init

This created the following files and directories automatically:

build.gradle
gradle/
gradlew
gradlew.bat
settings.gradle

These are the interesting artifacts that Gradle creates. In more detail:

  • build.gradle is where all of the project definitions and dependencies are placed. I’ll talk about that a little more down below.
  • gradle is a directory that contains the actual Gradle jar file.
  • gradlew and gradlew.bat are the wrappers that are generated that can bootstrap Gradle on a new system. The first is a shell script for *NIX like systems, while the second is a batch file for Windows.
  • settings.gradle allows you to configure multi-project options.

All of the files that Gradle creates can be checked into source control, meaning that you can easily version control your dependencies and settings (just like with Maven’s POM file).

Defining the Project

The build.gradle file is where the main settings for the project go. To start with, I entered some simple housekeeping information:

apply plugin: 'java'
apply plugin: 'eclipse'
 
group = 'com.chip8java'
version = '1.0'
 
description = 'A Chip 8 emulator'
 
sourceCompatibility = 1.7
targetCompatibility = 1.7

The first two lines refer to plugins that define additional behaviours:

  • The java plugin is used to define Java projects, and defines behaviours relating to building, testing, and packaging. For example, typing gradle build will build the project, while gradle test will run the unit tests. Note that the various tasks are dependent on one another (see the documentation), and are aware of each other. This means that if you haven’t changed the source code, and you type gradle build, the compiler won’t recompile your code.
  • The eclipse plugin is a very nice plugin for handling Eclipse integration. By typing gradle eclipse, Gradle will build an Eclipse project that you can easily import into your IDE.

The remainder should be fairly self-explanatory.

Dependency Management

Next, I needed to define the dependencies for my project. I added the following to the build.gradle file:

repositories {
    mavenCentral()
}
 
dependencies {
    compile group: 'commons-cli', name: 'commons-cli', version:'20040117.000000'
    testCompile group: 'org.mockito', name: 'mockito-all', version:'1.9.5'
    testCompile group: 'junit', name: 'junit', version:'4.11'
}
 
buildscript {
    repositories {
        mavenCentral()
    }
}

There are three blocks here to note:

  1. The repositories block simply tells Gradle to use the Maven Central Repository for downloading dependencies.
  2. The dependencies block lists all of the dependencies for the project. Each dependency is further broken down into the following fields:
    • The first field denotes what task requires the dependency. For example, compile means that the dependency is needed at compile time. Similarly, testCompile means that the dependency is needed during a compile in which unit tests are run.
    • The group field relates to the groupId of the dependency you wish to add. In my example, I use Mockito for mocking various objects. The group ID for that library is org.mockito.
    • The name field relates to the artifactId of the dependency you wish to add. Again, in the case of Mockito, this is mockito-all.
    • The version field relates to the version of the library you wish to use. For Mockito, I am using 1.9.5.
  3. The buildscript block lists any repositories you want to use, plus any local dependencies you might also have (I will talk about this more in a future post).

Source Locations

I have my files in a slightly different layout for my emulator than what would be normal for a typical project. In the root of the project directory, my source files are located in the src directory. Usually, src/main would be where the main Java sources would go, while src/test would contain unit tests. In my project structure however, I keep my unit tests under the root in the test directory. In order to run unit tests then, I need to tell Gradle where these sources live. This is accomplished with:

sourceSets {
    main {
        java {
            srcDir 'src'
        }
    }
    test {
        java {
            srcDir 'test'
        }
    }
}

Essentially, sourceSets indicates where my sources are located, with the main Java files located under src. The test files are located in the test directory instead. By doing this, I can run gradle test from the command line, and Gradle knows where the unit test source files are located.

Building a Fat Jar

One of the last tasks I want to perform is to build a fat jar with all of my required dependencies stored in it. This is accomplished with the following:

jar {
    manifest {
        attributes 'Main-Class': 'com.chip8java.emulator.Emulator'
    }
 
    doFirst {
        from (configurations.runtime.resolve().collect { it.isDirectory() ? it : zipTree(it) }) {
            exclude 'META-INF/MANIFEST.MF'
            exclude 'META-INF/*.SF'
            exclude 'META-INF/*.DSA'
            exclude 'META-INF/*.RSA'
        }
    }
}

There are two blocks here that are important:

  1. The manifest block with the Main-Class attribute tells Gradle to build a jar file with an entry point to the executable code being the class com.chip8java.emulator.Emulator. This effectively lets me call the jar with java -jar /path/to/emulator.jar without having to specify a specific class.
  2. The doFirst block is a bit of a work around to repackage the jar files into the fat jar, while stripping out their manifests.

All Together

All of the above sections put together generate a single build.gradle file. Now, to build the project, it is as simple as:

gradle build

To run the unit tests alone, if nothing has changed, you must tell Gradle to clean the unit test build and recompile just the tests:

gradle cleanTest test

And finally, to build the jar:

gradle jar

Conclusion

Switching from Maven to Gradle was quite straightforward. The simplified syntax of the build.gradle file, combined with powerful plugins means less cruft in your configuration files. Additionally, Gradle can use Maven plugins, and can automatically generate skeletons for development projects, convert existing Maven projects into Gradle based projects, and can easily generate the Eclipse project definitions for you.