Thursday, February 23, 2012

Hardware Hackathon with Arduino and LEDs

Recently I was invited to BetaSpring's Hardware Hackathon. And by invited, I mean I filled out a free online signup form. I was very excited for the hackathon - all my projects have been extremely software based, and I have very little experience playing with hardware.

I won't keep you in suspense: it went extremely well! I even ended up in the Providence Journal (the major Rhode Island paper):

That's not me in the picture, but I did make it into the very first sentence. I believe the reporter captured my quotes perfectly: "Yeah, you all bailed at like 2:30." I feel like I shouldn't be proud of that one, but I am anyway.

The Project
With my limited hardware experience, picking the project was a big deal. I had very few ideas of what to work on. In the end, I decided to make right one of my colossal failures. Some months ago I broke the taillight of my boss's truck while carrying some heavy equipment. My project plan: to resurrect the cracked taillight and bring it back to light.

The plan was to install replacement LEDs in the taillight, powered by an ethernet enabled microcontroller. I would be able to flash the LEDs from any computer over a network.

The project was a good fit for my skillset. It would force me to learn about electrical wiring (power supplies, resisters, LEDs) and Arduino programming. Although I had never used the Arduino, I was confident would have no problem. I was more apprehensive about lighting up the LEDs, since it would involve soldering and other hardware skills which I am unfamiliar with.

A Taillight, Resurrected
With help from the fabrication lab of AS220, I was able to get the hardware set up on the first day of the hackathon. I used the overnight time to get the Arduino programmed and talking over ethernet.

Here it is, in all its blinking glory:

You can find the Arduino controller code on github.

The Taillight's current job is monitoring the build status at work. When the builds are normal, the light pulses gently. When we have a build error, the Taillight flashes and alerts us developers that we screwed up somewhere (usually means that I broke BuildBot, but that's a different issue).

I've already shared the half-page article from the ProJo. They also had a summary post on their website.

BetaSpring (the Hackathon sponsor) mentioned me in their writeup blog post.

Thursday, October 27, 2011

Importing a custom Python package in Amazon MapReduce

Amazon's MapReduce makes it difficult to use custom, application specific modules in Python applications. Without special configuration, MapReduce only loads the two map/reduce files into its streaming jobs. I'm a heavy of custom modules - my scripts were failing with import errors without a clear solution.

The solution is relatively simple (and automatable)! To import custom modules, we will need to get them into the working directory of the streaming job. Then, we need to add that directory to our PATH.

I'm using Amazon's elastic-mapreduce command line tool to start my job flows. elastic-mapreduce gives you the option to easily push a single file to the working directory using the --cache parameter. However, it only allows you to push one file (and you can't add more than one --cache parameter. I tried). We'll have to use its friend, --cache-archive.

The Plan - quick summary:
  1. Create a package out of the application modules you want to import.
  2. Create an archive from the package
  3. Push the archive to S3
  4. Use the sys module to add the library directory your job's PATH
  5. Add the tar to your streaming job with --cache-archive
Broken down:
1. Create a package
Packages in Python are created by putting the modules together in one folder with a file named See the Python docs for more information. Here's my directory structure:

2. Tar the package
Note that I'm using -C to change into my helper_classes directory before creating the archive. This ensures that the files aren't put into a folder inside the archive, but live in the top level instead. Once I've created the tar, my directory structure looks like this:

3. Push the archived package to S3
I'm using s3cmd. How you push the archive to S3 doesn't matter, as long as you get it up there.

4. Add the package to your job's PATH
To temporarily add the folder with the application modules to our PATH, use sys.path.append(). This must be done before we attempt to import any of our custom modules. Here's an example:

5. Add the archive to your streaming job with --cache-archive
NOTE: What you put after "#" will be the directory name! This is where your tar will be unpacked to - it must match the directory name you used added to your PATH.

That's it! You can see a fully working (and automated) example of this method in my scrabble-bot project. In particular, check out and its helper

There may be better ways to get helper modules into MapReduce jobs (perhaps using bootstrap actions?), but this one has worked well for me.

Saturday, August 6, 2011

Zeromq benchmarking with large objects

At work, we're in the process of adding zeromq to our architecture. We plan to use it to make our C++ application multi-process (and eventually multi-computer).

We currently move the data between modules using pointers to Protocol Buffer messages. We know that a change to copying this data around using zeromq will take longer; we want to know how much of a slowdown we will see.

Our data starts out as protobuf messages, so it must be serialized before sending. The protos serialize into std::strings. From there, they are copied into zeromq messages and sent on a zeromq socket. The process is reversed on the receive side. 

Here is an example of our sending/receiving benchmarks on a large piece of data we use. Serialized, it is around 45MB in size.

It takes us 168.92ms to pass this one proto message. For comparison, the non-zeromq method (where a DataManager passes pointers around) takes 2.76ms.

Veteran zeromq users may notice that we're copying the serialized strings into the zeromq messages. These steps are the ones in red (and you thought the chart was colorful just because I like colors). This copying can be avoided in some use cases with zeromq's zero-copy functionality. If we were to take these actions out of our chain, the minimum total time would be 111.85ms

Using these benchmarks, we can decide which data passing paths can be replaced with zeromq, and which are time critical enough to require staying pointers. A ~0.1 second slowdown is a significant amount of time for our processing, but the benefits zeromq provides (multi-process communication!) outweigh the drawbacks in many cases.

    Thursday, July 28, 2011

    Python scripting to access Android GPS

    One of my long-term ideas has been to record myself playing Ultimate frisbee. I think visualizing a player's movement around the field (as well as the analytics you could get from the movement) would be awesome.

    I thought my old G1 would be the perfect device for the job. It is a small, self contained unit with a GPS. Using SL4A it even runs Python- what more could I want? I could poll the GPS for the player position and record their lat/lon at every tick. From that, it would be easy enough to make a graph of movement.

    Android is very straightforward with getting Location data from the GPS, and the Python scripting layer is easy to use.

    This code sets up our GPS listener and records the first event it spits back at us. From there, it's easy to record a bunch of events, parse out the lat/lon, and plot the points with matplotlib.

    That's a lat/lon graph of me running around a block of houses- ouch! I made some zig-zags (the waves on the top) and even backtracked (the big bump on the bottom) for a half block, but the frequency is nowhere near being good enough to be useful. Even though it is easy to access, the G1's GPS just doesn't update fast enough (with enough precision) to record a player's movement.

    I had a small amount of hope that the accelerometer would be viable option, but as this stackoverflow answer (and the accompanying video) shows, this isn't possible. The error introduced by going from acceleration to position is too great for navigation the navigation I want to do.

    Even though I failed in my planned goal, the code to record the GPS is very nice. It's available on github.

    As for next steps, I asked around and got some good ideas. First (and easiest), I plan to try to use Network location data along with the GPS data on the phone to try get a more accurate location. I don't think it will improve the update frequency, but it may. Long term, the best way to get my desired data (track someone running around a field) probably rests with short range triangulation. Something like putting radio transmitters on the sides of the field and determine the position of a device in the middle, or perhaps using cameras (Kinect style) to triangulate a person's position visually and with high frequency.

    Wednesday, June 15, 2011

    New toy! Rooting a Nook Touch

    So last week I made my first impulse buy in a while; I bought a brand new Nook Simple Touch. Why? I spend way too much time looking at LCD screens, I would love to change some of that to e-Ink (going outside just isn't part of the equation). Also, it's running Android - what's not to love?

    The main goals I have are as follows:
    • Read books (easy!)
    • Read Instapaper articles (harder)
    • Get on Google Reader (even harder)
    One solution is to sync all of these things to Calibre and load them manually. But what's the point of having a wireless e-reader if you're plugging it in all the time? And manual operation? Screw that. The only solution was to root it and get some side-loaded apps working to solve my problems.

    The root process is well spelled out here (thanks JesusFreke!). I've already added my clarifying contributions to the wiki; where I can help is what you do with your Nook after rooting. Here's what I did...

    Yes, that's right. Angry Birds, first priority. It's obligatory.

    Now that that's out of the way, some real stuff. All apps are loaded by making a adb connection (this should have been the last step of rooting). So, you run your adb connect, and can immediately load apps with adb install {app}. It's spelled out for the Nook Color here; same process.

    Installing is easy- getting access to the apps once they're on the device is slightly more complicated. For this, you need two apps. One is ADWLauncher, an Android homescreen replacement. The B&N one the Nook ships with just won't cut it. Unfortunately, the only time you can access your home screen is on Nook startup; if you leave it, you can't get back. For that, you need SoftKeys. This allows you to reach your desired home screen whenever you like. You should set it up as explained for the Nook Color here. I've also heard people having success with Button Savior, but I haven't tried it (yet!).

    Now you have a launcher/home screen, and a home key so you can always get back. Now you can load whichever apps you want. Here's what I've found that works and doesn't work so far...

    • Nook Color Tools
      • very helpful- provides access to system settings. Necessary to uninstall apps.
    • Dolphin browser
      • good browser that seems to work with only minor hiccups
    • Dropbox
      • file sharing! yippee!
    Not working (yet):

    Here's someone else's list: much more comprehensive.

    Note that two of my original goals are on the not working list. Some people seem have had success with both; I'll provide an update when I figure out how to get them working.

    One of the biggest issues I've had is simply finding the apks to load. Has anyone had success with Android Market? Or found a reliable way to download apps? My next plan is to try downloading them to my Nexus S and grabbing them from there.

    By the way, there have been some great resources online recording the progress of rooting the Nook. This thread on xda-devs has been very prolific. Mike Cane has had nearly daily updates on his blog.

    Friday, May 6, 2011

    Dice games confuse me -- testing a strategy in Python

    My friend (let's call him Coleman) told me about a dice game today, and how using probability you could increase your win percentage. I didn't believe him, and I set out to prove him wrong.

    First, let me explain the game. Two people each roll a die. Each can look at their opponent's roll, but not their own. They can have no communication once the die have been rolled (but can collaborate beforehand). The object of the game is for each to guess what their own die is; they both win if they each guess correctly, otherwise they lose.

    I assumed that without any knowledge of what their die is, each person has a 1/6 chance of guessing correctly. This means that the pair have a 1/36 chance of both guessing correctly and winning the game.

    Coleman tried to tell me he could do better. If the two people agree to each guess what their opponent's die is showing, they have a 1/6 chance of winning. This didn't make sense - what your opponent rolled has zero to do with what you rolled, and there is no way that choosing their roll would give you a better outcome.

    Time to test it - Monte Carlo style! I wrote a script that plays the game; it first plays guessing randomly, then guesses "intelligently" by following the strategy above. My gut told me that the two would have the same outcome, but my head was telling my gut to shutup and go have a sandwich.

    Here's the result of playing randomly:
    We get 2.8% (around 1/36); just as expected!

    Now let's try playing by the strategy. My gut says it will also be 1/36...
    16.7% -- around 1/6! What... how... ???

    So Coleman was right. Here is how I convinced myself that the math was actually true - let's see if it makes sense outside of my head. By agreeing to play by the strategy, you're changing the game. Before, the game was two people each guessing a die roll correctly; this has a 1/36 chance of working. With the strategy, the game becomes two people roll the same number. This happens every 1/6 games. So you change the game with your strategy, and get a much better win percentage.

    You can check out the Monte Carlo script I used here. I'm gonna go find some dice, and hit Coleman with a sandwich.

    Thursday, May 5, 2011

    Relearning Data Structures with Python

    I realized the other day that I had learned about tons of data structures in class, but I've never used some of them in production. So, I set out to write simple implementations of a hash table and binary tree. After all, after I've made one in practice I'm more likely to make one when it's actually useful.

    The focus in this exercise was clean implementation. Keep the code as short as possible, not sacrificing for readability. I think I've struck a good balance.

    I've found that since I've learned Clojure, I'm more likely to use functional style coding in other languages. For example, here's my hash_function in Python:
    Note the use of map. I was even using reduce for a while, but Python's sum led to a cleaner implementation. Also note that I'm using the decorator @staticmethod. Functional style calls for functions that have well defined inputs and outputs without side effects - declaring this method to be static explicitly shows that I am following that principle.

    Compare that to how I would have written this six months ago:
    The new, functional way is much cleaner. 

    There are definitely many things I could add for my next steps. Some structure changing methods (dynamically allocated hash table? self balancing tree?) could be pretty cool.

    You can check out the source on github. There is a tester file that shows how to use the class (and even provides some profiling code I used). To enable a tester, call it from the main function in