Concurrency, Threads, Actors and Akka

Concurrency is a topic that’s close to my heart. It all started when I was introduced to Pthreads in my CS 241 Systems Programming class at UIUC. Having the ability to write code that could be doing multiple things simultaneously was (and is) pretty awesome. The assignments in that class, which included writing a parallel sorting function, a parallel version of make and a simple HTTP Server, really helped us appreciate the power of concurrency and threading. My love for concurrency has only grown since then and I’ve written concurrent code in both Java and Python.

This Summer I spent some time with Scala Actors and Akka (I got familiar with the concept of actors on Scala 2.8.1 and then upgraded to Scala 2.10.2 for Akka 2.1.4. As mentioned, Scala 2.10 and above will use Akka for actors). I feel that the actor model is a great way to think of and write concurrent programs. By relying only on immutable message passing (Yes, I know that you can pass mutable messages. But please don’t do that.) between actors for communication and coordination I feel that the likelihood of common errors seen in threading based concurrent programs like deadlock, livelock, race conditions etc. are reduced. Moreover, there are certain classes of applications, for example distributed systems applications, that can easily be modeled as a system with entities that communicate by sending messages to each other. Actors would be great for something like this as the actor code would map pretty neatly to the message flows in the system.

Akka builds on top of the actor model and adds a bunch of cool features. Some of my favorite include: supervision and monitoring, being able to refer to any actor using a hierarchical path, FSM support, Netty based remoting, and routers. This is a great example of why I’m a fan of Akka. The author of the post was able to model his problem as an FSM and translate it pretty neatly into code. This example ties in with what I was saying earlier about actors and distributed systems. By decomposing the system into a set of entities and possible messages that can be exchanged between the two the author of the blog post was able to come up with an elegant solution for the problem.

For programmers who’ve only ever used the threading model of concurrency in the past, I say give actors a try! You might be able to come up with a more awesome and well-structured solution to your problem. Akka also has a Java API in case you do not want to use Scala.

Akka resources:

  1. Akka Java and Scala guide (these are amazing)
  2. The Akka Team Blog

Testing Async Functions with Jasmine

Sam and I have been using Jasmine as our Javascript testing library for our Software Engineering project, mainly because I’ve used QUnit in the past and wanted to try something new. It also makes the tests in our group seem “uniform” in that Jasmine tests look a lot like RSpec, which is what the Rails team on our project is using. Our application has a RESTful Rails backend and a frontend written in Backbone.

We have a function in our Backbone view that calls a function on our Backbone model. The Backbone model talks to our RESTful backend and based on the response triggers the “success” or “error” callback functions provided by the view. Here is the pattern that we used to test this function:

 

Running Fabric tasks in parallel based on roles

We’ve been using Fabric to set up and build Gelato on AWS. Each time I use it I’m left with this sense of awe at how amazing it is. Going from having to manually SSH into each machine to do anything to have Fabric build your code on 15 machines in parallel is indescribable.

One thing that we were having trouble with was having Fabric run a task on specific host roles in parallel. To run tasks in parallel you use the @parallel decorator, while to run tasks on hosts by roles you use the @roles decorator. If you want to run tasks in parallel on specific hosts you have to be careful of the order in which you apply these decorators. Here is what worked for us:

P.S. make sure you set the correct Bubble Size if you have a large number of hosts!

Snip: pastie.org clone using Express and Redis

I’ve been wanting to learn and write something using Redis and Nodejs for the past few days. While scouring the Internet for knowledge I came across this article that describes how to implement a simple pastie.org clone using Redis and nodejs. I used this article as ‘inspiration’ and modified their sample application.

Instead of using Nerve for routing and returning html in the form of a string, I decided to use Express to build the application and Jade for the templates. I also replaced Pygmentize, used for syntax highlighting, with a Javascript syntax highlighter. Lastly, I swapped the Redis module ‘redis-client’ with the recommended ‘redis’ module instead. I then rewrote the application using these new tools.

Code is here. 

 

Linkedin Intern Hackday

WOW.

That is pretty much all I have to say after getting back from the Intern hackday. The sheer level of technical wizardry on display was ridiculous. There was a VNC client implemented in Javascript, a multiplayer capture the flag game that used nodejs + WebGL, a webapplication that allows you to share files by ‘streaming’ it to another user (no storage on an intermediate server) written in Django … all in all, absolutely amazing. Sam and I built a web based AI, called ‘Sherlock’. We used a homegrown audio recorder + chunker written in Python, Google + Bing + Qwiki APIs and NowJS for server-client distribution. I was really happy with how our application turned out in the end, it looked great and functioned pretty well. We made the final 15 (out of 45 teams) and I am really proud of what we accomplished over the course of 16 hours. 

For the curious, the winners were: 1st prize was the capture the flag game, 2nd prize was taken by Linked Out (an application that used data available on Likedin to predict who will change jobs) and the 3rd prize was grabbed by the Django-streaming-file-sharing application (they called themselves Beamit). 

#inday, I shall miss you.

 

On Django

A week or two ago I started learning Django, and wrote my first app, a simple contacts book thingy. Right from the get go, I was amazed by how everything felt so natural in Django, at least to me. The MTV pattern seemed really intuitive and I had no problem diving right in and creating an app. The Django documentation is extremely well written and answered all the questions I had while coding. Even though I was creating my first Django app, I had no problems in incorporating generic views, model forms, pagination etc. In order to have database migration support (yes, I kept changing the schema even for a simple app :p) I installed South and everything was smooth sailing from there. The last thing I want to add to the app is search capabilities, and for this I’ve decided to use the Haystack application. I’m pretty sure this is overkill for such a simple app, but I wanted to try out this application and hence decided to throw it in.

After working with Django, I’ve decided to go back and give Rails another go. I’ve almost completely forgotten all the concepts from Rails, and I would love to refresh my memory. 

Mozilla World Series of Hack

Sam and I took part in Mozilla’s World Series of Hack held on 7/22/11 7.30pm to 7/23/11 7.30am. The rules of this contest were slightly different from previous hackathons I’ve attended: contestants were allowed to start work on their projects as early as 7/15. They would then have 12 hours to work on it during the event and then they must demo at 8.00am. 

Sam and I had an idea that we started working on sometime around 7/19, but on the day of the competition we decided to completely change our idea and build something in the 12 hour period.

We wanted to implement a turntable.fm clone in 12 hours. Our concept revolved around a social music experience, where peers (users) connected to hubs (channels) and within a hub they could play music that would be heard by all peers connected to that hub. Each hub would have a ‘now-playing’ queue and songs added by peers would get added to this queue. Everyone present in the hub could also chat with other users present in the hub.

We implemented our idea using node.js, making heavy use of the awesome NowJS library for communication between the server and the clients. All audio playback was handled using html5.

NowJS is a wonderful library, and it was no suprise that a LOT of teams at the event took full advantage of it’s capabilities. The guys from NowJS were also at the event and were super helpful. 

Overall, the event was a great experience, and I learnt a lot in 12 hours. Kudos to Mozilla and the engineers from all the other companies who helped make WSOH kickass.

 

Tweet streamer using Node.js

A day or two ago I decided to learn Node.js and was looking for tutorials and articles on the internet for the same. In my quest for knowledge I came across this tutorial that talks about building a real-time tweet streamer using Node. Unfortunately, the code as presented on the website doesn’t work, I guess due to changes in the Node API since the tutorial. After a bit of researching and digging around in the Node API docs, I managed to produce a version that works.

Be warned though, this application makes a LOT of requests to the Twitter API resulting in your application being banned for an hour from using the Twitter API. Also, this is the first node application that I have written, so I might be making a lot of beginner mistakes. Please forgive me for this. The goal of writing this application was to have a working version of what was presented to us in the tutorial.

The version of node I am using(based on the output of 'node -v') is 'v0.5.0-pre'.

Here is my code.

All your __import__ are belong to us

Everyone knows and loves Python’s import function. This function is used to import external modules into the current module/script we are writing. Here are a few simple examples illustrating how the function can be used:

Internally, a call to the import function makes a call to the built-in __import__ function. However, there are 2 cases where using import would not work and the only way out is to call __import__ ourselves.

The most common case where we would have to call __import__ directly would be when we want to import specific modules at run-time based on user input. Here is how we might do that:

Yes, I know that we didn’t really have to import at run-time in the previous example. It was just meant to be a simple example 🙂

Another case where we might have to call __import__ is when, for some reason, the parent modules/folders for a module we want to import or a module itself has a name with characters that are not allowed by Python. For example, something like the snippet below would not work:

Notice how the hyphen is not allowed in the module name. Here is how we might work around that: