Running Fabric tasks in parallel based on roles

We’ve been using Fabric to set up and build Gelato on AWS. Each time I use it I’m left with this sense of awe at how amazing it is. Going from having to manually SSH into each machine to do anything to have Fabric build your code on 15 machines in parallel is indescribable.

One thing that we were having trouble with was having Fabric run a task on specific host roles in parallel. To run tasks in parallel you use the @parallel decorator, while to run tasks on hosts by roles you use the @roles decorator. If you want to run tasks in parallel on specific hosts you have to be careful of the order in which you apply these decorators. Here is what worked for us:


env.roledefs = {
"service_A": ["hostA1", "hostA2", …],
"service_B": ["hostB1", "hostB2", …],
"service_C": ["hostC1", "hostC2", …],
}
@task
@parallel
@roles("service_A", "service_B", "service_C")
def build():

view raw

fabfile.py

hosted with ❤ by GitHub

P.S. make sure you set the correct Bubble Size if you have a large number of hosts!

Gelato Tech Stack

For our Advanced Distributed Systems (CS 525) final project Onur and I are working on a system we’ve named Gelato. I will have more details about it in a month when we (hopefully) open source our code. Our tech stack for Gelato looks something like this:

We were deciding between Cassandra and HBase and decided to use HBase because HBase has a native Java API and is pretty easy to use on AWS thanks to AWS EMR.

The languages we are using are:

  • Java for the core Gelato system
  • Python to gather performance metrics

Gelato is pretty the most complex system I’ve built during college and I’m really excited to see how it finally turns out.

2012: Year in review

2012 was a great year for me for 3 reasons: The world didn’t end. 4 Hackathon victories: 3rd prize at Greylock, an award at an internal hackday at LinkedIn, 1st prize at the Facebook Hackathon at UIUC and 2nd prize at the Facebook Hackathon finals…

2012 was a great year for me for 3 reasons:

  1. The world didn’t end.
  2. 4 Hackathon victories: 3rd prize at Greylock, an award at an internal hackday at LinkedIn, 1st prize at the Facebook Hackathon at UIUC and 2nd prize at the Facebook Hackathon finals. We were also one of the finalists for the LinkedIn Intern Hackday 2012.I worked with Sam and Onur on all the hackathons apart from the LinkedIn internal hackday for which I worked with my awesome mentor Jim.I also love how our Computer Vision technology evolved over time for each hack: we started with object tracking using colors (we wore  colored socks on our hands for dance()), next we were able to pull of object tracking within a bounded region (we were able to track a finger in ABSees) and finally for StreetCoders we were able to have a system that didn’t require colored socks or a bounded region!

    I’m extremely proud of what Sam, Onur and I achieved and it was great working with them.

    What didn’t improve though was the quality of our Javascript code :(. Each of our hackathon projects started with us working on the computer vision first. This usually took the most time. Once we were pretty confident that our computer vision components worked locally (no images coming over the network), we started working on the Javascript portions of the code that actually talked to the computer vision servers. It was also at this point that we used to realize that we have 10-12 hours to build a majority of our application. This scramble to the finish line usually ended up with us having Javascript code that is functional but littered with code smells. Our overall tiredness towards the end didn’t help either. Oh well.

  3. I got an opportunity to intern at an awesome company. My Summer at LinkedIn was phenomenal.
  4. There was a new addition to my family.my dog

Looking forward to a great 2013.

Ciao!

I realized that I haven’t blogged in quite some time. My blog posts on Summer 12, which were to be a two part post, stands with only one part on my blog and the other part in my head. I’ll get to it eventually, maybe during Thanksgiving Break. I t…

I realized that I haven’t blogged in quite some time. My blog posts on Summer 12, which were to be a two part post, stands with only one part on my blog and the other part in my head. I’ll get to it eventually, maybe during Thanksgiving Break.

I think the main reason I haven’t had the time to blog is because of my 18 credit hours workload this semester. The last time I’d taken 18 hours was during first semester freshman year and the classes were nowhere near as hard 🙂

I’m not complaining though: the level of the classes I’m in this semester is absolutely fantastic. I think my favorite class right now is Distributed Systems (CS 425). It’s something that I’ve been interested in for quite some time and learning about the various algorithms, concepts and existing systems has been excellent. The coding assignments in this class are great too. So far I’ve worked on a distributed log querying system and a group membership system with failure detection and no SPOF. I’m pretty excited for my next assignment which involves building a simplified version of HDFS. My Java skills improved by leaps and bounds this Summer at LinkedIn, and this class is only making me better. I’ve had a chance to work with Futures, Timers and lots of Threads. Designing a system from the ground up and having it work is an extremely rewarding experience. 

Natural Language Processing (CS 498) is pretty awesome too. All the concepts I’ve learnt in that class so far seem intuitive and the math seen so far (mostly probability) isn’t too bad. Plus, the coding assignments are in Python :D.

Machine Learning (CS 446) is definitely my hardest class. The concepts are much more involved than my other classes and the math is much harder. But the feeling you get when you tweak a few parameters of your algorithm and watch the accuracy jump up 10% cannot be described. Even though it’s a really hard class, it’s also a great class and I would recommend anyone to take it. We’ve gone over some really interesting stuff so far (Perceptrons, PAC, SGD, Kernels etc.) and there’s a lot more cool stuff left for the rest of the semester. 

My other classes, namely Ethics, Software Engineering and Italian are also really good. 

Overall though, with all the work from these classes, I really don’t have time for anything else apart from school work. In other (maybe related) news, coffee is now my new best friend.

Summer 2012 – Part I: The Internship

This Summer I was a Software Engineering intern at LinkedIn in Mountain View, CA. I worked on the Presentation Infrastructure team under CORE. You can find out more about what I worked on here. What was really unique about my internship (apart fro…

This Summer I was a Software Engineering intern at LinkedIn in Mountain View, CA. I worked on the Presentation Infrastructure team under CORE. You can find out more about what I worked on here.

What was really unique about my internship (apart from the amazing people I worked with, the excellent food and the ridiculous perks) is that I got to spend nearly equal amounts of time working on application and infrastructure development. Application development is something that I had “done” before, in that all hackathons I’d taken part in I’d essentially built applications. But I’d never gotten a chance to work on applications that function at “LinkedIn-scale” before and getting an opportunity to do so was a great learning experience. What I’d never done though was infrastructure development and I really enjoyed working on infrastructure components this Summer. The whole concept of “building applications used to build applications” really appealed to me.

I also got to code a little bit in Scala this Summer. I’d read about Scala and had gone through a few “Hello World”-ish tutorials before but had never actually built anything in it. I have to say, from whatever little work I did in it and from the code I saw others had written, that I really like Scala. The OO + functional form of the language appeals to me. The language has a lot of beautiful concepts (like pattern matching and objects) and, from what I’ve heard, it performs well too. Another concept that I really like are Actors (though I never got to write any code that used Actors this Summer). Side note: one of the reasons that I wanted to learn Scala was to learn a language that uses the Actor system, which I first heard of while looking into Erlang. This Summer got me quite excited about Scala and I will definitely try to learn the language properly, i.e. delve more into advanced Scala topics like the Parallel Collections, writing DSLs using Scala, Actors, Views etc. My experience with Java, Python and OCaml helped me get really good at Scala over the Summer and I want to retain this skill. I also want to build a complete application in Scala; I have this idea to build something similar to Scrapy using Scala + Akka, though I don’t see when I will find the free time to do so.

Overall, my internship was excellent. It was all that I wanted and more. Thanks for an awesome Summmer LinkedIn! 🙂

Reflections on Spring 12

Wow. I guess I’m a senior now. Feels a bit strange to be honest. I’ve grown a lot these past three years, and I can’t wait to see what my last year at UIUC holds. I feel that my next two semesters are going to be as crazy as Spring 12 was. I’m pre…

Wow. I guess I’m a senior now. Feels a bit strange to be honest. I’ve grown a lot these past three years, and I can’t wait to see what my last year at UIUC holds. I feel that my next two semesters are going to be as crazy as Spring 12 was. I’m pretty sure I will be taking 18 credit hours each semester, and am also looking to do some research. 

Let’s talk a bit about Spring 12. CS 473: Algorithms was AWESOME. It was a lot of hard work, but the material that I learnt in that class is invaluable. My favorite topics had to be Greedy Algorithms and Dynamic Programming. There is just something beautiful about these two topics that appeal to me. In CS 440: AI, I enjoyed our discussions about Bayesian Networks and HMMs tremendously. I’d come across these terms before but could never fully understand what they meant, or why they were used. 440, and it’s book, changed that. I also improved my Python skills in this class since I did all the programming assignments in Python. CS 421: Compilers was another great class. I came into the class not caring much about compilers in general, but along the way I developed an appreciation for language design, parsers, interpreters and compilers to such an extent that I’m considering taking more advanced compiler classes before I graduate. 

The only thing that disappointed me about Spring 12 was that I didn’t have the time to learn a new language/framework/database etc. Hopefully, I shall make up for that this Summer. 

 

Post Facebook Camp Hackathon, Pre Yahoo Open Hack All Stars 2011

Facebook held it’s Camp Hackathon at UIUC yesterday, and it was another great experience. Sam and I built a system to remotely control your iTunes music library via text messaging, a web interface and voice. Technologies used were Python, PHP, Twi…

Facebook held it’s Camp Hackathon at UIUC yesterday, and it was another great experience. Sam and I built a system to remotely control your iTunes music library via text messaging, a web interface and voice. Technologies used were Python, PHP, Twilio and NodeJS (NowJS and Express). I’m a huge fan of NowJS, it’s an excellent product that makes realtime communication in NodeJS so much simpler, and it opens up a world of possibilities in terms of applications that can be built. It’s the fourth time I’ve used this library, and each time it’s elegance blows me away. The same holds true for Express.  Robust and easy to use (though we didn’t use it heavily for this project). All the text messaging stuff was handled via Twilio, another service that I am a huge fan of. 

Heading out to New York tomorrow for the Yahoo Open Hack All Stars 2011. I’m super excited for this event and can’t wait for the competition to begin!

Snip: pastie.org clone using Express and Redis

I’ve been wanting to learn and write something using Redis and Nodejs for the past few days. While scouring the Internet for knowledge I came across this article that describes how to implement a simple pastie.org clone using Redis and nodejs. I u…

I’ve been wanting to learn and write something using Redis and Nodejs for the past few days. While scouring the Internet for knowledge I came across this article that describes how to implement a simple pastie.org clone using Redis and nodejs. I used this article as ‘inspiration’ and modified their sample application.

Instead of using Nerve for routing and returning html in the form of a string, I decided to use Express to build the application and Jade for the templates. I also replaced Pygmentize, used for syntax highlighting, with a Javascript syntax highlighter. Lastly, I swapped the Redis module ‘redis-client’ with the recommended ‘redis’ module instead. I then rewrote the application using these new tools.

Code is here.