gelato – Karan Parikh

We’ve been using Fabric to set up and build Gelato on AWS. Each time I use it I’m left with this sense of awe at how amazing it is. Going from having to manually SSH into each machine to do anything to have Fabric build your code on 15 machines in parallel is indescribable.

One thing that we were having trouble with was having Fabric run a task on specific host roles in parallel. To run tasks in parallel you use the @parallel decorator, while to run tasks on hosts by roles you use the @roles decorator. If you want to run tasks in parallel on specific hosts you have to be careful of the order in which you apply these decorators. Here is what worked for us:

	…

	env.roledefs = {
	"service_A": ["hostA1", "hostA2", …],
	"service_B": ["hostB1", "hostB2", …],
	"service_C": ["hostC1", "hostC2", …],
	…
	}

	@task
	@parallel
	@roles("service_A", "service_B", "service_C")
	def build():
	…

view raw

fabfile.py

hosted with ❤ by GitHub

P.S. make sure you set the correct Bubble Size if you have a large number of hosts!

For our Advanced Distributed Systems (CS 525) final project Onur and I are working on a system we’ve named Gelato. I will have more details about it in a month when we (hopefully) open source our code. Our tech stack for Gelato looks something like this:

Apache HBase for persistence of data
Apache ZooKeeper for service discovery, load balancing, group membership, and failure detection
RabbitMQ for internal publish-subscribe
Apache Thrift for RPC
Fabric for deploying our code

We were deciding between Cassandra and HBase and decided to use HBase because HBase has a native Java API and is pretty easy to use on AWS thanks to AWS EMR.

The languages we are using are:

Java for the core Gelato system
Python to gather performance metrics

Gelato is pretty the most complex system I’ve built during college and I’m really excited to see how it finally turns out.

Tag: gelato

Running Fabric tasks in parallel based on roles

Gelato Tech Stack