Paper Review: Paxos Explained from Scratch

“Tutorial Summary: Paxos Explained from Scratch” is an extremely unique and interesting paper. As evident from the title, the paper attempts to explain the Paxos algorithm to the reader. What makes this paper great is that it builds up the Paxos algorithm step-by-step.

The Paxos algorithm is explained in the context of building a replicated state machine. The authors begin with a simple algorithm for consensus. By injecting failures in this simple algorithm we eventually derive the Paxos algorithm in a very natural fashion.

This is the first time I’ve read a bottom-up explanation of Paxos and I thought it was quite easy to understand. Each algorithm they present (building up to the Paxos algorithm) is also accompanied by a pictorial explanation which made concepts even more clear.

Overall, I loved this paper. If you’re looking to refresh your knowledge on the Paxos algorithm I would recommend reading this paper, followed by Paxos Made Simple.


Paul Buchheit on the death of his brother

Paul’s post is incredibly touching and beautifully written. The following lines stood out to me –

On a more practical level, what matters most in our day-to-day lives is that we’re good to ourselves and to each other.

What’s most important is that we are good too each other, and ourselves. If we “win”, but have failed to do that, then we have lost. Winning is nothing.

Ultimately, the people who learn to love what they do who will be the ones who accomplish the most anyway. Those who push only for the sake of some future reward, or to avoid failure, very often burn out, sometimes tragically. Please don’t do that.

Goal Tracking: May Edition

At the beginning of the year I published a post outlining what some of my goals for the year were. In the spirit of being transparent, here is the progress I made on them over the course of May –

  1. 0 hours of volunteering. I need to figure out a way to be more productive on Sundays.
  2. No shyness.
  3. 2 books read – The First Fifteen Lives of Harry August (I enjoyed it immensely. I might write a longer post about this book in the future) and The Simpsons and Their Mathematical Secrets (made me love The Simpsons (and Futurama) even more. I particularly enjoyed the sections on prime numbers and topology. I highly recommend reading this book if you’re even remotely interested in math and/or The Simpsons).
  4. I read 2 chapters from AOSA.
  5. 5 blog posts.
  6. I’m still stuck at one muscle up. Getting to two is proving to be harder than expected.


(ouvert means open in French)

Last week I contributed a small feature to clize. As before, I discovered this project on the GitHub page for trending Python repositories. The author had a list of open issues for the repository which made it easy to see what needed to be worked on and I picked one that caught my fancy.

Once I knew what needed to be done I had to figure out how to implement it. The first thing I did was see how the existing code handled unknown command line arguments. “Oh look, it printed ‘Unknown option’! That seems like a good place to start.” I ran an ack for the phrase “Unknown option” and found the relevant source code files. The next step was to figure out from where the parsed arguments lived inside the program. A well placed print statement that I added quickly solved that mystery.

With this knowledge in hand I began writing some code. The basic algorithm was pretty simple – in case the user enters a command line argument that is not one of the parsed arguments compute the Levenshtein distance between what the user entered and the known arguments and suggest one that has the lowest distance. This was more or less the initial pull request that I submitted. The author provided excellent feedback on my code and after a couple of iterations my commit was merged into the master branch.

Things I learnt along the way –


AOSA: The NoSQL Ecosystem Review

(this is a review of the chapter on The NoSQL Ecosystem in the Architecture of Open Source Applications)

Unlike the other chapters in the book (and as stated in the introduction to the chapter), this portion of the book doesn’t dive deep into the internals of one particular project. Rather, it gives readers an overview of the various algorithms and concepts that serve as the building blocks for NoSQL systems like Voldemort, Cassandra, HBase, etc. I think it does a great job at explaining what is out there once one moves away from the traditional relational model and SQL world. It also references several seminal papers, like the Google BigTable and Amazon Dynamo papers, which I urge people to read if they are interested in understanding more about the topics covered in this chapter.

Speaking as someone who has read numerous papers on distributed and NoSQL systems (as well as studied them in several courses at UIUC) I feel like I didn’t gain a whole lot from reading this chapter. It was still a very enjoyable read and I really liked the sections that talked about fsync, read repair, hinted handoff, and anti-entropy. The section on the differences between range-based and hash-based partitioning was excellent as well. One thing I particularly liked was the author’s use of examples to explain concepts like the relational model, range-based partitioning, hash rings, etc.

If you have zero or very little background in NoSQL systems I would highly recommend reading this chapter.

AOSA: Eclipse Review

(this is a review of the chapter on Eclipse in the Architecture of Open Source Applications)

Eclipse was the second IDE (the first was Turbo C++) I was ever introduced to. We used it during our first undergraduate computer science course to code in Java. I remember being blown away by how powerful it was, and how easy it made learning a new language. Even though I’ve switched to using IntelliJ IDEA for Java/Scala now, Eclipse still holds a special place in my heart.

The chapter on Eclipse is very well written and offers readers a glimpse into why Eclipse is the way it is, and why certain design decisions (like writing their own Java compiler) were made. The component and plugin based architecture used by Eclipse seems to be extremely flexible and easy to add new features to. The compatibility layer provided by the team for each new major release of Eclipse (so that plugins written against the older versions still work in the new version) is a great move by the team to iterate on the internals (and externals in the case of public APIs) while preserving the ecosystem of plugins that already exist. I particularly enjoyed the section on “Rich Client Platform (RCP)” that talks about how people used portions of Eclipse to build other, non-IDE, applications. Incremental builds is one of my favorite features of Eclipse and learning how that worked was quite satisfying as well.

Things I learned about –

  • OSGi
  • Native and emulated widget toolkits

Key takeaways –

  • Having a good API is paramount in others adopting or contributing to your software
  • Components are very powerful