Jet lag, a book, research papers, and a dog

I reached Vadodara on the 23rd and was hit with jet lag that lasted about three days.

I’d bought Mira Jacob’s The Sleepwalker’s Guide to Dancing with the hope that it would last me a majority of my visit to India. I read ~100 pages of the book while in the U.S. On my first night in Vadodara I woke up around 3.30am (because of the afore mentioned jet lag) and finished the remaining ~400 pages in four hours. This is an incredible book, and I can see why it was a nominee for Goodreads’ Best Fiction books of 2014. Simply put, I loved it. And yes, I did appreciate the irony of the title given my battle with temporary-time-difference-induced sleeplessness.

With my reading plans in shambles (OK not completely in shambles since I am not yet done with Borges’ Labyrinths which is also amazing) I decided to work on one of my 2014 goals and read some research papers.

Here is what I read:

    1. Scaling Memcache at Facebook – This paper talks about how Facebook leveraged memcached to build a distributed key-value store. I thought the usage of UDP for making get requests and flow control mechanisms to combat incast was particularly interesting. Overall this is a very good paper and I would highly recommend it to anyone interested in distributed systems and system architecture. My favorite line in the paper? “Simplicity is vital.” (section 9).
    1. MultiQueues: Simpler, Faster, and Better Relaxed Concurrent Priority Queues – this paper introduces MultiQueues, a concurrent priority queue with relaxed semantics, i.e. you are not always guaranteed to get a globally minimal item from the data structure. Most of the papers I read are systems papers, so it was refreshing to read a paper that dealt more with data structures. MultiQueues are conceptually quite simple and their performance (as shown in section 6 of the paper) is impressive.
    1. You’re Doing It Wrong – this article introduces a B-heap, which is a VM page-friendly implementation of a binary heap. Kamp writes really well, and this article is a joy to read. My main takeaway from this article was the reminder that one should also consider I/O and memory access patterns while analyzing algorithms. This idea was introduced to me in CS-232 at UIUC and it is something that I always try to keep in mind while looking at an algorithm or trying to improve performance.

Oh, and here is the dog!

IMG_2360

On time and art

I was looking at one of my favorite works of art recently and realized that it reminds me a lot of the concept of time in distributed systems. The melting clocks are a great illustration for the fact that the notion of time as we understand it makes little sense in a distributed system where each node has its own physical clock that runs independently of the other clocks. In a single node system one can look at timestamps to figure out the ordering of events. However, this concept breaks down for a multi-node system involving multiple clocks due to problems like clock skew. While several types of logical clocks have been created to solve this problem and help come up with causal ordering between events I think it is fascinating that time, something we pretty much take for granted everyday, is something you cannot rely on anymore in a distributed system. This was one of the first problems I was exposed to while studying distributed systems and understanding how one can solve it was extremely intellectually satisfying.

Aside – in the past few years I’ve discovered that I’m a fan of surreal art. Another artist I quite like is Rene Magritte, with The Son of Man, The Human Condition, and The Treachery of Images being my favorite works by him.

On Zab

Apache ZooKeeper has become an indispensable component for many distributed systems: Apache Hadoop, Apache Mesos, Apache Kafka, Apache HBase, etc. all use ZooKeeper in some form or the other. I’ve written code that interacts with ZooKeeper and I’m a big fan of the simple APIs and powerful guarantees it provides.

I had a vague idea of the broadcast protocol that powers ZooKeeper but wasn’t awake of the details. So this weekend I decided to read a short paper that gives an overview of Zab (a more detailed description of Zab can be found in “Zab: High-performance broadcast for primary-backup systems” by Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini).

The title of the paper is extremely accurate – Zab is a very simple protocol that is intuitive and easy to understand. The paper does a great job of explaining the core concepts of the algorithm to the reader. I particularly liked section 3, which includes a comparison between Zab and Paxos. Section 4 is probably the most important section of the paper and is very well written. The figures illustrating the two main failure scenarios are a nice touch.

Next step – read the detailed Zab paper.

Let’s Go.

One of my goals of 2014 was to learn a new programming language, and for various reasons I decided on Go. While my learning progress has been pretty terrible for most of the year, I was able to make pretty good progress this week.

(Chart generated using the awesome pygal library. Code can be found here.)

Screen Shot 2014-09-28 at 11.09.48 AM

What made me get to 40% this month? Several things, but by far the most important one was A Tour of Go.

The tour is a fantastic resource for a person beginning to learn Go. It does a great job of explaining the important language features and constructs, and lets you try Go without having to install Go on your machine. This ability to try Go without having to install Go locally and setup your environment is incredibly powerful (this is how it works under the hood) and is something that I wish more languages incorporated into their websites. With the ability to compile almost anything down to Javascript one does not even have to build a remote sandbox environment (though that is more powerful and robust in my opinion) in order to introduce newcomers to your language.
I feel that the Tour could be improved by adding three things – syntax highlighting to the code in the console, more exercises, and including solutions to the exercises. Apart from that, I think it is a perfect learning resource.

My thoughts on Go? I think it is an interesting language, and something that I will most certainly delving into more. I’d read some Go code before (mostly Docker code) so I was vaguely aware of what the language looked like. I was initially a bit hesitant about the fact that Go has both high level (e.g. functional programming) and low level (e.g. pointers) features, but I think I quite like this ability to have and use both. The concurrency primitives look excellent, and I can’t wait to write some code using it. Another thing that scared me a little was the lack of classes, but structs + interfaces seem to be powerful enough to deal with that. I’ve read online about Go’s lack of generics being a problem, but I haven’t written enough code in Go to comment on this issue. I really liked the fact that the Go compiler doesn’t allow you to compile code that has unused imports or unused variables.

Next steps – read Go by Example and An Introduction to Programming in Go, and begin writing Go code to implement the two phase commit protocol.

TDD

I read about TDD and it’s associated red-green-refactor cycle while learning Ruby on Rails and I was quite intrigued by the concept. It seemed like a novel way to approach programming. From what I read on the Internet about it, it also seemed to r…

I read about TDD and it’s associated red-green-refactor cycle while learning Ruby on Rails and I was quite intrigued by the concept. It seemed like a novel way to approach programming. From what I read on the Internet about it, it also seemed to reduce bugs in the code and generally leads to better, more human understandable code.

So, when we were given our first programming assignment in CS 241, I decided to approach it with the TDD mantra. Programming this way was a great expirience, once I had come up with enough test cases for my assignment, programming it up wasn’t too hard because I had already captured all the edge/unusual cases. 

All in all, I was satisfied with this approach, and I will most definately be using it again for future assignments when possible.

Namespace error, of sorts

So we had ‘picture day’ in Phil 102 today. I was talking to the instructor about an article I found and wrote my name down as “CS” on the board instead of my slick moniker “KP”. Thankfully, I caught this error in time. Sometimes I wish life had sy…

So we had ‘picture day’ in Phil 102 today. I was talking to the instructor about an article I found and wrote my name down as “CS” on the board instead of my slick moniker “KP”. Thankfully, I caught this error in time. 

Sometimes I wish life had syntax highlighting built in.