(This post is a summary of two papers I have recently read. Papir is the Norwegian word for paper)
Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs is a paper that was presented at VLDB 2016. It combines two of my favorite topics, distributed systems and graph theory, into a short (2 pages!) paper. It presents a simplified version of the algorithm that Twitter uses to detect motifs in real-time in a user’s social graph, which is then used to generate recommendations for the user. One thing I liked about this paper is that it presents naive solutions to the problem at hand before diving into the elegant solution that Twitter uses. The paper then presents their solution to the problem, and explains how it works at Twitter scale by graph partitioning, pruning, and offline data structure generation.
Design patterns for container-based distributed systems is a paper by Google that talks about software design patterns that are emerging from software systems that are built around containers. Software like Docker and CoreOS has made working with containers easier, and more and more companies are moving towards a container based ecosystem. Google was one of the first companies to use containers, and this paper contains design and architecture patterns that they have observed in their container based systems. The design patterns presented are grouped under three main categories of which I enjoyed reading about “Multi-node application patterns” the most. This sections deals with design patterns in distributed systems, where each node holds multiple related containers (called “pods” in the paper). It was interesting to read about how distributed system problems like leader election, scatter-gather, etc. can be dealt with language agnostic containers rather than by language specific libraries. I loved this line from the end of the paper, which made me think of containers in an entirely new light:
In all cases, containers provide many of the same benefits as objects in object-oriented systems, such as making it easy to divide implementation among multiple teams and to reuse components in new contexts. In addition, they provide some benefits unique to distributed systems, such as enabling components to be upgraded independently, to be written in a mixture of languages, and for the system a whole to degrade gracefully.
Been thinking about it twice about the recommendation emails that I get now that I read that paper. Thanks for the recommendation! Do you know if they’ve open-sourced the implementation, or if there’s something out there that shows the implementation?
There is no open source implementation that I’m aware of.