Red

(The worst part about jet lag is jet lag. The best part about jet lag is that it makes be very productive for some reason. Last year I read a book and a few research papers. This year I finished reading the “Red Book” while not being able to sleep according to the time zone I’m in)

As I’ve mentioned before, databases hold a special place in my heart. I think they’re incredibly interesting pieces of software. State of the art database systems that exist today are result of decades of research and systems engineering. The “Red Book” does a superb job in explaining how we got here, and where we might be going next.

The book is organized into chapters that deal with different components and related areas of database systems. The authors pick a few research papers that are significant in the chapter under discussion and then offer their commentary on them, as well as explain the content of the paper and talk about other related systems/papers/techniques/algorithms. The authors (Peter Bailis, Joseph M. Hellerstein, and Michael Stonebrakerhave a lot of technical expertise in database systems which makes this book an absolute delight to read. I particularly enjoyed the personal anecdotes and commentaries that sprinkled throughout the book. My favorite chapters in the book were the ones on weak isolation and distribution and query optimization.

While reading this book I made note of all research papers that are referenced in this book that I would like to read next. I will be working on that list over the duration of my vacation.

passwd

Over the weekend I read an interesting paper: How to Memorize a Random 60-Bit String. This was probably the first security paper I’d read in quite some time. I discovered this paper via The Morning Paper, which in my opinion is one of the best resources for people interested in computer science research papers. The title of the paper caught my eye and I decided to read the entire paper (this is typically my strategy with The Morning Paper; I tend to use it for “paper discovery”).

This paper builds on the password generation mechanism introduced by XKCD. It modifies XKCD’s approach to work for 60-bit long passwords and also proposes new schemes for English password generation based on random 60-bit strings. We use a lot of online services in our lives, and creating a secure + easy to remember password (especially if you use the same password for all the services. Please don’t do this.) is essential in order to safeguard ourselves.

While XKCD’s scheme is based on a simple dictionary lookup in order to generate a 4 letter phrase, the methods proposed in the paper are more involved. They all use n-gram language models in order to generate English word passwords that, while random, still seem to have the correct structure of a grammatically correct English sentence fragment or phrase.

I think the most impressive scheme proposed is the one in which a 60-bit string is converted into a poem. Yes, you read that right, a poem. This is done using word information from the CMU pronunciation dictionary + FST + FSA for their accepted poem structure. I thought this was the most interesting part of the paper. It was also the part that took me the longest to read.

The paper ends with a section on experiments showcasing user preference and recall for the difference generation schemes proposed. What is interesting is that even though the poetry scheme has the highest recall percentage it has the second lowest user preference percentage (the XKCD method has the lowest preference).

This paper was a fun, short read. I really enjoyed it!

Paper: Read-Copy Update

With multicore machines now the norm, writing code that scales and performs well in multithreaded scenarios is becoming extremely important. I was thus delighted to discover the paper on Read-Copy Update – a technique to implement data structures with no locking on reads. Modifications (writes or deletes) still use some sort of locking mechanism, but the idea is that the ratio of reads to modifications is so large that by eliminating the overheads of synchronization on the read path we improve the performance of our code significantly.

This paper is extremely well written with lots of code samples. I specifically liked section 7 that compared read-copy update with other locking algorithms. Section 3 is also great because it allows one to answer the question “Can I use read-copy update for my data structure and its expected access pattern?”

Read-copy update is a simple (at least in terms of the general idea; the actual code implementation is tricky) and elegant solution for building high performance concurrent data structures (for certain usage patterns). It is definitely a topic I will be exploring further in the future.