The other day I was working through my (quite massive) backlog of saved Hacker News stories and read this gem — Introduction to PostgreSQL physical storage. As someone who has loved databases forever, and has spent over a year building LinkedIn’s next generation distributed graph database, I found this post absolutely fascinating.
As the title suggests this article talks about how data in PostgreSQL tables and databases are actually stored on disk and how free space (to figure out where to store incoming data) is managed. The diagram of the page structure is very helpful in understanding how data is stored in a page. I also really liked the use of PostgreSQL queries used throughout the article to explain the topic at hand by examining a real PostgreSQL instance. The author does a great job at explaining concepts at just the right amount of detail, with several links provided for those interesting in learning more.
(Relevant side bar: the PostgreSQL source code documentation is amazing)