ArtifiShell Intelligence

My summary of “The Art of PostgreSQL”

2020-04-05

I’ve recently finished reading The Art of PostgreSQL (see my review). I took a lot of notes, mostly to help me read in a more focused manner; I thought it’d be nice to polish and publish them.

I first wanted to call this “What I’ve learned from TAOP”, but it also includes things I already knew, so there.

Parts I and II: Preface and Introduction

I didn’t take any notes for these—they’re a very general introduction to the book, a first application example, and why you should use Postgres. Got it.

Part III: Writing SQL Queries

Chapter 4: Business Logic

Chapter 5: A Small Application

Chapter 6: The SQL REPL—An Interactive Setup

This chapter was great, I’ve totally overhauled my psql configuration file after reading it and the whole man page. My favourites:

Chapter 7: SQL is Code

This chapter basically says that all the ususal software engineering practices apply to SQL code as well: use a consistent style, use tests, use meaningful names, and so on. The style guide promoted by the author is roughly

Other advice:

Part IV: SQL Toolbox

Chapter 13: Select, From, Where

Chapter 14: Order By, Limit, No Offset

Chapter 15: Group By, Having, With, Union All

Chapter 16: Understanding Nulls

Chapter 17: Understanding Window Functions

I definitely still don’t understand window functions.

Chapter 18: Understanding Relations and Joins

Part V: Data Types

Chapter 20: Serialization and Deserialization

Storing and retrieving values out of and back into memory isn’t a problem for which you need a database system.

Chapter 21: Some Relational Theory

Chapter 22: PostgreSQL Data Types

Get a sample from a table: tablesample; for example:

select * from table tablesample bernoulli(10);

gets a sample where each row has a chance of 10% to appear in the output.

Boolean

Character and Text

Server Encoding and Client Encoding

Numbers

All combinations of operand types have to be supported, so Postgres has only few numeric types (no unsigned, for example).

Sequences

UUIDs

Date/Time and Time Zones

Intervals

Intervals are aware of month lengths when attached to a date, so use them.

Date/Time Processing and Querying

Network Address Types

There are proper data types for IP addresses: inet for hosts and networks, cidr for networks.

Ranges

Chapter 23: Denormalized Data Types

Arrays

Composite Types

For advanced cases not covered in this book.

XML

You can write XSLT stored procedures in Postgres, see PL/XSLT.

JSON

Enum

Part VI: Data Modeling

Chapter 26: Object Relational Mapping

Database modeling and object modeling are separate and both required.

Chapter 27: Tooling for Database Modeling

Chapter 28: Normalization

This was a very useful chapter. It helped me better understand primary keys, when to use surrogate keys and to think about constraints on what would be natural keys.

Chapter 29: Practical Use Case: Geonames

In this chapter, the author goes through a poorly laid out schema and normalizes it. I used a tool to visualize the schema after the normalization:

Normalized schema

It would have been interesting to see the schema before as well, but I realized that too late.

Chapter 30: Modelization Anti-Patterns

The discussion about UUIDs as primary keys was teased a few times before this chapter, but I’m not convinced by the recommendation; while sequences are easier to handle and guarantee that there won’t be collisions, you wouldn’t want to leak them in externally visible data. This wasn’t addressed in this chapter, though.

Anti-patterns:

Chapter 31: Denormalization

Chapter 32: Not Only SQL

Part VII: Data Manipulation and Concurrency Control

Chapter 34: Another Small Application

Chapter 35: Insert, Update, Delete

Insert

Update

Inserting Some Tweets

There is an XML version of Shakespeare’s works!

Delete

Chapter 36: Isolation and Locking

Modeling for Concurrency

Chapter 37: Computing and Caching in SQL

Chapter 38: Triggers

Chapter 39: Listen and Notify

Chapter 40: Batch Update, MoMA Collection

Part VIII: PostgreSQL Extensions

Chapter 42: What’s a PostgreSQL Extension?

Chapter 43: Auditing Changes with hstore

Chapter 44: Last.fm Million Song Dataset

Really just loads zipped JSON into a table using a Lisp script.

Chapter 45: Using Trigrams For Typos

Chapter 46: Denormalizing Tags with intarray

Chapter 48: How far is the nearest pub?

Chapter 49: Geolocation with PostgreSQL

Chapter 50: Counting Distinct Users with HyperLogLog