
- •About…
- •About the Book
- •About the Author
- •Acknowledgements
- •About the organisation of the books
- •Structured Query Language
- •A First Use Case
- •Loading the Data Set
- •Application Code and SQL
- •Back to Discovering SQL
- •Computing Weekly Changes
- •Software Architecture
- •Why PostgreSQL?
- •The PostgreSQL Documentation
- •Getting Ready to read this Book
- •Business Logic
- •Every SQL query embeds some business logic
- •Business Logic Applies to Use Cases
- •Correctness
- •Efficiency
- •Stored Procedures — a Data Access API
- •Procedural Code and Stored Procedures
- •Where to Implement Business Logic?
- •A Small Application
- •Readme First Driven Development
- •Chinook Database
- •Top-N Artists by Genre
- •Intro to psql
- •The psqlrc Setup
- •Transactions and psql Behavior
- •Discovering a Schema
- •Interactive Query Editor
- •SQL is Code
- •SQL style guidelines
- •Comments
- •Unit Tests
- •Regression Tests
- •A Closer Look
- •Indexing Strategy
- •Indexing for Queries
- •Choosing Queries to Optimize
- •PostgreSQL Index Access Methods
- •Advanced Indexing
- •Adding Indexes
- •An Interview with Yohann Gabory
- •Get Some Data
- •Structured Query Language
- •Queries, DML, DDL, TCL, DCL
- •Select, From, Where
- •Anatomy of a Select Statement
- •Projection (output): Select
- •Restrictions: Where
- •Order By, Limit, No Offset
- •Ordering with Order By
- •kNN Ordering and GiST indexes
- •Top-N sorts: Limit
- •No Offset, and how to implement pagination
- •Group By, Having, With, Union All
- •Aggregates (aka Map/Reduce): Group By
- •Aggregates Without a Group By
- •Restrict Selected Groups: Having
- •Grouping Sets
- •Common Table Expressions: With
- •Distinct On
- •Result Sets Operations
- •Understanding Nulls
- •Three-Valued Logic
- •Not Null Constraints
- •Outer Joins Introducing Nulls
- •Using Null in Applications
- •Understanding Window Functions
- •Windows and Frames
- •Partitioning into Different Frames
- •Available Window Functions
- •When to Use Window Functions
- •Relations
- •SQL Join Types
- •An Interview with Markus Winand
- •Serialization and Deserialization
- •Some Relational Theory
- •Attribute Values, Data Domains and Data Types
- •Consistency and Data Type Behavior
- •PostgreSQL Data Types
- •Boolean
- •Character and Text
- •Server Encoding and Client Encoding
- •Numbers
- •Floating Point Numbers
- •Sequences and the Serial Pseudo Data Type
- •Universally Unique Identifier: UUID
- •Date/Time and Time Zones
- •Time Intervals
- •Date/Time Processing and Querying
- •Network Address Types
- •Denormalized Data Types
- •Arrays
- •Composite Types
- •Enum
- •PostgreSQL Extensions
- •An interview with Grégoire Hubert
- •Object Relational Mapping
- •Tooling for Database Modeling
- •How to Write a Database Model
- •Generating Random Data
- •Modeling Example
- •Normalization
- •Data Structures and Algorithms
- •Normal Forms
- •Database Anomalies
- •Modeling an Address Field
- •Primary Keys
- •Foreign Keys Constraints
- •Not Null Constraints
- •Check Constraints and Domains
- •Exclusion Constraints
- •Practical Use Case: Geonames
- •Features
- •Countries
- •Modelization Anti-Patterns
- •Entity Attribute Values
- •Multiple Values per Column
- •UUIDs
- •Denormalization
- •Premature Optimization
- •Functional Dependency Trade-Offs
- •Denormalization with PostgreSQL
- •Materialized Views
- •History Tables and Audit Trails
- •Validity Period as a Range
- •Pre-Computed Values
- •Enumerated Types
- •Multiple Values per Attribute
- •The Spare Matrix Model
- •Denormalize wih Care
- •Not Only SQL
- •Schemaless Design in PostgreSQL
- •Durability Trade-Offs
- •Another Small Application
- •Insert, Update, Delete
- •Insert Into
- •Insert Into … Select
- •Update
- •Inserting Some Tweets
- •Delete
- •Tuples and Rows
- •Deleting All the Rows: Truncate
- •Isolation and Locking
- •About SSI
- •Putting Concurrency to the Test
- •Computing and Caching in SQL
- •Views
- •Materialized Views
- •Triggers
- •Transactional Event Driven Processing
- •Trigger and Counters Anti-Pattern
- •Fixing the Behavior
- •Event Triggers
- •Listen and Notify
- •PostgreSQL Notifications
- •Notifications and Cache Maintenance
- •Listen and Notify Support in Drivers
- •Batch Update, MoMA Collection
- •Updating the Data
- •Concurrency Patterns
- •On Conflict Do Nothing
- •An Interview with Kris Jenkins
- •Installing and Using PostgreSQL Extensions
- •Finding PostgreSQL Extensions
- •A Short List of Noteworthy Extensions
- •Auditing Changes with hstore
- •Introduction to hstore
- •Comparing hstores
- •Auditing Changes with a Trigger
- •Testing the Audit Trigger
- •From hstore Back to a Regular Record
- •Last.fm Million Song Dataset
- •Using Trigrams For Typos
- •The pg_trgm PostgreSQL Extension
- •Trigrams, Similarity and Searches
- •Complete and Suggest Song Titles
- •Trigram Indexing
- •Denormalizing Tags with intarray
- •Advanced Tag Indexing
- •User-Defined Tags Made Easy
- •The Most Popular Pub Names
- •A Pub Names Database
- •Normalizing the Data
- •Geolocating the Nearest Pub (k-NN search)
- •How far is the nearest pub?
- •The earthdistance PostgreSQL contrib
- •Pubs and Cities
- •The Most Popular Pub Names by City
- •Geolocation with PostgreSQL
- •Geolocation Data Loading
- •Geolocation Metadata
- •Emergency Pub
- •Counting Distinct Users with HyperLogLog
- •HyperLogLog
- •Installing postgresql-hll
- •Counting Unique Tweet Visitors
- •Lossy Unique Count with HLL
- •Getting the Visits into Unique Counts
- •Scheduling Estimates Computations
- •Combining Unique Visitors
- •An Interview with Craig Kerstiens
9
An Interview with Yohann Gabory
Yohann Gabory, Python Django’s expert, has published an “Advanced Django” book in France to share his deep understanding of the publication system with Python developers. The book really is a reference on how to use Django to build powerful applications.
As a web backend developer and Django expert, what do you expect from an RDBMS in terms of features and behavior?
Consistency and con dence
Data what a web application reli on. You can manage bad quality code but you cannot afford to have data loss or corruption.
Someone might say “Hey we do not work for financials, it doesn’t matter if we lose some data sometime”. What I would answer to th : if you are ready to lose some data then your data h no value. If your data h no value then there a big chance that your app h no value either.
So let’s say you care about your customers and so you care about their data. The first thing you must guaranty confidence. Your users must trust you when you say, “I have saved your data”. They must trust you when you say, “Your data not corrupted”.
So what the feature I first expect?
Don’t mess up my database with invalid or corrupted data. Ensure
Chapter 9 An Interview with Yohann Gabory j 82
that when my database says something saved, it really .
Code in SQL
Of course, th means that each time the coherence of my database involved I do not rely on my framework or my Python code. I
rely on SQL code.
I need my database to be able to handle code within itself — procedure, tri ers, check_constraints — those are the most basic featur I need from a database.
Flexible when I want, rigid when I ask
As a developer when first implementing a proof of concept or a MVC you cannot ask me to know perfectly how I will handle my data in the future. Some information that do not seem very relevant will be mandatory or something else I tough w mandatory not after all.
So I need my database to be flexible enough to let me easily change what mandatory and what not.
Th point |
the main reason some developers |
fly to |
NoSQL |
databas . Because they see the schemaless options |
a way to not |
||
carefully specify their database schema. |
|
|
|
At first sight th can seem like a good idea. In fact, th |
a ter- |
rible one. Because tomorrow you will need consistency and nonpermissive schema. When it happens, you will be on your own, lost in a world of inconsistency, corrupted data and “eventually consistent” records.
I will not talk about writing consistency and relational checks in code because it reminds me of nightmar called race-conditions and
Heisenbugs.
What I really expect from my RDBMS to let me begin schemaless and after some time, let me specify mandatory fields, relation insurance and so on. If you think I’m asking too much, have a look at jsonb or hstore.
What makes you want to use PostgreSQL rather than something else in your Django projects? Are there any di culties to be aware of when using
Chapter 9 An Interview with Yohann Gabory j 83
PostgreSQL?
Django lets you use a lot of different databas . You can use SQLite, MariaDB, PostgreSQL and some others. Of course, you can expect from some databas availability, consistency, isolation, and durability. Th allows you to make decent applications. But there always a time where you need more. Especially some database type that could match Python type. Think about list, dictionary, rang , timestamp, timezone, date and datetime.
All of th (and more) can be found in PostgreSQL. Th so true that there are now in Django some specific models fields (the Django representation of a column) to handle those great PostgreSQL fields.
When it com to choosing a database why someone wants to use something other than the most full-featured?
But don’t think I choose PostgreSQL only for performance, easiness of use and powerful featur . It’s also a really warm place to code with confidence.
Because Django h a migration management system that can handle pure SQL I can write advanced SQL functions and tri ers directly in my code. Those functions can use the most advanced featur of PostgreSQL and stay right in front of me, in my Git, easily editable.
In fact version after version, Django let you use your database more and more. You can now use SQL function like COALESCE, NOW, a regation functions and more directly in your Django code. And those function you write are plain SQL.
Th also means that version after version your RDBMS choice more and more important. Do you want to choose a tool that can do half the work you expect from it?
Me neither.
Django comes with an internal ORM that maps an object model to a relational table and allows it to handle “saving” objects and SQL query writing. Django also supports raw SQL. What is your general advice around using the ORM?
Well th |
a tough question. Some will say ORM sucks. |
Some |
others says mixing SQL and Python code in your application |
ugly. |

Chapter 9 An Interview with Yohann Gabory j 84
I think they are both right. Of course, an ORM limits you a lot. Of course writing SQL everytime you need to talk to your database
not sustainable in the long run.
When your queri are so simple you can express them with your ORM why not use it? It will generate a SQL query good anybody could write. It will hydrate a Django object you can use right away, in a breeze.
Think about:
1MyModel.objects.get(id=1)
|
Th |
equivalent to: |
1 |
select |
mymodel.id, mymodel.other_field, ... |
2 |
from |
mymodel |
3where id=1;
Do you think you could write better SQL?
ORM can manage all of your SQL needs. There also some advice to avoid the N+1 dilemma. The a regation system reli on SQL and fairly decent.
But if you don’t pay attention, it will bite you hard.
The rule of thumb for me to never forget what your ORM meant for: translate SQL records into Python objects.
If you think it can handle anything more, like avoiding writing SQL, managing index etc… you are wrong.
The main Django ORM philosophy to let you drive the car.
•First always be able to translate your ORM query into the SQL counterpart, the following trick should help you with th
1 MyModel.objects.filter(...).query.sql_with_params()
•Create SQL functions and use them with the Func object
•Use manager methods with meticulously crafted raw sql and use those methods in your code.
So y , use your ORM. Not the one from Django. Yours !
What do you think of supporting several RDMS solutions in your applications?
Chapter 9 An Interview with Yohann Gabory j 85
Sorry but I have to admit that back in the days I believed in such a tale. Now a grown-up I know two things. Santa and RDBMS agnosticism do not really exist.
What true that a framework like Django lets you choose a database and then stick with it.
The idea of using SQLite in development and PostgreSQL in production leads only to one thing: you will use the featur of SQLite everywhere and you will not be able to use the PostgreSQL specific featur .
The only way to be purely agnostic to use only the featur all the proposed RDMS provid . But think again. Do you want to drive your race car like a tractor?

Part IV
SQL Toolbox
j 87
In this chapter, we are going to add to our pro ciency in writing SQL queries. The structured query language doesn’t look like any other imperative, functional or even object-oriented programming language.
This chapter contains a long list of SQL techniques from the most basic select clause to advanced lateral joins, each time with practical examples working with a free database that you can install at home.
It is highly recommended that you follow along with a local instance of the database so that you can enter the queries from the book and play with them yourself. A key aspect of this part is that SQL queries arent’ typically written in a text editor with hard thinking, instead they are interactively tried out in pieces and stitched together once the spelling is spot on.
The SQL writing process is mainly about discovery. In SQL you need to explain your problem, unlike in most programming languages where you need to focus on a solution you think is going to solve your problem. That’s quite di ferent and requires looking at your problem in another way and understanding it well enough to be able to express it in details in a single sentence.
Here’s some good advice I received years and years ago, and it still applies to this day: when you’re struggling to write a SQL query, rst write down a single sentence —in your native language— that perfectly describes what you’re trying to achieve. As soon as you can do that, then writing the SQL is going to be easier.
One of the very e fective techniques in writing such a sentence is talking out loud, because apparently writing and speaking come from di ferent parts of the brain. So it’s the same as when debugging a complex program, as it helps a lot to talk about it with a colleague… or a rubber duck.
Af er having dealt with the basics of the language, where means basic really fundamentals, this chapter spends time on more advanced SQL concepts and PostgreSQL along with how you can bene t from them when writing your applications, making you a more e fective developer.