
- •About…
- •About the Book
- •About the Author
- •Acknowledgements
- •About the organisation of the books
- •Structured Query Language
- •A First Use Case
- •Loading the Data Set
- •Application Code and SQL
- •Back to Discovering SQL
- •Computing Weekly Changes
- •Software Architecture
- •Why PostgreSQL?
- •The PostgreSQL Documentation
- •Getting Ready to read this Book
- •Business Logic
- •Every SQL query embeds some business logic
- •Business Logic Applies to Use Cases
- •Correctness
- •Efficiency
- •Stored Procedures — a Data Access API
- •Procedural Code and Stored Procedures
- •Where to Implement Business Logic?
- •A Small Application
- •Readme First Driven Development
- •Chinook Database
- •Top-N Artists by Genre
- •Intro to psql
- •The psqlrc Setup
- •Transactions and psql Behavior
- •Discovering a Schema
- •Interactive Query Editor
- •SQL is Code
- •SQL style guidelines
- •Comments
- •Unit Tests
- •Regression Tests
- •A Closer Look
- •Indexing Strategy
- •Indexing for Queries
- •Choosing Queries to Optimize
- •PostgreSQL Index Access Methods
- •Advanced Indexing
- •Adding Indexes
- •An Interview with Yohann Gabory
- •Get Some Data
- •Structured Query Language
- •Queries, DML, DDL, TCL, DCL
- •Select, From, Where
- •Anatomy of a Select Statement
- •Projection (output): Select
- •Restrictions: Where
- •Order By, Limit, No Offset
- •Ordering with Order By
- •kNN Ordering and GiST indexes
- •Top-N sorts: Limit
- •No Offset, and how to implement pagination
- •Group By, Having, With, Union All
- •Aggregates (aka Map/Reduce): Group By
- •Aggregates Without a Group By
- •Restrict Selected Groups: Having
- •Grouping Sets
- •Common Table Expressions: With
- •Distinct On
- •Result Sets Operations
- •Understanding Nulls
- •Three-Valued Logic
- •Not Null Constraints
- •Outer Joins Introducing Nulls
- •Using Null in Applications
- •Understanding Window Functions
- •Windows and Frames
- •Partitioning into Different Frames
- •Available Window Functions
- •When to Use Window Functions
- •Relations
- •SQL Join Types
- •An Interview with Markus Winand
- •Serialization and Deserialization
- •Some Relational Theory
- •Attribute Values, Data Domains and Data Types
- •Consistency and Data Type Behavior
- •PostgreSQL Data Types
- •Boolean
- •Character and Text
- •Server Encoding and Client Encoding
- •Numbers
- •Floating Point Numbers
- •Sequences and the Serial Pseudo Data Type
- •Universally Unique Identifier: UUID
- •Date/Time and Time Zones
- •Time Intervals
- •Date/Time Processing and Querying
- •Network Address Types
- •Denormalized Data Types
- •Arrays
- •Composite Types
- •Enum
- •PostgreSQL Extensions
- •An interview with Grégoire Hubert
- •Object Relational Mapping
- •Tooling for Database Modeling
- •How to Write a Database Model
- •Generating Random Data
- •Modeling Example
- •Normalization
- •Data Structures and Algorithms
- •Normal Forms
- •Database Anomalies
- •Modeling an Address Field
- •Primary Keys
- •Foreign Keys Constraints
- •Not Null Constraints
- •Check Constraints and Domains
- •Exclusion Constraints
- •Practical Use Case: Geonames
- •Features
- •Countries
- •Modelization Anti-Patterns
- •Entity Attribute Values
- •Multiple Values per Column
- •UUIDs
- •Denormalization
- •Premature Optimization
- •Functional Dependency Trade-Offs
- •Denormalization with PostgreSQL
- •Materialized Views
- •History Tables and Audit Trails
- •Validity Period as a Range
- •Pre-Computed Values
- •Enumerated Types
- •Multiple Values per Attribute
- •The Spare Matrix Model
- •Denormalize wih Care
- •Not Only SQL
- •Schemaless Design in PostgreSQL
- •Durability Trade-Offs
- •Another Small Application
- •Insert, Update, Delete
- •Insert Into
- •Insert Into … Select
- •Update
- •Inserting Some Tweets
- •Delete
- •Tuples and Rows
- •Deleting All the Rows: Truncate
- •Isolation and Locking
- •About SSI
- •Putting Concurrency to the Test
- •Computing and Caching in SQL
- •Views
- •Materialized Views
- •Triggers
- •Transactional Event Driven Processing
- •Trigger and Counters Anti-Pattern
- •Fixing the Behavior
- •Event Triggers
- •Listen and Notify
- •PostgreSQL Notifications
- •Notifications and Cache Maintenance
- •Listen and Notify Support in Drivers
- •Batch Update, MoMA Collection
- •Updating the Data
- •Concurrency Patterns
- •On Conflict Do Nothing
- •An Interview with Kris Jenkins
- •Installing and Using PostgreSQL Extensions
- •Finding PostgreSQL Extensions
- •A Short List of Noteworthy Extensions
- •Auditing Changes with hstore
- •Introduction to hstore
- •Comparing hstores
- •Auditing Changes with a Trigger
- •Testing the Audit Trigger
- •From hstore Back to a Regular Record
- •Last.fm Million Song Dataset
- •Using Trigrams For Typos
- •The pg_trgm PostgreSQL Extension
- •Trigrams, Similarity and Searches
- •Complete and Suggest Song Titles
- •Trigram Indexing
- •Denormalizing Tags with intarray
- •Advanced Tag Indexing
- •User-Defined Tags Made Easy
- •The Most Popular Pub Names
- •A Pub Names Database
- •Normalizing the Data
- •Geolocating the Nearest Pub (k-NN search)
- •How far is the nearest pub?
- •The earthdistance PostgreSQL contrib
- •Pubs and Cities
- •The Most Popular Pub Names by City
- •Geolocation with PostgreSQL
- •Geolocation Data Loading
- •Geolocation Metadata
- •Emergency Pub
- •Counting Distinct Users with HyperLogLog
- •HyperLogLog
- •Installing postgresql-hll
- •Counting Unique Tweet Visitors
- •Lossy Unique Count with HLL
- •Getting the Visits into Unique Counts
- •Scheduling Estimates Computations
- •Combining Unique Visitors
- •An Interview with Craig Kerstiens

4
Business Logic
Where to maintain the business logic can be a hard question to answer. Each application may be di ferent, and every development team might have a di ferent viewpoint here, from one extreme (all in the application, usually in a middleware layer) to the other (all in the database server with the help of stored procedures).
My view is that every SQL query embeds some parts of the business logic you are implementing, thus the question changes from this:
•Should we have business logic in the database? to this:
•How much of our business logic should be maintained in the database?
The main aspects to consider in terms of where to maintain the business logic are the correctness and the e ciency aspects of your code architecture and organisation.
Every SQL query embeds some business logic
Before we dive into more speci cs, we need to realize that as soon as you send an SQL query to your RDBMS you are already sending business logic to the database. My argument is that each and every and all SQL query contains some levels of business logic. Let’s consider a few examples.

Chapter 4 Business Logic j 28
In the very simplest possible case, you are still expressing some logic in the query. In the Chinook database case, we might want to fetch the list of tracks from a given album:
1 select name
2from track
3 where albumid = 193
4order by trackid;
What business logic is embedded in that SQL statement?
•The select clause only mentions the name column, and that’s relevant to your application. In the situation in which your application runs this query, the business logic is only interested into the tracks names.
•The from clause only mentions the track table, somehow we decided that’s all we need in this example, and that again is strongly tied to the logic being implemented.
•The where clause restricts the data output to the albumid 193, which again is a direct translation of our business logic, with the added information that the album we want now is the 193rd one and we’re lef to wonder how we know about that.
•Finally, the order by clause implements the idea that we want to display the track names in the order they appear on the disk. Not only that, it also incorporates the speci c knowledge that the trackid column ordering is the same as the original disk ordering of the tracks.
A variation on the query would be the following:
1select track.name as track, genre.name as genre
2 |
from |
track |
3 |
|
join genre using(genreid) |
4 |
where |
albumid = 193 |
5order by trackid;
This time we add a join clause to fetch the genre of each track and choose to return the track name in a column named track and the genre name in a column named genre. Again, there’s only one reason for us to be doing that here: it’s because it makes sense with respect to the business logic being implemented in our application.
Granted, those two examples are very simple queries. It is possible to argue that, barring any computation being done to the data set, then we are not actually implementing any business logic. It’s a fair argument of course. The idea here is that

Chapter 4 Business Logic j 29
those two very simplistic queries are already responsible for a part of the business logic you want to implement. When used as part of displaying, for example, a per album listing page, then it actually is the whole logic.
Let’s have a look at another query now. It is still meant to be of the same level of complexity (very low), but with some level of computations being done on-top of the data, before returning it to the main application’s code:
1select name,
2milliseconds * interval '1 ms' as duration,
3 |
pg_size_pretty(bytes) as bytes |
4from track
5 where albumid = 193
6order by trackid;
This variation looks more like some sort of business logic is being applied to the query, because the columns we sent in the output contain derived values from the server’s raw data set.
Business Logic Applies to Use Cases
Up to now, we have been approaching the question from the wrong angle. Looking at a query and trying to decide if it’s implementing business logic rather than something else (data access I would presume) is quite impossible to achieve without a business case to solve, also known as a use case or maybe even a user story, depending on which methodology you are following.
In the following example, we are going to rst de ne a business case we want to implement, and then we have a look at the SQL statement that we would use to solve it.
Our case is a simple one again: display the list of albums from a given artist, each with its total duration.
Let’s write a query for that:
1select album.title as album,
2 |
sum(milliseconds) * interval '1 ms' as duration |
3from album
4join artist using(artistid)
5left join track using(albumid)
6 where artist.name = 'Red Hot Chili Peppers'

Chapter 4 Business Logic j 30
7 group by album
8order by album;
The output is:
album │ duration
═══════════════════════╪══════════════════════════════
Blood Sugar Sex Magik │ @ 1 |
hour |
13 mins |
57.073 secs |
|||
By The Way |
│ |
@ |
1 |
hour |
8 mins 49.951 secs |
|
Californication |
│ |
@ |
56 mins 25.461 |
secs |
(3 rows)
What we see here is a direct translation from the business case (or user story if you prefer that term) into a SQL query. The SQL implementation uses joins and computations that are speci c to both the data model and the use case we are solving.
Another implementation could be done with several queries and the computation in the application’s main code:
1.Fetch the list of albums for the selected artist
2.For each album, fetch the duration of every track in the album
3.In the application, sum up the durations per album
Here’s a very quick way to write such an application. It is important to include it here because you might recognize patterns to be found in your own applications, and I want to explain why those patterns should be avoided:
1 #! /usr/bin/env python3
2# -*- coding: utf-8 -*-
3
4import psycopg2
5 import psycopg2.extras
6import sys
7from datetime import timedelta
8
9DEBUGSQL = False
10 PGCONNSTRING = "user=cdstore dbname=appdev application_name=cdstore"
11
12
13class Model(object):
14tablename = None
15columns = None
16
17@classmethod
18def buildsql(cls, pgconn, **kwargs):
19if cls.tablename and kwargs:
20 |
cols |
= |
", ".join(['"%s"' % c for c in cls.columns]) |
21 |
qtab |
= |
'"%s"' % cls.tablename |

|
Chapter 4 Business Logic j 31 |
22 |
sql = "select %s from %s where " % (cols, qtab) |
23 |
for key in kwargs.keys(): |
24 |
sql += "\"%s\" = '%s'" % (key, kwargs[key]) |
25 |
if DEBUGSQL: |
26 |
print(sql) |
27 |
return sql |
28 |
|
29 |
|
30@classmethod
31def fetchone(cls, pgconn, **kwargs):
32if cls.tablename and kwargs:
33 |
sql = cls.buildsql(pgconn, **kwargs) |
34 |
curs = pgconn.cursor(cursor_factory=psycopg2.extras.DictCursor) |
35 |
curs.execute(sql) |
36 |
result = curs.fetchone() |
37 |
if result is not None: |
38 |
return cls(*result) |
39 |
|
40@classmethod
41def fetchall(cls, pgconn, **kwargs):
42if cls.tablename and kwargs:
43 |
sql = cls.buildsql(pgconn, **kwargs) |
44 |
curs = pgconn.cursor(cursor_factory=psycopg2.extras.DictCursor) |
45 |
curs.execute(sql) |
46 |
resultset = curs.fetchall() |
47 |
if resultset: |
48 |
return [cls(*result) for result in resultset] |
49 |
|
50 |
|
51class Artist(Model):
52tablename = "artist"
53columns = ["artistid", "name"]
54
55def __init__(self, id, name):
56self.id = id
57self.name = name
58
59
60class Album(Model):
61tablename = "album"
62columns = ["albumid", "title"]
63
64def __init__(self, id, title):
65self.id = id
66self.title = title
67self.duration = None
68
69
70class Track(Model):
71tablename = "track"
72columns = ["trackid", "name", "milliseconds", "bytes", "unitprice"]
73