- •About…
- •About the Book
- •About the Author
- •Acknowledgements
- •About the organisation of the books
- •Structured Query Language
- •A First Use Case
- •Loading the Data Set
- •Application Code and SQL
- •Back to Discovering SQL
- •Computing Weekly Changes
- •Software Architecture
- •Why PostgreSQL?
- •The PostgreSQL Documentation
- •Getting Ready to read this Book
- •Business Logic
- •Every SQL query embeds some business logic
- •Business Logic Applies to Use Cases
- •Correctness
- •Efficiency
- •Stored Procedures — a Data Access API
- •Procedural Code and Stored Procedures
- •Where to Implement Business Logic?
- •A Small Application
- •Readme First Driven Development
- •Chinook Database
- •Top-N Artists by Genre
- •Intro to psql
- •The psqlrc Setup
- •Transactions and psql Behavior
- •Discovering a Schema
- •Interactive Query Editor
- •SQL is Code
- •SQL style guidelines
- •Comments
- •Unit Tests
- •Regression Tests
- •A Closer Look
- •Indexing Strategy
- •Indexing for Queries
- •Choosing Queries to Optimize
- •PostgreSQL Index Access Methods
- •Advanced Indexing
- •Adding Indexes
- •An Interview with Yohann Gabory
- •Get Some Data
- •Structured Query Language
- •Queries, DML, DDL, TCL, DCL
- •Select, From, Where
- •Anatomy of a Select Statement
- •Projection (output): Select
- •Restrictions: Where
- •Order By, Limit, No Offset
- •Ordering with Order By
- •kNN Ordering and GiST indexes
- •Top-N sorts: Limit
- •No Offset, and how to implement pagination
- •Group By, Having, With, Union All
- •Aggregates (aka Map/Reduce): Group By
- •Aggregates Without a Group By
- •Restrict Selected Groups: Having
- •Grouping Sets
- •Common Table Expressions: With
- •Distinct On
- •Result Sets Operations
- •Understanding Nulls
- •Three-Valued Logic
- •Not Null Constraints
- •Outer Joins Introducing Nulls
- •Using Null in Applications
- •Understanding Window Functions
- •Windows and Frames
- •Partitioning into Different Frames
- •Available Window Functions
- •When to Use Window Functions
- •Relations
- •SQL Join Types
- •An Interview with Markus Winand
- •Serialization and Deserialization
- •Some Relational Theory
- •Attribute Values, Data Domains and Data Types
- •Consistency and Data Type Behavior
- •PostgreSQL Data Types
- •Boolean
- •Character and Text
- •Server Encoding and Client Encoding
- •Numbers
- •Floating Point Numbers
- •Sequences and the Serial Pseudo Data Type
- •Universally Unique Identifier: UUID
- •Date/Time and Time Zones
- •Time Intervals
- •Date/Time Processing and Querying
- •Network Address Types
- •Denormalized Data Types
- •Arrays
- •Composite Types
- •Enum
- •PostgreSQL Extensions
- •An interview with Grégoire Hubert
- •Object Relational Mapping
- •Tooling for Database Modeling
- •How to Write a Database Model
- •Generating Random Data
- •Modeling Example
- •Normalization
- •Data Structures and Algorithms
- •Normal Forms
- •Database Anomalies
- •Modeling an Address Field
- •Primary Keys
- •Foreign Keys Constraints
- •Not Null Constraints
- •Check Constraints and Domains
- •Exclusion Constraints
- •Practical Use Case: Geonames
- •Features
- •Countries
- •Modelization Anti-Patterns
- •Entity Attribute Values
- •Multiple Values per Column
- •UUIDs
- •Denormalization
- •Premature Optimization
- •Functional Dependency Trade-Offs
- •Denormalization with PostgreSQL
- •Materialized Views
- •History Tables and Audit Trails
- •Validity Period as a Range
- •Pre-Computed Values
- •Enumerated Types
- •Multiple Values per Attribute
- •The Spare Matrix Model
- •Denormalize wih Care
- •Not Only SQL
- •Schemaless Design in PostgreSQL
- •Durability Trade-Offs
- •Another Small Application
- •Insert, Update, Delete
- •Insert Into
- •Insert Into … Select
- •Update
- •Inserting Some Tweets
- •Delete
- •Tuples and Rows
- •Deleting All the Rows: Truncate
- •Isolation and Locking
- •About SSI
- •Putting Concurrency to the Test
- •Computing and Caching in SQL
- •Views
- •Materialized Views
- •Triggers
- •Transactional Event Driven Processing
- •Trigger and Counters Anti-Pattern
- •Fixing the Behavior
- •Event Triggers
- •Listen and Notify
- •PostgreSQL Notifications
- •Notifications and Cache Maintenance
- •Listen and Notify Support in Drivers
- •Batch Update, MoMA Collection
- •Updating the Data
- •Concurrency Patterns
- •On Conflict Do Nothing
- •An Interview with Kris Jenkins
- •Installing and Using PostgreSQL Extensions
- •Finding PostgreSQL Extensions
- •A Short List of Noteworthy Extensions
- •Auditing Changes with hstore
- •Introduction to hstore
- •Comparing hstores
- •Auditing Changes with a Trigger
- •Testing the Audit Trigger
- •From hstore Back to a Regular Record
- •Last.fm Million Song Dataset
- •Using Trigrams For Typos
- •The pg_trgm PostgreSQL Extension
- •Trigrams, Similarity and Searches
- •Complete and Suggest Song Titles
- •Trigram Indexing
- •Denormalizing Tags with intarray
- •Advanced Tag Indexing
- •User-Defined Tags Made Easy
- •The Most Popular Pub Names
- •A Pub Names Database
- •Normalizing the Data
- •Geolocating the Nearest Pub (k-NN search)
- •How far is the nearest pub?
- •The earthdistance PostgreSQL contrib
- •Pubs and Cities
- •The Most Popular Pub Names by City
- •Geolocation with PostgreSQL
- •Geolocation Data Loading
- •Geolocation Metadata
- •Emergency Pub
- •Counting Distinct Users with HyperLogLog
- •HyperLogLog
- •Installing postgresql-hll
- •Counting Unique Tweet Visitors
- •Lossy Unique Count with HLL
- •Getting the Visits into Unique Counts
- •Scheduling Estimates Computations
- •Combining Unique Visitors
- •An Interview with Craig Kerstiens
Chapter 23 Denormalized Data Types j 204
In the rst case, using jsonb is a great enabler in terms of your application’s capabilities to process the documents it manages, including searching and ltering using the content of the document. See jsonb Indexing in the PostgreSQL documentation for more information about the jsonb_path_ops which can be used as in the following example and provides a very good general purpose index for the @> operator as used in the previous query:
1create index on js using gin (extra jsonb_path_ops);
Now, it is possible to use jsonb as a exible way to maintain your data model. It is possible to then think of PostgreSQL like a schemaless service and have a heterogeneous set of documents all in a single relation.
This trade-o f sounds interesting from a model design and maintenance perspective, but is very costly when it comes to daily queries and application development: you never really know what you’re going to nd out in the jsonb columns, so you need to be very careful about your SQL statements as you might easily miss rows you wanted to target, for example.
A good trade-o f is to design a model with some static columns are created and managed traditionally, and an extra column of jsonb type is added for those things you didn’t know yet, and that would be used only sometimes, maybe for debugging reasons or special cases.
This works well until the application’s code is querying the extra column in every situation because some important data is found only there. At this point, it’s worth promoting parts of the extra eld content into proper PostgreSQL attributes in your relational schema.
Enum
ThisdatatypehasbeenaddedtoPostgreSQLinordertomakeiteasiertosupport migrations from MySQL. Proper relational design would use a reference table and a foreign key instead:
1create table color(id serial primary key, name text);
2
3create table cars
4(
5 |
brand |
text, |
6 |
model |
text, |
Chapter 23 Denormalized Data Types j 205
7 |
color |
integer references color(id) |
8);
9
10insert into color(name)
11values ('blue'), ('red'),
12 |
('gray'), ('black'); |
13 |
|
14insert into cars(brand, model, color)
15select brand, model, color.id
16from (
17 |
values('ferari', 'testarosa', |
'red'), |
18 |
('aston martin', 'db2', |
'blue'), |
19 |
('bentley', 'mulsanne', |
'gray'), |
20 |
('ford', 'T', 'black') |
|
21 |
) |
|
22 |
as data(brand, model, color) |
|
23 |
join color on color.name = data.color; |
|
In this setup the table color lists available colors to choose from, and the cars table registers availability of a model from a brand in a given color. It’s possible to make an enum type instead:
1create type color_t as enum('blue', 'red', 'gray', 'black');
2
3 drop table if exists cars;
4create table cars
5(
6 |
brand |
text, |
7 |
model |
text, |
8 |
color |
color_t |
9);
10
11insert into cars(brand, model, color)
12values ('ferari', 'testarosa', 'red'),
13 |
('aston martin', 'db2', 'blue'), |
|
14 |
('bentley', 'mulsanne', |
'gray'), |
15 |
('ford', 'T', 'black'); |
|
Be aware that in MySQL there’s no create type statement for enum types, so each column using an enum is assigned its own data type. As you now have a separate anonymous data type per column, good luck maintaining a globally consistent state if you need it.
Using the enum PostgreSQL facility is mostly a matter of taste. Af er all, join operations against small reference tables are well supported by the PostgreSQL SQL engine.
