Добавил:
ИВТ (советую зайти в "Несортированное")rnПИН МАГА Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Database 2024 / Books / Искусство PostgreSQL.pdf
Скачиваний:
14
Добавлен:
20.11.2024
Размер:
1.62 Mб
Скачать

Chapter 31 Denormalization j 274

When dealing with a short list of items, the normalized way to do that is to handle the catalog of accepted values in a dedicated table and reference this table everywhere your schema uses that catalog of values.

When using more than join_collapse_limit or from_collapse_limit relations in SQL queries, the PostgreSQL optimizer might be defeated… so in some schema using an ENUM data type rather than a reference table can be bene cial.

Multiple Values per Attribute

In the CSV anti-pattern database model, we saw all the disadvantages of using multiple values per attribute in general, with a text-based schema and a separator used in the attribute values.

Managing several values per attribute, in the same row, can help reduce how many rows your application must manage. The normalized alternative has a side table for the entries, with a reference to the main table’s primary key.

Given PostgreSQL array support for searching and indexing, it is more e cient at times to manage the list of entries as an array attribute in our main table. This is particularly e fective when the application of en has to delete entries and all referenced data.

In some cases, multiple attributes each containing multiple values are needed. PostgreSQL arrays of composite type instances might then be considered. Cases when that model beats the normalized schema are rare, though, and managing this complexity isn’t free.

The Spare Matrix Model

In cases where your application manages lots of optional attributes per row, most of them never being used, they can be denormalized to a JSONB extra column with those attributes, all managed into a single document.

When restricting this extra jonsb attribute to values never referenced anywhere else in the schema, and when the application only needs this extra data as a whole, then jsonb is a very good trade-o f for a normalized schema.

Chapter 31 Denormalization j 275

Partitioning

Partitioning refers to splitting a table with too many rows into a set of tables each containing a part of those rows. Several kinds of partitioning are available, such as list or range partitioning. Starting in PostgreSQL 10, table partitioning is supported directly.

While partitioning isn’t denormalization as such, the limits of the PostgreSQL implementation makes it valuable to include the technique in this section. Quoting the PostgreSQL documentation:

There is no facility available to create the matching indexes on all partitions automatically. Indexes must be added to each partition with separate commands. This also means that there is no way to create a primary key, unique constraint, or exclusion constraint spanning all partitions; it is only possible to constrain each leaf partition individually.

Since primary keys are not supported on partitioned tables, foreign keys referencing partitioned tables are not supported, nor are foreign key references from a partitioned table to some other table.

Using the ON CONFLICT clause with partitioned tables will cause an error, because unique or exclusion constraints can only be created on individual partitions. There is no support for enforcing uniqueness (or an exclusion constraint) across an entire partitioning hierarchy.

An UPDATE that causes a row to move from one partition to another fails, because the new value of the row fails to satisfy the implicit partition constraint of the original partition.

Row triggers, if necessary, must be de ned on individual partitions, not the partitioned table.

So when using partitioning in PostgreSQL 10, we lose the ability to reach even the rst normal form by the lack of covering primary key. Then we lose the ability to maintain a reference to the partitioned table with a foreign key.

Before partitioning any table in PostgreSQL, including PostgreSQL 10, as with any other denormalization technique (covered here or not), please do your homework: check that it’s really not possible to sustain the application’s workload with a normalized model.

Chapter 31 Denormalization j 276

Other Denormalization Tools

PostgreSQL extensions such as hstore, ltree, intarray or pg_trgm o fer another set of interesting trade-o fs to implement speci c use cases.

For example ltree can be used to implement nested category catalogs and reference articles precisely in this catalog.

Denormalize wih Care

It’s been mentioned already, and it is worth saying it again. Only denormalize your application’s schema when you know what you’re doing, and when you’ve double-checked that there’s no other possibility for implementing your application and business cases with the required level of performance.

First, query optimization techniques — mainly rewriting until it’s obvious for PostgreSQL how to best execute a query — can go a long way. Production examples of query rewrite improving durations from minutes to milliseconds are commonly achieved, in particular against queries written by ORMs or other naive toolings.

Second, denormalization is an optimization technique meant to leverage trade- o fs. Allow me to quote Rob Pike again, as he establishes his rst rule of programming in Notes on Programming in C as the following:

Rule 1. You can’t tell where a program going to spend its time. Bottlenecks occur in surprising plac , so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottleneck

.

The rule works as well for a database model as it does for a program. Maybe the database model is even more tricky because we only measure time spent by ran queries, usually, and not the time it takes to:

Understand the database model

Understand how to use the database model to solve a new business case

Write the SQL queries necessary to the application code

Validate data quality

Chapter 31 Denormalization j 277

So again, only put all those nice properties at risk with denormalizing the schema when there’s no other choice.

Соседние файлы в папке Books