![](/user_photo/_userpic.png)
- •Table of Contents
- •About the Author
- •About the Technical Reviewer
- •Acknowledgments
- •Software Entropy
- •Clean Code
- •C++11: The Beginning of a New Era
- •Who This Book Is For
- •Conventions Used in This Book
- •Sidebars
- •Notes, Tips, and Warnings
- •Code Samples
- •Coding Style
- •C++ Core Guidelines
- •Companion Website and Source Code Repository
- •UML Diagrams
- •The Need for Testing
- •Unit Tests
- •What About QA?
- •Rules for Good Unit Tests
- •Test Code Quality
- •Unit Test Naming
- •Unit Test Independence
- •One Assertion per Test
- •Independent Initialization of Unit Test Environments
- •Exclude Getters and Setters
- •Exclude Third-Party Code
- •Exclude External Systems
- •What Do We Do with the Database?
- •Don’t Mix Test Code with Production Code
- •Tests Must Run Fast
- •How Do You Find a Test’s Input Data?
- •Equivalence Partitioning
- •Boundary Value Analysis
- •Test Doubles (Fake Objects)
- •What Is a Principle?
- •KISS
- •YAGNI
- •It’s About Knowledge!
- •Building Abstractions Is Sometimes Hard
- •Information Hiding
- •Strong Cohesion
- •Loose Coupling
- •Be Careful with Optimizations
- •Principle of Least Astonishment (PLA)
- •The Boy Scout Rule
- •Collective Code Ownership
- •Good Names
- •Names Should Be Self-Explanatory
- •Use Names from the Domain
- •Choose Names at an Appropriate Level of Abstraction
- •Avoid Redundancy When Choosing a Name
- •Avoid Cryptic Abbreviations
- •Avoid Hungarian Notation and Prefixes
- •Avoid Using the Same Name for Different Purposes
- •Comments
- •Let the Code Tell the Story
- •Do Not Comment Obvious Things
- •Don’t Disable Code with Comments
- •Don’t Write Block Comments
- •Don’t Use Comments to Substitute Version Control
- •The Rare Cases Where Comments Are Useful
- •Documentation Generation from Source Code
- •Functions
- •One Thing, No More!
- •Let Them Be Small
- •“But the Call Time Overhead!”
- •Function Naming
- •Use Intention-Revealing Names
- •Parameters and Return Values
- •Avoid Flag Parameters
- •Avoid Output Parameters
- •Don’t Pass or Return 0 (NULL, nullptr)
- •Strategies for Avoiding Regular Pointers
- •Choose simple object construction on the stack instead of on the heap
- •In a function’s argument list, use (const) references instead of pointers
- •If it is inevitable to deal with a pointer to a resource, use a smart one
- •If an API returns a raw pointer...
- •The Power of const Correctness
- •About Old C-Style in C++ Projects
- •Choose C++ Strings and Streams over Old C-Style char*
- •Use C++ Casts Instead of Old C-Style Casts
- •Avoid Macros
- •Managing Resources
- •Resource Acquisition Is Initialization (RAII)
- •Smart Pointers
- •Unique Ownership with std::unique_ptr<T>
- •Shared Ownership with std::shared_ptr<T>
- •No Ownership, but Secure Access with std::weak_ptr<T>
- •Atomic Smart Pointers
- •Avoid Explicit New and Delete
- •Managing Proprietary Resources
- •We Like to Move It
- •What Are Move Semantics?
- •The Matter with Those lvalues and rvalues
- •rvalue References
- •Don’t Enforce Move Everywhere
- •The Rule of Zero
- •The Compiler Is Your Colleague
- •Automatic Type Deduction
- •Computations During Compile Time
- •Variable Templates
- •Don’t Allow Undefined Behavior
- •Type-Rich Programming
- •Know Your Libraries
- •Take Advantage of <algorithm>
- •Easier Parallelization of Algorithms Since C++17
- •Sorting and Output of a Container
- •More Convenience with Ranges
- •Non-Owning Ranges with Views
- •Comparing Two Sequences
- •Take Advantage of Boost
- •More Libraries That You Should Know About
- •Proper Exception and Error Handling
- •Prevention Is Better Than Aftercare
- •No Exception Safety
- •Basic Exception Safety
- •Strong Exception Safety
- •The No-Throw Guarantee
- •An Exception Is an Exception, Literally!
- •If You Can’t Recover, Get Out Quickly
- •Define User-Specific Exception Types
- •Throw by Value, Catch by const Reference
- •Pay Attention to the Correct Order of Catch Clauses
- •Interface Design
- •Attributes
- •noreturn (since C++11)
- •deprecated (since C++14)
- •nodiscard (since C++17)
- •maybe_unused (since C++17)
- •Concepts: Requirements for Template Arguments
- •The Basics of Modularization
- •Criteria for Finding Modules
- •Focus on the Domain of Your Software
- •Abstraction
- •Choose a Hierarchical Decomposition
- •Single Responsibility Principle (SRP)
- •Single Level of Abstraction (SLA)
- •The Whole Enchilada
- •Object-Orientation
- •Object-Oriented Thinking
- •Principles for Good Class Design
- •Keep Classes Small
- •Open-Closed Principle (OCP)
- •A Short Comparison of Type Erasure Techniques
- •Liskov Substitution Principle (LSP)
- •The Square-Rectangle Dilemma
- •Favor Composition over Inheritance
- •Interface Segregation Principle (ISP)
- •Acyclic Dependency Principle
- •Dependency Inversion Principle (DIP)
- •Don’t Talk to Strangers (The Law of Demeter)
- •Avoid Anemic Classes
- •Tell, Don’t Ask!
- •Avoid Static Class Members
- •Modules
- •The Drawbacks of #include
- •Three Options for Using Modules
- •Include Translation
- •Header Importation
- •Module Importation
- •Separating Interface and Implementation
- •The Impact of Modules
- •What Is Functional Programming?
- •What Is a Function?
- •Pure vs Impure Functions
- •Functional Programming in Modern C++
- •Functional Programming with C++ Templates
- •Function-Like Objects (Functors)
- •Generator
- •Unary Function
- •Predicate
- •Binary Functors
- •Binders and Function Wrappers
- •Lambda Expressions
- •Generic Lambda Expressions (C++14)
- •Lambda Templates (C++20)
- •Higher-Order Functions
- •Map, Filter, and Reduce
- •Filter
- •Reduce (Fold)
- •Fold Expressions in C++17
- •Pipelining with Range Adaptors (C++20)
- •Clean Code in Functional Programming
- •The Drawbacks of Plain Old Unit Testing (POUT)
- •Test-Driven Development as a Game Changer
- •The Workflow of TDD
- •TDD by Example: The Roman Numerals Code Kata
- •Preparations
- •The First Test
- •The Second Test
- •The Third Test and the Tidying Afterward
- •More Sophisticated Tests with a Custom Assertion
- •It’s Time to Clean Up Again
- •Approaching the Finish Line
- •Done!
- •The Advantages of TDD
- •When We Should Not Use TDD
- •TDD Is Not a Replacement for Code Reviews
- •Design Principles vs Design Patterns
- •Some Patterns and When to Use Them
- •Dependency Injection (DI)
- •The Singleton Anti-Pattern
- •Dependency Injection to the Rescue
- •Adapter
- •Strategy
- •Command
- •Command Processor
- •Composite
- •Observer
- •Factories
- •Simple Factory
- •Facade
- •The Money Class
- •Special Case Object (Null Object)
- •What Is an Idiom?
- •Some Useful C++ Idioms
- •The Power of Immutability
- •Substitution Failure Is Not an Error (SFINAE)
- •The Copy-and-Swap Idiom
- •Pointer to Implementation (PIMPL)
- •Structural Modeling
- •Component
- •Interface
- •Association
- •Generalization
- •Dependency
- •Template and Template Binding
- •Behavioral Modeling
- •Activity Diagram
- •Action
- •Control Flow Edge
- •Other Activity Nodes
- •Sequence Diagram
- •Lifeline
- •Message
- •State Diagram
- •State
- •Transitions
- •External Transitions
- •Internal Transitions
- •Trigger
- •Stereotypes
- •Bibliography
- •Index
Chapter 6 Modularization
Modules
The programming language C++, which was first released in 1985, is now about 35 years old. The foundation of C++ is still the procedural language C, which was released in 1972. To this day, C++ is backward compatible with C. This also means that C++ dragged along the legacy of C until today. Especially with the latest developments in the direction of modern C++—i.e. the standards C++11, C++14, C++17 and now C++20—the legacy of C appears more and more anachronistic and fits less and less with a modern programming style. Nowadays, the old-fashioned and weak #include system for implementing the modularity system in C++ is simply no longer appropriate.
Newer programming languages, like D or Rust, often have a built-in module system. Java was retrofitted with the module system Jigsaw with the release of version 9 in 2017. So, it was high time that C++ also got a module system: modules.
The Drawbacks of #include
What are the disadvantages of the old #include system with header files? Well, these are relatively easy to understand when we think about what an #include really is. Every #include results in a simple text replacement by the preprocessor of the compiler, i.e. an #include directive leads to a simple copy-and-paste-operation of the contents of the included file, as depicted in Figure 6-14.
281
![](/html/75672/2303/html_3iAzqyDIia.Sav3/htmlconvd-3gbB7Q293x1.jpg)
Chapter 6 Modularization
Figure 6-14. #include causes the file contents to be included in the including file
First of all, a major drawback to this approach is that the compilation time, especially in large projects, suffers greatly. If a header file is included in many translation units, the compiler must perform these copy-and-paste operations again and again. And the time - consuming part is not the text substitution alone, but mainly the subsequent generation of the so-called Abstract Syntax Tree (AST) by the compiler. Then it turns out that hundreds or even thousands of lines of code that have been included can be optimized away because they are not needed.
Furthermore, there are always two physical files, header and source file, to maintain the interface and the implementation of the same module. This basically results into consistency issues and many violations of the DRY principle.
But really unpleasant issues can be caused by multiple definitions of identical symbols and types in different header files, also known as ODR violations, and accidental code changes, e.g., redefinitions of symbols through macros. Imagine that two different header files, both defining a constant named PI in the global namespace, are included in the same translation unit. This requires that multiple inclusions of the same header file in the same translation unit be prevented through certain measures, e.g., with the help of an idiom called the include guard macro; otherwise, conflicts with multiple-defined symbols and types will occur.
282
![](/html/75672/2303/html_3iAzqyDIia.Sav3/htmlconvd-3gbB7Q294x1.jpg)
Chapter 6 Modularization
ODR VIOLATION
ODR is the abbreviation for an important rule in C++ development: The One Definition Rule. The ODR is defined in the current ISO C++ Standard in Section 6.3. It states that no translation unit should contain more than one definition of any variable, function, class type, enumeration type, template, default argument for a parameter (for a function in a given scope), or default template argument.
A simple example of an ODR violation: A translation unit (.cpp file) includes two headers, both defining a class with an identical name. The compiler would terminate with an error message (e.g. “class type redefinition”) then.
Some violations of the ODR must be diagnosed by the compiler. For other violations of this rule, the compiler may remain silent. These possibly undetected ODR violations can lead to very subtle side effects and errors in the running program.
Modules totheRescue
With modules, which is one of the major new features of the C++20 standard, the separation of header files and implementation files, and thus many of the
aforementioned problems, as well as C-style macros and the C preprocessor, should be a thing of the past. Ultimately, the aim of modules is to significantly speed up the compilation of the software and to make it easier for the software designer to build distributable components.
A NOTE ABOUT FILE EXTENSIONS
In the following sections I will use *.mpp as the file extension for module files, and *.bmi as extension for so-called Built Module Interface (BMI) files. In fact, these file extensions are not standardized and may vary between compilers. For example, if you’re using the Microsoft Visual Studio C++ compiler, the module interface files end with *.ixx, and the BMI files generated by the compiler have the extension *.ifc. For Clang/LLVM compilers, the file extension for the module file is *.cppm and the BMI file ends with *pcm.
283
![](/html/75672/2303/html_3iAzqyDIia.Sav3/htmlconvd-3gbB7Q295x1.jpg)
Chapter 6 Modularization
With modules, the situation presented in Figure 6-14 would change as depicted in Figure 6-15.
Figure 6-15. Module import
So, the solution is to do without the header files. Instead, one translation unit directly accesses the other translation units that it wants to use. Of course, this is not just easily done by throwing away the header file and instead using the implementation file directly as it is. You may have noticed while looking at Figure 6-15 that the file extension of the two artifacts to be imported in the client.cpp file has changed from *.cpp to *.mpp. Migration to modules is not for free, there are a lot of things that must be changed and taken into consideration. And sometimes you might not be able to do it for various reasons, e.g., if you are confronted with a third-party library that you cannot change.
Under theHood
Before we go a bit more in detail, let’s look at what happens “under the hood” when a C++ module is imported and what the basic difference to header file inclusion is.
As depicted in Figure 6-16, a module import is of course no copy&paste operation as with the content of header files. If the compiler encounters a module file—in this case the file named mathLibrary.mpp—imported by a translation unit (main.cpp), the module file is first translated into a Built Module Interface (BMI) file and an object file.
284
![](/html/75672/2303/html_3iAzqyDIia.Sav3/htmlconvd-3gbB7Q296x1.jpg)
Chapter 6 Modularization
Figure 6-16. A module file is first translated to a Binary Module Interface (BMI) file and an object file
The BMI is a file on the filesystem that contains the metadata for the module and describes the exported interface of mathLibrary.mpp. The compiler also produces an object file (mathLibrary.o), which is required by the Linker to link the module to produce an executable.
So basically, when using modules, there is an additional processing step that is required to generate the intermediate artifacts BMI file and object file. This is also an essential difference compared to using header files: When using header file inclusions, we do not have any additional time-consuming generation step. The big advantage, however, is that this step only has to be performed once, no matter how many translation units are importing the module. For example, using “import <iostream>” instead of “#include <iostream>” everywhere in your program avoids compiling the thousands of lines of code from the <iostream> header over and over again.
But this also means that we have a strict chronological order. Importing a module creates a succession, i.e. the compiler has to process the module first to obtain the BMI file, before compiling the translation units that imports the module.
One of the most important aspects of increasing build performance, especially when building large projects, is parallelization. Especially in a CI/CD1 environment where a continuous build chain is used to build the project very, very frequently, a single build has to run very fast. The development team needs fast feedback on whether the build
1Continuous Integration/Continuous Deployment
285