Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
roth_stephan_clean_c20_sustainable_software_development_patt.pdf
Скачиваний:
29
Добавлен:
27.03.2023
Размер:
7.26 Mб
Скачать

Chapter 4 Basics of Clean C++

04

public:

05

const std::string& getRegistrationCode() const;

06

void setRegistrationCode(const std::string& registrationCode);

07

// ...

08

 

09private:

10std::string registrationCode_;

11// ...

12};

As opposed to the setter on line 6, the getRegistrationCode member function on line 5 cannot modify member variables of the Car class. The following implementation of getRegistrationCode will cause a compiler error, because the function tries to assign a new string to registrationCode_:

const std::string& Car::getRegistrationCode() { std::string toBeReturned = registrationCode_; registrationCode_ = "foo"; // Compile-time error! return toBeReturned;

}

About Old C-Style in C++ Projects

If you take a look at relatively new C++ programs (for example, on GitHub or SourceForge), you will be surprised at how many of these allegedly “new” programs still contain countless lines of old C code. Well, C is still a subset of the C++ language. This means that the language elements of C are still available. Unfortunately, many of these old C constructs have significant drawbacks when it comes to writing clean, safe, and modern code. And there are clearly better alternatives.

Therefore, a basic piece of advice is to quit using those old and error-prone C constructs wherever better C++ alternatives exist. And there are many of these possibilities. Nowadays you can nearly completely do without C programming in modern C++.

114

Chapter 4 Basics of Clean C++

Choose C++ Strings and Streams over Old C-Style char*

A so-called C++ string is part of the C++ Standard Library and is of type std::string, std::wstring, std::u8string, std::u16string, or std::u32string (all defined in the <string> header). In fact, all are type aliases of the std::basic_string<T> class template and are (simplified) defined this way:

using string = basic_string<char>; using wstring = basic_string<wchar_t>; using u8string = basic_string<char8_t>;

using u16string = basic_string<char16_t>; using u32string = basic_string<char32_t>;

Note  To simplify things, from now on I will only speak about C++ strings in general, by which I mean all the previously mentioned, different string types.

To create such a string, an object of one of these two templates must be instantiated, for example, with the initialization constructor:

std::string name("Stephan");

Compared to this, a so-called C-style string is simply an array of characters (type char or wchar_t) that ends with a so-called zero terminator (sometimes also called a null terminator). A zero terminator is a special character ('\0', ASCII code 0) used to indicate the end of the string. A C-style string can be defined this way:

char name[] = "Stephan";

In this case, the zero terminator is automatically added at the end of the string, that is, the length of the string is eight characters. An important point is that we have to keep in mind that we’re still dealing with an array of characters. This means, for instance, that it has a fixed size. You can change the content of the array using the index operator, but no characters can be added to the end of the array. And if the zero terminator at the end is accidentally overwritten, this can cause various issues.

115

Chapter 4 Basics of Clean C++

The character array is often used with the help of a pointer pointing to the first element, for example, when it is passed as a function argument:

char* pointerToName = name;

void function(char* pointerToCharacterArray) { //...

}

However, in many C++ programs as well as in textbooks, C strings are still frequently used. Are there any good reasons to use C-style strings in C++ nowadays?

Yes, there are some situations where you can still use C-style strings. I will present a few of these exceptions later. Apart from that, the vast majority of strings in a modern and clean C++ program should be implemented using C++ strings. Objects of type std::string, as well as all the other C++ string types, provide numerous advantages compared to old C-style strings:

•\

C++ string objects manage their memory by themselves, so you can

 

copy, create, and destroy them easily. That means that they free you

 

from managing the lifetime of the string’s data, which can be a tricky

 

and daunting task using C-style character arrays.

•\

They are mutable. The string can be manipulated easily in various

 

ways: adding strings or single characters, concatenating strings,

 

replacing parts of the string, etc.

•\

C++ strings provide a convenient iterator interface. As with all other

 

Standard Library container types, std::string and std::wstring

 

allow you to iterate over their elements (i.e., over their characters).

 

This also means that all suitable algorithms that are defined in the

 

<algorithm> header can be applied to the string.

•\

C++ strings work perfectly together with C++ I/O streams (e.g.,

 

ostream, stringstream, fstream, etc.) so you can take advantage of

 

all those useful stream facilities easily.

116

Chapter 4 Basics of Clean C++

•\ Since C++11, the Standard Library uses move semantics extensively. Many algorithms and containers are now move-optimized. This also applies to C++ strings. For example, an instance of a std::string can simply be returned as the return value of a function. The formerly still necessary approaches with pointers or references to efficiently return large string objects from a function—that is, without costly copying of the string’s data—are no longer required.

Note  Apart from a few exceptions, strings in modern C++ programs should be represented by C++ strings taken from the Standard Library.

So, what are the few exceptions that justify the use of old C-style strings?

On the one hand, there are string constants, that is, immutable strings. If you just need a fixed array of fixed characters, then std::string provides little advantage. For instance, you can define such a string constant this way:

const char* const PUBLISHER = "Apress Media LLC";

In this case, neither the value being pointed to nor the pointer itself can be modified (see the section about const correctness).

Another reason to work with C strings is compatibility with C-style API’s libraries. Many third-party libraries often have low-level interfaces to ensure backward compatibility and to keep their area of application as broad as possible. Strings are often expected as C-style strings by such an API. However, even in this case, the use of the C-style strings should be locally limited to the handling of this interface. Follow rule CPL.3 of the C++ Core Guidelines [Cppcore20]: If you must use C for interfaces, use C++ in the calling code using such interfaces.

Avoid Using printf( ), sprintf( ), gets( ), etc.

printf(), which is part of the C library to perform input/output operations (defined in the <cstdio> header), prints formatted data to standard output (stdout). Some

developers still use a lot of printfs for tracing/logging purposes in their C++ code. They often argue that printf is ... no ... it must be much faster than C++ I/O streams, since the whole C++ overhead is missing.

117

Chapter 4 Basics of Clean C++

First, I/O is a bottleneck anyway, no matter if you’re using printf() or std::cout. To write anything on standard output is generally slow, with magnitudes slower than most of the other operations in a program. Under certain circumstances, std::cout can be slightly slower than printf(), but in relation to the general cost of an I/O operation, those few microseconds are usually negligible. At this point I would also like to remind everyone to be careful with (premature) optimizations (remember the section “Be Careful with Optimizations” in Chapter 3).

Second, printf() is fundamentally type-unsafe and thus prone to error. The function expects a sequence of non-typed arguments that are related to a C string filled with format specifiers, which is the first argument. Functions that cannot be used safely should never be used, because this can lead to subtle bugs, undefined behavior (see the section about undefined behavior in Chapter 5), and security vulnerabilities.

TEXT FORMATTING LIBRARY [C++20]

With the new standard C++20, a text formatting library is available that offers a safe, faster, and more extensible alternative to the outdated and potentially dangerous printf family of functions. The header file of this library is <format>. The style of formatting looks very similar to string formatting in the Python programming language.

Unfortunately, it is also one of the new libraries that’s currently not supported by any C++ compiler while I’m writing this book. A temporary and good alternative, which was also the template for the new C++20 library, is the open source library called {fmt} (https:// github.com/fmtlib/fmt), which provides, among other features, a C++20 compatible implementation of std::format.

Here are some usage examples:

#include "fmt/format.h" // Can be replaced by <format> when available. #include <numbers>

#include <iostream>

int main() {

// Note: replace namespace fmt:: by std:: once the compiler supports <format>. const auto theAnswer = fmt::format("The answer is {}.", 42); std::cout << theAnswer << "\n";

118

Chapter 4 Basics of Clean C++

//Many different format specifiers are possible. const auto formattedNumbers =

fmt::format("Decimal: {:f}, Scientific: {:e}, Hexadecimal: {:X}", 3.1415, 0.123, 255);

std::cout << formattedNumbers << "\n";

//Arguments can be reordered in the created string by using an index {n:}: const auto reorderedArguments =

fmt::format("Decimal: {1:f}, Scientific: {2:e}, Hexadecimal: {0:X}", 255, 3.1415, 0.123);

std::cout << reorderedArguments << "\n";

//The number of decimal places can be specified as follows:

const auto piWith22DecimalPlaces = fmt::format("PI = {:.22f}", std::numbers::pi);

std::cout << piWith22DecimalPlaces << "\n";

return 0;

}

The output of this small demo program is as follows:

The answer is 42.

Decimal: 3.141500, Scientific: 1.230000e-01, Hexadecimal: FF

Decimal: 3.141500, Scientific: 1.230000e-01, Hexadecimal: FF

PI = 3.1415926535897931159980

Third, unlike printf, C++ I/O streams allow complex objects to be easily streamed by providing a custom insertion operator (operator<<). Suppose we have a class called Invoice (defined in a header file named Invoice.h) that looks like Listing 4-29.

Listing 4-29.  An Excerpt from the Invoice.h File with Line Numbers

01 #ifndef INVOICE_H_

02 #define INVOICE_H_

03

04 #include <chrono>

05 #include <memory>

06 #include <ostream>

07 #include <string>

08 #include <vector>

119

Chapter 4 Basics of Clean C++

09

10#include "Customer.h"

11#include "InvoiceLineItem.h"

12#include "Money.h"

13#include "UniqueIdentifier.h"

15using InvoiceLineItemPtr = std::shared_ptr<InvoiceLineItem>;

16using InvoiceLineItems = std::vector<InvoiceLineItemPtr>;

18using InvoiceRecipient = Customer;

19using InvoiceRecipientPtr = std::shared_ptr<InvoiceRecipient>;

21using DateTime = std::chrono::system_clock::time_point;

23class Invoice {

24public:

25explicit Invoice(const UniqueIdentifier& invoiceNumber);

26void setRecipient(const InvoiceRecipientPtr& recipient);

27void setDateTimeOfInvoicing(const DateTime& dateTimeOfInvoicing);

28Money getSum() const;

29Money getSumWithoutTax() const;

30void addLineItem(const InvoiceLineItemPtr& lineItem);

31// ...possibly more member functions here...

32

33private:

34friend std::ostream& operator<<(std::ostream& outstream, const Invoice& invoice);

35std::string getDateTimeOfInvoicingAsString() const;

36

37UniqueIdentifier invoiceNumber;

38DateTime dateTimeOfInvoicing;

39InvoiceRecipientPtr recipient;

40InvoiceLineItems invoiceLineItems;

41};

42// ...

120

Chapter 4 Basics of Clean C++

The class has dependencies to an invoice recipient (which in this case is an alias for the Customer defined in the Customer.h header; see line 18), and it uses an identifier (type UniqueIdentifier) representing an invoice number that is guaranteed to be unique among all invoice numbers. Furthermore, the invoice uses a data type that can represent money amounts (see the section entitled “Money Class” in Chapter 9 about design patterns), as well as a dependency to another data type that represents a single invoice line item. The latter is used to manage a list of invoice items inside the invoice using a std::vector (see lines 16 and 41). To represent the time of invoicing, we use the data type time_point from the Chrono library (defined in the <chrono> header), which has been available since C++11.

Now let’s imagine that we also want to stream the entire invoice with all its data to standard output. Wouldn’t it be pretty simple and convenient if we could write something like this:

std::cout << instanceOfInvoice;

Well, that’s possible with C++. The insertion operator (<<) for output streams can be overloaded for any class. We just have to add an operator<< function to our class

declaration in the header. It is important to make this function a friend of the class in our case (see line 34) because it accesses private member variables directly. See Listing 4-30.

Listing 4-30.  The Insertion Operator for the Invoice Class

43// ...

44std::ostream& operator<<(std::ostream& outstream, const Invoice& invoice) {

45outstream << "Invoice No.: " << invoice.invoiceNumber << "\n";

46outstream << "Recipient: " << *(invoice.recipient) << "\n";

47outstream << "Date/time: " << invoice.getDateTimeOfInvoicingAsString()

<<"\n";

48outstream << "Items:" << "\n";

49for (const auto& item : invoice.invoiceLineItems) {

50outstream << " " << *item << "\n";

51}

52outstream << "Amount invoiced: " << invoice.getSum() << std::endl;

53return outstream;

54}

55// ...

121

Chapter 4 Basics of Clean C++

All structural components of the Invoice class are written into an output stream inside the function. This is possible, because the UniqueIdentifier, InvoiceRecipient, and InvoiceLineItem classes also have their own insertion operator functions (not shown here) for output streams. To print all line items in the vector, a C++11 range-based for loop is used. And to get a textual representation of the date of invoicing, we use an internal helper method named getDateTimeOfInvoicingAsString() that returns a well-­formatted date/time string.

Tip  Avoid using printf() and other unsafe C functions, such as sprintf(), puts(), scanf(), sscanf(), etc.

Choose Standard Library Containers over

Simple C-Style Arrays

Instead of using C-style arrays, you should use the std::array<TYPE, N> template has been available since C++11 (in the <array> header). Instances of std::array<TYPE, N> are fixed-size sequence containers and are as efficient as ordinary C-style arrays.

The problems with C-style arrays are more or less the same as with C-style strings (see the previous section). C arrays are bad because they are passed around as raw pointers to their first element. This could be potentially dangerous, because there are no bound checks that protect users of that array to access nonexistent elements. Arrays built with std::array are safer, because they don’t decay to pointers (see the section entitled “Strategies to Avoid Regular Pointers,” earlier in this chapter).

An advantage of using std::array is that it knows its size (number of elements). When working with arrays, the size of the array is important information that is often required. Ordinary C-style arrays don’t know their own size. Thus, the size of the array must often be handled as an additional piece of information, for example, in an additional variable. For example, the size must be passed as an additional argument to function calls like in the following example.

const std::size_t arraySize = 10; MyArrayType cStyleArray[arraySize];

void function(MyArrayType const* pArray, const std::size_t arraySize) { // ...

}

122

Chapter 4 Basics of Clean C++

Strictly speaking, in this case the array and its size don’t form a cohesive unit (see the section entitled “Strong Cohesion” in Chapter 3). Furthermore, we already know from a previous section about parameters and return values that the number of function arguments should be as small as possible.

In contrast, instances of std::array carry their size and any instance can be queried about it. Thus, the parameter lists of functions or methods don’t require additional parameters about the array’s size:

#include <array>

using MyTypeArray = std::array<MyArrayType, 10>;

void function(const MyTypeArray& array) { const std::size_t arraySize = array.size(); //...

}

Another noteworthy advantage of std::array is that it has a Standard Library compatible interface. The class template provides public member functions so it looks like every other container in the Standard Library. For example, users of an array can get an iterator pointing to the beginning and the end of the sequence using std::array::begin() and std::array::end(), respectively. This also means that algorithms from the <algorithm> header can be applied to the array (see the section about algorithms in the following chapter).

#include <array>

#include <algorithm>

using MyTypeArray = std::array<MyArrayType, 10>; MyTypeArray array;

void doSomethingWithEachElement(const MyArrayType& element) { // ...

}

std::for_each(std::cbegin(array), std::cend(array), doSomethingWithEachElement);

123

Chapter 4 Basics of Clean C++

NON-MEMBER STD::BEGIN() AND STD::END() [C++11/14]

Every C++ Standard Library container has a begin() and cbegin() and an end() and cend() member function to retrieve iterators and const-iterators for that container. Apart from some some exceptions, many containers also provide corresponding const and non-­ const reverse iterators (rbegin()/rend() and crbegin()/crend()).

C++11 has introduced free non-member functions for that purpose: std::begin(<container>) and std::end(<container>). With C++14, the still missing functions std::cbegin(<container>), std::cend(<container>), std::rbegin(<container>), std::rend(<container>),

std::crbegin(<container>), and std::crend(<container>) have been added. Instead of using the member functions, it is now recommended to use these non-member functions (all defined in the <iterator> header) to get iterators and const-iterators for a container, like so:

#include <vector>

std::vector<AnyType> aVector;

auto iter = std::begin(aVector); // ...instead of 'auto iter = aVector. begin();'

The reason is that those free functions allow a more flexible and generic programming style. For instance, many user-defined containers don’t have a begin() and end() member function, which makes them impossible to use with the Standard Library algorithms (see the section about algorithms in Chapter 5) or any other user-defined template function that requires iterators. The non-member functions to retrieve iterators are extensible in the sense

of that they can be overloaded for any type of sequence, including old C-style arrays. In other words, non-Standard-Library-compatible (custom) containers can be retrofitted with iterator capabilities.

For instance, assume that you have to deal with a C-style array of integers, like this one:

int fibonacci[] = { 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144 };

This type of array can now be retrofitted with a Standard Library-compliant iterator interface. For C-style arrays, such functions are already provided in the Standard Library, so you do not have to program them yourself. They look more or less like this:

124