- •Thinking in C++ 2nd edition Volume 2: Standard Libraries & Advanced Topics
- •Preface
- •What’s new in the second edition
- •What’s in Volume 2 of this book
- •How to get Volume 2
- •Prerequisites
- •Learning C++
- •Goals
- •Chapters
- •Exercises
- •Exercise solutions
- •Source code
- •Language standards
- •Language support
- •The book’s CD ROM
- •Seminars, CD Roms & consulting
- •Errors
- •Acknowledgements
- •Library overview
- •1: Strings
- •What’s in a string
- •Creating and initializing C++ strings
- •Initialization limitations
- •Operating on strings
- •Appending, inserting and concatenating strings
- •Replacing string characters
- •Concatenation using non-member overloaded operators
- •Searching in strings
- •Finding in reverse
- •Finding first/last of a set
- •Removing characters from strings
- •Stripping HTML tags
- •Comparing strings
- •Using iterators
- •Iterating in reverse
- •Strings and character traits
- •A string application
- •Summary
- •Exercises
- •2: Iostreams
- •Why iostreams?
- •True wrapping
- •Iostreams to the rescue
- •Sneak preview of operator overloading
- •Inserters and extractors
- •Manipulators
- •Common usage
- •Line-oriented input
- •Overloaded versions of get( )
- •Reading raw bytes
- •Error handling
- •File iostreams
- •Open modes
- •Iostream buffering
- •Seeking in iostreams
- •Creating read/write files
- •User-allocated storage
- •Output strstreams
- •Automatic storage allocation
- •Proving movement
- •A better way
- •Output stream formatting
- •Internal formatting data
- •Format fields
- •Width, fill and precision
- •An exhaustive example
- •Formatting manipulators
- •Manipulators with arguments
- •Creating manipulators
- •Effectors
- •Iostream examples
- •Code generation
- •Maintaining class library source
- •Detecting compiler errors
- •A simple datalogger
- •Generating test data
- •Verifying & viewing the data
- •Counting editor
- •Breaking up big files
- •Summary
- •Exercises
- •3: Templates in depth
- •Nontype template arguments
- •Typedefing a typename
- •Using typename instead of class
- •Function templates
- •A string conversion system
- •A memory allocation system
- •Type induction in function templates
- •Taking the address of a generated function template
- •Local classes in templates
- •Applying a function to an STL sequence
- •Template-templates
- •Member function templates
- •Why virtual member template functions are disallowed
- •Nested template classes
- •Template specializations
- •A practical example
- •Pointer specialization
- •Partial ordering of function templates
- •Design & efficiency
- •Preventing template bloat
- •Explicit instantiation
- •Explicit specification of template functions
- •Controlling template instantiation
- •Template programming idioms
- •Summary
- •Containers and iterators
- •STL reference documentation
- •The Standard Template Library
- •The basic concepts
- •Containers of strings
- •Inheriting from STL containers
- •A plethora of iterators
- •Iterators in reversible containers
- •Iterator categories
- •Input: read-only, one pass
- •Output: write-only, one pass
- •Forward: multiple read/write
- •Bidirectional: operator--
- •Random-access: like a pointer
- •Is this really important?
- •Predefined iterators
- •IO stream iterators
- •Manipulating raw storage
- •Basic sequences: vector, list & deque
- •Basic sequence operations
- •vector
- •Cost of overflowing allocated storage
- •Inserting and erasing elements
- •deque
- •Converting between sequences
- •Cost of overflowing allocated storage
- •Checked random-access
- •list
- •Special list operations
- •list vs. set
- •Swapping all basic sequences
- •Robustness of lists
- •Performance comparison
- •A completely reusable tokenizer
- •stack
- •queue
- •Priority queues
- •Holding bits
- •bitset<n>
- •vector<bool>
- •Associative containers
- •Generators and fillers for associative containers
- •The magic of maps
- •A command-line argument tool
- •Multimaps and duplicate keys
- •Multisets
- •Combining STL containers
- •Creating your own containers
- •Summary
- •Exercises
- •5: STL Algorithms
- •Function objects
- •Classification of function objects
- •Automatic creation of function objects
- •Binders
- •Function pointer adapters
- •SGI extensions
- •A catalog of STL algorithms
- •Support tools for example creation
- •Filling & generating
- •Example
- •Counting
- •Example
- •Manipulating sequences
- •Example
- •Searching & replacing
- •Example
- •Comparing ranges
- •Example
- •Removing elements
- •Example
- •Sorting and operations on sorted ranges
- •Sorting
- •Example
- •Locating elements in sorted ranges
- •Example
- •Merging sorted ranges
- •Example
- •Set operations on sorted ranges
- •Example
- •Heap operations
- •Applying an operation to each element in a range
- •Examples
- •Numeric algorithms
- •Example
- •General utilities
- •Creating your own STL-style algorithms
- •Summary
- •Exercises
- •Perspective
- •Duplicate subobjects
- •Ambiguous upcasting
- •virtual base classes
- •The "most derived" class and virtual base initialization
- •"Tying off" virtual bases with a default constructor
- •Overhead
- •Upcasting
- •Persistence
- •MI-based persistence
- •Improved persistence
- •Avoiding MI
- •Mixin types
- •Repairing an interface
- •Summary
- •Exercises
- •7: Exception handling
- •Error handling in C
- •Throwing an exception
- •Catching an exception
- •The try block
- •Exception handlers
- •Termination vs. resumption
- •The exception specification
- •Better exception specifications?
- •Catching any exception
- •Rethrowing an exception
- •Uncaught exceptions
- •Function-level try blocks
- •Cleaning up
- •Constructors
- •Making everything an object
- •Exception matching
- •Standard exceptions
- •Programming with exceptions
- •When to avoid exceptions
- •Not for asynchronous events
- •Not for ordinary error conditions
- •Not for flow-of-control
- •You’re not forced to use exceptions
- •New exceptions, old code
- •Typical uses of exceptions
- •Always use exception specifications
- •Start with standard exceptions
- •Nest your own exceptions
- •Use exception hierarchies
- •Multiple inheritance
- •Catch by reference, not by value
- •Throw exceptions in constructors
- •Don’t cause exceptions in destructors
- •Avoid naked pointers
- •Overhead
- •Summary
- •Exercises
- •8: Run-time type identification
- •The “Shape” example
- •What is RTTI?
- •Two syntaxes for RTTI
- •Syntax specifics
- •Producing the proper type name
- •Nonpolymorphic types
- •Casting to intermediate levels
- •void pointers
- •Using RTTI with templates
- •References
- •Exceptions
- •Multiple inheritance
- •Sensible uses for RTTI
- •Revisiting the trash recycler
- •Mechanism & overhead of RTTI
- •Creating your own RTTI
- •Explicit cast syntax
- •Summary
- •Exercises
- •9: Building stable systems
- •Shared objects & reference counting
- •Reference-counted class hierarchies
- •Finding memory leaks
- •An extended canonical form
- •Exercises
- •10: Design patterns
- •The pattern concept
- •The singleton
- •Variations on singleton
- •Classifying patterns
- •Features, idioms, patterns
- •Basic complexity hiding
- •Factories: encapsulating object creation
- •Polymorphic factories
- •Abstract factories
- •Virtual constructors
- •Destructor operation
- •Callbacks
- •Observer
- •The “interface” idiom
- •The “inner class” idiom
- •The observer example
- •Multiple dispatching
- •Visitor, a type of multiple dispatching
- •Efficiency
- •Flyweight
- •The composite
- •Evolving a design: the trash recycler
- •Improving the design
- •“Make more objects”
- •A pattern for prototyping creation
- •Trash subclasses
- •Parsing Trash from an external file
- •Recycling with prototyping
- •Abstracting usage
- •Applying double dispatching
- •Implementing the double dispatch
- •Applying the visitor pattern
- •More coupling?
- •RTTI considered harmful?
- •Summary
- •Exercises
- •11: Tools & topics
- •The code extractor
- •Debugging
- •Trace macros
- •Trace file
- •Abstract base class for debugging
- •Tracking new/delete & malloc/free
- •CGI programming in C++
- •Encoding data for CGI
- •The CGI parser
- •Testing the CGI parser
- •Using POST
- •Handling mailing lists
- •Maintaining your list
- •Mailing to your list
- •A general information-extraction CGI program
- •Parsing the data files
- •Summary
- •Exercises
- •General C++
- •My own list of books
- •Depth & dark corners
- •Design Patterns
- •Index
formed code. Notice that the operation that broke deque in DequeCoreDump.cpp is perfectly fine with a list.
Performance comparison
To get a better feel for the differences between the sequence containers, it’s illuminating to race them against each other while performing various operations.
//: C04:SequencePerformance.cpp
//Comparing the performance of the basic
//sequence containers for various operations #include <vector>
#include <queue> #include <list> #include <iostream> #include <string> #include <typeinfo> #include <ctime> #include <cstdlib> using namespace std;
class FixedSize { int x[20];
//Automatic generation of default constructor,
//copy-constructor and operator=
}fs;
template<class Cont> struct InsertBack {
void operator()(Cont& c, long count) { for(long i = 0; i < count; i++)
c.push_back(fs);
}
char* testName() { return "InsertBack"; }
};
template<class Cont> struct InsertFront {
void operator()(Cont& c, long count) { long cnt = count * 10;
for(long i = 0; i < cnt; i++) c.push_front(fs);
Chapter 15: Multiple Inheritance
193
}
char* testName() { return "InsertFront"; }
};
template<class Cont> struct InsertMiddle {
void operator()(Cont& c, long count) { typename Cont::iterator it;
long cnt = count / 10;
for(long i = 0; i < cnt; i++) {
//Must get the iterator every time to keep
//from causing an access violation with
//vector. Increment it to put it in the
//middle of the container:
it = c.begin(); it++; c.insert(it, fs);
}
}
char* testName() { return "InsertMiddle"; }
};
template<class Cont>
struct RandomAccess { // Not for list void operator()(Cont& c, long count) {
int sz = c.size();
long cnt = count * 100; for(long i = 0; i < cnt; i++)
c[rand() % sz];
}
char* testName() { return "RandomAccess"; }
};
template<class Cont> struct Traversal {
void operator()(Cont& c, long count) { long cnt = count / 100;
for(long i = 0; i < cnt; i++) {
typename Cont::iterator it = c.begin(), end = c.end();
while(it != end) it++;
}
}
Chapter 15: Multiple Inheritance
194
char* testName() { return "Traversal"; }
};
template<class Cont> struct Swap {
void operator()(Cont& c, long count) { int middle = c.size() / 2;
typename Cont::iterator it = c.begin(), mid = c.begin();
it++; // Put it in the middle for(int x = 0; x < middle + 1; x++)
mid++;
long cnt = count * 10; for(long i = 0; i < cnt; i++)
swap(*it, *mid);
}
char* testName() { return "Swap"; }
};
template<class Cont> struct RemoveMiddle {
void operator()(Cont& c, long count) { long cnt = count / 10;
if(cnt > c.size()) {
cout << "RemoveMiddle: not enough elements" << endl;
return;
}
for(long i = 0; i < cnt; i++) {
typename Cont::iterator it = c.begin(); it++;
c.erase(it);
}
}
char* testName() { return "RemoveMiddle"; }
};
template<class Cont> struct RemoveBack {
void operator()(Cont& c, long count) { long cnt = count * 10;
if(cnt > c.size()) {
cout << "RemoveBack: not enough elements"
Chapter 15: Multiple Inheritance
195
<< endl; return;
}
for(long i = 0; i < cnt; i++) c.pop_back();
}
char* testName() { return "RemoveBack"; }
};
template<class Op, class Container>
void measureTime(Op f, Container& c, long count){ string id(typeid(f).name());
bool Deque = id.find("deque") != string::npos; bool List = id.find("list") != string::npos; bool Vector = id.find("vector") !=string::npos; string cont = Deque ? "deque" : List ? "list"
: Vector? "vector" : "unknown";
cout << f.testName() << " for " << cont << ": "; // Standard C library CPU ticks:
clock_t ticks = clock(); f(c, count); // Run the test ticks = clock() - ticks; cout << ticks << endl;
}
typedef deque<FixedSize> DF; typedef list<FixedSize> LF; typedef vector<FixedSize> VF;
int main(int argc, char* argv[]) { srand(time(0));
long count = 1000;
if(argc >= 2) count = atoi(argv[1]); DF deq;
LF lst;
VF vec, vecres;
vecres.reserve(count); // Preallocate storage measureTime(InsertBack<VF>(), vec, count); measureTime(InsertBack<VF>(), vecres, count); measureTime(InsertBack<DF>(), deq, count); measureTime(InsertBack<LF>(), lst, count);
// Can't push_front() with a vector:
//! measureTime(InsertFront<VF>(), vec, count);
Chapter 15: Multiple Inheritance
196
measureTime(InsertFront<DF>(), deq, count); measureTime(InsertFront<LF>(), lst, count); measureTime(InsertMiddle<VF>(), vec, count); measureTime(InsertMiddle<DF>(), deq, count); measureTime(InsertMiddle<LF>(), lst, count); measureTime(RandomAccess<VF>(), vec, count); measureTime(RandomAccess<DF>(), deq, count); // Can't operator[] with a list:
//! measureTime(RandomAccess<LF>(), lst, count); measureTime(Traversal<VF>(), vec, count); measureTime(Traversal<DF>(), deq, count); measureTime(Traversal<LF>(), lst, count); measureTime(Swap<VF>(), vec, count); measureTime(Swap<DF>(), deq, count); measureTime(Swap<LF>(), lst, count); measureTime(RemoveMiddle<VF>(), vec, count); measureTime(RemoveMiddle<DF>(), deq, count); measureTime(RemoveMiddle<LF>(), lst, count); vec.resize(vec.size() * 10); // Make it bigger measureTime(RemoveBack<VF>(), vec, count); measureTime(RemoveBack<DF>(), deq, count); measureTime(RemoveBack<LF>(), lst, count);
} ///:~
This example makes heavy use of templates to eliminate redundancy, save space, guarantee identical code and improve clarity. Each test is represented by a class that is templatized on the container it will operate on. The test itself is inside the operator( ) which, in each case, takes a reference to the container and a repeat count – this count is not always used exactly as it is, but sometimes increased or decreased to prevent the test from being too short or too long. The repeat count is just a factor, and all tests are compared using the same value.
Each test class also has a member function that returns its name, so that it can easily be printed. You might think that this should be accomplished using run-time type identification, but since the actual name of the class involves a template expansion, this turns out to be the more direct approach.
The measureTime( ) function template takes as its first template argument the operation that it’s going to test – which is itself a class template selected from the group defined previously in the listing. The template argument Op will not only contain the name of the class, but also (decorated into it) the type of the container it’s working with. The RTTI typeid( ) operation allows the name of the class to be extracted as a char*, which can then be used to create a string called id. This string can be searched using string::find( ) to look for deque, list or vector. The bool variable that corresponds to the matching string becomes true, and this is used to properly initialize the string cont so the container name can be accurately printed, along with the test name.
Chapter 15: Multiple Inheritance
197
Once the type of test and the container being tested has been printed out, the actual test is quite simple. The Standard C library function clock( ) is used to capture the starting and ending CPU ticks (this is typically more fine-grained than trying to measure seconds). Since f is an object of type Op, which is a class that has an operator( ), the line:
f(c, count);
is actually calling the operator( ) for the object f.
In main( ), you can see that each different type of test is run on each type of container, except for the containers that don’t support the particular operation being tested (these are commented out).
When you run the program, you’ll get comparative performance numbers for your particular compiler and your particular operating system and platform. Although this is only intended to give you a feel for the various performance features relative to the other sequences, it is not a bad way to get a quick-and-dirty idea of the behavior of your library, and also to compare one library with another.
set
The set produces a container that will accept only one of each thing you place in it; it also sorts the elements (sorting isn’t intrinsic to the conceptual definition of a set, but the STL set stores its elements in a balanced binary tree to provide rapid lookups, thus producing sorted results when you traverse it). The first two examples in this chapter used sets.
Consider the problem of creating an index for a book. You might like to start with all the words in the book, but you only want one instance of each word and you want them sorted. Of course, a set is perfect for this, and solves the problem effortlessly. However, there’s also the problem of punctuation and any other non-alpha characters, which must be stripped off to generate proper words. One solution to this problem is to use the Standard C library function strtok( ), which produces tokens (in our case, words) given a set of delimiters to strip out:
//: C04:WordList.cpp
// Display a list of words used in a document #include "../require.h"
#include <string> #include <cstring> #include <set> #include <iostream> #include <fstream> using namespace std;
const char* delimiters =
" \t;()\"<>:{}[]+-=&*#.,/\\~";
Chapter 15: Multiple Inheritance
198
int main(int argc, char* argv[]) { requireArgs(argc, 1);
ifstream in(argv[1]); assure(in, argv[1]); set<string> wordlist; string line; while(getline(in, line)) {
// Capture individual words:
char* s = // Cast probably won’t crash: strtok((char*)line.c_str(), delimiters);
while(s) {
// Automatic type conversion: wordlist.insert(s);
s = strtok(0, delimiters);
}
}
// Output results: copy(wordlist.begin(), wordlist.end(),
ostream_iterator<string>(cout, "\n"));
} ///:~
strtok( ) takes the starting address of a character buffer (the first argument) and looks for delimiters (the second argument). It replaces the delimiter with a zero, and returns the address of the beginning of the token. If you call it subsequent times with a first argument of zero it will continue extracting tokens from the rest of the string until it finds the end. In this case, the delimiters are those that delimit the keywords and identifiers of C++, so it extracts these keywords and identifiers. Each word is turned into a string and placed into the wordlist vector, which eventually contains the whole file, broken up into words.
You don’t have to use a set just to get a sorted sequence. You can use the sort( ) function (along with a multitude of other functions in the STL) on different STL containers. However, it’s likely that set will be faster.
Eliminating strtok( )
Some programmers consider strtok( ) to be the poorest design in the Standard C library because it uses a static buffer to hold its data between function calls. This means:
1.You can’t use strtok( ) in two places at the same time
2.You can’t use strtok( ) in a multithreaded program
3.You can’t use strtok( ) in a library that might be used in a multithreaded program
4.strtok( ) modifies the input sequence, which can produce unexpected side effects
Chapter 15: Multiple Inheritance
199
5.strtok( ) depends on reading in “lines”, which means you need a buffer big enough for the longest line. This produces both wastefully-sized buffers, and lines longer than the “longest” line. This can also introduce security holes. (Notice that the buffer size problem was eliminated in WordList.cpp by using string input, but this required a cast so that strtok( ) could modify the data in the string – a dangerous approach for general-purpose programming).
For all these reasons it seems like a good idea to find an alternative for strtok( ). The following example will use an istreambuf_iterator (introduced earlier) to move the characters from one place (which happens to be an istream) to another (which happens to be a string), depending on whether the Standard C library function isalpha( ) is true:
//: C04:WordList2.cpp
// Eliminating strtok() from Wordlist.cpp #include "../require.h"
#include <string> #include <cstring> #include <set> #include <iostream> #include <fstream> #include <iterator> using namespace std;
int main(int argc, char* argv[]) { requireArgs(argc, 1);
ifstream in(argv[1]); assure(in, argv[1]);
istreambuf_iterator<char> p(in), end; set<string> wordlist;
while (p != end) { string word;
insert_iterator<string> ii(word, word.begin());
//Find the first alpha character: while(!isalpha(*p) && p != end)
p++;
//Copy until the first non-alpha character: while (isalpha(*p) && p != end)
*ii++ = *p++;
if (word.size() != 0) wordlist.insert(word);
}
// Output results: copy(wordlist.begin(), wordlist.end(),
Chapter 15: Multiple Inheritance
200
ostream_iterator<string>(cout, "\n")); } ///:~
This example was suggested by Nathan Myers, who invented the istreambuf_iterator and its relatives. This iterator extracts information character-by-character from a stream. Although the istreambuf_iterator template argument might suggest to you that you could extract, for example, ints instead of char, that’s not the case. The argument must be of some character type – a regular char or a wide character.
After the file is open, an can be extracted from it. words.
istreambuf_iterator called p is attached to the istream so characters The set<string> called wordlist will be used to hold the resulting
The while loop reads words until the end of the input stream is found. This is detected using the default constructor for istreambuf_iterator which produces the past-the-end iterator object end. Thus, if you want to test to make sure you’re not at the end of the stream, you simply say p != end.
The second type of iterator that’s used here is the insert_iterator, which creates an iterator that knows how to insert objects into a container. Here, the “container” is the string called word which, for the purposes of insert_iterator, behaves like a container. The constructor for insert_iterator requires the container and an iterator indicating where it should start inserting the characters. You could also use a back_insert_iterator, which requires that the container have a push_back( ) (string does).
After the while loop sets everything up, it begins by looking for the first alpha character, incrementing start until that character is found. Then it copies characters from one iterator to the other, stopping when a non-alpha character is found. Each word, assuming it is nonempty, is added to wordlist.
StreamTokenizer:
a more flexible solution
The above program parses its input into strings of words containing only alpha characters, but that’s still a special case compared to the generality of strtok( ). What we’d like now is an actual replacement for strtok( ) so we’re never tempted to use it. WordList2.cpp can be modified to create a class called StreamTokenizer that delivers a new token as a string whenever you call next( ), according to the delimiters you give it upon construction (very similar to strtok( )):
//: C04:StreamTokenizer.h
// C++ Replacement for Standard C strtok() #ifndef STREAMTOKENIZER_H
#define STREAMTOKENIZER_H #include <string> #include <iostream>
Chapter 15: Multiple Inheritance
201
#include <iterator>
class StreamTokenizer {
typedef std::istreambuf_iterator<char> It; It p, end;
std::string delimiters; bool isDelimiter(char c) {
return
delimiters.find(c) != std::string::npos;
}
public: StreamTokenizer(std::istream& is,
std::string delim = " \t\n;()\"<>:{}[]+-=&*#" ".,/\\~!0123456789") : p(is), end(It()), delimiters(delim) {}
std::string next(); // Get next token
};
#endif STREAMTOKENIZER_H ///:~
The default delimiters for the StreamTokenizer constructor extract words with only alpha characters, as before, but now you can choose different delimiters to parse different tokens. The implementation of next( ) looks similar to Wordlist2.cpp:
//: C04:StreamTokenizer.cpp {O} #include "StreamTokenizer.h" using namespace std;
string StreamTokenizer::next() { string result;
if(p != end) { insert_iterator<string>
ii(result, result.begin()); while(isDelimiter(*p) && p != end)
p++;
while (!isDelimiter(*p) && p != end) *ii++ = *p++;
}
return result; } ///:~
The first non-delimiter is found, then characters are copied until a delimiter is found, and the resulting string is returned. Here’s a test:
//: C04:TokenizeTest.cpp //{L} StreamTokenizer
Chapter 15: Multiple Inheritance
202