The C++ Programming Language
4th Edition (2013)
Bjarne Stroustrup
Part I: Introduction
1. Notes to the Reader
C++ is not (strictly) an object-oriented programming language (§1.2.1). It aims to support several popular programming styles and techniques:
- Procedural programming
- Data abstraction
- Object-oriented programming
- Generic programming
C++11 adds rvalue references which make functional programming techniques (passing around temporary objects instead of mutating objects with identity) more efficient (§1.2.1).
The static type system in C++ is key to good performance (§1.2.2).
C and C++ are distinct languages: C++ is not a proper superset of C (§1.2.3). The languages have evolved independently, borrowing features from each other along the way. However:
The definition of C++ has been revised to ensure that a construct that is both legal C and legal C++ has the same meaning in both languages.
C++ was designed to be efficient (§1.2.4). Except for dynamic memory
allocation (new
and delete
), run-time type information (typeid
and
dynamic_cast
), and exceptions (try
and throw
), the language requires no
run-time support.
I'm skipping the rest of Part I (a brief tour of C++) and jumping ahead to the details.
Part II: Basic Facilities
6. Types and Declarations
Standards-compliant code isn't necessarily good code (§6.1). It's not necessarily portable, either, as many features are implementation-defined (due to target platform hardware differences), and most real-world code uses facilities defined by the system (e.g., POSIX), not the language.
Use compile-time static assertions (static_assert()
) to verify assumptions
about the implementation. Avoid unspecified and undefined behavior. Use static
analysis tools.
C++ runs in hosted or freestanding environments (§6.1.1). Implementations targeting freestanding environments allow language features (like exceptions) to be disabled and include fewer library facilities (e.g., no STL).
C++ is strongly-typed (§6.2):
Every name (identifier) in a C++ program has a type associated with it.
The type system is based on the fundamental data storage units provided by most computers (§6.2.1):
The assumption is that a computer provides bytes for holding characters, words for holding and computing integer values, some entity most suitable for floating-point computation, and addresses for referring to those entities.
Of the fundamental types, use only bool
, char
, int
, and double
, unless
you have specialized space or performance requirements.
char
is equivalent to either signed char
or unsigned char
(implementation-defined), and at least 8 bits wide (§6.2.3).
The C++ standard supports ones-complement hardware. As such, the value -128
may not be representable in an 8-bit signed char
on all platforms.
char
, signed char
, and unsigned char
are distinct types; you
can't mix pointers between the types (§6.2.3.1).
C++ defines character literals ('a'
) as char
, unlike C which defines them
as int
(§6.2.3.2).
Integer types come in three flavors (plain, signed
, unsigned
) and four
sizes (short
, plain, long
, and long long
) (§6.2.4). The plain integer
types (int
, long
, etc) are always signed; the signed
types are synonyms
for the plain types. The unsigned
types are best used to represent bit
arrays; any other use (to increase numerical range or enforce positive
values) is fraught due to implict type conversion.
The type of an integer literal depends on its form (decimal, octal, or hex),
value, and suffix (§6.2.4.2). There's a long list of rules, but the take home
message is to be explicit (using suffixes) whenever you want something besides
an int
:
#include <iostream>
int main() {
std::cout << (0xBEEF << 16) << std::endl;
}
prints -1091633152
on a machine with 32-bit int
s, while:
#include <iostream>
int main() {
std::cout << (0xBEEFU << 16) << std::endl;
}
prints 3203334144
.
Floating point literals are of type double
unless a type suffix is supplied
(§6.2.5.1).
Programmers can define suffix operators for user-defined types (§6.2.6),
opening the possiblity for literals like "foo bar"s
and 123_km
-- who
knew?
void
is a fundamental type useful only in building more complex types; there
are no void
objects (§6.2.7).
Stroustrup wants you to write portable code (§6.2.8):
People who claim they don’t care about portability usually do so because they use only a single system and feel they can afford the attitude that “the language is what my compiler implements.” This is a narrow and shortsighted view.
On the sizes of the fundamental types (§6.2.8):
Sizes of C++ objects are expressed in terms of multiples of the size of a char, so by definition the size of a char is 1. ... [I]t is guaranteed that a char has at least 8 bits, a short at least 16 bits, and a long at least 32 bits.
The fundamental types can be intermixed in expressions; the compiler converts values as necessary, but cannot always preserve the value, as when converting a wider type to a narrower one, or a floating type to an integer:
#include <iostream>
int main()
{
double d = 123456.789;
int i = d;
short s = d;
char c = d;
bool b = d;
std::cout << std::fixed
<< "d: " << d << '\n'
<< "i: " << i << '\n'
<< "s: " << s << '\n'
<< "c: " << (int)c << '\n'
<< "b: " << b << '\n';
}
produces:
d: 123456.789000
i: 123456
s: -7616
c: 64
b: 1
Caveat programmor: value-destroying conversions are perfectly legal and don't produce any compiler warnings. Per Stroustrup (§6.2.8):
Conversions that are not value-preserving are best avoided.
Two non-fundamental but useful types defined in <cstddef>
are:
size_t
, capable of representing the size (in bytes) of any objectptrdiff_t
, capable of representing the difference between any two pointers (i.e., the number of elements in an array)
An object may occupy more memory than its type suggests; the extra memory is
padding due to the alignment requirements of the hardware (§6.2.9). The
alignof()
operator and alignas()
type specifier can be used to get or set,
respectively, the alignment of an object.
Declarations specify the type of an entity; definitions reserve memory for it (§6.3):
There must always be exactly one definition for each name in a C++ program. However, there can be many declarations.
Declarations can be quite complex, including optional prefix specifiers
(static
, virtual
), suffix function specifiers (const
, noexcept
), and
initializer or function body, but all declarations include a base type and a
declarator. The declarator defines the name and may include declarator
operators (*
, &
, []
, ()
) to declare pointers, references, arrays, and
functions, respectively. The postfix declarator operators ([]
and ()
) bind
tighter than the prefix operators (*
and &
), so parentheses are required
to declare types like pointer-to-function, thus:
void (*f)();
void *g();
declares f
as pointer-to-function and g
as a function returning
void
-pointer.
While multiple names can be declared in a single declaration, declarator operators bind to exactly one name (6.3.2). Thus:
char* p, q;
declares p
as pointer-to-char
, and q
as a plain char
.
Such declarations with multiple names and nontrivial declarators make a program harder to read and should be avoided.
Some names are reserved for the implementation (§6.3.3):
- non-local names beginning with underscore (
_start
) - any name beginning with an underscore and uppercase letter (
_Bool
) - any name containing a double underscore (
__func__
)
Stroustrup offers some advice on naming (one of two hard things in computer science):
Names from a large scope ought to have relatively long and reasonably obvious names. However, code is clearer if names used only in a small scope have short, conventional names such as x, i, and p. ... Use all capitals for macros and never for non-macros.
Four syntactic styles of object initialization are possible (§6.3.5):
T a1 {v};
: list initializationT a2 = {v};
: list or aggregate initializationT a2 = v;
: assignment-style initializationT a3(v);
: function-style initialization
List initializtion is new in C++11. According to Stroustrup, this is the
preferred style as it can be used in every context and avoids narrowing
conversions (e.g., long
to int
and double
to float
). Personally,
I find it rather ugly.
You probably don't want list initialization when using auto
for type
deduction, though; the deduced type would be std::initializer_list<T>
.
The empty list initializer ({}
) initializes an object to the type's default
value.
Initialization is not required (§6.3.5.1), but:
The only really good case for an uninitialized variable is a large input buffer.
The value of an uninitialized variable depends on the type of the variable and its storage duration:
Global, namespace, local static
, and static
member variables are
initialized to {}
of the appropriate type. For the built-in types, this
happens by virtue of the fact that the .bss
segment is zeroed by the program
loader. For user-defined types, the compiler inserts calls to the type's
default constructor before the object is first referenced.
Variables of built-in type with automatic or dynamic storage duration are not default initialized. Automatic and dynamic variables of user-defined type are default initialized by calling the default constructor.
In declarations empty parentheses define a function, so to explicitly default
initialize an object use {}
rather than ()
(§6.3.5.2):
T x{}; // declares, defines, and initializes a T
T y(); // declares a function returning T