《C++ Primer——The Basics》notes1

08-23

Understanding the details of how a language provides these features is the first step toward understanding the language.

Among the most fundamental of these common features are

Built-in types such as integers, characters, and so forth
Variables, which let us give names to the objects we use
Expressions and statements to manipulate values of these types
Control structures, such as if or while, that allow us to conditionally or repeatedly execute a set of actions
Functions that let us define callable units of computation

Most programming languages supplement these basic features in two ways: They let programmers extend the language by defining their own types, and they provide library routines that define useful functions and types not otherwise built into the language. Perhaps the most important feature in C++ is the class, which lets programmers define their own types. In C++ such types are sometimes called 「class types」 to distinguish them from the types that are built into the language.

Chapter 2. Variables and Basic Types

Arithmetic Types

In C++ a byte has at least as many bits as are needed to hold a character in the machine』s basic character set. On most machines a byte contains 8 bits and a word is either 32 or 64 bits, that is, 4 or 8 bytes.

To give meaning to memory at a given address, we must know the type of the value stored there. The type determines how many bits are used and how to interpret those bits.

Unlike the other integer types, there are three distinct basic character types: char, signed char, and unsigned char. In particular, char is not the same type as signed char. Although there are three character types, there are only two representations: signed and unsigned. The (plain) char type uses one of these representations. Which of the other two character representations is equivalent to char depends on the compiler.

A few rules of thumb can be useful in deciding which type to use:

Use an unsigned type when you know that the values cannot be negative.
Use int for integer arithmetic. short is usually too small and, in practice, long often has the same size as int. If your data values are larger than the minimum guaranteed size of an int, then use long long.
Do not use plain char or bool in arithmetic expressions. Use them only to hold characters or truth values. Computations using char are especially problematic because char is signed on some machines and unsigned on others. If you need a tiny integer, explicitly specify either signed char or unsigned char.
Use double for floating-point computations; float usually does not have enough precision, and the cost of double-precision calculations versus single-precision is negligible. In fact, on some machines, double-precision operations are faster than single. The precision offered by long double usually is unnecessary and often entails considerable run-time cost.

Type conversions

Type conversions happen automatically when we use an object of one type where an object of another type is expected.

bool b=42; // b is trueint i=b; // i has a value of 1i=3.14; // i has a value of 3double pi = i; // pi has a value of 3.0unsigned char c = -1; // assuming 8-bit chars, c has value 255 signed char c2 = 256; // assuming 8-bit chars, the value of c2 is undefined

what happens depends on the range of the values that the types permit:

When we assign one of the nonbool arithmetic types to a bool object, the result is false if the value is 0 and true otherwise.
When we assign a bool to one of the other arithmetic types, the resulting value is 1 if the bool is true and 0 if the bool is false.
When we assign a floating-point value to an object of integral type, the value is truncated. The value that is stored is the part before the decimal point.
When we assign an integral value to an object of floating-point type, the fractional part is zero. Precision may be lost if the integer has more bits than the floating-point object can accommodate.
If we assign an out-of-range value to an object of unsigned type, the result is the remainder of the value modulo the number of values the target type can hold. For example, an 8-bit unsigned char can hold values from 0 through 255, inclusive. If we assign a value outside this range, the compiler assigns the remainder of that value modulo 256. Therefore, assigning –1 to an 8-bit unsigned char gives that object the value 255.
If we assign an out-of-range value to an object of signed type, the result is undefined. The program might appear to work, it might crash, or it might produce garbage values.

PS: programs usually should avoid implementation-defined behavior, such as assuming that the size of an int is a fixed and known value. Such programs are said to be nonportable. When the program is moved to another machine, code that relied on implementation-defined behavior may fail. Tracking down these sorts of problems in previously working programs is, mildly put, unpleasant.

The compiler applies these same type conversions when we use a value of one

arithmetic type where a value of another arithmetic type is expected.

It is essential to remember that signed values are automatically converted to unsigned.

for (int i = 10; i >= 0; --i) std::cout << i << std::endl;// WRONG: u can never be less than 0; the condition will always succeed for (unsigned u = 10; u >= 0; --u) std::cout << u << std::endl;unsigned u = 11; // start the loop one past the first element we want to print while (u > 0) {--u; // decrement first, so that the last iteration will print 0 std::cout << u << std::endl;}unsigned u = 10, u2 = 42; std::cout << u2 - u << std::endl; // 32std::cout << u - u2 << std::endl; // 4294967264int i = 10, i2 = 42;std::cout << i2 - i << std::endl; // 32std::cout << i - i2 << std::endl; // -32std::cout << i - u << std::endl; // 0std::cout << u - i << std::endl; // 0

Literals

to be continued...

Variables

Initialization is not assignment. Initialization happens when a variable is given a value when it is created. Assignment obliterates an object』s current value and replaces that value with a new one.

int units_sold = 0; int units_sold = {0}; int units_sold{0};int units_sold(0);

Braced lists of initializers can now be used whenever we initialize an object and in some cases when we assign a new value to an object.

The compiler will not let us list initialize variables of built-in type if the initializer might lead to the loss of information:

long double ld = 3.1415926536;int a{ld}, b = {ld}; // error: narrowing conversion required int c(ld), d = ld; // ok: but value will be truncated

When we define a variable without an initializer, the variable is default initialized. Such variables are given the 「default」 value. What that default value is depends on the type of the variable and may also depend on where the variable is defined.The value of an object of built-in type that is not explicitly initialized depends on where it is defined. Variables defined outside any function body are initialized to zero. With one exception, which we cover in § 6.1.1 (p. 205), variables of built-in type defined inside a function are uninitialized. The value of an uninitialized variable of built-in type is undefined (§ 2.1.2, p. 36). It is an error to copy or otherwise try to access the value of a variable whose value is undefined.

Objects of class type that we do not explicitly initialize have a value that is defined by the class. Some classes require that every object be explicitly initialized. The compiler will complain if we try to create an object of such a class with no initializer.

std::string global_str; // global_str has a value of ""int global_int; // global has a value of 0int main(){int local_int; // local_int has an undefined valuestd::string local_str; //local_str has a value of ""}

We recommend initializing every object of built-in type. It is not always necessary, but it is easier and safer to provide an initializer until you can be certain it is safe to omit the initializer.

Variable Declarations and Definitions

To support separate compilation, C++ distinguishes between declarations and definitions. A declaration makes a name known to the program. A file that wants to use a name defined elsewhere includes a declaration for that name. A definition creates the associated entity.

A variable declaration specifies the type and name of a variable. A variable definition is a declaration. In addition to specifying the name and type, a definition also allocates storage and may provide the variable with an initial value.

To obtain a declaration that is not also a definition, we add the extern keyword and may not provide an explicit initializer:

extern int i; // declares but does not define i int j; // declares and defines j

C++ is a statically typed language, which means that types are checked at compile time. The process by which types are checked is referred to as type checking. A consequence of static checking is that the type of every entity we use must be known to the compiler. As one example, we must declare the type of a variable before we can use that variable.

Identifiers

Following these conventions can improve the readability of a program.

An identifier should give some indication of its meaning.
Variable names normally are lowercase—index, not Index or INDEX.
Like Sales_item, classes we define usually begin with an uppercase letter.
Identifiers with multiple words should visually distinguish each word, for example, student_loan or studentLoan, not studentloan.

Scope of a Name

Advice: It is usually a good idea to define an object near the point at which the object is first used. Doing so improves readability by making it easy to find the definition of the variable. More importantly, it is often easier to give the variable a useful initial value when the variable is defined close to where it is first used.

Scopes can contain other scopes. The contained (or nested) scope is referred to as an

inner scope, the containing scope is the outer scope.

Once a name has been declared in a scope, that name can be used by scopes nested inside that scope. Names declared in the outer scope can also be redefined in an inner scope:

#include <iostream>// Program for illustration purposes only: It is bad style for a function// to use a global variable and also define a local variable with the same name int reused = 42; // reused has global scopeint main(){int unique = 0; // unique has block scope// output #1: uses global reused; prints 42 0std::cout << reused << " " << unique << std::endl;int reused = 0; // new, local object named reused hides global reused// output #2: uses local reused; prints 0 0std::cout << reused << " " << unique << std::endl;// output #3: explicitly requests the global reused; prints 42 0std::cout << ::reused << " " << unique << std::endl; return 0;}

Output #1 appears before the local definition of reused. Therefore, this output statement uses the name reused that is defined in the global scope. This statement prints 42 0. Output #2 occurs after the local definition of reused. The local reused is now in scope. Thus, this second output statement uses the local object named reused rather than the global one and prints 0 0. Output #3 uses the scope operator (§ 1.2, p. 8) to override the default scoping rules. The global scope has no name. Hence, when the scope operator has an empty left-hand side, it is a request to fetch the name on the right-hand side from the global scope. Thus, this expression uses the global reused and prints 42 0.

Warning: It is almost always a bad idea to define a local variable with the same name

as a global variable that the function uses or might use.

Compound Types —— References

A compound type is a type that is defined in terms of another type.

A reference defines an alternative name for an object. A reference is not an object. Instead, a reference is just another name for an already existing object.

int ival = 1024;int &refVal = ival; // refVal refers to (is another name for) ival int &refVal2; // error: a reference must be initialized

Ordinarily, when we initialize a variable, the value of the initializer is copied into the object we are creating. When we define a reference, instead of copying the initializer』s value, we bind the reference to its initializer. Once initialized, a reference remains bound to its initial object. There is no way to rebind a reference to refer to a different object. Because there is no way to rebind a reference, references must be initialized.

Because references are not objects, we may not define a reference to a reference. We can define multiple references in a single definition. Each identifier that is a reference must be preceded by the & symbol:

int i = 1024, i2 = 2048; // i and i2 are both ints int &r = i, r2 = i2; // r is a reference bound to i; r2 is an int int i3 = 1024, &ri = i3; // i3 is an int; ri is a reference bound to i3 int &r3 = i3, &r4 = i2; // both r3 and r4 are references

With two exceptions that we』ll cover in § 2.4.1 (p. 61) and § 15.2.3 (p. 601), the type of a reference and the object to which the reference refers must match exactly. Moreover, for reasons we』ll explore in § 2.4.1, a reference may be bound only to an object, not to a literal or to the result of a more general expression:

int &refVal4 = 10; // error: initializer must be an objectdouble dval = 3.14;int &refVal5 = dval; // error: initializer must be an int object

Compound Types —— Pointers

A pointer is a compound type that 「points to」 another type. We define a pointer type by writing a declarator of the form *d, where d is the name being defined. The * must be repeated for each pointer variable. Because references are not objects, they don』t have

addresses. Hence, we may not define a pointer to a reference.

The value (i.e., the address) stored in a pointer can be in one of four states:

It can point to an object.
It can point to the location just immediately past the end of an object.
It can be a null pointer, indicating that it is not bound to any object.
It can be invalid; values other than the preceding three are invalid.

Dereferencing a pointer yields the object to which the pointer points. We can assign to that object by assigning to the result of the dereference.

A null pointer does not point to any object. There are several ways to obtain a null pointer:

int *p1 = nullptr; // equivalent to int *p1 = 0;int *p2 = 0; // directly initializes p2 from the literal constant 0 // must #include cstdlibint *p3 = NULL; // equivalent to int *p3 = 0;

ModernC++ programs generally should avoid using NULL and use nullptr instead. It is illegal to assign an int variable to a pointer, even if the variable』s value happens to be 0.

int zero = 0;pi = zero; // error: cannot assign an int to a pointer

The type void* is a special pointer type that can hold the address of any object. Like any other pointer, a void* pointer holds an address, but the type of the object at that address is unknown:

double obj = 3.14, *pd = &obj;// ok: void* can hold the address value of any data pointer type void *pv = &obj; // obj can be an object of any type pv = pd; // pv can hold a pointer to any type

There are only a limited number of things we can do with a void* pointer: We can compare it to another pointer, we can pass it to or return it from a function, and we can assign it to another void* pointer. We cannot use a void* to operate on the object it addresses—we don』t know that object』s type, and the type determines what operations we can perform on the object. Generally, we use a void* pointer to deal with memory as memory, rather than

using the pointer to access the object stored in that memory.

Key Differences between Pointers and References

Both pointers and references give indirect access to other objects. However, there are important differences in how they do so. The most important is that a reference is not an object. Once we have defined a reference, there is no way to make that reference refer to a different object. When we use a reference, we always get the object to which the reference was initially bound. Unlike a reference, a pointer is an object in its own right. Pointers can be assigned and copied; a single pointer can point to several different objects over its lifetime. Unlike a reference, a pointer need not be initialized at the time it is defined. Like other built-in types, pointers defined at block scope have undefined value if they are not initialized.

Compound Type Declarations

Defining Multiple Variables

int* p; // legal but might be misleadingint* p1, p2; // p1 is a pointer to int; p2 is an intint *p1, *p2; // both p1 and p2 are pointers to int// simple styleint* p1; // p1 is a pointer to int int* p2; // p2 is a pointer to int

Pointers to Pointers

int ival = 1024;int *pi = &ival; // pi points to an intint **ppi = π // ppi points to a pointer to an int

References to Pointers

int i = 42;int *p; //p is a pointer to intint *&r = p; // r is a reference to the pointer pr = &i; // r refers to a pointer; assigning &i to r makes p point to i*r = 0; // dereferencing r yields i, the object to which p points; changes i to 0