C Plus Plus Tutorial

From Intra

Jump to: navigation, search

Computer Science is no more about computers than astronomy is about telescopes. -- Edsger Dijkstra

For the Bootcamp on C++ conducted by Assela Pathirana

ATTENTION
These pages are designed to be read online. Please DO NOT print them. If you need a copy, just save this page to your computer, so that you can read it while off-line.

In 2010, the new version of Visual C++ Express Edition (2010) was released. If you have a computer running Windows XP, Vista or Windows 7, you can install this new version and avoid much of the configuration hassle. The new version comes bundled with the SDK.

Contents

[hide]


[edit] Introduction

This tutorial is designed for you to learn the C/C++ language in a rapid-phased way, generally over a period of four days of intense work. Even though you will have some time with an instructor, you will have to do a lot of practice as homework in order to gain sufficient confidence as a programmer. Most of the material is ideal for self-study.

[edit] Acknowledgment

A number of sections of this tutorial were adopted with permission from Juan Soulie's excellent C++ Language Tutorial published in cplusplus.com. In many of the lessons, changes from original tutorial are minor. However, there are some important differences in a few sections. Therefore, you are advised to follow the present tutorial during the course. On the other hand, remember that Juan Soulie's site will hopefully be updated more frequently, therefore for general look-up for help, that site will be better.

My colleague Billah helped to convert the original tutorial to this format. Thanks, Billah.

[edit] How to use this material effectively

It is important to keep in mind that C++ and Visual C++ 2008 Express Edition are not the same thing. C++ is the language. Visual C++ 2008 Express Edition is the compiler used to build a program with that language. There are many other alternatives to Visual C++ 2008 Express Edition as compilers. These can do the task of compiling a C++ program equally well.

Let's face it, our mission here is not to become professional programmers. The objective is rather to become familiar with programming as a tool. Think of programming as a challenge similar to learn cycling. You need some instructions and perhaps a bit of 'pep-talk', but what is mostly needed is to go out there and do it! And then practice a lot. It may be not possible for everybody to become a racing cyclist, but anyone can learn how to ride a bike for leisure or commuting. In my decade-long experience in teaching programming as a useful tool, I never came across someone who could not master it with adequate practice.

I will help you to set up a environment where you can write and test programs. There are many good compilers out there for C++ and some of the best ones are free and open source. However, the platform we use to try-out these examples is Visual C++ 2008 Express Edition which is a free version of a commercial software. Please keep in mind that there are many other ways of doing the same thing.

To install Visual C++ 2008 Express Edition yourself, please follow the instructions given in this page.

Follow this link for instructions on how to install and configure Visual C++ 2008 Express Edition.

[edit] Some Commonsense Rules You Should Follow When Using Advanced Stuff with Computers

  • Never have spaces or special characters in filenames, directory names etc. They can drive you crazy! Instead use underscore or hyphen to separate words.
    1. Instead of "my program" use "my_program",
    2. Don't save your programs on the desktop -- "c:\Documents and Settings\<yourname>\Desktop" [see the spaces?], instead save them in a data directory like -- d:\my_work\ [no spaces in path!]
  • In the world of programming uppercase and lowercase letters are not the same. Take care when you type things -- especially data files.


[edit] Structure of a program

Probably the best way to start learning a programming language is by writing a program. Therefore, here is our first program:

// my first program in C++
 
#include <iostream>
using namespace std;
 
int main ()
{
  cout << "Hello World!\n";
   return 0;
}
Hello World!

The first panel shows the source code for our first program. The second one shows the result of the program once compiled and executed.

[edit] Compiling and Executing

Vanishing Windows
If the console window closes before you can see the result, add the following line immediately above return 0; statement.
cin.get();

In order to compile your program in Visual C++ 2005 Express Edition, write the program in an empty c++ file. (Check the this section for information on how to do this.)

To run the program click Debug, Start without Debugging or Debug, Start Debugging.

[edit] A guided-tour

The previous program is the typical program that programmer apprentices write for the first time, and its result is the printing on screen of the "Hello World!\n" sentence. It is one of the simplest programs that can be written in C++, but it already contains the fundamental components that every C++ program has. We are going to look line by line at the code we have just written:

// my first program in C++
This is a comment line. All lines beginning with two slash signs (//) are considered comments and do not have any effect on the behavior of the program. The programmer can use them to include short explanations or observations within the source code itself. In this case, the line is a brief description of what our program is.
#include <iostream>
Lines beginning with a pound sign (#) are directives for the preprocessor. They are not regular code lines with expressions but indications for the compiler's preprocessor. In this case the directive #include <iostream> tells the preprocessor to include the iostream standard file. This specific file (iostream) includes the declarations of the basic standard input-output library in C++, and it is included because its functionality is going to be used later in the program.
using namespace std;
All the elements of the standard C++ library are declared within what is called a namespace, the namespace with the name std. So in order to access its functionality we declare with this expression that we will be using these entities. This line is very frequent in C++ programs that use the standard library, and in fact it will be included in most of the source codes included in these tutorials.
int main ()
This line corresponds to the beginning of the definition of the main function. The main function is the point by where all C++ programs start their execution, independently of its location within the source code. It does not matter whether there are other functions with other names defined before or after it - the instructions contained within this function's definition will always be the first ones to be executed in any C++ program. For that same reason, it is essential that all C++ programs have a main function.The word main is followed in the code by a pair of parentheses (()). That is because it is a function declaration: In C++, what differentiates a function declaration from other types of expressions are these parentheses that follow its name. Optionally, these parentheses may enclose a list of parameters within them.Right after these parentheses we can find the body of the main function enclosed in braces ({}). What is contained within these braces is what the function does when it is executed.
cout << "Hello World\n";
This line is a C++ statement. A statement is a simple or compound expression that can actually produce some effect. In fact, this statement performs the only action that generates a visible effect in our first program.cout represents the standard output stream in C++, and the meaning of the entire statement is to insert a sequence of characters (in this case the Hello World sequence of characters) into the standard output stream (which usually is the screen).cout is declared in the iostream standard file within the std namespace, so that's why we needed to include that specific file and to declare that we were going to use this specific namespace earlier in our code. The character \n instructs c++ to end the current output line. Notice that the statement ends with a semicolon character (;). This character is used to mark the end of the statement and in fact it must be included at the end of all expression statements in all C++ programs (one of the most common syntax errors is indeed to forget to include some semicolon after a statement).
return 0;
The return statement causes the main function to finish. return may be followed by a return code (in our example is followed by the return code ). A return code of 0 for the main function is generally interpreted as the program worked as expected without any errors during its execution. This is the most usual way to end a C++ program.

You may have noticed that not all the lines of this program perform actions when the code is executed. There were lines containing only comments (those beginning by //). There were lines with directives for the compiler's preprocessor (those beginning by #). Then there were lines that began the declaration of a function (in this case, the main function) and, finally lines with statements (like the insertion into cout), which were all included within the block delimited by the braces ({}) of the main function.

The program has been structured in different lines in order to be more readable, but in C++, we do not have strict rules on how to separate instructions in different lines. For example, instead of

int main ()
{
  cout << " Hello World ";
   return 0;
}

We could have written:

int main () { cout << "Hello World"; return 0; }

All in just one line and this would have had exactly the same meaning as the previous code.

In C++, the separation between statements is specified with an ending semicolon (;) at the end of each one, so the separation in different code lines does not matter at all for this purpose. We can write many statements per line or write a single statement that takes many code lines. The division of code in different lines serves only to make it more legible and schematic for the humans that may read it.

Let us add an additional instruction to our first program:

// my second program in C++
 
#include <iostream>
 
using namespace std;
 
int main ()
{
  cout << "Hello World!\n";
   cout << "I'm a C++ program";
   return 0;
}
Hello World! I'm a C++ program

In this case, we performed two insertions into cout in two different statements. Once again, the separation in different lines of code has been done just to give greater readability to the program, since main could have been perfectly valid defined this way:

int main () { cout << " Hello World!\n"; cout << " I'm a C++ program "; return 0; }

We were also free to divide the code into more lines if we considered it more convenient:

int main ()
{
  cout <<

    "Hello World!\n";
   cout
     << "I'm a C++ program";
   return 0;
}

And the result would again have been exactly the same as in the previous examples.

Preprocessor directives (those that begin by #) are out of this general rule since they are not statements. They are lines read and discarded by the preprocessor and do not produce any code by themselves. Preprocessor directives must be specified in their own line and do not have to end with a semicolon (;).

[edit] Comments

Make it a habit to comment your code liberaly. That makes your code readable for human beings. These days, writing code readable to human being is perhaps more important than writing those understandable by computers!

Comments are parts of the source code disregarded by the compiler. They simply do nothing. Their purpose is only to allow the programmer to insert notes or descriptions embedded within the source code.

C++ supports two ways to insert comments:

// line comment
 
/* block comment */

The first of them, known as line comment, discards everything from where the pair of slash signs (//) is found up to the end of that same line. The second one, known as block comment, discards everything between the /* characters and the first appearance of the */ characters, with the possibility of including more than one line.
We are going to add comments to our second program:

/* my second program in C++
   with more comments */
 
#include <iostream>
 
using namespace std;
 
int main ()
{
  cout << "Hello World!\n";     // prints Hello World!
  cout << "I'm a C++ program"; // prints I'm a C++ program
 
  return 0;
}
Hello World! 
I'm a C++ program

If you include comments within the source code of your programs without using the comment characters combinations //, /* or */, the compiler will take them as if they were C++ expressions, most likely causing one or several error messages when you compile it.

[edit] Variables. Data Types.

The usefulness of the "Hello World" programs shown in the previous section is quite questionable. We had to write several lines of code, compile them, and then execute the resulting program just to obtain a simple sentence written on the screen as result. It certainly would have been much faster to type the output sentence by ourselves. However, programming is not limited only to printing simple texts on the screen. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work we need to introduce the concept of variable.

Let us think that I ask you to retain the number 5 in your mental memory, and then I ask you to memorize also the number 2 at the same time. You have just stored two different values in your memory. Now, if I ask you to add 1 to the first number I said, you should be retaining the numbers 6 (that is 5+1) and 2 in your memory. Values that we could now for example subtract and obtain 4 as result.

The whole process that you have just done with your mental memory is a simile of what a computer can do with two variables. The same process can be expressed in C++ with the following instruction set:

a = 5;
b = 2;
a = a + 1;
result = a - b;

Obviously, this is a very simple example since we have only used two small integer values, but consider that your computer can store millions of numbers like these at the same time and conduct sophisticated mathematical operations with them.

Therefore, we can define a variable as a portion of memory to store a determined value.

Each variable needs an identifier that distinguishes it from the others, for example, in the previous code the variable identifiers were a, b and result, but we could have called the variables any names we wanted to invent, as long as they were valid identifiers.

[edit] Identifiers

A valid identifier is a sequence of one or more letters, digits or underline characters (_). Neither spaces nor punctuation marks or symbols can be part of an identifier. Only letters, digits and underline characters are valid. In addition, variable identifiers always have to begin with a letter. They can also begin with an underline character (_ ), but this is usually reserved for compiler specific keywords or external identifiers. In no case they can begin with a digit.

Another rule that you have to consider when inventing your own identifiers is that they cannot match any keyword of the C++ language or your compiler's specific ones since they could be confused with these. The standard reserved keywords are:

asm, auto, bool, break, case, catch, char, class, const, const_cast, continue, default, delete, do, double, dynamic_cast, else, enum, explicit, export, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, operator, private, protected, public, register, reinterpret_cast, return, short, signed, sizeof, static, static_cast, struct, switch, template, this, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t, while

Additionally, alternative representations for some operators cannot be used as identifiers since they are reserved words under some circumstances:

and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq

Your compiler may also include some additional specific reserved keywords.

Very important: The C++ language is a "case sensitive" language. That means that an identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example, the RESULT variable is not the same as the result variable or the Result variable. These are three different variable identifiers.

[edit] Fundamental data types

When programming, we store the variables in our computer's memory, but the computer has to know what we want to store in them, since it is not going to occupy the same amount of memory to store a simple number than to store a single letter or a large number, and they are not going to be interpreted the same way.

The memory in our computers is organized in bytes. A byte is the minimum amount of memory that we can manage in C++. A byte can store a relatively small amount of data: one single character or a small integer (generally an integer between 0 and 255). In addition, the computer can manipulate more complex data types that come from grouping several bytes, such as long numbers or non-integer numbers.

Next you have a summary of the basic fundamental data types in C++, as well as the range of values that can be represented with each one:

Name Description Size* Range*
char Character or small integer. 1byte signed: -128 to 127
unsigned: 0 to 255
short int (short) Short Integer. 2bytes signed: -32768 to 32767
unsigned: 0 to 65535
int Integer. 4bytes signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
long int (long) Long integer. 4bytes signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
bool Boolean value. It can take one of two values: true or false. 1byte true or false
float Floating point number. 4bytes 3.4e +/- 38 (7 digits)
double Double precision floating point number. 8bytes 1.7e +/- 308 (15 digits)
long double Long double precision floating point number. 8bytes 1.7e +/- 308 (15 digits)
wchar_t Wide character. 2bytes 1 wide character

* The values of the columns Size and Range depend on the architecture of the system where the program is compiled and executed. The values shown above are those found on most 32bit systems. But for other systems, the general specification is that int has the natural size suggested by the system architecture (one word) and the four integer types char, short, int and long must each one be at least as large as the one preceding it. The same applies to the floating point types float, double and long double, where each one must provide at least as much precision as the preceding one.

[edit] Declaration of variables

In order to use a variable in C++, we must first declare it specifying which data type we want it to be. The syntax to declare a new variable is to write the specifier of the desired data type (like int, bool, float...) followed by a valid variable identifier. For example:

int a;
float mynumber;

These are two valid declarations of variables. The first one declares a variable of type int with the identifier a. The second one declares a variable of type float with the identifier mynumber. Once declared, the variables a and mynumber can be used within the rest of their scope in the program.

If you are going to declare more than one variable of the same type, you can declare all of them in a single statement by separating their identifiers with commas. For example:

int a, b, c;

This declares three variables (a, b and c), all of them of type int, and has exactly the same meaning as:

int a;
int b;
int c;

The integer data types char, short, long and int can be either signed or unsigned depending on the range of numbers needed to be represented. Signed types can represent both positive and negative values, whereas unsigned types can only represent positive values (and zero). This can be specified by using either the specifier signed or the specifier unsigned before the type name. For example:

unsigned short int NumberOfSisters;

signed int MyAccountBalance;

By default, if we do not specify either signed or unsigned most compiler settings will assume the type to be signed, therefore instead of the second declaration above we could have written:

int MyAccountBalance;

with exactly the same meaning (with or without the keyword signed)

An exception to this general rule is the char type, which exists by itself and is considered a different fundamental data type from signed char and unsigned char, thought to store characters. You should use either signed or unsigned if you intend to store numerical values in a char-sized variable.

short and long can be used alone as type specifiers. In this case, they refer to their respective integer fundamental types: short is equivalent to short int and long is equivalent to long int. The following two variable declarations are equivalent:

short Year;

short int Year;

Finally, signed and unsigned may also be used as standalone type specifiers, meaning the same as signed int and unsigned int respectively. The following two declarations are equivalent:

unsigned NextYear;

unsigned int NextYear;

To see what variable declarations look like in action within a program, we are going to see the C++ code of the example about your mental memory proposed at the beginning of this section:

// operating with variables

#include <iostream>
using namespace std;


int main ()
{
  // declaring variables:
  int a, b;
  int result;

  // process:
  a = 5;
  b = 2;
  a = a + 1;
  result = a - b;

  // print out the result:

  cout << result;

  // terminate the program:
  return 0;
}
4

Do not worry if something else than the variable declarations themselves looks a bit strange to you. You will see the rest in detail in coming sections.

[edit] Scope of variables

All the variables that we intend to use in a program must have been declared with its type specifier in an earlier point in the code, like we did in the previous code at the beginning of the body of the function main when we declared that a, b, and result were of type int.

A variable can be either of global or local scope. A global variable is a variable declared in the main body of the source code, outside all functions, while a local variable is one declared within the body of a function or a block.

Image:2-imgvars1.gif

Global variables can be referred from anywhere in the code, even inside functions, whenever it is after its declaration.

The scope of local variables is limited to the block enclosed in braces ({}) where they are declared. For example, if they are declared at the beginning of the body of a function (like in function main) their scope is between its declaration point and the end of that function. In the example above, this means that if another function existed in addition to main, the local variables declared in main could not be accessed from the other function and vice versa.

[edit] Initialization of variables

When declaring a regular local variable, its value is by default undetermined. But you may want a variable to store a concrete value at the same moment that it is declared. In order to do that, you can initialize the variable. There are two ways to do this in C++:

The first one, known as c-like, is done by appending an equal sign followed by the value to which the variable will be initialized:

type identifier = initial_value ;

For example, if we want to declare an int variable called a initialized with a value of 0 at the moment in which it is declared, we could write:

int a = 0;

The other way to initialize variables, known as constructor initialization, is done by enclosing the initial value between parentheses (()):

type identifier (initial_value) ;

For example:

int a (0);

Both ways of initializing variables are valid and equivalent in C++.

// initialization of variables
 
#include <iostream>
using namespace std;
 
int main ()
{
  int a=5;               // initial value = 5
 
  int b(2);              // initial value = 2
  int result;            // initial value undetermined
 
  a = a + 3;
  result = a - b;
  cout << result;
 
  return 0;
}
6

[edit] Introduction to strings

Variables that can store non-numerical values that are longer than one single character are known as strings.

The C++ language library provides support for strings through the standard string class. This is not a fundamental type, but it behaves in a similar way as fundamental types do in its most basic usage.

A first difference with fundamental data types is that in order to declare and use objects (variables) of this type we need to include an additional header file in our source code: <string> and have access to the std namespace (which we already had in all our previous programs thanks to the using namespace statement).

// my first string

#include <iostream>
#include <string>
using namespace std;

int main ()
{
  string mystring = "This is a string";
   cout << mystring;
   return 0;
}

This is a string

As you may see in the previous example, strings can be initialized with any valid string literal just like numerical type variables can be initialized to any valid numerical literal. Both initialization formats are valid with strings:

string mystring = "This is a string";
 string mystring ("This is a string");

Strings can also perform all the other basic operations that fundamental data types can, like being declared without an initial value and being assigned values during execution:

// my first string
#include <iostream>
#include <string>
using namespace std;


int main ()
{
  string mystring;
  mystring = "This is the initial string content";
   cout << mystring << endl;
   mystring = "This is a different string content";
   cout << mystring << endl;
   return 0;
}


This is the initial string content
This is a different string content

For more details on strings, you can have a look on this link.

[edit] Constants

Constants are expressions with a fixed value.

[edit] Literals

Literals are used to express particular values within the source code of a program. We have already used these previously to give concrete values to variables or to express messages we wanted our programs to print out, for example, when we wrote:

a = 5;

the 5 in this piece of code was a literal constant.

Literal constants can be divided in Integer Numerals, Floating-Point Numerals, Characters, Strings and Boolean Values.

[edit] Integer Numerals

1776
707
-273

They are numerical constants that identify integer decimal values. Notice that to express a numerical constant we do not have to write quotes (") nor any special character. There is no doubt that it is a constant: whenever we write 1776 in a program, we will be referring to the value 1776.

In addition to decimal numbers (those that all of us are used to use every day) C++ allows the use as literal constants of octal numbers (base 8) and hexadecimal numbers (base 16). If we want to express an octal number we have to precede it with a (zero character). And in order to express a hexadecimal number we have to precede it with the characters 0x (zero, x). For example, the following literal constants are all equivalent to each other:

75         // decimal

0113       // octal
0x4b       // hexadecimal

All of these represent the same number: 75 (seventy-five) expressed as a base-10 numeral, octal numeral and hexadecimal numeral, respectively.

Literal constants, like variables, are considered to have a specific data type. By default, integer literals are of type int. However, we can force them to either be unsigned by appending the u character to it, or long by appending l:

75         // int

75u        // unsigned int
75l        // long
75ul       // unsigned long

In both cases, the suffix can be specified using either upper or lowercase letters.

[edit] Floating Point Numbers

They express numbers with decimals and/or exponents. They can include either a decimal point, an e character (that expresses "by ten at the Xth height", where X is an integer value that follows the e character), or both a decimal point and an e character:

3.14159    // 3.14159

6.02e23    // 6.02 x 1023
1.6e-19    // 1.6 x 10-19
3.0        // 3.0

These are four valid numbers with decimals expressed in C++. The first number is PI, the second one is the number of Avogadro, the third is the electric charge of an electron (an extremely small number) -all of them approximated- and the last one is the number three expressed as a floating-point numeric literal.

The default type for floating point literals is double. If you explicitly want to express a float or long double numerical literal, you can use the f or l suffixes respectively:

3.14159L   // long double

6.02e23f   // float

Any of the letters than can be part of a floating-point numerical constant (e, f, l) can be written using either lower or uppercase letters without any difference in their meanings.

[edit] Character and string literals

There also exist non-numerical constants, like:

'z'
'p'

"Hello world"
"How do you do?"

The first two expressions represent single character constants, and the following two represent string literals composed of several characters. Notice that to represent a single character we enclose it between single quotes (') and to express a string (which generally consists of more than one character) we enclose it between double quotes (").

When writing both single character and string literals, it is necessary to put the quotation marks surrounding them to distinguish them from possible variable identifiers or reserved keywords. Notice the difference between these two expressions:

x
'x'

x alone would refer to a variable whose identifier is x, whereas 'x' (enclosed within single quotation marks) would refer to the character constant 'x'.

Character and string literals have certain peculiarities, like the escape codes. These are special characters that are difficult or impossible to express otherwise in the source code of a program, like newline (\n) or tab (\t). All of them are preceded by a backslash (\). Here you have a list of some of such escape codes:

\n newline
\r carriage return
\t tab
\v vertical tab
\b backspace
\f form feed (page feed)
\a alert (beep)
\' single quote (')
\" double quote (")
\? question mark (?)
\\ backslash (\)

For example:

'\n'
'\t'
"Left \t Right"
"one\ntwo\nthree"

Additionally, you can express any character by its numerical ASCII code by writing a backslash character (\) followed by the ASCII code expressed as an octal (base-8) or hexadecimal (base-16) number. In the first case (octal) the digits must immediately follow the backslash (for example \23 or \40), in the second case (hexadecimal), an x character must be written before the digits themselves (for example \x20 or \x4A).

String literals can extend to more than a single line of code by putting a backslash sign (\) at the end of each unfinished line.

"string expressed in \
two lines" 

You can also concatenate several string constants separating them by one or several blank spaces, tabulators, newline or any other valid blank character:

"this forms" "a single" "string" "of characters"

Finally, if we want the string literal to be explicitly made of wide characters (wchar_t), instead of narrow characters (char), we can precede the constant with the L prefix:

L"This is a wide character string"

Wide characters are used mainly to represent non-English or exotic character sets.

[edit] Boolean literals

In C++, true and false are represented by 1 (non-zero) and 0, respectively.

There are only two valid Boolean values: true and false. These can be expressed in C++ as values of type bool by using the Boolean literals true and false.

[edit] Defined constants (#define)

You can define your own names for constants that you use very often without having to resort to memory-consuming variables, simply by using the #define preprocessor directive. Its format is:

#define identifier value

For example:

#define PI 3.14159265

#define NEWLINE '\n'

This defines two new constants: PI and NEWLINE. Once they are defined, you can use them in the rest of the code as if they were any other regular constant, for example:

// defined constants: calculate circumference

#include <iostream>
using namespace std;


#define PI 3.14159
#define NEWLINE '\n'

int main ()
{
  double r=5.0;               // radius
  double circle;

  circle = 2 * PI * r;
  cout << circle;
  cout << NEWLINE;

  return 0;
}

31.4159

In fact the only thing that the compiler preprocessor does when it encounters #define directives is to literally replace any occurrence of their identifier (in the previous example, these were PI and NEWLINE) by the code to which they have been defined (3.14159265 and '\n' respectively).

Remember NOT TO end #define directives with a semicolon. If you do, you will get very strange errors.

The #define directive is not a C++ statement but a directive for the preprocessor; therefore it assumes the entire line as the directive and does not require a semicolon (;) at its end. If you append a semicolon character (;) at the end, it will also be appended in all occurrences within the body of the program that the preprocessor replaces.

[edit] Declared constants (const)

With the const prefix you can declare constants with a specific type in the same way as you would do with a variable:

const int pathwidth = 100;
const char tabulator = '\t';
 const zipcode = 12440;

In case that no type is explicitly specified (as in the last example) the compiler assumes that it is of type int.

[edit] Operators

Once we know of the existence of variables and constants, we can begin to operate with them. For that purpose, C++ integrates operators. Unlike other languages whose operators are mainly keywords, operators in C++ are mostly made of signs that are not part of the alphabet but are available in all keyboards. This makes C++ code shorter and more international, since it relies less on English words, but requires a little of learning effort in the beginning.

You do not have to memorize all the content of this page. Most details are only provided to serve as a later reference in case you need it.

[edit] Assignment (=)

C/C++ and many other languages use '=' sign as 'assignment' operator, not equal sign. The equality test in C/C++ is '==' . (Two equal signs typed one after the other.) Keep this in mind to avoid some time consuming mistakes.

The assignment operator assigns a value to a variable.

a = 5;

This statement assigns the integer value 5 to the variable a. The part at the left of the assignment operator (=) is known as the lvalue (left value) and the right one as the rvalue (right value). The lvalue has to be a variable whereas the rvalue can be either a constant, a variable, the result of an operation or any combination of these.
The most important rule when assigning is the right-to-left rule: The assignment operation always takes place from right to left, and never the other way:

a = b;

This statement assigns to variable a (the lvalue) the value contained in variable b (the rvalue). The value that was stored until this moment in a is not considered at all in this operation, and in fact that value is lost.

Consider also that we are only assigning the value of b to a at the moment of the assignment operation. Therefore a later change of b will not affect the new value of a.

For example, let us have a look at the following code - I have included the evolution of the content stored in the variables as comments:

// assignment operator
 
#include <iostream>
using namespace std;
 
int main ()
{
  int a, b;         // a:?,  b:?
 
  a = 10;           // a:10, b:?
  b = 4;            // a:10, b:4
  a = b;            // a:4,  b:4
  b = 7;            // a:4,  b:7
 
  cout << "a:";
   cout << a;
   cout << " b:";
   cout << b;
 
   return 0;
}
a:4 b:7

This code will give us as result that the value contained in a is 4 and the one contained in b is 7. Notice how a was not affected by the final modification of b, even though we declared a = b earlier (that is because of the right-to-left rule).

A property that C++ has over other programming languages is that the assignment operation can be used as the rvalue (or part of an rvalue) for another assignment operation. For example:

Writing code like
a=2+(b=5);
works, but is poor style. Instead, write two statements:

b=5; a=2+b;

, instead.
a = 2 + (b = 5);

is equivalent to:

b = 5;
a = 2 + b;

that means: first assign 5 to variable b and then assign to a the value 2 plus the result of the previous assignment of b (i.e. 5), leaving a with a final value of 7.

The following expression is also valid in C++:

a = b = c = 5;

It assigns 5 to the all the three variables: a, b and c.

[edit] Arithmetic operators ( +, -, *, /, % )

The five arithmetical operations supported by the C++ language are:

+ addition
- subtraction
* multiplication
/ division
 % modulo

Operations of addition, subtraction, multiplication and division literally correspond with their respective mathematical operators. The only one that you might not be so used to see may be modulo; whose operator is the percentage sign (%). Modulo is the operation that gives the remainder of a division of two values. For example, if we write:

a = 11 % 3;

the variable a will contain the value 2, since 2 is the remainder from dividing 11 between 3.

[edit] Compound assignment (+=, -=, *=, /=, %=, >>=, <<=, &=, ^=, |=)

When we want to modify the value of a variable by performing an operation on the value currently stored in that variable we can use compound assignment operators:

expression is equivalent to
value += increase; value = value + increase;
a -= 5; a = a - 5;
a /= b; a = a / b;
price *= units + 1; price = price * (units + 1);

and the same for all other operators. For example:

// compound assignment operators

#include <iostream>
using namespace std;

int main ()
{
  int a, b=3;
  a = b;
  a+=2;             // equivalent to a=a+2

  cout << a;
  return 0;
}
5

[edit] Increase and decrease (++, --)

Shortening even more some expressions, the increase operator (++) and the decrease operator (--) increase or reduce by one the value stored in a variable. They are equivalent to +=1 and to -=1, respectively. Thus:

c++;
c+=1;
c=c+1;

are all equivalent in its functionality: the three of them increase by one the value of c.

In the early C compilers, the three previous expressions probably produced different executable code depending on which one was used. Nowadays, this type of code optimization is generally done automatically by the compiler, thus the three expressions should produce exactly the same executable code.

A characteristic of this operator is that it can be used both as a prefix and as a suffix. That means that it can be written either before the variable identifier (++a) or after it (a++). Although in simple expressions like a++ or ++a both have exactly the same meaning, in other expressions in which the result of the increase or decrease operation is evaluated as a value in an outer expression they may have an important difference in their meaning: In the case that the increase operator is used as a prefix (++a) the value is increased before the result of the expression is evaluated and therefore the increased value is considered in the outer expression; in case that it is used as a suffix (a++) the value stored in a is increased after being evaluated and therefore the value stored before the increase operation is evaluated in the outer expression. Notice the difference:

Example 1 Example 2
B=3;
A=++B;
// A contains 4, B contains 4
B=3;
A=B++;
// A contains 3, B contains 4

In Example 1, B is increased before its value is copied to A. While in Example 2, the value of B is copied to A and then B is increased.

[edit] Relational and equality operators ( ==, !=, >, <, >=, <= )

In order to evaluate a comparison between two expressions we can use the relational and equality operators. The result of a relational operation is a Boolean value that can only be true or false, according to its Boolean result.

We may want to compare two expressions, for example, to know if they are equal or if one is greater than the other is. Here is a list of the relational and equality operators that can be used in C++:

In C++ = and == are not the same thing! a=b, assigns the value of b to a, while a==b tests whether a and b are equal! Remember this to avoid very nasty errors.

== Equal to
!= Not equal to
> Greater than
< Less than
>= Greater than or equal to
<= Less than or equal to

Here there are some examples:

(7 == 5)     // evaluates to false.

(5 > 4)      // evaluates to true.
(3 != 2)     // evaluates to true.
(6 >= 6)     // evaluates to true.
(5 < 5)      // evaluates to false.

Of course, instead of using only numeric constants, we can use any valid expression, including variables. Suppose that a=2, b=3 and c=6,

(a == 5)     // evaluates to false since a is not equal to 5.
(a*b >= c)   // evaluates to true since (2*3 >= 6) is true. 
(b+4 > a*c)  // evaluates to false since (3+4 > 2*6) is false. 

((b=2) == a) // evaluates to true. 

Be careful! The operator = (one equal sign) is not the same as the operator == (two equal signs), the first one is an assignment operator (assigns the value at its right to the variable at its left) and the other one (==) is the equality operator that compares whether both expressions in the two sides of it are equal to each other. Thus, in the last expression ((b=2) == a), we first assigned the value 2 to b and then we compared it to a, that also stores the value 2, so the result of the operation is true.

[edit] Logical operators ( !, &&, || )

The Operator ! is the C++ operator to perform the Boolean operation NOT, it has only one operand, located at its right, and the only thing that it does is to inverse the value of it, producing false if its operand is true and true if its operand is false. Basically, it returns the opposite Boolean value of evaluating its operand. For example:

!(5 == 5)    // evaluates to false because the expression at its right (5 == 5) is true. 
 
 !(6 <= 4)    // evaluates to true because (6 <= 4) would be false. 
 !true        // evaluates to false
 !false       // evaluates to true. 

The logical operators && and || are used when evaluating two expressions to obtain a single relational result. The operator && corresponds with Boolean logical operation AND. This operation results true if both its two operands are true, and false otherwise. The following panel shows the result of operator && evaluating the expression a && b:

&& OPERATOR

a b a && b
true true true
true false false
false true false
false false false

The operator || corresponds with Boolean logical operation OR. This operation results true if either one of its two operands is true, thus being false only when both operands are false themselves. Here are the possible results of a || b:

|| OPERATOR

a b a b
true true true
true false true
false true true
false false false

For example:

( (5 == 5) && (3 > 6) )  // evaluates to false ( true && false ).

( (5 == 5) || (3 > 6) )  // evaluates to true ( true || false ).

[edit] Conditional operator ( ? )

The conditional operator evaluates an expression returning a value if that expression is true and a different one if the expression is evaluated as false. Its format is:

condition ? result1 : result2

If condition is true the expression will return result1, if it is not it will return result2.

7==5 ? 4 : 3     // returns 3, since 7 is not equal to 5.

7==5+2 ? 4 : 3   // returns 4, since 7 is equal to 5+2.
5>3 ? a : b      // returns the value of a, since 5 is greater than 3.
a>b ? a : b      // returns whichever is greater, a or b.
// conditional operator

#include <iostream>

using namespace std;

int main ()
{
  int a,b,c;

  a=2;
  b=7;
  c = (a>b) ? a : b;

  cout << c;

  return 0;
}

7

In this example a was 2 and b was 7, so the expression being evaluated (a>b) was not true, thus the first value specified after the question mark was discarded in favor of the second value (the one after the colon) which was b, with a value of 7.

[edit] Comma operator ( , )

The comma operator (,) is used to separate two or more expressions that are included where only one expression is expected. When the set of expressions has to be evaluated for a value, only the rightmost expression is considered.

For example, the following code:

a = (b=3, b+2);

Would first assign the value 3 to b, and then assign b+2 to variable a. So, at the end, variable a would contain the value 5 while variable b would contain value 3.

[edit] Bitwise Operators ( &, |, ^, ~, <<, >> )

Bitwise operators modify variables considering the bit patterns that represent the values they store.

operator asm equivalent description
& AND Bitwise AND
| OR Bitwise Inclusive OR
^ XOR Bitwise Exclusive OR
~ NOT Unary complement (bit inversion)
<< SHL Shift Left
>> SHR Shift Right

[edit] Explicit type casting operator

Type casting operators allow you to convert a datum of a given type to another. There are several ways to do this in C++. The simplest one, which has been inherited from the C language, is to precede the expression to be converted by the new type enclosed between parentheses (()):

int i;
float f = 3.14;
i = (int) f;

The previous code converts the float number 3.14 to an integer value (3), the remainder is lost. Here, the typecasting operator was (int). Another way to do the same thing in C++ is using the functional notation: preceding the expression to be converted by the type and enclosing the expression between parentheses:

i = int ( f );

Both ways of type casting are valid in C++.

[edit] sizeof()

This operator accepts one parameter, which can be either a type or a variable itself and returns the size in bytes of that type or object:

a = sizeof (char);

This will assign the value 1 to a because char is a one-byte long type.
The value returned by sizeof is a constant, so it is always determined before program execution.

[edit] Other operators

Later in these tutorials, we will see a few more operators, like the ones referring to pointers or the specifics for object-oriented programming. Each one is treated in its respective section.

[edit] Precedence of operators

Don't Bother

Do not rely on intricate rules of precedence to write complicated expressions. Instead always use parenthesis. It make your code easy to maintain, read and less error prone.

When writing complex expressions with several operands, we may have some doubts about which operand is evaluated first and which later. For example, in this expression:

a = 5 + 7 % 2

we may doubt if it really means:

a = 5 + (7 % 2)    // with a result of 6, or
a = (5 + 7) % 2    // with a result of 0

The correct answer is the first of the two expressions, with a result of 6. If you are interested in finding how this happend and to get a thorough idea on the precedence of operators read this page. However, give everybody (including you) a break -- use parenthesis and do not depend too much on these rules!!

[edit] Exercises

[edit] Prime Numbers

Task
Write a C++ program to find and print prime numbers.
Hint
  • Don't worry about calculation efficiency too much.
  • For each number, check if it is a multiple of any number between 2 and that number. If yes, it is not a prime. Otherwise it is.

[edit] Basic Input/Output

Until now, the example programs of previous sections provided very little interaction with the user, if any at all. Using the standard input and output library, we will be able to interact with the user by printing messages on the screen and getting the user's input from the keyboard.

C++ uses a convenient abstraction called streams to perform input and output operations in sequential media such as the screen or the keyboard. A stream is an object where a program can either insert or extract characters to/from it. We do not really need to care about many specifications about the physical media associated with the stream - we only need to know it will accept or provide characters sequentialy.

The standard C++ library includes the header file iostream, where the standard input and output stream objects are declared.

[edit] Standard Output (cout)

By default, the standard output of a program is the screen, and the C++ stream object defined to access it is cout.

cout is used in conjunction with the insertion operator, which is written as << (two "less than" signs).

cout << "Output sentence"; // prints Output sentence on screen

cout << 120;               // prints number 120 on screen
cout << x;                 // prints the content of x on screen

The << operator inserts the data that follows it into the stream preceding it. In the examples above it inserted the constant string Output sentence, the numerical constant 120 and variable x into the standard output stream cout. Notice that the sentence in the first instruction is enclosed between double quotes (") because it is a constant string of characters. Whenever we want to use constant strings of characters we must enclose them between double quotes (") so that they can be clearly distinguished from variable names. For example, these two sentences have very different results:

cout << "Hello";  // prints Hello

cout << Hello;    // prints the content of Hello variable

The insertion operator (<<) may be used more than once in a single statement:

cout << "Hello, " << "I am " << "a C++ statement";
 
 

This last statement would print the message Hello, I am a C++ statement on the screen. The utility of repeating the insertion operator (<<) is demonstrated when we want to print out a combination of variables and constants or more than one variable:

cout << "Hello, I am " << age << " years old and my zipcode is " << zipcode;

If we assume the age variable to contain the value 24 and the zipcode variable to contain 90064 the output of the previous statement would be:

Hello, I am 24 years old and my zipcode is 90064 

It is important to notice that cout does not add a line break after its output unless we explicitly indicate it, therefore, the following statements:

cout << "This is a sentence.";
 cout << "This is another sentence."; 
 

will be shown on the screen one following the other without any line break between them:

This is a sentence.This is another sentence.

even though we had written them in two different insertions into cout. In order to perform a line break on the output we must explicitly insert a new-line character into cout. In C++ a new-line character can be specified as \n (backslash, n):

cout << "First sentence.\n ";
 cout << "Second sentence.\nThird sentence."; 
 
 

This produces the following output:

First sentence.
Second sentence.
Third sentence.

Additionally, to add a new-line, you may also use the endl manipulator. For example:

cout << "First sentence." << endl;
cout << "Second sentence." << endl; 

would print out:

First sentence.
Second sentence.

The endl manipulator produces a newline character, exactly as the insertion of '\n' does, but it also has an additional behavior when it is used with buffered streams: the buffer is flushed. Anyway, cout will be an unbuffered stream in most cases, so you can generally use both the \n escape character and the endl manipulator in order to specify a new line without any difference in its behavior.

[edit] Standard Input (cin).

The standard input device is usually the keyboard. Handling the standard input in C++ is done by applying the overloaded operator of extraction (>>) on the cin stream. The operator must be followed by the variable that will store the data that is going to be extracted from the stream. For example:

int age;
cin >> age;

The first statement declares a variable of type int called age, and the second one waits for an input from cin (the keyboard) in order to store it in this integer variable.

cin can only process the input from the keyboard once the RETURN key has been pressed. Therefore, even if you request a single character, the extraction from cin will not process the input until the user presses RETURN after the character has been introduced.

You must always consider the type of the variable that you are using as a container with cin extractions. If you request an integer you will get an integer, if you request a character you will get a character and if you request a string of characters you will get a string of characters.

// i/o example

#include <iostream>
using namespace std;

int main ()
{
  int i;
  cout << "Please enter an integer value: ";
   cin >> i;
   cout << "The value you entered is " << i;
  cout << " and its double is " << i*2 << ".\n";
   return 0;
}

Please enter an integer value: 702
The value you entered is 702 and its double is 1404.

The user of a program may be one of the factors that generate errors even in the simplest programs that use cin (like the one we have just seen). Since if you request an integer value and the user introduces a name (which generally is a string of characters), the result may cause your program to misoperate since it is not what we were expecting from the user. So when you use the data input provided by cin extractions you will have to trust that the user of your program will be cooperative and that he/she will not introduce his/her name or something similar when an integer value is requested. A little ahead, when we see the stringstream class we will see a possible solution for the errors that can be caused by this type of user input.

You can also use cin to request more than one datum input from the user:

cin >> a >> b;

is equivalent to:

cin >> a;
cin >> b;

In both cases the user must give two data, one for variable a and another one for variable b that may be separated by any valid blank separator: a space, a tab character or a newline.

[edit] cin and strings

We can use cin to get strings with the extraction operator (>>) as we do with fundamental data type variables:

cin >> mystring;

However, as it has been said, cin extraction stops reading as soon as if finds any blank space character, so in this case we will be able to get just one word for each extraction. This behavior may or may not be what we want; for example if we want to get a sentence from the user, this extraction operation would not be useful.

In order to get entire lines, we can use the function getline, which is the more recommendable way to get user input with cin:

// cin with strings
#include <iostream>
#include <string>

using namespace std;

int main ()
{
  string mystr;
  cout << "What's your name? ";
   getline (cin, mystr);
   cout << "Hello " << mystr << ".\n";
   cout << "What is your favorite team? ";
   getline (cin, mystr);
   cout << "I like " << mystr << " too!\n";
   return 0;
}

What's your name? James Cook
Hello James Cook.
What is your favorite team? The Cookers
I like The Cookers too!

Notice how in both calls to getline we used the same string identifier (mystr). What the program does in the second call is simply to replace the previous content by the new one that is introduced.

[edit] A handy tool for type conversion -- stringstream

The standard header file <sstream> defines a class called stringstream that allows a string-based object to be treated as a stream. This way we can perform extraction or insertion operations from/to strings, which is especially useful to convert strings to numerical values and vice versa. For example, if we want to extract an integer from a string we can write:

string mystr ("1204");

int myint;
stringstream(mystr) >> myint;

This declares a string object with a value of "1204", and an int object. Then we use stringstream's constructor to construct an object of this type from the string object. Because we can use stringstream objects as if they were streams, we can extract an integer from it as we would have done on cin by applying the extractor operator (>>) on it followed by a variable of type int.

After this piece of code, the variable myint will contain the numerical value 1204.

// stringstreams

#include <iostream>
#include <string>
#include <sstream>
using namespace std;

int main ()
{
  string mystr;
  float price=0;
  int quantity=0;

  cout << "Enter price: ";
   getline (cin,mystr);
   stringstream(mystr) >> price;
   cout << "Enter quantity: ";
   getline (cin,mystr);
   stringstream(mystr) >> quantity;
   cout << "Total price: " << price*quantity << endl;
  return 0;
}

Enter price: 22.25
Enter quantity: 7
Total price: 155.75

In this example, we acquire numeric values from the standard input indirectly. Instead of extracting numeric values directly from the standard input, we get lines from the standard input (cin) into a string object (mystr), and then we extract the integer values from this string into a variable of type int (myint).

Using this method, instead of direct extractions of integer values, we have more control over what happens with the input of numeric values from the user, since we are separating the process of obtaining input from the user (we now simply ask for lines) with the interpretation of that input. Therefore, this method is usually preferred to get numerical values from the user in all programs that are intensive in user input.


[edit] Control Structures

A program is usually not limited to a linear sequence of instructions. During its process it may bifurcate, repeat code or take decisions. For that purpose, C++ provides control structures that serve to specify what has to be done by our program, when and under which circumstances.

With the introduction of control structures we are going to have to introduce a new concept: the compound-statement or block. A block is a group of statements which are separated by semicolons (;) like all C++ statements, but grouped together in a block enclosed in braces: { }:

{ statement1; statement2; statement3; }

Most of the control structures that we will see in this section require a generic statement as part of its syntax. A statement can be either a simple statement (a simple instruction ending with a semicolon) or a compund statement (several instructions grouped in a block), like the one just described. In the case that we want the statement to be a simple statement, we do not need to enclose it in braces ({}). But in the case that we want the statement to be a compund statement it must be enclosed between braces ({}), forming a block.

[edit] Conditional structure: if and else

The if keyword is used to execute a statement or block only if a condition is fulfilled. Its form is:

if (condition) statement

Where condition is the expression that is being evaluated. If this condition is true, statement is executed. If it is false, statement is ignored (not executed) and the program continues right after this conditional structure.
For example, the following code fragment prints x is 100 only if the value stored in the x variable is indeed 100:

if (x == 100)
  cout << "x is 100";
 
 

If we want more than a single statement to be executed in case that the condition is true we can specify a block using braces { }:

if (x == 100)
{
   cout << "x is ";
    cout << x;
 }
 

We can additionally specify what we want to happen if the condition is not fulfilled by using the keyword else. Its form used in conjunction with if is:

if (condition) statement1 else statement2

For example:

if (x == 100)
  cout << "x is 100";
 else
  cout << "x is not 100";
 

prints on the screen x is 100 if indeed x has a value of 100, but if it has not -and only if not- it prints out x is not 100.

The if + else structures can be concatenated with the intention of verifying a range of values. The following example shows its use telling if the value currently stored in x is positive, negative or none of them (i.e. zero):

if (x > 0)
  cout << "x is positive";
 
 else if (x < 0)
  cout << "x is negative";
 else
  cout << "x is 0";
 
 

Remember that in case that we want more than a single statement to be executed, we must group them in a block by enclosing them in braces { }.

[edit] Iteration structures (loops)

Loops have as purpose to repeat a statement a certain number of times or while a condition is fulfilled.

[edit] The while loop

Its format is:

while (expression) statement

and its functionality is simply to repeat statement while the condition set in expression is true.
For example, we are going to make a program to countdown using a while-loop:

// custom countdown using while

#include <iostream>

using namespace std;

int main ()
{
  int n;
  cout << "Enter the starting number > ";
   cin >> n;
 
   while (n>0) {
    cout << n << ", ";
     --n;
   }
 
   cout << "FIRE!\n";
   return 0;
}

Enter the starting number > 8
8, 7, 6, 5, 4, 3, 2, 1, FIRE!

When the program starts the user is prompted to insert a starting number for the countdown. Then the while loop begins, if the value entered by the user fulfills the condition n>0 (that n is greater than zero) the block that follows the condition will be executed and repeated while the condition (n>0) remains being true.

The whole process of the previous program can be interpreted according to the following script (beginning in main):

  1. User assigns a value to n
  2. The while condition is checked (n>0). At this point there are two posibilities:
    * condition is true: statement is executed (to step 3)
    * condition is false: ignore statement and continue after it (to step 5)
  3. Execute statement:
    cout << n << ", ";
    --n;
    (prints the value of n on the screen and decreases n by 1)
  4. End of block. Return automatically to step 2
  5. Continue the program right after the block: print FIRE! and end program.

When creating a while-loop, we must always consider that it has to end at some point, therefore we must provide within the block some method to force the condition to become false at some point, otherwise the loop will continue looping forever. In this case we have included --n; that decreases the value of the variable that is being evaluated in the condition (n) by one - this will eventually make the condition (n>0) to become false after a certain number of loop iterations: to be more specific, when n becomes , that is where our while-loop and our countdown end.

Of course this is such a simple action for our computer that the whole countdown is performed instantly without any practical delay between numbers.

[edit] The do-while loop

Its format is:

do statement while (condition);

Its functionality is exactly the same as the while loop, except that condition in the do-while loop is evaluated after the execution of statement instead of before, granting at least one execution of statement even if condition is never fulfilled. For example, the following example program echoes any number you enter until you enter .

// number echoer

#include <iostream>

using namespace std;

int main ()
{
  unsigned long n;
  do {
    cout << "Enter number (0 to end): ";
     cin >> n;
     cout << "You entered: " << n << "\n";
   } while (n != 0);
  return 0;
}

Enter number (0 to end): 12345
You entered: 12345
Enter number (0 to end): 160277
You entered: 160277
Enter number (0 to end): 0
You entered: 0

The do-while loop is usually used when the condition that has to determine the end of the loop is determined within the loop statement itself, like in the previous case, where the user input within the block is what is used to determine if the loop has to end. In fact if you never enter the value in the previous example you can be prompted for more numbers forever.

[edit] The for loop

Its format is:

for (initialization; condition; increase) statement;

and its main function is to repeat statement while condition remains true, like the while loop. But in addition, the for loop provides specific locations to contain an initialization statement and an increase statement. So this loop is specially designed to perform a repetitive action with a counter which is initialized and increased on each iteration.

It works in the following way:

  1. initialization is executed. Generally it is an initial value setting for a counter variable. This is executed only once.
  2. condition is checked. If it is true the loop continues, otherwise the loop ends and statement is skipped (not executed).
  3. statement is executed. As usual, it can be either a single statement or a block enclosed in braces { }.
  4. finally, whatever is specified in the increase field is executed and the loop gets back to step 2.

Here is an example of countdown using a for loop:

// countdown using a for loop
#include <iostream>
using namespace std;
int main ()
{
  for (int n=10; n>0; n--) {
    cout << n << ", ";
   }
   cout << "FIRE!\n";
   return 0;
}

10, 9, 8, 7, 6, 5, 4, 3, 2, 1, FIRE!

The initialization and increase fields are optional. They can remain empty, but in all cases the semicolon signs between them must be written. For example we could write: for (;n<10;) if we wanted to specify no initialization and no increase; or for (;n<10;n++) if we wanted to include an increase field but no initialization (maybe because the variable was already initialized before).

Optionally, using the comma operator (,) we can specify more than one expression in any of the fields included in a for loop, like in initialization, for example. The comma operator (,) is an expression separator, it serves to separate more than one expression where only one is generally expected. For example, suppose that we wanted to initialize more than one variable in our loop:

for ( n=0, i=100 ; n!=i ; n++, i-- )
{
   // whatever here...

}

This loop will execute for 50 times if neither n or i are modified within the loop:

Image:6-imgloop1.gif

n starts with a value of , and i with 100, the condition is n!=i (that n is not equal to i). Because n is increased by one and i decreased by one, the loop's condition will become false after the 50th loop, when both n and i will be equal to 50.

[edit] Jump statements.

[edit] The break statement

Using break we can leave a loop even if the condition for its end is not fulfilled. It can be used to end an infinite loop, or to force it to end before its natural end. For example, we are going to stop the count down before its natural end (maybe because of an engine check failure?):

// break loop example

#include <iostream>
using namespace std;

int main ()
{
  int n;
  for (n=10; n>0; n--)
  {
    cout << n << ", ";
     if (n==3)
    {
      cout << "countdown aborted!";
       break;
     }
   }
   return 0;
}

10, 9, 8, 7, 6, 5, 4, 3, countdown aborted!

[edit] The continue statement

The continue statement causes the program to skip the rest of the loop in the current iteration as if the end of the statement block had been reached, causing it to jump to the start of the following iteration. For example, we are going to skip the number 5 in our countdown:

// continue loop example
#include <iostream>
using namespace std;

int main ()
{
  for (int n=10; n>0; n--) {
    if (n==5) continue;
     cout << n << ", ";
   }
   cout << "FIRE!\n";
   return 0;
}

10, 9, 8, 7, 6, 4, 3, 2, 1, FIRE!

[edit] The goto statement

Frequent use of goto statement is poor programming practice.

goto allows to make an absolute jump to another point in the program. You should use this feature with caution since its execution causes an unconditional jump ignoring any type of nesting limitations.
The destination point is identified by a label, which is then used as an argument for the goto statement. A label is made of a valid identifier followed by a colon (:).

Generally speaking, this instruction has no concrete use in structured or object oriented programming aside from those that low-level programming fans may find for it. For example, here is our countdown loop using goto:

// goto loop example

#include <iostream>

using namespace std;

int main ()
{
  int n=10;
  loop:
  cout << n << ", ";
   n--;
   if (n>0) goto loop;
  cout << "FIRE!\n";
   return 0;
}

10, 9, 8, 7, 6, 5, 4, 3, 2, 1, FIRE!

[edit] The exit function

exit is a function defined in the cstdlib library.

The purpose of exit is to terminate the current program with a specific exit code. Its prototype is:

void exit (int exitcode);

The exitcode is used by some operating systems and may be used by calling programs. By convention, an exit code of means that the program finished normally and any other value means that some error or unexpected results happened.

[edit] The selective structure: switch.

The syntax of the switch statement is a bit peculiar. Its objective is to check several possible constant values for an expression. Something similar to what we did at the beginning of this section with the concatenation of several if and else if instructions. Its form is the following:

switch (expression)
{
  case constant1:
     group of statements 1;
     break;
  case constant2:
     group of statements 2;
     break;
  .
  .
  .
  default:
     default group of statements
}

It works in the following way: switch evaluates expression and checks if it is equivalent to constant1, if it is, it executes group of statements 1 until it finds the break statement. When it finds this break statement the program jumps to the end of the switch selective structure.

If expression was not equal to constant1 it will be checked against constant2. If it is equal to this, it will execute group of statements 2 until a break keyword is found, and then will jump to the end of the switch selective structure.

Finally, if the value of expression did not match any of the previously specified constants (you can include as many case labels as values you want to check), the program will execute the statements included after the default: label, if it exists (since it is optional).

Both of the following code fragments have the same behavior:

Any switch statement can be written as a if-then ladder. However, switch statement is more elegant and easy to understand.

switch example if-else equivalent
switch (x) {
  case 1:
    cout << "x is 1";
    break;
  case 2:
    cout << "x is 2";
    break;
  default:
    cout << "value of x unknown";
  }
if (x == 1) {
  cout << "x is 1";
  }
else if (x == 2) {
  cout << "x is 2";
  }
else {
  cout << "value of x unknown";
  }

The switch statement is a bit peculiar within the C++ language because it uses labels instead of blocks. This forces us to put break statements after the group of statements that we want to be executed for a specific condition. Otherwise the remainder statements -including those corresponding to other labels- will also be executed until the end of the switch selective block or a break statement is reached.

For example, if we did not include a break statement after the first group for case one, the program will not automatically jump to the end of the switch selective block and it would continue executing the rest of statements until it reaches either a break instruction or the end of the switch selective block. This makes unnecessary to include braces { } surrounding the statements for each of the cases, and it can also be useful to execute the same block of instructions for different possible values for the expression being evaluated. For example:

switch (x) {
  case 1:
  case 2:
  case 3:
    cout << "x is 1, 2 or 3";
     break;
   default:
     cout << "x is not 1, 2 nor 3";
   }
 
 

Notice that switch can only be used to compare an expression against constants. Therefore we cannot put variables as labels (for example case n: where n is a variable) or ranges (case (1..3):) because they are not valid C++ constants.

If you need to check ranges or values that are not constants, use a concatenation of if and else if statements.

[edit] Functions

Using functions we can structure our programs in a more modular way, accessing all the potential that structured programming can offer to us in C++.

A function is a group of statements that is executed when it is called from some point of the program. The following is its format:

type name ( parameter1, parameter2, ...) { statement }

where:

  • type is the data type specifier of the data returned by the function.
  • name is the identifier by which it will be possible to call the function.
  • parameters (as many as needed): Each parameter consists of a data type specifier followed by an identifier, like any regular variable declaration (for example: int x) and which acts within the function as a regular local variable. They allow to pass arguments to the function when it is called. The different parameters are separated by commas.
  • statements is the function's body. It is a block of statements surrounded by braces { }.

Here you have the first function example:

// function example
#include <iostream>
using namespace std;


int addition (int a, int b)
{
  int r;
  r=a+b;
  return (r);
}

int main ()
{
  int z;
  z = addition (5,3);
  cout << "The result is " << z;
  return 0;
}

The result is 8

In order to examine this code, first of all remember something said at the beginning of this tutorial: a C++ program always begins its execution by the main function. So we will begin there.

We can see how the main function begins by declaring the variable z of type int. Right after that, we see a call to a function called addition. Paying attention we will be able to see the similarity between the structure of the call to the function and the declaration of the function itself some code lines above:

Image:7-imgfunc1.gif

The parameters and arguments have a clear correspondence. Within the main function we called to addition passing two values: 5 and 3, that correspond to the int a and int b parameters declared for function addition.

At the point at which the function is called from within main, the control is lost by main and passed to function addition. The value of both arguments passed in the call (5 and 3) are copied to the local variables int a and int b within the function.

Function addition declares another local variable (int r), and by means of the expression r=a+b, it assigns to r the result of a plus b. Because the actual parameters passed for a and b are 5 and 3 respectively, the result is 8.

The following line of code:

return (r);

finalizes function addition, and returns the control back to the function that called it in the first place (in this case, main). At this moment the program follows it regular course from the same point at which it was interrupted by the call to addition. But additionally, because the return statement in function addition specified a value: the content of variable r (return (r);), which at that moment had a value of 8. This value becomes the value of evaluating the function call.

Image:7-imgfunc2.gif

So being the value returned by a function the value given to the function call itself when it is evaluated, the variable z will be set to the value returned by addition (5, 3), that is 8. To explain it another way, you can imagine that the call to a function (addition (5,3)) is literally replaced by the value it returns (8).

The following line of code in main is:

cout << "The result is " << z;

That, as you may already expect, produces the printing of the result on the screen.

[edit] Scope of variables

The scope of variables declared within a function or any other inner block is only their own function or their own block and cannot be used outside of them. For example, in the previous example it would have been impossible to use the variables a, b or r directly in function main since they were variables local to function addition. Also, it would have been impossible to use the variable z directly within function addition, since this was a variable local to the function main.Image:7-imgvars1.gifTherefore, the scope of local variables is limited to the same block level in which they are declared. Nevertheless, we also have the possibility to declare global variables; These are visible from any point of the code, inside and outside all functions. In order to declare global variables you simply have to declare the variable outside any function or block; that means, directly in the body of the program.

And here is another example about functions:

// function example

#include <iostream>
using namespace std;

int subtraction (int a, int b)
{
  int r;
  r=a-b;
  return (r);
}


int main ()
{
  int x=5, y=3, z;
  z = subtraction (7,2);
  cout << "The first result is " << z << '\n';
   cout << "The second result is " << subtraction (7,2) << '\n';
   cout << "The third result is " << subtraction (x,y) << '\n';
   z= 4 + subtraction (x,y);
   cout << "The fourth result is " << z << '\n';
   return 0;
}

The first result is 5
The second result is 5
The third result is 2
The fourth result is 6

In this case we have created a function called subtraction. The only thing that this function does is to subtract both passed parameters and to return the result.

Nevertheless, if we examine function main we will see that we have made several calls to function subtraction. We have used some different calling methods so that you see other ways or moments when a function can be called.

In order to fully understand these examples you must consider once again that a call to a function could be replaced by the value that the function call itself is going to return. For example, the first case (that you should already know because it is the same pattern that we have used in previous examples):

z = subtraction (7,2);
cout << "The first result is " << z;

If we replace the function call by the value it returns (i.e., 5), we would have:

z = 5;
cout << "The first result is " << z;

As well as

cout << "The second result is " << subtraction (7,2);

has the same result as the previous call, but in this case we made the call to subtraction directly as an insertion parameter for cout. Simply consider that the result is the same as if we had written:

cout << "The second result is " << 5;

since 5 is the value returned by subtraction (7,2).

In the case of:

cout << "The third result is " << subtraction (x,y);

The only new thing that we introduced is that the parameters of subtraction are variables instead of constants. That is perfectly valid. In this case the values passed to function subtraction are the values of x and y, that are 5 and 3 respectively, giving 2 as result.

The fourth case is more of the same. Simply note that instead of:

z = 4 + subtraction (x,y);

we could have written:

z = subtraction (x,y) + 4;

with exactly the same result. I have switched places so you can see that the semicolon sign (;) goes at the end of the whole statement. It does not necessarily have to go right after the function call. The explanation might be once again that you imagine that a function can be replaced by its returned value:

z = 4 + 2;
z = 2 + 4;

[edit] Functions with no type. The use of void.

If you remember the syntax of a function declaration:

type name ( argument1, argument2 ...) statement

you will see that the declaration begins with a type, that is the type of the function itself (i.e., the type of the datum that will be returned by the function with the return statement). But what if we want to return no value?

Imagine that we want to make a function just to show a message on the screen. We do not need it to return any value. In this case we should use the void type specifier for the function. This is a special specifier that indicates absence of type.

// void function example

#include <iostream>
using namespace std;

void printmessage ()
{
  cout << "I'm a function!";
 }
 
 int main ()
{
  printmessage ();
  return 0;
}

I'm a function!

void can also be used in the function's parameter list to explicitly specify that we want the function to take no actual parameters when it is called. For example, function printmessage could have been declared as:

void printmessage (void)
{
  cout << "I'm a function!";
 }
 

Although it is optional to specify void in the parameter list. In C++, a parameter list can simply be left blank if we want a function with no parameters.

What you must always remember is that the format for calling a function includes specifying its name and enclosing its parameters between parentheses. The non-existence of parameters does not exempt us from the obligation to write the parentheses. For that reason the call to printmessage is:

printmessage ();

The parentheses clearly indicate that this is a call to a function and not the name of a variable or some other C++ statement. The following call would have been incorrect:

printmessage;




[edit] Recursivity.

Recursivity is the property that functions have to be called by themselves. It is useful for many tasks, like sorting or calculate the factorial of numbers. For example, to obtain the factorial of a number (n!) the mathematical formula would be:

n! = n * (n-1) * (n-2) * (n-3) ... * 1

more concretely, 5! (factorial of 5) would be:

5! = 5 * 4 * 3 * 2 * 1 = 120

and a recursive function to calculate this in C++ could be:

// factorial calculator
#include <iostream>
using namespace std;


long factorial (long a)
{
  if (a > 1)
   return (a * factorial (a-1));
  else
   return (1);
}


int main ()
{
  long number;
  cout << "Please type a number: ";
   cin >> number;
   cout << number << "! = " << factorial (number);
  return 0;
}

Please type a number: 9
9! = 362880

Notice how in function factorial we included a call to itself, but only if the argument passed was greater than 1, since otherwise the function would perform an infinite recursive loop in which once it arrived to it would continue multiplying by all the negative numbers (probably provoking a stack overflow error on runtime).

This function has a limitation because of the data type we used in its design (long) for more simplicity. The results given will not be valid for values much greater than 10! or 15!, depending on the system you compile it.

[edit] Declaring functions.

Until now, we have defined all of the functions before the first appearance of calls to them in the source code. These calls were generally in function main which we have always left at the end of the source code. If you try to repeat some of the examples of functions described so far, but placing the function main before any of the other functions that were called from within it, you will most likely obtain compiling errors. The reason is that to be able to call a function it must have been declared in some earlier point of the code, like we have done in all our examples.

But there is an alternative way to avoid writing the whole code of a function before it can be used in main or in some other function. This can be achieved by declaring just a prototype of the function before it is used, instead of the entire definition. This declaration is shorter than the entire definition, but significant enough for the compiler to determine its return type and the types of its parameters.

Its form is:

type name ( argument_type1, argument_type2, ...);

It is identical to a function definition, except that it does not include the body of the function itself (i.e., the function statements that in normal definitions are enclosed in braces { }) and instead of that we end the prototype declaration with a mandatory semicolon (;).

The parameter enumeration does not need to include the identifiers, but only the type specifiers. The inclusion of a name for each parameter as in the function definition is optional in the prototype declaration. For example, we can declare a function called protofunction with two int parameters with any of the following declarations:

int protofunction (int first, int second);

int protofunction (int, int);

Anyway, including a name for each variable makes the prototype more legible.

// declaring functions prototypes
#include <iostream>
using namespace std;


void odd (int a);
void even (int a);

int main ()
{
  int i;
  do {
    cout << "Type a number (0 to exit): ";
     cin >> i;
     odd (i);
   } while (i!=0);
  return 0;
}


void odd (int a)
{
  if ((a%2)!=0) cout << "Number is odd.\n";
   else even (a);
}

void even (int a)
{
  if ((a%2)==0) cout << "Number is even.\n";
   else odd (a);
}

Type a number (0 to exit): 9
Number is odd.
Type a number (0 to exit): 6
Number is even.
Type a number (0 to exit): 1030
Number is even.
Type a number (0 to exit): 0
Number is even.

This example is indeed not an example of efficiency. I am sure that at this point you can already make a program with the same result, but using only half of the code lines that have been used in this example. Anyway this example illustrates how prototyping works. Moreover, in this concrete example the prototyping of at least one of the two functions is necessary in order to compile the code without errors.

The first things that we see are the declaration of functions odd and even:

void odd (int a);
void even (int a); 

This allows these functions to be used before they are defined, for example, in main, which now is located where some people find it to be a more logical place for the start of a program: the beginning of the source code.

Anyway, the reason why this program needs at least one of the functions to be declared before it is defined is because in odd there is a call to even and in even there is a call to odd. If none of the two functions had been previously declared, a compilarion error would happen, since either odd would not not be visible from even (because it has still not been declared), or even would not be visible from odd (for the same reason).

Having the prototype of all functions together in the same place within the source code is found practical by some programmers, and this can be easily achieved by declaring all functions prototypes at the beginning of a program.

[edit] Arrays and Vectors

When you write new code in C++, use Vectors instead of arrays. Almost always they are easier to work with.

In old C the way to store a number of values of same type (say integers) is to use an Array -- which can be thought of as a line of slots that we can fill in with values. In general arrays should have fixed size that is determined during the compile time (there are ways to avoid this problem!). C++ has Vectors -- array's on steroids -- think of these as expandable bags. You can load them with any number of values as you like! If you write code in C++, you can get away without using Arrays at all. But, in old code written in C, arrays appear quite often. So, we shall cover arrays first.

[edit] Arrays

An array is a series of elements of the same type placed in contiguous memory locations that can be individually referenced by adding an index to a unique identifier.

That means that, for example, we can store 5 values of type int in an array without having to declare 5 different variables, each one with a different identifier. Instead of that, using an array we can store 5 different values of the same type, int for example, with a unique identifier.

For example, an array to contain 5 integer values of type int called billy could be represented like this:

Image:9-imgarra1.gif

where each blank panel represents an element of the array, that in this case are integer values of type int. These elements are numbered from to 4 since in arrays the first index is always , independently of its length.

Like a regular variable, an array must be declared before it is used. A typical declaration for an array in C++ is:

type name [elements];

where type is a valid type (like int, float...), name is a valid identifier and the elements field (which is always enclosed in square brackets []), specifies how many of these elements the array has to contain.

Therefore, in order to declare an array called billy as the one shown in the above diagram it is as simple as:

int billy [5];

NOTE: The elements field within brackets [] which represents the number of elements the array is going to hold, must be a constant value, since arrays are blocks of non-dynamic memory whose size must be determined before execution. In order to create arrays with a variable length dynamic memory is needed, which is explained later in these tutorials.

[edit] Initializing arrays.

When declaring a regular array of local scope (within a function, for example), if we do not specify otherwise, its elements will not be initialized to any value by default, so their content will be undetermined until we store some value in them. The elements of global and static arrays, on the other hand, are automatically initialized with their default values, which for all fundamental types this means they are filled with zeros.

In both cases, local and global, when we declare an array, we have the possibility to assign initial values to each one of its elements by enclosing the values in braces { }. For example:

int billy [5] = { 16, 2, 77, 40, 12071 }; 

This declaration would have created an array like this:

Image:9-imgarra3.gif

The amount of values between braces { } must not be larger than the number of elements that we declare for the array between square brackets [ ]. For example, in the example of array billy we have declared that it has 5 elements and in the list of initial values within braces { } we have specified 5 values, one for each element.

When an initialization of values is provided for an array, C++ allows the possibility of leaving the square brackets empty [ ]. In this case, the compiler will assume a size for the array that matches the number of values included between braces { }:

int billy [] = { 16, 2, 77, 40, 12071 };

After this declaration, array billy would be 5 ints long, since we have provided 5 initialization values.

[edit] Accessing the values of an array.

In any point of a program in which an array is visible, we can access the value of any of its elements individually as if it was a normal variable, thus being able to both read and modify its value. The format is as simple as:

name[index]

Following the previous examples in which billy had 5 elements and each of those elements was of type int, the name which we can use to refer to each element is the following:

Image:9-imgarra2.gif

For example, to store the value 75 in the third element of billy, we could write the following statement:

billy[2] = 75;

and, for example, to pass the value of the third element of billy to a variable called a, we could write:

a = billy[2];

Therefore, the expression billy[2] is for all purposes like a variable of type int.

Notice that the third element of billy is specified billy[2], since the first one is billy[0], the second one is billy[1], and therefore, the third one is billy[2]. By this same reason, its last element is billy[4]. Therefore, if we write billy[5], we would be accessing the sixth element of billy and therefore exceeding the size of the array.

In C++ it is syntactically correct to exceed the valid range of indices for an array. This can create problems, since accessing out-of-range elements do not cause compilation errors but can cause runtime errors. The reason why this is allowed will be seen further ahead when we begin to use pointers.

At this point it is important to be able to clearly distinguish between the two uses that brackets [ ] have related to arrays. They perform two different tasks: one is to specify the size of arrays when they are declared; and the second one is to specify indices for concrete array elements. Do not confuse these two possible uses of brackets [ ] with arrays.

int billy[5];         // declaration of a new array

billy[2] = 75;        // access to an element of the array.

If you read carefully, you will see that a type specifier always precedes a variable or array declaration, while it never precedes an access.

Some other valid operations with arrays:

billy[0] = a;
billy[a] = 75;
b = billy [a+2];
billy[billy[a]] = billy[2] + 5;
// arrays example
#include <iostream>
using namespace std;

int billy [] = {16, 2, 77, 40, 12071};

int n, result=0;

int main ()
{
  for ( n=0 ; n<5 ; n++ )
  {
    result += billy[n];
  }
  cout << result;
  return 0;
}
12206

[edit] Multidimensional arrays

Multidimensional arrays can be described as "arrays of arrays". For example, a bidimensional array can be imagined as a bidimensional table made of elements, all of them of a same uniform data type.

Image:9-imgarra5.gif

jimmy represents a bidimensional array of 3 per 5 elements of type int. The way to declare this array in C++ would be:

int jimmy [3][5];

and, for example, the way to reference the second element vertically and fourth horizontally in an expression would be:

jimmy[1][3]

Image:9-imgarra6.gif

(remember that array indices always begin by zero).

Multidimensional arrays are not limited to two indices (i.e., two dimensions). They can contain as many indices as needed. But be careful! The amount of memory needed for an array rapidly increases with each dimension. For example:

char century [100][365][24][60][60];

declares an array with a char element for each second in a century, that is more than 3 billion chars. So this declaration would consume more than 3 gigabytes of memory!

[edit] Arrays as parameters

At some moment we may need to pass an array to a function as a parameter. In C++ it is not possible to pass a complete block of memory by value as a parameter to a function, but we are allowed to pass its address. In practice this has almost the same effect and it is a much faster and more efficient operation.

In order to accept arrays as parameters the only thing that we have to do when declaring the function is to specify in its parameters the element type of the array, an identifier and a pair of void brackets []. For example, the following function:

void procedure (int arg[])

accepts a parameter of type "array of int" called arg. In order to pass to this function an array declared as:

int myarray [40];

it would be enough to write a call like this:

procedure (myarray);

Here you have a complete example:

// arrays as parameters
#include <iostream>
using namespace std;


void printarray (int arg[], int length) {
  for (int n=0; n<length; n++)
    cout << arg[n] << " ";
   cout << "\n";
 }
 
 
 int main ()
{
  int firstarray[] = {5, 10, 15};
  int secondarray[] = {2, 4, 6, 8, 10};
  printarray (firstarray,3);
  printarray (secondarray,5);
  return 0;
}
5 10 15
2 4 6 8 10

As you can see, the first parameter (int arg[]) accepts any array whose elements are of type int, whatever its length. For that reason we have included a second parameter that tells the function the length of each array that we pass to it as its first parameter. This allows the for loop that prints out the array to know the range to iterate in the passed array without going out of range.

In a function declaration it is also possible to include multidimensional arrays. The format for a tridimensional array parameter is:

base_type[][depth][depth]

for example, a function with a multidimensional array as argument could be:

void procedure (int myarray[][3][4])

Notice that the first brackets [] are left blank while the following ones are not. This is so because the compiler must be able to determine within the function which is the depth of each additional dimension.

Arrays, both simple or multidimensional, passed as function parameters are a quite common source of errors for novice programmers. I recommend the reading of the chapter about Pointers for a better understanding on how arrays operate.

[edit] Vectors -- Arrays made easy

Arrays are simple as long as their dimensions are fixed. What if the size of the array is determined by the data? While there are ways to overcome this issue, they tend to be somewhat complicated. An alternative is to use the vector structure that is available in C++. Vectors are way easier to use than traditional arrays. Let's start with an example.

/** Vector demonstration I
**/
#include <vector>
#include <iostream>
#include <string>
using namespace std;
 
int main ()
{
	vector<string> animals;
	do{
		string animal;
		cout << "An animal (Just Enter to end):";
		getline(cin, animal);
		if(animal==""){
			break;
		}
		animals.push_back(animal);
	}while(true);
	cout << "I got the following:\n";
	for(unsigned int i=0;i<animals.size();i++){
		cout << animals[i]<<'\n';
	}
 
}
An animal (Just Enter to end):rabbit
An animal (Just Enter to end):fox
An animal (Just Enter to end):chicken
An animal (Just Enter to end):
I got the following:
rabbit
fox
chicken

Almost always, vectors are better substitutes for arrays. Learn to use them effectively. However, one situation where you may have to use arrays is when dealing with C language (prior to C++) code libraries.

Let's go through this code:

#include <vector>
directive needed (to include vector headers) if you want to use vectors.
vector<string> animals;
animals is a vector, with string elements. (You can define vectors with any type of elements. e.g. vector<int> age;.
animals.push_back(animal);
add the string stored in animal variable to animals vector. (Vector will grow by one element.)
animals.size()
The size of the vector. (i.e. number of elements.)
animals[i]
Elements of vectors can be accessed using the same notation that we use for arrays. (Alternatively you can use animals.at(i).)

[edit] Vectors of Vectors

It is possible to define vectors of vectors (of vectors ...) as follows.

vector < vector <int> > matrix; //defines a vector of vector of integers.

Following is an example of vector of vectors in use.

/* vectors of vectors */
#include <vector>
#include <iostream>
#include <sstream>
using namespace std;
 
int main ()
{
	vector < vector <double> > matrix; //matrix is a vector of, vector of doubles.
	vector <double> line;              //line is a vector of doubles
	matrix.push_back(line);			   //add a 'line' to the 'matrix'
	cout << "Enter your matrix. One number at a time.\n";
	cout << "'Enter' to break current line.\n";
	cout << "x'Enter' to end entering.\n";
 
	int i=0,j=0;
	while(true){
		string mystr;
		getline (cin,mystr); //read what was entered. 
		if(mystr==""){       // if it is blank (just Enter)
			j++;                   // increase the row count. 
			matrix.push_back(line);// add a row to the matrix.
			cout << "Enter next row.\n";
			continue;	     // no need to waste time, start next iteration. 
		}
		if(mystr=="x"){	     // if it is "x"
			break;           // we are out!
		}
		// now we assume its a number. 
		//In reality we need a bit of error handling.
		//But let's keep things simple here. 
		double tmp;
		stringstream(mystr) >> tmp; //convert mystr to a double and store in tmp
		matrix[j].push_back(tmp);   //add that double (tmp) to row j of matrix.
	}
	cout << "Done entering!\n You entered the following matrix.\n";
	for (unsigned int j=0;j<matrix.size();j++){// for each row in matrix
		for(unsigned int i=0;i<matrix[j].size();i++){// for each place in jth row.
			cout << matrix[j][i] <<'\t';
		}
		cout << '\n';
	}
}

A typical run of this program would look like the following:

 
Enter your matrix. One number at a time.
'Enter' to break current line.
x'Enter' to end entering.
2
3

Enter next row.
4
5
6
7
8
9

Enter next row.
1

Enter next row.
25
26
x
Done entering!
 You entered the following matrix.
2       3
4       5       6       7       8       9
1
25      26

[edit] Vectors running wild!

If you reuse a vector, first you need to explicitly remove all stuff. Remember to empty your bags before refilling them!

Think of vectors as bags of unlimited space. They are very convenient because you don't have to know what is the number of items you are going to fill them with in advance, they just keep on growing!! However, this same property can lead to problems if you don't pay attention. One of the common mistakes made by new programmers is forgetting to empty the vector (bag) before putting new set of items (refilling the bag!).

Lets suppose you write a program where you use an array/a vector to store a number of items. Let's say, within the program you do it several times. In arrays when you do the following:

int vals[5];
...
vals[i]=5;

we explicitly say replace the slot number i of vals with value 5. But in vectors

vector <int> vals;
...
vals.push_back(5);

what we say is add the value 5 to the bag vals. Notice that if we don't need the old values, we have to explicitly erase them!

You can erase a whole vector by

vals.clear();

[edit] Exercises

[edit] Problem 1

There is a large data set in the file moddata.txt with approximately 17 million lines. The format of data is as follows:

1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 1
11 1
12 0
13 -9999
14 0
15 0

The first column is the sequential number (starting at 1 and incremented by one at each line until end of the file.). Second column is a real number indicating a value. -9999 for value indicates 'missing data'. Compute the average of the value field, without including missing data, by writing a small program. The data can be downloaded from here: https://dimos.ihe.nl/public/pbcpp/moddata.zip

[edit] Problem 2

Population Densities of West-Asia

The image on the right shows the population density distribution for West Asia. The raw data used to create this image is stored in the file waspop.txt in the following format.

ncols         840
nrows         840
xllcorner     25
yllcorner     10
cellsize      0.04166666667
NODATA_value  -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999  ... ... 
.... ...
....

The full dataset is here: waspop.zip

There are 840 rows and 840 columns. The cellsize of the grid (data spacing) is 0.04166667 degrees (long/lat), i.e. 2.5 arc minutes. We need to write a program to (a) Process the data to a new resolution of 10 arc minutes (i.e. 0.1666667 degrees). (b) Try to generalize the program so that it can get a user input for the resolution multiplier (e.g. 4 in above case) and reprocess the data.

Note
-9999 indicates missing data. When constituent cells of a new (larger cell) contains more than 50% of missing data, that cell is denoted as -9999. If missing data fraction is less than 50%, then we compute the average of non-missing data and assign it to the new cell.
Note
In this type of problem, it is always useful to test your program with a smaller, manageable dataset. Such a fake dataset is given below.
ncols         17
nrows         17
xllcorner     25
yllcorner     10
cellsize      0.1
NODATA_value  -9999
5 5 5 5 6 7 8 9 9 1 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 1 9 9 9 9 9 9 9
1 5 1 5 6 7 8 9 9 1 9 9 9 9 9 9 9
5 5 5 1 6 7 8 9 9 9 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 9 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 9 9 2 9 9 9 9 9
5 5 5 5 6 6 8 9 9 9 9 9 9 9 9 9 9
5 5 5 2 6 7 3 9 4 9 9 9 3 9 4 9 9
5 5 5 5 6 7 8 9 9 3 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 9 2 9 2 9 9 9 9
5 5 5 5 6 7 8 9 9 9 9 9 9 9 3 9 9
5 5 5 5 6 7 8 9 9 9 9 9 9 8 9 9 9
5 5 5 5 6 7 1 9 1 9 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 9 9 9 9 9 9 9 9
5 5 5 5 6 7 8 9 9 9 9 3 9 1 9 9 9
5 5 5 5 6 7 1 9 9 9 9 9 9 9 9 9 9
1 5 5 5 6 7 8 9 9 9 9 9 9 9 9 9 1

[edit] answer

#include <iostream>
#include <fstream>
#include <vector>
#include <math.h>
#include <string>
 
using namespace std;
string head="../myprograms/lesson8/";
float missfrac=.5; // if more than missfrac values are missing, then mark the target cell as missing. 
int readdata(string file) ;
void writedata();
//here we define a global vectors to hold the original data and resampled data;
vector< vector <float> > ordata, rsdata;
// some global variables to hold the header information;
int num_cols, num_rows,  nodat;
float xllc, yllc, cells;
 
int main(){
	cout << "Your data file should be in the folder : " << head<<"\n";
	string tmp="waspop.asc";
	cout << "Data file name should be: "  <<  tmp <<"\n Press any key to continue...\n";
	cin.get();
	string filename=head+tmp;
	int out=readdata(filename);
	if(out<0){
		cout << "Error in reading data. Check your data file : "<< filename<<", it's location also. I quit!";
		exit(1);
	}
	writedata();
 
}
 
int readdata(string file) {
	ifstream strm;
	strm.open(file.c_str());
	if(strm.is_open()){
		//read data. 
		//first read the header
		/*
		ncols         840
		nrows         840
		xllcorner     25
		yllcorner     10
		cellsize      0.04166666667
		NODATA_value  -9999
		*/
		char dumb[500];
		strm >> dumb; strm >> num_cols;
		strm >> dumb; strm >> num_rows; 
		strm >> dumb; strm >> xllc;
		strm >> dumb; strm >> yllc;
		strm >> dumb; strm >> cells;
		strm >> dumb; strm >> nodat;
		cout << " (num_cols, num_rows,  nodat; float xllc, yllc, cells) \n= \n("
			<< num_cols<< ",   " << num_rows<< ",   " <<  nodat<< ",   " 
			<< xllc<< ",   " << yllc<< ",   " << cells << ")\n";
		//now we know the number of columns and rows. So, start reading data
		for (int ii=0;ii<num_rows;ii++){
			vector <float> tmp;
			for(int jj=0;jj<num_cols;jj++){
				float val;
				strm >> val;
				tmp.push_back(val);
			}
			ordata.push_back(tmp);
		}
		return 1;
	}else{
		cerr << "Could not open the file " << file.c_str() <<"\n";
		return -1;
	}
	strm.close();
}
 
void writedata(){
 
	//now ask a deltan value
	cout << "Enter a grid average value (integer >1) :";
	int deltan;
	cin >> deltan;
	cout << "Now enter a name for the output file:";
	string outfile;
	cin >> outfile;
	outfile=head+outfile;
	ofstream fstr;
	fstr.open(outfile.c_str());
	if( ! (fstr.is_open() ) ){
		cout << "I can not open your file : " << outfile << ". Please check the path.\n I quit!.";
	}
			/*
		ncols         840
		nrows         840
		xllcorner     25
		yllcorner     10
		cellsize      0.04166666667
		NODATA_value  -9999
		*/
	fstr << "ncols " << floor((float)num_cols/deltan)<< "\n";
	fstr <<"nrows" << floor((float)num_rows/deltan)<< "\n";
 
		int lastc=(int)floor((float)num_cols/deltan)*deltan;
	    int lastr=(int)floor((float)num_rows/deltan)*deltan;
		for (int ii=0; ii<lastr; ii=ii+deltan) {
			for (int jj=0;jj<lastc;jj+=deltan) {
				//now we are at (ii, jj)  value
				/* need to average the folowing sub-matrix. 
				(ii,jj)		(ii+1,jj),		.....								(ii+deltan-1,jj)
				(ii,jj+1)	(ii+1,jj+1), .....							 (ii+deltan-1,jj+1)
				....					 ...	...
				....					...
				(ii,jj+deltan-1) (ii+1,jj+deltan-1), .....  (ii+deltan-1,jj+detan-1)
				*/
				int ct=0; double sum=0;
 
				for(int m=ii;m<ii+deltan;m++){ 
					for(int n=jj;n<jj+deltan;n++){
						if( ! (ordata[m][n]==nodat) ){
						ct++;
						sum+=ordata[m][n];
						}
					}
				}
					double avg;
					if(ct>=missfrac*deltan*deltan){
						avg=sum/ct;
					}else{
						avg=-9999;
					}
					fstr << avg << " ";
				//close the outer for loops
				}
			fstr << "\n";
			}
		fstr.close();
 
		}

[edit] Problem 3

(For advanced users) We are going to write a program that can solve a set of simultaneous linear equations. What the program does is

  • Read a file that has equation specifications
  • Solve them and write the answers to a file.

Let's assume the input format is as follows:

n
c11 c12 c13 ... c1n
c21 c22 c23 ... c2n
...
...
cn1 cn2 cn3 ... cnn
rh1
rh2
.
.
rhn

We use a c++ library called ALGLIB to solve this problem.

  1. First download Alglib (http://www.alglib.net/download.php C++ version).
  2. We use rmatrixinverse function of Alglib library to invert the matrix C11..Cnn. Here's an example on how to do matrix inversion: [1]. Note that Alglib defines a new data type real_2d_array and that type is used in rmatrixinverse.
  3. Write a program that will read the above file into c++. Read the matrix C11..Cnn in to a real_2d_array and rh1..rhnn to a real_1d_array of appropriate size. Hint: You might want to read the section on "Working with vectors and matrices" in the Alblib manual [2].
  4. Invert the matrix and multiply with rh1..rhnn and write the answer to a file.

[edit] Character Sequences

As you may already know, the C++ Standard Library implements a powerful [/string string] class, which is very useful to handle and manipulate strings of characters. However, because strings are in fact sequences of characters, we can represent them also as plain arrays of char elements.

For example, the following array:

char jenny [20];

is an array that can store up to 20 elements of type char. It can be represented as:

Image:10-imgstri1.gif

Therefore, in this array, in theory, we can store sequences of characters up to 20 characters long. But we can also store shorter sequences. For example, jenny could store at some point in a program either the sequence "Hello" or the sequence "Merry christmas", since both are shorter than 20 characters.

Therefore, since the array of characters can store shorter sequences than its total length, a special character is used to signal the end of the valid sequence: the null character, whose literal constant can be written as '\0' (backslash, zero).

Our array of 20 elements of type char, called jenny, can be represented storing the characters sequences "Hello" and "Merry Christmas" as:

Image:10-imgstri2.gif

Notice how after the valid content a null character ('\0') has been included in order to indicate the end of the sequence. The panels in gray color represent char elements with undetermined values.

[edit] Initialization of null-terminated character sequences

Because arrays of characters are ordinary arrays they follow all their same rules. For example, if we want to initialize an array of characters with some predetermined sequence of characters we can do it just like any other array:

char myword[] = { 'H', 'e', 'l', 'l', 'o', '\0' }; 

In this case we would have declared an array of 6 elements of type char initialized with the characters that form the word "Hello" plus a null character '\0' at the end.
But arrays of char elements have an additional method to initialize their values: using string literals.

In the expressions we have used in some examples in previous chapters, constants that represent entire strings of characters have already showed up several times. These are specified enclosing the text to become a string literal between double quotes ("). For example:

"the result is: "

is a constant string literal that we have probably used already.

Double quoted strings (") are literal constants whose type is in fact a null-terminated array of characters. So string literals enclosed between double quotes always have a null character ('\0') automatically appended at the end.

Therefore we can initialize the array of char elements called myword with a null-terminated sequence of characters by either one of these two methods:

char myword [] = { 'H', 'e', 'l', 'l', 'o', '\0' };

char myword [] = "Hello"; 
 

In both cases the array of characters myword is declared with a size of 6 elements of type char: the 5 characters that compose the word "Hello" plus a final null character ('\0') which specifies the end of the sequence and that, in the second case, when using double quotes (") it is appended automatically.

Please notice that we are talking about initializing an array of characters in the moment it is being declared, and not about assigning values to them once they have already been declared. In fact because this type of null-terminated arrays of characters are regular arrays we have the same restrictions that we have with any other array, so we are not able to copy blocks of data with an assignment operation.

Assuming mystext is a char[] variable, expressions within a source code like:

mystext = "Hello";
 mystext[] = "Hello"; 
 
 

would not be valid, like neither would be:

mystext = { 'H', 'e', 'l', 'l', 'o', '\0' };

The reason for this may become more comprehensible once you know a bit more about pointers, since then it will be clarified that an array is in fact a constant pointer pointing to a block of memory.

[edit] Using null-terminated sequences of characters

Null-terminated sequences of characters are the natural way of treating strings in C++, so they can be used as such in many procedures. In fact, regular string literals have this type (char[]) and can also be used in most cases.

For example, cin and cout support null-terminated sequences as valid containers for sequences of characters, so they can be used directly to extract strings of characters from cin or to insert them into cout. For example:

// null-terminated sequences of characters

#include <iostream>
using namespace std;

int main ()
{
  char question[] = "Please, enter your first name: ";
   char greeting[] = "Hello, ";
   char yourname [80];
  cout << question;
  cin >> yourname;
  cout << greeting << yourname << "!";
   return 0;
}

Please, enter your first name: John
Hello, John!

As you can see, we have declared three arrays of char elements. The first two were initialized with string literal constants, while the third one was left uninitialized. In any case, we have to speficify the size of the array: in the first two (question and greeting) the size was implicitly defined by the length of the literal constant they were initialized to. While for yourname we have explicitly specified that it has a size of 80 chars.

Finally, sequences of characters stored in char arrays can easily be converted into string objects just by using the assignment operator:

string mystring;

char myntcs[]="some text";
 mystring = myntcs;

[edit] Pointers

The concept of pointers in a simple one; however, in old C language the pointers are perhaps the single most common cause of program failure in code written by novice programmers. A good way to start, eh? Well.. pointers can be trouble makers, but the good news is, unlike in C, with C++ there many ways to avoid using them. Before we go any further, the rule of the game of pointers (at least for the 'rest of us') is don't use them unless absolutely necessary.

Having said that, using 'Pointers' for essential tasks is not difficult, and the rules that govern how they work are pretty simple. The rules can be combined to get complex results, but the individual rules remain simple.

Before we go any further, lets define what a `pointer' is.

Pointer
A pointer is a programming language data type whose value refers directly to (or “points to”) another value stored elsewhere in the computer memory using its address.

A good analogy is the relationship between a web page and its URL (address). For example http://en.wikipedia.org/wiki/Banana is URL of the page describing Bananas on Wikipedia. If we know the above address it allows us to reach the Banana Page. However, http://en.wikipedia.org/wiki/Banana is not the Banana Page, but only the address of that page.

Similarly a pointer in computer jargon is a reference (or address) of something.

[edit] Pointers and Pointees

Remember this: Addresses are pointers. The stuff referred to by those addresses (Banana Page in case of http://en.wikipedia.org/wiki/Banana or your home, in case of your postal address.) is pointee.

In C/C++ we denote this as follows:

/* Pointer/ Pointee demo */
#include<iostream>
using namespace std;
int main(){
	int *k; // the pointee of k is an integer. 
	int y;  // y is an integer
	y=0;    // set y to zero
	k=&y;   // point k to the address of y (y is pointee of k now)
	*k=40;  // set the value of pointee of k (ah, ha!) to 40. 
	cout << y <<"\n"; // print y. 
	cout << (*k) <<"\n"; // print pointee of k
}

The notation * can be read as pointee of and & as address of.

What k=&y; *k=40; does is equivalent of y=40 in a very round about way!! This code shows one of the major issues of pointer usage. Where there are pointers -- there are hidden links. If we don't keep a track of these links, we are inviting for trouble!

If you are interested in a more in-depth coverage of the pointers in general make a detour to this link. The rest of this article covers only two important aspects of pointers.

[edit] Using Pointers to get values out of functions

The most straight forward way to get the results of a function is its return value. If you need to get several values out of a function how do you do that? One way is to use Data Structures, a topic we are yet to cover. Another is to use pointers. See the following example:

#include <iostream>
using namespace std;
void add(int *a){ // pointee of a is an integer
	(*a)+=5;		  // add 5 to pointee of a
}
 
int main ()
{
	int val=0;      // val is zero
        int *k;  // k's pointee is an integer. 
	k=&val;  //now k points to the address of val
	add(k);      // pass k, i.e. address of val
	cout << val;    // val has changed! 
}

The identical results can be obtained by

#include <iostream>
using namespace std;
void add(int& a){ // whatever value passed in the place of a is taken 'by reference'
	a+=5;		  // add 5 to a (pointing to the same 'memory location' as val in the main program)
}
 
int main ()
{
	int val=0;      // val is zero
	add(val);      // pass val
	cout << val;    // val has changed! 
}

The following section is a formal explanation of what is happening.

[edit] Passing by value or by reference

All the simple functions we have seen in section Functions and thereafter, the arguments passed to the functions have been passed by value. This means that when calling a function with parameters, what we have passed to the function were copies of their values but never the variables themselves. For example in the followng code

// function example
#include <iostream>
using namespace std;
 
 
int addition (int a, int b)
{
  int r;
  r=a+b;
  return (r);
}
 
int main ()
{
  int z;
  z = addition (5,3);
  cout << "The result is " << z;
  return 0;
}

What we did in this case was to call to function addition passing the values of x and y, i.e. 5 and 3 respectively, but not the variables x and y themselves.

Image:8-imgfunc1.gif

This way, when the function addition is called, the value of its local variables a and b become 5 and 3 respectively, but any modification to either a or b within the function addition will not have any effect in the values of x and y outside it, because variables x and y were not themselves passed to the function, but only copies of their values at the moment the function was called.

But there might be some cases where you need to manipulate from inside a function the value of an external variable. For that purpose we can use arguments passed by reference, as in the function duplicate of the following example:

// passing parameters by reference

#include <iostream>
using namespace std;

void duplicate (int& a, int& b, int& c)
{
  a*=2;
  b*=2;
  c*=2;
}


int main ()
{
  int x=1, y=3, z=7;
  duplicate (x, y, z);
  cout << "x=" << x << ", y=" << y << ", z=" << z;
  return 0;
}

x=2, y=6, z=14

The first thing that should call your attention is that in the declaration of duplicate the type of each parameter was followed by an ampersand sign (&). This ampersand is what specifies that their corresponding arguments are to be passed by reference instead of by value.

When a variable is passed by reference we are not passing a copy of its value, but we are somehow passing the variable itself to the function and any modification that we do to the local variables will have an effect in their counterpart variables passed as arguments in the call to the function.

Image:8-imgfunc3.gif

To explain it in another way, we associate a, b and c with the arguments passed on the function call (x, y and z) and any change that we do on a within the function will affect the value of x outside it. Any change that we do on b will affect y, and the same with c and z.

That is why our program's output, that shows the values stored in x, y and z after the call to duplicate, shows the values of all the three variables of main doubled.

If when declaring the following function:

void duplicate (int& a, int& b, int& c)

we had declared it this way:

void duplicate (int a, int b, int c)

i.e., without the ampersand signs (&), we would have not passed the variables by reference, but a copy of their values instead, and therefore, the output on screen of our program would have been the values of x, y and z without having been modified.

Passing by reference is also an effective way to allow a function to return more than one value. For example, here is a function that returns the previous and next numbers of the first parameter passed.

// more than one returning value

#include <iostream>
using namespace std;

void prevnext (int x, int& prev, int& next)
{
  prev = x-1;
  next = x+1;
}


int main ()
{
  int x=100, y, z;
  prevnext (x, y, z);
  cout << "Previous=" << y << ", Next=" << z;
  return 0;
}

Previous=99, Next=101

[edit] Passing functions as pointers

Note
This section can be skipped without much harm!

We can pass whole functions as arguments to other functions using pointers. See the following example.

#include <iostream>
using namespace std;
int oper(int one, int two, int (*myfunc)(int,int)){ // pointee of myfunc is a function 
						    //and takes two integer arguments
						    // and return an integer. 
	return (*myfunc)(one,two);		    // &x - 'pointer of x is..'
						    // *x - 'pointee of x is..'
 
}
 
int bigger(int a, int b){
	if(a>b){
		return a;
	}else{
		return b;
	}
}
 
int smaller(int a, int b){
	if(a>b){
		return b;
	}else{
		return a;
	}
}
 
int main ()
{
	int a=5, b=10;
	cout << oper(a,b,&bigger)<<'\n';
	cout << oper(a,b,&smaller)<<'\n';
}

[edit] Data Structures

We have already learned how groups of sequential data can be used in C++. But this is somewhat restrictive, since in many occasions what we want to store are not mere sequences of elements all of the same data type, but sets of different elements with different data types.

[edit] Data structures

A data structure is a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths. Data structures are declared in C++ using the following syntax:

struct structure_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;

where structure_name is a name for the structure type, object_name can be a set of valid identifiers for objects that have the type of this structure. Within braces { } there is a list with the data members, each one is specified with a type and a valid identifier as its name.

The first thing we have to know is that a data structure creates a new type: Once a data structure is declared, a new type with the identifier specified as structure_name is created and can be used in the rest of the program as if it was any other type. For example:

struct product {
  int weight;
  float price;
} ;

product apple;
product banana, melon;

We have first declared a structure type called product with two members: weight and price, each of a different fundamental type. We have then used this name of the structure type (product) to declare three objects of that type: apple, banana and melon as we would have done with any fundamental data type.

Once declared, product has become a new valid type name like the fundamental ones int, char or short and from that point on we are able to declare objects (variables) of this compound new type, like we have done with apple, banana and melon.

Right at the end of the struct declaration, and before the ending semicolon, we can use the optional field object_name to directly declare objects of the structure type. For example, we can also declare the structure objects apple, banana and melon at the moment we define the data structure type this way:

struct product {
  int weight;
  float price;
} apple, banana, melon;

It is important to clearly differentiate between what is the structure type name, and what is an object (variable) that has this structure type. We can instantiate many objects (i.e. variables, like apple, banana and melon) from a single structure type (product).

Once we have declared our three objects of a determined structure type (apple, banana and melon) we can operate directly with their members. To do that we use a dot (.) inserted between the object name and the member name. For example, we could operate with any of these elements as if they were standard variables of their respective types:

apple.weight
apple.price
banana.weight
banana.price
melon.weight
melon.price

Each one of these has the data type corresponding to the member they refer to: apple.weight, banana.weight and melon.weight are of type int, while apple.price, banana.price and melon.price are of type float.

Let's see a real example where you can see how a structure type can be used in the same way as fundamental types:

// example about structures

#include <iostream>
#include <string>
#include <sstream>
using namespace std;

struct movies_t {
  string title;
  int year;
} mine, yours;


void printmovie (movies_t movie);

int main ()
{
  string mystr;

  mine.title = "2001 A Space Odyssey";
   mine.year = 1968;
 
   cout << "Enter title: ";
   getline (cin,yours.title);
   cout << "Enter year: ";
   getline (cin,mystr);
   stringstream(mystr) >> yours.year;
 
   cout << "My favorite movie is:\n ";
   printmovie (mine);
   cout << "And yours is:\n ";
   printmovie (yours);
   return 0;
}


void printmovie (movies_t movie)
{
  cout << movie.title;
  cout << " (" << movie.year << ")\n";
 }
 
Enter title: Alien
Enter year: 1979

My favorite movie is:
 2001 A Space Odyssey (1968)
And yours is:
 Alien (1979)

The example shows how we can use the members of an object as regular variables. For example, the member yours.year is a valid variable of type int, and mine.title is a valid variable of type string.

The objects mine and yours can also be treated as valid variables of type movies_t, for example we have passed them to the function printmovie as we would have done with regular variables. Therefore, one of the most important advantages of data structures is that we can either refer to their members individually or to the entire structure as a block with only one identifier.

Data structures are a feature that can be used to represent databases, especially if we consider the possibility of building arrays of them:

// array of structures

#include <iostream>
#include <string>
#include <sstream>
using namespace std;

#define N_MOVIES 3

struct movies_t {
  string title;
  int year;
} films [N_MOVIES];

void printmovie (movies_t movie);

int main ()
{
  string mystr;
  int n;

  for (n=0; n<N_MOVIES; n++)
  {
    cout << "Enter title: ";
     getline (cin,films[n].title);
     cout << "Enter year: ";
     getline (cin,mystr);
     stringstream(mystr) >> films[n].year;
   }
 
   cout << "\nYou have entered these movies:\n";
   for (n=0; n<N_MOVIES; n++)
    printmovie (films[n]);
  return 0;
}


void printmovie (movies_t movie)
{
  cout << movie.title;
  cout << " (" << movie.year << ")\n";
 }
 
Enter title: Blade Runner
Enter year: 1982
Enter title: Matrix
Enter year: 1999
Enter title: Taxi Driver
Enter year: 1976
 
You have entered these movies:
Blade Runner (1982)
Matrix (1999)
Taxi Driver (1976)


[edit] Input/Output with files

C++ provides the following classes to perform output and input of characters to/from files:

  • ofstream: Stream class to write on files
  • ifstream: Stream class to read from files
  • fstream: Stream class to both read and write from/to files.

These classes are derived directly or indirectly from the classes istream, and ostream. We have already used objects whose types were these classes: cin is an object of class istream and cout is an object of class ostream. Therfore, we have already been using classes that are related to our file streams. And in fact, we can use our file streams the same way we are already used to use cin and cout, with the only difference that we have to associate these streams with physical files. Let's see an example:

// basic file operations

#include <iostream>
#include <fstream>
using namespace std;

int main () {
  ofstream myfile;
  myfile.open ("example.txt");
  myfile << "Writing this to a file.\n";
   myfile.close();
   return 0;
}

[file example.txt]
Writing this to a file

This code creates a file called example.txt and inserts a sentence into it in the same way we are used to do with cout, but using the file stream myfile instead.

But let's go step by step:

[edit] Open a file

The first operation generally performed on an object of one of these classes is to associate it to a real file. This procedure is known as to open a file. An open file is represented within a program by a stream object (an instantiation of one of these classes, in the previous example this was myfile) and any input or output operation performed on this stream object will be applied to the physical file associated to it.

In order to open a file with a stream object we use its member function open():

open (filename, mode);

Where filename is a null-terminated character sequence of type const char * (the same type that string literals have) representing the name of the file to be opened, and mode is an optional parameter with a combination of the following flags:

ios::in Open for input operations.
ios::out Open for output operations.
ios::binary Open in binary mode.
ios::ate Set the initial position at the end of the file.
If this flag is not set to any value, the initial position is the beginning of the file.
ios::app All output operations are performed at the end of the file, appending the content to the current content of the file. This flag can only be used in streams open for output-only operations.
ios::trunc If the file opened for output operations already existed before, its previous content is deleted and replaced by the new one.

All these flags can be combined using the bitwise operator OR (|). For example, if we want to open the file example.bin in binary mode to add data we could do it by the following call to member function open():

ofstream myfile;
myfile.open ("example.bin", ios::out | ios::app | ios::binary); 

Each one of the open() member functions of the classes ofstream, ifstream and fstream has a default mode that is used if the file is opened without a second argument:

class default mode parameter
ofstream ios::out
ifstream ios::in
fstream ios::out

For ifstream and ofstream classes, ios::in and ios::out are automatically and respectivelly assumed, even if a mode that does not include them is passed as second argument to the open() member function.

The default value is only applied if the function is called without specifying any value for the mode parameter. If the function is called with any value in that parameter the default mode is overridden, not combined.

File streams opened in binary mode perform input and output operations independently of any format considerations. Non-binary files are known as text files, and some translations may occur due to formatting of some special characters (like newline and carriage return characters).

Since the first task that is performed on a file stream object is generally to open a file, these three classes include a constructor that automatically calls the open() member function and has the exact same parameters as this member. Therefor, we could also have declared the previous myfile object and conducted the same opening operation in our previous example by writing:

ofstream myfile ("example.bin", ios::out | ios::app | ios::binary);

Combining object construction and stream opening in a single statement. Both forms to open a file are valid and equivalent.

To check if a file stream was successful opening a file, you can do it by calling to member is_open() with no arguments. This member function returns a bool value of true in the case that indeed the stream object is associated with an open file, or false otherwise:

if (myfile.is_open()) { /* ok, proceed with output */ }

[edit] Closing a file

When we are finished with our input and output operations on a file we shall close it so that its resources become available again. In order to do that we have to call the stream's member function close(). This member function takes no parameters, and what it does is to flush the associated buffers and close the file:

myfile.close();

Once this member function is called, the stream object can be used to open another file, and the file is available again to be opened by other processes.

In case that an object is destructed while still associated with an open file, the destructor automatically calls the member function close().

[edit] Text files

Text file streams are those where we do not include the ios::binary flag in their opening mode. These files are designed to store text and thus all values that we input or output from/to them can suffer some formatting transformations, which do not necessarily correspond to their literal binary value.

Data output operations on text files are performed in the same way we operated with cout:

// writing on a text file
#include <iostream>

#include <fstream>
using namespace std;

int main () {
  ofstream myfile ("example.txt");
  if (myfile.is_open())
  {
    myfile << "This is a line.\n";
     myfile << "This is another line.\n";
     myfile.close();
   }
   else cout << "Unable to open file";
   return 0;
}

[file example.txt]
This is a line.
This is another line.

Data input from a file can also be performed in the same way that we did with cin:

// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;


int main () {
  string line;
  ifstream myfile ("example.txt");
  if (myfile.is_open())
  {
    while (! myfile.eof() )
    {
      getline (myfile,line);
      cout << line << endl;
    }
    myfile.close();
  }

  else cout << "Unable to open file"; 
 
   return 0;
}

This is a line.
This is another line.

This last example reads a text file and prints out its content on the screen. Notice how we have used a new member function, called eof() that returns true in the case that the end of the file has been reached. We have created a while loop that finishes when indeed myfile.eof() becomes true (i.e., the end of the file has been reached).

[edit] Reading Numbers

Suppose you have a text file with numbers, like the example shown below:

23 23.343 23
54 22.3   33
...
...

Then if you need to read these values as numbers, it can be done as follows:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;
 
int main () {
  string line;
  ifstream myfile ("example.txt");
  int num1;
  float num2;
  int num3;
  if (myfile.is_open())
  {
    while (! myfile.eof() )
    {
      myfile >> num1;
      myfile >> num2;
      myfile >> num3;
     cout << num1 << "," << num2 << "," << num3;
    }
    myfile.close();
  }
 
  else cout << "Unable to open file";
 
   return 0;
}

[edit] Checking state flags

In addition to eof(), which checks if the end of file has been reached, other member functions exist to check the state of a stream (all of them return a bool value):

bad()
Returns true if a reading or writing operation fails. For example in the case that we try to write to a file that is not open for writing or if the device where we try to write has no space left.
fail()
Returns true in the same cases as bad(), but also in the case that a format error happens, like when an alphabetical character is extracted when we are trying to read an integer number.
eof()
Returns true if a file open for reading has reached the end.
good()
It is the most generic state flag: it returns false in the same cases in which calling any of the previous functions would return true.

In order to reset the state flags checked by any of these member functions we have just seen we can use the member function clear(), which takes no parameters.

[edit] Binary files

Binary files are very useful for efficient reading and writing data to/from disks when handling large datasets. However, this topic is beyond the scope of this lesson.

Look at the following code:

#include <stdio.h>
 
int main(void){
 
	printf("hello %s\n", "john");
}
#include <stdio.h>
 
int main(int argc, char *argv[]){
 
 
	printf("hello %s\n", argv[1]);
}
#include <stdio.h>
#include <stdlib.h>
 
int main(int argc, char *argv[]){
 
	if(argc<2){
		printf("Usage: %s <argument>", argv[0]);
		exit(0);
	}
 
	printf("hello %s\n", argv[1]);
}

[edit] Storage Specifiers

In a previous section (*) we have touched upon the subject of scope of variables (storage) in C/C++ language. There is quite a bit more to it and this section briefly covers the rest of the important parts of this subject.

Technically the specifications that come before the storage type specifier (e.g. int) of a variable is called the storage class. The storage class influences the nature of the memory allocated to the variable.

[edit] storage class auto

This is the most widely used storage class in any program, but also the one that is invisible to the reader! That is because, when no storage class is defined for a variable of limited scope, the compiler always assume the class to be auto. For example the following two declarations are identical.

int counter;
auto int counter;

Auto class tells the compiler that this is a "vanilla" variable: It will come to scope when the program control is within the block it is defined and go out of scope when the program control goes out.

Here's an example:

#include <iostream>
using namespace std;
 
void greet(){
    auto char * name="becky"; /* we normally do not write auto here because it is always implied
                            when there is no storage class specified. */
    cout << "Hello : " << name;
}
 
int main(){
    greet();
}

The variable name is of storage class auto. It is local to the scope of the function greet().

Now, we can get identical behavior with the following variant:

#include <iostream>
using namespace std;
 
char * name="becky"; /* This variable has file scope. So, it CAN NOT have <tt>auto</tt> storage class */
 
void greet(){
    cout << "Hello : " << name;
}
 
int main(){
    greet();
}

Note that now the variable name is a global variable. More specifically, we say that it has file scope, meaning that any function within the file the program is written, has access to this variable. (Note: Global variables can not be auto. If you specify auto in this variant, the compiler will complain.)

Summary
You can only apply the auto storage class specifier to names of variables declared in a block or to names of function parameters. However, these variables are always auto, so the specification of the storage class (the word auto) in source code is redundant.

[edit] Storage class static

Look at the following code:

#include <iostream>
using namespace std;
 
 
void counter(){
    int ct=0;
    cout << "Counting " << ct << '\n';
    ct=ct+1; /* this line is meaningless in the current context*/
 
}
 
int main(){
    for(int i=0;i<10;i++){
        counter();
    }
}

The function counter does the following:

  • Defines the variable ct
  • Prints the current value of ct
  • Increases ct by one.

We call the counter function ten times. If you run this program, the output will be something like the following:

Counting 0
Counting 0
Counting 0
Counting 0
Counting 0
Counting 0
Counting 0
Counting 0
Counting 0
Counting 0

You already knew this would happen! The reason is, the variable ct is of auto class and is in the scope of function counter. So, it cease to exist as soon as the program goes out of the function counter. Each time the program enters the function counter. ct is increased by one, but the new value is immediately lost, making the third line in the function rather meaningless!

Now, look at the following variant:

#include <iostream>
using namespace std;
 
 
void counter(){
    static int ct=0;
    cout << "Counting " << ct << '\n';
    ct=ct+1; /* this line is meaningless in the current context*/
 
}
 
int main(){
    for(int i=0;i<10;i++){
        counter();
    }
}

Now the output will be as follows:

Counting 0
Counting 1
Counting 2
Counting 3
Counting 4
Counting 5
Counting 6
Counting 7
Counting 8
Counting 9

Now you know what the static storage class does: It causes the current value of variable to be persistent between calls. However, it is important to remember that this DOES NOT make the variable counter global. It is still a local variable of the function counter(), but it is persistent in value. The following program makes this abundantly clear:

#include <iostream>
using namespace std;
 
void counter1(){
    static int ct=0;
    cout << "counter1 " << ct << '\n';
    ct=ct+1;
 
}
 
void counter2(){
    static int ct=0;
    cout << "counter2 " << ct << '\n';
    ct=ct+100;
 
}
 
int main(){
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}

The output will be:

counter1 0
counter2 0
counter1 1
counter2 100
counter1 2
counter2 200
counter1 3
counter2 300
counter1 4
counter2 400
counter1 5
counter2 500
counter1 6
counter2 600
counter1 7
counter2 700
counter1 8
counter2 800
counter1 9
counter2 900

Both functions counter1() and counter2() has static variable named ct. They are local to their respective functions, but are persistant between different calls to those functions.

Summary
Objects declared with the static storage class specifier have static storage duration, which means that memory for these objects is allocated when the program begins running and is freed when the program terminates. Static storage duration for a variable is different from file or global scope: a variable can have static duration but local scope.

[edit] Storage class extern

#include <iostream>
using namespace std;
 
int ct=0;
 
void counter1(){
    extern int ct; /* this declaration is useless in this context */
    cout << "counter1 " << 
                ct << '\n';
    ct=ct+1;
 
}
 
void counter2(){
    extern  int ct; /* this declaration is useless in this context */
    cout << "counter2 " << 
                ct << '\n';
    ct=ct+100;
 
}
 
int main(){
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}

If you run this program, the output will be as follows:

counter1 0
counter2 1
counter1 101
counter2 102
counter1 202
counter2 203
counter1 303
counter2 304
counter1 404
counter2 405
counter1 505
counter2 506
counter1 606
counter2 607
counter1 707
counter2 708
counter1 808
counter2 809
counter1 909
counter2 910

The meaning of the storage class extern is pretty clear from this: "Somebody else defines this variable, I just use it!".

However, in the context of the above program, this is rather useless. For example the program will run equally well and give the exact same output without the two extern specifiers, like the variant shown below:

#include <iostream>
 
int ct=0;
using namespace std;
void counter1(){
    cout << "counter1 " << ct << '\n';
    ct=ct+1;
}
 
void counter2(){
    cout << "counter2 " << ct << '\n';
    ct=ct+100;
}
 
int main(){
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}
Then the question is
Why do we need the extern specifier, can't we just define global variables and use them everywhere?
We need it when we use more than one file to write a computer program. Many real-world programs have a number of files of code that will be compiled together to make the running executable. Imagine this situation:
part1.cpp part2.cpp part3.cpp
#include <iostream>
using namespace std;
int ct;
 
void counter1(){
    cout << "counter1 " 
          << ct << '\n';
    ct=ct+1;
 
}
#include <iostream>
using namespace std;
 
void counter2(){
    cout << "counter2 " 
          << ct << '\n';
    ct=ct+100;
 
}
#include <iostream>
using namespace std;
void counter1();
void counter2();
 
int main(){
    ct=0;
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}

Try compiling the above code to a single executable. The compiler complains about ct not being defined. Indeed, this is the case, for ct is defined only in part1.cpp, but not in part2.cpp and part3.cpp. If we try to add the definition of ct to the other files as well (see below):

part1.cpp part2.cpp part3.cpp
#include <iostream>
using namespace std;
int ct;
 
void counter1(){
    cout << "counter1 " 
          << ct << '\n';
    ct=ct+1;
 
}
#include <iostream>
using namespace std;
int ct;
 
 
void counter2(){
    cout << "counter2 " 
          << ct << '\n';
    ct=ct+100;
 
}
#include <iostream>
using namespace std;
void counter1();
void counter2();
 
int ct;
 
 
int main(){
    ct=0;
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}

Then the linker will complain about "multiply defined symbols". Again it is correct, for we have defined same global variable in three places.

This is where the extern storage class comes handy. The following code works without a problem.

part1.cpp part2.cpp part3.cpp
#include <iostream>
using namespace std;
int ct;
 
void counter1(){
    cout << "counter1 " 
          << ct << '\n';
    ct=ct+1;
 
}
#include <iostream>
using namespace std;
extern int ct;
 
 
void counter2(){
    cout << "counter2 " 
          << ct << '\n';
    ct=ct+100;
 
}
#include <iostream>
using namespace std;
void counter1();
void counter2();
 
extern int ct;
 
 
int main(){
    ct=0;
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}

Why? because, we have defined ct once as a global variable (in file part1.cpp) and in all other files we indicate that it is an externally defined variable by using storage class extern.

[edit] Header files

It is customary to organize all the function prototype definitions and variable definitions in header files. In a large project, this makes a big difference in maintainability. Let's organize the above program that way:

part1.cpp part2.cpp part3.cpp
#include "part.h"  
void counter1(){
    cout << "counter1 " 
          << ct << '\n';
    ct=ct+1;
 
}
include "part.h"    
void counter2(){
    cout << "counter2 " 
          << ct << '\n';
    ct=ct+100;
 
}
#include "part.h"    
 
int main(){
    ct=0;
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}
part.h
#include <iostream>
using namespace std;
void counter1();
void counter2();
extern int ct;

This looks nice, but it will not work! The reason is clear, a variable has to be defined once without the extern storage specifier. Then we use extern else where to say that we use that variable. The #include statement simply replaces itself with the content of the header file part.h. That means, in all three files we have the following:

extern int ct;

So, we have a problem, for we have to have the definition without the extern specifier in one location! The following schemes solves this problem elegantly.


part1.cpp part2.cpp part3.cpp
#define EXT 
#include "part.h"  
void counter1(){
    cout << "counter1 " 
          << ct << '\n';
    ct=ct+1;
 
}
#define EXT extern 
include "part.h"    
void counter2(){
    cout << "counter2 " 
          << ct << '\n';
    ct=ct+100;
 
}
#define EXT extern
#include "part.h"    
 
int main(){
    ct=0;
    for(int i=0;i<10;i++){
        counter1();
        counter2();
    }
}
part.h
#include <iostream>
using namespace std;
void counter1();
void counter2();
EXT int ct;

In simple terms, what is happening here is in file part1.cpp we define the pre-processor variable EXT to be nothing. So, when we include the header file the definition of ct becomes:

int ct;

In other two files, we defined EXT to be "extern". So the definition will become:

extern int ct;

If you want to know more about the C/C++ preprocessor, read this section.

[edit] Preprocessor directives

Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a pound sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.

These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\).

[edit] macro definitions (#define, #undef)

To define preprocessor macros we can use #define. Its format is:

#define identifier replacement

When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifier by replacement.

#define TABLE_SIZE 100

int table1[TABLE_SIZE];
int table2[TABLE_SIZE];

After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:

int table1[100];
int table2[100];

This use of #define as constant definer is already known by us from previuos tutorials, but #define can work also with parameters to define function macros:

#define getmax(a,b) a>b?a:b

This would replace any occurrence of getmax followed by two arguments by the replacement expression, but also replacing each argument by its identifier, exactly as you would expect if it was a function:

// function macro
#include <iostream>
using namespace std;

#define getmax(a,b) ((a)>(b)?(a):(b))

int main()
{
  int x=5, y;
  y= getmax(x,2);
  cout << y << endl;
  cout << getmax(7,x) << endl;
  return 0;
}

5
7

Defined macros are not affected by block structure. A macro lasts until it is undefined with the #undef preprocessor directive:

#define TABLE_SIZE 100
int table1[TABLE_SIZE];
#undef TABLE_SIZE
#define TABLE_SIZE 200
int table2[TABLE_SIZE];

This would generate the same code as:

int table1[100];

int table2[200];

Function macro definitions accept two special operators (# and ##) in the replacement sequence:
If the operator # is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)

#define str(x) #x
cout << str(test);

This would be translated into:

cout << "test";
 

The operator ## concatenates two arguments leaving no blank spaces between them:

#define glue(a,b) a ## b
glue(c,out) << "test";
 

This would also be translated into:

cout << "test";
 
 

Because preprocessor replacements happen before any C++ syntax check, macro definitions can be a tricky feature, but be careful: code that relies heavily on complicated macros may result obscure to other programmers, since the syntax they expect is on many occasions different from the regular expressions programmers expect in C++.

[edit] Conditional inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)

These directives allow to include or discard part of the code of a program if a certain condition is met.

#ifdef allows a section of a program to be compiled only if the macro that is specified as the parameter has been defined, no matter which its value is. For example:

#ifdef TABLE_SIZE
int table[TABLE_SIZE];
#endif 

In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE was previously defined with #define, independently of its value. If it was not defined, that line will not be included in the program compilation.

#ifndef serves for the exact opposite: the code between #ifndef and #endif directives is only compiled if the specified identifier has not been previously defined. For example:

#ifndef TABLE_SIZE

#define TABLE_SIZE 100
#endif
int table[TABLE_SIZE];

In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not been defined yet, it would be defined to a value of 100. If it already existed it would keep its previous value since the #define directive would not be executed.

The #if, #else and #elif (i.e., "else if") directives serve to specify some condition to be met in order for the portion of code they surround to be compiled. The condition that follows #if or #elif can only evaluate constant expressions, including macro expressions. For example:

#if TABLE_SIZE>200

#undef TABLE_SIZE
#define TABLE_SIZE 200
 
#elif TABLE_SIZE<50
#undef TABLE_SIZE
#define TABLE_SIZE 50
 
#else
#undef TABLE_SIZE

#define TABLE_SIZE 100
#endif
 
int table[TABLE_SIZE];

Notice how the whole structure of #if, #elif and #else chained directives ends with #endif.

The behavior of #ifdef and #ifndef can also be achieved by using the special operators defined and !defined respectively in any #if or #elif directive:

#if !defined TABLE_SIZE

#define TABLE_SIZE 100
#elif defined ARRAY_SIZE
#define TABLE_SIZE ARRAY_SIZE
int table[TABLE_SIZE];

[edit] Line control (#line)

When we compile a program and some error happen during the compiling process, the compiler shows an error message with references to the name of the file where the error happened and a line number, so it is easier to find the code generating the error.

The #line directive allows us to control both things, the line numbers within the code files as well as the file name that we want that appears when an error takes place. Its format is:

#line number "filename"

Where number is the new line number that will be assigned to the next code line. The line numbers of successive lines will be increased one by one from this point on.

"filename" is an optional parameter that allows to redefine the file name that will be shown. For example:

#line 20 "assigning variable"
int a?;

This code will generate an error that will be shown as error in file "assigning variable", line 20.

[edit] Error directive (#error)

This directive aborts the compilation process when it is found, generating a compilation the error that can be specified as its parameter:

#ifndef __cplusplus
#error A C++ compiler is required!
#endif

This example aborts the compilation process if the macro name __cplusplus is not defined (this macro name is defined by default in all C++ compilers).

[edit] Source file inclusion (#include)

This directive has also been used assiduously in other sections of this tutorial. When the preprocessor finds an #include directive it replaces it by the entire content of the specified file. There are two ways to specify a file to be included:

#include "file"

#include <file>

The only difference between both expressions is the places (directories) where the compiler is going to look for the file. In the first case where the file name is specified between double-quotes, the file is searched first in the same directory that includes the file containing the directive. In case that it is not there, the compiler searches the file in the default directories where it is configured to look for the standard header files.
If the file name is enclosed between angle-brackets <> the file is searched directly where the compiler is configured to look for the standard header files. Therefore, standard header files are usually included in angle-brackets, while other specific header files are included using quotes.

[edit] Pragma directive (#pragma)

This directive is used to specify diverse options to the compiler. These options are specific for the platform and the compiler you use. Consult the manual or the reference of your compiler for more information on the possible parameters that you can define with #pragma.

If the compiler does not support a specific argument for #pragma, it is ignored - no error is generated.

[edit] Predefined macro names

The following macro names are defined at any time:

macro value
__LINE__ Integer value representing the current line in the source code file being compiled.
__FILE__ A string literal containing the presumed name of the source file being compiled.
__DATE__ A string literal in the form "Mmm dd yyyy" containing the date in which the compilation process began.
__TIME__ A string literal in the form "hh:mm:ss" containing the time at which the compilation process began.
__cplusplus An integer value. All C++ compilers have this constant defined to some value. If the compiler is fully compliant with the C++ standard its value is equal or greater than 199711L depending on the version of the standard they comply.

For example:

// standard macro names
#include <iostream>
using namespace std;

int main()
{
  cout << "This is the line number " << __LINE__;
  cout << " of file " << __FILE__ << ".\n";
   cout << "Its compilation began " << __DATE__;
  cout << " at " << __TIME__ << ".\n";
   cout << "The compiler gives a __cplusplus value of " << __cplusplus;
  return 0;
}

[edit] Writing Good Code

[edit] Keeping things organized

[edit] Putting it all together I -- C++ and EPAnet

Using epanet toolkit interface, it is quite straight forward to run Epanet via your C++ program. Following is an example. To run the example.

  1. Create a new project and save the file given below.
  2. Download epanet toolkit and save it somewhere.
  3. Set the paths to epanet.h and epanet2.dll.
  4. Download the input file given below and save it as net1.inp

(first a visualc++ project that is working with epanet2 toolkit: File:Vcproj epanet2.zip)

/********************************************************************
*** Sample code to call epanet2.dll 
*** Based on epanet example 2
*** Author: Assela Pathirana    2006DEC22
*** *********************************************
*** Modification history:
*** Name           Date     Description 
*** Assela Pathirana 20080529  Added vector return
*********************************************************************/
#define _CRT_SECURE_NO_DEPRECATE
#include "epanet2.h"
#include <iostream>
#include <vector>
 
using namespace std;
vector <float> PressureCalc(char *MyNode, int N, float D[]);
 
int main(void){
	char node[3]="22";
	float demands[5]={0,500,1000,2000,4000};
	float pressure[5];
	int num=5;
	char* tmp;
 
     vector<float> result;
	 result=PressureCalc(node, num, demands );
 
}
 
vector <float> PressureCalc(char *MyNode, int N, float D[])
{  
	int   i, nodeindex;  
	long  t;  
	float pressure;  
	int ret;
	char* tmp;
    vector <float> pressures;
	/* Open the EPANET toolkit & hydraulics solver */ 
	ret=ENopen("Net1.inp", "Net1.rpt", "");  
	printf("epanet returned:%i\n",ret);
	ret=ENopenH();  
	printf("epanet returned:%i\n",ret);
	/* Get the index of the node of interest */  
	ret=ENgetnodeindex(MyNode, &nodeindex);
	printf("epanet returned:%i\n",ret);
	printf("            Epanet node index for %s is %i\n",MyNode,nodeindex);
	/* Iterate over all demands */  
	for (i=0; i<N; i++)  {  
		/* Set nodal demand, initialize hydraulics, make a */  
		/* single period run, and retrieve pressure */     
		ret=ENsetnodevalue(nodeindex, EN_BASEDEMAND, D[i]);     
			printf("epanet returned:%i\n",ret);
		ret=ENinitH(0);     
			printf("epanet returned:%i\n",ret);
		ret=ENrunH(&t);     
			printf("epanet returned:%i\n",ret);
		ret=ENgetnodevalue(nodeindex, EN_PRESSURE, &pressure);     
			printf("epanet returned:%i\n",ret);
             printf("            Epanet pressure for %s is %10.5f\n", MyNode,pressure);
			 pressures.push_back(pressure);
	}  
	/* Close hydraulics solver & toolkit */  
	ret=ENcloseH(); 
	printf("epanet returned:%i\n",ret);
	ret=ENclose();
	printf("epanet returned:%i\n",ret);
	return pressures;
}
[TITLE]
 EPANET Example Network 1
A simple example of modeling chlorine decay. Both bulk and
wall reactions are included. 

[JUNCTIONS]
;ID              	Elev        	Demand      	Pattern         
 10              	710         	0           	                	;
 11              	710         	150         	                	;
 12              	700         	150         	                	;
 13              	695         	100         	                	;
 21              	700         	150         	                	;
 22              	695         	200         	                	;
 23              	690         	150         	                	;
 31              	700         	100         	                	;
 32              	710         	100         	                	;

[RESERVOIRS]
;ID              	Head        	Pattern         
 9               	800         	                	;

[TANKS]
;ID              	Elevation   	InitLevel   	MinLevel    	MaxLevel    	Diameter    	MinVol      	VolCurve
 2               	850         	120         	100         	150         	50.5        	0           	                	;

[PIPES]
;ID              	Node1           	Node2           	Length      	Diameter    	Roughness   	MinorLoss   	Status
 10              	10              	11              	10530       	18          	100         	0           	Open  	;
 11              	11              	12              	5280        	14          	100         	0           	Open  	;
 12              	12              	13              	5280        	10          	100         	0           	Open  	;
 21              	21              	22              	5280        	10          	100         	0           	Open  	;
 22              	22              	23              	5280        	12          	100         	0           	Open  	;
 31              	31              	32              	5280        	6           	100         	0           	Open  	;
 110             	2               	12              	200         	18          	100         	0           	Open  	;
 111             	11              	21              	5280        	10          	100         	0           	Open  	;
 112             	12              	22              	5280        	12          	100         	0           	Open  	;
 113             	13              	23              	5280        	8           	100         	0           	Open  	;
 121             	21              	31              	5280        	8           	100         	0           	Open  	;
 122             	22              	32              	5280        	6           	100         	0           	Open  	;

[PUMPS]
;ID              	Node1           	Node2           	Parameters
 9               	9               	10              	HEAD 1	;

[VALVES]
;ID              	Node1           	Node2           	Diameter    	Type	Setting     	MinorLoss   

[TAGS]

[DEMANDS]
;Junction        	Demand      	Pattern         	Category

[STATUS]
;ID              	Status/Setting

[PATTERNS]
;ID              	Multipliers
;Demand Pattern
 1               	1.0         	1.2         	1.4         	1.6         	1.4         	1.2         
 1               	1.0         	0.8         	0.6         	0.4         	0.6         	0.8         

[CURVES]
;ID              	X-Value     	Y-Value
;PUMP: Pump Curve for Pump 9
 1               	1500        	250         

[CONTROLS]
 LINK 9 OPEN IF NODE 2 BELOW 110
 LINK 9 CLOSED IF NODE 2 ABOVE 140


[RULES]

[ENERGY]
 Global Efficiency  	75
 Global Price       	0.0
 Demand Charge      	0.0

[EMITTERS]
;Junction        	Coefficient

[QUALITY]
;Node            	InitQual
 10              	0.5
 11              	0.5
 12              	0.5
 13              	0.5
 21              	0.5
 22              	0.5
 23              	0.5
 31              	0.5
 32              	0.5
 9               	1.0
 2               	1.0

[SOURCES]
;Node            	Type        	Quality     	Pattern

[REACTIONS]
;Type     	Pipe/Tank       	Coefficient


[REACTIONS]
 Order Bulk            	1
 Order Tank            	1
 Order Wall            	1
 Global Bulk           	-.5
 Global Wall           	-1
 Limiting Potential    	0.0
 Roughness Correlation 	0.0

[MIXING]
;Tank            	Model

[TIMES]
 Duration           	24:00 
 Hydraulic Timestep 	1:00 
 Quality Timestep   	0:05 
 Pattern Timestep   	2:00 
 Pattern Start      	0:00 
 Report Timestep    	1:00 
 Report Start       	0:00 
 Start ClockTime    	12 am
 Statistic          	None

[REPORT]
 Status             	Yes
 Summary            	No
 Page               	0

[OPTIONS]
 Units              	GPM
 Headloss           	H-W
 Specific Gravity   	1.0
 Viscosity          	1.0
 Trials             	40
 Accuracy           	0.001
 Unbalanced         	Continue 10
 Pattern            	1
 Demand Multiplier  	1.0
 Emitter Exponent   	0.5
 Quality            	Chlorine mg/L
 Diffusivity        	1.0
 Tolerance          	0.01

[COORDINATES]
;Node            	X-Coord         	Y-Coord
 10              	20.00           	70.00           
 11              	30.00           	70.00           
 12              	50.00           	70.00           
 13              	70.00           	70.00           
 21              	30.00           	40.00           
 22              	50.00           	40.00           
 23              	70.00           	40.00           
 31              	30.00           	10.00           
 32              	50.00           	10.00           
 9               	10.00           	70.00           
 2               	50.00           	90.00           

[VERTICES]
;Link            	X-Coord         	Y-Coord

[LABELS]
;X-Coord           Y-Coord          Label & Anchor Node
 6.99             73.63            "Source"                 
 13.48            68.13            "Pump"                 
 43.85            91.21            "Tank"                 

[BACKDROP]
 DIMENSIONS     	7.00            	6.00            	73.00           	94.00           
 UNITS          	None
 FILE           	
 OFFSET         	0.00            	0.00            

[END]

[edit] Putting it all together II -- Optimization and EPAnet

This section contains some advanced stuff. Don't be discouraged if you don't understand all and difficult to complete it without the help of somebody else.

[edit] Problem

The problem network.

The figure on the left shows a water supply network with a reservoir, seven pipe segments conveying water and five junctions that have different demands. We want to compute the most economical pipe diameters for each segment while maintaining a minimum pressure head of 10 m at all junctions. You can open the File:Cbcpp pipe network1.inp in Epanet 2 software to view the network.

Our plan is to use a Genetic Algorithm to optimize the pipe diameters. For this we need to connect Epanet Software to a genetic algorithm code.


[edit] Plan

We will use the Epanet Toolkit, a programming interface designed to run Epanet 2.0 programmatically without the standard graphical interface. For the Genetic Algorithm part, we'll use Evolving objects a free and open source package for evolutionary computations.

Evolving objects can be downloaded from eodev] website. Epanet toolkit can be downloaded from US-EPA website.

We shall attack our problem in a piecemeal fashion, in the steps given below:

  1. Get EO to solve a small GA problem.
  2. Replace the cost function with our own.
  3. Use epanet_toolkit to run Epanet on our water supply network and use the results to evaluate a cost function.

Usually this step-by-step approach is less error-prone than attempting the whole task once.

[edit] Prerequisites

  1. You should have completed the C++ programming primer preceding this section.
  2. You have exposure to the EPAnet 2 software (graphical version is adequate).
  3. A basic understanding of Genetic Algorithms (Follow these links: [3],[4],[5] and spend some time if not.)

In this lesson, we push the abilities we have gained to the limit! We link two code libraries, namely, Evolving objects -- a versatile genetic algorithm code and EPAnet (toolkit) a pipe network calculation model by writing some relatively simple code and create a running program. The following section explains the problem we have selected to solve (please note that the problem itself is of secondary importance here, what we are trying to do is to hone the skills we have gained and build confidence to attack bigger coding projects).

All the files needed for this excercise can be downloaded from here: File:Bpcpp-epanet-gaall.zip. Download and extract this file. It should create five folders.

include
lib
run
data
code


[edit] Running a GA

Create a new folder called projects under the same folder that has sub-folders of code, data, etc. This is where we shall keep the visual studio express edition related (platform specific) stuff. Open visual C++ and reate a new empty project EPGA in that folder. Add the following files in the code folder to the project. (Left click View->Solution Explorer, then right click on Source files sub folder, select Add->Existing Item and select the files.)

 
FirstRealGA.cpp
real_value.h

Your project should apper like the figure on right:


When you try to compile the project, you will get the error message on the line:

#include <eo>
#include <es.h>

Indicating that the compiler can not include 'eo' and es.h. To remedy this, we should include the path to the include file folder eo ("..\..\include"). (Note: Depending on your particular project structure the above path may be something like "..\..\..\include" or "..\include". Try.) This can be done by adding it to: Edit-><project>Properties->C/C++->General->Additional include directories.

At this stage RealEA.cpp should compile successfully, but would cause errors at linking stage. The error messages would look like

FirstRealGA.obj : error LNK2019: unresolved external symbol

The reason for this is

  1. The compiler knows about eo library (by #include <eo>), but
  2. the real library objects of eo needed for linking are missing.

To rectify this, we should let the linker access to the necessary libraries. Before doing this we have to make a detour. First save your project/solution and close it.

Examine the folder lib. It has following four files.

eo.lib  eoes.lib  eoga.lib  eoutils.lib
Adding linker dependancies.

Now open your project again. In add the above four files as dependancies for the linker. (Edit-><project>Properties->Linker->Input->Additional Dependancies).

Then let the linker know where these are: (Edit-><project>Properties->Linker->General->Additional Library Directories). Add something like ..\..\lib.

Technical note
  1. Depending on the version of Visual C++ you are using, you might have to do the following modification to get the project to compile properly. Properties->C/C++->Code Generation->Runtime Library entry should be Multi-threaded (/MT).
  2. If you still can not get it to link properly, replace your etire EPGA directory (in projects folder) with the one given here: File:Epga project.zip


At this stage, you should be able to compile the project successfully. Debug->Start without Debugging should run the program, albeit with not-so-meaningful-at-the-moment results.

[edit] EO In-Action

Now is a good time to have an idea about how our GA code works. Don't worry if you can not understand everything -- what is important is to have a general idea of how things work!

The actual code-in-action is very short, indeed. I have changed the comments a little bit.

/* Many instructions to this program can be given on the command line. 
     The following code understands what you have specified as command line arguments. 
  */
  eoParser parser(argc, argv);  // for user-parameter reading
  eoState state;    // keeps all things allocated
 
  typedef eoReal<eoMinimizingFitness> EOT;
 
  // The evaluation fn - encapsulated into an eval counter for output 
  eoEvalFuncPtr<EOT, double, const std::vector<double>&> 
               mainEval( real_value );
  eoEvalFuncCounter<EOT> eval(mainEval);
  // the genotype - through a genotype initializer
  eoRealInitBounded<EOT>& init = make_genotype(parser, state, EOT());
  // Build the variation operator (any seq/prop construct)
  eoGenOp<EOT>& op = make_op(parser, state, init);
  // initialize the population - and evaluate
  // yes, this is representation indepedent once you have an eoInit
  eoPop<EOT>& pop   = make_pop(parser, state, init);
  // stopping criteria
  eoContinue<EOT> & term = make_continue(parser, state, eval);
  // output
  eoCheckPoint<EOT> & checkpoint = make_checkpoint(parser, state, eval, term);
  // algorithm (need the operator!)
  eoAlgo<EOT>& ea = make_algo_scalar(parser, state, eval, checkpoint, op);
  make_help(parser); //print help, if something is missing or user gives /h 
  // evaluate intial population AFTER help and status in case it takes time
  apply<EOT>(eval, pop);
  // print it out
  cout << "Initial Population\n";
  pop.sortedPrintOn(cout);
  cout << endl;
 
  run_ea(ea, pop); // run the ea
 
  cout << "Final Population\n";
  pop.sortedPrintOn(cout);
  cout << endl;

You can get away without understanding a single line of code above!! The only critical part is the following function, specified in the header file real_value.h. In order to adopt these versatile algorithms to solve a problem of our choosing, only changing this header file is adequate.

[edit] Fitness function

double real_value(const std::vector<double>& _ind)
{
  double sum = 0;
  for (unsigned i = 0; i < _ind.size(); i++)
      sum += _ind[i] * _ind[i];
  return sqrt(sum);
}

The eo library expects this function to be present and uses to evaluate the individuals in the population for fitness.

What happens here?

  1. Our genotype vector has (say) n number of items. The fitness function simply squares each of these and add them up and returns the square root. No rocket science here! Just, the rule here is larger the individuals -- better the fitness!!
  2. The GA code does the rest,

Try running the algorithm. It will go on decreasing the cost function and exit at 100th generation. The default is 100 generations, in a moment, we'll get into the details on how to change the default behavior.


[edit] Changing GA parameters

Examine the file FirstRealGA.cpp. In that file, the following section define various parameters for the Genetic Algorithm.

  const unsigned int SEED = 1; // seed for random number generator
  const unsigned int VEC_SIZE = 40; // Number of object variables in genotypes(tstep*ninterv)
  const unsigned int POP_SIZE = 50; // Size of population
  const unsigned int T_SIZE = 3; // size for tournament selection
  const unsigned int MAX_GEN = 500; // Maximum number of generation before STOP
  const float CROSS_RATE = (float)0.8; // Crossover rate
  const double EPSILON = 0.01;  // range for real uniform mutation
  const float MUT_RATE = (float)0.5;   // mutation rate

Try changing these values. Each time you have to recompile the program before running it.

[edit] Writing our own cost function

Now let's turn into our original problem. We need to minimize the diameter of seven pipes while maintaining reasonable (>10m) pressure at all supply nodes. Lets forget the pressure part for a moment and concentrate on minimizing the diameter. (The answer here is obvious, if pressure is no concern, then the 'best' diameter is zero!! -- but let's live with this silly notion for the moment.)

Do the following changes:

  • First we change the args.txt file to have
--vecSize=7 # we have seven pipes, each diameter can be represented by a single real value. 
--initBounds=7[0,500]             # -B : Bounds for variables -- pipe diameter within 0-500mm
--objectBounds=7[0,500]             # -B : Bounds for variables -- pipe diameter within 0-500mm
  • Change the cost function real_value.h to the following:
#include <vector>
#include <iostream>
using namespace std;
double real_value(const std::vector<double>& _ind)
{
  //GA returns diameter as ind_
  double length=1000;
  double factor=1.; //some factor so that cost=length=diameter*factor (lets keep things simple!)
  double dia,cost = 0;
  for (unsigned i = 0; i < _ind.size(); i++){
      cost+=_ind[i]*length*factor;
  }
  return cost/10000;
}

If everything is all right, you will have some large initial cost, and a very small final cost.


Now that we have customized the GA to represent our silly optimization problem, it is but a small step to do the real job!

[edit] In comes EPANet 2

Download the epanet toolkit from here: File:Epanet tools min.zip With the provided EPANET toolkit (in epanet_tools folder) there is a help file: TOOLKIT.HLP. This is going to be our standard reference to various calls to epanet program via toolkit.

Let's do most of the changes in the real_value.h file and try to keep changes in FirstRealGA.cpp to a minimum.

We will focus on several toolkit functions:

[edit] ENOpen and ENClose

Declaration
int ENopen( char* f1, char* f2, char* f3)
Description
Opens the Toolkit to analyze a particular distribution system.
Declaration
int Enclose(void)
Description
Closes down the Toolkit system (including all files being processed).

[edit] ENSolveH

Declaration
int ENsolveH( void )
Description
Runs a complete hydraulic simulation with results for all time periods written to the binary Hydraulics file.

[edit] ENGetNodeValue

Declaration
int  ENgetnodevalue( int index, int paramcode, float* value )
Description
Retrieves the value of a specific link parameter.

[edit] ENSetLinkValue

Declaration
int  ENsetlinkvalue( int index, int paramcode, float value )
Description
Sets the value of a parameter for a specific link.

[edit] Call EPANet 2

Do the following changes in FirstRealGA.cpp.

From
try
    {
        main_function(argc, argv);
    }
    catch(exception& e)
To
try
    {
       epanet_init(); // 
        main_function(argc, argv);
      epanet_close();
    }
    catch(exception& e)

Then write two functions epanet_init() and epanet_close() in the code. The first function should have calls to ENopen, ENopenH, ENinitH. The second should have ENcloseH, ENclose.

Now let's do the major modifications in real_value.h

As the first stage:

#include <vector>
#include <iostream>
#include "epanet2.h"
using namespace std;
double dia_cost_factor=1.; //some factor so that cost=length=diameter*factor (lets keep things simple!)
 
/** A function that computes the cost. This is what the GA use to evaluate its populations */
double real_value(const std::vector<double>& _ind)
{
  //GA returns diameter as ind_
  double length=1000; /* All pipe lengths are equal */
 
  double dia,cost = 0;
  for (unsigned i = 0; i < _ind.size(); i++){
      cost+=_ind[i]*length*dia_cost_factor;
  }
  return cost/10000;
}
 
/* We open the epanet system with the necessary input file. 
A bit of hard coding here. But, lets live with that for the moment. */
void epanet_init(){
	int ret;
	char file[500]="../../data/network.inp";
	char rept[500]="../../data/network.rep";
 
	ret=ENopen(file,rept,"");
	cout << "At opening Epanet retured : "<<ret<<'\n';
 
}
/* Close the epanet system */
void epanet_close(){
	int ret;
    ret=ENclose();
	cout << "At closing Epanet retured : "<<ret<<'\n';
 
}


To run the above you will have to

  1. Add ..\..\epanet_tools to additional include directories.
  2. Make sure that epanet2.dll is in the place where the program runs. (Change Properties->Configuration Properties->Debugging->Working Directory to the directory you have epanet2.dll in. (e.g. ../../epanet_tools/).


Run the application at this stage to make sure that the two epanet calls return zero (error free call signal).

Then add a function pressure_cost to real_value.h to compute the 'cost' of pressure deficiency. (Something like the one below)

/* Returns the pressure cost (penalty for pressure violations at demand nodes) based on epanet runs.
Prerequisites: The epanet system should be initialized before calling this function for the first time. */
double pressure_cost(vector<double> _ind){
	int ret;
	double cost;
	for(unsigned int i=0;i<npipes;i++){
		int index=-1;
		ret=ENgetlinkindex(pipes[i],&index);
		//cout << "At opening Epanet retured : "<<ret<<'\n';
		ret=ENsetlinkvalue(index,EN_DIAMETER,_ind[i]);
		//cout << "At opening Epanet retured : "<<ret<<'\n';
	}
	//now run the simulation
		ret=ENsolveH();
		//cout << "At solve Epanet retured : "<<ret<<'\n';
		cost=0;
    //read the pressure values
	for(unsigned int i=0;i<nnodes;i++){
		int index=-1;
		ret=ENgetnodeindex(nodes[i],&index);
		float value;
		//cout << "At ENgetnodeindex Epanet retured : "<<ret<<'\n';
		ret=ENgetnodevalue(index,EN_PRESSURE,&value);
		//cout << "At ENgetnodevalue Epanet retured : "<<ret<<'\n';
		if(value<10){
			cost+=pressue_cost_factor*(10-value); // if p<10m, set a proportional penalty. 
		}
	}
 
	//cout << "At ENcloseH Epanet retured : "<<ret<<'\n';
	return cost;
 
 
}

The value of variable pressure_cost_factor should be carefully considered (against that of dia_cost_factor).

Finally modify real_value function so that it will call the above pressure_cost function and add the cost to the total cost.

At this stage you have a complete living-breathing program that join the power to evolving objects with Epanet.

[edit] A touch of sophistication -- let's get rid of hard coding

This section is here mostly for the sake of completion. Ignore this section if you don't have time or inclination.

We can change the behavior of the GA without recompiling the code, thanks to the sophistication of the design of EO library. However, we have given up some of this flexibility in the way we have designed our cost function. We have hard coded a number of items:

  1. Name of the network file.
  2. Number of pipes in the network.
  3. Number of nodes.
  4. IDs of pipes and nodes.

It is possible to make our program quite flexible in these aspects also. But it needs a bit of work. Let's see how it can be done. The first stage is to change the real_value.h so that instead of hard coded values, it can take values stored in variables. Then in the main function (RealEA.cpp) add the code necessary to read these from a text file supplied by user.

Lets define the text file format as follows:

<network file name>
<file name for the report to write into>
<no of pipes>
<PIPE_ID1>
...
<no of nodes>
<NODE_ID1>
...

That means something like the file below

..\..\data\network.inp
..\..\data\network.rpt
7
P1
P2
P3
P4
P5
P6
P7
5
J1
J2
J3
J4
J5

Then the modified program is as follows:

real_value.h
#include <vector>
#include <iostream>
#include "epanet2.h"
#include <fstream>
using namespace std;
#define MAX_PATH_LEN 500
#define MAX_LABEL_LEN 25
double pressure_cost(vector<double> _ind);
int npipes=7;
 
int nnodes=5;
 
char file[MAX_PATH_LEN];
char rept[MAX_PATH_LEN];
vector<string> nodes; // notice, now we use vectors instead of traditional arrays. 
vector<string> pipes;
 
  double dia_cost_factor=1.; //some factor so that cost=length=diameter*factor (lets keep things simple!)
  double pressue_cost_factor=1000000; //multiplier to 'map' pressue defficiency to cost. 
                                      // cost=(pressuredefficiency)*pmult
 
 
  /** read the text file specified by filename argument and obtain epanet related parameters */
  void parse_epanet_para(char* filename){
	  cout << "I read epanet related data from "<<filename<<"\n"; // inform the user
	  //open the file
	  ifstream myfile (filename);
	   if(!myfile.is_open()){ // this is important. 
		   cout << "I can not open the file:"<<filename <<" I quit!!\n";
		   exit(1);
	   }
	   myfile >> file; //read the name of the file
	   myfile >> rept; //read the name of the (new) report file
	   myfile >> npipes; //number of pipes
	   for(int i=0;i<npipes;i++){ // read those pipe ids
		   char tmp[MAX_LABEL_LEN];
		   myfile >>  tmp;
		   pipes.push_back(tmp);
	   }
		myfile >> nnodes; //number of junctions
	   for(int i=0;i<nnodes;i++){//those ids
		   char tmp[MAX_LABEL_LEN];
		   myfile >>  tmp;
		   nodes.push_back(tmp);
	   }
  }
 
double real_value(const std::vector<double>& _ind)
{
  // check for sanity 
	if(_ind.size()!=npipes){
		//raise hell
		cout << "Bloody murder!\n";
		cout << "Number of pipes and chromosome size mismatch!\n";
		exit(5);
	}
  //GA returns diameter as ind_
  double length=1000;
 
  double dia,cost = 0;
 
  cost=pressure_cost(_ind);
  for (unsigned i = 0; i < _ind.size(); i++){
      cost+=_ind[i]*length*dia_cost_factor;
  }
  return cost/10000;
}
 
double pressure_cost(vector<double> _ind){
	int ret;
	double cost;
	for(unsigned int i=0;i<npipes;i++){
		int index=-1;
		char tmp[MAX_LABEL_LEN]; // this gimmick here is to convet a c++ string to a c style char*
		strcpy(tmp,pipes[i].c_str()); // because epanet is writtin in old c, which does not accept strings.
		ret=ENgetlinkindex(tmp,&index);
		//cout << "At opening Epanet retured : "<<ret<<'\n';
		ret=ENsetlinkvalue(index,EN_DIAMETER,_ind[i]);
		//cout << "At opening Epanet retured : "<<ret<<'\n';
	}
	//now run the simulation
		ret=ENsolveH();
		//cout << "At solve Epanet retured : "<<ret<<'\n';
		cost=0;
    //read the pressure values
	for(unsigned int i=0;i<nnodes;i++){
		int index=-1;
		char tmp[MAX_LABEL_LEN]; // convert c++ string to c style char* 
		strcpy(tmp,nodes[i].c_str());
		ret=ENgetnodeindex(tmp,&index);
		float value;
		//cout << "At ENgetnodeindex Epanet retured : "<<ret<<'\n';
		ret=ENgetnodevalue(index,EN_PRESSURE,&value);
		//cout << "At ENgetnodevalue Epanet retured : "<<ret<<'\n';
		if(value<10){
			cost+=pressue_cost_factor*(10-value);
		}
	}
 
	//cout << "At ENcloseH Epanet retured : "<<ret<<'\n';
	return cost;
 
 
}
 
void epanet_init(){
	int ret;
 
 
	ret=ENopen(file,rept,"");
	cout << "At opening Epanet retured : "<<ret<<'\n';
 
}
 
void epanet_close(){
	int ret;
    ret=ENclose();
	cout << "At closing Epanet retured : "<<ret<<'\n';
 
}
RealEA.cpp
#define _CRT_SECURE_NO_DEPRECATE
//above is to get rid of deprecation warnings of Microsoft compiler. Needed because we use strcpy() function. 
#include <iostream>
#include <es/make_real.h>
#include "real_value.h"
#include <apply.h>
 
using namespace std;
typedef eoReal<eoMinimizingFitness> EOT;
void print_values(eoPop<EOT> pop);
int main_function(int argc, char* argv[]);
 
/** Notice that we have moved everything that was previously in main() to 
main_function. 
Now before GA related stuff is handled (by main_function), 
We process the command argument list. Unlike the previous case, now the first argument, i.e. 
the filename of the EPAnet related parameters, is mandatory. 
Then we copy the rest of the arguments in argv to a new array argv_ and pass it to main_function. 
From the GA viewpoint, nothing has changed. It receives a argument array. If there are no arguments in it, GA will 
run with default parameters. Otherwise it will parse the argument array. 
The first command line argument is separately passed parse_epanet_para function.
*/
int main(int argc,char *argv[]){
       /* argv[0] is always the name of the program. So, to run properly the program should have 
       length of argv (i.e. argc) >=2 
       If this is not the case, provide some help. */
	if(argc<2){// no arguments provided at the command line
		cout << "Usage: argv[0] <epanet_related_datafile> <EO related arguments ...>\n";
		cout << "Format of epanet_related_datafile as follows.\n";
		cout << "<network file name>\n";
		cout << "<file name for the report to write into>\n";
		cout << "<no of pipes>\n";
		cout << "<PIPE_ID1>\n";
		cout << "...\n";
		cout << "<no of nodes>\n";
		cout << "<NODE_ID1>\n";
		cout << "...\n";
		exit(1);
	}
	char* filename=argv[1]; // seperately copy argv[1] (first argument) to variable filename
	char* argv_[MAX_PATH_LEN]; 
	argv_[0]=argv[0];       // argv[0] is the calling program name, copy this as is. 
	for(int i=1;i<argc-1;i++){ // then copy the rest (argv[2], argv[3], ...) of arguments to new array
		cerr << argv[i+1];
		argv_[i]=argv[i+1];
	}
	argc--; // argc should be one less than before
	//now parse the parameter file stright away! 
	parse_epanet_para(filename);
	return main_function(argc,argv_); // now call main_function with new argc, argv_ pair. 
}
int main_function(int argc, char* argv[])
{
 
  try
  {
  // first initialize the Epanet
  epanet_init();
 
  eoParser parser(argc, argv);  // for user-parameter reading
  eoState state;   
  eoEvalFuncPtr<EOT, double, const std::vector<double>&> 
               mainEval( real_value );
  eoEvalFuncCounter<EOT> eval(mainEval);
  eoRealInitBounded<EOT>& init = make_genotype(parser, state, EOT());
  // Build the variation operator (any seq/prop construct)
  eoGenOp<EOT>& op = make_op(parser, state, init);
  // initialize the population - and evaluate
  // yes, this is representation indepedent once you have an eoInit
  eoPop<EOT>& pop   = make_pop(parser, state, init);
  // stopping criteria
  eoContinue<EOT> & term = make_continue(parser, state, eval);
  // output
  eoCheckPoint<EOT> & checkpoint = make_checkpoint(parser, state, eval, term);
  // algorithm (need the operator!)
  eoAlgo<EOT>& ea = make_algo_scalar(parser, state, eval, checkpoint, op);
  // to be called AFTER all parameters have been read!!!
  make_help(parser);
 
  // evaluate intial population AFTER help and status in case it takes time
  apply<EOT>(eval, pop);
  // print it out
  cout << "Initial Population\n";
  pop.sortedPrintOn(cout);
  cout << endl;
 
  cin.get();
  run_ea(ea, pop); // run the ea
 
  cout << "Final Population\n";
  pop.sortedPrintOn(cout);
  cout << endl;
 
  // close Epanet
  epanet_close();
 
  }
  catch(exception& e)
  {
    cout << e.what() << endl;
  }
	return 1;
}