C++ has two different data types that are used to store text. The first is the char data type, which stores a single character. The second is the string data type, which stores a sequence of characters. In this chapter we will start with the char data type before learning about strings.
A character value is a single letter or digit or other symbol enclosed in single quotes, like 'a' or '5' or '%'. To store a character value, we need to use the data type char:
Exactly one character muse be written in the single quotes for a char. An empty char '' is illegal. As is trying to place multiple characters in the char like 'hello'. (We need to use strings for 0 or 2+ characters.)
The processor in your computer does not really work with characters. At a hardware level, everything is just a number (or to be more precise, a sequence of 1βs and 0βs). So to store and work with something like a character, it needs to be converted to a number.
Each character in C++ has a corresponding number, which is called its ASCII value. For example, the ASCII value for the letter βAβ is 65, and the ASCII value for the letter βaβ is 97. You can see the ASCII values for all characters at the ASCII Table. The ASCII table has 256 entries (\(2^8\)) so it requires 8 bits to store a value large enough to represent any ASCII value. So, in C++, a char occupies 1 byte (8 bits) in memory. That means it can store the values -128 to 127.
Because chars are stored as numbers, it is possible to do math with them. The ASCII value for 'A' is 65. If you add one to that you get 66, which is the ASCII value for 'B'. It is even possible to assign char variables numeric values:
Just because you can do something does not mean you should. Using numeric values like 67 instead of char literals like 'C' is bad practice. No one reading your code should have to remember what character has the ASCII code 94.
The numeric aspect of chars explains one confusing aspect of working with them. Although you can compare chars using relational operators, the results are not always what you might expect. Examine the following. (Remember 1 is true and 0 is false.)
As long as you are comparing two upper-case letters or two lower-case letters, it is safe to assume that < or > will do a logical alphabetical order comparison. But you canβt rely on those operators to do anything that makes sense outside of the ASCII table when applied to two different kinds of character.
The ASCII character set has the characters used in English and many European languages. But to represent characters from other languages, symbols, and things like emojis, we need a bigger table of characters. Unicode is a standard for representing in alphabets like Cyrillic and Greek, non-alphabetic languages like Chinese, and various symbols. You can read more about it at the Unicode website (https://unicode.org/).
C++ provides a data type wchar_t (wide character type) for storing Unicode values. We will not cover it in this book, but pretty much anything you can do with a char you can do with a wchar_t. Do a search for βC++ wide characterβ to learn more if you are interested.