Introduction to Number Systems and Computer Data Representation Methods
CSCI 1500 -
Introduction to Programming & Problem Solving
Data Representation Methods
for Computer Storage of Data
Representing Characters
American Standard Code for Information Interchange
(ASCII)
Most common coding
scheme for representing characters: ASCII code
1.
Originally,
a 7-bit code to represent upper- and lowercase letters (e.g., ‘a’ is 01100001),
punctuation symbols, digits, and some control character information (carriage
return, etc.).
2.
Has
been extended to an 8-bit code - can represent 256 different characters.
3.
Information
stored using ASCII code sometimes referred to as text format.
The following table gives ASCII code representations for
some characters.
ASCII Code for Some Characters
Key ASCII Code Key ASCII Code Key ASCII Code
+ 00101011 A 01000001 a 01100001
, 00101100 B 01000010 b 01100010
- 00101101 C 01000011 c 01100011
. 00101110 D 01000100 d 01100100
/ 00101111 E 01000101 e 01100101
0 00110000 F 01000110 f 01100110
1 00110001 G 01000111 g 01100111
2 00110010 H 01001000 h 01101000
3 00110011 I 01001001 I 01101001
4 00110100 J 01001010 j 01101010
5 00110101 K 01001011 k 01101011
6 00110110 L 01001100 l 01101100
7 00110111 M 01001101 m 01101101
8 00111000 N 01001110 n 01101110
9 00111001 O 01001111 o 01101111
NOTE: Other coding schemes are used for
representing character data; e.g., EBCDIC is another coding scheme sometimes
used on larger computers; files created by specific software packages use their
own coding scheme for storing information; Unicode is a 16-bit code developed
by a group of manufacturers.
Representing Numbers
Numbers are typically
represented in a binary format, not ASCII.
This is a more efficient use of storage. For example, in text mode, five bytes of storage would be
required to store the number 32000 (one byte for each character). But, using the typical coding scheme for
integers, this number can be stored using only two bytes.
The binary number system is a way of
representing numbers using only 0’s and 1’s.
As noted before, it is useful for representing the states of the
circuits that make up RAM - “off” and “on” circuit states could be represented
in binary as 0’s and 1’s.
The hexadecimal (hex) number system is also
useful for this discussion. There is a
close relationship between the binary and hex systems. We will see that any collection of four
binary digits can be uniquely represented as one hex digit. So, hex digits are a convenient way to
represent long strings of binary digits (e.g., 32 bits could be represented
using only 8 hex digits).
Binary and Hexadecimal
Number Systems
The decimal number system is
a base 10 number system. There are 10
digits used (0-9) and the digits of any decimal number, going from right to
left, represents the number of each successively higher power of ten that is
part of the original decimal number.
For example: The decimal number
12,345 means 5 1’s, 4 10’s, 3 100’s, 2 1000’s, and 1 10,000. Note:
,
and
.
The binary number system is a base 2 number
system. There are only two digits used
(0 and 1) and the digits of any binary number, going from right to left,
represents the number of each successively higher power of two that is part of
the original binary number. For
example: The binary number 11001 means
one 1, no 2’s, no 4’s, one 8, and one 16 (adds up to 25, decimal). Note:
![]()
![]()
and
.
The hexadecimal number system is a base 16
number system. There are 16 digits used
(0-9, and A,B,C,D,E,F), where A-F are digits used to represent the decimal
values 10-15. Each digit of a hex
number, going from right to left, represents the number of each successively
higher power of sixteen that is part of the original hex number. For example: The hex number 2ADF means F (decimal 15) 1’s, D (decimal 13)
16’s, A (decimal 10) 256’s, and two 4096’s (adds up to 10,975, decimal). Note:
and
.
The following table
shows the equivalent representations of decimal 0-15 in both binary and hex.
Decimal Binary Hexadecimal
0 0000 0
1 0001 1
2 0010 2
3 0011 3
4 0100 4
5 0101 5
6 0110 6
Decimal Binary Hexadecimal
7 0111 7
8 1000 8
9 1001 9
10 1010 A
11 1011 B
12 1100 C
13 1101 D
14 1110 E
15 1111 F
Representing Binary Data in Hex Form
As mentioned earlier,
it is common to represent binary numbers in groups of four bits, since it is
then easy to refer to the binary number in hex format.
Example: Represent the following 32-bit binary number
in hex form.
Binary number: 01011101001011110110101101111000
Binary number in
groups of 4 bits: 0101 1101
0010 1111 0110
1011 0111 1000
Binary number in hex
format: 5 D 2 F
6 B 7 8
Number System Conversions
Converting from binary to decimal:
To convert a binary
number to its decimal equivalent, use the positional meaning of each digit of
the number, as discussed earlier.
Example: Convert
0101 1101 to decimal.
0101 1101 = 1 + 0*2 + 1*4 + 1*8 + 1*16 + 0*32 +
1*64 + 0*128 = 93
Converting from decimal to binary:
To convert a decimal
number to its binary equivalent, repeatedly divide the decimal number by two,
until the quotient is zero - keep track of the remainder digit at each division
step because the remainder digits are
the binary digits of the binary number representation going from right-to-left.
Example: Convert 567 to binary. Note:
In the following steps, “rem” stands for remainder.
567/2 =
283 rem 1
283/2 =
141 rem 1
141/2
= 70 rem 1
70/2 =
35 rem 0
35/2 =
17 rem 1
17/2 =
8 rem 1
8/2 =
4 rem 0
4/2 =
2 rem 0
2/2 =
1 rem 0
1/2 =
0 rem 1
So, 567 =
10 0011 0111
Another, easier, way
to convert from decimal to binary, is to first convert from decimal to
hexadecimal, and then from hex to binary, as outlined next.
Converting from decimal to hexadecimal:
To convert from
decimal to its equivalent hexadecimal number, repeatedly divide the decimal
number by 16, until the quotient is zero - again, keep track of the remainder
digit at each division step because the remainder digits are the hex digits of
the hex number representation going from right-to-left.
Example: Convert 765 to hex
765/16 =
47 rem 13 (hex digit D)
47/16 =
2 rem 15 (hex digit F)
2/16 =
0 rem 2
So, 765 =
2FD hex.
Now, to convert a
decimal number to binary by using the decimal’s hex equivalent, convert each
hex digit to its equivalent four binary digits.
Example: Use the above hex equivalent of the decimal
765 to convert 765 to binary.
765 = 2FD
hex = 0010 1111 1101 binary
Since the
hex digits convert to binary as 2 = 0010, F = 1111, and D = 1101.
Converting from hexadecimal to decimal:
To convert a
hexadecimal number to its decimal equivalent, again use the positional meaning
of each digit of the hex number, as discussed earlier.
Example: Convert
the hex number D2F6 to decimal.
D2F6 = 6 +
F*16 + 2*162 + D*163 = 6 + 15*16 + 2*256 + 13*4096 =
54006
Binary Addition
To add two binary
numbers, add bitwise and carry if necessary.
This is similar to addition in
decimal, but in general, there is lots of carrying with binary addition.
Example: Add 0010 1100 1110 and 0011 0111 1010.
Carry bit: 0111 1111 110
0010 1100
1110
+ 0011
0111 1010
0110 0100
1000
Storing Integers
Negatives and Two’s Complement
For all binary numbers
we have just discussed, we discussed only the representation of positive
integer quantities. Next we will
discuss a way to represent negative integers, also.
A common method used
for the representation of negative integers is two’s complement representation.
First, positive integers are represented in two’s complement in the
binary form that we have already discussed (although the most significant,
left-most bit will not used as a digit of significance).
To form the two’s
complement representation of a negative of a positive integer, do this:
1.
Form
the one’s complement of the integer - change all 1 bits to 0’s and all 0 bits
to 1’s.
2.
Add
1 to the result.
Example: Find the two’s complement negative of
decimal 61 (use 8 bits to represent the number).
1.
Decimal
61 = 0011 1101 binary.
2.
Form
complement: 1100 0010
3.
Add
1 to the result: 1100 0010 + 1 = 1100
0011
Comments:
1.
When
representing integers as two’s complement integers, the leftmost bit represents the sign of the number - it is not a digit
of significance (i.e., it is not used to determine the value of the
integer). If the leftmost bit is a 1,
then the number is negative and if the leftmost bit is 0, then the number is
non-negative. Two’s complement numbers
are some times referred to as signed
binary numbers. Binary numbers for
which all digits are significant are some times referred to as unsigned binary numbers. What this means is you need to be aware of
the context of the application since the same binary number can represent
different quantities depending on whether it is signed or unsigned.
Example: There are eight different binary
representations that can be formed using three bits. Here are all of the decimal equivalent values for these eight
binary representations assuming that the representations are 1) signed (two’s
complement) values and 2) unsigned values.
Binary
number Unsigned value Signed (two’s complement) value
000 0 0
001 1 1
010 2 2
011 3 3
100 4 -4
101 5 -3
110 6 -2
111 7 -1
2.
As
can be seen from the above example, when using n bits to represent an unsigned
value, the range of possible values are 0 to 2n -1; when using n
bits to represent a signed value,
the range of possible values are -2n-1 to 2n-1
-1. A common word size for storing
an integer value in the C++/C programming language is 16-bits. This means that
the range of values that can be stored in one 16-bit unsigned integer word
would be 0 to 216 -1 (0 to 65,535), and which can be
stored in one 16-bit signed integer word would be -215 to 215 -1
(-32,768 to 32,767).
3.
Why
is the two’s complement representation used to represent negative
integers? Note what happens when you
add a positive binary number (8-bit) to its two’s complement:
Example: Add 0011 1101
+ 1100 0011
1 0000 0000
The result has the low
eight bits all zero, with a one carry bit.
If the carry bit is dropped, the result of the addition is zero. This means that the positive binary number added
to its two’s complement results in zero - that implies that the two’s
complement of a number is its negative, as we previously asserted.
4.
Only
two circuits are required to implement the four arithmetic operations of
integer arithmetic in a computer - addition and negation circuits. Subtraction is implemented as “adding the
opposite”. Multiplication is repeated
addition; division is repeated subtraction (6/2 = number of times 2 can be
subtracted from 6 before the result becomes negative).
Example: For a computer with a 16-bit integer word
size, how would the computer do this subtraction problem: Find
1234 - 456
1. The computer would first convert the
decimals to binary (will use 16 bits here):
1234 =
0000 0100 1101 0010
456
= 0000 0001 1100 1000
2. Next, the computer would form the two’s
complement of 456:
-456 =
1111 1110 0011 1000
3. Then, the computer would add 1234 and
(-456):
1234 =
0000 0100 1101 0010
+ -456 = 1111 1110 0011 1000
778
= 0000 0011 0000 1010
5.
Overflow - it is possible to perform
an arithmetic operation and obtain a result out of the range of the integer
representation you are using. The
result will be one of the possible signed two’s complement values, but will not
be the correct result.
Example: Using 3-bit
two’s complement representations, 2 + 3 = 010 + 011 = 101 = -3 (not 5).
This type of error is
called overflow. To avoid, the programmer can use double
precision values - these are two’s complement representations of the integers
that use larger size words. (For
example, in C, there is a way of storing integers in 4-byte words or 2-byte
words.)
Storing Floating Point Numbers
Fractions in Binary
Recall the digit
location meaning of the digits in a decimal fraction:
1.234
means 1 + 2 (1/10) + 3(1/100) + 4(1/1000) =
+
+
+![]()
Each successive digit
to the right of the decimal represents the number of a power of 10 to the
negative of its position number that is part of the entire fraction.
For binary fractions,
the idea is similar. Each successive
digit to the right of the radix point (like the decimal point, but specifying
the beginning of fractional digits in a different base number) represents the
number of a power of two to the negative of its position number that is part of
the entire fraction. For example:
= 1/2 = 0.1 (binary)
= 1/4 = 0.01 (binary)
= 1/8 = 0.001 (binary)
= 1/16 = 0.0001 (binary)
= 1/32 = 0.00001 (binary)
So, 1.1011 binary = 1 + 1/2 + 1/8 + 1/16 = 27/16 decimal = 1.6875
To add binary
fractions, align the radix points and add bitwise, as shown before with
non-fractional binary numbers.
Example: Add 1.1011 and 1.1001
Carry bit: 11 011
1.1011
+
1.1001
11.0100
That is, 1.6875
(1.1011 binary) + 1.5625 (1.1001 binary) = 3.25 (11.0100 binary).
The last section gave
the common method for storing signed integer numbers. In the next section, we
discuss how floating point numbers (i.e., real numbers - numbers containing a
fractional part) can be represented in a binary format.
Floating-Point Numbers -
IEEE Single-Precision Standard
A floating-point
number is a number with a decimal point, such as 8.3194, -90.3314,
, the last number
being given in scientific notation.
Scientific notation consists of a mantissa,
a base, and an exponent. Consider the
example scientific notation number
. The mantissa is the
first number (6.34 in the example) - it must be a number whose absolute value
is between one and ten (can equal one, cannot equal ten). The base refers to the base of the number
multiplied by the mantissa - for scientific notation, the base is 10. The exponent refers to the power applied to
the base (20 in the example).
Floating point representations in computers are
implementation-dependent. One commonly
used standard is the IEEE
single-precision standard. It
provides a method of representing any floating point number by expressing it in
a binary “scientific notation” form; i.e., a floating point number represented
using this standard is first expressed as a mantissa times a power of two - the
mantissa will be a number whose absolute value is between one and two (can
equal one, cannot equal two).
IEEE
Single-Precision Standard
The following describes the format of a floating
point number representation according to the standard. Note:
In this description, the “first” bit is the left-most bit.
·
A
floating point value is represented in one 32-bit word.
·
The
first bit s is the sign bit (0 means
+, 1 means -).
·
The
following eight bits are interpreted as an unsigned
integer e (range of values is 0 to
255), which is used in representing the exponent. If e is not zero, then
exponent equals e - 127 and if e is zero, then exponent equals -126.
·
The
remaining 23 bits
are interpreted as the value m =
in base 2; i.e., m =
. The value m is used in representing the
mantissa. If e is not zero, then the mantissa is 1 + m; if e equals zero, then
the mantissa is equal to m.
So, to summarize, a given number is represented as
the following:
·
If
, the number is represented as ![]()
·
If
e = 0, the number is represented as ![]()
Example:
Given that the following 32-bit binary word represents a floating-point
number using the IEEE single-precision standard, determine what that number is in
decimal.
1 10101100
1001 0100 0000 0000 0000 000
The sign bit is 1 so the number is negative.
The value of e,
from the next 8 bits, is 1010 1100.
This represents the unsigned integer l72 decimal (=AC hex = 12+10(16)
decimal = 172). So the exponent of the
number is 172-127 = 45 (decimal).
The value of m,
from the next 23 bits, is 1001 0100 0000 0000 0000 000. This represents the decimal value =
= 0.578125. So, the mantissa of the number is 1.578125.
So, the decimal value of the number is then: -![]()
Round-Off Errors
Not all decimal floating point numbers can be represented exactly in
this binary format.
Example: Suppose we want to
store the floating point value 1.00000001 using the standard just described. The fractional part of this number,
is too small to be represented since
. If you try to store
this value there will be a round-off error introduced.
Note: Round-off error also
occurs for any rational numbers that cannot be expressed exactly in decimal,
much less than in just binary. For
example: 1/3
Solution to the round-off
problem: Use more space to store the
floating point value. There is an IEEE
Double Precision Standard that defines a similar method for storing floating
point numbers that uses 8 bytes instead of 4 bytes. Depending on the programming language used, there can be even
higher precision floating point formats used.
Representing Other Types of
Data
New applications
involve data representations of pictures, sounds, and video. Methods of representation of these types of
data are not standardized yet.
Ultimately, the data is again stored in some binary format.
References
Computer
Science: An Overview, Fifth Edition, by J. G. Brookshear (Addison-Wesley, 1997)
An
Introduction to Computer Science Using C, by R. Eggen and M. Eggen (PWS Publishing Company,
1994).