Introduction to Number Systems and Computer Data Representation Methods

 

CSCI 1500 - Introduction to Programming & Problem Solving

 

Data Representation Methods for Computer Storage of Data

 

Representing Characters

            American Standard Code for Information Interchange (ASCII)

            Most common coding scheme for representing characters:  ASCII code

1.      Originally, a 7-bit code to represent upper- and lowercase letters (e.g., ‘a’ is 01100001), punctuation symbols, digits, and some control character information (carriage return, etc.).

2.      Has been extended to an 8-bit code - can represent 256 different characters.

3.      Information stored using ASCII code sometimes referred to as text format.

 

            The following table gives ASCII code representations for some characters.

 

                             ASCII Code for Some Characters

            Key      ASCII Code                Key      ASCII Code                Key      ASCII Code

            +          00101011                    A         01000001                    a          01100001

            ,           00101100                    B          01000010                    b          01100010

            -           00101101                    C         01000011                    c          01100011

            .           00101110                    D         01000100                    d          01100100

            /           00101111                    E          01000101                    e          01100101

            0          00110000                    F          01000110                    f           01100110

            1          00110001                    G         01000111                    g          01100111

            2          00110010                    H         01001000                    h          01101000

            3          00110011                    I           01001001                    I           01101001

            4          00110100                    J           01001010                    j           01101010

            5          00110101                    K         01001011                    k          01101011

            6          00110110                    L          01001100                    l           01101100

            7          00110111                    M         01001101                    m         01101101

            8          00111000                    N         01001110                    n          01101110

            9          00111001                    O         01001111                    o          01101111

                                   

            NOTE:  Other coding schemes are used for representing character data; e.g., EBCDIC is another coding scheme sometimes used on larger computers; files created by specific software packages use their own coding scheme for storing information; Unicode is a 16-bit code developed by a group of manufacturers.

 

Representing Numbers

            Numbers are typically represented in a binary format, not ASCII.  This is a more efficient use of storage.  For example, in text mode, five bytes of storage would be required to store the number 32000 (one byte for each character).  But, using the typical coding scheme for integers, this number can be stored using only two bytes.

 

            The binary number system is a way of representing numbers using only 0’s and 1’s.  As noted before, it is useful for representing the states of the circuits that make up RAM - “off” and “on” circuit states could be represented in binary as 0’s and 1’s.

 

            The hexadecimal (hex) number system is also useful for this discussion.  There is a close relationship between the binary and hex systems.  We will see that any collection of four binary digits can be uniquely represented as one hex digit.  So, hex digits are a convenient way to represent long strings of binary digits (e.g., 32 bits could be represented using only 8 hex digits).

 

Binary and Hexadecimal Number Systems

            The decimal number system is a base 10 number system.  There are 10 digits used (0-9) and the digits of any decimal number, going from right to left, represents the number of each successively higher power of ten that is part of the original decimal number.  For example:  The decimal number 12,345 means 5 1’s, 4 10’s, 3 100’s, 2 1000’s, and 1 10,000. Note: , and .

 

            The binary number system is a base 2 number system.  There are only two digits used (0 and 1) and the digits of any binary number, going from right to left, represents the number of each successively higher power of two that is part of the original binary number.  For example:  The binary number 11001 means one 1, no 2’s, no 4’s, one 8, and one 16 (adds up to 25, decimal).  Note:    and .

 

            The hexadecimal number system is a base 16 number system.  There are 16 digits used (0-9, and A,B,C,D,E,F), where A-F are digits used to represent the decimal values 10-15.  Each digit of a hex number, going from right to left, represents the number of each successively higher power of sixteen that is part of the original hex number.  For example:  The hex number 2ADF means F (decimal 15) 1’s, D (decimal 13) 16’s, A (decimal 10) 256’s, and two 4096’s (adds up to 10,975, decimal).  Note:    and .

 

            The following table shows the equivalent representations of decimal 0-15 in both binary and hex.

            Decimal            Binary        Hexadecimal

               0                   0000                0

               1                   0001                1

               2                   0010                2

               3                   0011                3

               4                   0100                4

               5                   0101                5

               6                   0110                6

            Decimal            Binary        Hexadecimal

               7                   0111                7

               8                   1000                8

               9                   1001                9

               10                 1010                A

               11                 1011                B

               12                 1100                C

               13                 1101                D

               14                 1110                E

               15                 1111                F

 

            Representing Binary Data in Hex Form

            As mentioned earlier, it is common to represent binary numbers in groups of four bits, since it is then easy to refer to the binary number in hex format. 

 

            Example:  Represent the following 32-bit binary number in hex form.

            Binary number:  01011101001011110110101101111000

            Binary number in groups of 4 bits:  0101  1101  0010  1111  0110  1011  0111  1000

            Binary number in hex format:                 5       D        2       F       6        B       7        8

 

Number System Conversions

            Converting from binary to decimal:

            To convert a binary number to its decimal equivalent, use the positional meaning of each digit of the number, as discussed earlier.

            Example:  Convert  0101 1101  to decimal.

                        0101  1101 = 1 + 0*2 + 1*4 + 1*8 + 1*16 + 0*32 + 1*64 + 0*128 = 93

 

            Converting from decimal to binary:

            To convert a decimal number to its binary equivalent, repeatedly divide the decimal number by two, until the quotient is zero - keep track of the remainder digit at each division step because the  remainder digits are the binary digits of the binary number representation going from right-to-left.

            Example:  Convert 567 to binary.  Note:  In the following steps, “rem” stands for remainder.

                        567/2 = 283 rem 1

                        283/2 = 141 rem 1

                        141/2 =   70 rem 1

                          70/2 =   35 rem 0

                          35/2 =   17 rem 1

                          17/2 =     8 rem 1

                            8/2 =     4 rem 0

                            4/2 =     2 rem 0

                            2/2 =     1 rem 0

                            1/2 =     0 rem 1

                        So, 567 = 10 0011 0111

 

            Another, easier, way to convert from decimal to binary, is to first convert from decimal to hexadecimal, and then from hex to binary, as outlined next.

            Converting from decimal to hexadecimal:

            To convert from decimal to its equivalent hexadecimal number, repeatedly divide the decimal number by 16, until the quotient is zero - again, keep track of the remainder digit at each division step because the remainder digits are the hex digits of the hex number representation going from right-to-left.

            Example:  Convert 765 to hex

                        765/16 = 47 rem 13 (hex digit D)

                          47/16 =  2 rem 15 (hex digit F)

                            2/16 =  0 rem 2

                        So, 765 = 2FD hex.

 

            Now, to convert a decimal number to binary by using the decimal’s hex equivalent, convert each hex digit to its equivalent four binary digits.

            Example:  Use the above hex equivalent of the decimal 765 to convert 765 to binary.

                        765 = 2FD hex = 0010  1111  1101 binary

                        Since the hex digits convert to binary as 2 = 0010, F = 1111, and D = 1101.

 

            Converting from hexadecimal to decimal:

            To convert a hexadecimal number to its decimal equivalent, again use the positional meaning of each digit of the hex number, as discussed earlier.

            Example:  Convert  the hex number D2F6  to decimal.

                        D2F6 = 6 + F*16 + 2*162 + D*163 = 6 + 15*16 + 2*256 + 13*4096 = 54006

 

            Binary Addition

            To add two binary numbers, add bitwise and carry if necessary.  This is  similar to addition in decimal, but in general, there is lots of carrying with binary addition.

            Example:  Add 0010 1100 1110 and 0011 0111 1010.

 

                  Carry bit:    0111  1111  110

                                    0010  1100  1110

                        +          0011  0111  1010

                                    0110  0100  1000

 

Storing Integers

            Negatives and Two’s Complement

            For all binary numbers we have just discussed, we discussed only the representation of positive integer quantities.  Next we will discuss a way to represent negative integers, also.

 

            A common method used for the representation of negative integers is two’s complement representation.  First, positive integers are represented in two’s complement in the binary form that we have already discussed (although the most significant, left-most bit will not used as a digit of significance).

 

            To form the two’s complement representation of a negative of a positive integer, do this:

1.      Form the one’s complement of the integer - change all 1 bits to 0’s and all 0 bits to 1’s.

2.      Add 1 to the result.

 

            Example:  Find the two’s complement negative of decimal 61 (use 8 bits to represent the                                    number).

1.      Decimal 61 = 0011 1101 binary.

2.      Form complement:  1100 0010

3.      Add 1 to the result:  1100 0010 + 1 = 1100 0011

 

            Comments:

1.      When representing integers as two’s complement integers, the leftmost bit represents the sign of the number - it is not a digit of significance (i.e., it is not used to determine the value of the integer).  If the leftmost bit is a 1, then the number is negative and if the leftmost bit is 0, then the number is non-negative.  Two’s complement numbers are some times referred to as signed binary numbers.  Binary numbers for which all digits are significant are some times referred to as unsigned binary numbers.  What this means is you need to be aware of the context of the application since the same binary number can represent different quantities depending on whether it is signed or unsigned.

 

            Example:  There are eight different binary representations that can be formed using three bits.  Here are all of the decimal equivalent values for these eight binary representations assuming that the representations are 1) signed (two’s complement) values and 2) unsigned values.

                        Binary number              Unsigned value Signed (two’s complement) value

                                    000                              0                                  0

                                    001                              1                                  1

                                    010                              2                                  2

                                    011                              3                                  3

                                    100                              4                                  -4

                                    101                              5                                  -3

                                    110                              6                                  -2

                                    111                              7                                  -1

 

2.      As can be seen from the above example, when using n bits to represent an unsigned value, the range of possible values are 0 to 2n -1; when using n bits to represent a signed value, the range of possible values are  -2n-1  to  2n-1 -1.  A common word size for storing an integer value in the C++/C programming language is 16-bits. This means that the range of values that can be stored in one 16-bit unsigned integer word would be 0 to  216 -1 (0 to 65,535), and which can be stored in one 16-bit signed integer word would be -215  to 215 -1 (-32,768 to 32,767).

 

3.      Why is the two’s complement representation used to represent negative integers?  Note what happens when you add a positive binary number (8-bit) to its two’s complement:

                        Example:   Add 0011  1101

                                                        +  1100  0011

                                                        1  0000  0000

            The result has the low eight bits all zero, with a one carry bit.  If the carry bit is dropped, the result of the addition is zero.  This means that the positive binary number added to its two’s complement results in zero - that implies that the two’s complement of a number is its negative, as we previously asserted.

 

4.      Only two circuits are required to implement the four arithmetic operations of integer arithmetic in a computer - addition and negation circuits.  Subtraction is implemented as “adding the opposite”.  Multiplication is repeated addition; division is repeated subtraction (6/2 = number of times 2 can be subtracted from 6 before the result becomes negative).

            Example:  For a computer with a 16-bit integer word size, how would the computer do this subtraction problem:              Find 1234 - 456

                        1.  The computer would first convert the decimals to binary (will use 16 bits here):

                                    1234  =  0000 0100 1101 0010

                                      456  =  0000 0001 1100 1000

                        2.  Next, the computer would form the two’s complement of 456:

                                    -456  =  1111 1110 0011 1000

                        3.  Then, the computer would add 1234 and (-456):

                                    1234  =  0000 0100 1101 0010

                             +     -456  =              1111 1110 0011 1000

                                     778  =   0000 0011 0000 1010

 

5.      Overflow - it is possible to perform an arithmetic operation and obtain a result out of the range of the integer representation you are using.  The result will be one of the possible signed two’s complement values, but will not be the correct result.

 

            Example: Using 3-bit two’s complement representations, 2 + 3 = 010 + 011 = 101 = -3 (not 5).

 

            This type of error is called overflow.  To avoid, the programmer can use double precision values - these are two’s complement representations of the integers that use larger size words.  (For example, in C, there is a way of storing integers in 4-byte words or 2-byte words.)

 

Storing Floating Point Numbers

          Fractions in Binary

            Recall the digit location meaning of the digits in a decimal fraction: 

                        1.234 means 1 + 2 (1/10) + 3(1/100) + 4(1/1000) =   

                                                +++

            Each successive digit to the right of the decimal represents the number of a power of 10 to the negative of its position number that is part of the entire fraction.

 

            For binary fractions, the idea is similar.  Each successive digit to the right of the radix point (like the decimal point, but specifying the beginning of fractional digits in a different base number) represents the number of a power of two to the negative of its position number that is part of the entire fraction.  For example:

                        = 1/2 = 0.1 (binary)

                        = 1/4 = 0.01 (binary)

                        = 1/8 = 0.001 (binary)

                        = 1/16 = 0.0001 (binary)

                        = 1/32 = 0.00001 (binary)

 

            So,  1.1011 binary = 1 + 1/2 + 1/8 + 1/16  = 27/16 decimal = 1.6875

 

            To add binary fractions, align the radix points and add bitwise, as shown before with non-fractional binary numbers.

            Example:  Add 1.1011 and 1.1001

                  Carry bit:    11 011

                                      1.1011

                        +            1.1001

                                    11.0100

            That is, 1.6875 (1.1011 binary) + 1.5625 (1.1001 binary) = 3.25 (11.0100 binary).

 

            The last section gave the common method for storing signed integer numbers. In the next section, we discuss how floating point numbers (i.e., real numbers - numbers containing a fractional part) can be represented in a binary format.

 

Floating-Point Numbers - IEEE Single-Precision Standard

A floating-point number is a number with a decimal point, such as 8.3194, -90.3314,  , the last number being given in scientific notation.  Scientific notation consists of a mantissa, a base, and an exponent.  Consider the example scientific notation number .  The mantissa is the first number (6.34 in the example) - it must be a number whose absolute value is between one and ten (can equal one, cannot equal ten).  The base refers to the base of the number multiplied by the mantissa - for scientific notation, the base is 10.  The exponent refers to the power applied to the base (20 in the example).

 

Floating point representations in computers are implementation-dependent.  One commonly used standard is the IEEE single-precision standard.  It provides a method of representing any floating point number by expressing it in a binary “scientific notation” form; i.e., a floating point number represented using this standard is first expressed as a mantissa times a power of two - the mantissa will be a number whose absolute value is between one and two (can equal one, cannot equal two).

IEEE Single-Precision Standard

The following describes the format of a floating point number representation according to the standard.  Note:  In this description, the “first” bit is the left-most bit.

·        A floating point value is represented in one 32-bit word.

·        The first bit s is the sign bit (0 means +, 1 means -).

·        The following eight bits are interpreted as an unsigned integer e (range of values is 0 to 255), which is used in representing the exponent.  If e is not zero, then exponent equals e - 127 and if e is zero, then exponent equals -126.

·        The remaining 23 bits  are interpreted as the value m =  in base 2; i.e., m = .  The value m is used in representing the mantissa.  If e is not zero, then the mantissa is 1 + m; if e equals zero, then the mantissa is equal to m.

 

So, to summarize, a given number is represented as the following:

 

·        If , the number is represented as

·        If e = 0, the number is represented as

 

Example:  Given that the following 32-bit binary word represents a floating-point number using the IEEE single-precision standard, determine what that number is in decimal.

 

                                                1  10101100  1001 0100 0000 0000 0000 000

 

The sign bit is 1 so the number is negative.

The value of e, from the next 8 bits, is 1010 1100.  This represents the unsigned integer l72 decimal (=AC hex = 12+10(16) decimal = 172).  So the exponent of the number is 172-127 = 45 (decimal).

 

The value of m, from the next 23 bits, is 1001 0100 0000 0000 0000 000.  This represents the decimal value =  = 0.578125.  So, the mantissa of the number is 1.578125.

 

So, the decimal value of the number is then:  -

 

Round-Off Errors

Not all decimal floating point numbers can be represented exactly in this binary format.

Example:  Suppose we want to store the floating point value 1.00000001 using the standard just described.  The fractional part of this number, is too small to be represented since .  If you try to store this value there will be a round-off error introduced.

Note:  Round-off error also occurs for any rational numbers that cannot be expressed exactly in decimal, much less than in just binary.  For example:  1/3

 

Solution to the round-off problem:  Use more space to store the floating point value.  There is an IEEE Double Precision Standard that defines a similar method for storing floating point numbers that uses 8 bytes instead of 4 bytes.  Depending on the programming language used, there can be even higher precision floating point formats used. 

 

Representing Other Types of Data

            New applications involve data representations of pictures, sounds, and video.  Methods of representation of these types of data are not standardized yet.  Ultimately, the data is again stored in some binary format.

 

References

 

Computer Science: An Overview, Fifth Edition, by J. G. Brookshear (Addison-Wesley, 1997)

 

An Introduction to Computer Science Using C, by R. Eggen and M. Eggen (PWS Publishing Company, 1994).