How to EncodE and dEcodE InformatIon IT Zone

IT Zone
How to Encode and
Decode Information
Learn to convert between various codes for digital data transmission
use. There are different kinds of binary
codes, like weighted and non-weighted
codes, reflected codes, sequential
codes, alphanumeric codes, and error
detection and correction codes.
Weighted code

C.T. Bhunia
D
igital data is represented,
stored and exchanged in
the form of codes written in
strings of 0’s and 1’s known as bits.
And encoding and decoding is an
art of transforming one representation into another and vice versa. In
accounts section of offices, employee
codes are used for salary disbursement. The code is a unique number
given to the employee name. Decoding is getting the employee details
from the code.
In computer engineering, encoding
and decoding are used for transformation of numerical data from one
system to another system for specific
Converting decimal data into binary
and vice versa is essential for computing as the computer realises only
binary data from ‘on’ and ‘off’ states
of the transistor built into the system.
Binary codes are the codes that are
represented in binary system with
modification from the original ones.
There are two such codes: weighted
binary codes and non-weighted
codes.
In addition to conversion of the
number system, it is often desirable to
retain decimal character even when it
is encoded in binary digits, or bits (1
or 0). For example, consider decimal
number 56, where we represent individual decimal digits, namely, 5 and 6,
as 0101 for 5 and 0110 as 6:
56=01010110
Such a representation is known
as binary-coded decimal (BCD). It is
a code known as weighted code (see
Table I).
From Table I, you can find the code
of 783 as 0111 1000 0011. In fact, each
digit of the decimal number is coded
as 4-tuple binary data, say, a3a2a1a0 , as
follows:
First decimal Second decimal Last decimal
digitdigit digit
a3a2a1a0
a3a2a1a0a3a2a1a0
BCD is also known as 8421 code,
where 8, 4, 2 and 1, respectively, rep-
7 0 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u
Table I
Weighted BCD Codes for
Decimal Digits
Decimal digit
BCD code weight
8
4
21
0
0
0
00
1
0
0
01
2
0
0
10
3
0
0
11
4
0
1
00
5
0
1
01
6
0
1
10
7
0
1
11
8
1
0
00
9
0
0
01
resent weights of fourth, third, second
and first positions of the tuple.
As examples of other weighted
codes, Table II shows following codes:
2421, 74-2-1 and 4221.
In fact, weighted binary codes
are those which obey the positional
weighting principle (each position
of the number represents a specific
weight).
Some weighted codes are known
as reflective codes. A code is said to
be reflective when the code for 9 is the
complement for the code for 0, code for
8 is the complement for the code for 1,
and so on. Codes 2421, 5211 and excess-3 are reflective, whereas the 8421
code is not reflective.
Similarly, some weighted codes
are sequential. A code is said to be
sequential when two subsequent
codes, seen as numbers in binary representation, differ by one. This greatly
aids in mathematical manipulation of
data. The 8421 and Excess-3 codes are
w w w. e f y m ag . co m
IT Zone
sequential, whereas the 2421 and 5211
codes are not sequential.
Some features of
weighted codes
Except 8421 and 74-2-1 codes, for all
other weighted codes some of the decimal number may be coded in more than
one forms. For example, decimal 4 in
2421 code may be represented as 1010
or 0100. That’s why 2421 and 4221 are
known as self-complementing codes.
A property known as self-complementing is applied to select a code
out of different options. The property
is stated as follows: If ‘D’ is the given
decimal number, you may code it in
any option, but then decimal number
9-D must be coded taking the option
such that the code is bit-wise complement of the code of D. For example, if
decimal 4 in 2421 is coded as 0100, then
9-4=5 must be coded as 1011. See that
1011 is bit-wise complement of 0100.
We have shown the code in Table
II applying the self-complementing
property. A necessary condition of
self-complementing property is that
the sum of weights in the code is 9.
The codes 8421 and 74-2-1 are not selfcomplementing.
Non-weighted code
In many cases, there may be some
requirement for non-weighted codes
(see Table III), particularly from design
point of view. Non-weighted codes are
codes that are not positionally weighted. That is, each position within the
binary number is not assigned a fixed
value. The examples of a few nonweighted codes are excess-3 code, 1-to2 code and BCDP (BCD with parity).
Excess-3 is a non-weighted code
used to express decimal numbers. The
code derives its name from the fact that
each binary code is the corresponding
8421 code plus 0011(3).
Some features of nonweighted code
Excess-3 code is just BCD with plus 3
weight. If ‘d’ is the decimal number,
d+3 is coded in 8421 to get the excess-3
0, 1 and three-bit gray
code 000, 001, 011, 010,
Weighted Codes
110, 111, 101, 100 (dif2421 code
74-2-1 code
4221 code
fers from 000 by 1 bit).
Decimal digit
Weight 2 4 2 1 Weight 7 4 -2 -1 Weight 4 2 2 1 The sequence 00, 11, 01,
0
0000
0000
0 0 0 0 10 is not a two-bit gray
1
0001
0111
0 0 0 1 code as the first and
2
0010
0110
0 0 1 0 second elements differ
by two bits.
3
0011
0101
0011
Table IV shows a
4
0100
0100
0110
4-bit code in compari5
1011
1010
1 0 0 1 son with a binary num6
1100
1001
1 1 0 0 ber. In the column of
7
1101
1000
1 1 0 1 decimal number, it is
8
1110
1111
1 1 1 0 seen that from 00 to 09
if you move from num9
1111
1110
1111
ber N to number N+1,
there is a change in
Table III
only one digit. But when you
Non-weighted Codes
move from 09 to 10, there are
BCDP (with changes in two digit positions.
Excess-3 code
1-to-2 code
Decimal digit
odd parity)
(XS-3 code)
Look at the column of reflected
0
0011
0001
00001
decimal number: as you move
1
0100
0010
00010
from N to N+1, always there
is change in one digit position.
2
0101
0011
00100
In binary representation as
3
0110
0100
00111
we move from N to N+1, there
4
0111
0101
01000
may be many changes in bits.
5
1000
0110
01011
For example, as you go from 3
6
1001
1000
01101
to 4, there are as many as three
7
1010
1001
01110
changes in bit position. But,
look at the reflected binary or
8
1011
1010
10000
gray code: as you move from
9
1101
1100
10011
any N to N+1, always there
is only one change in bit posicode. This code is the most important
tion. This is the property of gray code
non-weighted code in digital logic deor reflected binary code. Gray code
sign. 1-to-2 code has a unique feature.
is also known as a variable weighted
No code has less than one and greater
code and is cyclic.
than two 1’s. BCDP (with odd parity) is
The gray code is called ‘reflected
designed so as to make any code havbinary,’ because the first eight values
ing odd numbers of 1’s. This is done
compare with those of the last eight
with selection of parity bit, which is the
values, but in reverse order. The gray
right-most bit of the code.
code belongs to a class of codes called
Gray code/binary
‘minimum change codes,’ in which
reflected code
only one bit in the code changes when
moving from one code to the next. The
Gray code is a very important nongray code is a reflective digital code
weighted code. An N-bit gray code is
which has a special property that any
a sequence of all the N-bit binary numtwo subsequent numbers’ codes differ
bers, ordered in such a way that each
by only one bit. It is also called ‘unitbinary number differs from its prededistance code.’
cessor and successor by exactly 1 bit
Gray code has an important ap(and the first and last differ by one bit
plication in digital control systems. Let
also). For example, one-bit gray code
7 2 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u
Table II
w w w. e f y m ag . co m
IT Zone
Table IV
Gray/Reflected Code
Decimal
Reflected
decimal
Binary
b3b2b1b0
Reflected binary/gray code
g3g2g1g0
00
00
0000
0000
01
01
0001
0001
02
02
0010
0011
03
03
0011
0010
04
04
0100
0110
05
05
0101
0111
06
06
0110
0101
07
07
0111
0100
08
08
1000
1100
09
09
1001
1101
10
19
1010
1111
11
18
1011
1110
12
17
1100
1010
13
16
1101
1011
14
15
1110
1001
15
14
1111
1000
16
13
17
12
18
11
19
10
to the first half and 1’s to
the second half: 00, 01, 11,
10. Continuing, write 00,
01, 11, 10, 10, 11, 01, 00 to
obtain 000, 001, 011, 010,
110, 111, 101, 100. Each iteration therefore doubles
the number of codes.
Algorithm for
binary-to-gray
conversion
This is very simple:
1. The most significant
bit of the binary number
is the most significant bit
of the gray code
2. Add (using modulo
2, i.e., ignoring carry)
the next significant bit of
the binary number to the
next significant bit of the
binary number to obtain
the next gray code bit.
3. Repeat step 2 until
all bits of the binary bits
have been added modulo
2. The resultant number is
Table V
Illustration of Property of Gray Code
Present number
Action
Next number
0011 (=3) Starting position
Change in the first position from right
0010 (=2)
0010(=2)
Change in the second position from right
0000(=0)
0000(=0)
Change in the third position from right
0100(=4) Final position
Comment: So for a change from regulating position 3 to 4, the fan will go a round of 3 to 2 to 0 to 4.
This will cause huge oscillation in the circuit, which is not a good design.
us take the example of a fan regulator
which is digitally controlled. You want
to switch from position 3 to 4. If you
follow the normal binary system, you
need to change three bit positions to
switch from 3 to 4. But how? Change
three positions one after another as
shown in Table V.
In gray code, there is change in
only one position, so there will be no
oscillation as such. The code is called
reflected because it can be generated in the following manner: Take the
gray code 0, 1. Write it forwards, then
backwards: 0, 1, 1, 0. Then prepend 0’s
w w w. e f y m ag . co m
the gray code equivalent of the binary
number.
Assume binary number b3 b2 b1 b0
= 0011. Conversion into gray is done
as follows: g3 = b3 =0; g2= b3+b2=0+0=0;
g1=b2+b1=1+0=1; and g0= b1+b0=1+1=0
(ignore ‘carry’)
Gray-to-binary conversion
Assume gray code g3 g2 g1 g0=0010.
It can be converted into binary as follows: b 3=g 3/2=remainder
0; b 2 = [g 3+g 2]/2=remainder 0; b 1=
[ g 3+ g 2+ g 1] / 2 = r e m a i n d e r 1 ; b 0=
[g3+g2+g1+g0]/2=remainder 1. Thus
equivalent binary is 0011.
Excess-3 gray code
In many applications, it is desirable
to use a BCD as well as unit distance.
Excess-3 gray code is such a code. The
values for 0 and 9 differ in only one
bit, and so do all values for successive
numbers. Outputs from linear devices
or angular encoders may be coded in
excess-3 gray code to obtain multi-digit
BCD numbers.
The code obtained from excess-3
code by applying conversion rule is
shown below:
Decimal Excess-3 gray code
0
0010
1 0110
2 0111
3 0101
4
0100
5
1100
6
1101
71111
8
1110
9
1010
Error correction and
detection codes
When a binary message made of
strings of 0’s and 1’s is transmitted
from the source to the destination,
the message is corrupted by the noise
during transmission. The corrupted
message becomes erroneous by conversion from transmitted 0 to received 1 or
from transmitted 1 to received 0.
The error-correction code (ECC)
and the error-detection code (EDC)
are used to rectify the errors. In EDCs
and ECCs, redundant check bits are
pended with original message bits to
design the codes. EDCs detect presence of errors. ECCs detect and correct
the errors.
Typically, transmission errors are
of two types: random and burst. An
error is called random if bits in the
error are randomly distributed over
the code. Burst error occurs when
bits in the error are clustered together
over the code. For transmitted byte
01010101, the examples of random
error and burst error may be as below
e l e c t ro n i c s f o r yo u • s e p t e m b e r 2 0 1 0 • 7 3
IT Zone
(underlined bits are in error):
01110111 ---- Random error (errors
are distributed and in second and sixth
bit locations),
and
01101101 ---- Burst error (errors are
clustered on fourth, fifth and sixth bit
locations).
Several EDCs and ECCs are used
to address both random and burst errors. R.W. Hamming introduced one-bit
error-correcting codes in 1950. (7,4) and
(13,8) are the examples of one-bit ECCs.
P. Elias developed convolution codes in
1955. In 1959, R.C. Bose and D.K. Chaudhuri proposed multiple error-correcting
codes. These are very powerful codes
and known as generalised Hamming
codes. A. Hocquenghem independently
designed the codes proposed by Bose
and Chaudhuri. That is why these codes
are known as BCH codes. In 1960, I.S.
Reed and G. Solomon designed powerful block codes particularly for burst
errors, known as Reed Solomon codes.
In 1960, G.D. Fornery introduced the
concept of concatenated codes. In 1967,
A.J. Viterbi introduced an important
convolution code known as Viterbi code.
Turbo code, low-density parity code,
combined turbo code, punctured turbo
code, cyclic redundancy code (CRC)
and Golay code are the other important
codes.
Error detection and correction
codes begin with parity codes. Parity
codes are EDCs. Parity bit is used for
error detection. In parity codes, a parity bit (either odd or even) is appended
to the original message bits; parity bit
is the redundant check bit. Even parity
bit ensures even number of 1’s in the
code (message plus parity bit). Odd
parity ensures odd number of 1’s in the
code. If an original message of seven
bits is 1110001, its codes with parity
bit are: even parity [11100010] and odd
parity [11100011], where bold bits are
parity bits.
Hamming code is an ECC. Richard Hamming, a theorist with Bell
Telephone Laboratories in the 1940s,
developed the Hamming code method
of error correction in 1949. The key to
the Hamming code is the use of extra
parity bits to allow identification of the
errors. Hamming code (7,4) can detect
and correct one-bit error, whereas (13,8)
code can detect up to two simultaneous
bit errors, and correct single-bit errors.
A code with this ability to reconstruct
the original message in the presence of
errors is known as the error-correcting
code. By contrast, the simple parity discussed above cannot correct errors, and
can detect only an odd number of errors.
An ECC always has more check
bits than EDC and hence requires
more bandwidth. In simple parity
check bit is just one, whereas in (7,4)
and (13,8) codes check bits are 3 and
5, respectively. Code capability and
complexity in system design are the
other parameters for selection of a code
for particular applications.
Encoding of (7,4)
Hamming code (7, 4) encodes four
bits of data into seven bit blocks called
‘code word.’ The extra three bits are
parity bits. Each of the three parity bits
maintains even parity for three of the
four data bits, and no two parity bits
are for the same three data bits. If the
four data bits in (7, 4) are d1, d2, d3 and
d4, Hamming code parity bits p1, p2
and p3 are calculated as:
p1 = d2 + d3 + d4
p2 = d1 + d3 + d4
p3 = d1 + d2 + d4
where ‘+’ means bit-wise exclusive OR
operation, i.e., sum ignoring carry. For
example, you can encode data 1010
using the Hamming code as 1011010.
Decoding of (7,4)
In a world without errors, decoding a
Hamming code word would be very
easy. Just leave out the parity bits. In
the example of code word, the parity
bits are 101 and when you leave these
out, you will receive data bits as 1010.
But what if you receive a code
word with an error and one or more
of the parity bits are wrong? Suppose
the received code word is 1011011. The
first step is to check the parity bits to
determine whether there is an error.
7 4 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u
Calculate parity bits with received
bits as:
p1 = d2 + d3 + d4 = 0 + 1 + 1 = 0
p2 = d1 + d3 + d4 = 1 + 1 + 1 = 1
p3 = d1 + d2 + d4 = 1 + 0 + 1 = 0
In this case, every parity bit is
wrong. p1, p2 and p3 should have been
010, but you received 101. Compare received parity bits with these calculated
parity bits to get bit pattern 111. This
bit pattern has decimal value of 7. Now
reverse the bit at seventh position of
the received code to get 1011010, and
then leave out parity bits 101 to receive
the corrected data as 1010.
Generalised Hamming
code for single bit-error
correction
Illustration of (7,4) Hamming code
paves the way for explaining generalised Hamming code. In generalised
code, coding is done as below:
1. All bit positions that are powers
of 2 will be locations for parity bits in
code words. Thus in code words, locations of parity bits are 1, 2, 4, 8, 16, 32,
64, etc.
2. All other bit positions are for the
given original data to be encoded. That
means data locations are 3, 5, 6, 7, 9, 10,
11, 12, 13, 14, 15, 17, etc.
3. Each parity bit calculates the
parity for some of the bits in the code
word. The position of the parity bit
determines the sequence of bits that
it alternately checks and skips as follows: (i) Location 1: check 1 bit, skip 1
bit, check 1 bit, skip 1 bit, etc (1, 3, 5, 7,
9, 11, 13, 15,...), (ii) Location 2: check 2
bits, skip 2 bits, check 2 bits, skip 2 bits,
etc (2, 3, 6, 7, 10, 11, 14, 15,...), (iii) Location 4: check 4 bits, skip 4 bits, check
4 bits, skip 4 bits, etc (4, 5, 6, 7, 12, 13,
14, 15, 20, 21, 22, 23,...), (iv) Location 8:
check 8 bits, skip 8 bits, check 8 bits,
skip 8 bits, etc (8-15, 24-31, 40-47,...), (v)
Location 16: check 16 bits, skip 16 bits,
check 16 bits, skip 16 bits, etc (16-31,
48-63, 80-95,...), (vi) Location 32: check
32 bits, skip 32 bits, check 32 bits, skip
32 bits, etc (32-63, 96-127, 160-191,...).
4. Set a parity bit to 1 if the total
number of 1’s in the positions that it
w w w. e f y m ag . co m
IT Zone
checks is odd. Set a parity bit to 0 if the
total number of 1’s in the positions that
it checks is even. This means even parity is ensured in the code word.
We illustrate with an example.
Say, the original given byte of data is
10011010. As per Hamming code, the
code word will be: _ _ 1 _ 0 0 1 _ 1 0 1
0, where ‘_’ locations are for the parity
for each parity bit.
• Location-1 check bit is 0, as created by even parity rule of data at locations 1, 3, 5, 7, 9, 11.
• Location-2 check bit is obtained
by even parity rule of data bit locations
2, 3, 6, 7, 10, 11 as 1.
• Location-4 check bit is obtained
from data locations of 4, 5, 6, 7, 12 as
1.
• Location-8 check bit is obtained
from data locations of 8, 9, 10, 11, 12
as 0.
• Thus the code word becomes:
011100101010.
Suppose the received code is
w w w. e f y m ag . co m
011100101110. Here the error location is
highlighted in bold. The receiver calculates parity bits from the received data,
and compares calculated parity bits with
received parity bits to find out which bit
is in order to correct it. The method is to
verify all the incorrect parity bits by the
comparison stated. In the example, parity bits 2 and 8 are incorrect.
It is now 2 + 8 = 10, and that bit position 10 is the location of the incorrect
bit. We complement the tenth bit to get
back the correct code word.
Repetition code
Coding Theory is the study of how to
add redundancy or additional bits to
the original given data so as to use them
to detect and correct errors induced by
the communication channel. Here, the
transmitter sends the data bit several
times, an odd number of times in fact.
In general, if each bit is repeated 2K + 1
(K is a positive integer) times, the code
can tolerate up to K errors.
In repetition code, bit to be transmitted is transmitted more than once.
In triple repetition code, ‘0’ and ‘1’ are
coded, respectively, as ‘000’ and ‘111’.
At the receiver, majority rule is applied
to decide about the bit. If the three bits
received were not identical, an error
occurred. If the channel is nearly clean,
most likely only one bit will change in
each triple. Therefore 001, 010 and 100
each correspond to a 0 bit, while 110,
101 and 011 correspond to a 1 bit.
Such codes cannot correct all errors. For example, if the channel introduces two bits error and the receiver
gets ‘001,’ the system detects the error,
but concludes that the original bit was
0, which is incorrect. If we increase the
number of times we duplicate each bit
to four, we can detect all two-bit errors
but can’t correct; at five, we can correct
all two-bit errors, but not detect all
three-bit errors. 
The author is a regular contributor to EFY
e l e c t ro n i c s f o r yo u • s e p t e m b e r 2 0 1 0 • 7 5