Forward Error Correction Identification

WBahn

Joined Mar 31, 2012
32,939
No, you don't need to repost, I modified my program to be able to read either format. Otherwise I wouldn't have been able to spot the new typo.

So here's a quick summary. I first verified that it could possibly be an error correcting code at all by measuring the minimum distance between any two codewords. It turned out to be three, meaning that it should be possible to correct a single bit error. I assumed that the code was parity-bit based and so I proceeded to determine which bits in the codewords were affected by single bit changes in the data. I did that analysis for all of the codewords provided (originally just 55) and identified which bits in the codeword were always changed by changing the bit in a given position in the data, which bits were never changed, and which bits were sometimes changed. I was hoping that there wouldn't be any that were sometimes changed, but there were. So I tried to see if that was sufficient to identify which bit was in error and it wasn't. So I examined the bits that resulting in sometime changes and determined that they were outliers and always involved all data bits other than the one being changed being identically zero. So I then asked what the codeword for 00 would be if it started out as something else and was changed to 00 and it came up 555. I then reran the analysis with that codeword and now I had no bits that were only affected sometimes. So, with the table of which codeword bits are affected by which changes in the data bits, I could determine the encoding, decoding, and error correction algorithms, as well as what fraction of codewords represent unrecoverable errors. Then I used the codeword for 00 to determine how the final codeword is created (because the analysis up to this point only dealt with how codewords change in response to changes in the data, not what the actual codeword for a given data value is). This was actually pretty obvious, since the basic codeword for 00 is 000 but the actual codeword is 555, so you just take the basic codeword and XOR it with 555. I then did an analysis to explore my hypothesis as to why they used the XORed with 555, since it makes no difference from an error correction standpoint.
 

Thread Starter

mossman

Joined Aug 26, 2010
131
WBahn, was wondering if you have had time to put together a brief (if that's possible) explanation on how you got the codes. We just got word today that our contract is ending and I'd like to finish up this project. I would greatly appreciate it.
 

WBahn

Joined Mar 31, 2012
32,939
I actually had a large blog entry almost done but when I tried to save it, I got this "you must reload the last page" error and ended up loosing several hours worth of work. That greatly frustrated me and I put it aside.

I thought that this was just something that you were interested in as a curiosity. Didn't know that I was doing free work on your contract. ;-P

I'll try to put something up this weekend, but no promises.
 

Thread Starter

mossman

Joined Aug 26, 2010
131
I thought that this was just something that you were interested in as a curiosity. Didn't know that I was doing free work on your contract. ;-P
Guess I conveniently left that part out. It's a special project I'm working on and the FEC was the missing part of the puzzle. Not getting paid for doing it (other than my regular salary), just trying to provide as much information as I can about the signal. If it helps, this project has the potential to save lives. Thanks again.
 

WBahn

Joined Mar 31, 2012
32,939
Okay, so let's see what I can get posted. I think I will first post an analysis that would have applied had I started with a complete and correct codebook, since that is the most reasonable case for your report.

BTW: While I'm certainly not asking for any kind of remuneration, I would appreciate having my work acknowledged in your report. If you are willing to do that, I can PM you the information you would need.

First, some terminology (that I will try to use consistently). We have 8-bit "data values" (or "messages") that we map to 12-bit "codewords". Those 256 codewords are a subset of the 4096 possible 12-bit "vectors".

So the first step of the analysis was just to see if the codebook could actually be an error correcting code at all and, if so, what the maximum strength of the code is. To do this, each codeword was examined and the minimum Hamming distance (number of bit positions in which the codewords differ) from that codeword to any other codeword was determined. For most codewords, the minimum distance to another codeword was 3, but for some it was 5. What this meant was that the codebook is that of a valid error-correcting code having the ability to correct all 1-bit errors.

The next step was to map out which bits in the codewords are affected by bit changes in a message. This was done by taking each message and, one at a time, changing each bit in the message and keeping track of which bits in the codeword changed. This, in turn, was done by simply performing a bitwise-XOR between the original message's codeword and the codeword for the message that results by changing the message bit in question (this is known as the "error vector"). Since there were a total of 256 messages, there were 256 error vectors for each bit position. These were compared and it was determined that all 256 error vectors associated with a given bit position in the message were identical, meaning that a change in a given message bit always affected (i.e., flipped) a fixed subset of the codeword bits and never affected any of the others. The mapping is shown below:

Rich (BB code):
mask: X------- affects: X------- XXX-
mask: -X------ affects: -X------ -XXX
mask: --X----- affects: --X----- X-X-
mask: ---X---- affects: ---X---- -X-X
mask: ----X--- affects: ----X--- X-XX
mask: -----X-- affects: -----X-- XX--
mask: ------X- affects: ------X- -XX-
mask: -------X affects: -------X --XX
From this we can see that there is a one-to-one mapping between the message bits and the first eight codeword bits, meaning that each of the first eight codeword bits depends on exactly one of the message bits (and a different message bit in each case).

The remaining four codeword bits act as parity bits over different subsets of the message. If we number the message bits M[7:0] and the codeword bits C[7:0] followed by P[3:0], as follows

Rich (BB code):
      MMMMMMMM          CCCCCCCC PPPP
      76543210          76543210 3210
mask: X------- affects: X------- XXX-
mask: -X------ affects: -X------ -XXX
mask: --X----- affects: --X----- X-X-
mask: ---X---- affects: ---X---- -X-X
mask: ----X--- affects: ----X--- X-XX
mask: -----X-- affects: -----X-- XX--
mask: ------X- affects: ------X- -XX-
mask: -------X affects: -------X --XX
Thus we can see that C7, for instance, is only affected by changes in M7, while P3 is affected by changes in M7, M5, M3, and M2. Doing the same for all of the bits in the codeword, the equations to produce a "basic codeword" is

Rich (BB code):
C[7:0] = Mx[7:0]
P3 = XOR{M7,M5,M3,M2}
P2 = XOR{M7,M6,M4,M2,M1}
P1 = XOR{M7,M6,M5,M3,M1,M0}
P0 = XOR{M6,M4,M3,M0}
At this point, we know how changes in the message result in changes]/B] in the actual codeword, but we don't have an actual mapping from a message to a specific codeword. Fortunately, if we arbitrarily define just one reference message/codeword pair, then all of the remaining message/codeword paris are defined because we can progressively change the reference message to any other message while tracking the cumulative changes to the reference codeword to find the corresponding codeword for that message.

For simplicty, we can use the fact that the basic codeword for the message 0x00 is 0x000 while the actual codeword, in this codebook, is 0x555. Thus we can translate between basic codewords and actual codewords by XORing one with 0x555 to get the other.

So now we know how to generate the codewords in this codebook. Let's turn our attention to taking a received vector and recovering the message from it (under the assumption that it contains more than a single bit error in the codeword).

For clarity, let's take a received 12-bit vector and immediately XOR it with 0x555 in order to, in the absence of errors, convert it into a basic codeword. If there are no errors, then the received data, M[7:0], is simply the first eight bits of the codeword, namely C[7:0]. But we are assuming that their might be a single bit error somewhere in the 12-bits of this basic codeword. To find and correct it, we will initially assume that the error, if it exists, is in the parity bits and that the message-bearing bits are all correct. We will then take these message bearing bits and construct the correct set of parity bits for that message and compare this to the received parity bits. We can do this by generating the error vector between the received codeword and the "correct" codeword for the given message bits (remember, the error vector is simply the bitwise-XOR of the two). Since the two vectors have the same message bits, we are guaranteed that these will produce zeros.

The parity bits, on the other hand, could produce any of 16 outcomes. If there are no mismatches (the error vector is all zeros), then the received codeword contains no errors; the alternative is that there are sufficient errors of just the right type to have converted the original codeword into another codeword, which requires at least three errors. If this happens, then this error would be undetectible (at this level of data stream processing), but we are workiing under the assumption that there are only single-bit errors (and acknowledge that we need to deal with the possibiity of multi-bit errors elsewhere).

If there is a single mismatch in the parity bits, then the conclusion is that the single error is that particular parity bit and, once again, the assumption that the message-bearing bits in the receive vector are correct is valid.

If there is more than one mismatch in the parity bits, then either there were multiple errors among the parity bits, which violates our single-bit-error assumption, or there was a single bit error in the message portion of the codeword. We can determine which message bit could have caused the particular pattern of parity bit mismatches by looking at the table of which parity bits are affected by each data bit.

A complete table of parity bit error vectors and the resulting interpretation can therefore be built up.

Rich (BB code):
PPPP
3210 Conclusion
---- -----------------
0000 No error
0001 P0 is in error
0010 P1 is in error
0011 M0 is in error
0100 P2 is in error
0101 M4 is in error
0110 M1 is in error
0111 M6 is in error
1000 P3 is in error
1001 Unrecoverable
1010 M5 is in error
1011 M3 is in error
1100 M2 is in error
1101 Unrecoverable
1110 M7 is in error
1111 Unrecoverable
The three patterns that are unrecoverable represent vectors that can only be reached by at least two bit errors. Recall that some of the codewords had a minimum Hamming distance of five to the nearest codeword; therefore there are vectors that can be received that are a Hamming distance of two away from any codeword -- but those can only be received if the transmitted codeword experience at least two bit errors, which takes it beyond the error correcting capabilities of this code.

This error correcting code can be expressed in matrix form using concepts from Algebraic Coding Theory, but that is beyond the scope of this analysis.
 
Last edited:

Thread Starter

mossman

Joined Aug 26, 2010
131
Thanks. I really appreciate the time you took to put this explanation together. I will need some time to absorb everything :eek:. Unfortunately, we just got word last Friday that our contract is ending at the end of this month (Merry Xmas!). Therefore, I have no choice but to wrap up the projects I am currently working on and will not have the time to incorporate the FEC section as I hoped. It's been very educational though, and again, I can't thank you enough for spending your personal time to do the analysis and provide such a detailed explanation. You've gone above and beyond anything I could have expected.
 

WBahn

Joined Mar 31, 2012
32,939
Not a problem at all. I did it because it was an enjoyable distraction (now, just what that says about me is an open question -- and one that we will not be asking my wife's opinion on).

As time permits, I may add a couple more posts on stuff, such as how I detected the wrong codewords or why the codeword is XOR'ed with 0x555 and whether that value is the best choice that could have been made.
 
Top