# filter out some packets - in pic, TTL serial

#### bug13

Joined Feb 13, 2012
1,924
Hi guys,
As of the following drawings.

I have a pic A, and a pic B. my codes run in A, and I have no control of B. They are connected in TTL serial.

I need to send a stream of data (in packet) from A to B. B will simply repeats what it received from A back to A, and also generated it's own data (in packet).

My question is: how can I filter out the data repeated from B to A, I am only interested the data generated within B.

The data stream is very slow, it's about 16kbps.

Thanks guys!!

Mod edit: reduced image size

Last edited by a moderator:

#### WBahn

Joined Mar 31, 2012
26,398
A lot of it depends on your packet format. Can you embed a header in the packets going from A to B that will survive and be in the packets echoed back from B? If so, then you can filter based on the header. Are you guaranteed that the packets will come back from B in the same order than they went out from A in? If so, then you could copy the sent packets into a queue and match them up against the packets coming back and drop the ones that match before you pop them out of the queue. There are probably other ways as well, but it depends on the specifics of your system.

#### bug13

Joined Feb 13, 2012
1,924
A lot of it depends on your packet format. Can you embed a header in the packets going from A to B that will survive and be in the packets echoed back from B? If so, then you can filter based on the header. Are you guaranteed that the packets will come back from B in the same order than they went out from A in? If so, then you could copy the sent packets into a queue and match them up against the packets coming back and drop the ones that match before you pop them out of the queue. There are probably other ways as well, but it depends on the specifics of your system.
Hi WBahn,

Thank you for the quick replay.
As of method 1: I can embed a header or some sort in the packets, but I am pretty sure B won't recognize the packets if I do. Because there is a format/protocol it needs to follow in order for B to recognize them.
As of method 2: I am pretty sure that its not guaranteed that the packets came back the same order they went out, as B have 4 buffer banks to store received packets. And there is a good chance that the data generated in B come back before the packets send from B.

#### WBahn

Joined Mar 31, 2012
26,398
Okay, so ask yourself if, as a smart human being, there is any way that YOU could look at the data stream coming from B and tell which packets were echoes of packets originally sent from A. If you can't tell the difference, then there is no way for A to tell the difference.

#### bug13

Joined Feb 13, 2012
1,924
Okay, so ask yourself if, as a smart human being, there is any way that YOU could look at the data stream coming from B and tell which packets were echoes of packets originally sent from A. If you can't tell the difference, then there is no way for A to tell the difference.
OK, there is one way it may work, after some thinking.

Make a copy of the a few latest TX packets from A to B, and put them in memory. Add a time stamp on each copy was made. Scan every packet come out from B, if the packet match any packets store in memory, filter out that packet and clear the same packet store in memory.
Also, expire any packet store in memory after some time, say 1s.

Here is how I think I should implement it.

Code:
typedef struct __SHADOW_BUFFER__
{
uint8_t data[MAX_PACKET_SIZE];
uint8_t len;
bool isEmpty;
uint16_t timeStamp;

// rx interrupt
void interrupt ON_RX(void)
{
getOnePacketAndStoreItSomewhere();
}

// loop
while(1)
{
// make a copy of the latest packets going from A to B
if (TX available)
{
// do other TX stuff
}
// match the RX packet with packet in shadow buffer, if match, filter out the packet
filterPacket();
// if a packet's life is longer than a pre-define time, expire it
expirePacket();
}
any loop hole??

Thanks.

#### WBahn

Joined Mar 31, 2012
26,398
I haven't looked at the code, but the description sounds reasonable. One thing to consider is the relationship between the buffer length and your timeout value. If you have a T second timeout value, you want to be sure that you can store as many packets as you can transmit in T seconds.

#### bug13

Joined Feb 13, 2012
1,924
I haven't looked at the code, but the description sounds reasonable. One thing to consider is the relationship between the buffer length and your timeout value. If you have a T second timeout value, you want to be sure that you can store as many packets as you can transmit in T seconds.
Hi WBahn,

Didn't aware of the relationship between the buffer length and your timeout value, thank you for pointing this out.

#### John P

Joined Oct 14, 2008
1,861
An alternative approach: would it be possible to insert some unique marker in the packets sent from A to B, so that when that packet was echoed back to A, A could recognize it as "one of mine"? But maybe the packets generated by unit B are unpredictable and this wouldn't work. But even if it works imperfectly (an occasional packet generated by B would be discarded as an echo) could it still be "good enough"?

I'm suggesting this because it's simpler just to accept or reject each packet as it comes in rather than buffering outgoing packets and checking all incoming data against the stored data, and having to check timestamps and delete packets from the buffer.

#### WBahn

Joined Mar 31, 2012
26,398
An alternative approach: would it be possible to insert some unique marker in the packets sent from A to B, so that when that packet was echoed back to A, A could recognize it as "one of mine"? But maybe the packets generated by unit B are unpredictable and this wouldn't work. But even if it works imperfectly (an occasional packet generated by B would be discarded as an echo) could it still be "good enough"?

I'm suggesting this because it's simpler just to accept or reject each packet as it comes in rather than buffering outgoing packets and checking all incoming data against the stored data, and having to check timestamps and delete packets from the buffer.
He indicated earlier that he really can't change the packet format of the packets going to B because B wouldn't be able to understand them.

The real question here is whether it is better to accept echoes that should have been discarded or to discard packets as echoes that really weren't. More specifically, is one or both of those NOT acceptable?

#### John P

Joined Oct 14, 2008
1,861
If unit B rejects packets that have extraneous data in them, then my idea is no good. But I wasn't necessarily saying that the added material should be a header--it could be a few bytes at the end of the data block within the packet (5 bytes saying "bug13" maybe). I'm probably asking too much, where I want unit B to accept the modified packet and ignore the added data, but to echo the packet exactty as it was received.

But the fact that echoed packets may not even come back in the same order they were received in makes this buffering scheme pretty complex. It does seem like the only solution, though.

#### WBahn

Joined Mar 31, 2012
26,398
The saving grace is that the link is very slow, so there is a lot of time to spend searching the buffer. Plus, because it is so slow, the number of packets that need to be buffered is going to be pretty small.

#### bug13

Joined Feb 13, 2012
1,924
Hi WBahn

In order to save memory, can I calculate a crc16 of a packet, save it in buffer as reference instead of saving the whole packet.

My question is, is an crc16 unique for every different packet? Or do I need to do some other kind of check sum which produce an unique number for every different packet?

Thanks!

#### WBahn

Joined Mar 31, 2012
26,398
Hi WBahn

In order to save memory, can I calculate a crc16 of a packet, save it in buffer as reference instead of saving the whole packet.

My question is, is an crc16 unique for every different packet? Or do I need to do some other kind of check sum which produce an unique number for every different packet?

Thanks!
The answer to whether there is any kind of checksum that will produce a unique number for every different packet is no (unless the checksum length is at least as long as the packet length). Think about it, if you have 16-bit packets and an 8-bit checksum, then you have 65536 possible packets by only 256 possible checksum values. So, on average, there will be 256 packets that map to a given checksum value. These are called hash collisions.

So the question isn't whether it will be possible for hash collisions to occur (because it is), but whether the odds of a hash collision are sufficiently low that you are willing to accept the risk and pay the penalty if and when it does happen. If you use a 16-bit hash (such as CRC-16), then you will have 65536 possible checksums. So you would expect, on average, for there to be one collision in any set of (roughly) 256 packets (this is an example of the Birthday Paradox that goes as the order of the square root of N). If the number of packets you need to keep is much less than this, then this might be acceptable. But you probably want at least a 32-bit hash (and a CRC is not, for this purpose, a very good hash).

#### Picbuster

Joined Dec 2, 2013
1,025
I was professionally involved in protocol conversion and data block recognition (you call it filtering).
This is very difficult you need to know:
Speed
Sync or async
Data format (length of bytes) you used to 8 but I worked sometimes with 12 bits (sync bitstream).
Type of protocol x25 ip tcp/ip both could be transported over a async or sync line.
(soh/stx/eot/nack/ack etc.)
Is a challenge and response used?
Encryption?

To be able to find out what is going on you need a reel FAT cpu high speed at data input.

#### bug13

Joined Feb 13, 2012
1,924
I was professionally involved in protocol conversion and data block recognition (you call it filtering).
This is very difficult you need to know:
Speed
Sync or async
Data format (length of bytes) you used to 8 but I worked sometimes with 12 bits (sync bitstream).
Type of protocol x25 ip tcp/ip both could be transported over a async or sync line.
(soh/stx/eot/nack/ack etc.)
Is a challenge and response used?
Encryption?

To be able to find out what is going on you need a reel FAT cpu high speed at data input.

• My MCU is running at 48MHz, if you are not familiar with PIC, think of it as 12MHz instruction per second
• async
• Not sure what you mean by data format, here is what I think you want: basic data is byte (8 bits), packet length is variable. It depends on the packet, the len of the packet is embedded in the packet itself.
• I think it's proprietary protocol (I have not seen it anywhere else), eg a packet is like this "BC FF 81 FF 3CF2" in plain ASCII text. a packet is always start with 0x0D and end with 0x0A.
• No encryption.
• Not sure what you mean by challenge and response

#### Picbuster

Joined Dec 2, 2013
1,025
ok it looks like 8 bits bytes async starting with CR and end with LF.
But it could have checksum and will repeat message on a Nack char or repeat when nothing is received within x block transfer times.
Assume you have N strings what would you like to filter and why?
Whom or what is generating this data?
You should have some information.
challenge and response is used to exclusively identify receiver.
In your case it could send a challenge (looks like a random pattern between od and oa but is not) and wait for correct answer.

#### bug13

Joined Feb 13, 2012
1,924
• It repeat anything that follow his data format goes into it's RX, split it our from it's TX.
• The reason why I want to filter out message is I am only interested in the message the device generated, not it's repeated message. (the device also generate it's own message with the same format)
• The repeat message is generated by many external devices. (same format as the message generated within the device)

#### bug13

Joined Feb 13, 2012
1,924
ok it looks like 8 bits bytes async starting with CR and end with LF.
But it could have checksum and will repeat message on a Nack char or repeat when nothing is received within x block transfer times.
Assume you have N strings what would you like to filter and why?
Whom or what is generating this data?
You should have some information.
challenge and response is used to exclusively identify receiver.
In your case it could send a challenge (looks like a random pattern between od and oa but is not) and wait for correct answer.
• t repeat anything that follow his data format goes into it's RX, split it our from it's TX.
• The reason why I want to filter out message is I am only interested in the message the device generated, not it's repeated message. (the device also generate it's own message with the same format)
• The repeat message is generated by many external devices. (same format as the message generated within the device)

#### Picbuster

Joined Dec 2, 2013
1,025
Well you should make a printout and identify the blocks you need.
Then look for an unique identifier and use this as a trigger.
then program
loop
Read block from CR to and include LF.
send block to designation.
scan for identifier when true send block to snoop port.
end loop

#### bug13

Joined Feb 13, 2012
1,924
Well you should make a printout and identify the blocks you need.
Then look for an unique identifier and use this as a trigger.
then program
loop
Read block from CR to and include LF.
send block to designation.
scan for identifier when true send block to snoop port.
end loop
Well you should make a printout and identify the blocks you need.
Then look for an unique identifier and use this as a trigger.
then program
loop
Read block from CR to and include LF.
send block to designation.
scan for identifier when true send block to snoop port.
end loop
The problem is the repeated messages could be exactly the same as the messages generated internally by the device.