Unbreakable encryption using a (home made) hardware dongle.

Discussion in 'Math' started by THE_RB, Aug 13, 2012.

  1. THE_RB

    Thread Starter AAC Fanatic!

    Feb 11, 2008
    5,435
    1,305
    Hi, this is an idea I have had for a while and I would like to get some feedback from the "great minds" of the forum (ie people who I have discussed RNGs and encryption etc with in the past).

    The idea would be to use a cheap PIC 18F micro that has +10k ROM and +2k RAM, as a very good Psuedo RNG (PRNG) like my Black RNG, which produces a very large stream of random data where the process in the PIC cannot be reverse engineered by receiving any of that data.

    The PIC is plugged into a USB serial port only when encryption or decription of a file is required, and only sends/receives serial (so there is no access to re-program the PIC).

    Basic principle to encrypt a file;
    1. typed in text password -> seeded into blank PC cache

    2. PC runs its PRNG -> fills PC cache with random data
    3. a 32bit checksum (of the cache) is sent to PIC, PIC uses checksum to reseed its cache
    4. PIC runs it's PRNG -> fills PIC cache with random data
    5. a 32bit checksum is sent back to the PC, PC uses that checksum to reseed its cache
    (goto 2 and repeat)

    The process ends when an amount of random data has been produced that is the same size as the file to be encrypted. Encryption occurs on the fly, so each byte of the file is encrypted in turn.

    This means there is never one "state" of either RNG which represents the whole encryption, and could possibly be "captured" by a memory capturing trojan etc on the PC.

    Likewise the remote PIC cannot have it's state captured, any serial snoop trojan etc could only ever capture the 32bit checksums to and from the PIC.

    Ideally, if a third party does not have access to the PIC the encrypted data is uncreakable.

    Also, if two identical PICs exist they can be used by two parties to decrypt each other's encrypted files.

    Any weaknesses? The best I have at this point is that if the typed in password was known (say captured with a trojan keylogger) then that message encrypted with that password could possibly be decrypted by someone who could fully reverse engineer the PC side software AND had a recording of all the serial checksums.

    However if the text password was different then all checksums and checksum replies would be different, so knowlege of a previous password or previous checksums would be useless. So ideally if a different password was used on each file or transmission then they still remain unbreakable.

    I'm open to suggestions. :)
     
  2. bretm

    Member

    Feb 6, 2012
    152
    24
    32 bits can be brute-forced. Send each 32-bit value to the PIC, record the 32-bit response, and store the 16GB lookup table. Now replace the pic by a simple lookup with lots of storage.

    Or just guess at the initial seed. Only 2 billion guesses on average before its cracked. 32 bits is too small.

    Even if you use a larger seed size, the pic is just acting as a mixing function. And it's a random mixing function which means lots of collisions and funnels. There are better schemes to use.
     
  3. Wendy

    Moderator

    Mar 24, 2008
    20,765
    2,536
    It is funny, I've been thinking of making a true random number generator. Something simple, that come out with a 1 or 0 in absolute random fashion. From there you can make anything.

    At that point you could use a truly large SD card or thumb drive, which would be easy to duplicate (so both parties have the key). 32GB ought to be pretty uncrackable, and would last a very long time as keys go. Use the section only once, and I don't think it can be cracked.

    Funny how ideas can percolate from different angles.
     
  4. THE_RB

    Thread Starter AAC Fanatic!

    Feb 11, 2008
    5,435
    1,305
    Bretm, thanks for the reply. I'm not sure if you are familiar with my "black RNG" but it has an enormous amount of states, in addition to data deletion, making it not possible to reverse engineer the input or states by receiving the output.

    Even on a cheap PIC 18f series the cache could be seeded with a 10k random sequence stored in ROM (80000bits or 10^24000 combinations!) and the RNG cache itself would probably be 1000bytes (8000bits) in length and scrambled X amount ot times to finally produce a 32bit checksum from the 8000 bit cache.

    The result of course is that the same 32bit question sent to the PIC would result in a different 32bit result every time, with internal permutations well over 10^24000 * 10^2400.

    Hi Bill, it's good to see your interest. I have some work done on a simple PC software that does what you suggest, which is to hash encrypt the user data with a RNG file, and record an index (so the same piece of RNG data cannot be accidentally re-used). As you say this is unbreakable, unless someone has the RNG data file.

    I also have a couple of versions of software which generate RNG data of equivalent entropy to natural RNG data, which was the point of my Black RNG; it allows a small amount of natural RNG data to be used to generate large amounts of RNG data that cannot be reverse engineered. :)

    I have also looked at having a large RNG "key" on a dongle, but there are some security issues. For data to be read from the dongle in one or more files, it is opened by the PC using Win32 fopen() which can be hacked by trojan software on the PC, so if someone puts a trojan on your PC (say via the internet) the next time your system looks at the RNG files on the dongle the data can be covertly captured and stored in a hidden file on the PC, then sent back to the hacker on the internet. So your dongle is defeated.

    What I am proposing above is that the dongle be intelligent, and acts as vital a part of the encryption process. Because the comms to the dongle is only checksums back and forth the data and processes inside the dongle cannot possibly be hacked by interception or recording of the checksums. Somebody must physically posess the dongle, or decryption cannot take place. :)
     
    Last edited: Aug 14, 2012
  5. nsaspook

    AAC Fanatic!

    Aug 27, 2009
    2,907
    2,167
    What is your synchronization process (dongle sequence key exchange or ‘vector’) on the decoding dongle? You have a secret passkey ("secret" key encryption) known only to the sender and receiver that the send side uses to generates a sets of random data that is used to encipher the plaintext to ciphertext. The receiver types in the same passkey to generate the same random data stream (random data stream to decrypt ciphertext to plaintext) for the decoding process so the receive dongle must know where in the random data generation process the send dongle is to work. Will the same key and plaintext generate the same ciphertext everytime (codebook mode)?

    http://www.tccsecure.com/whitepapers/images/whtchrt.gif

    http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
     
  6. WBahn

    Moderator

    Mar 31, 2012
    17,737
    4,789
    It sounds like you are relying on security through obscurity; namely, you have to keep the inner workings of your dongle a secret. That's pretty much a non-starter and has been for decades.

    The philosphy you need to adopt is that you devise your algorithms and your hardware design and then you contract with your worst enemy to actually implement them and you buy the end result from the bad guys and, assuming you can verify that what they implemented was your algorithms and your hardware, without change, then you can still carry out secure communications with your allies even if the bad guys get a complete copy of your ciphertext and have working copies of your hardware and know exactly how the algorithms work.

    If your system can't meet that test, then no one is going to be interested in it.

    To see why you don't want your hardware and algorithms to be part of the secret that has to be maintained, consider what would happen if your idea caught on: Entity X (be it some corporation or government or whatever) wants to use you device and starts out by giving Alice and Bob each a dongle so that they can carry on secret communications. Now you want to give Charlie and Denise dongles so that they can also carry on secret communications. Are these two dongles the same as the ones given to Alice and Bob, or are they unique to Charlie and Denise? Either way, eventually, you want to be able to have each of the 10000 people in your organization to be able to securely communicate with everyone else. If everyone has the same dongle, then you have 10000 people that have these dongles and if even one of them is compromised (falls into the wrong hands) you have a serious problem. How long do you think it will be before one of them is lost? Based on anecdotal information we collected from people in the secure comms arena, my guess is that you are talking about days if not hours. The alternative in the other extreme is for every pair of people to have a dongle that is unique to them (and which, if compromised, sheds no light on how any of the other dongles work). The good news is that, if a dongle is lost, the only compromise is for the communications carried out between the pair of people that used that particular dongle. The bad news is that now each person has to have 9999 and you have to make and distribute a million dongles (half a million unique pairs).

    This is the problem with any shared-secret (i.e., symmetric) communication system -- the classic key distribution problem. It's bad enough if your keys are a couple hundred bits of data, but if your keys (the stuff that has to be kept secret) is your hardware and your algorithms, you are sunk before you even begin.
     
  7. Papabravo

    Expert

    Feb 24, 2006
    10,140
    1,789
    I was impressed with your page for the Black RNG. FYI, there is a broken link to John Walker's page. For some background on the issues surrounding random numbers I've always liked Knuth. He spent 177 pages in Volume 2 of The Art of Computer Programming laying out a fairly convincing case that the best you could hope to achieve with computation was a pseudo-random sequence with a finite period. He also devoted some pages to testing for randomness.

    http://www.amazon.com/Art-Computer-...&ie=UTF8&qid=1345087856&sr=1-8&keywords=Knuth

    The sole hope for a true random number generator lies in natural processes such as beta decay of radioactive isotopes -- yikes!! Pseudo-random will just have to do for this cowboy. I note that the Black RNG comes very close to this level.

    If you are not familiar with MD5 or SHA-1 hash functions, and the competition to select the next great cryptographic hash function, the following is actually an interesting read.

    http://en.wikipedia.org/wiki/Cryptographic_hash_function
    http://en.wikipedia.org/wiki/NIST_hash_function_competition

    I was intrigued with the competition's goal to raise the level of computing power required for a brute force crack. I hope you find some useful information here or in the links.
     
    Last edited: Aug 16, 2012
  8. nsaspook

    AAC Fanatic!

    Aug 27, 2009
    2,907
    2,167
    Security through obscurity works well in some situations just not when it means trusting an obscure algorithm or method. Our nuclear weapons release control EAM systems are almost entirely composed of systems that use (unbreakable) one time pad tokens for confirmation of orders. So the trust in the system is not in the actual message (normally top-secret and encoded by a separate crypto-system) that can even be sent in the clear during an emergency but in the challenge and response of the authenticity of it's sender by data that can only be known to the select few that have the code books and response tokens. The distribution of the media (or dongle as very well descried by WBahn to lots of people who can't be trusted) is usually the problem for obscurity methods, a problem that our government has solved by having lots of armed men with big scary guns around anyone with access to keys.

    I can't really get a good idea of his system because it's so incomplete as a idea and mainly talks about a randomization process instead of an actual crypto system.

    PS: This is what those EAM messages look like: http://www.dod.gov/pubs/foi/operation_and_plans/NuclearChemicalBiologicalMatters/120.pdf :D
     
    Last edited: Aug 16, 2012
  9. WBahn

    Moderator

    Mar 31, 2012
    17,737
    4,789
    This doesn't sound like security through obscurity, to me. But that really hinges one how you choose to partition the concept of the system/algorithm/protocols from the key material. The one-time pad element would certainly be just part of the key material, as would the challenge-response tokens. The codebook, on the otherhand, could probably be argued either way.

    Also, it should be pointed out that just because the designers/users of a system choose to keep elements such as algorithms, implementations, and protocols secret does not make the system dependent on security through obscurity. That is determined by whether the security fails if those elements are compromised. An example would be an organization that used PKI choosing to keep both keys secret instead of just the private keys.

    But I definitely agree that security through obscurity is reasonably achievable on a sufficiently small scale. I can come up with a algorithm based on even pretty simple non-linear equations and carry on quite a bit of traffic with someone and even if the NSA got all of my ciphertext they would probably be unable to decode it provided the total volume of intercepted material remained below some critical threshold (which, unfortunately, I would be unlikely to know and therefore likely to exceed it without realizing it).

    In the extreme, the same is true even with the simplest monoalphabetic ciphers. For instance, if the only cipher material the bad guy ever got was JQVY, the fact that I used an easily breakable cipher does him no good since the message could literally be any sequence of four unique characters, of which there are 358,800 (assuming a 26-character alphabet). If I had used a one-time pad, there would only be 456,976 possibilities, meaning that I still have 78.5% of the message space available.
     
  10. nsaspook

    AAC Fanatic!

    Aug 27, 2009
    2,907
    2,167
    Security through obscurity by it's self is bad but when added to known good elements it can be effective in adding security if the original element is compromised. Lets say you use SSH and instead of using the standard port you edit the config file to use something else (a secret port number). A zero-day SSH exploit comes out and it looks for systems to crack using the standard port, after many systems are hacked the exploit becomes public, you scan your logs and see many attempts on the standard port but none on the port you moved your SSH daemon to. You have gained security through the obscurity of moving the port.
     
  11. THE_RB

    Thread Starter AAC Fanatic!

    Feb 11, 2008
    5,435
    1,305
    Hi NSAspook glad to have you onboard. The PIC dongle is a large PRNG using 8000 bit encryption, seeded from the PICs internal lookup table of 80000 bits. So yes, if the dongle is fed the same sequence (from the start) of 32bit checksum "questions", it will return the same sequence of 32bit checksum "answers".

    So to (I hope) answer your question, if two users have the same data file and same typed password, their two PCs will send the same checksums to the two dongles, and the two dongles would return identical checksums.

    Basically I was hoping to capitalise of the big strength of my Black PRNG (massive number of states and no data reversing) and the big weakness of it (with no natural entropy introduced it will produce the same PRNG sequence).

    That was exactly my design goal and I think (at least at this stage) it is successful. Not only would data be secure if the bad guys had all the source code for the PC side and PIC dongle side, but it would still be secure EVEN if they hacked my PC and had my text passwords AND a copy of all the checksums to/from the dongle. :)

    Re your other question, no, all dongles have a 80000 bit "key" in the PIC rom, and possibly other variables. That gives a minimum of 10^24000 combinations.

    I appreciate you bringing up the flaw of distributing hardware dongles, but this is not a general use thing. Mainly it was to be a project for me, so I know that anything I secure will remain secure and only my dongle will unlock it. And if I make one for me I don't mind releasing the software etc so other people can make one. At this point it's just in the idea stage. :)

    To PapaBravo, thanks for the links I will read those shortly. :) I'll check the link to John Walkers tester ENT.EXE, his page may be gone which would be a shame.

    I'm interested in your input and don't mind clarifying the procedure I had in mind. What would you like to know?
     
  12. WBahn

    Moderator

    Mar 31, 2012
    17,737
    4,789
    I'm missing something here. If I have all of the source code for the PC and the PIC dongle and I have all of the text passwords, then what am I missing that allows the intended recipient to decrypt the data but that prevents me from doing the same?

    If you release the software and other stuff so that other people can make their own, then what, if anything, is different about their dongle than yours?
     
  13. nsaspook

    AAC Fanatic!

    Aug 27, 2009
    2,907
    2,167
    How is the data file used (or not used) to generate the passed PC checksums. If a set password key is used with different files will the PC and dongle always return the same sequence of checksums as it generates the required amount of random data (keystream) for your (unknown,stateless?) encryption/masking function?

    This can be a weakness (static electronic codebook mode) that can be exploited with a long lived key, large amounts of known ciphertext and plaintext (with a large amount of repeated data strings) if there is not a 'salt or vector' (CBC) used to change the start position used within of the sequence space of your PRNG for a set key in a reversible way on the receive end. Identical strings of data in the same file position in different files should not have the same ciphertext with the same key if the files differ overall. The key must be somehow be different for each different plaintext so the same keystream (generated from the KEY, the PC and the PIC) can't be reused.

    This is a (tweaked for simplicity) demonstration of the weakness in general using image blobs and small pixel blocks.
    http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&cad=rja&ved=0CFkQFjAC&url=http%3A%2F%2Fwww.turbocrypt.com%2Fvpics%2F9a8f098c615a425eab6d17c804dd67ae%2Fwhitepapers%2Fbackup_attack.pdf&ei=pxgtUKi3I4mDiwKyl4GgAw&usg=AFQjCNHVIi8fv35E0xJO9J1qQgGAMjR6KQ&sig2=k2LzQz2lKue1Piu5d60IzA
     
    Last edited: Aug 16, 2012
  14. bretm

    Member

    Feb 6, 2012
    152
    24
    Maybe not possible, but also completely unnecessary. Just going by your original post, the PIC receives 32-bit input and returns 32-bit output. That's all I need to know. I can "borrow" it for a while, feed it all possible inputs and record the outputs. Now I can implement a clone that just returns the appropriate output for the given input. I didn't need to reverse engineer your algorithm at all, and yet I have a device that behaves identically as far as external behavior is concerned.
     
  15. WBahn

    Moderator

    Mar 31, 2012
    17,737
    4,789
    I must still be missing something. Let me restate your process based on my present understanding and then perhaps you can see where I am going astray from what you intend.

    Initial Conditions:

    You have a file of B bytes that you wish to encrypt.

    You have a PC with:
    a) a PRNG on it.
    b) a block of memory used as this PC cache.
    You have a PIC dongle with:
    c) your Black PRNG on it.
    d) a block of memory used as the PIC cache.

    For simplicity's sake, let's say that the PC cache is 2kB and that the PIC cache is 1kB. Let's further say that the file we want to encrypt is 16kB.

    Is this close to the process you are talking about:

    Initialize System:

    Seed the PC PRNG with 0 (or some other static value).
    Seed the PIC PRNG with 0 (or some other static value).

    Your Step 1:

    Store the text password in the PC cache with the rest of it being blank.

    Your Step 2:

    Use the PC PRNG, possibly combined with the present contents of the PC cache, to fill the PC cache with random data (presumably overwriting the text password that was initially stored there).

    This 2KB of data is then XOR'ed with the first 2KB of the data file (or in some other way used to encrypt the first 2kB of the file).

    Your Step 3:

    The 32-bit checksum covering the 2KB PC cache is then transmitted to the PIC dongle. Since 'reseeding' generally (to my knowledge) means starting over, the state of the PIC's PRNG is reinitialized in such a way that its subsequent outputs are a deterministic result of the 32-bit checksum value it received.

    Your Step 4:

    The PIC dongle then uses a procedure, perhaps similar to the one the PC used, to generate a 1kB randomized PIC cache. Other than being used to determine the 32-bit checksum used in the next step, this block of random data is not used for anything.

    Your Step 5:

    The 32-bit checksum covering the 1KB PIC cache is then transmitted to the PC. Again, since 'reseeding' generally (to my knowledge) means starting over, the state of the PC's PRNG is reinitialized in such a way that its subsequent outputs are a deterministic result of the 32-bit checksum value it received.

    You then go back to Step 2 and repeat until the entire file is encrypted.

    Does that help you see where my understanding is departing from what you are trying to describe?

    If this is a reasonably accurate description of the process, then it would seem that the PIC dongle is indistinguishable from a 16GB look-up table in which a given 32-bit checksum is used to look up the corresponding 32-bit value.

    Now, if you don't actually 'reseed' things and let the prior state carry forward, that would be a different matter. This is basically the ECB vs CBC mode issue mentioned by others.
     
  16. bertus

    Administrator

    Apr 5, 2008
    15,647
    2,346
  17. THE_RB

    Thread Starter AAC Fanatic!

    Feb 11, 2008
    5,435
    1,305
    Hi WBahn, the PIC dongle contains a 10kbyte (80kbit) random "key". That key would be different for each user. Obviously if the PIC source was distributed, people making their own PIC dongle would add in their own 10kbyte key.

    Also there is no need to supply PC source code, only a working PC software application. People would use their own PIC dongle and use the standard PC application.

    Thanks for raising a very important point! Here is my original take; The PC hashes it's cache with the user typed password (NOT the data file). Then the PC RNG initiates the scrambling, sending the first checksum, which will have been based on the typed password.

    So you are correct, if the same typed password was used on two different data files, the same sequence of checksums would occur, and the same final data key (the actual hash used to encrypt the data).

    OK, as a possible solution; the PC could generate a 128bit random key, and use that AND the typed password to start the process. That means the same typed password and same dongle would produce a totally different series of checksums, and different encryption. But of course the 128bit random key would need to be stored and transmitted, with the encrypted file. I think that might be acceptable since knowing that 128bit "starter" key won't help anybody decrypt the data.


    Thanks for bringing up a possible security issue, but I had looked into this. OK; 1. if they get the dongle then security is breached, they can decrypt all my files. That's already about a worst case scenario for a dongle based system!
    2. If they want to borrow the dongle and make a clone and return the dongle without me knowing (spy style) then it is not going to be fast! The first 32bit checksum into the PIC will produce one specific 32bit result, as you are saying. Now allowing for 1mS serial comms time and 1mS PIC RNG processing time, to get all the 32bit results requires 4 billion iterations (4*10^9) at 2mS each. That is 8 million seconds or 92 days!

    More importantly the second checksum from the PC to PIC is not in any known sequence and the second checksum will generate a different result BASED on what the first checksum was! So to get all possible no2 checksums requires 32*32bit worth of tests!

    So JUST to record the first 2 checksums from the dongle would require 3.7*10^11 days or 1 billion years. To get just 3 checksums would take 4 billion billion years! Now imagine a normal encryption uses hundreds of checksums. ;)

    To WBahn; re the process used. I'm sorry I did not explain it fully in the first post. You got a lot right but with a few important differences;

    1. PC cache is initially seeded with the typed user password. (And now probably a 128bit random "starter" key as discussed above). Then the PC RNG is run, for many iterations until the cache data is no longer reversable.

    2. On being intiialised, the PIC does a similar RNG process, where its cache is seeded with its own 80kbit ROM key.

    Then a loop happens;
    2. PC cache produces a 32bit checksum, this is transmitted to PIC. The PIC then partially seeds the checksum through its cache (I said reseeding but always meant a partial reseeding). This partial reseeding does not delete existing cache entropy, it just hashes the seed into (on top of) the cache in some way.

    3. the PIC runs its RNG, until the cache is mangled enough to not be reversable. Then it makes a checksum and sends that back to the PC.

    4. The PC does the same as the PIC did (in 2).
    5. The PC does the same as the PIC did (in 3).
    (repeat)

    So you can see the two RNGs scramble each other repeatedly. This forms a large dual (bifurcated?) PRNG that will always produce the same result (given the same dongle, same typed password and same 128bit starter key).

    Once the process has run X amount of loops, then a small amount of datakey bytes can be extracted each cycle. The data key bytes will be taken from the PC cache and won't be reversable to either the cache contents or the checksums.

    Then the datakey bytes are used in a simple XOR hash against the original data file, to encrypt it.

    Thanks Bertus for mentioning smartcards, I just went and had a read. :) They seem to have a similar functionality but I don't know how many times the encryption cycles, possibly just the once. From a first glance my proposed system offers much greater number of states, number of cycles and of course the buffer of using a serial USART so the PIC cannot be tampered (like in a contact style smart card or RF smart card with RF programming capability).
     
  18. Wendy

    Moderator

    Mar 24, 2008
    20,765
    2,536
    Does your device generate a random number from scratch, or does it take it from some other device?

    Trojans and other problems are usually only a problem if you connect your computer to a network, this is not a given per se. If you are really paranoid you encrypt on a secure machine, move the message over via a medium, then send on an unsecured machine. C64 anyone? :D
     
  19. THE_RB

    Thread Starter AAC Fanatic!

    Feb 11, 2008
    5,435
    1,305
    The total random state is generated over time, based on;
    1. (initial) typed password
    2. (initial) random generated "starter key"
    3. closed loop feedback from the PIC RNG during the process

    Possibly, but these days PDFs and many other processes can open executable files, and even on a "secure machine" someone may have had access to it at some point, to load a keylogger software or even plug in a hardware keylogger to your keyboard lead etc.

    To simplify the goal; if someone doesn't have your actual physical dongle then they can't decrypt your file, even if they can hack your PC, or even if they can get your file AND your password.
     
  20. nsaspook

    AAC Fanatic!

    Aug 27, 2009
    2,907
    2,167
    How is the PIC initialized after sequencing checksums from one file with it's key to the next file with a different key? How does it know to erase past history so the send/receive PICs can synchronize the new keystream with new plaintext as is needed with a "synchronous stream cipher"?
     
Loading...