Ok so I have a question about throughput..

So lets say I have a micro-controller.. This micro-controller has been programmed to take in input blocks of 128-bits each, 32-bits at a time. So it takes 4 clock cycles to clock in 128-bits using a 32-bit input bus. The number of input blocks is a variable, it can be 1 block or 2,000 blocks. But the output is fixed and 64-bits in length, no matter how big the input is. So it takes 2 clock cycles to output 64-bits using a 32-bit output bus.

Now lets say, that I input one 128-bit block. And it takes 60 clock cycles for the micro-controller to process a 128-bit block and produce a 64-bit output. Lets also say that the system clock frequency is 20Mhz. How do I calculate the throughput?

Is it: (1/20Mhz) * 60 clock Cycles = 3us (3 micro-seconds = time to process one block)

1s / 3us = how many blocks could be processed in 1 second = 333,333

And each block is 128-bits, so 128 bits X 333,333s = 42,666,666 total bits

So throughput = 42.66 Mbits per second

Is this how it would be calculated??

Or instead of using the 128-bits in the calculation, do I have to use the 64-bit output instead

EX.

64-bits X 333,333s = 21,333,312 bits => throughput = 21.33 Mbits per second

What about latency?? Do I have to factor in the time it takes to clock in the input block, and clock out the output? So instead of using 60 total clock cycles, I use 4 + 60 + 2 = 66 total clock cycles??

Am I close??

Thanks..