find repeated number

MrChips · Apr 5, 2020

I am gonna milk this one as long as I can. There are still many weeks of isolation ahead of us.
I am going to supply the solution in drips and dribbles.

The solution is very similar to the concept of hashing.

jpanhalt · Apr 5, 2020

How many Tcy for an array of a specific size (defined at your pleasure)?

WBahn · Apr 5, 2020

Gajyamadake said:
I haven't given up yet I am watching this thread very closely I am interested to see how do you solve the problem first in your head and then how do you make flow chart for that.

So does that mean that you aren't interested in putting in any more effort on your own? Instead, you are just going to wait for someone else to do all the work and give you the solution?

That's a shame, we were only a couple of steps away from a working solution in https://forum.allaboutcircuits.com/threads/flowchart-find-repeated-number.168406/post-1495131

WBahn · Apr 5, 2020

MrChips said:
The solution which I will present does not require sorting or comparisons.
It is simple and fast. I cannot think of any other solution that would be faster.

Can anyone guess what is the solution?

I've already described a linear-time algorithm to solve the problem. It'll be interesting to see how much similarity there is between yours and mine.

A couple of other linear time algorithms would be:

IF we know and have a reasonable range on the values of the numbers (and assuming they are integers), then we can just use the number as an index to an array and increment the value at that index in a counter array.

If we don't have suitable constraints on the values, then we can perform some kind of hash on the value to get a suitable index and do the same thing, except that we need to store the actual value in a list of some kind associated with each index.

jpanhalt · Apr 5, 2020

MrChips said:
The solution which I will present does not require sorting or comparisons.
It is simple and fast. I cannot think of any other solution that would be faster.

Can anyone guess what is the solution?

I don't need to guess, post #31 is faster. The gauntlet is down.

WBahn · Apr 5, 2020

jpanhalt said:
I don't need to guess, post #31 is faster. The gauntlet is down.

Perhaps you could walk through a toy example for that algorithm to illustrate how it works.

For instance, let's say I have an array with four values: {2, 19, 4, 19}

Your algorithm says that I need four registers and then I use the values in the array as indices into the array, right. So what do I do when the value is 19?

MrChips · Apr 5, 2020

I arrived at this solution after discussing with @dl324 about putting balls in a bag.

jpanhalt · Apr 5, 2020

WBahn said:
Perhaps you could walk through a toy example for that algorithm to illustrate how it works.

For instance, let's say I have an array with four values: {2, 19, 4, 19}

Your algorithm says that I need four registers and then I use the values in the array as indices into the array, right. So what do I do when the value is 19?

My algorithm worked in simulation with MPLab 8.92. If there is a need to explain it, I will do that later. Basically the act of entering a number creates a marker that can be efficiently (btfsc/s) tested by subsequent numbers for duplication.

Since the original question involved just a few numbers, I simply used relative jumps in RAM to do that. Any memory can be used. With much bigger sets, I would reduce the number of registers/instruction memory used, if required, by addressing individual bits,* much like one does when drawing on a GLCD.

Of course, since the TS seems to be using 16F PIC's, that is what I wrote it for. If one goes outside that hardware limitation and uses an MCU that can address individual bits with a variable (e.g., bsf REGA, x and btfsc/s REGA,x are not specifically allowed when x is a variable), then that changes the implied rules. If that is to be allowed, why not use a PC with gigabits of RAM, which is way out of my league.

*Effectively using %8 to reduce the number of registers needed.

iimagine · Apr 5, 2020

If memory/storage capacity isnt a problem and the range of number is finite, then the fastest and easiest is to store each number representing its own index: For example
Numbers = [2, 6, 199, 3201, 4.....]
data = []
for (i = 0; i < sizeof(Numbers) - 1; i++)
x = Numbers;
if x > sizeof(data);
resize(data, x);
data[x] = data[x]++;

Of course the array have to be resize each time x is greater then its length.

jpanhalt · Apr 5, 2020

iimagine said:
If memory/storage capacity isnt a problem and the range of number is finite, then the fastest and easiest is to store each number representing its own index: For example
Numbers = [2, 6, 199, 3201, 4.....]
data = []
for (i = 0; i < sizeof(Numbers) - 1; i++)
x = Numbers;
if x > sizeof(data);
resize(data, x);
data[x] = data[x]++;

Of course the array have to be resize each time x is greater then its length.

I just "store" the index. It's quicker for later access. How many instructions does that code decompile to?

MrChips · Apr 5, 2020

@WBahn has described the same solution in post #84.

I thought about this when we described the problem as balls in a bag.
Take balls out of the bag one at a time.
Put each ball in a bag and write down on the outside of the bag the number marked on the ball.
This process is the same as the hash process described in a previous post.

How many bags would we need?
If there are 100 balls, then we will need a maximum of 100 bags.
This is how the hash function works. For example, the hash function when given a value "12345", for example, will return a numerical hash index which is going to be the number of our bag. So the solution is simply that the hash index is the same as the number on the ball. In other words, the hash function hash(x) = x.

Our bags will be numbered sequentially from 0 to the max number.
Suppose that we are restricted to numbers from 0-65535. Then we will need 65536 bags. Period. End of story.

Suppose we are told that the numbers are in the range 1 to 100. Then we need 100 bags, regardless of the total number of balls. You take one ball at a time and put it into its proper bag marked with the same number on the outside of the bag. Given boundaries on the input we can place restrictions on the model.

The point here is, you take a problem that may appear abstract mathematically and find a real world working solution to the problem, such as putting numbered balls into numbered bags.

Here is the engineering problem solving methodology.

1. Define the problem. List the specifications, limitations and restrictions.
2. Create a working solution after researching all possibilities.
3. Draw a flowchart.
4. Implement the solution.
5. Test the solution

MrChips · Apr 5, 2020

If you still cannot do this on your own, here is a code snippet of the solution:

for (i = 0; i < n; i++ ) ++count( x(i) );

Correction:
The corrected statement is:
for (i = 0; i < n; i++ ) ++count[ x(i) ];

jpanhalt · Apr 5, 2020

MrChips said:
@WBahn has described the same solution in post #84.

I thought about this when we described the problem as balls in a bag.
Take balls out of the bag one at a time.
Put each ball in a bag and write down on the outside of the bag the number marked on the ball.
This process is the same as the hash process described in a previous post.

How many bags would we need?
If there are 100 balls, then we will need a maximum of 100 bags.
This is how the hash function works. For example, the hash function when given a value "12345", for example, will return a numerical hash index which is going to be the number of our bag. So the solution is simply that the hash index is the same as the number on the ball. In other words, the hash function hash(x) = x.

Our bags will be numbered sequentially from 0 to the max number.
Suppose that we are restricted to numbers from 0-65535. Then we will need 65536 bags. Period. End of story.

Suppose we are told that the numbers are in the range 1 to 100. Then we need 100 bags, regardless of the total number of balls. You take one ball at a time and put it into its proper bag marked with the same number on the outside of the bag. Given boundaries on the input we can place restrictions on the model.

The point here is, you take a problem that may appear abstract mathematically and find a real world working solution to the problem, such as putting numbered balls into numbered bags.

Here is the engineering problem solving methodology.

1. Define the problem. List the specifications, limitations and restrictions.
2. Create a working solution after researching all possibilities.
3. Draw a flowchart.
4. Implement the solution.
5. Test the solution

That's it? I call that sorting since you are taking balls that are not in particular order and putting them into sequentially numbered bags.

Moreover, that is exactly what the algorithm and code in my post does. If you looked at it, it takes very few instructions in Assembly for the actual process. The "numbers" are RAM offsets..

The difference is that rather than "move" the balls to the bags, I simply "mark" the target bags and leave the balls where they were. It takes less code and fewer Tcy to read a mark (a set bit) than to move the contents and test for zero. It also allows for easy compression/packing to preserve RAM.

Moving the "ball" to the bag and determining whether the bag is empty is not only slower, the ball takes up space in the bag. My code with trivial modification (setting up the FSR to read/write program memory instead of RAM) could use an unused bit in, say, an existing, non-packed ascii table. Slightly more changes would allow me to store eight "balls" in a single register. That is similar to using only 8 registers to store 64 bits needed for a column in a 64x128 GLCD.

I do agree with you, it seems like a pretty efficient way in terms of speed. Your method wastes time and space by actually moving the balls.

WBahn · Apr 6, 2020

MrChips said:
If you still cannot do this on your own, here is a code snippet of the solution:

for (i = 0; i < n; i++ ) ++count( x(i) );

What is the function count() returning?

Why not count[x(i)]++ ?

For this approach to work, you have to use a hash function that is collision free. That's pretty restrictive -- in fact it largely defeats the purpose of using a hash function in most applications.

It's also quite restricting in this application, as well. Even with 16-bit unsigned integers for data values that's an array with 64k entries. But if you simply allow 32-bit unsigned integers for data now you need an array with 4G entries, all of which need to be initialized no matter how few the number of entries in the data array. That's a pretty hefty constant-time penalty to overcome. If the data are random integers on a 64-bit system, you are simply SOL unless you hash them down to a manageable space, but now you have to content with collisions, which means you need to maintain some kind of a list to be able to determine the actual values that hashed to the same index.

But all of this is WAY beyond where the TS is at.

Gajyamadake · Apr 6, 2020

WBahn said:
But all of this is WAY beyond where the TS is at.

I feel I am trying to solve a more difficult question than my ability. I made many efforts to solve it but it is not so easy as you say, first solve the problem inside the brain

jpanhalt · Apr 6, 2020

Anything can be obfuscated and jargon only adds to the clutter. Try the logic I posted a few times along with the code in post #31. Despite denials by some, it is a simple sorting scheme appropriate for the problem you presented. If you want to extend it to 64-bit numbers, that is more than a bit different.

MrChips · Apr 6, 2020

I have corrected the code snippet as follows:

for (i = 0; i < n; i++ ) ++count[ x(i) ];

I explained in the post that the hash function is simply the value of the number itself. There is no actual hash function involved. There is no sorting involved.

I also explained that if the values can be any 16-bit value then you would need 65536 bags (i.e. an array of 65536 entries).
I placed limits on the problem. For example, values will be from the integer set 1 to 100. This reduces the storage requirements.

All of the discussion was aimed directly at the TS. The TS is struggling with two concepts, (1) problem solving, and (2) drawing flowcharts. TS is doing things backwards. TS wants to write code and then attempts to match the flowchart to the code. This is backwards. Flowchart structure is methodical and consistent. Similarly, coding is methodical and consistent once the flowchart has been drawn.

I am trying to demonstrate to the TS that before you can draw the flowchart, you need to formulate a solution. In helps if you can create a real-life scenario, such as balls in a bag, in order to allow you to visualize and work through the steps of a solution. The only jargon that was added was the concept of "hashing" because this is the general solution to the problem with large data sets and large value ranges.

jpanhalt · Apr 6, 2020

So far as I can see it, the big revelation turned into a rehash of what was presented long ago to the TS.

We seem to have distinctly different understandings of what "sort" means:

In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order.

https://en.wikipedia.org/wiki/Sorting_algorithm

Anyway, it appears the TS has lost interest.

MrChips · Apr 6, 2020

jpanhalt said:
We seem to have distinctly different understandings of what "sort" means:

True, putting numbered balls into its respective containers would qualify as sorting in the human perspective.
When I think of sorting as a computer algorithm, I think of putting elements in an ordered list using multiple nested iterations.

WBahn · Apr 6, 2020

Gajyamadake said:
I feel I am trying to solve a more difficult question than my ability. I made many efforts to solve it but it is not so easy as you say, first solve the problem inside the brain

Which is why I am trying to walk you incrementally to a workable solution.

Go back and do your best to accomplish the next step that was asked in: https://forum.allaboutcircuits.com/threads/flowchart-find-repeated-number.168406/post-1495131

Do it using C code, or pseudocode, or a flowchart.

Currently we have:

Code:

#include <stdio.h>

int main(void)
{
    int data[11] = {5, 3, 7, 1, 6, 3, 9, 12 , 21, 13, 16};
    sizeOfData = 11
    for ( int  i = 0;  i < sizeOfData;  i++)
          printf( "data = %d \n", data[i]);

    return 0;
}

Here's a hint: you want to use a nested loop (loop within a loop), you only need to add a single line of code that will be nearly identical to one of the lines you already have, and the output statement will be changed to something like:

printf(" "(%d, %d)\n", data, data[j]);

Thread starter	Similar threads	Forum	Replies	Date
C	IRLZ44N IR2110 74HC04 Half Bridge design on PCB struggling to find issue with maybe wiring or incorrect connection.	PCB Layout , EDA & Simulations	1	Tuesday at 8:14 PM
	Trying to find this exact plug	Datasheets, Manuals & Parts Identification	3	Jun 30, 2026
	CCNP in Seattle can't find work	Jobs & Career Advising	4	Jun 1, 2026
V	How to find CPU Load for microcontrollers	Microcontrollers	14	May 6, 2026
	program to find maximum time repeated number in array	Programming & Languages	16	Mar 3, 2020

find repeated number

Join our Engineering Community! Sign-in with:

find repeated number

MrChips

jpanhalt

WBahn

WBahn

jpanhalt

WBahn

MrChips

jpanhalt

iimagine

jpanhalt

MrChips

MrChips

jpanhalt

WBahn

Gajyamadake

jpanhalt

MrChips

jpanhalt

MrChips

WBahn

You May Also Like

STMicro Unwraps Low-Power Image Sensors for Always-On Designs

Rewiring the Spine: The Tech Restoring Movement After Spinal Injury

Geehy Debuts First Motor Control MCU on Its New G32F0 Platform

Arm Axion Heads Google’s 8th-Gen TPUs as Cloud Pivots to Agentic AI