# Shortest strings having divisibility properties in both directions.

Discussion in 'Math' started by WBahn, Jan 7, 2018.

1. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
So here's a pretty simple problem that, on the surface, seems pretty complicated.

Imagine a string of 0's and 1's (call it 's') and the reverse of the string (call it 'r'). For instance

s = 1100
r = 0011

Now, interpret 's' as a base-2 value and interpret 'r' as a base-3 value.

s = 1100 (base-2 = 12)
r = 0011 (base-3 = 4)

If 's' is a string such that 's' is divisibly by 3 but 'r' is NOT divisible by 2, then we put 's' into a collection of strings and call that collection the language L.

In this case, 's' is not in L.

The problem is to determine whether or not any string 's' will be in L. If not, to prove it convincingly. If so, find the shortest string that is in L.

So that's the task you are being given here.

------------------------------------------------------------------------------------
For those interested in where this rather odd problem came from, read on.

I'm getting my lecture materials put together for this upcoming semester and have been trying to come up with interesting problems that are sufficiently simple that they can be solved by hand using the concepts of finite automata. In particular, I'm trying to come up with problems that involve the use of most of the concepts we are covering in order to solve (at least via a finite automaton).

For those not familiar with the term "finite automaton", it is essentially a super simple finite state machine. You have an input string of symbols and, on each clock cycle, you read one symbol and transition to the next state based only on your current state and the current input symbol. Each state outputs either a 0 or a 1. These are all suppressed by a global signal that enable the output once the string has been completely read as well as halting the machine. If the final output is a 1 then the machine "accepts" the input string otherwise it doesn't.

In playing around with a couple of ideas, I thought of strings that, when read in one direction and interpreted as being in one number base are divisible by one thing and, when read in the other direction and interpreted as being a number in a different number base, are divisible by something else. They idea is for the students to construct small machines that perform incremental parts of the task and then combined them.

Originally I wanted to have the problem be to find strings that were appropriately divisibly in both directions, but in trying to build up the equivalence classes, I noticed a pattern in that while all of the strings that were base-2 values divisible 3 were, when reversed, base-3 values divisible by 2, the converse was not true -- there were base-3 strings divisible by 2 for which the reversed strings were not base-2 strings divisible by 3. If this turned out to always be the case, then the language degenerates.

But then the idea hit me to incorporate a few other concepts. The first is taking the complement of a language, and the second is constructing a language to prove or disprove a hypothesis.

The hypothesis here is the following: Every binary value that is divisible by three will, if reversed and interpreted as a base-3 value, be divisibly by 2.

To prove that this hypothesis is wrong, it is sufficient to come up with a single value that violates the hypothesis. Any approach used to find that value is technically legitimate since the correctness of the counterexample can be proven directly from it. But to prove that the hypothesis is correct, you must provide a convincing mathematical argument that no such value exists.

It turns out that this problem, using finite automata, is actually quite easy to solve. In about twenty minutes I had constructed a machine that proved that such values DO exist and that could be used to generate them.

Think of the kind of program you would write in a high level language to do this -- and it would probably not be capable of proving that no values exists should that be the case. With that in mind, I find it somewhat remarkable that a six-state automaton is all that was needed and the pattern of values that are accepted is quite simple.

Even more powerfully, if it had turned out that no such values exist, then the machine would have had but a single state and that would have constituted a rigorous proof of the hypothesis.

cmartinez likes this.
2. ### bogosort Active Member

Sep 24, 2011
111
38
Neat problem, will be a good learning opportunity for your students.

Though, wouldn't you only need 5 states? I wrote up a DFA with a 3-state transition table to find the words divisible by 3, reversed and interpreted those in base-3, and fed them into a 2-state DFA to accept the non-multiples of 2. Of the first 512 bit strings, it found 45 words in L, the smallest being 21 (10101).

3. ### Tesla23 Senior Member

May 10, 2009
355
77
10101

I'm don't know if it is the shortest, but I'm pretty sure it has the fewest number of 1s in it.

L consists of all binary strings that are divisible by 3 such the number of '1's in the representation is 0 or 1 mod 3 (so 1, 3, 4, 6, 7, 9... '1's). Divisibility by 2 mod 3 is easily tested by adding the digits - if the sum (mod 3) is divisible by 2 then so is the number, and as the digits are 0/1, we only have to count the '1's (the algebra is the same as the test for divisibility by 9 mod 10).

The only binary strings with one '1' are 1,2,4,8..., none of which are divisible by 3. 10101 has 3 '1's and is divisible by 3.

Divisibility by 3 in binary is easily tested by treating the number as if mod(4) by pairing the digits. If the sum of the base 4 digits is divisible by 3 then so is the number (as above, the algebra is the same as the test for divisibility by 9 mod 10), so 010101 is a simple number to construct with 3 '1's that is divisible by 3, leave off the leading '0' and voila.. 10101 is in L.

Last edited: Jan 9, 2018
4. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
Glad you liked it! I was concerned that it was going to be way too esoteric for anyone to even play with.

The idea is for them to construct a single DFA. The problem was specifically constructed so that the subpieces could be done with two- and three-state machines.

Using two DFAs is practical in this case because the smallest string is sufficiently small so that a brute force filtering of the first N strings is almost guaranteed to find it -- though if you are cranking them by hand it might not depending on how quickly you give up. But if the first string had happened to be 64+ bits long, then even using a computer (with the resources available for something like this) wouldn't have found it. I'd LOVE to figure out a problem where the shortest string in the language is at least ten symbols (more than that and the manual work becomes just too onerous).

An alternative to finding a problem that requires a machine needing a lot of states is to task them with finding the lexicographically first string having something like 10 or 20 symbols that is in the language. Still doable via computerized search, but most students won't write a computer program unless it is required (which has me shaking my head since these are people either about to get or who already have a degree in Computer Science). Some problems would lend themselves to asking for the first 30 symbol string, but many would be hard pressed to justify more than about 10.

If the problem had had no solution (if L was empty), then the two DFA approach would not have provided a convincing proof -- at least as I'm seeing things right now.

However, by combining the two DFA via intersection, you end up with DFA that, if the language is empty, will contain no reachable accept states and that can be minimized to a single nonaccepting state. That is a compelling and rigorous proof. In this case, you end up with a six-state machine that is extremely simply and that therefore allows you to, by inspection, find that 10101 is the shortest string in the language. It's also easy to see the pattern for generating other strings. In this case, the number of leading zeros is unrestricted, so the first 32-bit string would consist of twenty-seven 0's followed by 10101.

I'm also going to have them construct a regular expression for this language, which is also something I don't think you can do (at least easily) from the two separate machines since I don't know how you can take the intersection of two regular expressions. But I suspect a number of people will try, most by using the intersection operator in their expression.

As an aside, do you know of any good resource for manipulating regular expressions? It's barely touched on in any of the texts I have and I have to admit that even manipulating two pretty simple regular expressions to show that they are equivalent (without showing that their corresponding minimal DFAs are equivalent) is extremely challenging for me, even when I can verbally justify why they are the same. I'm hoping to find a good table of identities.

5. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
Yep, it's the shortest.

For the problem given here, it is a perfectly valid approach and is very akin to the type of reasoning I used as I was playing around with it, if only to sanity check that my machines made sense. The only shortcoming is not proving that it is the shortest string, but I suspect that the analysis could be continues to prove that, as well.

This is the kind of reasoning that I strongly encourage the students to engage in, even though the course is about finite automata. Whether they look for these kinds of constraints at the beginning or the end, it would help them understand the structure of the problem, some of the other avenues available to them, and most importantly give them the ability to check their final machines to ask if they really do make sense -- it is SO easy to make a simple mistake and mess up the machine, but most students won't even attempt to verify the correctness of their solutions.

The other thing that it would hopefully impress upon them is that analyses like this tend to be extremely problem specific. If I change the base or change the divisor it will likely be much harder to do an analysis like this, particularly to confidently find the shortest string. But the approach using finite automata is very general and you can even write a program to implement the algorithm that generates the machine and then either finds the first string in the language or declares that the problem has no solution. I think that level of understanding is an important goal for a course like this one.

6. ### Tesla23 Senior Member

May 10, 2009
355
77
Yep - the division tests jumped out to me as I read the post so it was easy to work out, but as you say - change the divisors - and I don't have the time.

7. ### atferrari AAC Fanatic!

Jan 6, 2004
2,943
1,089
Sorry William,

I hardly could follow the whole thing but when coming to the above, I lost you. Yes, I am rather far from all this in my daily activity.

8. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
No problem.

"Lexical" means the words of a language and "graphic" means the writing of something (at least in this context).

Lexicographical means the ordering the words of a language. In every day use we say "alphabetical", but that confuses people when the alphabet contains symbols other than letters (and, really, it shouldn't -- you have an alphabet and that alphabet has a defined order).

There are two types of lexicographical orderings. One is what we normally think of in that all words, no matter how long, that begin with 'a' will be listed before any words that start with 'b'. This is sometimes referred to as "dictionary order". The others lists short words before longer words (and uses dictionary order within groups of words of the same length). This is usually referred to as "shortlex" order.

atferrari and panic mode like this.
9. ### bogosort Active Member

Sep 24, 2011
111
38
Doesn't this run into the halting problem? If a DFA recognizes all words in L, then L is regular and decidable. But if a DFA accepts no words in L, then L is not a regular language; it may not even be recursively enumerable. How then can a non-accepting DFA be proof that L is empty?

This is one of those implementation details that needs some thought when doing this in code. In my first attempt, I used unsigned 64-bit integer containers for everything, which of course quickly overflowed when treating the reversed bit strings as base-3 numbers. The process of making the code robust -- and figuring out its various limitations -- is a worthy experience in itself.

Wish I could help, but my experience with regular expressions doesn't go much beyond their practical use. Formally proving that two arbitrary regexes are equivalent seems like a hard (NP-hard?) problem. Good luck!

10. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
All DFAs halt and, hence, are deciders and all regular languages are therefore decidable.

The empty language is a regular language since a DFA can be constructed that accepts the empty language.

The intersection of two regular languages is also regular.

The problem of whether the language accepted by a DFA is empty is a decidable problem.

In this case, we construct a machine that accepts every string that is a solution to the problem. If the resulting DFA turns out to accept the empty language, then that means that there ARE no strings that are a solution to the problem.

11. ### bogosort Active Member

Sep 24, 2011
111
38
This last point gets to the heart of my confusion. If the resulting DFA accepts the empty language, is that really sufficient to say that the problem has no solution? What if the problem was undecidable in the first place? Or what if the problem is decidable, but the valid words in the language are larger than the number of states in the DFA? The DFA, unable to recognize the language, will wrongly accept the empty language.

12. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
The resulting DFA accepts EXACTLY every string that is a solution to the problem.

We have a language, L, that, by definition, is the set of all strings that satisfy the problem requirements. Each and every string that solves the problem is in the language while each and every string in the language solves the problem.

We don't know yet if it is regular, but IF we can construct a DFA that recognizes it, then it is, by definition, a regular language.

In this case we can construct this DFA, call it M, incrementally. We'll index the rest of our out machines Mx and the language accepted by Mx will be called Lx.

First we construct a DFA, call it M1, that accepts all strings over the alphabet {0,1} that represent base-2 value divisible by 3.

We can design this DFA using the properties of modular arithmetic. It will have at most three states (we will go ahead and accept the empty string -- but we will need to consider the implications of this more carefully later).

Next we construct a DFA, call it M2, that accepts all strings over the alphabet {0,1} that represent base-3 value divisible by 2.

But we don't need M2, we need a machine that is different in two regards. First, it need to read the string in the reverse order, and second it needs to accept the strings that are NOT divisible by 2.

So first we will reverse the machine to form a new NFA, M3, by having the start state make epsilon transitions to the existing accept states (and making them nonaccept states), reversing the direction of all of the arrows, and making the original start state the single accept state (regardless of whether it was or wasn't an accept state originally).

We then convert this to a DFA, M4.

We create M5 by taking the complement of M4, which is done by simply swapping the accept status of every state.

Now we have two machines, M1 and M5. A string is in our language if and only if it is both languages, which means our desired language L is the intersection of L1 and L5 and for a string to be in L it must be accepted by BOTH machine M1 and M5.

But we can construct a new machine, M, that does exactly this -- accepts strings if and only if they are accepted by both M1 and M5. The result of this construct is our final machine.

Thus, the strings accepted by M are EXACTLY the strings that are solutions to the problem. Any string accepted by M is a solution to the problem and any string not accepted by M is not a solution to the problem.

If it turns out that M doesn't accept any strings, then that means that no string exists that is a solution to the problem because, if it were, it would accepted by M.

Now, what about the empty string?

Since we defined our language such that the empty string was considered divisible by whatever the divisor was, it is in L1 through L4 but is NOT in L5. Since it is not in both L1 and L5, it will not be in L.

As for the number of words in the language being larger than the number of states, that is almost always the case. Most (not all) regular languages are infinite in size to begin with, including ALL of the ones discussed here. But any finite automaton has to have a finite number of states -- that's what makes it a "finite" automaton. All this means is that there must be at least one loop in the machine.

13. ### bogosort Active Member

Sep 24, 2011
111
38
Suppose that we encode the size of a compiler's machine code output in bytes by the number of 0s in a bit string. We'll also say that if the number of bytes is even, the bit string ends in a 1.

Let L1 be the set of bit strings that encode even-numbered compiler outputs. Clearly L1 is regular and we can build a two-state DFA that recognizes it.

Now suppose that we encode some compiler optimization metric in the number of 0s in a bit string, for instance, the number of memory accesses performed by the resulting machine code. We'll say that if the number of memory accesses is optimal (minimal), then the bit string ends in a 1.

Let L2 be the set of such bit strings. Again, we build a 2-state DFA. But L2 is not decidable, and so the DFA can never recognize it.

I don't believe that this is sufficient to say that L2 does not exist, or that the optimization problem does not have a solution (as that would imply solving the halting problem). It seems to me that DFAs can prove that a language is regular, but they cannot say anything about a non regular language. Am I wrong in this?

Yes, total lapse of reason on my part there. Anyway, most of my experience with finite automata is entirely practical (e.g., building FSMs for embedded systems), but I'm increasingly fascinated by the theoretical aspects, so I appreciate your patient dialogue.

14. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
Both L1 and L2 are regular. Remember, a language is defined PURELY by the set of strings it contains. In the case of L2, the string 00001 and 00000001 are both in the language. The MEANING of those strings (i.e., the semantics) is a completely different issue.

I think your confusion comes from thinking that somehow the strings that get fed into the machine have some bearing on the language it accepts. In the case of L2, we may or may never feed it a string ending in a 1 into it. So what? If we did, it would be accepted because all strings ending in a 1 are in the language.

Think of it this way -- the DFA accepts all strings that satisfy the definition of the language but the strings in the language have a separate and external correlation to optimally compiled programs. The correlation is not rigid. If I feed a string ending in a 1 into this machine, it WILL accept it. That does NOT mean that a compiler has optimally compiled a program -- it just means that I happen to feed a string that ended with a 1 into the machine.

That's completely different than what I'm talking about. In this case, the language that the machine accepts is exactly the language consisting of all strings that satisfy the problem and that the strings satisfy the problem is rigid -- it is the STRING itself that is a solution to the problem.

Also, I'm not relying on whether or not anyone ever happens to feed a string that's a solution to problem into the machine to determine whether or not the problem has a solution. If the language the corresponds to the DFA has ANY string in it, then each of them is a solution to the problem. But if the language that corresponds to the DFA is the empty language, the no string that is a solution to the problem exists because, if it did, the machine would accept it because it would have to be in the language that is accepted by the machine. So if the machine is not capable of accepting any strings at all, it must therefore mean that no strings that are solutions to the problem exist.

Perhaps a simpler example would make it clearer.

Using the alphabet {a,b}, let's say that

L1 = {w | w the number of a's is odd}
L2 = {w | w the number of a's is even}

Now someone claims that there are no strings that have both and even and an odd number of a's and we are tasked with proving or disproving this claim.

So we build DFAs for L1 and L2. In their minimal form they are both two-state machines. We then construct a machine that is the intersection of these two which accepts strings if and only if they are in both L1 and L2. The result, by the general algorithm for combining DFA, is a four-state machine with a single accept state. However, this state is unreachable and if the machine is minimized it consists of a single nonaccepting state. Therefore the machines that accepts all strings that are in both L1 and L2 is incapable of accepting any strings, therefore there are no strings that are in both L1 and L2.

15. ### bogosort Active Member

Sep 24, 2011
111
38
Yes, thanks, I was trying to associate a decidability problem with a set of strings, but mucked things up.

Anyway, I think I now finally understand why a non-accepting DFA is proof that a language is empty (and hence the problem coupled to it has no solutions). It seems obvious in retrospect, but I needed your help sorting some of it out.

Consider: L = { w in {0,1} | the number of 0s equals the number of 1s }

DFAs can't count, so no DFA can recognize words in L. Obviously L exists, but since there's no way to describe it using a DFA, we can't actually build one that will accept or reject words in L -- any states we included would be arbitrary and meaningless. So DFAs have literally nothing to say about non regular languages.

On the other hand, if we can describe some language with a DFA, and if it consequently rejects every string, then we know with certainty that the language must be empty. After all, its entire purpose is to pick out the language.

So yes, now I totally agree that had the DFA of your student problem produced the null set, it would have been rigorous proof that the problem had no solution.

Again, thanks for the patience.

16. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
More than my pleasure!

17. ### crutschow Expert

Mar 14, 2008
19,122
5,344
It's interesting to me that I enjoy solving electronics problems that have a practical purpose, but have never had much interest in solving problems just for the sake of it being a problem, such as your string question, or crossword puzzles, or jigsaw puzzles, or ....
I likely would not do well in your class.

18. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
The thing to keep in mind is that problems like this are not done for their own sake. When you were first learning to analyze electronic circuits, did every problem you solve have some practical purpose? Why did you have to solve problems that thousands of people before you had already solved? If the goal was simply to get a circuit having a practical purpose, why weren't you just shown how to look up the cataloged results of other people's work instead of reinventing the wheel. Perhaps the goal was something else and most of them intended to teach you how to think so as to be able to eventually solve new electronics problems that did a practical purpose?

It is the same with this problem. The intent is not to find a string that has the properties indicated. It is to start to learn how to cast problems into a form such that they CAN be solved -- or, perhaps even more importantly, into a form that allows us to rigorously prove that no solution exists. This let's us discover some very interesting and ultimately useful things about the capabilities and fundamental limitations of algorithms and computation.

19. ### bogosort Active Member

Sep 24, 2011
111
38
Exactly. The theory of computation is fundamentally a theory of what is knowable in the universe (I like Scott Aaronson's description of theoretical computer science as quantitative epistemology). I don't think it's an exaggeration to say that one of the most profound insights in human history is the relatively recent understanding that problems can be categorized by how difficult they are to solve. While any particular problem is unique in detail, the theory of computation allows us to distill the problem to its computational essence. The great insight is that, so distilled, vast amounts of apparently different problems are shown to be equivalent , and so if you know how to solve a problem in one class, you know how to solve any problem in that class.

WBahn's student exercise is an excellent case in point: there is a class of problems that can be described by a regular language. The theory of computation tells us that any such problem can be solved (or proved to have no solution) with a DFA. While the details of WBahn's problem may have seemed as relevant as a crossword puzzle, there are quite literally an infinite number of very practical problems that are exactly equivalent. Being able to reason about such things in a cogent, systematic manner will serve his students well in their careers.

It's no accident that researchers in other sciences -- notably physics and biology -- have been bringing the results of theoretical computer science into their fields. It from bit, indeed.

20. ### WBahn Thread Starter Moderator

Mar 31, 2012
22,620
6,729
So I gave some thought to trying to come up with a problem that you might consider had a practical purpose but that could be solved (or proven unsolvable) using this kind of reasoning.

Let's say that you have a computer program that someone wrote and this program takes an input from source and decides if that input is good or bad. Every possible input you can feed into the machine should either produce a good or a bad output. The company that owns this program comes to you and says they want a finite state machine (that reads the input one symbol at a time and, upon reading the last symbol (or, more correctly, upon detecting that there are no more symbols in the input buffer) it has to make the same good/bad declaration) that can replace this computer. They are willing to pay serious money -- perhaps billions of dollars -- for both the hardware and the time to develop the machine. All they care about is whether or not you can, given enough time and enough resources, eventually deliver a machine that can replace their computer program. Of course, if you don't deliver, then the penalties are so steep that you will only take the contract if you know that the problem is solvable, even if it means hiring a huge team of people to spend their career working on it.

Do you take the contract? Can you guarantee your potential customer that, given enough time and resources, you can accomplish this?

This is one of the big understandings that comes out of the Theory of Computation (more commonly known as Finite Automata). A finite state machine (a.k.a, Finite Automaton, FA) has limits on its computation capabilities, and those limits are greater than what are placed on more powerful machines. A computer is generally modeled as a Turing Machine (TM), but a more reasonable model is a Linear Bounded Automaton (LBA). In either case, however, it can be proven that there exist problems that can be solved by the latter (the TM or the LBA) that fundamentally cannot be solved by an FA. But ANY problem that can be solved by an FA can be solved by an LBA or a TM.

Well, that leaves the door open for you because you now know that at least SOME problems that can be solved by a TM can be solved by an FA and if the computer program you are being asked to implement as an FA is one of them, then you become super wealthy. So you ask the company if you can examine the program to decide if it is one that can be implement as an FA. They agree, but only on the condition that you have to prove to them whether it can or cannot. As before, they will pay you very handsomely if you can do that but you get penalized ultra heavily if you can't make the proof. Now remember, at this point they aren't saying you have to prove that it can be; they don't care what the outcome of the analysis is, as long as it is a firm yes/no outcome.

Do you accept this contract? Can you guarantee them that, given enough time and resources, you will be able to tell them whether or not their program can be implemented as an FA?

This is another one of the big understandings that comes out of the Theory of Computation. By first doing some very nonintuitive mathematical proofs that show things like the size of the set of natural numbers is exactly the same size as the size of the even natural numbers or of the set of rational numbers, but not of the set of real numbers, you can rigorously proof that it is impossible to construct and algorithm that can answer the question of whether or not that program can be implemented by an FA.

This is very valuable because not only does it let you know that you should NOT take that contract, but it also prevents you from wasting a lot of time and money trying to solve a problem that has no solution.

In fact, it can be proven that, given an arbitrary Turing machine, it can not always even determined whether or not that machine will ever halt when processing an arbitrarily chosen finite length input. This has profound implications for the field of software verifiability, the Holy Grail of Computer Science.

And all of it stems directly out of being able to solve questions like this string question.