Algorithm for determining L1 is a subset of L2.

bogosort

Joined Sep 24, 2011
696
An algorithm does NOT by definition have a finite number of steps.
This is wrong. An essential property of any algorithm is that it completes in a finite number of steps.

The "algorithm" described by your flow chart cannot work for languages that have infinite number of words, and is therefore generally unsuitable to determine when L1 is a subset of L2. This is precisely the reason why the set theoretic gobbledygook is necessary, because it allows us to make definitive statements about sets, whether or not they are finite in size.

Barring that, there's likely something already here_ your question's already been done by some developer:

Title: Algorithms in C, 3rd Ed. [Parts 1-4, Fundamentals, Data Structures, Sorting, Searching]
Author: Robert Sedgewick
ISBN: 0-201-31452-5
I happen to have the first edition of that book on my shelf and, unless it's scope has changed drastically, it's focus is on practical algorithms. Such a book will be of no use to the OP's current inquiries, which are squarely in the theoretical realm.
 

Thread Starter

zulfi100

Joined Jun 7, 2012
656
That is not quite correct. Regex's represent automata theory- a means by identifying a pattern of data using a pattern mask that is a tokenize form allowing expansion.

An algorithm does NOT by definition have a finite number of steps. It may have a specific number of sub-processes, but how many times those processes are stepped through is not always known. Recursion is an example, and something very widely used in automata theory for regexp.

Instead of gobbledegook, why don't you simply make a flow-chart. It is the MOST logical representation of an algorithm to meet your need. Determining if one set of tokens is a subset of another set of tokens is trivial, conceptually speaking. I'm not sure why you're making it so difficult.

Something like this--

View attachment 190768

It might not be perfect, I slapped it together, on the way out the door.

Barring that, there's likely something already here_ your question's already been done by some developer:

Title: Algorithms in C, 3rd Ed. [Parts 1-4, Fundamentals, Data Structures, Sorting, Searching]
Author: Robert Sedgewick
ISBN: 0-201-31452-5
 

Thread Starter

zulfi100

Joined Jun 7, 2012
656
You said that \( A \subset B \) means that all members of A are also members of B. You then found the expression ∀x(x ∈ A → x ∈ B) on Wikipedia. Can you see that both the plain language version and the Wikipedia version mean the same thing?

How can we make an algorithm from this? As a first iteration, we could try:
Code:
1. remove an element x from A
2. check if x is an element of B
  2a. if x is in B, go back to step 1
  2b. otherwise, stop:  A is not a subset of B
If the algorithm halts without ever reaching step 2b, then A is a subset of B.

But an algorithm, by definition, must have a finite number of steps. If A and B are infinite sets, then what we've described is not an algorithm.

Now, the real goal is to show that, if L1 and L2 are regular languages, then there exists an algorithm that can determine if L1 is a subset of L2. Since regular languages can indeed be infinite, we cannot use such a naive algorithm and must think a little more abstractly. Do you know about the closure properties of regular languages?
Yes I know the closure property.
Thanks for the algorithm I have modified it in the context of comparing Strings of L1 and L2.

Zulfi.
 

WBahn

Joined Mar 31, 2012
30,077
An algorithm does NOT by definition have a finite number of steps. It may have a specific number of sub-processes, but how many times those processes are stepped through is not always known. Recursion is an example, and something very widely used in automata theory for regexp.
Not knowing how many times it might be executed and having a finite number of steps are not mutually exclusive.

For instance, a finite automaton always decides whether a string is in the language or not in a finite number of steps -- namely a number of steps equal to the length of the input string. We have no idea how long the input string might be, but we do know that any input string is finite in length, and thus the number of steps is finite.

This was one of the revolutionary shifts that came about around the turn of the 20th century. Before that we had just an informal notion of what an algorithm was, but thanks in large part to David Hilbert's problems for the 20th century (most specifically his 10th problem), in trying to phrase his problems very precisely he introduced the notion that an algorithm must SOLVE a problem in a FINITE number of steps. This notional shift is what led Church and Turing to explore the consequences of this view and essentially led to the field of computational science.
 

WBahn

Joined Mar 31, 2012
30,077
Yes I know the closure property.
Thanks for the algorithm I have modified it in the context of comparing Strings of L1 and L2.

Zulfi.
So if you have a DFA (M1) that accepts L1 and a DFA (M2) that accepts L2, how can you combine these machines to get a new one (M3) that can conclusively answer the question of whether L1 is a subset of L2.
 

Thread Starter

zulfi100

Joined Jun 7, 2012
656
You're missing the point.

The fact that "IF A ⊆ B THEN (A-B) = ∅" does not logically mean that "IF (A-B) = ∅ THEN A ⊆ B" is true. In some situations it is, but in others it is not. So no proof that is based on this premise is a valid proof.

Now, if you can show that "IF and ONLY IF A ⊆ B THEN (A-B) = ∅", then you can make the second assertion and use it as a premise in your proof.
Q. What you mean by second assertion?

Zulfi.
 

Thread Starter

zulfi100

Joined Jun 7, 2012
656
So if you have a DFA (M1) that accepts L1 and a DFA (M2) that accepts L2, how can you combine these machines to get a new one (M3) that can conclusively answer the question of whether L1 is a subset of L2.
I dont thing that combining would do any help, because we have to find subset, we dont know what language is accepted by L1 and what language is accepted by L2 and how this knowledge can help us in finding subset?

Zulfi.
 

WBahn

Joined Mar 31, 2012
30,077
I dont thing that combining would do any help, because we have to find subset, we dont know what language is accepted by L1 and what language is accepted by L2 and how this knowledge can help us in finding subset?

Zulfi.
Draw a Venn diagram of L1 and L2 in which L1 is a subset of L2.

What do you see MUST be true about the intersection of L1 and ~L2 ?
 

Thread Starter

zulfi100

Joined Jun 7, 2012
656
First assertion: IF A ⊆ B THEN (A-B) = ∅
Second assertion: IF (A-B) = ∅ THEN A ⊆ B

The second assertion does not follow from the first.
Thanks a lot. I was thinking about this but can't frame it mathematically.

But it looks as if the first one is derived from the second one and vice versa.

Zulfi.
 

WBahn

Joined Mar 31, 2012
30,077
Thanks a lot. I was thinking about this but can't frame it mathematically.
It IS framed mathematically. The problem is that you are having a hard time comprehending what the math means -- something that nearly every struggles with over and over.

It usually helps to use examples that illustrate what the math means in ways that are easy for us to comprehend based on what is obvious to us given our experiences.

But it looks as if the first one is derived from the second one and vice versa.
This is a common fallacy and I've already given you examples to illustrate it. Let's try another one.

First assertion: If (I poured water on the ground) then (the ground is wet).
Second assertion: If (the ground is wet) then (I poured water on the ground).

Let's stipulate that the first assertion is always true. This does NOT lead to the conclusion that the second assertion is true -- the ground might be wet because it just rained or any of a host of other reasons.

So, in general, we have to be careful to avoid basing proofs and conclusions on the belief that "If (condition A) then (condition B)" infers "If (condition B) then (condition A)".
 
Top