points inside a shape

cmartinez

Joined Jan 17, 2007
8,253
you talked about reversing things if the angle was more than 180° for a given vertex, yet here you don't mention anything about that at all.
Actually, it's there in an implicit way. When you multiply two sides represented as vectors that have less than 180° the result is of a different sign than those that have more than 180° ... In fact, you could get away by just summing the angles at each vertex of the polygon (but for those angles that are larger than 180° we'd subtract the angle's value from 180°, thereby obtaining a negative value), and in the end get what you want, which is a sign representing the drawing orientation of the polygon.

what about points for which the normal vector to the nearest boundary vector doesn't intersect the boundary segment at all? How do you deal with those?
You just assume that all sides are of infinite length, and define them as linear equations, for the moment. The 90° intersection to the point being analyzed will probably lie outside the line's limits, but that is of not consequence. I don't think this method is too computationally demanding. By defining each side as a linear equation, you have m as the slope of that line, and therefore -1/m is the slope of the perpendicular intersecting line. Calculating the intersecting point for that side for the point being analyzed is child's play. What might be a little more demanding for the CPU would be calculating the distances between that point and each and every line (but it just involves a bunch of squares and square roots) and then finding the smallest value. Then again... I can't think of a more efficient way to do it.

EDIT: I also think that vector multiplication is far easier to perform, computationally speaking, than finding out the value of each angle at each vertex through trigonometric functions.
 
Last edited:

WBahn

Joined Mar 31, 2012
30,060
You just assume that all sides are of infinite length, and define them as linear equations, for the moment. The 90° intersection to the point being analyzed will probably lie outside the line's limits, but that is of not consequence. I don't think this method is too computationally demanding. By defining each side as a linear equation, you have m as the slope of that line, and therefore -1/m is the slope of the perpendicular intersecting line. Calculating the intersecting point for that side for the point being analyzed is child's play. What might be a little more demanding for the CPU would be calculating the distances between that point and each and every line (but it just involves a bunch of squares and square roots) and then finding the smallest value. Then again... I can't think of a more efficient way to do it.
Calculating 1/m (i.e., using division), is computationally expensive. And having to calculate distances from every point to every line is computationally expensive.

Why note only consider the border line segments that have the potential to intersect the line and then only do operations that involve addition and multiplication to determine if they intersect?
 

WBahn

Joined Mar 31, 2012
30,060
Here's one scenario that is bothering me:

Border.png

Let's assume that all other border segments are much further away from these two points than either A or B.

Both points are closer to the vector A than the vector B, but are on opposite sides of it. So when we multiply (which I've been assuming that you mean the vector cross product) between the normal to the vector A will get results of opposite sign, indicating that one is inside the shape and the other is outside the shape, when clearly they are both either inside or both outside.
 

cmartinez

Joined Jan 17, 2007
8,253
Here's one scenario that is bothering me:

View attachment 88785

Let's assume that all other border segments are much further away from these two points than either A or B.

Both points are closer to the vector A than the vector B, but are on opposite sides of it. So when we multiply (which I've been assuming that you mean the vector cross product) between the normal to the vector A will get results of opposite sign, indicating that one is inside the shape and the other is outside the shape, when clearly they are both either inside or both outside.
You're right... it seems that I made a mistake when I said:
The 90° intersection to the point being analyzed will probably lie
outside the line's limits, but that is of not consequence.
So let's add this rule. If the closest intersection to a given line lies outside its boundaries, then discard it and use the second closest line. In the case you've mentioned it would be vector B, for point C, and for point D it would be some other line beyond A and B. But if no line exists whose perpendicular intersect to that point lies within its boundaries, then the point is definitely outside the shape.
But I'm sure that if the point lies inside the polygon, there will always be a perpendicular intersecting point that is within the closest line's limits.

When I interpreted your "computationally expensive" expression, I thought that you were referring to trigonometric functions or recursive calculations.... yes, division and square operations (no need for square roots here, now that I think of it) are computationally expensive when you compare them with simple addition, subtraction or bit-shifting, for instance.
The reason why I don't think we'd even need square roots for this application is that we're not actually trying to find the exact value of the distance from the point to the side, but just the smallest value of a group. And if we don't calculate the square roots is not relevant.

I guess the only way to test the truth of what I'm saying is to actually write the code and test it in the CAD software of your choice. o_O
 

WBahn

Joined Mar 31, 2012
30,060
You're right... it seems that I made a mistake when I said:


So let's add this rule. If the closest intersection to a given line lies outside its boundaries, then discard it and use the second closest line. In the case you've mentioned it would be vector B, for point C, and for point D it would be some other line beyond A and B. But if no line exists whose perpendicular intersect to that point lies within its boundaries, then the point is definitely outside the shape.
Now it seems like we are completely defeating the point. Not only do we have to determine the distance to every border segment, but we have to check whether or not they actually intersect (starting with the nearest one and working out until we find one that does).

But I'm sure that if the point lies inside the polygon, there will always be a perpendicular intersecting point that is within the closest line's limits.
Really? What about this one:

Border2.png

Which border segment has a perpendicular to the red dot that intersects the line within the its limits?

What special rule is going to get added now?

When I interpreted your "computationally expensive" expression, I thought that you were referring to trigonometric functions or recursive calculations.... yes, division and square operations (no need for square roots here, now that I think of it) are computationally expensive when you compare them with simple addition, subtraction or bit-shifting, for instance.
The reason why I don't think we'd even need square roots for this application is that we're not actually trying to find the exact value of the distance from the point to the side, but just the smallest value of a group. And if we don't calculate the square roots is not relevant.

I guess the only way to test the truth of what I'm saying is to actually write the code and test it in the CAD software of your choice. o_O
Again, let's look at the method I have been recommending (with no claim that it is THE most efficient method).

Border3.png

Sort the segments according to the lowest y-coordinate of either of their end points. At the same time, determine the furthest right x-coordinate of any segment (the yellow line). Set a constant equal to the x-coordinate of that point plus some arbitrary amount: XMAX = (max x-coordinate) + 1.

The order of the segments might be the alphabetical order shown above (there are two sets of four segments, {C,D,E,F} and {M,N,O,P} that have the same lowest y-coordinate, so let's assume that those are ordered randomly.

We now want to find if the red dot is inside the polygon. So we construct the horizontal green segment that starts at the point and ends at any point to the right of the yellow line. That segment is trivial to construct with almost no computational cost. If the point has coordinates <xp,yp>, the other end of the segment has coordinates <XMAX,yp>.

We now filter our set of border segments to find those that potentially might intersect the green line segment. We first truncate the set beginning with the first line segment whose lowest y-coordinate is above the green segment, which is segment J. Since we can do a binary search for this, the cost is O(log(n)) where n is the number of border segments. Furthermore, each test is a simple comparison on one value, so the cost per step is cheap.

This leaves us with the list {A,B,C,D,E,F,G,H,I}.

We now walk through this list eliminating all points for which the second coordinate is also below the line segment. Even if we have to walk through this entire list, it is again a simple comparison.

This will leave us with just {H,I}.

We know eliminate any points for which BOTH x-coordinates are to the left of the candidate point. This involves up to two comparisons.

This will leave us with just {I}.

All three of these steps can easily be combined, though it means doing a linear search through the ordered border segments, but we pretty much had to do that in the second step anyway.

Once we have our final set of border segments, we just need to count how many of them actually intersect the green line. The first step here is to count every point for which BOTH x-coordinates are to the right of the candidate point since we KNOW that those must intersect the green segment. That removes segment I from the set and at this point we are done. If our count is odd, then we know that the point is within the shape.

If there are still any coordinate points left, which will almost always be a small set since it requires that one endpoint lie to the left and the other to the right, we can perform an intersection check against each segment. Each of these would normally require four cross-products, each requiring two multiplications and an addition. But our filtering has made it so that we only have to perform one of these -- namely the one that determines if the candidate point is to the left of the boarder segment. If it is, then the border segment intersects the green segment and if it's not then it doesn't.

We can make things even simpler because we don't actually need to count the intersections and then determine if the total is odd or even, we simply initialize a Boolean variable, say "within" to FALSE and toggle it everytime we identify a boarder segment that intersects the green line.

The only thing that isn't covered here is handling the special case when one (or both) of the candidate point coordinates exactly matches one of the coordinate values of one of the border segments.


We
 

cmartinez

Joined Jan 17, 2007
8,253
Now it seems like we are completely defeating the point. Not only do we have to determine the distance to every border segment, but we have to check whether or not they actually intersect (starting with the nearest one and working out until we find one that does).



Really? What about this one:

View attachment 88791

Which border segment has a perpendicular to the red dot that intersects the line within the its limits?

What special rule is going to get added now?



Again, let's look at the method I have been recommending (with no claim that it is THE most efficient method).

View attachment 88792

Sort the segments according to the lowest y-coordinate of either of their end points. At the same time, determine the furthest right x-coordinate of any segment (the yellow line). Set a constant equal to the x-coordinate of that point plus some arbitrary amount: XMAX = (max x-coordinate) + 1.

The order of the segments might be the alphabetical order shown above (there are two sets of four segments, {C,D,E,F} and {M,N,O,P} that have the same lowest y-coordinate, so let's assume that those are ordered randomly.

We now want to find if the red dot is inside the polygon. So we construct the horizontal green segment that starts at the point and ends at any point to the right of the yellow line. That segment is trivial to construct with almost no computational cost. If the point has coordinates <xp,yp>, the other end of the segment has coordinates <XMAX,yp>.

We now filter our set of border segments to find those that potentially might intersect the green line segment. We first truncate the set beginning with the first line segment whose lowest y-coordinate is above the green segment, which is segment J. Since we can do a binary search for this, the cost is O(log(n)) where n is the number of border segments. Furthermore, each test is a simple comparison on one value, so the cost per step is cheap.

This leaves us with the list {A,B,C,D,E,F,G,H,I}.

We now walk through this list eliminating all points for which the second coordinate is also below the line segment. Even if we have to walk through this entire list, it is again a simple comparison.

This will leave us with just {H,I}.

We know eliminate any points for which BOTH x-coordinates are to the left of the candidate point. This involves up to two comparisons.

This will leave us with just {I}.

All three of these steps can easily be combined, though it means doing a linear search through the ordered border segments, but we pretty much had to do that in the second step anyway.

Once we have our final set of border segments, we just need to count how many of them actually intersect the green line. The first step here is to count every point for which BOTH x-coordinates are to the right of the candidate point since we KNOW that those must intersect the green segment. That removes segment I from the set and at this point we are done. If our count is odd, then we know that the point is within the shape.

If there are still any coordinate points left, which will almost always be a small set since it requires that one endpoint lie to the left and the other to the right, we can perform an intersection check against each segment. Each of these would normally require four cross-products, each requiring two multiplications and an addition. But our filtering has made it so that we only have to perform one of these -- namely the one that determines if the candidate point is to the left of the boarder segment. If it is, then the border segment intersects the green segment and if it's not then it doesn't.

We can make things even simpler because we don't actually need to count the intersections and then determine if the total is odd or even, we simply initialize a Boolean variable, say "within" to FALSE and toggle it everytime we identify a boarder segment that intersects the green line.

The only thing that isn't covered here is handling the special case when one (or both) of the candidate point coordinates exactly matches one of the coordinate values of one of the border segments.


We
I... HATE... YOU.... :D

Seriously now, I'll study everything you've just said and get back to you tomorrow... In the meantime, nighty night...
 

WBahn

Joined Mar 31, 2012
30,060
I... HATE... YOU.... :D

Seriously now, I'll study everything you've just said and get back to you tomorrow... In the meantime, nighty night...
LOL!

Getting our ideas critiqued and challenged is one of the best ways to improve them, though not always the most enjoyable.
 

cmartinez

Joined Jan 17, 2007
8,253
Alright... I'm back. Newly rested and refreshed.

Here's the thing. I just know my algorithm works because I wrote it many years ago and I used it extensively for a while. But it seems that... ahem... I've misplaced it, and I just can't find it anywhere. So now I'm trying to reconstruct it.
As far as I remember, the first part involved finding out the drawing direction of the polygon. And it seems that that, at least, is working. Now I'm going to have to meditate more deeply, because calculating if the point was inside or outside the shape followed a step or two, and had no ugly special exceptions in its logic. Probably the mistake I'm making is using the faces of the polygon as references, instead of its vertex, or both. I'll just have to think and see.

When I'm done, I'll post it again for you to evaluate... see if you can poke holes in it... but be aware that I'll make it water-tight and bullet-proof :D. And then we can compare its efficiency with your algorithm.
 

cmartinez

Joined Jan 17, 2007
8,253
I haven't forgotten about this little challenge W, it's just that I'm overwhelmed with work at this point and hadn't had the time to think it through... but let me tell you, I think that the key lies into choosing the closest vertex to the point in question, draw a line (vector) from the point to the vertex, and from then infer which side of the fence the point's at.
I'm almost certain that the calculation involved is not that complicated.
 

WBahn

Joined Mar 31, 2012
30,060
I haven't forgotten about this little challenge W, it's just that I'm overwhelmed with work at this point and hadn't had the time to think it through... but let me tell you, I think that the key lies into choosing the closest vertex to the point in question, draw a line (vector) from the point to the vertex, and from then infer which side of the fence the point's at.
I'm almost certain that the calculation involved is not that complicated.
Whenever you get around to it, it'll be here waiting for you. I've already got a counter-example in mind for the description you've given here, though I just have a mental picture of it and so it may or may not be relevant.
 

BR-549

Joined Sep 22, 2013
4,928
Assuming you can detect the difference between points and boundaries, why not lay a x-y axis on the image area.

Record the point and boundary data that lay on and intersect the axis.

Rotate the image a little, depending on resolution.

Record the point and boundary data that lay and intersect the axis.

Rotate the image a little, depending on resolution.

Until 90 degrees.

Now I have no idea of what I'm talking about. But I would think that with the graphic power we have.....this would not be difficult.
 

cmartinez

Joined Jan 17, 2007
8,253
Assuming you can detect the difference between points and boundaries, why not lay a x-y axis on the image area.

Record the point and boundary data that lay on and intersect the axis.

Rotate the image a little, depending on resolution.

Record the point and boundary data that lay and intersect the axis.

Rotate the image a little, depending on resolution.

Until 90 degrees.

Now I have no idea of what I'm talking about. But I would think that with the graphic power we have.....this would not be difficult.
maybe if you were to explain it graphically, as you say....
 

BR-549

Joined Sep 22, 2013
4,928
When the axis is laid on the image, it creates the first column of four arrays. x,y,-x,-y.

The scale of the axis, will depend on resolution. As will the rotation steps(array columns).

After a 90 degree sweep, you will have four raw data arrays.

The first thing to do is to clear the false positives. As you go towards the center, you will get false positives, because of the changing length of arc and the pixel diameter.

When you look at the RAW data columns..............a point on the outside of the image will look like a point.

As you go to the center of the image, the points will appear as arcs. The real point will be at center of arc. The length of the arc is related to the distance to the origin. This is how you check if an arc has one or two points.

If you know the pixel diameter and the distance to the origin, this noise can be cleared and accurate points at the center area can be known. This pixeldiameterdistanceorigin constant will be applied to every data point. Not boundary points.

Now you have four quadrant arrays of the image. By noting the boundaries in the columns, you can determine the innies from the outies.

Does that work?
 
Last edited:

WBahn

Joined Mar 31, 2012
30,060
When the axis is laid on the image, it creates the first column of four arrays. x,y,-x,-y.

The scale of the axis, will depend on resolution. As will the rotation steps(array columns).

After a 90 degree sweep, you will have four raw data arrays.

The first thing to do is to clear the false positives. As you go towards the center, you will get false positives, because of the changing length of arc and the pixel diameter.

When you look at the RAW data columns..............a point on the outside of the image will look like a point.

As you go to the center of the image, the points will appear as arcs. The real point will be at center of arc. The length of the arc is related to the distance to the origin. This is how you check if an arc has one or two points.

If you know the pixel diameter and the distance to the origin, this noise can be cleared and accurate points at the center area can be known. This pixeldiameterdistanceorigin constant will be applied to every data point. Not boundary points.

Now you have four quadrant arrays of the image. By noting the boundaries in the columns, you can determine the innies from the outies.

Does that work?
Frankly, it sounds like pure babble. But perhaps if you worked through a toy example it might become clear what you are trying to do -- and also let us consider if there are any obvious counter-examples that would give it grief.
 

WBahn

Joined Mar 31, 2012
30,060
We just can't picture what you are trying to do. A few sketches and a toy example would go a long way toward that goal.
 

BR-549

Joined Sep 22, 2013
4,928
Ok. Sorry, I don't know the right terminology for this topic. Take the TS's first image. Assuming the data points are the same size and that the boundary is continuous, AND assuming you can detect the difference between boundary points and data points,......pick a point of origin within the area of interest. Lay a x,y axis on it.

At a sample rate(sample length on axis) that depends on point size(or density), take the samples under the axis.

This will give you four lists of values, the x list, y list, -x list and -y list. These are the values that are underneath or intersecting the axis. This value can only be one of three things.

Nothing, Point or Boundary. This list will be the first column of values in the four arrays.

Now, find the farthest distance data point from the origin. Inside or outside....the farthest one.

Adjust rotation step size so that it takes two clicks to see the outlier(farthest data point) but not three.

Now go back to the first sample.

Now rotate the Image one click, not the Axis. Re-sample. This will give us four arrays of columns(a column is a click) to 90 degrees.

If you assign 0 to Nothing and 1 to Point and 2 to Boundary, and print out the array, no matter what the image is, you will see a pattern of distortion. It will appear that there is always more data points in the center(top of printout). Because this distortion was caused by rotation, it should be easy to quantify and remove.

I would think that this could be done on the pixel scale. I would think that a quarter turn is faster than a full scan of an image.

Is this still babble to everyone?
 

cmartinez

Joined Jan 17, 2007
8,253
....I would think that this could be done on the pixel scale.
Maybe the thing here is that we're talking about continuous analytical data... that is, mathematical shapes. While it seems to me that you're talking about raster images, or discrete data.
 

WBahn

Joined Mar 31, 2012
30,060
Ok. Sorry, I don't know the right terminology for this topic. Take the TS's first image. Assuming the data points are the same size and that the boundary is continuous, AND assuming you can detect the difference between boundary points and data points,......pick a point of origin within the area of interest. Lay a x,y axis on it.
You've got a problem right off the bat. The entire goal is to be able to determine whether a point is inside or outside the area of interest, so given that, how do you pick a point of origin within the area of interest?

At a sample rate(sample length on axis) that depends on point size(or density), take the samples under the axis.
The points are mathematical points that have no physical extent. So how do you set this sample rate? Once set, what does it mean to "take samples under the axis"?

Again, a picture of a toy problem that you walk through would be extremely helpful to figure out what you are talking about.

This will give you four lists of values, the x list, y list, -x list and -y list. These are the values that are underneath or intersecting the axis. This value can only be one of three things.

Nothing, Point or Boundary. This list will be the first column of values in the four arrays.

Now, find the farthest distance data point from the origin. Inside or outside....the farthest one.

Adjust rotation step size so that it takes two clicks to see the outlier(farthest data point) but not three.

Now go back to the first sample.

Now rotate the Image one click, not the Axis. Re-sample. This will give us four arrays of columns(a column is a click) to 90 degrees.

If you assign 0 to Nothing and 1 to Point and 2 to Boundary, and print out the array, no matter what the image is, you will see a pattern of distortion. It will appear that there is always more data points in the center(top of printout). Because this distortion was caused by rotation, it should be easy to quantify and remove.

I would think that this could be done on the pixel scale. I would think that a quarter turn is faster than a full scan of an image.

Is this still babble to everyone?
Pretty much.

Even if I follow everything you are talking about, it sounds extremely computationally intensive. You are rotating the image repeatedly, which is not cheap.
 
Top