Neural networks for image recognition: Where to start?

Thread Starter

strantor

Joined Oct 3, 2010
6,781
Yes, I know I can google this, and trust me, I have. But after a series of frustrating recent experiences following bad advice and bad tutorials (not related to neural networks) found in google results, I am looking for a more trustworthy recommendation relevant to the actual task at hand.

I have discussed this project in another recent thread but that was a different question, so I will re-hash it here with context applicable to this question.

I writing a Python script to scan images, identify shipping containers in the image, and OCR the text on them.

Here are some details about the application:
  • The images will be of doors*of shipping containers and they could be any color, and the photos could be taken any time of day or night and at any angle (within reason).
    • I am not interested in sides, tops, backs, etc. only doors.
  • There will be other text and may be other shipping containers in the image, and I only want to OCR the text if it is on the door of a shipping container, and specifically the container that is in the center of the frame. It may be next to other containers of different color or the same color.
Until now, I have been using color filters to identify the containers in the images and it looks something like this:

1.jpg


With the outline of the white shape (and more importantly the 4 corner point coordinate sets) I can de-skew and be ready for OCR:



3.jpg



The problem with this approach is that the color filter requires tweaking to get a result like shown above. More often it looks something like this (or worse, usually worse):

20220909_155625___.jpg



I have tried many ways to automate the tweaking process but it just gets more kludgey the further I go down this path.
I sought help on the OpenCV forum and was told (paraphrased) that anything other than a neural network is going to be a scabby piece of crap. And considering how much effort I've put into this to still only have a scabby piece of crap, I'm inclined to believe that.

So despite the mental anguish I've already invested I think I need to start over and build this around a neural network, but I have no idea where to start.
I've watched a few tensorflow tutorial videos and it seems "simple" in terms of the number of lines of code used in the examples but none of the lines made any sense to me and I'm almost quasi-fluent in Python. More alarmingly, the explanation of the most basic concepts behind the code passed my ears like the nonsensical ramblings of a spambot. I wonder if I'm simply not intelligent enough to do this.

Is tensorflow the way to go? How about pytorch? Keras?
Most of what I find when I google this topic is for tensorflow, so I think there will be more examples to work with if I go that route, but I already know I'll be in over my head the moment I poke my big toe into it.
I've received the recommendation to use easyOCR for the OCR part once I get back around to that, and easyOCR uses pytorch, so that is a minor incentive to go that direction.
But what I'm really asking here is:
  • is there is any neural network solution uniquely suited to the application I've described?
  • is there is any neural network solution uniquely suited to implementation by a noob of (questionably) middling-intelligence?
  • Is there one that satisfies both?
 
Last edited:

Tesla23

Joined May 10, 2009
542
This is waaay outside my areas of expertise, but in response to your other thread I did look at what was available to read container numbers using image recognition - there are commercial solutions available much in the same way as number plate recognition (but I guess you know that):
https://www.google.com.au/search?q=container+number+recognition+camera

Looking at github there are several code examples - but these could simply be student assignments (but something that sort of works could be a starting point):
https://www.google.com/search?q=container+number+recognition+github
 

Thread Starter

strantor

Joined Oct 3, 2010
6,781
This is waaay outside my areas of expertise, but in response to your other thread I did look at what was available to read container numbers using image recognition - there are commercial solutions available much in the same way as number plate recognition (but I guess you know that):
https://www.google.com.au/search?q=container+number+recognition+camera
I did look into commercial options, but perhaps not hard enough. Most of what is available is a physical product, a camera, which is not what I need. I need a software solution that can be deployed with any camera (incl. drone camera) and integrated with existing python system. There are a few of these container-specific software OCR "engines" (SDK or API) available with per-camera licensing but most have very little information available which leads me to suspect they want to sell you a "solution" (read: software engineering service) rather than something you can tinker with and tailor (yourself) to your own needs.

Here is one exception to that, and they even have a web engine that you can use to test it out, without even giving them an email address (two thumbs up for that!) but I tried running a few of my wide-angle pictures through it and it doesn't get them correctly identified. If I pass them my python-deskewed versions of those same pictures it does just fine. So it's a half-solution, and if I'm going to request funding for something I expect it to be a full solution.

I will keep looking though; I'm nearing the breaking point where I will be able to throw up my hands in defeat (and pay for a software engineering service) without losing sleep over it.

Looking at github there are several code examples - but these could simply be student assignments (but something that sort of works could be a starting point):
https://www.google.com/search?q=container+number+recognition+github
Thanks. I had a glance through a handful of your search results just now and saw tensorflow come up more than once. Another thing I saw come up more than once: Tesseract. That's a red flag according to what I was told on the OpenCV forum:

okay, OCR… chuck tesseract. that is ancient crusty trash. better stuff existed 10-20 years ago. it’s only popular because it was the first open source OCR and now the name stuck in everyone’s head and nobody even considers anything modern. try “easyocr” or any other library.
And that kind of outdated, simplistic thing is exactly the problem I've been stumbling over time and again with this project and why I'm hesitant to continue trying to emulate examples in search results. I can't tell you how many hours I've wasted following along with tutorials that it later became obvious to me would only have ever worked with the specific images and specific circumstances that the tutorial was constructed around. Looking back on it now, it's laughable. 90% of all tutorials on this subject should start with the disclaimer "This will not work on any kind of real-world image, whatsoever." Things like:
-Detect a white paper receipt on a black table
-Detect an airplane silhouetted against a perfect blue sky, using a sample airplane cropped from the very same image
-OCR black text typed on a white background in MS Paint
I see no examples of real-world applications like detecting a blue sign at an angle against a blue sky, or OCR-ing spray-stenciled text on a wooden pallet with surface imperfections. And of course, nothing remotely close to OCRing a shipping container at a 30 degree angle

I will dig more thoroughly through some of the search results in the morning, now that I'm slightly more educated about what sucks, see if anything looks promising. Thank you for your help.
 

Thread Starter

strantor

Joined Oct 3, 2010
6,781
Hi!
Any updates on this topic? I have almost the same problem and would love to hear other perspectives.
It was overtaken by more pressing tasks. I will come back around to it in the next month or two. Before I had to move it to the back burner I had installed an OAK-D-POE camera which has 3 camera modules; two for stereo depth perception and a 3rd for high resolution imaging. It has AI inside it, many applications already setup. I recommend you look into it. Many OAK camera modules to choose from, exciting stuff.
 

MrSalts

Joined Apr 2, 2020
2,767
Looking at edges/transitions is much easier and eliminates the need for color filtering. And the built-in error checking system helps confirm you're right.
 

miawaars

Joined Dec 25, 2022
2
It was overtaken by more pressing tasks. I will come back around to it in the next month or two. Before I had to move it to the back burner I had installed an OAK-D-POE camera which has 3 camera modules; two for stereo depth perception and a 3rd for high resolution imaging. It has AI inside it, many applications already setup. I recommend you look into it. Many OAK camera modules to choose from, exciting stuff.
Thank you! :)
 

Thread Starter

strantor

Joined Oct 3, 2010
6,781
Looking at edges/transitions is much easier and eliminates the need for color filtering. And the built-in error checking system helps confirm you're right.
I had trouble with edge detection because there are so many lines on a Shipping container. The corrugated sides, every fold is an edge. For a given image you can set thresholds such that the primitive shape of the container is detected without detecting the corrugations but those thresholds won't work for the next image. Whatever the final solution it needs to be adaptive. I think the stereo camera will provide much better information for edge detection because the edge detector will be looking at a depth map rather than a color pallet, so should do well at finding the container and ignoring the corrugations.
 
Top