Yes, I know I can google this, and trust me, I have. But after a series of frustrating recent experiences following bad advice and bad tutorials (not related to neural networks) found in google results, I am looking for a more trustworthy recommendation relevant to the actual task at hand.
I have discussed this project in another recent thread but that was a different question, so I will re-hash it here with context applicable to this question.
I writing a Python script to scan images, identify shipping containers in the image, and OCR the text on them.
Here are some details about the application:
With the outline of the white shape (and more importantly the 4 corner point coordinate sets) I can de-skew and be ready for OCR:
The problem with this approach is that the color filter requires tweaking to get a result like shown above. More often it looks something like this (or worse, usually worse):
I have tried many ways to automate the tweaking process but it just gets more kludgey the further I go down this path.
I sought help on the OpenCV forum and was told (paraphrased) that anything other than a neural network is going to be a scabby piece of crap. And considering how much effort I've put into this to still only have a scabby piece of crap, I'm inclined to believe that.
So despite the mental anguish I've already invested I think I need to start over and build this around a neural network, but I have no idea where to start.
I've watched a few tensorflow tutorial videos and it seems "simple" in terms of the number of lines of code used in the examples but none of the lines made any sense to me and I'm almost quasi-fluent in Python. More alarmingly, the explanation of the most basic concepts behind the code passed my ears like the nonsensical ramblings of a spambot. I wonder if I'm simply not intelligent enough to do this.
Is tensorflow the way to go? How about pytorch? Keras?
Most of what I find when I google this topic is for tensorflow, so I think there will be more examples to work with if I go that route, but I already know I'll be in over my head the moment I poke my big toe into it.
I've received the recommendation to use easyOCR for the OCR part once I get back around to that, and easyOCR uses pytorch, so that is a minor incentive to go that direction.
But what I'm really asking here is:
I have discussed this project in another recent thread but that was a different question, so I will re-hash it here with context applicable to this question.
I writing a Python script to scan images, identify shipping containers in the image, and OCR the text on them.
Here are some details about the application:
- The images will be of doors*of shipping containers and they could be any color, and the photos could be taken any time of day or night and at any angle (within reason).
- I am not interested in sides, tops, backs, etc. only doors.
- There will be other text and may be other shipping containers in the image, and I only want to OCR the text if it is on the door of a shipping container, and specifically the container that is in the center of the frame. It may be next to other containers of different color or the same color.
With the outline of the white shape (and more importantly the 4 corner point coordinate sets) I can de-skew and be ready for OCR:
The problem with this approach is that the color filter requires tweaking to get a result like shown above. More often it looks something like this (or worse, usually worse):
I have tried many ways to automate the tweaking process but it just gets more kludgey the further I go down this path.
I sought help on the OpenCV forum and was told (paraphrased) that anything other than a neural network is going to be a scabby piece of crap. And considering how much effort I've put into this to still only have a scabby piece of crap, I'm inclined to believe that.
So despite the mental anguish I've already invested I think I need to start over and build this around a neural network, but I have no idea where to start.
I've watched a few tensorflow tutorial videos and it seems "simple" in terms of the number of lines of code used in the examples but none of the lines made any sense to me and I'm almost quasi-fluent in Python. More alarmingly, the explanation of the most basic concepts behind the code passed my ears like the nonsensical ramblings of a spambot. I wonder if I'm simply not intelligent enough to do this.
Is tensorflow the way to go? How about pytorch? Keras?
Most of what I find when I google this topic is for tensorflow, so I think there will be more examples to work with if I go that route, but I already know I'll be in over my head the moment I poke my big toe into it.
I've received the recommendation to use easyOCR for the OCR part once I get back around to that, and easyOCR uses pytorch, so that is a minor incentive to go that direction.
But what I'm really asking here is:
- is there is any neural network solution uniquely suited to the application I've described?
- is there is any neural network solution uniquely suited to implementation by a noob of (questionably) middling-intelligence?
- Is there one that satisfies both?
Last edited: