About Hyeseon

Hyeseon's walkthrough

OCR code and model recognition rate improvement


About the Project

While the client was using EasyOCR to read semiconductor wafer IDs, since the program was using a generic model, the recognition rate was mediocre. To improve on this, the client requested improvement of the ID recognition rate, and I worked on several aspects of the program for the goal.

Improvement of the basic code

The existing code used the image without any specific preprocessing. This is far from being optimal because some appropriate preprocessing would raise the recognition rate. For this, I modified the code to first process the image using Gaussian thresholding to convert it to black-and-white. Then, for additional noise removal, additionally performed closing (dilation-erosion). Finally an additional erosion is done to bolden the text.

Synthetic training corpus

Although such improvements significantly increased the accuracy of the model, since the dataset was very domain-specific, this alone was not enough to achieve the accuracy required by the client. Although the wafer IDs in the wild were hardly obtainable both by the client and by me, their fonts were very specific such that I could instead craft a synthetic corpus. Based on the text recognition data generator used for the CRAFT model, I created a corpus generator which would also accept a bitmap font. I used the generator to generate about ~6000 image sources for training.

Finally, the generated data was combined with the real data to re-train the model, with the ratio being 3:1. To conclude, I was able to achieve >97% accuracy on the given data.

Figures

Fig. 1 - An example of training data for the project.

Contribution and Information

  • Period of Contribution: 6/2023
  • Total Participants: alone, except for the client
  • My Roles: entire development
  • Used Tech Stacks: Python (PyTorch, EasyOCR, PIL, OpenCV2)

I cannot provide the entirety of the project due to the NDA. However, I have the code of the synthetic data generator in my GitHub, although it is somewhat different from the version I used for this project.