This repo contains code written by MXNet for ocr tasks, which uses an cnn-lstm-ctc architecture to do text recognition.
In addition buctketing module is used in the code to handle variable length of input images.
network
The network in this preject is based on An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. B. Shi, X. Bai, C. Yao .TPAMI
download ICDAR2013 cropped word dataset http://rrc.cvc.uab.es/?ch=2&com=downloads and put it in the right fold which should be consistant with the path in text_deep_ocr_bucketing.py
make_list.py is to generate .lst file for recording image paths and corresponding labels and idx2char.json,char2idx.json for recording index and char's correspondence , run this in your terminal:
mkdir model
python text_deep_ocr_bucketing.py
results
if you use the default setting your should reach obout 40% accuracy in val set. To improve performance in future work:
use synthetic 90k to pretrain the model and finetune on ICDAR2013