Skip to content

Latest commit

 

History

History
140 lines (124 loc) · 12.6 KB

File metadata and controls

140 lines (124 loc) · 12.6 KB

Inference - MindOCR Models

1. Overview of Third-Party Inference

graph LR;
    A[ckpt] -- export.py --> B[MindIR] -- converter_lite --> C[MindSpore Lite MindIR];
    C --input --> D[infer.py] -- outputs --> eval_rec.py/eval_det.py;
    H[images] --input --> D[infer.py];
Loading

2. Third-Party Model Inference Methods

2.1 Text Detection

Let's take DBNet ResNet-50 en in the appendix table as an example to introduce the inference method:

  • Download the mindir file in the appendix table;
  • Use the converter_lite tool on Ascend310/310P to convert the downloaded file to mindir that can be used by MindSpore Lite:

Create config.txt and specify the model input shape:

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,736,1280]

Run the following command:

converter_lite \
     --saveType=MINDIR \
     --NoFusion=false \
     --fmk=MINDIR \
     --device=Ascend \
     --modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir \
     --outputFile=dbnet_resnet50 \
     --configFile=config.txt

After the above command is executed, the dbnet_resnet50.mindir model file will be generated;

Learn more about Model Conversion Tutorial

Learn more about converter_lite

  • Perform inference using /deploy/py_infer/infer.py codes and dbnet_resnet50.mindir file:
python infer.py \
     --input_images_dir=/path/to/ic15/ch4_test_images \
     --det_model_path=/path/to/mindir/dbnet_resnet50.mindir \
     --det_model_name_or_config=en_ms_det_dbnet_resnet50 \
     --res_save_dir=/path/to/dbnet_resnet50_results

After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir

When doing inference, you can use the --vis_det_save_dir parameter to visualize the results:

Visualization of text detection results

Learn more about infer.py inference parameters

  • Evaluate the results with the following command:
python deploy/eval_utils/eval_det.py \
     --gt_path=/path/to/ic15/test_det_gt.txt \
     --pred_path=/path/to/dbnet_resnet50_results/det_results.txt

The result is: {'recall': 0.8348579682233991, 'precision': 0.8657014478282576, 'f-score': 0.85}

2.2 Text Recognition

Let's take CRNN ResNet34_vd en in the appendix table as an example to introduce the inference method:

  • Download the mindir file in the appendix table;
  • Use the converter_lite tool on Ascend310/310P to convert the downloaded file to mindir that can be used by MindSpore Lite:

Create config.txt and specify the model input shape:

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,32,100]

Run the following command:

converter_lite \
     --saveType=MINDIR \
     --NoFusion=false \
     --fmk=MINDIR \
     --device=Ascend \
     --modelFile=crnn_resnet34-83f37f07-eb10a0c9.mindir \
     --outputFile=crnn_resnet34vd \
     --configFile=config.txt

After the above command is executed, the crnn_resnet34vd.mindir model file will be generated;

Learn more about Model Conversion Tutorial

Learn more about converter_lite

  • Perform inference using /deploy/py_infer/infer.py codes and crnn_resnet34vd.mindir file:
python infer.py \
     --input_images_dir=/path/to/ic15/ch4_test_word_images \
     --rec_model_path=/path/to/mindir/crnn_resnet34vd.mindir \
     --rec_model_name_or_config=../../configs/rec/crnn/crnn_resnet34.yaml \
     --res_save_dir=/path/to/rec_infer_results

After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

Learn more about infer.py inference parameters

  • Evaluate the results with the following command:
python deploy/eval_utils/eval_rec.py \
     --gt_path=/path/to/ic15/rec_gt.txt \
     --pred_path=/path/to/rec_infer_results/rec_results.txt

The result is: {'acc': 0.6966779232025146, 'norm_edit_distance': 0.8627135157585144}

3. Appendix - MindOCR Model Support List

MindOCR inference supports exported models from trained ckpt file, and this document displays a list of adapted models.

3.1 Text detection

Model Backbone Language Dataset F-score(%) FPS Config Download
DBNet MobileNetV3 en IC15 76.96 26.19 yaml mindir
ResNet-18 en IC15 81.73 24.04 yaml mindir
ResNet-50 en IC15 85.00 21.69 yaml mindir
ResNet-50 ch + en 12 Datasets 83.41 21.69 yaml mindir
DBNet++ ResNet-50 en IC15 86.79 8.46 yaml mindir
ResNet-50 ch + en 12 Datasets 84.30 8.46 yaml mindir
EAST ResNet-50 en IC15 86.86 6.72 yaml mindir
EAST MobileNetV3 en IC15 75.32 26.77 yaml mindir
PSENet ResNet-152 en IC15 82.50 2.31 yaml mindir
PSENet ResNet-50 en IC15 81.37 1.00 yaml mindir
PSENet MobileNetV3 en IC15 68.41 1.06 yaml mindir

3.2 Text recognition

Model Backbone Dict File Dataset Acc(%) FPS Config Download
CRNN VGG7 Default IC15 66.01 465.64 yaml mindir
ResNet34_vd Default IC15 69.67 397.29 yaml mindir
ResNet34_vd ch_dict.txt / / / yaml mindir
Rare ResNet34_vd Default IC15 69.47 273.23 yaml mindir
ResNet34_vd ch_dict.txt / / / yaml mindir