Skip to content

Improve accuracy of page detectors #5

@Witiko

Description

@Witiko

The low accuracies in the experimental notebook from #3 are worrying:

Page Detector random siamese imagehash vgg16 annotated
Accuracy 0.03% 3.95% 6.58% 61.84% 100.00%

The vgg16 page detector currently uses features produced by the last hiddent layer of a VGG16 model trained on ImageNet. Finetuning may not be an option, since our dataset is ill-suited for classification (too many document pages/classes, too few examples of each class). However, since our dataset is significantly different from ImageNet, we may have better luck using an earlier hidden layer of VGG16:

The siamese page detector uses position-dependent samples to normalize the input screen images, which may explain why the performance degrades on new document page and screen images (86% accuracy on training set versus 3.95% accuracy on test set):

If we manage to improve the siamese page detector, we may benefit from ensembling siamese with vgg16:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions