Detail publikace

Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

KOHÚT, J. HRADIŠ, M.

Originální název

Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

Typ

článek ve sborníku ve WoS nebo Scopus

Jazyk

angličtina

Originální abstrakt

In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple finetuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of finetuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, finetuning provided an average relative CER improvement of 25 % with 16 text lines for new writers and 50 % for 256 text lines.

Klíčová slova

Handwritten text recognition, OCR, Data augmentation, Finetuning.

Autoři

KOHÚT, J.; HRADIŠ, M.

Vydáno

19. 8. 2023

Nakladatel

Springer Nature Switzerland AG

Místo

San José

ISBN

978-3-031-41684-2

Kniha

Document Analysis and Recognition - ICDAR 2023

Edice

Lecture Notes in Computer Science

ISSN

0302-9743

Periodikum

Lecture Notes in Computer Science

Ročník

14190

Číslo

1

Stát

Spolková republika Německo

Strany od

269

Strany do

286

Strany počet

18

URL

BibTex

@inproceedings{BUT185151,
  author="Jan {Kohút} and Michal {Hradiš}",
  title="Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition",
  booktitle="Document Analysis and Recognition - ICDAR 2023",
  year="2023",
  series="Lecture Notes in Computer Science",
  journal="Lecture Notes in Computer Science",
  volume="14190",
  number="1",
  pages="269--286",
  publisher="Springer Nature Switzerland AG",
  address="San José",
  doi="10.1007/978-3-031-41685-9\{_}17",
  isbn="978-3-031-41684-2",
  issn="0302-9743",
  url="https://pero.fit.vutbr.cz/publications"
}