Abstract. Despite decades of research, offline handwriting recognition
(HWR) of degraded historical documents remains a challenging problem,
which if solved could greatly improve the searchability of online cultural
heritage archives. HWR models are often limited by the accuracy of
the preceding steps of text detection and segmentation. Motivated by
this, we present a deep learning model that jointly learns text detection,
segmentation, and recognition using mostly images without detection
or segmentation annotations. Our Start, Follow, Read (SFR) model is
composed of a Region Proposal Network to find the start position of
text lines, a novel line follower network that incrementally follows and
preprocesses lines of (perhaps curved) text into dewarped images suitable
for recognition by a CNN-LSTM network. SFR exceeds the performance
of the winner of the ICDAR2017 handwriting recognition competition,
even when not using the provided competition region annotations