Deep learning OCR post-correction
Evaluation and post-correction of OCR of digitised historical newspapers
A tool to clean up text generated by OCR using individual words as well as their context.
Ochre is experimental software for cleaning up text with OCR mistakes. The software was developed to investigate whether character-based language models can be used to remove OCR mistakes. In addition, ochre provides functionality to analyze the kinds of OCR mistakes in a corpus. This enables researchers to compare different OCR post-correction methods and find out what kinds of mistakes they are good at solving.