Graph Attention Networks for Efficient Text Line Detection on Receipt-Layout Documents

Workshop Conference

Date

August 14, 2022

Source / Conference / Filed

KDD. Document Intelligent Workshop

Authors

David Montero Martin
Mukul Kumar
David Jiménez
Javier Yebes

Abstract

Text line detection from OCR detections is an essential step in many information-extraction processes, particularly when working with unstructured documents such as purchase receipts, where utilizing this information is crucial for matching key-value pairs that are on the same line. Existing models, however, are limited to structured documents and do not generalize well to unstructured ones. To address this issue, we have created a GNN-based line detection model that is optimized for receipt-layout documents. Experiments show that the proposed method outperforms other approaches in accuracy, processing time and resource consumption.