I am looking for a standard format to represent text strings and their associated bounding box coordinates coming out from an OCR system. Would TEI accommodate this?
Youssef Eldakar
Bibliotheca Alexandrina
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am looking for a standard format to represent text strings and their associated bounding box coordinates coming out from an OCR system. Would TEI accommodate this?
Youssef Eldakar
Bibliotheca Alexandrina
There are certainly ways to do this. You may be interested in the developments taking place on how to markup up image facsimiles. A changing page is available at the TEI wiki:
http://www.tei-c.org.uk/wiki/index.php/LegacyFacsimileMarkup
Otherwise, there have been some discussions of this on the TEI-L mailing list see http://www.tei-c.org/Contact/ for information on how to join.
Hope that helps,
-James