Multi-Oriented and multi-scaled text character analysis and recognition in graphical documents and their apllications to document image retrieval

dc.contributor
Universitat Autònoma de Barcelona. Departament de Ciències de la Computació
dc.contributor.author
Pratim Roy, Partha
dc.date
info:eu-repo/date/embargoEnd/2011-07-12
dc.date.accessioned
2011-07-11T13:00:55Z
dc.date.available
2011-07-12T05:45:04Z
dc.date.issued
2010-11-03
dc.identifier.isbn
9788469403815
dc.identifier.uri
http://hdl.handle.net/10803/32107
dc.description.abstract
With the advent research of Document Image Analysis and Recognition (DIAR), an important line of research is explored on indexing and retrieval of graphics rich docu- ments. It aims at nding relevant documents relying on segmentation and recognition of text and graphics components underlying in non-standard layout where commercial OCRs can not be applied due to complexity. This thesis is focused towards text infor- mation extraction approaches in graphical documents and retrieval of such documents using text information. Automatic text recognition in graphical documents (map, engineering drawing, etc.) involves many challenges because text characters are usually printed in multi- oriented and multi-scale way along with di erent graphical objects. Text characters are used to annotate the graphical curve lines and hence, many times they follow curvi-linear paths too. For OCR of such documents, individual text lines and their corresponding words/characters need to be extracted. For recognition of multi-font, multi-scale and multi-oriented characters, we have proposed a feature descriptor for character shape using angular information from con- tour pixels to take care of the invariance nature. To improve the e ciency of OCR, an approach towards the segmentation of multi-oriented touching strings into individual characters is also discussed. Convex hull based background information is used to segment a touching string into possible primitive segments and later these primitive segments are merged to get optimum segmentation using dynamic programming. To overcome the touching/overlapping problem of text with graphical lines, a character spotting approach using SIFT and skeleton information is included. Afterwards, we propose a novel method to extract individual curvi-linear text lines using the fore- ground and background information of the characters of the text and a water reservoir concept is used to utilize the background information. We have also formulated the methodologies for graphical document retrieval ap- plications using query words and seals. The retrieval approaches are performed using recognition results of individual components in the document. Given a query text, the system extracts positional knowledge from the query word and uses the same to generate hypothetical locations in the document. Indexing of documents is also per- formed based on automatic detection of seals from documents containing cluttered background. A seal is characterized by scale and rotation invariant spatial feature descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents. Keywords: Document Image Analysis, Graphics Recognition, Dynamic Pro- gramming, Generalized Hough Transform, Character Recognition, Touching Charac- ter Segmentation, Text/Graphics Separation, Curve-Line Separation, Word Retrieval, Seal Detection and Recognition.
eng
dc.format.extent
280 p.
dc.format.mimetype
application/pdf
dc.language.iso
eng
dc.publisher
Universitat Autònoma de Barcelona
dc.rights.license
info:eu-repo/semantics/embargoAccess
dc.rights.license
ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs.
dc.source
TDX (Tesis Doctorals en Xarxa)
dc.subject
Document image processing
dc.subject
Graphics recognition
dc.subject
Text/graphics separation
dc.subject.other
Tecnologies
dc.title
Multi-Oriented and multi-scaled text character analysis and recognition in graphical documents and their apllications to document image retrieval
dc.type
info:eu-repo/semantics/doctoralThesis
dc.type
info:eu-repo/semantics/publishedVersion
dc.subject.udc
60
cat
dc.contributor.director
Lladós Canet, Josep
dc.contributor.codirector
Pal, Umapada
dc.identifier.dl
B-29347-2011


Documents

pr1de1.pdf

2.800Mb PDF

Aquest element apareix en la col·lecció o col·leccions següent(s)