A new visual signature for content-based indexing of low resolution documents

Md Nor, Danial and Abd. Wahab, M. Helmy and M. Jenu, M. Zarar and Ogier, Jean-Marc (2012) A new visual signature for content-based indexing of low resolution documents. Journal of Information Retrieval and Knowledge Management, 2. pp. 88-95.

[img] Text
J14168_5130d0b6fdee9bb0e61a4edec1d3837d.pdf
Restricted to Registered users only

Download (2MB) | Request a copy

Abstract

This paper proposes a new visual signature for content –based indexing of low resolution documents. Camera Based Document Analysis and Recognition (CBDAR) has been established which deals with the textual information in scene images taken by low cost hand held devices like digital camera, cell phones, etc. A lot of applications like text translation, reading text for visually impaired and blind person, information retrieval from media document, e-learning, etc., can be built using the techniques developed in CBDAR domain. The proposed approach of extraction of textual information is composed of three steps: image segmentation, text localization and extraction, and Optical Character Recognition. First of all, for pre-processing the resolution of each image is checked for re-sampling to a common resolution format (720 X 540). Then, the final image is converted to grayscale and binarized using Otsu segmentation method for further processing. In addition, looking at the mean horizontal run length of both black and white pixels, the proper segmentation of foreground objects is checked. In the post-processing step, the text localizer validates the candidate text regions proposed by text detector. We have employed a connected component approach for text localization. The extracted text is then has been successfully recognized using ABBYY FineReader for OCR. Apart from OCR, we had created a novel feature vectors from textual information for Content-Based Image Retrieval (CBIR).

Item Type: Article
Uncontrolled Keywords: Image Segmentation; Text Extraction; OCR; CBIR
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Electrical and Electronic Engineering > Department of Electronic Enngineering
Depositing User: Mr. Abdul Rahim Mat Radzuan
Date Deposited: 08 Jun 2022 02:05
Last Modified: 08 Jun 2022 02:05
URI: http://eprints.uthm.edu.my/id/eprint/7097

Actions (login required)

View Item View Item