UOCR: A Ligature Based Approach for an Urdu OCR System

Tofik Ali; Tauseef Ahmad; Mohd Imran

Conference proceeding

UOCR: A Ligature Based Approach for an Urdu OCR System

Tofik Ali, Tauseef Ahmad and Mohd Imran

PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, pp.388-394

01/01/2016

Abstract

Computer Science

Computer Science, Theory & Methods

Engineering

Engineering, Electrical & Electronic

Science & Technology

Technology

Optical Character Recognition (OCR) system translates text images into digitally editable text files. There exist OCRs for different languages but developing an OCR for Urdu is a challenging task because of its cursive nature and context sensitivity. This is the main reason behind very limited work on very Urdu OCRs systems, and the previous works on Urdu OCR are not very efficient in converting the image text file into editable text files. In this paper, we have proposed a UOCR system for converting printed Nastalique Urdu Script image file into digitally editable text files. It takes printed Nastalique Urdu image file as input and after performing operations like binarization, segmentation, feature extraction and classification on that document, it finally produces digitally Urdu editable text files as output. We are using SIFT and SURF methods for computing descriptor in our feature extraction approach. Our proposed UOCR system is very efficient in conversion.

Metrics

1 Record Views

Details

Title: UOCR: A Ligature Based Approach for an Urdu OCR System
Creators - without role: Tofik Ali - Aligarh Muslim University
Tauseef Ahmad - Aligarh Muslim University
Mohd Imran - Aligarh Muslim University
Contributors - without role: M N Hoda
Publication Details: PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, pp.388-394
Publisher: IEEE
Number of pages: 7
Identifiers: 9919298108331
Academic Unit: Northern Borders University
Language: English
Resource Type: Conference proceeding