CHAPTER 2
LITERATURE SURVEY
2.1 INTRODUCTION
Recognition strategies heavily depend on the nature of the data to be recognized. In the cursive case, the problem is made complex by the fact that the writing is fundamentally ambiguous as the letters in the word are generally linked together, poorly written, and may even be missing. On the contrary, hand-printed word recognition is more related to printed word recognition, the individual letters composing the word being usually much easier to isolate and to identify. As a consequence of this, methods working on a letter basis (i.e., based on character segmentation and recognition) are well suited to hand-printed word recognition, while cursive scripts require more specific and/or sophisticated techniques. Inherent ambiguity must then be compensated by the use of contextual information. Recognition techniques can be classified according to two criteria: the way preprocessing is performed on the data and the type of the decision algorithm. Depending on the type of preprocessing stage, various kinds of decision methods have been used such as various statistical methods, neural networks, structural matching (on trees, chains, etc.), and stochastic processing (Markov chains, etc.). Many recent methods mix several techniques to provide better reliability to compensate for the great variability of character and numerals. Analytical strategies deal with several levels of representation corresponding to increasing levels of abstraction (usually the feature level, the grapheme or pseudo letter level, and the word level). Words are not considered as a whole but as 31 sequences of smaller size units, which must be easily related to characters to make recognition independent from a specific vocabulary. The main features of the image often used are:
- Colour
- Texture
- Shape
- Edge
- Text
- Temporal details etc.
A computer technology sub-field that has the potential to be useful in a plurality of settings is the automated recognition of textual information. This field has been referred to generally as Optical Character Recognition (OCR). In general, an OCR machine reads machine printed /handwritten characters and tries to determine which character from a fixed set of the machine printed/handwritten characters is intended to represent. The task of recognized characters can be broadly separated into two categories: the recognition of machine-printed data and the recognition of handwritten data. Machine printed characters are uniform in size, position, and pitch for any given font. In contrast, handwritten characters are non-uniform, they can be written in many different styles and sizes by different writers and by the same writers. Therefore, the reading of machine-printed writing is a much simpler task than reading handwriting and has been accomplished and marketed with considerable success. To demonstrate a framework for giving good recognition accuracy for English handwritten characters input by developing a new system that can deal with English handwritten characters with some distinct methodologies attained through the survey and the effective approaches shown below:
Optical character recognition, abbreviated as OCR, is the process of converting the images of handwritten, typewritten or printed text (usually captured by a scanner) into machine editable text or computer processable format, such as ASCII code. Computer systems armed with OCR system improve the speed of input operations, reduce data entry errors, reduce storage space required by paper documents, and thus enable compact storage, fast retrieval, scanning corrections, and other file manipulations. OCR has applications in postal code recognition, automatic data entry into large administrative systems, banking, automatic cartography, 3D object recognition, digital libraries, invoice and receipt processing, reading devices for blind and personal digital assistants. OCR includes essential problems of pattern recognition. Accuracy, flexibility, and speed are the three main features that characterize a good OCR system. OCR aims at enabling computers to recognize optical symbols without human intervention. This is accomplished by searching a match between the features extracted from a given symbol’s image and the library of image models.
2.2 SURVEY ON CHARACTER RECOGNITION SYSTEM WITH IMAGE PROCESSING TECHNIQUES
(Lu, Huang, & Sui, 2018) stated the document degradation such as uneven illumination, image contrast variation, blur caused by humidity, and bleed-through, degraded document image binarization is still an enormous challenge. A new binarization method for degraded document images. The proposed algorithm focuses on the differences in image grayscale contrast in different areas. A quadtree is used to divide areas adaptively. Also, various contrast enhancements are selected to adjust local grayscale contrast in areas with different contrasts. Finally, the local threshold is regarded as the mean of foreground and background gray values, which are determined by the frequency of the gray values. The proposed algorithm was tested on the datasets from the Document Image Binarization Contest. Compared with five other classical algorithms, the images binarized using the proposed algorithm achieved the highest F-measure and peak signal-to-noise ratio and obtained the highest correct rate of recognition.
(Jia, Shi, He, Wang, & Xiao, 2018) allowed an effective approach for the local threshold binarization of degraded document images. To utilize the structural symmetric pixels (SSPs) to calculate the local threshold in the neighborhood and the voting result of multiple thresholds will determine whether one pixel belongs to the foreground or not. To extract SSP candidates with large magnitudes and distinguish the faint characters and bleed-through background, we propose an adaptive global threshold selection algorithm. To further extract pixels with opposite orientations, an iterative stroke width estimation algorithm is applied to ensure the proper size of the neighborhood used in orientation judgment. At last, we present a multiple threshold vote based framework to deal with some inaccurate detections of SSP. The experimental results on seven public document image binarization datasets that are accurate and robust based on multiple evaluation measures.
(Vo, Kim, Yang, & Lee, 2018) proposed the binarization of degraded document images using document analysis. The supervised-binarization method is proposed, in which a hierarchical deep supervised network (DSN) architecture is learned for the prediction of the text pixels at different feature levels. With higher-level features, the network can differentiate text pixels from background noises, whereby severe degradations that occur in document images can be managed. Alternatively, foreground maps that are predicted at lower-level features present a higher visual quality at the boundary area. Compared with those of traditional algorithms, binary images generated by our architecture have a cleaner background and better-preserved strokes. The proposed approach achieves state-of-the-art results over widely used DIBCO datasets, revealing the robustness of the presented method.
(Chen, & Wang, 2017) allowed a new framework for the binarization of broken and degraded document images and restoring the quality of the document images. Document image binarization refers to the conversion of a document image into a binary image. For broken and severely degraded document images, binarization is a very challenging process. From this approach, the non-local means method is extended and used to remove noises from the input document image in the step of pre-processing. Then the proposed method binarizes the document image which takes advantage of the quick adaptive thresholding approach. To get more pleasing binarization results, the binarized document image is post-processed finally. There are three measures in the post-process step: de-speckle, preserve stroke connectivity and improve the quality of text regions. Experimental results show significant improvement in the binarization of the broken and degraded document images collected from various sources including degraded and broken books, magazines, and document files.
(S. R. Narang, Jindal, & Kumar, 2020) stated an OCR with a special focus on the OCR for ancient text documents. This paper will help novice researchers by providing a comprehensive study of the various phases, namely, segmentation, feature extraction, and classification techniques required for an OCR system, especially for ancient documents. It has been observed that there is limited work is done for the recognition of ancient documents, especially for the Devanagari script. This article also presents future directions for the upcoming researchers in the field of ancient text recognition.
(S. Narang, Jindal, & Kumar, 2019) proposed the handwritten Devanagari ancient manuscripts recognition system has been presented using statistical features extraction techniques. In the feature extraction phase, intersection points, open endpoints, centroid, horizontal peak extent, and vertical peak extent features are extracted. For classification, Convolutional Neural Network, Neural Network, Multilayer Perceptron, RBF-SVM, and random forest techniques are considered in this work. Various feature extraction and classification techniques are considered and compared to the recognition of basic characters segmented from Devanagari’s ancient manuscripts. This approach achieved 88.95% recognition accuracy using a combination of all features and a combination of all classifiers considered in this work by a simple majority voting scheme.
(Bannigidad & Gudada, 2019) stated the historical Kannada handwritten document images of different dynasties based on their age-type; Vijayanagara dynasty (1460 AD), Mysore Wadiyar dynasty (1936 AD), Vijayanagara dynasty (1400 AD) and Hoysala dynasty (1340 AD) for experimentation. The average classification accuracy for different dynasties: in case of K-NN classifier is 92.3% and SVM classifier is 96.7%, It is observed that the SVM classifier has got a good classification performance comparatively K-NN classifier for Historical Kannada handwritten document images. The experimental outcomes are tested with manual results and other methods in the literature, which show the thoroughness of the proposed technique.
(S. R. Narang, Jindal, Ahuja, & Kumar, 2020) proposed improved recognition results for Devanagari ancient characters have been presented using the scale-invariant feature transform (SIFT) and Gabor filter feature extraction techniques. A support vector machine (SVM) classifier is used for the classification task in this work. For experimental results, a database consisting of 5484 samples of Devanagari characters was collected from various ancient manuscripts placed in libraries and museums. SIFT- and Gabor filter-based features are used to extract the properties of the handwritten Devanagari ancient characters for recognition. Principle component analysis is used to reduce the length of the feature vector for reducing the training time of the model and to improve recognition accuracy. The recognition accuracy of 91.39% has been achieved using the proposed system based on the tenfold cross-validation technique and poly-SVM classifier.
(Boudraa, Hidouci, & Michelucci, 2020) stated an original skew angle detection and correction technique. Morphological Skeleton is introduced to considerably diminish the amount of data by eliminating the redundant pixels and preserving only the central curves of the image components. Next, the proposed method uses a Progressive Probabilistic Hough Transform (PPHT) to find image lines. In the end, a specific procedure is applied to measure the global skew angle of the document image from these identified lines. Experimental results demonstrate the accuracy and the effectiveness of our approach on skew angle detection upon three popular datasets covering many types of documents of diverse linguistic writings (Chinese, Greek, and English) and different styles (horizontal or vertical orientations, including figures and tables, multi-columns page layouts).
(Rahiche, Hedjam, Al-maadeed, & Cheriet, 2020) stated the discoloration and the changes in the optical proprieties of their writing materials, which are a natural phenomenon that occurs as they age. Thus, we present a new content independent and non-destructive approach based on multispectral imaging combined with a ranking classification technique, to track the spectral responses of iron-gall ink at different wavelengths over time. To evaluate the proposed approach on multispectral images of real handwritten letters dating from the 17th to the 20th century. Experimental results demonstrate the effectiveness of multispectral imaging for document images dating.
(Sabeenian, Paramasivam, Anand, & Dinesh, 2019) proposed a character recognition approach using Convolutional Neural Network (CNN) viz., convolution layer, pooling layer, activation layer, fully connected layer, and softmax classifier. The database of a character set has been created using scanned images of palm-leaf manuscripts. The database comprises 15 variety of classes and each class contains around 1000 different samples. The recognition of CNN Classifier if found to be around 96.1% to 100%. The prediction rate is found to be higher due to the large quantum of features extracted for each of the CNN layers. A comparison of the proposed method with other machine learning algorithms has also been presented.
(Devi & Maheswari, 2019) stated that soft computing enabled Digital acquisition and character extraction from stone inscription images. A new stone inscription image enhancement system is proposed by combining Modified Fuzzy Entropy-based Adaptive Thresholding (MFEAT) with a degree of Gaussian membership function and iterative bilateral filter (IBF). Since there is a variation in stone color, the images are equally normalized and stretched by linear contrast stretching, followed by foreground extraction by MFEAT, and the resultant image after binarization includes some noise. Hence, IBF is used to remove unwanted noise by preserving the character edges. The proposed fuzzy system helps to predict uncertainty among the character and the background pixels. The results were tested on various light illumination images and achieved a good PSNR rate compared to other binarizing techniques.
(Alajlan, El Rube, Kamel, & Freeman, 2007) proposed the triangle-area representation for the shape retrieval approach for nonrigid shapes with closed contours. The representation utilizes the triangle areas formed by the boundary points to find the convexity/concavity of each point at different scales (or triangle side lengths). The effective capturing both local and global characteristics of shape, invariant to translation, rotation, and scaling, and robust against noise and moderate amounts of occlusion. In the matching stage, a Dynamic Space Warping (DSW) algorithm used to search efficiently for the optimal (least cost) correspondence between the two shapes. Then, a distance gets estimated based on the optimal correspondence. The performance of our method is demonstrated using four standard tests on two well-known shape databases.
(Alemdar & Ersoy, 2017) proposed the Multi-resident activity tracking and recognition in smart environments get solved in the multiple resident concurrent activity recognition problems in smart homes equipped with interaction-based sensors and with multiple residents with two different approaches. Initially, the factorial hidden Markov model for modeling two separate chains responding to the two residents. Secondly, the nonlinear Bayesian tracking for decomposing the observation space into the number of residents. To perform these two experiments on multi-resident Activity Recognition with Ambient Sensing data sets. The advantages and disadvantages of each approach in terms of run time complexity, flexibility, and generalizability.
(Angadi & Kodabagi, 2014) stated the segmentation of text lines, words, and characters from Kannada text in low-resolution display board images is presented. The proposed method uses projection profile features and on pixel distribution statistics for the segmentation of text lines. The method also detects text lines containing consonant modifiers and merges them with corresponding text lines, and efficiently separates overlapped text lines as well. The character extraction process computes character boundaries using vertical profile features for extracting character images from every text line. Further, the word segmentation process uses k-means clustering to group inter-character gaps into character and word cluster spaces, which are used to compute thresholds for extracting words. The method also takes care of variations in character and word gaps. The proposed method is tolerant to font variability, spacing variations between characters and words, absence of free segmentation path due to consonant and vowel modifiers, noise and other degradations.
(Anupama, Rupa, & Reddy, 2013) proposed the multiple histogram projections using morphological operators to extract features of the image. Horizontal projection is performed on the text image, and then line segments are identified by the peaks in the horizontal projection. The threshold is applied to divide the text image into segments. False lines are eliminated using another threshold. Vertical histogram projections are used for the line segments and decomposed into words using threshold and further decomposed to characters. This approach provides the best performance based on the experimental results such as Detection rate DR (98%) and Recognition Accuracy RA (98%).
(Arica & Vural, 2003) insisted on the Beam Angle Statistics (BAS) for perceptual shape descriptor with the rest of the points on the boundary. At each point, the angle between a pair of beams is calculated to extract the topological structure of the boundary. Then, a shape descriptor is defined by using the third-order statistics of all the beam angles in a set of neighborhood systems. It is shown that beam angle statistics (BAS) is invariant to translation, rotation, scale, and is insensitive to distortions.
(Asaari, Tebal, & Rosdi) stated the geometric feature representation for the infrared finger image recognition system. The geometric representation is based on the fusion of two types of features, which are the finger widths and the fingertip angles. The extracted finger widths and fingertip angels are transformed into the frequency domain using Discrete Fourier Transform (DFT) to make them robust to the shifting and rotation variations. The feature sets obtained from the DFT process are fused by concatenating them into a single row vector called width and fingertip angle (WFTA) feature. To ensure the orthogonal relationship between the WFTA components, the Principle Component Analysis is adopted at the matching stage.
(Bansal & Sinha, 1999) allowed the description of shapes of Devanagari characters and its application in their recognition. It exploits certain features of the script in both reducing the search space and creating a reference concerning which correspondence could be established, during the matching process. The description prototypes are constructed using the real-life script after segmentation so that the aberrations introduced during the inevitable process of segmentation get accounted for in the description. This has been tested on printed Devanagari text with a success of approximately 70% without any post-processing and 88% correct recognition with the help of a word dictionary.
(Li, Li, Pan, Chu, & Roddick, 2015) stated the recognizing optical character from document image of text mixed by figure has its wide applications such as document auto-reading. Segmenting the document region from text-mixed is a crucial step in this system. The segmentation procedure includes two stages, one is to extract the texture features of each block based on the Gabor filter, and the second is to classify the texture features for segmentation based kernel self-optimization Fisher classifier. Some experiments are implemented to testify the performance of the proposed method.
(Lozano-Monasor, López, Vigo-Bustos, & Fernández-Caballero, 2017) stated the facial recognition system to recognize the emotions of aging adults from their facial expressions. The six basic emotions (Happiness, Sadness, Anger, Fear, Disgust, and Surprise), as well as a Neutral state, are distinguished. Active shape models are applied for feature extraction, the Cohn–Kanade, JAFFE, and MMI databases are used for training, and support vector machines (νν-SVM) are employed for facial expression classification. These six basic emotions are classified into three categories (Happiness, Negative-Emotion, and Surprise), as Sadness, Anger, Fear and Disgust have been grouped into a single expression. The new three categories are found to be relevant for detecting abnormalities in the aging adult’s mood state.
(Mathivanan, Ganesamoorthy, & Maran, 2014) stated the writer identification system with the four processing steps like preprocessing, segmentation, feature extraction, and writer identification using neural network. In the preprocessing phase, the handwritten text is subjected to a slant removal process for segmentation and feature extraction. After this step, the text image enters into the process of noise removal and gray level conversion. The preprocessed image is further segmented by using a morphological watershed algorithm, where the text lines are segmented into single words and then into single letters. The segmented image is a feature extracted by Daubechies’5/3 integer wavelet transform to reduce training complexity. This process is lossless and reversible. These extracted features are given as input to our neural network for the writer identification process and a target image is selected for each training process in the 2-layer neural network. With the several trained output data obtained from different target help in-text identification. It is a multilingual text analysis that provides simple and efficient text segmentation.
(Murthy, Kumar, Kumar, & Ranganath, 2004) stated the ancient inscriptions in epigraphy. The inscriptions are of great importance as they give the picture of an ancient civilization. Since the contribution of the inscriptions in understanding ancient history is remarkable. There is a requirement for the segmentation of the characters. To segment the characters from the text lines present in the script. The complexity in epigraphical character segmentation has the spacing between the two text lines and adjacent characters. The text lines some times found in the skewed scripts. The proposed method established the nearest neighbors using the clustering method to segmented each text line and the character as well as segmented the skewed document also.
(Nikolaou, Makridis, Gatos, Stamatopoulos, & Papamarkos, 2010) proposed the digitization of historical machine-printed sources to segment document pages. These kinds of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects:
(i) The use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) to face the problem of complex and dense document layout
(ii) The detection of noisy areas and punctuation marks that are usual in historical machine-printed documents
(iii) The detection of possible obstacles formed from background areas to separate neighboring text columns or text lines, and
(iv) The use of skeleton segmentation paths to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique.
(Deshpande, Malik, & Arora, 2008) proposed the Classification & Recognition of Hand Written Devnagari Characters with the performance like as string searching, manipulation, validation, and formatting in all applications that deal with textual data. Character recognition problem scenarios in sequence analysis that are ideally suited for the application of regular expression algorithms. The Regular Expressions & Minimum Edit Distance Method gets solved. The proposed system describes the use of regular expressions in this problem domain and demonstrates how the effective use of regular expressions that can serve to facilitate more efficient and more effective character recognition.
(Van Phan, Zhu, & Nakagawa, 2011) stated the preprocessing and character segmentation on digitized Nom document pages toward their digital archiving. Nom is an ideographic script to represent Vietnamese, used from the 10th century to the 20th century. Because of various complex layouts, we propose an efficient method based on connected component analysis for the extraction of characters from images. The area Voronoi diagram is then employed to represent the neighborhood and boundary of connected components. Based on this representation, each character can be considered as a group of extracted adjacent Voronoi regions. To improve the performance of segmentation, we use the recursive x-y cut method to segment separated regions. We evaluate the performance of this method on several pages in different layouts. The results confirm that the method is effective for character segmentation in Nom documents.
(Raja & John, 2013) stated the Tamil Character Recognition in Tamilnadu. The Tamil script is used to write the Tamil language in India, Sri Lanka, Singapore, and parts of Malaysia, as well as to write minority languages such as Badaga. An efficient method for recognizing Tamil characters based on extracted features like horizontal lines, vertical lines, loops, and curves. The extracted features are fed to the Back Propagation based Neural Network (BPNN) classifier system, Support Vector Machine (SVM) classifier, and Decision Tree (DT) classifier. The DT classifier system has achieved a very good recognition rate on the Tamil character database and shows improved performance as compared to the BPNN classifier and SVM classifiers.
(Bajaj, Dey, & Chaudhury, 2002) proposed the recognition of handwritten Devnagari numerals. The basic objective of the present work is to provide an efficient and reliable technique for the recognition of handwritten numerals. Three different types of features have been used for the classification of numerals. A multi-classifier connectionist architecture has been proposed for increasing the reliability of the recognition results. Experimental results show that the technique is effective and reliable.
(Saba, Rehman, & Zahrani, 2014) stated the benchmark database enabled the character segmentation of such overlapped cursive words without using any slant correction technique. Hence, a new concept of core-zone is introduced for segmenting such overlapped cursive script. Following core-zone detection, the character boundaries are detected in the core-zone area only. However, due to the inherent nature of cursive words, few characters are over segmented. A threshold is selected heuristically to overcome this problem. For fair comparison overlapped words are extracted from the CEDAR benchmark database. Experiments thus performed exhibit promising results and high speed.
(Shadkami, & Bonnier, 2010) proposed the document image analysis approaches that help to the segmentation and classify the regions of a document image such as text, graphic, and background through their limits. Then we adapt the well-known watershed segmentation to obtain a very fast and efficient classification. Finally, we compare our algorithm with three others, by running all the algorithms on a set of document images and comparing their results with a ground-truth segmentation designed by hand.
(Shaheen, Sheltami, Al-Kharoubi, & Shakshuki, 2019) stated the image encryption techniques are proposed to ensure the confidentiality of data. Digital images are different from text data as they have more data, higher data redundancy, and the correlation between image pixels. In wireless sensor networks (WSN), many encryption techniques are proposed. Sensor nodes have limited resources in memory, energy, and processing capabilities; therefore, the proposed techniques must consider these limitations. The confidentiality in storing and transmitting images is needed for different fields such as medical, military, online personal albums, confidential communications and video conferencing, etc. However, most of the proposed techniques do not apply to digital images due to image structure and size; therefore, the traditional cryptosystems cannot be applied on WSN. In this paper, the following two digital images transformation techniques: (1) discrete cosine transform and (2) discrete wavelet transform are used to propose digital images encryption techniques for WSN.
(Silva, & Kariyawasam, 2014) proposed the English letters, Sinhala letters for Srilankan people that are round in shape and straight lines are almost nonexistent. Unlike printed characters, handwritten characters may sometimes touch each other and they also have variations in writing style. Segmentation of a document image is one of the basic and critical tasks which has a great impact on the character recognition process. The complications that are present in handwritten documents make the segmentation process a challenging task. The method suggested in this paper is to use Self-Organizing Feature Maps (SOFM) for the segmentation of the touching character pairs.
(Sridevi, & Subashini, 2012) allowed the two methods such as segmentation and character segmentation. Document image segmentation is one of the critical phases in a handwritten character recognition system. The correct segmentation of individual characters decides the accuracy of the recognition system. It is used to decompose the sequence of characters into individual characters to segmenting text lines and then words. Ancient Tamil scripts documents consist of vowels, consonants, and various modifiers. Hence proper segmentation algorithm is required. the first method uses a projection profile and PSO for line segmentation. In the second method combination of connected components along with nearest neighborhood methods is used to segment the characters. Experimental results show that these methods give better results when compared to other methods.
(Kavitha, Shivakumara, Kumar, & Lu, 2016) stated the text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR) to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background in such images. The combination of Sobel and Laplacian for enhancing degraded low contrast pixels. Then the proposed method generates skeletons for text components in enhanced images to reduce computational burdens, which in turn helps in studying component structures efficiently. We propose to study the cursiveness of components based on branch information to remove false text components. The proposed method introduces the nearest-neighbor criterion for grouping components in the same line, which results in clusters. Furthermore, the proposed method classifies these clusters into text and non-text clusters based on the characteristics of text components. We evaluate the proposed method on a large dataset containing varieties of images. The results are compared with the existing methods to show that the proposed method is effective in terms of recall and precision.
(Li, Bai, Wang, & Xiao, 2010) proposed a new conditional random field approach, in which contextual features are introduced into text segmentation. Local visual information and contextual label information are integrated into a conditional random field by several components. Some components focus on visual image information to predict the category within the image sites, while others focus on contextual label information to determine the patterns within the label field. Integrating contextual label information in a conditional random field can effectively resolve local ambiguities and improve text segmentation performance in a complex background. The comparing results demonstrate that the proposed method outperforms other methods for text segmentation from a complex background.
(Neumann, & Matas, 2015) stated the character detection and segmentation problem as an efficient sequential selection from the set of Extremal Regions (ER). The External Regions (ER) detector is robust against blur, low contrast, and illumination, color, and texture variation. In the first stage, the probability of each ER being a character is estimated using features calculated by a novel algorithm in constant time, and only ERs with locally maximal probability is selected for the second stage, where the classification accuracy is improved using computationally more expensive features. A highly efficient clustering algorithm then groups ERs into text lines and an OCR classifier trained on synthetic fonts is exploited to label character regions. The most probable character sequence is selected in the last stage when the context of each character is known. The method was evaluated on three public datasets. On the ICDAR 2013 dataset, the method achieves state-of-the-art results in text localization; on the more challenging SVT dataset, the proposed method significantly outperforms the state-of-the-art methods and demonstrates that the proposed pipeline can incorporate additional prior knowledge about the detected text.
2.3 SURVEY ON OPTICAL CHARACTER RECOGNITION (OCR) ENABLED TECHNIQUES
(Negi, Bhagvati, & Krishna, 2001) proposed the complex orthography of Telugu language with a large number of distinct character shapes (estimated to be of the order of 10,000) composed of simple and compound characters formed from 16 vowels (called achchus) and 36 consonants (called hallus). We present an efficient and practical approach to Telugu Optical Character Recognition (OCR) that limits the number of templates to be recognized to just 370, avoiding issues of classifier design for thousands of shapes or very complex glyph segmentation. A compositional approach using connected components and fringe distance template matching was tested to give a raw OCR accuracy of about 92%. Several experiments across varying fonts and resolutions showed the approach to be satisfactory.
(Cheung, Bennamoun, & Bergmann, 2001) proposed the Optical Character Recognition (OCR) with the recognition of cursive scripts is a difficult task as their segmentation suffers from serious problems. This paper proposes an Arabic OCR system, which uses a recognition-based segmentation technique to overcome classical segmentation problems. A newly developed Arabic word segmentation algorithm is also introduced to separate horizontally overlapping Arabic words/subwords. There is also a feedback loop to control the combination of character fragments for recognition. The system was implemented and the results show 90% recognition accuracy with a 20 chars/s recognition rate.
(Choudhary, 2014) proposed the enormous effort of cursive handwriting recognition used to develop various techniques for handwriting segmentation and recognition. This review presents the segmentation strategies for automated recognition of off-line unconstrained cursive handwriting from static surfaces with the advanced techniques and also compares the proposed results in the domain of handwritten word segmentation.
(Choudhary, Rishi, & Ahlawat, 2013) stated the Segmentation of words into characters becomes very difficult due to the cursive and unconstrained nature of the handwritten script. This paper proposes a new vertical segmentation algorithm in which the segmentation points are located after thinning the word image to get the stroke width of a single pixel. The knowledge of shape and geometry of English characters is used in the segmentation process to detect ligatures. The proposed segmentation approach is tested on a local benchmark database and high segmentation accuracy is found to be achieved.
(Daliri & Torre, 2008) stated the shape recognition and retrieval suggested algorithm helps the contour of pairs of shapes with two types of contours. The two shapes the cost of their matching is evaluated by using the shape context and by using dynamic programming the best matching between the point sets is obtained. Dynamic programming not only recovers the best matching but also identifies occlusions, i.e. points in the two shapes that cannot be properly matched. Given the correspondence between the two-point sets, the two contours are aligned using Procrustes analysis. After alignment, each contour is transformed into a string of symbols and a modified version of edit distance is used to compute the similarity between strings of symbols. Finally, recognition and retrieval are obtained by a simple nearest-neighbor procedure. The algorithm has been tested on a large set of shape databases providing performances for both in recognition and retrieval superior to most of previously proposed approaches.
(Sharma & Jhajj, 2010) proposed the problem of recognition of isolated handwritten characters in Gurmukhi script. The whole process consists of two stages. The first, feature extraction stage analyzes the set of isolated characters and selects a set of features that can be used to uniquely identify characters. The performance of the recognition system depends heavily on what features are being used. An SVM classifier discriminates two classes of feature vectors by generating hyper-surfaces in the feature space, which are “optimal” in a specific sense that is the hyper-surface obtained by the SVM optimization is guaranteed to have the maximum distance to the “nearest” support vectors. SVM operates on kernel evaluations of the feature vectors. An annotated sample image database of isolated handwritten characters in Gurmukhi script has been prepared which has been used for training and testing of the system.
(Gaurav & Ramesh, 2012) stated the geometry-based technique for feature extraction applicable to segmentation-based word recognition systems. The proposed system extracts the geometric features of the character contour. These features are based on the basic line types that form the character skeleton. The system gives a feature vector as its output. The feature vectors so generated from a training set were then used to train a pattern recognition engine based on Neural Networks so that the system can be benchmarked.
(Hsu, Chen, & Chung, 2012) proposed the vehicle license plate recognition (LPR) into three major categories and propose a solution with parameter settings that are adjustable for different applications. The three categories are access control (AC), law enforcement (LE), and road patrol (RP). Each application is characterized by variables of different variation scopes and thus requires different settings on the solution with which to deal. The proposed solution consists of three modules for plate detection, character segmentation, and recognition. Edge clustering is formulated for solving plate detection for the first time. It is also a novel application of the maximally stable extreme region (MSER) detector to character segmentation. A bilayer classifier, which is improved with an additional null class, is experimentally proven to be better than previous methods for character recognition. To assess the performance of the proposed solution, the application-oriented license plate (AOLP) database is composed and made available to the research community. Experiments show that the proposed solution outperforms many previous solutions, an LPR can be better solved by solutions with settings oriented for different applications.
(Iqbal et al., 2008) stated the Radial Sector Coding (RSC), for Translation, Rotation, and Scale-invariant character recognition. Translation invariance is obtained using the Center of Mass (CoM). Scaling invariance is achieved by normalizing the features of characters. To obtain the most challenging rotation invariance, RSC searches a rotation-invariant Line of Reference (LoR) by exploiting the symmetry property for symmetric characters and Axis of Reference (AoR) for non-symmetric characters. RSC uses the LoR to generate invariant topological features for different characters. The topological features are then used as inputs for a multilayer feed-forward artificial neural network (ANN). We test the proposed approach on two widely used English fonts Arial and Tahoma and got 98.6% recognition performance on average.
(Mahmud, Raihan, & Rahman, 2003) proposed the optical character recognition (OCR) system for Bengali character. Recognition is done for both isolated and continuous printed multi-font Bengali characters. Preprocessing steps include segmentation at various levels, noise removal, and scaling. Freeman chain code has been calculated from the scaled character which is further processed to obtain a discriminating set of feature vectors for the recognizer. The unknown samples are classified using a feed-forward neural network-based recognition scheme. It has been found from experimental results that the success rate is approximately 98% for isolated characters and 96% for the continuous character.
(Kapoor & Verma, 2014) stated the segmentation of handwritten text in Devanagari script is an uphill task. The occurrence of a header line, overlapped characters in the middle zone & half characters makes the segmentation process more difficult. Sometimes, interline space and noise make line fragmentation a difficult task. Sometimes, interline space and noise make line fragmentation a difficult task. Without separating the touching characters, it will be difficult to identify the characters, hence fragmentation is necessary for the touching characters in a word. To devise a technique, according to that first step will be preprocessing of a word, then identify the joint points, form the bounding boxes around all vertical & horizontal lines, and finally fragment the touching characters based on their height and width.
(Khan & Mohammad, 2008) proposed the novel character segmentation of unconstrained handwritten words. The developed segmentation algorithm over-segments in some cases due to the inherent nature of the cursive words. However, the over-segmentation is minimum. To increase the efficiency of the algorithm an Artificial Neural Network is trained with a significant amount of valid segmentation points for cursive words manually. Trained neural network extracts incorrect segmented points efficiently with high speed. For fair comparison benchmark database IAM is used. The experimental results are encouraging.
(Lee & Bang, 2019) stated the novel image search scheme that extracts the features of an image using combined invariant features and color description to retrieve specific images using query-by-example. The proposed method can be executed in real-time on an iPhone and can be easily used to identify a natural-color image with its invariant visual features. The proposed scheme is evaluated by assessing the performance of simulation in terms of the average precision and F-score in image databases that are commonly used for image retrieval. The experimental results reveal that the proposed algorithm offers a significant improvement of more than 7.35 and 18.09% in retrieval effectiveness when compared to open source OpenSURF and MPEG-7 color and texture descriptor, respectively. The main contribution of this paper is that the proposed approach achieves high accuracy and stability by using a combination of the improved SURF and color descriptor when searching for a natural image.
(Lehal & Singh, 1999) allowed a feature extraction and hybrid classification scheme, using a binary decision tree and nearest neighbor, for machine recognition of Gurmukhi characters is described. The classification process is carried out in three stages. In the first stage, the characters are grouped into three sets depending on their zonal position (upper zone, middle zone, and lower zone). In the second stage, the characters in the middle zone set are further distributed into smaller sub-sets by a binary decision tree using a set of robust and font independent features. In the third stage, the nearest neighbor classifier is used and the special features distinguishing the characters in each subset are used. One significant point of this scheme, in contrast to the conventional single-stage classifiers where each character image is tested against all prototypes, is that a character image is tested against only certain subsets of classes at each stage and also enhances the computational efficiency.
(Madhavaraj, Ramakrishnan, Kumar, & Bhat, 2014) stated the recognition accuracy drops mainly due to the merging or breaking of characters through Optical Character Recognition (OCR). The first algorithm to segment merged Kannada characters by using a hypothesis to select the positions to be cut. This method searches for the best possible positions to segment, by taking into account the support vector machine classifier’s recognition score and the validity of the aspect ratio (width to height ratio) of the segments between every pair of cut positions. The hypothesis to select the cut position is based on the fact that a concave surface exists above and below the touching portion. These concave surfaces are noted down by tracing the valleys in the top contour of the image and similarly doing it for the image rotated upside-down. The cut positions are then derived as closely matching valleys of the original and the rotated images. Our proposed segmentation algorithm works well for different font styles, shapes, and sizes better than the existing vertical projection profile based segmentation.
(Naveena & Aradhya, 2012) stated the character segmentation approach of many OCR systems. The segmentation of Kanada language scripts. A new character segmentation algorithm for unconstrained handwritten Kannada scripts is presented. The proposed method is based on thinning, branch point, and mixture models. The expectation-maximization (EM) algorithm is used to learn the mixture of Gaussians. We have used a cluster mean points to estimate the direction and branch point as reference points for segmenting characters. To evaluate the proposed method on Kannada words and it has shown encouraging results.
(Pal & Tripathy, 2009) stated the novel scheme towards the recognition of multi-oriented and multi-sized isolated characters of printed script. For recognition, at first, distances of the outer contour points from the centroid of the individual characters are calculated and these contour distances are then arranged in a particular order to get size and rotation invariant feature. Next, based on the arranged contour distances, the features are derived from a different class of characters. Finally, we use these derived features of the characters to statistically compare the features of the input character for recognition. The tested scheme on printed Bangla and Devnagari multi-oriented characters and we obtained encouraging results.
(Singh & Budhiraja, 2011) stated the feature extraction and the classification of the Indian English scripts the research is limited. The various O.C.R. systems for Gurmukhi which are developed for handwritten isolated Gurmukhi text. In the case of printed Gurmukhi text a lot of research has been done but in the case of handwritten Gurmukhi text very less work has been done. So handwritten Gurmukhi character recognition needs more attention from researchers.
(Pujari, Naidu, & Jinaga, 2002) stated the robust character recognizer for Telugu texts and it exploits the inherent characteristics of the Telugu Script. The proposed method uses wavelet multi-resolution analysis to extract features and associative memory models to accomplish the recognition tasks. Our system learns the style and font from the document itself and then it recognizes the remaining characters in the document. The major contribution of the present study can be outlined as follows. It is a robust OCR system for Telugu printed text. It avoids the feature extraction process and it exploits the inherent characteristics of the Telugu character by a clever selection of Wavelet Basis function which extracts the invariant features of the characters. It has a Hopfield-based Dynamic Neural Network for learning and recognition. This is important because it overcomes the inherent difficulties of memory limitation and spurious states in the Hopfield Network. The DNN has been demonstrated to be efficient for associative memory recall. However, though it is normally not suitable for image processing applications, the multi-resolution analysis reduces the sizes of the images to make the DNN applicable to the present domain.
(Sagar, Shobha, & Kumar, 2008) stated the development of Kannada optical character recognition (OCR) is discussed. In the process of developing the preprocessing, segmentation, character recognition, and post-processing modules were detailed in this paper. Since all most all the characters are in curve sharp, non-cursive the segmentation, character recognition, and post-processing is not easy for Kannada Script. Post Processing technique uses a dictionary-based approach to increase the OCR output. To discussed the syntactical analysis of the Kannada Script. It is the analysis of grammatical errors in the language for the Kannada script.
(Kunte, & Samuel, 2007) proposed the applications of a wide range of OCR to gain performance through the Neural network. Conventional neural networks are not suitable for classification problems involving large-set of patterns because of large computational time requirements and difficulty in determining network structure. To presented an OCR system for recognition of a complete set of printed Kannada characters, which are more than 600 in number. Two-stage multi-network neural classifiers are used to cope with the large-set character classification problem. Wavelets that have been progressively used in pattern recognition are used in our system to extract the features. An encouraging recognition rate of about 91% is got at the character level.
(Sardar, & Wahab, 2010) stated the real importance of the real Urdu language. The huge amount of valuable Urdu literature from philosophy to sciences is in vanishing and useless form because it has not been digitized till now. More importantly, many of the native speakers of Urdu, especially in Pakistan can only read and write Urdu language and very rare data is available for them on the internet and in digitized form. Because of its complexity very rare and partially, research work and implementation have been done, and therefore no complete OCR for Urdu language exits till now. More importantly, most research has been done for Urdu OCR is concerning scripts, fonts, and text environment which are other obstacles in the way of making complete OCR. So, the research and moderately implements online and offline OCR system which is irrespective of Urdu scripts and fonts.
(Singh, Sarka, Bhateja, & Nasipuri, 2018) proposed the multi-script environment, a complete framework of the script identification module is very essential before starting the actual document digitization through the OCR engine. The multi-script country like India, cannot serve the entire purpose of document digitization when such multi-script document images need to be converted into machine-readable form. But developing a script-invariant OCR engine is almost impossible. A novel handwritten script recognition model considering all the 12 officially recognized scripts in India. The classification task is performed at word-level using a tree-based approach where the Matra-based scripts are firstly separated from non-Matra scripts using distance-Hough transform (DHT) algorithm. Next, the Matra and non-Matra based scripts are individually identified using modified log-Gabor filter-based features applied at multi-scale and multi-orientation. Encouraging outcomes establish the efficacy of the present tree-based approach to the classification of handwritten Indic scripts.
(Xie, et al. 2019) proposed the weakly supervised segmentation system with recognition-guided information on attention area, is proposed for high-precision historical document segmentation under strict intersection-over-union (IoU) requirements. We formulate the character segmentation problem from the Bayesian decision theory perspective and propose boundary box segmentation (BBS), recognition-guided BBS (Rg-BBS), and recognition-guided attention BBS (Rg-ABBS), progressively, to search for the segmentation path. Furthermore, a novel judgment gate mechanism is proposed to train a high-performance character recognizer in an incremental weakly supervised learning manner. The proposed Rg-ABBS method is shown to substantially reduce time consumption while maintaining sufficiently high precision of the segmentation result by incorporating both character recognition knowledge and line-level annotation. Experiments show that the proposed Rg-ABBS system significantly outperforms traditional segmentation methods as well as deep-learning-based instance segmentation and detection methods under strict IoU requirements.
(Vamvakas, Gatos, Stamatopoulos., & Perantonis, 2008) stated the OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology consists of three steps: The first two steps refer to creating a database for training using a set of documents, while the third one refers to the recognition of new document images. First, a pre-processing step that includes image binarization and enhancement takes place. At a second step, a top-down segmentation approach is used to detect text lines, words, and characters. A clustering scheme is then adopted to group characters of similar shape. This is a semi-automatic procedure since the user can interact at any time to correct possible errors of clustering and assign an ASCII label. After this step, a database is created to be used for recognition. Finally, in the third step, for every new document image, the above segmentation approach takes place while the recognition is based on the character database that has been produced at the previous step.
(Bansal, & Sinha, 2001) stated the complete OCR for printed Hindi text in Devanagari script. A performance of 93% at the character level is obtained. Devanagari document reading system on various printed documents and gathered various results. A performance of approximately 93% ar character level is obtained. A sample text page and the two-level partitioning scheme and search algorithm for the correction of optically read Devanagari characters of the text recognition system for Devanagari script. To extended the concept of uniform penalty for a mismatch to include different penalties for various kinds of mismatches. The preference is given to the mappings that are known as OCR confusions.
(Belagali, & Angadi,2016) proposed the OCR system based complex handwritten Kannada characters. One of the major challenges faced by the Kannada OCR system is the recognition of handwritten text from an image. The input text image is subjected to preprocessing and then converted into a binary image. The segmentation process is carried out to extract a single character from the image. This can be done using connected component labeling. Hu’s invariant moments, horizontal and vertical profile features are obtained as features from a zoned image. The probabilistic neural network (PNN) classifier is used for character recognition. Finally, the recognized output is editable in the baraha editor. An accuracy of 94.69% is achieved in character recognition of the domain-specific input.
(Ramanan, Ramanan, & Charles, 2015) stated the multiclass classification to recognize Tamil characters using binary support vector machines (SVMs) organized in a hybrid decision tree. OCR for printed Tamil text is considered a challenging problem due to a large number of (i.e., 247) characters with complicated structures and, the similarity between characters as well as different font styles. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits the optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, a histogram of oriented gradients (HOG), and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.
(Chen, et al. 2011) stated the text detection algorithm, which employs edge-enhanced Maximally Stable Extremal Regions as basic letter candidates. These candidates are then filtered using geometric and stroke width information to exclude non-text objects. Letters are paired to identify text lines, which are subsequently separated into words. We evaluate our system using the ICDAR competition dataset and our mobile document database. The experimental results demonstrate the excellent performance of the proposed method.
(Soumya, & Kumar, 2015) proposed an epigraphist in reading and deciphering inscriptions. The automation steps include Pre-processing, Segmentation, Feature Extraction, and Recognition. Preprocessing involves, enhancement of degraded ancient document images which is achieved through Spatial filtering methods, followed by binarization of the enhanced image. Segmentation is carried out using Drop Fall and Water Reservoir approaches, to obtain sampled characters. Next Gabor and Zonal features are extracted for the sampled characters and stored as feature vectors for training. Artificial Neural Network (ANN) is trained with these feature vectors and later used for classification of new test characters. Finally, the classified characters are mapped to characters of modern form. The system showed good results when tested on the nearly 150 samples of ancient Kannada epigraphs from the Ashoka and Hoysala periods. The average Recognition accuracy of 80.2% for the Ashoka period and 75.6% for Hoysala period is achieved
2.4 SUMMARY
The conversion of handwritten ancient characters is important for making several important documents related to history, such as manuscripts, into machine editable form so that it can be easily accessed and preserved. Independent work is going on in Optical Character Recognition that is the processing of printed/computer-generated document and handwritten and manually created document processing i.e. handwritten character recognition. There are many OCR systems available for handling printed and handwritten documents of modern form, with reasonable levels of accuracy. However, there are not many reported efforts at developing OCR systems for ancient Indian scripts. In this chapter, there are several efficient ancient historical character documents with a preserving approach to be discussed. The image processing enabled Optical Character Recognition (OCR) provided the different character recognition systems analyzed and discussed in this chapter.
References:
- Alajlan, N., El Rube, I., Kamel, M. S., & Freeman, G. (2007). Shape retrieval using triangle-area representation and dynamic space warping. Pattern Recognition, 40(7), 1911-1920.
- Alemdar, H., & Ersoy, C. (2017). Multi-resident activity tracking and recognition in smart environments. Journal of Ambient Intelligence and Humanized Computing, 8(4), 513-529.
- Angadi, S. A., & Kodabagi, M. (2014). A robust segmentation technique for line, word and character extraction from Kannada text in low resolution display board images. International Journal of Image and Graphics, 14(01n02), 1450003.
- Anupama, N., Rupa, C., & Reddy, E. S. (2013). Character segmentation for Telugu image document using multiple histogram projections. Global Journal of Computer Science and Technology.
- Arica, N., & Vural, F. T. Y. (2003). BAS: a perceptual shape descriptor based on the beam angle statistics. Pattern Recognition Letters, 24(9-10), 1627-1639.
- Asaari, M. S. M., Tebal, N., & Rosdi, B. A. A Single Finger Geometry Recognition Based On Widths And Fingertip Angles (WFTA).
- Bajaj, R., Dey, L., & Chaudhury, S. (2002). Devnagari numeral recognition by combining decision of multiple connectionist classifiers. Sādhanā, 27(1), 59-72.
- Bannigidad, P., & Gudada, C. (2019). Age-type identification and recognition of historical Kannada handwritten document images using HOG feature descriptors Computing, Communication and Signal Processing (pp. 1001-1010): Springer.
- Bansal, V., & Sinha, M. K. (2001, September). A complete OCR for printed Hindi text in Devanagari script. In Proceedings of Sixth International Conference on Document Analysis and Recognition(pp. 0800-0800). IEEE Computer Society.
- Bansal, V., & Sinha, R. (1999). On how to describe shapes of Devanagari characters and use them for recognition. Paper presented at the Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR’99 (Cat. No. PR00318).
- Belagali, N., & Angadi, S. A. (2016). OCR for handwritten Kannada language script. J. Recent Trends Eng. Res.(IJRTER), 2(08), 190-197.
- Boudraa, O., Hidouci, W. K., & Michelucci, D. (2020). Using skeleton and Hough transform variant to correct skew in historical documents. Mathematics and Computers in Simulation, 167, 389-403.
- Chen, H., Tsai, S. S., Schroth, G., Chen, D. M., Grzeszczuk, R., & Girod, B. (2011, September). Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In 2011 18th IEEE International Conference on Image Processing(pp. 2609-2612). IEEE.
- Chen, Y., & Wang, L. (2017). Broken and degraded document images binarization. Neurocomputing, 237, 272-280.
- Cheung, A., Bennamoun, M., & Bergmann, N. W. (2001). An Arabic optical character recognition system using recognition-based segmentation. Pattern Recognition, 34(2), 215-233.
- Choudhary, A. (2014). A review of various character segmentation techniques for cursive handwritten words recognition. J. Inf. Comput. Technol, 4(6), 559-564.
- Choudhary, A., Rishi, R., & Ahlawat, S. (2013). A new character segmentation approach for off-line cursive handwritten words. Procedia Computer Science, 17, 88-95.
- Daliri, M. R., & Torre, V. (2008). Robust symbolic representation for shape recognition and retrieval. Pattern Recognition, 41(5), 1782-1798.
- Deshpande, P. S., Malik, L. G., & Arora, S. (2008). Fine Classification & Recognition of Hand Written Devnagari Characters with Regular Expressions & Minimum Edit Distance Method. JCP, 3(5), 11-17.
- Devi, K. D., & Maheswari, P. U. (2019). Digital acquisition and character extraction from stone inscription images using modified fuzzy entropy-based adaptive thresholding. Soft Computing, 23(8), 2611-2626.
- Gaurav, D. D., & Ramesh, R. (2012). A feature extraction technique based on character geometry for character recognition. arXiv preprint arXiv:1202.3884.
- Hsu, G.-S., Chen, J.-C., & Chung, Y.-Z. (2012). Application-oriented license plate recognition. IEEE Transactions on Vehicular Technology, 62(2), 552-561.
- Iqbal, A., ABM, M., Tahsin, A., Sattar, M. A., Islam, M. M., & Murase, K. (2008). A novel algorithm for translation, rotation and scale invariant character recognition. Paper presented at the SCIS & ISIS SCIS & ISIS 2008.
- Jia, F., Shi, C., He, K., Wang, C., & Xiao, B. (2018). Degraded document image binarization using structural symmetry of strokes. Pattern Recognition, 74, 225-240.
- Kapoor, S., & Verma, V. (2014). Fragmentation of handwritten touching characters in Devanagari script. International Journal of Information Technology, Modeling and Computing (IJITMC) Vol, 2, 11-21.
- Kavitha, A. S., Shivakumara, P., Kumar, G. H., & Lu, T. (2016). Text segmentation in degraded historical document images. Egyptian informatics journal, 17(2), 189-197.
- Khan, A. R., & Mohammad, Z. (2008). A simple segmentation approach for unconstrained cursive handwritten words in conjunction with the neural network. International Journal of Image Processing, 2(3), 29-35.
- Kunte, R. S., & Samuel, R. S. (2007, December). An OCR system for printed Kannada text using two-stage Multi-network classification approach employing Wavelet features. In International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007)(Vol. 2, pp. 349-353). IEEE.
- Lee, Y.-H., & Bang, S.-I. (2019). Improved image retrieval and classification with combined invariant features and color descriptor. Journal of Ambient Intelligence and Humanized Computing, 10(6), 2255-2264.
- Lehal, G. S., & Singh, C. (1999). Feature extraction and classification for OCR of Gurmukhi script. VIVEK-BOMBAY-, 12(2), 2-12.
- Li, J.-B., Li, M., Pan, J.-S., Chu, S.-C., & Roddick, J. F. (2015). Gabor-based kernel self-optimization Fisher discriminant for optical character segmentation from text-image-mixed document. Optik, 126(21), 3119-3124.
- Li, M., Bai, M., Wang, C., & Xiao, B. (2010). Conditional random field for text segmentation from images with complex background. Pattern Recognition Letters, 31(14), 2295-2308.
- Lozano-Monasor, E., López, M. T., Vigo-Bustos, F., & Fernández-Caballero, A. (2017). Facial expression recognition in ageing adults: from lab to ambient assisted living. Journal of Ambient Intelligence and Humanized Computing, 8(4), 567-578.
- Lu, D., Huang, X., & Sui, L. (2018). Binarization of degraded document images based on contrast enhancement. International Journal on Document Analysis and Recognition (IJDAR), 21(1-2), 123-135.
- Madhavaraj, A., Ramakrishnan, A., Kumar, H. S., & Bhat, N. (2014). Improved recognition of aged Kannada documents by effective segmentation of merged characters. Paper presented at the 2014 International Conference on Signal Processing and Communications (SPCOM).
- Mahmud, J. U., Raihan, M. F., & Rahman, C. M. (2003). A complete OCR system for continuous Bengali characters. Paper presented at the TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region.
- Mathivanan, P., Ganesamoorthy, B., & Maran, P. (2014). Watershed algorithm based segmentation for handwritten text identification. Ictact journal on image and video processing, 4(3), 767.
- Murthy, K. S., Kumar, G. H., Kumar, P., & Ranganath, P. (2004). Nearest neighbor clustering based approach for line and character segmentation in epigraphical scripts. Paper presented at the International Conference on Cognitive Systems, New Delhi.
- Narang, S., Jindal, M., & Kumar, M. (2019). Devanagari ancient documents recognition using statistical feature extraction techniques. Sādhanā, 44(6), 141.
- Narang, S. R., Jindal, M., Ahuja, S., & Kumar, M. (2020). On the recognition of Devanagari ancient handwritten characters using SIFT and Gabor features. Soft Computing.
- Narang, S. R., Jindal, M., & Kumar, M. (2020). Ancient text recognition: a review. Artificial Intelligence Review, 1-42.
- Naveena, C., & Aradhya, V. M. (2012). Handwritten character segmentation for kannada scripts. Paper presented at the 2012 World Congress on Information and Communication Technologies.
- Negi, A., Bhagvati, C., & Krishna, B. (2001). An OCR system for Telugu. Paper presented at the Proceedings of Sixth International Conference on Document Analysis and Recognition.
- Neumann, L., & Matas, J. (2015). Real-time lexicon-free scene text localization and recognition. IEEE transactions on pattern analysis and machine intelligence, 38(9), 1872-1885.
- Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., & Papamarkos, N. (2010). Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image and Vision Computing, 28(4), 590-604.
- Pal, U., & Tripathy, N. (2009). A contour distance-based approach for multi-oriented and multi-sized character recognition. Sādhanā, 34(5), 755.
- Pujari, A. K., Naidu, C. D., & Jinaga, B. (2002). An Adaptive Character Recognizer for Telugu Scripts Using Multiresolution Analysis, Associative Memory. Paper presented at the ICVGIP.
- Rahiche, A., Hedjam, R., Al-maadeed, S., & Cheriet, M. (2020). Historical documents dating using multispectral imaging and ordinal classification. Journal of Cultural Heritage.
- Raja, S., & John, M. (2013). A novel Tamil character recognition using decision tree classifier. IETE Journal of Research, 59(5), 569-575.
- Ramanan, M., Ramanan, A., & Charles, E. Y. A. (2015, August). A hybrid decision tree for printed Tamil character recognition using SVMs. In 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)(pp. 176-181). IEEE.
- Saba, T., Rehman, A., & Zahrani, S. (2014). Character segmentation in overlapped script using benchmark database. Computers, automatic control, signal processing and systems science, 140-143.
- Sabeenian, R., Paramasivam, M., Anand, R., & Dinesh, P. (2019). Palm-leaf manuscript character recognition and classification using convolutional neural networks Computing and Network Sustainability (pp. 397-404): Springer.
- Sagar, B. M., Shobha, G., & Kumar, P. R. (2008, December). Complete Kannada Optical Character Recognition with syntactical analysis of the script. In 2008 International Conference on Computing, Communication and Networking(pp. 1-4). IEEE.
- Sardar, S., & Wahab, A. (2010, June). Optical character recognition system for Urdu. In 2010 International Conference on Information and Emerging Technologies(pp. 1-5). IEEE.
- Shadkami, P., & Bonnier, N. (2010, December). Watershed based document image analysis. In International Conference on Advanced Concepts for Intelligent Vision Systems(pp. 114-124). Springer, Berlin, Heidelberg.
- Shaheen, A. M., Sheltami, T. R., Al-Kharoubi, T. M., & Shakshuki, E. (2019). Digital image encryption techniques for wireless sensor networks using image transformation methods: DCT and DWT. Journal of Ambient Intelligence and Humanized Computing, 10(12), 4733-4750.
- Sharma, D., & Jhajj, P. (2010). Recognition of isolated handwritten characters in Gurmukhi script. International journal of computer applications, 4(8), 9-17.
- Silva, C., & Kariyawasam, C. (2014). Segmenting Sinhala handwritten characters. International Journal of Conceptions on Computing and Information Technology, 2(4), 22-26.
- Singh, P., & Budhiraja, S. (2011). Feature extraction and classification techniques in OCR systems for handwritten Gurmukhi Script–a survey. International Journal of Engineering Research and Applications (IJERA), 1(4), 1736-1739.
- Singh, P. K., Sarkar, R., Bhateja, V., & Nasipuri, M. (2018). A comprehensive handwritten Indic script recognition system: a tree-based approach. Journal of Ambient Intelligence and Humanized Computing, 1-18.
- Soumya, A., & Kumar, G. H. (2015). Recognition of historical records using Gabor and zonal features. Signal and Image Processing: An International Journal, 6(4), 57-69.
- Sridevi, N., & Subashini, P. (2012). Segmentation of text lines and characters in ancient tamil script documents using computational intelligence techniques. International Journal of Computer Applications, 52(14).
- Vamvakas, G., Gatos, B., Stamatopoulos, N., & Perantonis, S. J. (2008, September). A complete optical character recognition methodology for historical documents. In 2008 The Eighth IAPR International Workshop on Document Analysis Systems(pp. 525-532). IEEE.
- Van Phan, T., Zhu, B., & Nakagawa, M. (2011). Development of Nom character segmentation for collecting patterns from historical document pages. Paper presented at the Proceedings of the 2011 Workshop on Historical Document Imaging and Processing.
- Vo, Q. N., Kim, S. H., Yang, H. J., & Lee, G. (2018). Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recognition, 74, 568-586.
- Xie, Z., Huang, Y., Jin, L., Liu, Y., Zhu, Y., Gao, L., & Zhang, X. (2019). Weakly supervised precise segmentation for historical document images. Neurocomputing, 350, 271-281.