Improved Gabor Filtering Application in the Identification of Handwriting

In this paper, handwriting image will be regarded as a texture image,The textural features of it were extracted by Improved multi-channel 2-D Gabor filtering,and was added in the features database as the basis for the identification of handwriting after processed.This is a content independent method,with a broad applicability.the speed and accuracy of identification were increased after optimizing the parameters of Gabor filtering, the author also made a lot of experiments by the platform of vc++6.0, it proved the effectiveness of the algorithm.


Introduction
Handwriting identification is a very effective method in distinguishing among identities, it has many significant advantages compared to other means of identification, such as handwriting has the peculiarity of uniqueness, stability, acquisition and non-invasive (Yang,Zi-Hua,2004,pp.67-79).The extraction of different handwriting features is the core of handwriting identification system, According to the object and the characteristics extracted,it can be divided into two methods(text-independent and text-dependent), most of the current means need to rely on the letter being identified(the most typical is signature verification),this method lacks broad applicability.The handwriting identification is intended to get a person's writing style, without too much concern to the specific content of the writing.thispaper presents an improved two-dimensional multi-channel Gabor transform the text of an independent handwriting identification methods,the paper put forward a text-independent handwriting identification method based on the improved two-dimensional multi-channel Gabor transform.this method exacted the texture characteristic of handwriting image in different frequencies and directions fastly and accurately by means of Gabor filters.

The preprocessing of handwriting image
Because handwriting image itself has many noise information, in order to obtain the unified texture image and make preparations for characteristics extraction,image preprocessing must be carried out, the main pretreatment steps are as follows: (1) Remove the background grid of the paper and gray the handwriting image.First,according to octree structure color quantization algorithm, convert the handwriting bitmap's color depth to 8-Bit bitmap to facilitate computer processing.In order to eliminate the impact of paper type, background colors, stains, grid lines, as well as other Mixed colors on the handwriting, the author designed a Color-screen obtaining method to take the color of handwriting text(indicated with RGB), a threshold is intercalated to reset the colors which have a large difference between the text color to background color.The ink color of handwriting text has nothing to do with with the handwriting written style,the image can be grayed to reduce the complexity of the system according to the weighted average method (He,Bin,2002).
(2) Remove the noise and binarize the image.Eliminate the random noise introducted when the image was scanned according to filtering algorithm,then binarize the handwriting image to make it only contain 0 and 255 gray levels.
(3)Character Segmentation and normalized treatment.Project the handwriting image in the horizontal direction,the Computer and Informaiton Science May, 2008 91 trough between two adjacent peaks in the projection curve corresponding to the gap between two lines,the distance between the two trough corresponding to the character height of a line.Every character's width and the space between characters can be obtained by vertical projection on each line.Set a threshold set according to the regional average gray to remove too sparse (random graffiti) and too crowded (stains) handwriting.Calculate the size of each text and the scaling factor in horizontal and vertical direction scaling comparing to standard size,then can normalize the character facilitately and remove spaces between words and gap between lines simultaneously.
(4)The text block splice.After the above pretreatment steps,the author splice the normalized hangwriting image into blocks,each block has the size of 256*256,we can splice the charcaters to obtain a unified texture image in the circumstances that Handwriting image itself contains only a few characters(For example, signature, criminal password,ect)

Extract handwriting features by Gabor filters.
Texture analysis is an important image analysis method in the frequency domain.Texture is the element gradation distributed orderliness of the image,it manifests the shape and reciprocity of texture element.The handwriting image may regard as to constitute by a group of texture unit,statistics of the whole image texture unit can reflect the writer's handwriting characteristic.The Gabor transformation used in the texture analysis overcomes the time-frequency localization contradictory flaw of the Fourier transformation, It has been recognized as one of the best signal expressive methods in the correspondence and the signal processing fields, particularly in image expression,and applied to many image processing areas (Shen,Cong,2002,pp.20-25).
The Gabor filter's form commonly used in the extraction of image texture features is (Wang,Yun-Hong,2001,pp.229-234): h denote odd and even Gabor filters, f ,θ ,σ are three important parameters of Gabor filter: Spatial frequency, phase and space constant.( , , ) g x y σ specifies the Guass function: This Gabor filter's frequency form is: The purpose of Gabor transform is analysing the handwriting image on the multi-scale and in multi directions according to the multi-resolution characteristic of two-dimensional Gabor wavelet.Extrac a number of filter coefficients from many sub-planes as the statistical characteristics.Choose a group of different filter parameter, then the output image on different channel can be obtained,which is equivalent to the original image in different expansions on the basis function (Liu.Cheng-Lin,1997,pp.56-63).
Supposes the texture image is described by ( , ) f x y ,and it's Fourier transform by ( , ) F u v ,the filter by ( , ) h x y , we record the filter output image's duplicate response output as ( , ) q x y ,that is: ( , ) ( , ) ( , ) Take the average value and the variance of the power matrix from each channel as the handwriting texture features.

The choice of filter parameters
Gabor filters mainly decided by three parameters:the direction θ , the radial center frequency f and constant spaceσ , it is a band-pass filter centered by ( , ) f θ ,the filter's wavelength decided by f ,the Gabor kernel function's direction decided byθ ,and σ is Gaussian envelope standard deviation in x and y directions, its value and the filter center frequency is in inversely proportion,assign 2 / f σ π = in this paper.A group of Gabor filters can be obtained through the selecting different parameters,then gain a group of non-orthogonal basis, the information of frequency-domain in different frequency and phase can be obtained by expanding the signal under these basis.
As Gabor filter is conjugate symmetry in frequency domain, so choose the direction parameters variable in 0 o ~180 o ,four phase interval parameters θ were chosen In the literature [4],they were: 0, / 4 π , / 2 π , 3 /4 π ,the filter's center frequency and the extraction of the characteristics relate to the scale of the texture.For a image size of N N × , the experienced choice of center frequency is / 2 f N ≤ , The lower center frequency,the bigger the scale of texture analysis,the weaker reflection of handwriting characteristic change.
The unified handwriting texture image is obtained after pretreatment,choose the Gabor filter's center frequency f as 4,8,16,32,64,128 for each phase θ ,then we have 24 Gabor filtering channels for 4 direction angles, each filter has the respective choice of frequency and direction, after the handwriting image filtrated,24 group of output coefficients can be obtained, take the power matrix's mean (M) and variance (S) as a characteristic value from each channel output,then a 48-dimensional feature vector is gained.

The improvement of multi-channel Gabor filter
The non-orthogonality of Gabor filter mains there having redundant information after images filtered,thus affect the speed and discrimination capabilities, This article proposed the following selection strategy of the best multi-channel Gabor filter group based on the above experiments,and experiment proved the feasibility of the method.
Suppose there have H sample images belong to C categories, record each sample image as { }, 1,2 t I t H = L ,then the Gabor features of each sample image can be manifested as ,the category of the sample image is denoted by i ,the channel of the filter is denoted by j , t is the mark of sample, { } the output of all filtes, assume that only preserve the j-dimensional features,that is ( ) calculate the dispersion matrix W S in one kind and dispersion matrix B S between kinds for each category (Bian.Zhao-Qi,2002).The formula is: shows the mean of jth eigenvalue for all the samples belonging to one category, j m is the jth eigenvalue's average value of all samples.
Use the criterion: to judge each eigenvalue's classification ability for different kind of handwriting image,choose the channels corresponding to Several largest values of j R to filter the handwriting image.
After choosing the most superior Gabor filter group,carry on bandpass filter to the handwriting image with them,select the mean and variance of the output image to idenfify the handwriting.By such optimized processing, both the Gabor filter's channel quantity and the comparing time of characteristic are reduced,it also enhanced the reliability of the identification.25 individuals each 10 handwriting samples were collected to carry on the massive experiments in this article, and a group composed of the best 12 multi-channel Gabor filters were decided,The parameters of these filters are shown in table 1.

6.The classification of handwriting image and experimental analysis
For simplicity,use the weighted Euclidean distance classifier to classify, compare the unknown handwriting's eigenvector with the sample's that has been trained,only when its eigenvector has the minimum weighted Euclidean distance(WED) between category k,the input handwriting is classified to category k.the WED can be calculated with the formula: δ denote the mean and the variance of the ith character of category k,and N is the total number of characters for each sample.
The author sort the checked materials according to the distance between the characteristic of each category in the storehouse in the experiment,choose the writers whose handwriting image is near to the samples as the candidate,12individuals each 5 handwriting samples were collected in an experiment,four for training and one for identifing.Because the algorithm calculate the overall image texture features,the character in the handwriting texture picture can be completely different.Take a 256*256 image block with unify texture from each handwriting image,then get the image's Gabor filtering characteristics of each category through the best multi-channel Gabor filter,compute the distance between the checked image and each classified texture image in database,and sort according to the result,then observe the position of one person's handwriting image.
The author encode the checked materials and the corresponding sample,for example,the first person's handwriting is classfied to Y1,the second person's handwriting is classfied to Y2,the first person's checked material is named J1,the second person's checked material is named J2,and so on.gain the most likely three categories according to the distance.The table 2 gives the relevant data of handwriting identification respectively based on the multi-channel Gabor filter and the best multi-channel Gabor filter.
For 12 candidates' checked materials and classfied samples match,when adopt multi-channel Gabor filtering algorithm,there are seven cheeked samples arrange first(accounts for 58.3%) and ten arrange first two(accounts for 83.3%),when adopt the best multi-channel Gabor filtering algorithm proposed in this article,the result is ten(accounts for 83.3%) and eleven(accounts for 91.7%) for the same materials,and the time the latter used in feature extraction and classification is about half of the former, it shows that the improved feature extraction algorithm is effective.The method is also of significant reference meaning for other image processing question,such as face recognition,vehicle license plate recognition,ect.