IET Image Processing, Vol. 1, Issue 1, pp. 67-84, 2007
An integrated system for the binarisation of normal and degraded printed documentsfor the purpose of visualisation and recognition of text characters is proposed. In degraded documents,where considerable background noise or variation in contrast and illumination exists,there are many pixels that cannot be easily classified as foreground or background pixels. Forthis reason, it is necessary to perform document binarisation by combining and taking intoaccount the results of a set of binarisation techniques, especially for document pixels thathave high vagueness. The proposed binarisation technique takes advantages of the benefits ofa set of selected binarisation algorithms by combining their results using a Kohonen selforganisingmap neural network. In order to improve further the binarisation results, significantimprovements are proposed for two of the most powerful document binarisation techniquesused, that is for the adaptive logical level technique and for the improvement of integratedfunction algorithm. The proposed binarisation technique is extensively tested with a varietyof degraded documents. Several experimental and comparative results, demonstrating theperformance of the proposed technique, are presented.