Web Exclusive: The Good and Bad of Scanned Images
A high quality image is important to any other application that uses scanning technology, including IcoMap (click here for the full review in the August issue of POB).
A scanned document consists of a series of dots (called pixels) to represent the lines and text on a document. The resolution of a scanned image relates to the number of pixels per inch. Thus, a 24 x 36" drawing scanned at 400 dpi (dots per inch) will have 138,240,000 pixels in the file. In a pure black and white image, the dot is either on or off and can be represented by a single bit. In a color scan, a pixel is represented by 24 bits, which allows a pixel to be any of 16,777,216 colors.
The quality of a scanned image is dependent on four basic elements: the quality of the source document, the precision of the scanner, the skill of the operator and the resolution at which it was scanned.
Scanning software has various parameters that can be used to improve the quality of the scanned image. Experienced operators can usually get the best image with little trial and error. However, this cannot always be achieved. Since each original has unique qualities, batch scanning of a large number of documents won't produce the best results.
The best way to obtain a useable scanned image is to start with a clean drawing with sharp lines and lettering. A white background is preferred. Any dirt or other visible variations from the background will result in speckles on the scanned image. Lines and lettering should be of an even darkness and width. Even the smallest speck can result in conversion problems. Although there are various options to remove speckles, a clean original is a better solution.
Text should be uniform with even spacing between the characters. They should not touch each other or other elements in the drawing. Cursive or other fancy type letters are hard to recognize. Hand lettering, if neat and uniform, can be recognized with a fair degree of success, but if rather crude, like mine, success is limited. Broken text, which is sometimes used to show adjoiner information cannot be recognized.
Documents should be scanned at 300 dpi or greater. UCLID Software (Madison, Wis.) recommends documents be scanned at 400 dpi. I found that if documents were clean and sharp, 300 dpi gave good results.
While Mathematical Content Recognition (MCR) can recognize text at any angle, optical character recognition (OCR) and intelligent character recognition (ICR) rely on the text being horizontal, thus documents should be placed as square as possible for good text recognition.
It is important to scan documents as pure black and white. For the purpose of text recognition, color offers no value. Black and white not only results in smaller file sizes, but otherwise IcoMap will try to convert the image to a pure black and white image. I found this to be time consuming and not very effective.