Digitizing Artifacts

Deciding what to digitize.
Before beginning a digitization project you should think about the following issues:
Availability: Can you get permission to digitize? You need permission from the owner of the artifact and any copyright or other rights holder.
Condition and size: How fragile is it? Is it too big or awkward to fit on a scanner or hold for a photograph?
Historical significance: Will it be useful to others studying an historic location, event, person or time period? What impact did it have?
Uniqueness: Is there already a digitized version available? Will you offer it in a higher resolution format with fewer restrictions? Is the copy in better condition with fewer visible blemishes? Sometimes poor condition can be a plus -- a book binding that is broken makes it easy to scan flat.

Scanning works best for flat objects and books with bindings that lay flat. You can achieve high resolutions that show details difficult to view actual size. Scanners with a greater depth of field (distance from scanner glass that is still sharp) can scan books and artifacts that do not lay flat or are three-dimensional. Be sure to clean the glass regularly. I have been using an Epson Perfection 1650 scanner which has good resolution, scanning speed and is fairly quiet when used in a Library. I regularly use a discontinued HP Scanjet 4600/4670. It is an innovative design that allows you to place the flat scanning stage on top of artifacts and see through the glass stage to accurately position each scan. It has a very narrow depth of field and poorly designed scanning software but cannot be matched (at consumer prices) for scanning large flat objects. Third party scanning software such as VueScan allow you to use older scanners on newer operating systems with a consistent interface.

Large format sheet feed scanners allow you to scan large drawings easily. I have been using a Colortrac SmartLF SC 25 (25" width by any length) to scan railroad valuation maps and other large documents without the need to stitch them together from individual 8.5" x 11" scans. The scanner is light enough that I can move it easily to the location of the drawing collection. The SmartWorks Pro software is highly recommended to fully contol the scanning process.

Scanning Resolution, dots-per-inch (dpi), refers to the number of digital representations (pixels) per inch of the original. Computer screens roughly display 75-100 pixels per inch. Magnification or printing require higher scanning resolutions. For archival quality, you should scan at 400-600 dpi and save the files in tiff format. Original photos and negatives require higher resolutions to capture their higher quality. Store in multiple locations (hard disk, CD, DVD, tape backup etc.). For limited budget projects, you should decide whether a high dpi but compressed format is acceptable for your purposes. Formats such as jpg have high compression ratios (smaller file sizes), but lose some information every time you save them. Lossless compression such as tiff LZW is preferred.

Digital cameras can digitize three-dimensional objects but, except for the most expensive cameras, do not have the resolution of a scanner. Cameras in the 3-5 mega-pixel range may be useful for some projects. Use a tripod and experiment with focusing distance and lighting. Quality film cameras have high resolution but require a separate high-resolution scan in order to digitize the image. You will need a film scanner or a flatbed scanner with a transparency adaptor. Inexpensive scanners, even with a transparency adaptor, generally do not scan small negatives and slides at a very high resolution.

Photos in printed documents (screened)
The printing process uses single color inks. Shades of gray or colors are displayed by creating larger or smaller dots. This is accomplished by the printer using a screen process. This sometimes causes a moiré pattern when viewed on a computer monitor. You can de-screen (blur) a printed photo using your scanning software or you can do it afterward in a graphic program. The latter is preferred because you save an unblurred version and can separate any text area from the photo and only de-screen the photo. I use a Gaussian Blur filter with a 1 to 3 pixel radius - depending on the size of the screen and then use the Unsharp Filter with a similar radius.

Oversize originals
If you overlap the scans, you can digitize any size document. Try to keep the document straight during each successive scan. You can straighten a scanned image using the Photoshop Measure tool to determine the Rotate Canvas arbitrary degrees or eyeball it with any free rotation tool. You can piece the sections together manually or use software such as Panavue Image Assembler.

Optical Character Recognition (OCR)
Text is scanned as a graphic image along with any picture content. In order to edit or search the text it must be converted to typed characters. You can re-type the text or use OCR software. Depending on the quality of the original text you will find the conversion ranges from 90 - 99% accurate. You will need to do some proof reading. When you post an OCR version of your document, it will be indexed by search engines and your browser's find on page search will work. A text version of the page image also takes up less space and loads faster. You should decide whether to correct typos in the original document so searches will succeed. If you do, be sure to indicate that you have done so. If you also provide the original image, researchers can compare your converted text to the original.

Add identifying information to an image using a graphics editor (alway preserve an unedited archival copy first). Image editors allow you to overlay editable text, arrows and other graphics. If you save a copy in the editor's native format, you will be able return to it and continue editing the information.You can create simple animations using the animated gif format. You can create presentations by importing images(frames) into a presentation, video or animation program.

For more information:

An excellent site for tips and detailed information on the scanning process is Wayne Fulton's scantips.com website and book.

There are a number of digitization projects on the web that are good examples and include information on the project itself.

Library catalogs and searchable digital collections:

updated 11/26/13

About this website - License/Copyright - DjVu - Digitization - What's New - Search