Here at U of A Library, we have a Digitization program that makes digital copies of physical materials, with research, teaching, and long-term access in mind. All digitized material is available for public viewing at no cost.
While we do handle some one-off requests, much of our work is in large-scale digitization projects. These are big logistical undertakings! Here’s a bit of information about how we do it, interspersed with some cool things we’ve digitized over the years.
Selection
Most digitization projects come to us as a part of a collection: a group of related materials, which can range from a few dozen discrete items, to thousands of volumes.
Before any digitization happens, we need to make sure that the collection is suitable for our program. We ask questions, including:
- Does it actually fit within the scope of our program?
- Who will benefit from the digitization of this collection?
- What is the enduring value to research, teaching, and learning?
- Has anyone else already digitized this collection, elsewhere?
Selection is always weighed against finite resources. So we also have many practical considerations, including:
- Copyright, privacy, and ethical considerations
- Cost analysis, in money and labour
- Physical constraints: format, size, condition, and quantity
Metadata Preparation
Once a project is selected: nothing can be digitized without metadata (aka. information that describes what it is) like a title, creator, or even just a unique identifier. Without these critical pieces of descriptive information, digital item can get lost in a sea of information and become nearly irretrievable. If people can’t find what they are looking for, what is the point of digitizing it?
Metadata is always a balance of two competing factors: describing items in enough detail so that they’re discoverable by the highest number of people, but not too much detail that it takes all day!
Scanning (aka. Taking the Pictures)
We work with external partners to physically scan & transform our materials from physical to digital formats.
They use specialized equipment appropriate for the format, condition, and scale of these projects. For example, a machine, called a “Scribe”, uses 2 overhead cameras to take photographs facing pages simultaneously. They’re much faster than flatbed photocopier / scanner machines, and the shape minimizes the risk of “cracking” those book spines!
Processing & Deriving
Once scanned, all items undergo quality assurance and post-processing work before the digital version is put up online.
If the item is mostly text, OCR (Optical Character Recognition) is done, which means a machine “reads” the page and generates searchable copy-able text. This allows you to do a search on the “full-text” of an item, which can really help you find the right sections!
Multiple file formats are derived, so that users can pick the file format that works best for them. Choices include plain text, image, PDF, and more.
Long Term Preservation
Whereas paper can often last a long time – as long as they remain cool and dry – digital files need regular assessment, maintenance, and format migration to ensure long-term accessibility.
Once we digitize a collection, we’re also committing to long-term Digital Preservation.
Helping People Using Our Collection
Once a collection is up online, it’s our continual responsibility to help people use it. We get lots of users from both on and off campus. We help them navigate the online platform, find specific things, and determine the terms & conditions for reuse.
Our collections have been used by people looking for their family histories (the Henderson’s Directories are a perpetual favourite), authors who publish books about Western Canada, podcast researchers, journalists and documentary producers from Sweden to Japan… We love hearing user stories!