How to Archive PDF Files
By Carey Stumm
Because electronic record formats change so rapidly and file formats can only be opened by certain programs, there has been a reluctance to rely on proprietary software programs to archive electronic documents. Although the PDF format was invented by Adobe, the specifications for reading and editing a PDF document are available to third party developers and users do not have to rely on Adobe to open the documents. For this reason, archivists have begun to rely on the PDF/A file format for archiving electronic records.
Understand the difference between a regular PDF and PDF/A. According to the International Organization for Standardization, PDF/A is a file format based on PDF, which "provides a mechanism for representing electronic documents in a manner that preserves their visual appearance over time, independent of the tools and systems used for creating, storing or rendering the files." A PDF/A file is a permanent PDF that is formatted such that it can be retrieved in the same form over a long period of time.
Open the PDF document you would like to archive in Adobe Acrobat 8.0 or higher or a third party source editor. See Resources 1 for a full list of PDF creating software.
Review what elements are in the PDF. For a PDF to be compatible with the PDF/A format it has to be 100 percent contained, meaning that no outside elements are required for the document to be viewed. Fonts, images, graphics, and color need to be embedded into the document. In the Adobe Distiller, click on Settings at the top of the menu bar and select Edit PDF Settings. Here, you will have the option to embed fonts images and color. Video and audio elements cannot be included in the PDF/A.
Input metadata into the document so that it will be searchable. PDF metadata includes keywords, the version of PDF used and the name of the program that created the PDF document. To input the metadata in Adobe Acrobat, click on Advanced at in the top toolbar. From the dropdown menu click Document Metadata.
Save the document to an archive. An archive can be a secure external server, hard drive or a gold DVD. The media that you decide to store your archival PDF will depend on what is most secure in your office or household.
- PDF/A files can be larger than regular PDF files because everything is embedded.
Carey Stumm is an archivist at a history museum in New York City and a professor of museum studies in a university graduate program. She has been a grant writer for museums for six years and has written about media preservation, art, and transportation history. Stumm has a master's degree in library and information science.