How to Convert PDF to Excel Open Sourceby John Gugie
PDF (Portable Document Format) is a document file format developed by Adobe that can include text, graphics and images. PDF files cannot be readily edited in their self-contained form. Sometimes, users want to copy and/or edit data contained within a PDF file. To do this, PDF files need to be converted into another document format, such as Microsoft Excel, a spreadsheet software program. Adobe Acrobat allows the exporting of PDF data into other formats.
Download and install Adobe Acrobat from the official Adobe website. This is a very large file that will take considerable time (30 minutes for DSL) to download and several minutes to install.
Open Adobe Acrobat. Click "File" on the menu at the top, choose "Open," browse to the location of the file on the computer hard drive and click the "Open" button.
Click "File" on the menu at the top of the window and choose "Export" and then "XML 1.0" from the drop-down menu. Click on "Desktop" from the save window and then "Save" at the bottom.
Open Microsoft Excel. Click "Desktop" on the left side of the screen, browse to the location of the XML file created in Step 3 and click the "Open" button. When the prompt pops up asking how to import the XML file, choose "Open as an XMLtable" and click the "OK" button. Click "OK" when another prompt pops up and asks if you want to create a schema.
Delete all of the columns that contain unusable Excel data, except for "TD" (table data). You may now edit the PDF data. When you are finished, save the XML by clicking on the Office icon on the menu at the top and choose "Save" and then "OK."
- Only table data will transfer over to Excel from PDF files. The rest of the PDF content, including most of the graphics, will not be used by Excel.
Items you will need
- PDF file
- Adobe Acrobat
- Microsoft Excel