How to Convert a PDF to HTML With Ubuntu

By Ben Lingenfelter

There are several ways to attempt changing a PDF file into an HTML. Keep in mind, the finished product will probably not look as good as the original. The Portable Document Format is not easily circumvented. HTML does not work with text and images in quite the same way that PDF files do, especially complex ones, but here are a few ways to attempt it.

3 Methods

Step 1

The easiest way is to go to the Adobe Web site and upload your PDF. Probably due to the rash of software being marketed to do this very thing, Adobe offers it for free. All you have to do is fill in a few blanks, click a button, and off you go.

http://www.adobe.com/products/acrobat/access_onlinetools.html

Step 2

Another way is to use a nifty little tool called Image Magick. It's easy to find in Synaptic. Download it, choose it from the "open with" menu, and "save as" html. The only hang-up with it is that you can only do one page at a time.

The final way is to use a little program called pdftohtml. To do this you have to use the terminal to make sure poppler-utils is installed.

sudo aptitude install poppler-utils

The program will install automatically, and then you have to navigate to the directory in which your PDF file is located. Once there, all you have to do is type:

pdftohtml -c [filename].pdf [filename].html

The finished product isn't much different from that given you by the Adobe Web site, but you'll be supporting open source software by using it!

×