How to Convert HTML to Plain Text
By Katelyn Kelley
HTML (Hypertext Markup Language) is used to create Web pages. It is text based but includes "tags" that define how text on a Web page is displayed. But because the HTML codes are hidden by the Web browser, you can copy the viewable text from the browser window and paste it into any application that accepts text, such as the free editor included with Windows (Notepad) or Macintosh OSX (TextEdit). Some Web browsers can also save HTML pages as text without the tags.
Save As Text with Web Browser
Open the HMTL document or Web page you want to convert to text in your Web browser software.
Click the File menu and select "Save As" (or page menu and "Save As" in Internet Explorer).
Choose "Text Pages" from the drop-down format menu and choose a destination for the text file.
Click "Save," then exit the browser and locate the text file you saved. You can open this in any application that can read a text file such as NotePad in Windows. Some text editors who read the saved HTML file might display it with no carriage returns or line breaks. If that happens, try the next method.
Copy and Paste to Notepad (Windows) or TextEdit (Macintosh)
Open the HMTL document or Web page you want to convert to text in your Web browser software.
Click and drag over the text you want to convert or press Ctrl+A (Command+A on Macintosh) to select the entire page.
Click the "Edit" menu, then "Copy."
Open the Notepad application (Windows) or the TextEdit application (Macintosh) and click "Paste" under the "Edit" menu. You will now have the text contents of the Web page in the text editor window.
Click the "File" menu, then "Save As" to save the new document as a text file, and you will have successfully converted the Web page HTML to text.
References
Tips
- A full-feature Web editing tool such as Adobe Dreamweaver can open an HTML page into a text-only view and has a command under the "Edit" menu that allows you to copy the text, not the tags. You could then paste that into another application. You can also copy and paste Web page content into word-processing software such as Microsoft Word. After pasting, click the "File" menu and select "Save As" ("Office" menu, then "Save As" in Word 2007) and choose "Text Only" as the format for this document if you want to keep it in "plain text." Otherwise, choose the word processor's standard format (.doc for Word) and you will have a document that used to be HTML but is now text without the HTML codes.
Warnings
- The Safari Web browser does not have an option to save a Web page as a text-only document. Use the copy-and-paste method to convert the HTML to text.
Writer Bio
Katelyn Kelley worked in information technology as a computing and communications consultant and web manager for 15 years before becoming a freelance writer in 2003. She specializes in instructional and technical writing in the areas of computers, gaming and crafts. Kelley holds a Bachelor of Arts in mathematics and computer science from Boston College.