While html is an excellent medium for distributing and consuming information on the web, it is not an ideal format as far as printing and archiving purposes are concerned. For that, pdf is a better format, as pdf documents have well-defined page layout, and have all contained images embedded into pdf files. If you would like to convert html pages to pdf format on Linux, follow this guideline.
You can use a command line utility called wkhtmltopdf to convert any html webpage or url to pdf file. wkhtmltopdf uses Webkit web browser rendering engine to do html to pdf conversion.
You can install wkhtmltopdf on Debian/Ubuntu as follows.
You need to be aware that wkhtmltopdf installed via apt-get has reduced functionality and other limitations. First of all, it cannot run without X11 system. Also, it cannot add hyperlinks or a table of contents in the converted pdf file.
To convert html to pdf using wkhtmltopdf, run it as follows.
If you would like to use wkhtmltopdf without X11 system, while enjoying its full features, you need to use a static binary of wkhtmltopdf, built with Qt and X11. You can download these binaries from its official website.
Note that if you want to capture web pages hosted on https site, you need to install openssl first, and run xkhtmltopdf.
If xkhtmltopdf does not work for some reason, an alternative way to convert html web pages to pdf files is to use Google Chrome browser. If you don't have Google Chrome installed, install it first.
On Google Chrome, go to the URL of the web page you would like to convert to pdf. Then, choose "Print a page" menu of Google Chrome, and change "Destination" to "Save as PDF". Once you click print button, the web page will be saved as a local pdf file that you designate.
Subscribe to Xmodulo
Do you want to receive Linux FAQs, detailed tutorials and tips published at Xmodulo? Enter your email address below, and we will deliver our Linux posts straight to your email box, for free. Delivery powered by Google Feedburner.