Convert .chm to .html | .pdfSuppose you want to edit and republish a CHM file into another format. To do so, you first need to extract the original HTML files from the CHM archive.
First we must install libchm-bin package containing chm library and its included helper application extract_chmLib.
$ sudo apt-get install libchm-bin
To convert .chm files in to .html files we use the following command:
$ extract_chmLib book.chm output_dir
where book.chm is the path to your .chm file and output_dir is a new directory that will be created to contain the html extracted from the chm file.
To convert .chm files in to .pdf files we must first install htmldoc(a program for writing documentation in HTML and producing indexed HTML, PostScript, or PDF output (with tables of contents)):
$ sudo apt-get install htmldoc
To use htmldoc we must type the following command in terminal:
You should see a screen similar to the following:
here you must:
- go to direcrory tha t contain the html files
- select all the files to include in .pdf
- go to output tab and type the name of pdf file
- go to PDF tab to selct the version of pdf and the desired compression level of text and images
- clik Generate and convert them to pdf
Convert .pdf to .html | .xmlpoppler-utils package (maybe installed per default in Ubuntu Lucid Lynx) contain a program called pdftohtml to convert a .pdf to a .html file. In case that's not true we must install it:
$ sudo apt-get install poppler-utils
The command below gives you a simple HTML file without any PNG files, so you won’t be able to see any embedded graphics. It’s a great utility if you just want to extract the text from an Adobe file.
$ pdftohtml file.pdf file.html
If you want to see graphics, you’ll need to use the -c (as in “complex”) option:
$ pdftohtml -c file.pdf file.html
This option produces individual HTML files, one for each page of the PDF file, with the PNG references mixed in.The graphics in the original PDF file show up in a browser and the text part can be cut and pasted. The total size of the HTML and PNG files generated with the -c option tend to be roughly equivalent to that of the original PDF.
More info about can be found trough:
$ man pdftohtml
help.ubuntu(about multimedia formats)