From: David R. <dr...@ro...> - 2009-11-23 22:40:25
|
I have also been saving Web pages to PDF to include in my Gramps media library. [Yes I am aware of copyright issues - but I don't intend to publish my site on the open Web]. I use MacOS X which offers a 'Print to PDF' option for all applications. It works reasonably well, but there have been times when it didn't work quite right - I think it was handling a scrolling text box inside an page. However the MacOS option merely tries to give you a PDF image of a printed page - no active features on the page will work. The Firefox extension which I use when I can is 'PDF download' as it seems to make the best attempt at producing a PDF file that represents the original page - for example most clickable links are still clickable in the PDF version. (Acrobat Reader will launch your default brower for the link's target). However, there are times when a page will not convert and you get the 'Server busy' message. I think that this happen when you are accessing a site that requires a login and which then tracks the dialogue with a cookie. PDF Download in your browser sends a request to an external 'pdfdownload' server quoting your URL for the page to be converted. The pdfdownload server tries to refetch the page using your URL - but it doesn't have your cookie - so the target web server demands a fresh login and the whole process stalls. You can explore this effect yourself using two different browsers. Find an interesting page in, say, Firefox - then copy the URL into, say Safari, and see if Safari retrieves the same page. For sites that use session tracking cookies, a fresh login will be demanded. When PDFdownload doesn't work, I find that the 'pdfit' Firefox extension works well, so that has been my second choice - it converts in your browser, so cookies are not an issue. It can also give a png image version of the page. David Rowe |