I know this question has been asked probably like a gazillion times, but here we go again.
I teach students using a digital jam board and to provide them with the pdf of the class I just export it. Now here comes the problem, the pdfs are always over 100 mbs which makes archiving them for the future very resource extensive. Now I stumbled upon djvu format and through their site I converted the pdf with coloured scanned 300 dpi setting. Now it took a lot of time for the document to get uploaded from my system directly so I used file sharing service and provided the link to the any2djvu site. Now I kid you not, a 103 mb pdf got converted into 1.73 mb djvu with rastor text and very good quality, which surprised me a lot. So I tried it with a couple of different pdfs and though the ratio was not like 100:2 but it was still phenomenal. So I not want to convert all of my class pdfs into djvu format, which is rather very time consuming to do manually through the website.
I tried looking at different forums and reddit threads and such but all the gui software always gave an error when converting the document and I don't know why, many gui softwares just straight up stopped after 20 or so pages. And as far as I could search I wasn't able to find a user friendly guide on setting up of the djvu for command line.
Now I am in no way a very tech savvy person but I am able to do basic command line conversions like converting images using imagemagick, downloading videos from youtube, c++, java, python. So if someone would be so kind as to give me a step by step guide for converting the pdfs just like how they get converted by the site when I select the setting of colored scanned document 300 dpi, and also help me automate the complete process.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If I was you, I'd 1st find out why your pdfs are so huge. In case I'm getting you right the pdfs are produced by you.
1. quickly convert a pdf to html+stuff. Use pdftohtml from poppler.
2. See what parts are enormous.
As for the conversion, it's basically two-stage:
1. pdf to a set of pictures (see pdftoppm)
2. pictures to djvu with djvu-utils
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi. In my experience, the largest PDF I could upload to any2djvu was about 30 MB. Above that, the conversion was still running more than 1 hour later. Have you tried to upload your PDF to pdf2djvu.com? I have not tried sending a big PDF to that site, just a suggestion.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I know this question has been asked probably like a gazillion times, but here we go again.
I teach students using a digital jam board and to provide them with the pdf of the class I just export it. Now here comes the problem, the pdfs are always over 100 mbs which makes archiving them for the future very resource extensive. Now I stumbled upon djvu format and through their site I converted the pdf with coloured scanned 300 dpi setting. Now it took a lot of time for the document to get uploaded from my system directly so I used file sharing service and provided the link to the any2djvu site. Now I kid you not, a 103 mb pdf got converted into 1.73 mb djvu with rastor text and very good quality, which surprised me a lot. So I tried it with a couple of different pdfs and though the ratio was not like 100:2 but it was still phenomenal. So I not want to convert all of my class pdfs into djvu format, which is rather very time consuming to do manually through the website.
I tried looking at different forums and reddit threads and such but all the gui software always gave an error when converting the document and I don't know why, many gui softwares just straight up stopped after 20 or so pages. And as far as I could search I wasn't able to find a user friendly guide on setting up of the djvu for command line.
Now I am in no way a very tech savvy person but I am able to do basic command line conversions like converting images using imagemagick, downloading videos from youtube, c++, java, python. So if someone would be so kind as to give me a step by step guide for converting the pdfs just like how they get converted by the site when I select the setting of colored scanned document 300 dpi, and also help me automate the complete process.
If I was you, I'd 1st find out why your pdfs are so huge. In case I'm getting you right the pdfs are produced by you.
1. quickly convert a pdf to html+stuff. Use pdftohtml from
poppler
.2. See what parts are enormous.
As for the conversion, it's basically two-stage:
1. pdf to a set of pictures (see pdftoppm)
2. pictures to djvu with djvu-utils
Try https://github.com/FriedrichFroebel/pdf2djvu
(no dot at the end!)
Last edit: Janusz 2023-09-05
Thanks, will give it a try and ask if I encounter any problems.
Last edit: Partham 2023-09-05
Last edit: Ildar Mulyukov 2023-09-08
Hi. In my experience, the largest PDF I could upload to any2djvu was about 30 MB. Above that, the conversion was still running more than 1 hour later. Have you tried to upload your PDF to pdf2djvu.com? I have not tried sending a big PDF to that site, just a suggestion.