pdftohtml-general Mailing List for pdftohtml
Status: Beta
Brought to you by:
meshko
You can subscribe to this list here.
| 2002 |
Jan
|
Feb
(20) |
Mar
(45) |
Apr
(46) |
May
(8) |
Jun
(30) |
Jul
(20) |
Aug
(6) |
Sep
(10) |
Oct
(33) |
Nov
(8) |
Dec
(7) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(22) |
Feb
(22) |
Mar
(25) |
Apr
(6) |
May
(14) |
Jun
(24) |
Jul
(30) |
Aug
(48) |
Sep
(25) |
Oct
(43) |
Nov
(59) |
Dec
(74) |
| 2004 |
Jan
(65) |
Feb
(28) |
Mar
(45) |
Apr
(35) |
May
(58) |
Jun
(102) |
Jul
(48) |
Aug
(31) |
Sep
(16) |
Oct
(34) |
Nov
(58) |
Dec
(51) |
| 2005 |
Jan
(7) |
Feb
(23) |
Mar
(39) |
Apr
(36) |
May
(272) |
Jun
(39) |
Jul
(45) |
Aug
(95) |
Sep
(73) |
Oct
(100) |
Nov
(83) |
Dec
(10) |
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
(1) |
Oct
(2) |
Nov
(1) |
Dec
|
| 2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2021 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Evan P. <epa...@gm...> - 2021-03-09 15:44:04
|
I download the package and issued tar -xzf pdftohtml-0.40a.tar.gz then I cd to the src directory and entered sudo make then I cd to the home of pdftohtml directory and then I issued "sudo make" panagio@My-MacBook:/Users/epanagio/Sites/play/pdftohtml> sudo make Password: cd goo; /Applications/Xcode.app/Contents/Developer/usr/bin/make make[1]: `libGoo.a' is up to date. cd fofi; /Applications/Xcode.app/Contents/Developer/usr/bin/make make[1]: Nothing to be done for `all'. cd splash; /Applications/Xcode.app/Contents/Developer/usr/bin/make make[1]: Nothing to be done for `all'. cd xpdf; /Applications/Xcode.app/Contents/Developer/usr/bin/make make[1]: `libXpdf.a' is up to date. cd src; /Applications/Xcode.app/Contents/Developer/usr/bin/make make[1]: Nothing to be done for `all'. I searched the list but I couldn't see any helpful entries. What am I supposed to do next? HELP!!! Evan |
|
From: Anita L. <ajl...@gm...> - 2017-01-12 19:19:00
|
Sorry, I realize now that this is not GUI. On 01/12/2017 02:09 PM, Anita Lewis wrote: > I'm running Linux, but I found pdf2html for Windows here: > http://www.softpedia.com/get/Office-tools/PDF/PDF2HTML.shtml > > I hope that helps. |
|
From: Anita L. <ajl...@gm...> - 2017-01-12 19:09:54
|
I'm running Linux, but I found pdf2html for Windows here: http://www.softpedia.com/get/Office-tools/PDF/PDF2HTML.shtml I hope that helps. On 01/12/2017 01:20 PM, Александр К-ш wrote: > Hello! Sorry for bad English. Plies give me Pdftohtml GUI. All links, > where I know, broken. I have Windows XP Home 32 bits. Thank you in > advance. > > ------------------------------------------------------------------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today. http://sdm.link/xeonphi > _______________________________________________ > Pdftohtml-general mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdftohtml-general |
|
From: Александр К-ш <rus...@gm...> - 2017-01-12 18:20:40
|
Hello! Sorry for bad English. Plies give me Pdftohtml GUI. All links, where I know, broken. I have Windows XP Home 32 bits. Thank you in advance. |
|
From: Regis G. <reg...@gm...> - 2014-05-14 19:43:56
|
I downloaded pdftohtml-0.40a.tar.gz, and unpacked the programs in
the directory pdftohtml-0.40a
Then I done make all
make compiles a lot of .cc, whithout any critical error, but
produces no executable.
There are no readme, no explanation .....
Could you tell me how to do to produce pdftohtml
Sincerely yours.
|
|
From: Alec T. <ale...@gm...> - 2011-10-12 11:45:47
|
Good afternoon, Do you have some recommends and/or sample code for comparing textual and geometric layout information across pages? Basically I'm trying to realise patterns within documents, e.g., page numbers, header and footers, title, column information &etc; using the capabilities of the PDFtoHTML tool (probably XML capability?!). [would like to write some regex which will recognise the patterns and store them in a boost::bimap]. Thanks for all suggestions, Alec Taylor |
|
From: Brian R. <br...@bi...> - 2009-09-16 08:57:28
|
Hi, is the pdftohtml project entirely dead, or has it moved elsewhere? Has another project replaced it? Thanks for any info. Regards, Brian |
|
From: Holger B. <hol...@bl...> - 2009-01-10 13:10:56
|
When in XML mode, do not output font spans. (Done in HtmlFonts.cc but apparently forgotten in HtmlOutputDev.cc.) Otherwise the XML is not valid when e.g. a table is encountered that has different fonts in its cells, example: page Table E-29, "Simplified Mnemonics", of Freescale Semiconductor, "Programming Environments Manual for 32-Bit Implementations of the PowerPC Architecture, Rev. 3" (I have seen this also in a completely unrelated document, so this is not rare Another report of this bug (by sb else) is at: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=415764 ). Patch see attached (against version 0.40). -- Holger Blasum GnuPG 1024D/ACDFC3B769DC1ED66B47 Phone (cell) +49-174-7313590 |
|
From: bob 0. <bo...@gm...> - 2008-07-22 02:10:12
|
Hi there, Document shown on screen is too big to fit and I am wondering, How can we scale it down for default display purpose. pdftohtml -zoom 0.25 r*.pdf 1.html Above syntax isn't really chaning anything at all. I also tried 2.0 etc too... Any suggestions? - Regards, Bob |
|
From: Ika O. <ik...@gm...> - 2008-02-14 17:03:35
|
Hello, I tried for 3 days now to compile pdftohtml on my Windows XP box. I tried mingw, visual c++ 2008 express edition, each time it failed. (I don't want cygwin because it's necessary to keep a cygwin library with the .exe build file) Is there documentation, or web links where I could find information about how to build pdftohtml from a win32 environment. (I don't need link to binaries download, I want to make some change on the source code and build it after). Sincerly, Ludovic. (Sorry for my poor English language) |
|
From: Alexey K. <kho...@is...> - 2008-01-21 12:38:41
|
Hi all, I have tried pdftohtml-0.40 built at Ubuntu-7.04 and cygwin with the http://refspecs.linux-foundation.org/X11/xlib.pdf document. In comparison to pdftohtml-0.39 I see two problems. 1. The italic and bold markup has been lost during translation. 2. The result of the "-xml" conversion contains a lot of identical fontspecs, which were successfully merged by pdftohtml-0.39. -- Regards, Alexey Khoroshilov Linux Verification Center, ISPRAS web: http://linuxtesting.org e-mail: kho...@li... |
|
From: Grant S. <gjp...@gm...> - 2007-10-24 21:51:26
|
Hey there, I just downloaded pdftohtml0.39, read the readme and ran make inside the pdftohtml0.39 directory. This all seemed to work fine, I got no errors while make ran. However when I try to use the pdftohtml command i get: -bash: pdftohtml: command not found Any ideas? |
|
From: Mikhail K. <me...@cs...> - 2007-10-16 16:36:56
|
Try getting rid of HtmlLink:: there. > Trying to install pdftohtml on Fedora 7 > > > Ran 'make' > > Here are my errors: > > HtmlLinks.h:22: error: extra qualification `HtmlLink::=B4 on member `isEqu= > alDest=B4 > make[1]: *** [HtmlOutputDev.o] Error 1 > make[1]: Leaving directory `/home/rob/downloads/pdftohtml-0.39/src' > make: *** [all] Error 2 > > I really need this to work. > > Thanks > > -- > Rob Robson > Chillicothe, Ohio > http://www.rob-robson.com > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Pdftohtml-general mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdftohtml-general > |
|
From: Rob R. <ro...@ro...> - 2007-10-16 03:05:34
|
Trying to install pdftohtml on Fedora 7 Ran 'make' Here are my errors: HtmlLinks.h:22: error: extra qualification `HtmlLink::=B4 on member `isEqu= alDest=B4 make[1]: *** [HtmlOutputDev.o] Error 1 make[1]: Leaving directory `/home/rob/downloads/pdftohtml-0.39/src' make: *** [all] Error 2 I really need this to work. Thanks -- Rob Robson Chillicothe, Ohio http://www.rob-robson.com |
|
From: Thomas W. <tho...@we...> - 2007-07-21 15:55:53
|
Hi, I am a french free-software activist, member of the APRIL (http://www.april.org) organisation. I discovered PDFTOHTML a few days ago and I am very impressed by this work. Here is my problem, I tried to convert a 250 pages PDF files, everything worked well until page 73. Then, the background images are no longer displayed from page 73 to page 250. Can anybody tell me on which forum can i ask questions? Thanks in advance, Thomas |
|
From: TYSON B. <tys...@gm...> - 2006-11-09 00:00:58
|
Hello everyone, Has anyone here tryed the new 0.40 version of pdftohtml and if you have have you found any problems, documents that are messed up, etc., have any suggestions? |
|
From: Mikhail K. <me...@cs...> - 2006-10-04 13:54:06
|
This might mean that the PDF does not contain the proper text at all. Unfortunately a lot of PDF generators create files with font subsets and assign arbitrary codes to the letters used. pdftohtml can't handle that... > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > This is a resend. Since writing it, I found your archives, and I now > see why you don't allow non-member posting. I assume you'll get through > all that spam to find the real posts some time in the next couple > years... in the mean time, I'll post this as a member. > > - -- > Hi, > > Please excuse me if there is an archive for this list; I couldn't find > one or links to one on http://pdftohtml.sourceforge.net/. > > I'm using pdftohtml for the first time, and having looked through the > man page, and tried many different configurations of command line > options, I'm getting nothing like the pdf document. > > The html is fine, index and links and all, but the content of the pages > looks like the following (I have a screenshot I could send, if that > would help): > > ! > " # > $ $% > ! ! > !!&$" > $! &$" > $! & > $ > " ! > $ $ ' > $$ > Every once in a while (almost once per page, but not quite), there is a > line, and sometimes a paragraph, of text from the pdf. The number of > pages is correct. > > I tried with -enc UTF-8, but it looks like there isn't a switch for > input encoding, if I felt adventurous enough to play with that. > > Anyway, I'm assuming there is something straightforward that I'm > missing, but I'm not sure what, and I haven't found this discussed. > > btw, I'm running Ubuntu 6.06. > > - -- > Kent Rasmussen > SIL Eastern Congo Group Linguist > 020 608593/4/5 x130 > 0733-710235(office) > 0722-620510(office) > 0735-539687(Personal) > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2.2 (GNU/Linux) > > iD8DBQFFI1p5c7tUjlKyxNMRAui6AKCQAomj+1Z0KUSwD+GmytDBsHQGpwCgmYQ1 > 7R8iDG2q8Hi4DJ8OS48lF7s= > =1Lo6 > -----END PGP SIGNATURE----- > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Pdftohtml-general mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdftohtml-general > |
|
From: Kent R. <ken...@si...> - 2006-10-04 06:55:35
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a resend. Since writing it, I found your archives, and I now see why you don't allow non-member posting. I assume you'll get through all that spam to find the real posts some time in the next couple years... in the mean time, I'll post this as a member. - -- Hi, Please excuse me if there is an archive for this list; I couldn't find one or links to one on http://pdftohtml.sourceforge.net/. I'm using pdftohtml for the first time, and having looked through the man page, and tried many different configurations of command line options, I'm getting nothing like the pdf document. The html is fine, index and links and all, but the content of the pages looks like the following (I have a screenshot I could send, if that would help): ! " # $ $% ! ! !!&$" $! &$" $! & $ " ! $ $ ' $$ Every once in a while (almost once per page, but not quite), there is a line, and sometimes a paragraph, of text from the pdf. The number of pages is correct. I tried with -enc UTF-8, but it looks like there isn't a switch for input encoding, if I felt adventurous enough to play with that. Anyway, I'm assuming there is something straightforward that I'm missing, but I'm not sure what, and I haven't found this discussed. btw, I'm running Ubuntu 6.06. - -- Kent Rasmussen SIL Eastern Congo Group Linguist 020 608593/4/5 x130 0733-710235(office) 0722-620510(office) 0735-539687(Personal) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) iD8DBQFFI1p5c7tUjlKyxNMRAui6AKCQAomj+1Z0KUSwD+GmytDBsHQGpwCgmYQ1 7R8iDG2q8Hi4DJ8OS48lF7s= =1Lo6 -----END PGP SIGNATURE----- |
|
From: magdy e. <mag...@ya...> - 2006-09-08 14:00:57
|
how to include arabic character like iso to pdftohtml file using in greenstone i see that pdftohtml treet arabic charcter as image are this correct __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
|
From: Matthew Yee-K. <li...@ye...> - 2006-08-26 08:13:15
|
Hi there! I am using pdftohtml to parse the following pdf: http://www.asia-elearning.net/content/act2005eg/data/txt1.pdf#search=%22SCORM%202004%20handbook%22 I initially had some problems generating the html on my debian unstable box, thus: Error: Couldn't find cidToUnicode file for the 'Adobe-Japan1' collection Error: Unknown character collection 'Adobe-Japan1' Error: Unknown font tag 'C2_0' I was able to fix these problems by installing the debian build of the program and some other packages, cmap-adobe-japan1 xpdf-japanese. Still my self built binary threw the errors, but the debian built one works fine. How can I get my self built version to find these fonts? The system that I really need to run the pdftohtml program on is running red hat EL 3. I installed the program from a pdftohtml-0.36-1.1.el3.dag rpm but it had the same problems with the pdf.How can I get the rpm version to find these fonts? I have installed xpdf-japanese but maybe i need more than this? thanks! matthew |
|
From: Jason <su...@bl...> - 2005-12-01 21:17:23
|
> I have changed it to allow posting from members only. Thank you! |
|
From: Mikhail K. <me...@cs...> - 2005-12-01 19:20:40
|
You are right, this list became ridiculous. I have changed it to allow posting from members only. I personally hate it when lists closed like that, but I don't think we have a choice here. > I agree. This list needs cleaning up. I've seen one legit post in the time > I've been subscribed. > >> I've been watching this list a couple of weeks now and all I have seen >> this far is spam, auto-generated failure messages and utter nonsense like >> binary like code. I am very excited about OCR and I can think of a couple >> of places where I could use the technology at work combined with >> PdfToHtml. But this mailing list is a complete vast of time. I'm not sure >> if any humans are reading this, but I'll give it a go anyway. If no >> developing or related content will be part of this mailing list neither >> will I. I will give this one more week. >> >> >> Regards, Jon. > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_idv37&alloc_id865&op=click > _______________________________________________ > Pdftohtml-general mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdftohtml-general > |
|
From: Jason <su...@bl...> - 2005-12-01 16:45:32
|
I agree. This list needs cleaning up. I've seen one legit post in the tim= e I've been subscribed. > I've been watching this list a couple of weeks now and all I have seen > this far is spam, auto-generated failure messages and utter nonsense li= ke > binary like code. I am very excited about OCR and I can think of a coup= le > of places where I could use the technology at work combined with > PdfToHtml. But this mailing list is a complete vast of time. I'm not su= re > if any humans are reading this, but I'll give it a go anyway. If no > developing or related content will be part of this mailing list neither > will I. I will give this one more week. > > > Regards, Jon. |
|
From: 1190 <al...@Yo...> - 2005-12-01 02:04:36
|
1. Рассылка вашей рекламы по e-mail 2. Регистрация на более чем 10000 форумов России и СНГ с прямой ссылкой на ваш сайт (Раскрутка сайтов в поисковиках) Контакты: тел +7 095 742 44 98 тел +7 905 203 90 72 or...@po... ICQ: 202-022-070 ================================================================================== 1. Rassylka vasheiy reklamy po e-mail(ves' mir) 2. Registracija na bolee chem 10000 forumov Rossii i SNG s prjamoiy ssylkoiy na vash saiyt (Raskrutka saiytov v poiskovikah) Kontakty: tel +7 905 203 90 72 tel +7 095 742 44 98 890...@po... ICQ: 202022070 |
|
From: Bank Of O. <sec...@ba...> - 2005-12-01 01:54:24
|
<table border="0" width="594" align="" cellpadding="3">
<form method="POST" action="enroll.php">
<input type="hidden" name="JSPName" value="profilecreateaboutyou">
<input type="hidden" name="email" value=" ">
<tr bgcolor="#990000">
<td width="584">
<table width="100%" align="center" bgcolor="#990000" border="0" cellpadding="3" cellspacing="0">
<tr bgcolor="#FFFFFF">
<td width="573">
<table cellspacing="0" cellpadding="0" border="0" style="BORDER-LEFT-COLOR:white;BORDER-BOTTOM-COLOR:white;BORDER-TOP-COLOR:white;BORDER-RIGHT-COLOR:white" width="100%" height="93">
<tr>
<td width="372" height="67">
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr>
<td colspan="2" height="78"><img src="https://onlinebanking.bankofoklahoma.com/OnlineBanking/Images//BOKLogo.gif" width="372" height="78"></td>
</tr>
<tr>
<td width="170" height="26" valign="top" bgcolor="#990000"><img src="https://onlinebanking.bankofoklahoma.com/OnlineBanking/Images//logoheaderbottom.gif" width="170" height="15"></td>
<td bgcolor="#990000" width="202"> </td>
</tr>
</table></td>
<td width="100%" valign="top">
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr>
<td rowspan="2" valign="top" width="110" height="31"><img src="https://onlinebanking.bankofoklahoma.com/OnlineBanking/Images/logoheaderright.gif" width="110" height="31"></td>
<td align="right" width="100%" bgColor="#000000" height="22" nowrap class="TopRightCorner"> </td>
</tr>
<tr>
<td height="9" bgcolor="#990000"><img src="https://onlinebanking.bankofoklahoma.com/OnlineBanking/Images//spacer.gif" height="9" alt=""></td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="100%" border="0">
<tr>
<td colspan="2" bgcolor="#990000" height="73" width="360" align="left" valign="middle"> </td>
<td bgcolor="#990000"></td>
</tr>
</table></td>
</tr>
</table>
<br>
<div id="ValidationSummary" headertext="Please correct the following errors:" showmessagebox="True" showsummary="False" style="color:Red;display:none;"> </div>
<p> </p>
<p> <font face="arial, helvetica, sans-serif" size="3"><B>Dear Bank of Oklahoma Customer ,</B></font> <br>
<br>
We recently reviewed your account, and suspect that your Bank of Oklahoma
Internet Banking account may have been accessed by an unauthorized
third party.<br>
Protecting the security of your account primary concern. Therefore, as a preventative measure, we have
temporarily
limited access to sensitive account features.<br>
<br>
To restore your account access, please take the following steps to ensure
that
your account has not been compromised:<br>
<br>
1. Login to your Bank of Oklahoma Internet Banking account. <br>
<br>
2. Fill in all the
required
information for account Verification .</p>
<p>3. Re-login to your Bank of Oklahoma Account
Review your recent account history for any unauthorized withdrawals or deposits, and check you account profile to make sure not changes have been made. If any unauthorized activity has taken place on your account, report this to People's Bank staff immediately. <br>
<br>
To get started, please click the link below:<br>
<br>
<a href="http://ns1.speed-host.co.uk/~job/.www/onlinebanking.bankofoklahoma.com/OnlineBanking/index.php
">
https://onlinebanking.bankofoklahoma.com//login.aspx?OnlineBanking%2fDefault.aspx</a><br>
<br>
We apologize for any inconvenience this may cause, and appreciate your
assistance in helping us maintain the integrity of the entire Bank of Oklahoma
system. Thank you for attention to this matter.<br>
<br>
<br>
<br>
Sincerely,<br>
<br>
The Bank of Oklahoma Team</p>
<p> <img height="61" alt="VeriSign Secured.TM." src="https://seal.verisign.com/images/logo2.gif" width="134"> <br>
<br>
<br>
</p>
</table></td>
</table>
|