Quote: "I tried to index RTF files with English and Russian texts in them (in table) but DocFetcher shows Russian text incorrectly. Can this be fixed?"
Can you please post an RTF file with English and Russian text in it, where the problem occurs? This would make it much easier to fix this bug.
You can add files to this page with the "Add a file" link below (you will probably need to create a SourceForge.net account first).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I didn't find how to add files so here's link to the test file - _http://www.multiupload.com/GCUFE1YPE1
Inside this zip file you can find the RTF file with Eng and Rus texts in tables. And also screeshot of how this file is shown on my computer in DocFetcher.
BTW, is it possible to fix the way how tables are shown so they look like tables in DocFetcher?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
On the top right of this page, there is a 'Monitor' button. You can use this to receive e-mail notification whenever something on this page changes. If I manage to fix this problem, I'll post it on this page, so you'll get notified.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Bug confirmed. The newer RTF document looks okay, but the old one shows weird characters. (Btw, I checked this with DocFetcher's built-in "Parser Testbox", you can press F11 to open it.)
As I said before, I hope I can fix this, but other features have higher priority right now, so this will take a while.
This is how I attach files on SourceForge.net:
- Log into SourceForge.net
- At the bottom of the page, there's a clickable line "Attached File"
- When I click on it, it expands, showing a clickable "Add a file" line
Out of curiosity: Is it not possible for you to convert your RTF files to other formats, e.g. OpenOffice or Microsoft Word?
Best regards
q:-) <= qforce
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have near 3000 old RTF files... And i can't resaved them at other formats. And my text editors (W - PolyEdit, X - Ted) save my fails in old RTF... And I need help. )
Есть кто русский? Помогите перевести. что точно gforce... Еще... есть одна прога, которая индексирует все, правда, ее придется воровать - она стоит две с половиной штуки баксов))) Но пароль на архивы у меня есть. Только ее надо компилировать, но я не знаю, как - там даже скрипта нет. Говорят ява нужна...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry, I looked into the problem before making the 1.0.1 release, but I couldn't solve it.
The problem is that I couldn't find a way to determine the encoding of the RTF file.
Anyway, RTF is a relatively old file format and there's aren't many RTF Java libraries out there. So maybe it's better if you search for a program to convert your RTF files to a newer format. For example, you could try this program: http://www.docfrac.net/
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Docfrac can''t work with old-RTF too. Now I look for correct converter... But i want say to qforce... I know one search program, which work with all formats. It''s Dtsearch - http://www.dtsearch.com
It''s propietary. But I have a pass to archives - there are Windows & Linux versions.
WIn - ftp://ftp.dtsearch.com/pub/dtSearchDesktop764.exe pass: DTSEARCH7
*nix - sousce - ftp://ftp.dtsearch.com/pub/dtSearchLinux764.sfx
glibs - ftp://ftp.dtsearch.com/pub/dtSearchLinux_glibc22_764.sfx pass: DTLINUX
I can't compile it. But maybe it porogram help qforce to do his Docfetcher better
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Don't worry, this problem is still on my todo list ;-)
DtSearch won't be of much help here because I don't have access to their source code.
However, I've found AbiWord, a word processor which handles all RTF files correctly. I will try to find the relevant code in AbiWord and integrate them in DocFetcher. This might take several months, though...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
While I convert my rtf files to pdf with this program: http://www.softinterface.com/Convert-Doc/Convert-Doc.htm Docfetcher work with PDF very good.
And will be wait our queue in todo. From pdf to rft this program convert not correctly, and i don't want to do it by this or another program, because i need in date, when the files where created... Sorry my terrible english...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Vitaly,
Can you please post an RTF file with English and Russian text in it, where the problem occurs? This would make it much easier to fix this bug.
You can add files to this page with the "Add a file" link below (you will probably need to create a SourceForge.net account first).
I didn't find how to add files so here's link to the test file - _http://www.multiupload.com/GCUFE1YPE1
Inside this zip file you can find the RTF file with Eng and Rus texts in tables. And also screeshot of how this file is shown on my computer in DocFetcher.
BTW, is it possible to fix the way how tables are shown so they look like tables in DocFetcher?
Thank you very much for the file, I'll look into it. I can't promise anything, though, I'll just try to fix it.
I'm not sure... Theoretically, it should be possible, but it will probably take a lot of time to implement this.
Please let me know if you manage to fix it.
On the top right of this page, there is a 'Monitor' button. You can use this to receive e-mail notification whenever something on this page changes. If I manage to fix this problem, I'll post it on this page, so you'll get notified.
I can confirm the information about rusiian rtf. If you use the Abiword, you can see two types of rtf files extentions (when 'save as' file) - rtf & old rtf. DocFetcher correctly shows rtf, saved with Abiword, but incorrectly shows old rtf saved with Abiword.
This is readable file: http://rapidshare.com/files/324045798/RTF.rtf.html
This is unreadable file: http://rapidshare.com/files/324046306/Old-RTF.rtf.html
This is my config: http://rapidshare.com/files/324047193/user.properties.html
Sorry, i don't see the "Add file" link...
Sorry one more - my OS is Xubuntu
Hello,
Bug confirmed. The newer RTF document looks okay, but the old one shows weird characters. (Btw, I checked this with DocFetcher's built-in "Parser Testbox", you can press F11 to open it.)
As I said before, I hope I can fix this, but other features have higher priority right now, so this will take a while.
This is how I attach files on SourceForge.net:
- Log into SourceForge.net
- At the bottom of the page, there's a clickable line "Attached File"
- When I click on it, it expands, showing a clickable "Add a file" line
Out of curiosity: Is it not possible for you to convert your RTF files to other formats, e.g. OpenOffice or Microsoft Word?
Best regards
q:-) <= qforce
I have about 1000 or more "old" RTFs in many folders on my hard drive so it's not real to save all of them in doc or odt format.
I have near 3000 old RTF files... And i can't resaved them at other formats. And my text editors (W - PolyEdit, X - Ted) save my fails in old RTF... And I need help. )
Есть кто русский? Помогите перевести. что точно gforce... Еще... есть одна прога, которая индексирует все, правда, ее придется воровать - она стоит две с половиной штуки баксов))) Но пароль на архивы у меня есть. Только ее надо компилировать, но я не знаю, как - там даже скрипта нет. Говорят ява нужна...
New version of Docfetcher can't show Old-rtf files too... It's a pity. Because with out such programm i can't delete Windows from my hard )
Sorry, I looked into the problem before making the 1.0.1 release, but I couldn't solve it.
The problem is that I couldn't find a way to determine the encoding of the RTF file.
Anyway, RTF is a relatively old file format and there's aren't many RTF Java libraries out there. So maybe it's better if you search for a program to convert your RTF files to a newer format. For example, you could try this program: http://www.docfrac.net/
Docfrac can''t work with old-RTF too. Now I look for correct converter... But i want say to qforce... I know one search program, which work with all formats. It''s Dtsearch - http://www.dtsearch.com
It''s propietary. But I have a pass to archives - there are Windows & Linux versions.
WIn - ftp://ftp.dtsearch.com/pub/dtSearchDesktop764.exe pass: DTSEARCH7
*nix - sousce - ftp://ftp.dtsearch.com/pub/dtSearchLinux764.sfx
glibs - ftp://ftp.dtsearch.com/pub/dtSearchLinux_glibc22_764.sfx pass: DTLINUX
I can't compile it. But maybe it porogram help qforce to do his Docfetcher better
Don't worry, this problem is still on my todo list ;-)
DtSearch won't be of much help here because I don't have access to their source code.
However, I've found AbiWord, a word processor which handles all RTF files correctly. I will try to find the relevant code in AbiWord and integrate them in DocFetcher. This might take several months, though...
Thank you!
While I convert my rtf files to pdf with this program: http://www.softinterface.com/Convert-Doc/Convert-Doc.htm Docfetcher work with PDF very good.
And will be wait our queue in todo. From pdf to rft this program convert not correctly, and i don't want to do it by this or another program, because i need in date, when the files where created... Sorry my terrible english...
docfetcher 1..0.3 can't read old rtf too (((((((((
I know, I know, sorry...
The problem is rather difficult and I don't have time right now. This will take a few months...
This problem is being worked on over here: http://sourceforge.net/tracker/index.php?func=detail&aid=3107897&group_id=150969&atid=779500
Should be fixed in version 1.1 beta 1, which is available here: http://sourceforge.net/projects/docfetcher/files/docfetcher/