You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(47) |
Nov
(74) |
Dec
(66) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(95) |
Feb
(102) |
Mar
(83) |
Apr
(64) |
May
(55) |
Jun
(39) |
Jul
(23) |
Aug
(77) |
Sep
(88) |
Oct
(84) |
Nov
(66) |
Dec
(46) |
| 2003 |
Jan
(56) |
Feb
(129) |
Mar
(37) |
Apr
(63) |
May
(59) |
Jun
(104) |
Jul
(48) |
Aug
(37) |
Sep
(49) |
Oct
(157) |
Nov
(119) |
Dec
(54) |
| 2004 |
Jan
(51) |
Feb
(66) |
Mar
(39) |
Apr
(113) |
May
(34) |
Jun
(136) |
Jul
(67) |
Aug
(20) |
Sep
(7) |
Oct
(10) |
Nov
(14) |
Dec
(3) |
| 2005 |
Jan
(40) |
Feb
(21) |
Mar
(26) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
(23) |
Aug
(3) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2006 |
Jan
(2) |
Feb
(4) |
Mar
(4) |
Apr
(1) |
May
(11) |
Jun
(1) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
(4) |
Nov
|
Dec
(1) |
| 2007 |
Jan
(2) |
Feb
(8) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Budd, S. <s....@ic...> - 2002-06-10 09:04:40
|
Star office can save directly to good html format or to Ms Office 2000/Xp format. The choice is made within the "save as" option. You should be able to index directly from the saved docuement if you chose the "Web page" format. -----Original Message----- From: Gilles Detillieux [mailto:gr...@sc...] Sent: Saturday, June 08, 2002 4:31 AM To: cba...@eu... Cc: htd...@li... Subject: Re: [htdig-dev] openoffice parser According to EuropeanServers - Christophe BAEGERT: > we use htdig on word documents, but now we've switched to OpenOffice.org, and > we haven't any parser. Does it exist or is it planned ? I haven't heard of or found leads to an OpenOffice.org to HTML document converter. However, the OpenOffice.org web site states that these documents are XML, so it should be pretty easy for someone familiar with basic HTML, and with Perl, awk or sed scripting, to whip up a rudimentary XML to HTML converter specific for these documents, so it could pick out the elements you want and surround them by appropriate HTML tags so htdig indexes them using the word types you want (i.e. titles, meta keywords & description, hyperlinks and their descriptions, plain text). Even simpler, you could probably feed the XML straight into htdig's HTML parser and it would at least index most/all the text as plain text. Not having actually seen any OpenOffice.org documents, though, it's hard for me to speculate on exactly how easy or difficult the task might be. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas - http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink _______________________________________________ htdig-dev mailing list htd...@li... https://lists.sourceforge.net/lists/listinfo/htdig-dev |
|
From: Neal R. <ne...@ri...> - 2002-06-09 19:53:26
|
Hey, Is there a method somewhere to remove a document from an index? I though I found on but can't locate it again.. This will hopefully be a method added to the libhtdig API and is usefull in the contect of large indexes where rebuilding a huge index to remove 1 document is inefficient. As a note I will try to submit a patch and set of makefiles to do a native WIN32 port of HtDig & libhtdig this week. The way this will work is that using a WIN32 system with cygwin and MSVC installed, a seperate set of makefiles are used to build a fully native WIN32 set of binaries (no cygwin dll needed). Cygwin is needed since these makefiles use GNU make instead of windows make, but MSVC's command line compiler is used. A ./configure is not necessary, but a setup script will replace/modify any the existing db/db_config.h & include/htconfig.h Any feedback in how you would like this to work better let me know! There are free native windows compilers out there.. Borland & Watcom have them for download.. Watcom's is Open Source as well.. http://www.openwatcom.org Thanks! -- Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Geoff H. <ghu...@ws...> - 2002-06-09 17:26:33
|
On Sunday, June 9, 2002, at 03:19 AM, Scott Gifford wrote: >> * Handle noindex_start & noindex_end as string lists. > I posted a patch to allow up to 11 noindex_start/noindex_end pairs > back in March. It's a bit kludgey, but I find it useful. Yes, but this isn't really what people are looking for. When we say "string lists," we mean something like this: noindex_start: <style <noindex <htdig-noindex <foo <bar ... While your patch is fine and I'm sure others get some use out of it too, I don't think we're going to accept it. At least, that's my $0.02. -Geoff |
|
From: Geoff H. <ghu...@us...> - 2002-06-09 07:14:19
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b4: In progress
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
SHOWSTOPPERS:
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug)
* Not all htsearch input parameters are handled properly: PR#648. Use a
consistant mapping of input -> config -> template for all inputs where
it makes sense to do so (everything but "config" and "words"?).
* If exact isn't specified in the search_algorithms, $(WORDS) is not set
correctly: PR#650. (The documentation for 3.2.0b1 is updated, but can
we fix this?)
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#859)
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* MySQL patches to 3.1.x to be forward-ported and cleaned up.
(Should really only attempt to use SQL for doc_db and related, not word_db)
NEEDED FEATURES:
* Field-restricted searching.
* Return all URLs.
* Handle noindex_start & noindex_end as string lists.
* Handle local_urls through file:// handler, for mime.types support.
* Handle directory redirects in RetrieveLocal.
* Merge with mifluz
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Turn on URL parser test as part of test suite.
* htsearch phrase support tests
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient.
* Add thorough documentation on htsearch restrict/exclude behavior
(including '|' and regex).
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#648.) Also make sure these config
attributes are all documented in defaults.cc, even if they're only set by
input parameters and never in the config file.
* Split attrs.html into categories for faster loading.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. phrase searching, regex matching,
external parsers and transport methods, database compression.)
* TODO.html has not been updated for current TODO list and completions.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
(Does Loic's new database code make this work?)
* The code needs a security audit, esp. htsearch
* URL.cc tries to parse malformed URLs (which causes further problems)
(It should probably just set everything to empty) This relates to
PR#348.
|
|
From: Gilles D. <gr...@sc...> - 2002-06-08 03:30:47
|
According to EuropeanServers - Christophe BAEGERT: > we use htdig on word documents, but now we've switched to OpenOffice.org, and > we haven't any parser. Does it exist or is it planned ? I haven't heard of or found leads to an OpenOffice.org to HTML document converter. However, the OpenOffice.org web site states that these documents are XML, so it should be pretty easy for someone familiar with basic HTML, and with Perl, awk or sed scripting, to whip up a rudimentary XML to HTML converter specific for these documents, so it could pick out the elements you want and surround them by appropriate HTML tags so htdig indexes them using the word types you want (i.e. titles, meta keywords & description, hyperlinks and their descriptions, plain text). Even simpler, you could probably feed the XML straight into htdig's HTML parser and it would at least index most/all the text as plain text. Not having actually seen any OpenOffice.org documents, though, it's hard for me to speculate on exactly how easy or difficult the task might be. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Martin V. <mv...@PD...> - 2002-06-07 05:40:53
|
Geoff Hutchison wrote: > > *what are the webserver requirements? > > Hardware? Software? For indexing, ht://Dig can index any webserver that > understands HTTP (i.e. all of them), though there have been reports of > strange quirks with Lotus Notes webservers. For running results, you > simply need the htsearch CGI and a CGI-webserver (i.e. just about > everything). UNIX-based servers are preferred, but there are > users who run ht://Dig on Windows as well--though it's flakier. There also is a VMS port of ht://Dig 3.1.6 (by yours truly) which I know is run by some DEC^3^HCompaq^6^HHP Customer Support Center to index the VMS documentation and source listings. The overall feedback I got was very positive, although some SWISH-E fan critizised the slowness of indexing - well, ht://Dig offers a lot more than SWISH, or does it? It works perfectly under the Apache port, and (using GET queries [1]) with Purveyor (an old but rock-solid commercial web server). cu, Martin [1] no POST because Purveyor doesn't support stdin -- So long, and thanks | Martin Vorlaender | VMS & WNT programmer for all the books... | work: mv...@pd... In Memoriam Douglas Adams | http://www.pdv-systeme.de/users/martinv/ 1952-2001 | home: ma...@ra... |
|
From: Geoff H. <ghu...@ws...> - 2002-06-06 20:49:53
|
> *what is the maximum indexing speed? > *what's the peak query rate? What's your hardware? How large are the documents that you're indexing? How fast is your network, or will you be indexing the local server rather than across the network? For queries, will you be running other services on the server while queries are performed? How many documents are likely to be returned? Are you talking about queries on a single server or a server farm? No offense, but I can't give you any reasonable number here. Suffice to say, ht://Dig is quite fast and well within the realm of commercial products. Personally, I'd be wary of someone giving you an actual number from your query alone. > *what are the rough disk space requirements for the software (not the > index, I mean for gcc/g++,ht://dig,and any other extra software > downloads I may need)? For the software? Depends on what sort of server you're running. Most UNIX servers have gcc/g++ already installed. The size of the ht://Dig binaries varies a bit by platform, but is probably in the realm of 2-3MB. > *what are the webserver requirements? Hardware? Software? For indexing, ht://Dig can index any webserver that understands HTTP (i.e. all of them), though there have been reports of strange quirks with Lotus Notes webservers. For running results, you simply need the htsearch CGI and a CGI-webserver (i.e. just about everything). UNIX-based servers are preferred, but there are users who run ht://Dig on Windows as well--though it's flakier. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |
|
From: <fi...@de...> - 2002-06-06 17:38:11
|
I am rather embarrassed to report that my email was not functioning, and the logs show that you emailed me with the data I requested earlier in the week. I would like to thank you and apologize profusely. Please be so good as to send another copy of the following data: *what is the maximum indexing speed? *what's the peak query rate? *what are the rough disk space requirements for the software (not the index, I mean for gcc/g++,ht://dig,and any other extra software downloads I may need)? *what are the webserver requirements? Thank you again very much for your time and help, it is greatly appreciated. Andy Fischer Fi...@de... Maxim Incorporated Products 120 San Gabriel Drive Sunnyvale, CA, 94086 |
|
From: <fi...@de...> - 2002-06-05 21:16:59
|
I'm doing research on search engines for my company website, and I have a few questions regarding ht://dig. I cannot complete my report without these bits of data, so I would greatly appreciate if anyone could answer them quickly. *what is the maximum indexing speed *what's the peak query rate *what are the rough disk space requirements for the software (not the index, I mean for gcc/g++,ht://dig,and any other extra software downloads I may need) *what are the webserver requirements? thank you very much for your help. Andy Fischer Fi...@de... Maxim Incorporated Products 120 San Gabriel Drive Sunnyvale, CA, 94086 |
|
From: Deal, D. <DD...@aa...> - 2002-06-05 17:29:43
|
Dear htdig devs, Currently, AARP uses ht:/dig on it's Web site. Currently, htdig does NOT allow for phrase searches, according to the lead tech guy, without using the Boolean search. For example, placing quotations around "EAP alive in Congress" and finding that phrase in articles on the AARP web site. However, some of the sites that you have listed as htdig users DO allow phrase searches. How do they put phrase searches on the AARP web site without using the Boolean search? Please help, David Deal 202-434-3479 |
|
From: Gilles D. <gr...@sc...> - 2002-06-04 21:50:55
|
According to Paul Lyon: > On Tue, 4 Jun 2002, Gilles Detillieux wrote: > > According to Paul Lyon: > > > Folks, > > > > > > I recently reported a bug about getting an error message from > > > htsearch about the file size of the words.db file not being > > > a multiple of the page size. This turns out to have been caused by a > > > version mismatch between htsearch and htmerge, in turn caused by my ... > > Did these happen to be bug numbers 551984 and 551990? If so, I'll annotate > > and close them. > > Correct. I didn't give the bug numbers, partly as I had forgotten them, > and partly because the bug reports had disappeared from the sourceforge > bug list (otherwise I would have added something to same.) By default, the SourceForge bug tracker only displays Open bug reports, although you can select other report statuses. These reports are frequently marked Pending or Closed after responding to them, unless there are compelling reasons to leave them open. I had closed these already, when I thought I had given the appropriate resolution to them. When you responded to my response, you left it closed (which I think is all you can do unless you have a SourceForge account) and I never got around to reopening it. So, I'll just leave it closed but add the note. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Gilles D. <gr...@sc...> - 2002-06-04 20:48:44
|
According to Paul Lyon: > Folks, > > I recently reported a bug about getting an error message from > htsearch about the file size of the words.db file not being > a multiple of the page size. This turns out to have been caused by a > version mismatch between htsearch and htmerge, in turn caused by my > failing > to specify the "prefix" properly (to "/usr/local" instead of the > default "/opt/www") when running configure. I have recently built the > latest 3.2.0b4 snapshot with --prefix=/usr/local to configure, and then > rebuilt two search databases. Now htsearch seems to work fine with > these. > > Sorry to file a bogus bug report. Please disregard same. > > Ciao, > > Paul Did these happen to be bug numbers 551984 and 551990? If so, I'll annotate and close them. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Paul L. <pd...@la...> - 2002-05-31 15:47:46
|
Folks, I recently reported a bug about getting an error message from htsearch about the file size of the words.db file not being a multiple of the page size. This turns out to have been caused by a version mismatch between htsearch and htmerge, in turn caused by my failing to specify the "prefix" properly (to "/usr/local" instead of the default "/opt/www") when running configure. I have recently built the latest 3.2.0b4 snapshot with --prefix=/usr/local to configure, and then rebuilt two search databases. Now htsearch seems to work fine with these. Sorry to file a bogus bug report. Please disregard same. Ciao, Paul -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Paul Lyon | "Without true justice Liberal Arts Computer Lab | there can be no peace." University of Texas at Austin | Lucretia Coffin Mott email: pd...@la... | 'phone: 512-232-7272 *** fax: 512-471-1061 | =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= |
|
From: EuropeanServers - C. B. <cba...@eu...> - 2002-05-30 11:47:16
|
Hi, we use htdig on word documents, but now we've switched to OpenOffice.org, and we haven't any parser. Does it exist or is it planned ? Regards, -- Christophe BAEGERT cba...@eu... >>>>>>>>>>>>> http://www.europeanservers.net <<<<<<<<<<<<< --------------- Ultra fast internet servers -------------- |
|
From: Torsten N. <tn...@in...> - 2002-05-30 10:52:56
|
Geoff Hutchison wrote:
>=20
> On Tue, 28 May 2002, Elaine Fortin wrote:
>=20
> > We want to store faxes and be able to search them with htdig.
> ...
> > In order to be able to search the content, would we have to run them
> > through an OCR program, or is there something else that can translate=
them?
>=20
> You'd have to have some sort of OCR in there.
>=20
> A fax TIFF file is pure graphic--there's very little text content. (TIF=
F
> files in general can have some useful text info, but I think you're
> looking for the text in the fax, not text that the fax program may or m=
ay
> not store in the TIFF.)
OCR software programs need to be "trained" for successfully recognizing
any textual contents in a graphic. Textual graphics need to be properly
aligned in order for the OCR software to successfully recognize the text
content as such.
That means: The OCR programs need to know about the font used in the
graphics files *plus* there should be little (less than 3=B0) alignment
offset or else any (even the best commercially available OCR software
program) will produce near completely unreadable output.
In the case of facsimiles to be indexed by ht://Dig via translating
graphics
to text content with any give OCR software, one has to take into account
that
(a) facsimiles are in most cases *not* correctly enough aligned to be
analyzed by an OCR program (hand-faxed sheets will normally have
offsets of 3=B0+)
(b) facsimiles cannot be controlled with regards to character based
training of the OCR software (they will especially never be sent
using specially designed OCR fonts)
(c) facsimiles (especially hand-transmitted ones) will in many cases
contain valuable information added in hand-writing
(d) facsimiles will contain "useless" information that cannot be
skipped
by text-indexing software like ht://Dig (since there is no way of
inserting the respective control statements for the analyzing
software)
All this makes facsimiles (and most scanned texts) nearly unfit for
automatic processing with OCR and indexing programs.
cheers,
Torsten
--=20
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstra=DFe 14 Tel: +49-4101-403605
D-25474 Ellerbek Fax: +49-4101-403606
E-Mail: in...@in... Internet: http://www.inwise.de
|
|
From: Budd, S. <s....@ic...> - 2002-05-30 09:51:24
|
Hello. Question.. what is the cause of the slowing down of a dig. When dig start I get more than 200 pages/minute and nearly 90 % cpu usage but near the end, I am seeing 20 pages/minute and at most 10% cpu usage. If I fetch the pages from the site being dug at the end, the page comes in very fast ( 100Mb line ) so access to the system is ok. I have set the db cache to 1.3 Gbyte but htdig seems to only use about 260Mbyte. I am putting the databases into /tmp i.e into memory. still plenty of space in memory. du -k /tmp 8 /tmp/.pcmcia 8 /tmp/.X11-unix 8 /tmp/.X11-pipe 8 /tmp/root 8 /tmp/kshroot 8 /tmp/ppp 1351528 /tmp/college 1351624 /tmp here is a top. last pid: 3903; load averages: 0.05, 0.06, 0.07 10:43:05 45 processes: 44 sleeping, 1 on cpu CPU states: 99.2% idle, 0.2% user, 0.6% kernel, 0.0% iowait, 0.0% swap Memory: 2048M real, 351M free, 1613M swap in use, 2029M swap free PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND 3607 ppp 1 18 4 265M 263M sleep 762:28 5.17% htdig 3900 ppp 1 58 0 2584K 1704K cpu 0:05 0.15% top 306 root 12 49 0 3472K 2456K sleep 1:54 0.00% mibiisa a -v log ............... 185822:170323:11:http://www.cc.ic.ac.uk/college/onlinedocs/SASOnlineDocV8/sa sdoc/sashtml/gref/zeration.htm: ******* size = 3599 185823:172065:11:http://www.cc.ic.ac.uk/college/onlinedocs/SASOnlineDocV8/sa sdoc/sashtml/accdb/z1258182.htm: ******* size = 3228 185824:172066:11:http://www.cc.ic.ac.uk/college/onlinedocs/SASOnlineDocV8/sa sdoc/sashtml/accdb/z-dbnull.htm: ******* size = 4636 |
|
From: Gilles D. <gr...@sc...> - 2002-05-29 20:53:52
|
According to Geoff Hutchison: > On 29 May 2002, Chris Albone wrote: > > I'm using a somewhat old snapshot of 3.2b4 (from november last yr > > actually). > > I'd obviously suggest grabbing a new snapshot, for a number of reasons. Well, none of those reasons relate to the problem reported, as there haven't been any htsearch changes since November. They do affect htdig reliability, though. > > We're having hassles with getting htsearch to use the > > wrapper.html file. We've added the relevant entry into the config file, > > If you read the documentation: > <http://www.htdig.org/attrs.html#search_results_wrapper> > > you'll see that there are a few things to check which you haven't > mentioned. Is the filename readable/accessible by the webserver user? Does > it have the $(HTSEARCH_RESULTS) variable? That part is critical. If the wrapper file doesn't have this pseudo- variable in it, or if the file isn't readable by the user ID under which htsearch runs, then htsearch will fall back on the search_results_header and search_results_footer. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Geoff H. <ghu...@ws...> - 2002-05-29 20:10:27
|
On Wed, 29 May 2002, RAHARD Matthieu wrote: > It works fine. Thanks for this excellent software. Thanks for the report! > For your information, I use the version below: > ht://dig: 3.1.6 > AIX 4.3.3 ML09 > gcc/g++: 2.9 I'm not quite sure what you mean by gcc "2.9" since there is no such version. There's 2.8.x and 2.95.x and the various egcs 2.91, 2.92, etc. versions. Could you give us the result of "gcc -v"? Thanks, -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |
|
From: Geoff H. <ghu...@ws...> - 2002-05-29 20:08:52
|
On 29 May 2002, Chris Albone wrote: > I'm using a somewhat old snapshot of 3.2b4 (from november last yr > actually). I'd obviously suggest grabbing a new snapshot, for a number of reasons. > We're having hassles with getting htsearch to use the > wrapper.html file. We've added the relevant entry into the config file, If you read the documentation: <http://www.htdig.org/attrs.html#search_results_wrapper> you'll see that there are a few things to check which you haven't mentioned. Is the filename readable/accessible by the webserver user? Does it have the $(HTSEARCH_RESULTS) variable? -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |
|
From: RAHARD M. <mr...@br...> - 2002-05-29 11:25:42
|
Hello, I have installed ht://Dig on an IBM RS/6000 under AIX 4.3.3 running an Apache server. It was compiled using gcc/g++. I did not have to change any lines in the code, it compile and it works. It works fine. Thanks for this excellent software. For your information, I use the version below: ht://dig: 3.1.6 AIX 4.3.3 ML09 gcc/g++: 2.9 |
|
From: Chris A. <yv...@gm...> - 2002-05-29 09:54:59
|
Hi All, I'm using a somewhat old snapshot of 3.2b4 (from november last yr actually). We're having hassles with getting htsearch to use the wrapper.html file. We've added the relevant entry into the config file, but we seem to have no joy. Is this a known problem? Should I upgrade? (Going back will kill the phrase searching that i need..) peace chris -- Chris Albone wf: +61 2 9351 7774 Systems Manager, mf: 0414 597 571 Faculty of Medicine fx: +61 2 9351 7778 University of Sydney #include <witty_comment.h> |
|
From: Geoff H. <ghu...@ws...> - 2002-05-28 22:23:33
|
On Tue, 28 May 2002, Elaine Fortin wrote: > We want to store faxes and be able to search them with htdig. ... > In order to be able to search the content, would we have to run them > through an OCR program, or is there something else that can translate them? You'd have to have some sort of OCR in there. A fax TIFF file is pure graphic--there's very little text content. (TIFF files in general can have some useful text info, but I think you're looking for the text in the fax, not text that the fax program may or may not store in the TIFF.) -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |
|
From: Elaine F. <ela...@ha...> - 2002-05-28 18:38:17
|
Hi, We want to store faxes and be able to search them with htdig. We are investigating using a fax2email service for this, which stores them as .tif files, and emails them to us as attachments. In order to be able to search the content, would we have to run them through an OCR program, or is there something else that can translate them? Thanks, Elaine Fortin |
|
From: Sankaranarayanan M.G. <san...@ce...> - 2002-05-27 04:14:37
|
Hi All, Greetings. I am wondering whether anyone had tried integrating a Optical = Character Recognition with htdig. I tried couple of Optical Character = Recognition softwares but I cldn't get satisfactory results. If anyone = had earlier used any OCRs and found it to be satisfactory, Please let me = know abt it. Thanks, Sankaranarayanan M.G. ************************************************************** Scanned by MailScan Content-Security and Anti-Virus Software. Visit http://www.mwti.net for more info on eScan and MailScan. ************************************************************** |
|
From: Geoff H. <ghu...@ws...> - 2002-05-26 15:38:15
|
On Sat, 25 May 2002, David Gibbs wrote: > I just updated cvs and ran 'make install', and got the following error ... > Ideas? Sorry, please disregard. Moderator error. :-( -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |