You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(47) |
Nov
(74) |
Dec
(66) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(95) |
Feb
(102) |
Mar
(83) |
Apr
(64) |
May
(55) |
Jun
(39) |
Jul
(23) |
Aug
(77) |
Sep
(88) |
Oct
(84) |
Nov
(66) |
Dec
(46) |
| 2003 |
Jan
(56) |
Feb
(129) |
Mar
(37) |
Apr
(63) |
May
(59) |
Jun
(104) |
Jul
(48) |
Aug
(37) |
Sep
(49) |
Oct
(157) |
Nov
(119) |
Dec
(54) |
| 2004 |
Jan
(51) |
Feb
(66) |
Mar
(39) |
Apr
(113) |
May
(34) |
Jun
(136) |
Jul
(67) |
Aug
(20) |
Sep
(7) |
Oct
(10) |
Nov
(14) |
Dec
(3) |
| 2005 |
Jan
(40) |
Feb
(21) |
Mar
(26) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
(23) |
Aug
(3) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2006 |
Jan
(2) |
Feb
(4) |
Mar
(4) |
Apr
(1) |
May
(11) |
Jun
(1) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
(4) |
Nov
|
Dec
(1) |
| 2007 |
Jan
(2) |
Feb
(8) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Lachlan A. <lh...@us...> - 2003-02-02 12:32:36
|
Greetings Andy/all, It seems this relates to an Undocumented Feature(tm). Words beginning=20 with 'exact:' or 'hidden:' are treated differently. As a side=20 effect, it introduces the bug of not splitting words at colons. Could someone who knows what exact: and hidden: mean please=20 explain what they are for (and/or document them officially)? I don't=20 want to break anything while trying to fix the bug. On a related note, does anyone have any ideas for the syntax of "field=20 restricted" searches? I was thinking of something like "title:word"=20 to search for "word" in the title field, or "heading:word" etc. Was=20 the plan to allow user-defined fields in meta-data to be searched? =20 That would be hard! Cheers, Lachlan On Friday 31 January 2003 07:49, And...@wi... wrote: > ...found that it would not find on double colons... > =09getopt::std > retreived nothing, as did: > =09"getopt::std" > while: > =09getopt std > was the same as: > =09"getopt std" > =09"getopt :: std" > =09"getopt :: std ::" |
|
From: Lachlan A. <lh...@us...> - 2003-02-02 08:50:11
|
Greetings all, Is there a list of tasks which *must* be completed before the release=20 of 3.2.0b4/5? If the "STATUS" file is that list, can I suggest that=20 some things be classed as "not essential" (at least defaults.xml, and=20 preferably most of it)? If STATUS isn't the list, could one be drawn=20 up? Sorry for sounding impatient, and I know that everyone is busy, but it=20 is two months after the initial target release date, and I don't feel=20 much closer... Thanks! Lachlan |
|
From: Lachlan A. <lh...@us...> - 2003-02-02 08:34:46
|
On Saturday 01 February 2003 11:25, Neal Richter wrote: > test it with > wordlist_compress_zlib & wordlist_compress dissabled and re-run > htpurge. I have my > doubts about whether this is caused by zlib or is a locical error > independednt of page-compression scheme. The test dig I ran didn't give the error, but the document set was=20 marginally different. I'll get back to you once I have concrete=20 results. > the SleepyCat book on BDB says. (New Riders ISBN 0735710643) > I'd be happy to send you a copy of the book if you really want to > dig into BDB. I could use some help there in the future increasing > index efficiency ;-) Thanks for the offer, but that sounds like more of a time commitment=20 than I can make at the moment :( If I can find some more time to=20 work on the database side of things, I'll let you know. Thanks again, Lachlan |
|
From: Lachlan A. <lac...@ip...> - 2003-02-02 08:19:52
|
On Sunday 02 February 2003 11:35, Neal Richter wrote: > very few of our > inline functions end up getting inlined. I've done a fair bit of > work with gprof and looking at what functions are actually inlined > in the assembly. At least with -O2 I'm not seeing some of them > making it. What version of gcc are you using? I've just compiled using 3.1,=20 and it seems to be inlining. For example, the copy constructor gives=20 the output below, which clearly shows line 229 of htString.h being=20 inlined. =09.align 2 =09.p2align 4,,15 =2Eglobl _ZN6StringC2ERKS_ =09.type=09_ZN6StringC2ERKS_,@function _ZN6StringC2ERKS_: =2ELFB7: =09.loc 1 67 0 =09pushl=09%ebp =2ELCFI34: =09movl=09%esp, %ebp =2ELCFI35: =09pushl=09%ebx =2ELCFI36: =09pushl=09%edx =09call=09.L31 =2EL31: =09popl=09%ebx =09addl=09$_GLOBAL_OFFSET_TABLE_+[.-.L31], %ebx =09movl=098(%ebp), %edx =09movl=0912(%ebp), %ecx =2ELBB10: =09movl=09_ZTV6String@GOT(%ebx), %eax =09addl=09$8, %eax =09.loc 1 68 0 =09movl=09$0, 4(%edx) =09.loc 1 67 0 =09movl=09%eax, (%edx) =09.file 38 "htString.h" =09.loc 38 229 0 =2ELBB11: =2ELBB12: =09movl=094(%ecx), %eax =09.loc 1 68 0 =2ELBE12: =2ELBE11: =09movl=09$0, 8(%edx) =09.loc 1 69 0 =09movl=09$0, 12(%edx) =09.loc 38 229 0 =09testl=09%eax, %eax =09jle=09.L25 =09pushl=09%eax =09pushl=09%eax =09movl=0912(%ecx), %eax =09pushl=09%eax =09pushl=09%edx =2ELCFI37: =09call=09_ZN6String4copyEPKcii@PLT =09addl=09$16, %esp =09.loc 1 73 0 =2EL25: =2ELBE10: =09movl=09-4(%ebp), %ebx =09movl=09%ebp, %esp =09popl=09%ebp =09ret =2ELFE7: =2ELfe7: =09.size=09_ZN6StringC2ERKS_,.Lfe7-_ZN6StringC2ERKS_ This isn't from DDD, but using the command /bin/sh ../libtool --mode=3Dcompile g++ -DHAVE_CONFIG_H -I. -I.=20 -I../include=20 -DDEFAULT_CONFIG_FILE=3D\"/home/lha/devel/htdig/install//conf/htdig.conf\= " =20 -I../include -I../htlib -I../htnet -I../htcommon =20 -I../htword -I../db -I../db =20 -I/usr/include -g -O2 -Wall -fno-rtti -fno-exceptions -S String.cc which is what my makefile generates, but with '-c' replaced by '-S'. > My next set of changes are going to be enhancements to WordDB, & > WordKey. These sound like good improvements. However when I do a dig of my=20 local computer, it seems rather disk bound, with CPU usage hovering=20 around 50%, of which a lot is system time. I haven't yet checked how=20 much of this is reading the input docs and how much is database=20 access, but my guess is that implementing your more compact database=20 format will give more bang for your development buck. It has the=20 added advantages that it uses less disk space, and that it speeds up=20 searching (assuming that is also disk bound). However, I am really=20 keen to get 3.2.0b5 out and I personally won't be working on=20 optimisations until then. Thoughts? > I need to get more familiar with how gcc does some of these > optimizations.. but at the same time I'm inclined not to rely on it > too much. Agreed. We should eliminate unnecessary function calls, but *some*=20 will still be needed, and we might as well get those inlined. Cheers, Lachlan |
|
From: Geoff H. <ghu...@us...> - 2003-02-02 08:16:24
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b5: Next release, tentatively 1 Feb 2003.
3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease.
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
(Please note that everything added here should have a tracker PR# so
we can be sure they're fixed. Geoff is currently trying to add PR#s for
what's currently here.)
SHOWSTOPPERS:
* Mifluz database errors are a severe problem (PR#428295)
-- Does Neal's new zlib patch solve this for now?
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug) PR#618737.
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#618738)
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* Mifluz merge.
NEEDED FEATURES:
* Field-restricted searching. (e.g. PR#460833)
* Quim's new htsearch/qtest query parser framework.
* File/Database locking. PR#405764.
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient. (PR#405279)
* Add thorough documentation on htsearch restrict/exclude behavior
(including '|' and regex).
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#405278.)
Should we make sure these config attributes are all documented in
defaults.cc, even if they're only set by input parameters and never
in the config file?
* Split attrs.html into categories for faster loading.
* Turn defaults.cc into an XML file for generating documentation and
defaults.cc.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. regex matching, database compression.)
PRs# 405280 #405281.
* TODO.html has not been updated for current TODO list and
completions.
* Htfuzzy could use more documentation on what each fuzzy algorithm
does. PR#405714.
* Document the list of all installed files and default
locations. PR#405715.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
* The code needs a security audit, esp. htsearch. PR#405765.
|
|
From: Neal R. <ne...@ri...> - 2003-02-02 00:40:08
|
> Greetings Neal, > > That is a very sensible change, but I have two questions. > First, in the change > - if (s.length() > 0) > - copy(s.Data, s.length(), s.length()); > + if (slen != 0) > + copy(s.Data, slen, slen); > was there a reason to replace the '>' by '!='? It is defensive > programming not to copy negative-length strings. Hmm. I'll change that back. I definetly meant to keep it as ">". I was cut and pasting code from an older snapshot where I made the changes. Good spot. > Second, do you know why gprof sees the calls to length() at all? > Shouldn't an inline function be optimised out? If so, the > optimisation becomes removing a few pointer dereferences (which > should also be optimised out by a sensible compiler). That said, I > agree that it is tidier not to rely on an optimising compiler. At least according to my examinations so far, very few of our inline functions end upgetting inlined. I've done a fair bit of work with gprof and looking at what functions are actually inlined in the assembly. At least with -O2 I'm not seeing some of them making it. I'll look up some more information on inlining. It's only a suggestion to the compiler and we need to be checking the assembly (DDD is a nice GUI debugger that allows assembly viewing during debugging) to see when it actually works. My next set of changes are going to be enhancements to WordDB, & WordKey. The inlines in WordDB aren't making it, and I found a way to eliminate many function calls in WordKey. There are also a bunch of places where functions are called in the comparison part of for loops. We need to eliminate these. I need to get more familiar with how gcc does some of these optimizations.. but at the same time I'm inclined not to rely on it too much. FYI there is a nice article on how Intel's new linux compiler does some of its optimizations in the new issue of Linux Journal. Question: is it me or are the last few issues of Dr Dobbs and Linux Journal looking very 'thin'? Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Gabriele B. <an...@ti...> - 2003-02-01 13:16:44
|
Ciao guys,
I send you a small patch for those of you that want to try the cookies
import facility. Please test it and let me know if I can commit in the next
week.
Also, let me know if you want me to backport the code to the 3.1.x's
branch by submitting another patch.
Then, as always, Gilles (I know I am terrible but you should know me
already!) please find me a suitable description for the
'cookies_input_file' attribute to be put in the defaults.cc file (and
defaults.xml too!).
Ciao and thanks
-Gabriele
|
|
From: Gabriele B. <an...@ti...> - 2003-02-01 08:32:53
|
Ciao Neal and everyone! >Hmm. XSLT is useable from Python. Python is very easy to learn. Yep, you just preceeded me. I was thinking of Python as well (even though I am a newbie with it). I have done it in PHP as well, but we need an XSLT engine (I think I used Sablotron's one). Is it easy in Python? Does it have XSLT parser capabilities? Thank you -Gabriele -- Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check maintainer Current Location: Prato, Tuscany, Italia an...@ti... | http://www.prato.linux.it/~gbartolini | ICQ#129221447 |
|
From: Lachlan A. <lac...@ip...> - 2003-02-01 04:59:46
|
Greetings Neal, That is a very sensible change, but I have two questions. First, in the change =09- if (s.length() > 0) =09- copy(s.Data, s.length(), s.length()); =09+ if (slen !=3D 0) =09+ copy(s.Data, slen, slen); was there a reason to replace the '>' by '!=3D'? It is defensive=20 programming not to copy negative-length strings. Second, do you know why gprof sees the calls to length() at all? =20 Shouldn't an inline function be optimised out? If so, the=20 optimisation becomes removing a few pointer dereferences (which=20 should also be optimised out by a sensible compiler). That said, I=20 agree that it is tidier not to rely on an optimising compiler. Cheers, Lachlan On Saturday 01 February 2003 12:20, Neal Richter wrote: > I've posted a patch to String.cc with some simple changes with make > a huge difference in efficiency. > Please take a look and tell me if you object to any change. > I'll commit it next week if no one sees a problem. |
|
From: onofrio <on...@sp...> - 2003-02-01 03:03:29
|
http://htdig.spearservice.it Italy thanks Onofrio |
|
From: Neal R. <ne...@ri...> - 2003-02-01 01:25:17
|
I've posted a patch to String.cc with some simple changes with make a huge difference in efficiency. http://ai.rightnow.com/htdig/String.cc.patch Basically it involves trivial changes to eliminate redundant calls to s.length() in the same function. I used gprof to find this and the change makes a massive difference in the number of total calls to String.length() during a large spidering run. Please take a look and tell me if you object to any change. You'll need to download the patch and apply it to your htlib\String.cc I'll commit it next week if no one sees a problem. Thanks. Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Neal R. <ne...@ri...> - 2003-02-01 00:31:00
|
Yep it is enabled by default.. If your error is repeatable, can you test it with wordlist_compress_zlib & wordlist_compress dissabled and re-run htpurge? I'd like to see if the error still appears. I have my doubts about whether this is caused by zlib or is a locical error independednt of page-compression scheme. Keep me posted! BerkeleyDB integrity check might be a nice feature.. I'll see what the SleepyCat book on BDB says. (New Riders ISBN 0735710643) I'd be happy to send you a copy of the book if you really want to dig into BDB. I could use some help there in the future increasing index efficiency ;-) Thanks! On Thu, 30 Jan 2003, Lachlan Andrew wrote: > Greetings Neal, > > Isn't wordlist_compress_zlib turned on by default? I've had a few > minor problems recently without changing it. The one I've got log > files for is that htpurge displays about six lots of diagnostics of > the form > > pg->type: 0 > ************************************ > ************************************ > ************************************ > page size:8192 > 00-07: Log sequence number. file : 0 > 00-07: Log sequence number. offset: 0 > 08-11: Current page number. : 143319 > 12-15: Previous page number. : 0 > 16-19: Next page number. : 132296 > 20-21: Number of item pairs on the page. : 0 > 22-23: High free byte page offset. : 8192 > 24: Btree tree level. : 0 > 25: Page type. : 0 > entry offsets: > 0: 0 0 0 0 0 0 0 0 d7 2f 2 0 0 0 0 0 c8 4 2 0 > 20: 0 0 0 20 0 0 fc 1f fc 1f ec 1f e8 1f d8 1f d4 1f c4 1f > 40: c0 1f b0 1f ac 1f 9c 1f 98 1f 88 1f 84 1f 74 1f 70 1f 60 1f > 60: 5c 1f 4c 1f 48 1f 38 1f 34 1f 24 1f 20 1f 10 1f c 1f fc 1e > 80: f8 1e e8 1e e4 1e d4 1e d0 1e c0 1e bc 1e ac 1e a8 1e 98 1e > 100: 94 1e 84 1e 80 1e 70 1e 6c 1e 5c 1e 58 1e 48 1e 44 1e 34 1e > ... > ... > ... > > while it is discarding words. I assumed this is caused by a > recoverable error. The only difference in the entries are the > current/next pages, and bytes 8, 9, 16, 17, 18 of the "entry > offsets". The first "next page number" is 0, and subsequent ones are > the previous values of "current page number" (as if it is reading > backwards through a chain). If you like, I can send the whole 6MB > log file, and/or any configuration files you want. > > A while ago (but I *think* with your fix in place), I also had a > problem with htdig crashing at one point, but I've lost the > details. When I've had problems, it has normally been on 10+ hour > digs. Any tips for isolating them? I've been thinking of doing an > integrity check every 100 database writes or so, but I don't know the > code well enough yet... > > Thanks for your feedback, and very much for writng the _zlib fix! > > Cheers, > Lachlan > > On Tuesday 28 January 2003 09:38, Neal Richter wrote: > > What DB errors are you speaking of? Turning on > > wordlist_compress_zlib should be a workaround for the DB errors I > > know about. > > > > Am I correct in believing that the hold-up is basically > > > database errors? > Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Neal R. <ne...@ri...> - 2003-02-01 00:27:12
|
On Tue Feb 12 06:17:45 2002 Geoff committed a patch from Jamie Anstice (SLI Systems). http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/htdig/htdig/htlib/Dictionary.cc Jamie/Geoff: Could you give me a little background on this change? I'm wondering if we couldn't add a String to the Dictionary class and use that instead of doing a malloc/strcpy everytime.. this function is called jillions of times. I'm also curious as to why not use Knuth's golden ratio hash function, it's a well studied and known-good hash. I can make the change and test it.. I'm just curious about the rational for both the origional hash function and the change. Thanks! /* Knuth's golden ratio hash function. key=word to hash on, kfl=length of * the word, bkts is number of buckets in hash table */ #define LONGMASK (~(1L << ((sizeof(long) * 8) - 1))) // ----- vhhash1 --------------------------------------------------------- static long vhhash1(char *key, int kfl, long bkts) { char *kptr; long lkey = 0L; double frac; kptr = key; while(kfl--) { lkey = ((lkey<<7)&~0x7fL) ^ ((lkey>>25)&0x7f) ^ ((long)*kptr++); } lkey &= LONGMASK; frac = 0.6180339887 * lkey; lkey = (long)frac; return((long) ((frac-lkey) * bkts)); } Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 |
|
From: Neal R. <ne...@ri...> - 2003-01-31 23:57:02
|
I'll admit I haven't been following this thread to closely... so bear with
me if this is something already coverd ;-)
Hmm. XSLT is useable from Python. Python is very easy to learn.
It may be easier to use Python + XSLT to parse the defaults.xml into
attrs.html for the site.
I've used XSLT from C & PHP code before.
Basically with XSLT you give it a template that describes what XML
containers/attributes to replace/wrap with what HTML code.
We used it to create a web-navigable HTML document out of a flat XML file
via PHP-XSLT.
Here's a Tag in an XML file
<contact_email label="Contact E-mail">ne...@xx...</contact_email>
Here's a snippet of an example .xsl file. It is used by PHP with a call
to xslt_process() with various parameters including the filename & .xsl
file. The outut of the function call is HTML.
//This calls a "template-function" on a specific XML tag.
<xsl:apply-templates select="contact_email" />
//This is the "template-function" for contact_email & others
<xsl:template match="ref_no|status|created_by|contact_email|interface|contact_id|company_id">
<tr>
<td><xsl:value-of select="@label" /></td>
<td><xsl:value-of select="." /></td>
</tr>
</xsl:template>
This is what shows up in the HTML:
<tr>
<td>Contact E-mail</td>
<td>ne...@xx...</td>
</tr>
Eh? Help any?
On Fri, 31 Jan 2003, Ted Stresen-Reuter wrote:
> Well, I think there are two issues to consider:
>
> 1) Rewriting some Perl code that takes the defaults.xml document and
> converts it into multiple attribute.html-type documents ("splitting")
> 2) Deciding on a format or "look and feel" for the documentation
>
> My plan was to spend this past weekend trying to learn enough Perl
> based on my PHP knowledge to be able to tackle number 1, but I'm afraid
> it's looking like too steep of a hill to climb in one weekend
> (especially considering it's Friday and I'm no further along than I was
> Saturday morning :-( ).
>
> At the same time, I'm exploring the capabilities of XSLT, for this and
> other projects, as a solution to creating individual HTML files (one
> for each attribute). The goal is to produce _something_ that splits
> defaults.xml into it's component pieces, even if it only ends up in the
> Contributions section.
>
> Alas, XSLT is really cool and certainly up to the task, but it too will
> require a few more weeks of study before I'm able to contribute
> anything worthwhile.
>
> Converting the documentation to XHTML and changing the look and feel
> (and improving the usability)...
>
> I could put together a style sheet (something I already know how to do
> and do with quite a bit of regularity) if you want, at least I'll be
> able to contribute _something_ while I work on the other piece...
>
> Let me know...
>
> Ted Stresen-Reuter
>
> On Friday, January 31, 2003, at 03:21 AM, Gabriele Bartolini wrote:
>
> > Ciao Ted, Budd and Gilles,
> >
> >> HyperCard and AppleScript and Lingo), I'll see if I can write the
> >> code for generating the documentation in Perl, but I would like to
> >> see Brian's scripts...
> >
> > I was trying to give a look at the situation of the XHTML porting of
> > the documentation. Any news? I can volounteer. I was thinking about
> > using an utility such as 'tidy' to help me moving forward, and also
> > try to check it up from the accessibility point of view. First (sorry
> > it is my fault), I'd remove the 'bold' effect when going over a link
> > (when I put it, it was almost one of my first experiences with CSS!)
> > which is really unaccessible!.
> >
> > I'll try and put the 'lang' attribute in every page and the content
> > type as well, and leave an empty 'body' in every page (I know some old
> > versions may show the default color - usually white - but it doesn't
> > prevent it from being viewed). Other changes will be made > consequently.
> >
> > Sounds good?
> >
> > Let me know. The only case which could have some problems is the
> > automatically generated configuration file (please update me for
> > this!).
> >
> > Ciao ciao
> > -Gabriele
> >
> > --
> > Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member -
> > ht://Check maintainer
> > Current Location: Prato, Tuscany, Italia
> > an...@ti... | http://www.prato.linux.it/~gbartolini | ICQ#129221447
> >
> >
>
>
>
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
> http://www.vasoftware.com
> _______________________________________________
> htdig-dev mailing list
> htd...@li...
> https://lists.sourceforge.net/lists/listinfo/htdig-dev
>
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
|
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-31 15:10:45
|
Well, I think there are two issues to consider:
1) Rewriting some Perl code that takes the defaults.xml document and
converts it into multiple attribute.html-type documents ("splitting")
2) Deciding on a format or "look and feel" for the documentation
My plan was to spend this past weekend trying to learn enough Perl
based on my PHP knowledge to be able to tackle number 1, but I'm afraid
it's looking like too steep of a hill to climb in one weekend
(especially considering it's Friday and I'm no further along than I was
Saturday morning :-( ).
At the same time, I'm exploring the capabilities of XSLT, for this and
other projects, as a solution to creating individual HTML files (one
for each attribute). The goal is to produce _something_ that splits
defaults.xml into it's component pieces, even if it only ends up in the
Contributions section.
Alas, XSLT is really cool and certainly up to the task, but it too will
require a few more weeks of study before I'm able to contribute
anything worthwhile.
Converting the documentation to XHTML and changing the look and feel
(and improving the usability)...
I could put together a style sheet (something I already know how to do
and do with quite a bit of regularity) if you want, at least I'll be
able to contribute _something_ while I work on the other piece...
Let me know...
Ted Stresen-Reuter
On Friday, January 31, 2003, at 03:21 AM, Gabriele Bartolini wrote:
> Ciao Ted, Budd and Gilles,
>
>> HyperCard and AppleScript and Lingo), I'll see if I can write the
>> code for generating the documentation in Perl, but I would like to
>> see Brian's scripts...
>
> I was trying to give a look at the situation of the XHTML porting of
> the documentation. Any news? I can volounteer. I was thinking about
> using an utility such as 'tidy' to help me moving forward, and also
> try to check it up from the accessibility point of view. First (sorry
> it is my fault), I'd remove the 'bold' effect when going over a link
> (when I put it, it was almost one of my first experiences with CSS!)
> which is really unaccessible!.
>
> I'll try and put the 'lang' attribute in every page and the content
> type as well, and leave an empty 'body' in every page (I know some old
> versions may show the default color - usually white - but it doesn't
> prevent it from being viewed). Other changes will be made > consequently.
>
> Sounds good?
>
> Let me know. The only case which could have some problems is the
> automatically generated configuration file (please update me for
> this!).
>
> Ciao ciao
> -Gabriele
>
> --
> Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member -
> ht://Check maintainer
> Current Location: Prato, Tuscany, Italia
> an...@ti... | http://www.prato.linux.it/~gbartolini | ICQ#129221447
>
>
|
|
From: Gabriele B. <an...@ti...> - 2003-01-31 09:22:50
|
Ciao Ted, Budd and Gilles, >HyperCard and AppleScript and Lingo), I'll see if I can write the code for >generating the documentation in Perl, but I would like to see Brian's >scripts... I was trying to give a look at the situation of the XHTML porting of the documentation. Any news? I can volounteer. I was thinking about using an utility such as 'tidy' to help me moving forward, and also try to check it up from the accessibility point of view. First (sorry it is my fault), I'd remove the 'bold' effect when going over a link (when I put it, it was almost one of my first experiences with CSS!) which is really unaccessible!. I'll try and put the 'lang' attribute in every page and the content type as well, and leave an empty 'body' in every page (I know some old versions may show the default color - usually white - but it doesn't prevent it from being viewed). Other changes will be made consequently. Sounds good? Let me know. The only case which could have some problems is the automatically generated configuration file (please update me for this!). Ciao ciao -Gabriele -- Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check maintainer Current Location: Prato, Tuscany, Italia an...@ti... | http://www.prato.linux.it/~gbartolini | ICQ#129221447 |
|
From: Lachlan A. <lh...@us...> - 2003-01-31 05:32:47
|
On Friday 31 January 2003 16:25, Lachlan Andrew wrote: > 2. Have you changed either since last rebuilding the database?=20 > (That behaviour is expected if either contained ":" when the > database was built, but not when it is searched.) Errr... Make that 'if neither contained ":" when the database was=20 build, but valid_punctuation does when it is searched', or something=20 like that... L |
|
From: Lachlan A. <lac...@ip...> - 2003-01-31 05:29:24
|
Greetings Andy, Thanks for your bug report. 1. What are your settings for valid_punctuation and =20 extra_word_characters? 2. Have you changed either since last rebuilding the database? (That=20 behaviour is expected if either contained ":" when the database was=20 built, but not when it is searched.) 3. Did it work for earlier versions? 4. Does it find "getoptstd"? 5. Have you tested it with single colons? Cheers, Lachlan On Friday 31 January 2003 07:49, And...@wi... wrote: > After compiling the lastest (01/26) snapshot I was trying some > searches and found that it would not find on double colons (as in a > perl module use stmt) > =09getopt::std > retreived nothing, > nor did: > =09"getopt::std" > while: > =09getopt std > was the same as: > =09"getopt std" > =09"getopt :: std" > =09"getopt :: std ::" |
|
From: Gabriele B. <an...@ti...> - 2003-01-30 21:00:08
|
Ciao guys,
after having tested the new configure process on FreeBSD as well, I
decided to commit the changes regarding the use of the new available
autotools (autoconf/automake/libtool).
They'll be available in next sunday's snapshot; I hope there won't be
any problems. This way, it is possible to compile the new cookie import
feature (it is not a big change, as the 'core' of the cookies jar is
basically the same), but not yet to use it. I count to hack the HTTP class
code tomorrow in order to enable it since sunday.
Ciao ciao
-Gabriele
--
Gabriele Bartolini - Web Programmer - ht://Dig & IWA Member - ht://Check
maintainer
Current Location: Prato, Tuscany, Italia
an...@ti... | http://www.prato.linux.it/~gbartolini | ICQ#129221447
|
|
From: <And...@wi...> - 2003-01-30 20:49:52
|
Hi. After compiling the lastest (01/26) snapshot I was trying some searches and found that it would not find on double colons (as in a perl module use stmt) getopt::std retreived nothing, nor did: "getopt::std" while: getopt std was the same as: "getopt std" "getopt :: std" "getopt :: std ::" a Andy Bach, Sys. Mangler Internet: and...@wi... VOICE: (608) 261-5738 FAX 264-5030 " ... even if you're mediocre/decent at perl [the cmecf] code is pretty confusing in certain areas ..." CB |
|
From: Lachlan A. <lh...@us...> - 2003-01-30 01:48:29
|
Greetings Neal,
Isn't wordlist_compress_zlib turned on by default? I've had a few=20
minor problems recently without changing it. The one I've got log=20
files for is that htpurge displays about six lots of diagnostics of=20
the form
pg->type: 0
************************************
************************************
************************************
page size:8192
00-07: Log sequence number. file : 0
00-07: Log sequence number. offset: 0
08-11: Current page number. : 143319
12-15: Previous page number. : 0
16-19: Next page number. : 132296
20-21: Number of item pairs on the page. : 0
22-23: High free byte page offset. : 8192
24: Btree tree level. : 0
25: Page type. : 0
entry offsets:
0: 0 0 0 0 0 0 0 0 d7 2f 2 0 0 0 0 0 c8 4 2 0=20
20: 0 0 0 20 0 0 fc 1f fc 1f ec 1f e8 1f d8 1f d4 1f c4 1f=20
40: c0 1f b0 1f ac 1f 9c 1f 98 1f 88 1f 84 1f 74 1f 70 1f 60 1f=20
60: 5c 1f 4c 1f 48 1f 38 1f 34 1f 24 1f 20 1f 10 1f c 1f fc 1e=20
80: f8 1e e8 1e e4 1e d4 1e d0 1e c0 1e bc 1e ac 1e a8 1e 98 1e=20
100: 94 1e 84 1e 80 1e 70 1e 6c 1e 5c 1e 58 1e 48 1e 44 1e 34 1e
=2E..
=2E..
=2E..
while it is discarding words. I assumed this is caused by a=20
recoverable error. The only difference in the entries are the=20
current/next pages, and bytes 8, 9, 16, 17, 18 of the "entry=20
offsets". The first "next page number" is 0, and subsequent ones are=20
the previous values of "current page number" (as if it is reading=20
backwards through a chain). If you like, I can send the whole 6MB=20
log file, and/or any configuration files you want.
A while ago (but I *think* with your fix in place), I also had a=20
problem with htdig crashing at one point, but I've lost the=20
details. When I've had problems, it has normally been on 10+ hour=20
digs. Any tips for isolating them? I've been thinking of doing an=20
integrity check every 100 database writes or so, but I don't know the=20
code well enough yet...
Thanks for your feedback, and very much for writng the _zlib fix!
Cheers,
Lachlan
On Tuesday 28 January 2003 09:38, Neal Richter wrote:
> =09What DB errors are you speaking of? Turning on
> wordlist_compress_zlib should be a workaround for the DB errors I
> know about.
> > Am I correct in believing that the hold-up is basically
> > database errors?
|
|
From: Neal R. <ne...@ri...> - 2003-01-30 00:17:24
|
you may have to resort to a simple loop similar to this:
space_flag = FALSE
j = 0;
alen = a.len(); //for efficiency
for (i = 0; i < alen; i++)
{
b[j] = a[i];
j++;
if (isspace(a[i]) != 0)
{
while ( ((i+1) < alen) && (isspace(a[i+1]) != 0) )
i++; //chew extra whitespace
}
}
Make sure and test this if you use it.. I'm not guaranteeing it to be
100% correct. ;-)
Thanks.
On Tue, 28 Jan 2003, Jessica Biola wrote:
> Can anyone give me a clue as to how to remove multiple
> spaces from a String?
>
> Also, is there a better way to do search and replace
> other than Stringname.replace('a','b')? I'd really
> like to use regular expressions on the first matching
> comparison value.
>
> -Jes
>
> p.s. I'm using htdig-3.2.0b4 and I'm in Display.cc
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
>
>
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
> http://www.vasoftware.com
> _______________________________________________
> htdig-dev mailing list
> htd...@li...
> https://lists.sourceforge.net/lists/listinfo/htdig-dev
>
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485
|
|
From: Gabriele B. <an...@ti...> - 2003-01-29 23:35:55
|
Ciao guys,
thanks for pointing me out some info regarding both Mac OS X and Solar=
is.
However, I have not been able to give them a look tonite and now it is la=
te
here in Italy and I'd better go to bed before doing something wrong.
I finished testing the new configure and makefiles on the compile farm=
with different results.
Everything is ok on all Linux on all platforms (i686, Alpha, Sparc);
MacOS x 10.1 still has that problem with shared libraries (as it was befo=
re)
whereas Solaris on a Sparc R220 doesn't go.
FreeBSD compile farm was not available today (I dunno why).
So ... Geoff and Gilles, can I commit the changes on configure (both
the main one and the 'db' dir one), makefiles, aclocals and libtool stuf=
f?
I think it would be important for people to test and configure them,
as sooner or later we should move to new autotools and libtool versions.
Ciao ciao
-Gabriele
|
|
From: Lachlan A. <lh...@us...> - 2003-01-29 22:13:41
|
Greetings Jes,
You could try the classes HtRegexReplace and HtRegexReplaceList.
For examples of usage, take a look in htcommon/HtURLRewriter.cc.
(I haven't used these classes, but they seem suitable...)
Thanks for your help with the development :)
Lachlan
On Wednesday 29 January 2003 04:47, Jessica Biola wrote:
> Can anyone give me a clue as to how to remove multiple
> spaces from a String?
> Also, is there a better way to do search and replace
> other than Stringname.replace('a','b')? I'd really
> like to use regular expressions on the first matching
> comparison value.
> p.s. I'm using htdig-3.2.0b4 and I'm in Display.cc
|
|
From: <And...@wi...> - 2003-01-29 21:08:35
|
Tested the 1/26 htdig snapshot on Solaris 2.7 x86 - w/ the ./configure --with-ssl=/usr/local/ssl it found ssl but did not correctly enable SSL. I had to go into include/htconfig.h and change/define: /* Define if you have the <ssl.h> header file. */ /* #undef HAVE_SSL_H */ #define HAVE_SSL_H 1 /* Define if you have the ssl library (-lssl). */ /* #undef HAVE_LIBSSL */ #define HAVE_LIBSSL 1 I think we figured it was the 1st that really mattered. Otherwise, htdig attempts an http connection to https ports and, obviously, get nothing useful back. Plus these compile warnings (gcc version 2.95.3): conf_lexer.cxx: In function `int yylex()': conf_lexer.cxx:704: warning: label `find_rule' defined but not used conf_lexer.cxx: At top level: conf_lexer.cxx:1790: warning: `void * yy_flex_realloc(void *, unsigned int)' defined but not used HtHTTP.cc: In function `static bool HtHTTP::isParsable(const char *)': HtHTTP.cc:827: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtHTTP.cc:827: warning: for conversion from `String' to `const char *' HtHTTP.cc:827: warning: because conversion sequence for the argument is better HtFile.cc: In function `static class String HtFile::File2Mime(const char *)': HtFile.cc:146: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:146: warning: for conversion from `String' to `bool' HtFile.cc:146: warning: because conversion sequence for the argument is better HtFile.cc: In method `enum Transport::DocStatus HtFile::Request()': HtFile.cc:183: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:183: warning: for conversion from `String' to `const char *' HtFile.cc:183: warning: because conversion sequence for the argument is better HtFile.cc:200: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:200: warning: for conversion from `String' to `const char *' HtFile.cc:200: warning: because conversion sequence for the argument is better HtFile.cc:264: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:264: warning: for conversion from `String' to `const char *' HtFile.cc:264: warning: because conversion sequence for the argument is better HtFile.cc:277: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:277: warning: for conversion from `String' to `const char *' HtFile.cc:277: warning: because conversion sequence for the argument is better HtFile.cc:284: warning: choosing `String::operator char *()' over `String::operator const char *() const' HtFile.cc:284: warning: for conversion from `String' to `const char *' HtFile.cc:284: warning: because conversion sequence for the argument is better Synonym.cc: In method `int Synonym::createDB(const HtConfiguration &)': Synonym.cc:84: warning: choosing `String::operator char *()' over `String::operator const char *() const' Synonym.cc:84: warning: for conversion from `String' to `const char *' Synonym.cc:84: warning: because conversion sequence for the argument is better Document.cc: In method `enum Transport::DocStatus Document::RetrieveLocal(HtDateTime, StringList *)': Document.cc:618: warning: choosing `String::operator char *()' over `String::operator const char *() const' Document.cc:618: warning: for conversion from `String' to `bool' Document.cc:618: warning: because conversion sequence for the argument is better Retriever.cc: In method `void Retriever::got_word(const char *, int, int)': Retriever.cc:1284: warning: comparison between signed and unsigned htsearch.cc: In function `void doFuzzy(WeightWord *, List &, List &)': htsearch.cc:689: warning: `class String * word' might be used uninitialized in this function parser.cc: In method `void Parser::perform_push()': parser.cc:325: warning: choosing `String::operator char *()' over `String::operator const char *() const' parser.cc:325: warning: for conversion from `String' to `bool' parser.cc:325: warning: because conversion sequence for the argument is better a Andy Bach, Sys. Mangler Internet: and...@wi... VOICE: (608) 261-5738 FAX 264-5030 " ... even if you're mediocre/decent at perl [the cmecf] code is pretty confusing in certain areas ..." CB |