You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(47) |
Nov
(74) |
Dec
(66) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(95) |
Feb
(102) |
Mar
(83) |
Apr
(64) |
May
(55) |
Jun
(39) |
Jul
(23) |
Aug
(77) |
Sep
(88) |
Oct
(84) |
Nov
(66) |
Dec
(46) |
| 2003 |
Jan
(56) |
Feb
(129) |
Mar
(37) |
Apr
(63) |
May
(59) |
Jun
(104) |
Jul
(48) |
Aug
(37) |
Sep
(49) |
Oct
(157) |
Nov
(119) |
Dec
(54) |
| 2004 |
Jan
(51) |
Feb
(66) |
Mar
(39) |
Apr
(113) |
May
(34) |
Jun
(136) |
Jul
(67) |
Aug
(20) |
Sep
(7) |
Oct
(10) |
Nov
(14) |
Dec
(3) |
| 2005 |
Jan
(40) |
Feb
(21) |
Mar
(26) |
Apr
(13) |
May
(6) |
Jun
(4) |
Jul
(23) |
Aug
(3) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(6) |
| 2006 |
Jan
(2) |
Feb
(4) |
Mar
(4) |
Apr
(1) |
May
(11) |
Jun
(1) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
(4) |
Nov
|
Dec
(1) |
| 2007 |
Jan
(2) |
Feb
(8) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: Geoff H. <ghu...@us...> - 2003-01-12 08:15:22
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b5: Next release, tentatively 1 Feb 2003.
3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease.
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
(Please note that everything added here should have a tracker PR# so
we can be sure they're fixed. Geoff is currently trying add PR#s for
what's currently here.)
SHOWSTOPPERS:
* Mifluz database errors are a severe problem (PR#428295)
-- Does Neal's new zlib patch solve this for now?
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug) PR#618737.
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#618738)
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* Mifluz merge.
NEEDED FEATURES:
* Field-restricted searching. (e.g. PR#460833)
* Handle noindex_start & noindex_end as string lists.
* Quim's new htsearch/qtest query parser framework.
* File/Database locking. PR#405764.
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient. (PR#405279)
* Add thorough documentation on htsearch restrict/exclude behavior
(including '|' and regex).
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#405278.)
Should we make sure these config attributes are all documented in
defaults.cc, even if they're only set by input parameters and never
in the config file?
* Split attrs.html into categories for faster loading.
* Turn defaults.cc into an XML file for generating documentation and
defaults.cc.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. regex matching, database compression.)
PRs# 405280 #405281.
* TODO.html has not been updated for current TODO list and
completions.
* Htfuzzy could use more documentation on what each fuzzy algorithm
does. PR#405714.
* Document the list of all installed files and default
locations. PR#405715.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
* The code needs a security audit, esp. htsearch. PR#405765.
|
|
From: Gilles D. <gr...@sc...> - 2003-01-10 17:20:34
|
According to Geoff Hutchison: > It's probably my fault if this didn't end up in the CVS. I'll take a > look later this morning. But AFAIK, this is the latest version of the > defaults.xml "builder." Brian had also posted earlier patches and scripts, which were archived here: ftp://ftp.ccsf.org/htdig-patches/3.2.0b4/DefaultsXML-20021013.README ftp://ftp.ccsf.org/htdig-patches/3.2.0b4/DefaultsXML-20021013.tar.gz This archive seems to contain some scripts that aren't in his later posting. I don't know if you need both to make a complete set, or whether the earlier set was obsoleted by the later posting. Brian, can you clarify? -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Geoff H. <ghu...@ws...> - 2003-01-10 15:38:49
|
It's probably my fault if this didn't end up in the CVS. I'll take a look later this morning. But AFAIK, this is the latest version of the defaults.xml "builder." -Geoff Begin forwarded message: > From: Brian White <bw...@st...> > Date: Sun Nov 10, 2002 08:05:48 PM US/Central > To: lh...@ee... > Cc: htd...@li... > Subject: Re: [htdig-dev] defaults.cc > > At 11:43 11/11/2002, Lachlan Andrew wrote: >> Greetings Brian, >> >> The latest version of defaults.cc that I >> have is at <http://www.ee.mu.oz.au/staff/lha/defaults.cc>. >> It includes all of my earlier changes, plus some >> forward-ported 3.1.6 attributes. > > Well, I ran it through my defaults.xml builder and > ( much to my chagrin! ) it didn't work straight up. > > I have sheepishly attached a slightly adjusted > version of the builder..... > >> On Mon, 11 Nov 2002 10:35, Brian White wrote: >> > At 17:44 8/11/2002, Gabriele Bartolini wrote: >> > > we'd need to change the Configuration.[h,cc] files >> > Those changes are all in the patch I supplied. >> While we're changing Configuration.cc, what do people >> think about issuing a warning if an attribute is not >> found, rather than silently using the "default_value" >> argument? That would remind developers to add the >> attribute to defaults.xml and avoid having default >> values scattered throughout the code... > > On the face of it, I think this is a good idea, but > it would need to be carefully done... > > Brian |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-10 14:34:12
|
I agree too. It just makes it easier to mirror AND the documentation could be accessed without having to have a web server (via the file system and a browser). Perhaps the only real advantage of using PHP is XSLT, but even that can be a one-time experience... I continue to look for Brian White's scripts to see if I could add the code necessary to generate the documentation but I'm unable to find them. Are they in the CVS archives? If so, what are they called and where are they located? I've got a pretty good handle on coding in PHP and from what I can tell, Perl is not that much different so if possible (coming from someone who has hade to code in ASP, PHP, JSP, ColdFusion, who likes to write JavaScript and who's written some VBA, in addition to LOTS of HyperCard and AppleScript and Lingo), I'll see if I can write the code for generating the documentation in Perl, but I would like to see Brian's scripts... Ted Stresen-Reuter On Friday, January 10, 2003, at 04:02 AM, Budd, Sinclair wrote: > Just to butt in.. I have to agree wholeheartedly with what Gilles is > saying > below. > Off line generation,, Good idea. Generation at each site. Difficult. > > > And earlier, Gilles said: >> However, if we don't use PHP for on-the-fly generation of docs, but >> just >> for building static HTML files, does this provide a big advantage over >> Brian White's scripts? >> >> -- >> Gilles R. Detillieux E-mail: <gr...@sc...> >> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ >> Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Budd, S. <s....@ic...> - 2003-01-10 10:03:43
|
Just to butt in.. I have to agree wholeheartedly with what Gilles is saying below. Off line generation,, Good idea. Generation at each site. Difficult. 2) There's a learning curve associated with maintaining PHP files (one I haven't personally climbed yet). It's probably safe to say that most/all developers know HTML, and are able to maintain the docs as they are. Even defaults.cc/defaults.xml are pretty easy to get up to speed on, and the conversion programs for these don't need a whole lot of ongoing maintenance. Going with PHP for all docs might complicate things and reduce the amount of developers available to maintain the docs. 3) We're trying to minimize the amount of dependencies ht://Dig has. As-is, it needs a few libraries, autoconf/automake, and Perl. Adding PHP to the list could conceivably complicate matters for those installing the package, and consequently increase the amount of requests for help on the mailing lists. Using PHP just to generate the static HTML files for the attributes docs should minimize this problem, requiring only active developers to install PHP on their systems. If we do things right, end-users should not have to worry about doing this. However, if we don't use PHP for on-the-fly generation of docs, but just for building static HTML files, does this provide a big advantage over Brian White's scripts? -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ htdig-dev mailing list htd...@li... https://lists.sourceforge.net/lists/listinfo/htdig-dev |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-09 18:59:35
|
All good arguments for automating the production of flat files and, for the time being, leaving the 404 error page as is. Furthermore, perhaps the best solution is to add the creation of the html documentation to Brian White's scripts. I'll take a look at them (if someone can point me to their current location) and see if I could possibly contribute on that level (I'm not too optimistic, though, as I don't have any significant shell programming experience). If that just seems like too much of a challenge for me for right now, I'll resort to creating the php scripts that do the same thing. Ted On Thursday, January 9, 2003, at 12:01 PM, Gilles Detillieux wrote: > According to Ted Stresen-Reuter: >> I've reviewed, briefly, both files mentioned. I'm not sure what they're >> purpose is in terms of the whole project (I've been following the >> discussion, but not closely - seems like these are default values for >> every attribute based on the name), but it seems that the smartest >> thing >> to do would be to write some php code that parses the defaults.xml >> document and extracts the documentation from there. I could set this up >> to be done on the fly (for each request) or as a php file that reads >> the >> defaults.xml file and outputs a series of html files that could then be >> uploaded to the web site (similar to a project I did for BBEdit: >> http://www.tedmasterweb.com/glossary/ ) > > If we do go with PHP, I'd favour the latter approach, i.e. generating > all > the HTML files at once for the web site. More below... > >> How much control do we have on the htdig site (can we set options in >> .htaccess, or better yet, modify the httpd.conf file directly)? Is >> anyone opposed to moving the online documentation to php or to using >> php >> on htdig.org? Can we customize the 404 error page? Can we use >> .htaccess? > > www.htdig.org is currently hosted on vhost.sourceforge.net. We > definitely > don't have access to httpd.conf on this site. We almost certainly have > some options available via .htaccess, but I'd expect that it would be > a fairly restrictive set that we can actually override. > > I've never actually looked into what options are set by default and > what we can override, because fortunately it's never been an issue. > I'd kind of like to keep it that way, so that we keep our options more > open as far as where we can be hosted (as well as mirrored) in the > future. > If we start requiring features of our web service provider, like PHP and > certain settable options, it may limit our options in the future, as > well > as possibly burdening our mirror sites. However, I believe some of the > sites currently hosted by sf.net are PHP-based, so I expect it would be > feasible to make the switch, other concerns notwithstanding. > >> Finally, could someone provide a gloss on the doctype declaration in >> defaults.xml? I understand the basic structure of xml documents (and >> have written a few smil docs by hand) but I could use a little >> clarification on this one point. Oh, and, one last thing, why is there >> both a defaults.cc and defaults.xml document? I can understand the >> purpose of the .cc document, but what is the defaults.xml doc for? Is >> it >> just a pretty way of presenting the .cc doc? > > I think Lachlan answered the last few questions, and I'll leave it to > someone who knows XML to explain the doctype declaration. > >> One more thing... I tend to try and write XTHML 1.1 Strict copmliant >> html. Does anyone have a problem with this? For those who are >> unfamiliar >> with it, XHTML is an XML compliant version of HTML 4. > > No objection here! We want to make htsearch's output XHTML compliant > very soon, hopefully in time for the b5 release, and it would be great > to make all the HTML docs XHTML compliant too. > > ... and later... >> Lachlan et al: >> >> Thank you for the excellent reply. I will use defaults.xml as the >> master >> document for generating the indidvidual attributes pages. >> >> Geoff? Gilles? Any opinion on moving the documentation to dynamically >> generated files using PHP? If so, I would suggest we procede as >> follows: > ... >> In addition to creating a dynamic interface into the documentation, I >> would maintain the current design so the documentation could be browsed >> as well. >> >> If this is something the htdig development team would like to see, I'll >> do it and post my efforts to my own web site for review. > > I can see how going with PHP on the fly like this would simplify some > things. However, my concerns are the following: > > 1) Right now, with HTML-only documentation, it's very easy to host the > site wherever we want, mirror it anywhere, and include the HTML docs in > with the source, allowing the end users to put the docs up on their own > site easily, or just browse the HTML files directly off their hard > drive. > Requiring a PHP-enabled web server to host the docs would complicate > things for a lot of people. > > 2) There's a learning curve associated with maintaining PHP files (one I > haven't personally climbed yet). It's probably safe to say that > most/all > developers know HTML, and are able to maintain the docs as they are. > Even defaults.cc/defaults.xml are pretty easy to get up to speed on, > and the conversion programs for these don't need a whole lot of ongoing > maintenance. Going with PHP for all docs might complicate things and > reduce the amount of developers available to maintain the docs. > > 3) We're trying to minimize the amount of dependencies ht://Dig has. > As-is, it needs a few libraries, autoconf/automake, and Perl. Adding > PHP > to the list could conceivably complicate matters for those installing > the package, and consequently increase the amount of requests for help > on > the mailing lists. Using PHP just to generate the static HTML files for > the attributes docs should minimize this problem, requiring only active > developers to install PHP on their systems. If we do things right, > end-users should not have to worry about doing this. > > However, if we don't use PHP for on-the-fly generation of docs, but just > for building static HTML files, does this provide a big advantage over > Brian White's scripts? > > -- > Gilles R. Detillieux E-mail: <gr...@sc...> > Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ > Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) > > ------------------------------------------------------------------------------------ Homepage: http://www.tedmasterweb.com/ My JavaScript Window Management Tool: http://www.tedmasterweb.com/wmo/ |
|
From: Gilles D. <gr...@sc...> - 2003-01-09 18:03:06
|
According to Ted Stresen-Reuter: > I've reviewed, briefly, both files mentioned. I'm not sure what they're > purpose is in terms of the whole project (I've been following the > discussion, but not closely - seems like these are default values for > every attribute based on the name), but it seems that the smartest thing > to do would be to write some php code that parses the defaults.xml > document and extracts the documentation from there. I could set this up > to be done on the fly (for each request) or as a php file that reads the > defaults.xml file and outputs a series of html files that could then be > uploaded to the web site (similar to a project I did for BBEdit: > http://www.tedmasterweb.com/glossary/ ) If we do go with PHP, I'd favour the latter approach, i.e. generating all the HTML files at once for the web site. More below... > How much control do we have on the htdig site (can we set options in > .htaccess, or better yet, modify the httpd.conf file directly)? Is > anyone opposed to moving the online documentation to php or to using php > on htdig.org? Can we customize the 404 error page? Can we use .htaccess? www.htdig.org is currently hosted on vhost.sourceforge.net. We definitely don't have access to httpd.conf on this site. We almost certainly have some options available via .htaccess, but I'd expect that it would be a fairly restrictive set that we can actually override. I've never actually looked into what options are set by default and what we can override, because fortunately it's never been an issue. I'd kind of like to keep it that way, so that we keep our options more open as far as where we can be hosted (as well as mirrored) in the future. If we start requiring features of our web service provider, like PHP and certain settable options, it may limit our options in the future, as well as possibly burdening our mirror sites. However, I believe some of the sites currently hosted by sf.net are PHP-based, so I expect it would be feasible to make the switch, other concerns notwithstanding. > Finally, could someone provide a gloss on the doctype declaration in > defaults.xml? I understand the basic structure of xml documents (and > have written a few smil docs by hand) but I could use a little > clarification on this one point. Oh, and, one last thing, why is there > both a defaults.cc and defaults.xml document? I can understand the > purpose of the .cc document, but what is the defaults.xml doc for? Is it > just a pretty way of presenting the .cc doc? I think Lachlan answered the last few questions, and I'll leave it to someone who knows XML to explain the doctype declaration. > One more thing... I tend to try and write XTHML 1.1 Strict copmliant > html. Does anyone have a problem with this? For those who are unfamiliar > with it, XHTML is an XML compliant version of HTML 4. No objection here! We want to make htsearch's output XHTML compliant very soon, hopefully in time for the b5 release, and it would be great to make all the HTML docs XHTML compliant too. ... and later... > Lachlan et al: > > Thank you for the excellent reply. I will use defaults.xml as the master > document for generating the indidvidual attributes pages. > > Geoff? Gilles? Any opinion on moving the documentation to dynamically > generated files using PHP? If so, I would suggest we procede as follows: ... > In addition to creating a dynamic interface into the documentation, I > would maintain the current design so the documentation could be browsed > as well. > > If this is something the htdig development team would like to see, I'll > do it and post my efforts to my own web site for review. I can see how going with PHP on the fly like this would simplify some things. However, my concerns are the following: 1) Right now, with HTML-only documentation, it's very easy to host the site wherever we want, mirror it anywhere, and include the HTML docs in with the source, allowing the end users to put the docs up on their own site easily, or just browse the HTML files directly off their hard drive. Requiring a PHP-enabled web server to host the docs would complicate things for a lot of people. 2) There's a learning curve associated with maintaining PHP files (one I haven't personally climbed yet). It's probably safe to say that most/all developers know HTML, and are able to maintain the docs as they are. Even defaults.cc/defaults.xml are pretty easy to get up to speed on, and the conversion programs for these don't need a whole lot of ongoing maintenance. Going with PHP for all docs might complicate things and reduce the amount of developers available to maintain the docs. 3) We're trying to minimize the amount of dependencies ht://Dig has. As-is, it needs a few libraries, autoconf/automake, and Perl. Adding PHP to the list could conceivably complicate matters for those installing the package, and consequently increase the amount of requests for help on the mailing lists. Using PHP just to generate the static HTML files for the attributes docs should minimize this problem, requiring only active developers to install PHP on their systems. If we do things right, end-users should not have to worry about doing this. However, if we don't use PHP for on-the-fly generation of docs, but just for building static HTML files, does this provide a big advantage over Brian White's scripts? -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-09 14:04:15
|
Lachlan et al: Thank you for the excellent reply. I will use defaults.xml as the master document for generating the indidvidual attributes pages. Geoff? Gilles? Any opinion on moving the documentation to dynamically generated files using PHP? If so, I would suggest we procede as follows: - In general terms, recreate the php.net documentation interface (which allows you to do such things as http://www.php.net/mysql_fetch_assoc, and be taken directly to the documentation for that function). This is accomplished by using a custom 404 error page (written in PHP, in their case). Specifically, the 404 page carries out the following steps: 1) Parses the REQUEST_URI looking for a single entry following the host name. 2) IF A SINGLE ITEM IS FOUND FOLLOWING THE HOST NAME... - The 404 page looks for a match between the item and an entry in defaults.xml - IF A MATCH IS FOUND IN DEFAULTS.XML... - The user is redirected to a new attributes page (just one document) with a query string with the attribute name as the parameter. This varies from php.net's method, but I find it easier to manage one file rather than hundreds... - The new attributes page looks up the query string parameter in defaults.xml and extracts the documentation and we're done. - IF A MATCH IS NOT FOUND IN DEFAULTS.XML... - PHP executes a query against the htdig system using the single item as a search term and the results are displayed, with a message indicating that although the page they were seeking could not be found, many others exist that may be of use. IF MORE THAN A SINGLE ITEM IS FOUND FOLLOWING THE HOST NAME... - We have a choice: present a basic Page Not Found message or execute a query against htdig using everything following the host name as the search terms. I tend to vote for the first option as the state of AI hasn't quite reached a point where the second option is really helpful. In addition to creating a dynamic interface into the documentation, I would maintain the current design so the documentation could be browsed as well. If this is something the htdig development team would like to see, I'll do it and post my efforts to my own web site for review. Ted Stresen-Reuter http://www.tedmasterweb.com/ also, check out http://dev.susansexton.com/htdig/ On Thursday, January 9, 2003, at 07:00 AM, Lachlan Andrew wrote: > On Thursday 09 January 2003 10:30, Ted Stresen-Reuter wrote: >> I've reviewed, briefly, both files mentioned. I'm not sure what their >> purpose is in terms of the whole project. [deleted] Why is there >> both a defaults.cc and defaults.xml document? > > Their purpose is to be a central place where the options, default > values and documentation are all given, both for the executables to > use and for people to read. So far, defaults.cc has been that > central place, but this has the drawback of bloating the executable > with all of the descriptions, which aren't used. Brian White has > written some scripts to produce both a bare-bones defaults.cc and > attrs.html et al. from defaults.xml (and to generate defaults.xml > from the current defaults.cc). Once these have passed the test of > time, defaults.xml will become the authoritative file. > > > ------------------------------------------------------------------------------------ Homepage: http://www.tedmasterweb.com/ My JavaScript Window Management Tool: http://www.tedmasterweb.com/wmo/ |
|
From: Lachlan A. <lac...@ip...> - 2003-01-09 13:03:40
|
On Thursday 09 January 2003 10:30, Ted Stresen-Reuter wrote: > I've reviewed, briefly, both files mentioned. I'm not sure what their > purpose is in terms of the whole project. [deleted] Why is there > both a defaults.cc and defaults.xml document? Their purpose is to be a central place where the options, default=20 values and documentation are all given, both for the executables to use and for people to read. So far, defaults.cc has been that central place, but this has the drawback of bloating the executable with all of the descriptions, which aren't used. Brian White has written some scripts to produce both a bare-bones defaults.cc and attrs.html et al. from defaults.xml (and to generate defaults.xml from the current defaults.cc). Once these have passed the test of time, defaults.xml will become the authoritative file. |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-08 23:30:41
|
Ok... see comments/questions below... On Wednesday, January 8, 2003, at 10:10 AM, Geoff Hutchison wrote: > >> This seems like something I could do. Could you please provide more >> detail on what exactly needs to be done? For example, do we want to >> have >> each configure option in it's own html file a la php.net's >> documentation > > I guess that's certainly a possibility--I think the original idea was > that > we'd keep an attrs.html for backwards-URL compatibility, but we'd have a > few "category" pages. Certainly we could do both--have a "category" > index > and then separate pages for each attribute. > > But I think in some cases, it's useful to show examples in the > documentation of using multiple related attributes together. I'll plan on doing both. > >> Also, what version of htdig et al are we preparing this documentation >> for? > > 3.2.x. The defaults.cc file (or defaults.xml file, if you've > been following that) has a "category" field like "File Layout" or > "Indexing" that should be a separate page. I've reviewed, briefly, both files mentioned. I'm not sure what they're purpose is in terms of the whole project (I've been following the discussion, but not closely - seems like these are default values for every attribute based on the name), but it seems that the smartest thing to do would be to write some php code that parses the defaults.xml document and extracts the documentation from there. I could set this up to be done on the fly (for each request) or as a php file that reads the defaults.xml file and outputs a series of html files that could then be uploaded to the web site (similar to a project I did for BBEdit: http://www.tedmasterweb.com/glossary/ ) How much control do we have on the htdig site (can we set options in .htaccess, or better yet, modify the httpd.conf file directly)? Is anyone opposed to moving the online documentation to php or to using php on htdig.org? Can we customize the 404 error page? Can we use .htaccess? Finally, could someone provide a gloss on the doctype declaration in defaults.xml? I understand the basic structure of xml documents (and have written a few smil docs by hand) but I could use a little clarification on this one point. Oh, and, one last thing, why is there both a defaults.cc and defaults.xml document? I can understand the purpose of the .cc document, but what is the defaults.xml doc for? Is it just a pretty way of presenting the .cc doc? One more thing... I tend to try and write XTHML 1.1 Strict copmliant html. Does anyone have a problem with this? For those who are unfamiliar with it, XHTML is an XML compliant version of HTML 4. Ted ------------------------------------------------------------------------------------ Homepage: http://www.tedmasterweb.com/ My JavaScript Window Management Tool: http://www.tedmasterweb.com/wmo/ |
|
From: Budd, S. <s....@ic...> - 2003-01-08 17:04:31
|
hello I am having trouble indexing a ssl protected site ( Apache/2.0.43 (Unix) mod_ssl/2.0.43 OpenSSL/0.9.6g DAV/2 ) I can successfully dig a "Basic Authenticated" page on the site but when it tries to dig an https page the output shown below occurs. When I access the ssl protected page by Mozilla or IE5.5 , the browser first declares the certificate not recognized and asks for "acceptance" of the certificate, After accepting it, the browser asks for the username and password. After supplying the one as specified in the htdig configuration file, we can view the page. Could someone help me with the configuration of the htdig or specify the web server ssl configuration that htdig can handle. Thanks Sinclair Budd s....@ic... pick: bartok.cc.ic.ac.uk, # servers = 2 > bartok.cc.ic.ac.uk supports HTTP persistent connections (infinite) pick: bartok.cc.ic.ac.uk, # servers = 2 > bartok.cc.ic.ac.uk with a traditional HTTP connection 5:7:1:https://bartok.cc.ic.ac.uk/sslprotectedpage.html: Making HTTP request on https://bartok.cc.ic.ac.uk/sslprote ctedpage.html Header line: HTTP/1.1 400 Bad Request Header line: Date: Wed, 08 Jan 2003 14:20:43 GMT Header line: Server: Apache/2.0.43 (Unix) mod_ssl/2.0.43 OpenSSL/0.9.6g DAV/2 Header line: Content-Length: 546 Header line: Connection: close Header line: Content-Type: text/html; charset=iso-8859-1 No modification time returned: assuming now Retrieving document /sslprotectedpage.html on host: bartok.cc.ic.ac.uk:443 Http version : HTTP/1.1 Server : HTTP/1.1 Status Code : 400 Reason : Bad Request Access Time : Wed, 08 Jan 2003 14:20:43 gmt Modification Time : Wed, 08 Jan 2003 14:20:43 gmt Content-type : text/html; charset=iso-8859-1 Connection : close Request time: 0 secs not found pick: bartok.cc.ic.ac.uk, # servers = 2 > bartok.cc.ic.ac.uk supports HTTP persistent connections (infinite) pick: bartok.cc.ic.ac.uk, # servers = 2 > bartok.cc.ic.ac.uk with a traditional HTTP connection htdig: Run complete htdig: 2 servers seen: htdig: bartok.cc.ic.ac.uk:443 2 documents htdig: bartok.cc.ic.ac.uk:80 4 documents htdig: Errors to take note of: Not found: http://bartok.cc.ic.ac.uk/missingpage.html Ref: http://bartok.cc.ic.ac.uk/ Not found: https://bartok.cc.ic.ac.uk/ Ref: http://bartok.cc.ic.ac.uk/ Not found: https://bartok.cc.ic.ac.uk/sslprotectedpage.html Ref: http://bartok.cc.ic.ac.uk/ HTTP statistics =============== Persistent connections : Yes HEAD call before GET : No Connections opened : 5 Connections closed : 5 Changes of server : 1 HTTP Requests : 8 HTTP KBytes requested : 3.72266 HTTP Average request time : 0 secs HTTP Average speed : Infinity KBytes/secs ht://dig End Time: Wed Jan 8 14:20:43 2003 bartok$ |
|
From: Geoff H. <ghu...@ws...> - 2003-01-08 16:15:42
|
> This seems like something I could do. Could you please provide more > detail on what exactly needs to be done? For example, do we want to have > each configure option in it's own html file a la php.net's documentation I guess that's certainly a possibility--I think the original idea was that we'd keep an attrs.html for backwards-URL compatibility, but we'd have a few "category" pages. Certainly we could do both--have a "category" index and then separate pages for each attribute. But I think in some cases, it's useful to show examples in the documentation of using multiple related attributes together. > Also, what version of htdig et al are we preparing this documentation > for? 3.2.x. The defaults.cc file (or defaults.xml file, if you've been following that) has a "category" field like "File Layout" or "Indexing" that should be a separate page. -Geoff |
|
From: Ted Stresen-R. <ted...@ma...> - 2003-01-08 14:08:37
|
This seems like something I could do. Could you please provide more detail on what exactly needs to be done? For example, do we want to have each configure option in it's own html file a la php.net's documentation or are we in any position to create something a little more dynamic (again, a la php.net's documentation)? I am able to dedicate a weekend working on this and it doesn't seem like it should take that long. Also, what version of htdig et al are we preparing this documentation for? Ted Stresen-Reuter On Sunday, January 5, 2003, at 02:14 AM, Geoff Hutchison wrote: > * Split attrs.html into categories for faster loading. > ------------------------------------------------------------------------------------ Homepage: http://www.tedmasterweb.com/ My JavaScript Window Management Tool: http://www.tedmasterweb.com/wmo/ |
|
From: Geoff H. <ghu...@ws...> - 2003-01-06 03:35:52
|
> Has this idea been considered and rejected before, or just not got > around to yet? I'd put lots of things in that category. This sounds like a good idea to me. -Geoff |
|
From: Lachlan A. <lh...@us...> - 2003-01-05 13:15:40
|
Greetings all, I'm thinking of rewriting the String class as a copy-on-write class. (There seems to be a lot of excess malloc'ing which is making it hard to use mpatrol to track down the mifluz errors...) Has this idea been considered and rejected before, or just not got around to yet? Cheers, Lachlan |
|
From: Geoff H. <ghu...@us...> - 2003-01-05 08:14:13
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b5: Next release, tentatively 1 Feb 2003.
3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease.
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
(Please note that everything added here should have a tracker PR# so
we can be sure they're fixed. Geoff is currently trying add PR#s for
what's currently here.)
SHOWSTOPPERS:
* Mifluz database errors are a severe problem (PR#428295)
-- Does Neal's new zlib patch solve this for now?
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug) PR#618737.
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#618738)
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* Mifluz merge.
NEEDED FEATURES:
* Field-restricted searching. (e.g. PR#460833)
* Handle noindex_start & noindex_end as string lists.
* Quim's new htsearch/qtest query parser framework.
* File/Database locking. PR#405764.
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient. (PR#405279)
* Add thorough documentation on htsearch restrict/exclude behavior
(including '|' and regex).
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#405278.)
Should we make sure these config attributes are all documented in
defaults.cc, even if they're only set by input parameters and never
in the config file?
* Split attrs.html into categories for faster loading.
* Turn defaults.cc into an XML file for generating documentation and
defaults.cc.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. regex matching, database compression.)
PRs# 405280 #405281.
* TODO.html has not been updated for current TODO list and
completions.
* Htfuzzy could use more documentation on what each fuzzy algorithm
does. PR#405714.
* Document the list of all installed files and default
locations. PR#405715.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
* The code needs a security audit, esp. htsearch. PR#405765.
|
|
From: Lachlan A. <lh...@us...> - 2003-01-04 09:03:28
|
Greetings, That was a trial forward-port from 3.1.6. It is just defensive programmi= ng=20 not to try using a string of negative length. (Sorry about the lack of Changelog entry. I've since discovered that=20 Changelog entries *aren't* generated as part of the CVS checkin process. = =20 *sheepish grin* ) On Saturday 04 January 2003 02:05, Budd, Sinclair wrote: > I notice that there is a fix in 3.2.0b4-20021229 String.cc but no men= tion > in the changelog. What is the rational for change? > > < if (s && len > 0) > --- > > if (s && len !=3D 0) |
|
From: Budd, S. <s....@ic...> - 2003-01-03 15:06:32
|
Hello I notice that there is a fix in 3.2.0b4-20021229 String.cc but no mention in the changelog. What is the rational for change? ................. ./.version 1c1 < 3.2.0b4-20021229 --- > 3.2.0b4-20021201 ./htlib/String.cc 12c12 < // $Id: String.cc,v 1.33 2002/12/23 15:54:11 lha Exp $ --- > // $Id: String.cc,v 1.32 2002/02/01 22:49:34 ghutchis Exp $ 62c62 < if (s && len > 0) --- > if (s && len != 0) 71c71 < if (s.length() > 0) --- > if (s.length() != 0) |
|
From: Geoff H. <ghu...@ws...> - 2003-01-02 16:14:33
|
-- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ ---------- Forwarded message ---------- Date: Sun, 29 Dec 2002 17:25:22 -0800 (PST) From: Andrew Daviel <an...@da...> To: Geoff Hutchison <ghu...@us...> Subject: geographic searching and ht//Dig versions Hi A long time ago (well, about 10 months I think) I was working on a geographic-enabled version of ht//Dig which picks up location from metadata and allows a "sort by closest" search. I think at the time you (or someone) expressed interest in this. It is online at http://geotags.com/ with a few (hundred) pages indexed (try "disney" or "restaurant") I was trying to get back on this again and was trying to index a moderately large site that someone has added metadata to, and was having some trouble with htdig hanging (the log output with -vv stops, but htdig continues to eat CPU). I am currently using a modified version of 3.2.0b3 I am also using (as I recall the same version) htdig for regular search within a domain. I was wondering if I would be better to use the production version 3.1.6, and if I might submit the patches to the general effort. For the domain search (at www.triumf.ca) I wanted to allow a search on author name. We have quite a few scientific preprints in PostScript and PDF; the metadata in PDF (and in many HTML editors) typically includes author name, keywords, subject and title. (to tell the truth we have a lot of pages without even a title, or a title of "frrt56.tex", but that's an education problem...) Adding subject, title and keywords to the general index is reasonable, but there is a distinct difference between someone having written a paper and just being mentioned in it, so I added an author entry. For the geographic search I have 3 metadata values I collect - position, region and placename. So to allow htdig to collect all these I added 4 entries to the DOC structure. To control indexing I added a config boolean "require_geo". I was intending to have another config value "require_region" to restrict indexing to one geographic region. Currently if require_geo is true then a region metadata acts like robots=follow,noindex while position metadata acts like robots=all. This is somewhat messy but works more-or-less. Since the overwhelming majority of web pages do not have position data I do not want to visit them all to check, but want to allow lists of links to position-enabled pages and to follow links that might have them. I think I currently follow all links up to max_hop_count and then follow links forever if a region is present; I might change this, it is probably better to require a region on index pages and honour max_hop_count. If require_geo is false or not present, htdig runs normally. It will index position and region data if it finds it and can include it in search results, but normally it won't find any. So I think it is back-compatible with regular htdig. regards Andrew Daviel Vancouver Webpages |
|
From: Geoff H. <ghu...@us...> - 2002-12-29 08:13:49
|
STATUS of ht://Dig branch 3-2-x
RELEASES:
3.2.0b5: Next release, tentatively 1 Dec 2002.
3.2.0b4: "In progress" -- snapshots called "3.2.0b4" until prerelease.
3.2.0b3: Released: 22 Feb 2001.
3.2.0b2: Released: 11 Apr 2000.
3.2.0b1: Released: 4 Feb 2000.
(Please note that everything added here should have a tracker PR# so
we can be sure they're fixed. Geoff is currently trying add PR#s for
what's currently here.)
SHOWSTOPPERS:
* Mifluz database errors are a severe problem (PR#428295)
-- Does Neal's new zlib patch solve this for now?
KNOWN BUGS:
* Odd behavior with $(MODIFIED) and scores not working with
wordlist_compress set but work fine without wordlist_compress.
(the date is definitely stored correctly, even with compression on
so this must be some sort of weird htsearch bug) PR#618737.
* Not all htsearch input parameters are handled properly: PR#405278. Use a
consistant mapping of input -> config -> template for all inputs where
it makes sense to do so (everything but "config" and "words"?).
* META descriptions are somehow added to the database as FLAG_TITLE,
not FLAG_DESCRIPTION. (PR#618738)
PENDING PATCHES (available but need work):
* Additional support for Win32.
* Memory improvements to htmerge. (Backed out b/c htword API changed.)
* Mifluz merge.
NEEDED FEATURES:
* Field-restricted searching. (e.g. PR#460833)
* Return all URLs. (PR#618743)
* Handle noindex_start & noindex_end as string lists.
* Quim's new htsearch/qtest query parser framework.
* File/Database locking. PR#405764.
TESTING:
* httools programs:
(htload a test file, check a few characteristics, htdump and compare)
* Turn on URL parser test as part of test suite.
* htsearch phrase support tests
* Tests for new config file parser
* Duplicate document detection while indexing
* Major revisions to ExternalParser.cc, including fork/exec instead of popen,
argument handling for parser/converter, allowing binary output from an
external converter.
* ExternalTransport needs testing of changes similar to ExternalParser.
DOCUMENTATION:
* List of supported platforms/compilers is ancient. (PR#405279)
* Add thorough documentation on htsearch restrict/exclude behavior
(including '|' and regex).
* Document all of htsearch's mappings of input parameters to config attributes
to template variables. (Relates to PR#405278.)
Should we make sure these config attributes are all documented in
defaults.cc, even if they're only set by input parameters and never
in the config file?
* Split attrs.html into categories for faster loading.
* Turn defaults.cc into an XML file for generating documentation and
defaults.cc.
* require.html is not updated to list new features and disk space
requirements of 3.2.x (e.g. phrase searching, regex matching,
external parsers and transport methods, database compression.)
PRs# 405280 #405281.
* TODO.html has not been updated for current TODO list and
completions.
* Htfuzzy could use more documentation on what each fuzzy algorithm
does. PR#405714.
* Document the list of all installed files and default
locations. PR#405715.
OTHER ISSUES:
* Can htsearch actually search while an index is being created?
* The code needs a security audit, esp. htsearch. PR#405765.
|
|
From: Gabriele B. <g.b...@co...> - 2002-12-24 07:17:51
|
I send you a script file that originally Gilles sent to me (I modified that). It's very simple. Merry XMas to everyone and Lachlan I hope I can get to see you down under (I'll contact you later privately). Ciao ciao -Gabriele Il mar, 2002-12-24 alle 06:40, Lachlan Andrew ha scritto: > On Fri, 20 Dec 2002 02:13, Geoff Hutchison wrote: > > > Send me an e-mail with your SourceForge account and I can > > turn it on. > > Thanks Geoff :) > > My next question is how should I add Changelog entries? > (The timestamp at the start of each entry seems to be > generated automatically...) > > MERRY CHRISTMAS ALL!! > > Lachlan > > -- > Lachlan Andrew Phone: +613 8344-3816 Fax: +613 8344-6678 > Dept of Electrical and Electronic Engg CRICOS Provider Code > University of Melbourne, Victoria, 3010 AUSTRALIA 00116K > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > htdig-dev mailing list > htd...@li... > https://lists.sourceforge.net/lists/listinfo/htdig-dev -- Gabriele Bartolini - Web Programmer Comune di Prato - Prato - Tuscany - Italy g.b...@co... | http://www.comune.prato.it > find bin/laden -name osama -exec rm {} ; |
|
From: Geoff H. <ghu...@ws...> - 2002-12-24 06:09:20
|
On Tue, 24 Dec 2002, Lachlan Andrew wrote: > My next question is how should I add Changelog entries? > (The timestamp at the start of each entry seems to be > generated automatically...) I use emacs as an editor, which makes ChangeLog entries quite easy. If you're in a particular file, type "add-change-log-entry" (tab completion works nicely, as does binding it to a key) and make sure it's picking the right ChangeLog file. It'll do most of the formatting for you. Otherwise, cut-and-paste works OK too. ;-) Cheers, -Geoff |
|
From: Lachlan A. <lh...@ee...> - 2002-12-24 05:41:21
|
On Fri, 20 Dec 2002 02:13, Geoff Hutchison wrote: > Send me an e-mail with your SourceForge account and I can > turn it on. Thanks Geoff :) My next question is how should I add Changelog entries? (The timestamp at the start of each entry seems to be generated automatically...) MERRY CHRISTMAS ALL!! Lachlan -- Lachlan Andrew Phone: +613 8344-3816 Fax: +613 8344-6678 Dept of Electrical and Electronic Engg CRICOS Provider Code University of Melbourne, Victoria, 3010 AUSTRALIA 00116K |
|
From: David W. <dav...@il...> - 2002-12-23 03:40:40
|
Hi, Further to my last email about adding an Australian Mirror. Can we please= =20 get our name listed as 'Ilisys dedicated hosting' (thats an uppercase i=20 followed by an l (ilisys)) Regards, Dave Wilcox Technical Support Officer tel: +61 8 9226 5622 fax +61 8 9226 5633 1-800-999-618 Ilisys Internet=AE http://www.ilisys.com.au/ NEW! Ilisys discussion forums http://forums.ilisys.com.au Special - This month only, 10% off everything. Merry christmas!! |
|
From: David W. <dav...@il...> - 2002-12-23 03:35:20
|
Hi, We have setup an Australian mirror of htdig. The mirror can be viewed at=20 http://htdig.ilisys.com.au/ . Can we get added to your list of mirrors= please? Regards, Dave Wilcox Technical Support Officer tel: +61 8 9226 5622 fax +61 8 9226 5633 1-800-999-618 Ilisys Internet=AE http://www.ilisys.com.au/ NEW! Ilisys discussion forums http://forums.ilisys.com.au Special - This month only, 10% off everything. Merry christmas!! |