From: Christopher M. <chr...@mc...> - 2002-03-04 22:32:54
|
First off, I want to say thanks to the htdig developers. Your work has added tremendous value to our site and if there was a way I could get out university to give you big gobs of cash, I would. :-) For our search engine, I wanted things like restrict and config file values to be dynamically generated by an SQL query via PHP. I had a look at the PHP wrapper for htdig, and found it a bit involved to setup. So, what I did instead was modify the Display.cc file in htsearch so that it generated properly formatted query strings. All this meant was changing ';' for '&' in the Display::createURL method (16 of them to change). Then to insert the htdig query in a PHP page, I set script_name in the config file accordingly and then did this: <? <code to grab restrict, config and other stuff from PostgreSQL> $words=urlencode($words); include("http://server_name.com/cgi-bin/htsearch_php?restrict=${restrict};config=${config};method=${method};sort=${sort};matchesperpage=${matchesperpage};words=${words};page=${page}"); ?> Anyway, I found this to be *much* easier than using the php wrapper example, so hopefully this will be of use to someone else. Maybe it would be nice to have a config option in 3.2 where it could compile a php friendly version? Just a thought. Thanks again. Cheers, Chris -- Christopher Murtagh Webmaster / Sysadmin Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 |
From: Gilles D. <gr...@sc...> - 2002-03-16 00:04:28
|
According to Christopher Murtagh: > So, what I did instead was modify the Display.cc file in htsearch so that > it generated properly formatted query strings. All this meant was changing > ';' for '&' in the Display::createURL method (16 of them to change). ... > Maybe it would be nice to have a config option in 3.2 where it could > compile a php friendly version? Just a thought. Are you using the latest version of PHP? I thought I had read previously on this list that it now can parse CGI parameters separated by semicolons. Maybe I'm remembering wrong? I know the latest CGI.pm for Perl does. HTML 4.0 is hardly a new standard, and we've actually been slow to bring ht://Dig into compliance with it, so I don't know why other web application developers are so slow to get with the program. See FAQ 5.21 if you haven't already. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 |
From: Christopher M. <chr...@mc...> - 2002-04-22 20:05:21
|
On Fri, 15 Mar 2002, Gilles Detillieux wrote: >According to Christopher Murtagh: >> So, what I did instead was modify the Display.cc file in htsearch so that >> it generated properly formatted query strings. All this meant was changing >> ';' for '&' in the Display::createURL method (16 of them to change). >... >> Maybe it would be nice to have a config option in 3.2 where it could >> compile a php friendly version? Just a thought. > >Are you using the latest version of PHP? I thought I had read previously >on this list that it now can parse CGI parameters separated by semicolons. >Maybe I'm remembering wrong? I know the latest CGI.pm for Perl does. >HTML 4.0 is hardly a new standard, and we've actually been slow to bring >ht://Dig into compliance with it, so I don't know why other web application >developers are so slow to get with the program. See FAQ 5.21 if you >haven't already. Hi Gilles, Thanks for the note... sorry it took so long for me to reply. Your reply got filtered badly and I just saw it today! Sorry I didn't see the FAQ, I didn't realize that this was a PHP problem not being compliant. FWIW, I tried the following with PHP version 4.1.2 (the lastest): <HTML> <? print"foo: $foo<BR>"; print"foobar: $foobar<BR>"; ?> </HTML> with this URL: http://server/file.php?foo=hey&foobar=there Output is as expected: foo: hey foobar: there However, this URL: http://server/file.php?foo=hey;foobar=there produces: foo: hey;foobar=there foobar: Perhaps there is a compile option in PHP? Anyone else solve this issue? A compile time option with PHP maybe? Cheers, Chris -- Christopher Murtagh Webmaster / Sysadmin Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 |
From: Ted Stresen-R. <bow...@ho...> - 2002-04-23 00:02:20
|
1) You can set the argument separator for php in the php.ini file as in the following: arg_separator.output = ";" arg_separator.input = ";" (be sure to use the quotes as the semi-colon is seen as an "end of line" marker in the php.ini file) 2) Depending on your configuration, you may be able to do (1) on a directory-by-directory basis using the .htaccess file (php.net has more info on how to do this, http://www.modwest.com/help/kb.phtml?cat=1&qid=44 but this article may be more helpful). 3) You can also uses ini_set(arg_separator.input) = ";" (and ini_get()) to set the argument separator right in a php page, but note that if you are trying to set these values after a page has been returned to the browser from htsearch, you may be too late as the page has already been parsed by both htsearch and the web server (and PHP). http://www.modwest.com/help/kb.phtml?cat=5&qid=98 For susansexton.com, we just went ahead and changed it in the php.ini file so that we would be compliant and not worry about it anymore... For details on how we set up PHP to work with htdig, check out http://dev.susansexton.com/htdig/ Ted Stresen-Reuter (trying really hard to provide quality feedback ;-) On 4/22/02 3:05 PM, "Christopher Murtagh" <chr...@mc...> wrote: > On Fri, 15 Mar 2002, Gilles Detillieux wrote: >> According to Christopher Murtagh: >>> So, what I did instead was modify the Display.cc file in htsearch so that >>> it generated properly formatted query strings. All this meant was changing >>> ';' for '&' in the Display::createURL method (16 of them to change). >> ... >>> Maybe it would be nice to have a config option in 3.2 where it could >>> compile a php friendly version? Just a thought. >> >> Are you using the latest version of PHP? I thought I had read previously >> on this list that it now can parse CGI parameters separated by semicolons. >> Maybe I'm remembering wrong? I know the latest CGI.pm for Perl does. >> HTML 4.0 is hardly a new standard, and we've actually been slow to bring >> ht://Dig into compliance with it, so I don't know why other web application >> developers are so slow to get with the program. See FAQ 5.21 if you >> haven't already. > > > Hi Gilles, > > Thanks for the note... sorry it took so long for me to reply. Your reply > got filtered badly and I just saw it today! > > Sorry I didn't see the FAQ, I didn't realize that this was a PHP problem > not being compliant. FWIW, I tried the following with PHP version 4.1.2 > (the lastest): > > <HTML> > <? > print"foo: $foo<BR>"; > print"foobar: $foobar<BR>"; > ?> > </HTML> > > with this URL: > > http://server/file.php?foo=hey&foobar=there > > Output is as expected: > > foo: hey > foobar: there > > However, this URL: > > http://server/file.php?foo=hey;foobar=there > > produces: > > foo: hey;foobar=there > foobar: > > Perhaps there is a compile option in PHP? Anyone else solve this issue? A > compile time option with PHP maybe? > > Cheers, > > Chris > > -- > > Christopher Murtagh > Webmaster / Sysadmin > Web Communications Group > McGill University > Montreal, Quebec > Canada > > Tel.: (514) 398-3122 > Fax: (514) 398-2017 > > > > _______________________________________________ > htdig-general mailing list <htd...@li...> > To unsubscribe, send a message to > <htd...@li...> with a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html > |
From: Gilles D. <gr...@sc...> - 2002-04-23 16:11:23
|
According to Ted Stresen-Reuter: > 1) You can set the argument separator for php in the php.ini file as in the > following: > arg_separator.output = ";" > arg_separator.input = ";" > > (be sure to use the quotes as the semi-colon is seen as an "end of line" > marker in the php.ini file) If you do this, will PHP still recognize the "&" as a valid argument separator as well? Ideally, you'd want a way of configuring it to allow either. If your PHP page is called from the action attribute of an HTML <form> tag, using the GET method, it will get its parameters separated by ampersands, as this is the CGI standard. However, if the same PHP page is called from an HTML <a href=...> tag, it should be able to recognize the semicolon as separator, as the HTML 4.0 standard recommends. As far as I can tell, the susansexton.com web site's results page does allow both, but it keeps crashing my Netscape 4.x browser so I'm not completely certain of this. On the other hand, if PHP can only allow one separator and not the other, then I can see the value in leaving it as the ampersand separator, but then patching htsearch to use "&", or better still, "&" (for HTML 4.0 compliance) as the parameter separator for page button URLs. If this is the case, it would be a good argument for adding a new config attribute to htsearch for this purpose, as "fix your wrapper" wouldn't be a solution in this case. ... > Ted Stresen-Reuter > (trying really hard to provide quality feedback ;-) Thanks, Ted. I'm really out of my element when it comes to PHP questions, so I appreciate your fielding this one. > On 4/22/02 3:05 PM, "Christopher Murtagh" <chr...@mc...> > wrote: > > > On Fri, 15 Mar 2002, Gilles Detillieux wrote: > >> According to Christopher Murtagh: > >>> So, what I did instead was modify the Display.cc file in htsearch so that > >>> it generated properly formatted query strings. All this meant was changing > >>> ';' for '&' in the Display::createURL method (16 of them to change). > >> ... > >>> Maybe it would be nice to have a config option in 3.2 where it could > >>> compile a php friendly version? Just a thought. > >> > >> Are you using the latest version of PHP? I thought I had read previously > >> on this list that it now can parse CGI parameters separated by semicolons. > >> Maybe I'm remembering wrong? I know the latest CGI.pm for Perl does. > >> HTML 4.0 is hardly a new standard, and we've actually been slow to bring > >> ht://Dig into compliance with it, so I don't know why other web application > >> developers are so slow to get with the program. See FAQ 5.21 if you > >> haven't already. > > > > > > Hi Gilles, > > > > Thanks for the note... sorry it took so long for me to reply. Your reply > > got filtered badly and I just saw it today! > > > > Sorry I didn't see the FAQ, I didn't realize that this was a PHP problem > > not being compliant. FWIW, I tried the following with PHP version 4.1.2 > > (the lastest): > > > > <HTML> > > <? > > print"foo: $foo<BR>"; > > print"foobar: $foobar<BR>"; > > ?> > > </HTML> > > > > with this URL: > > > > http://server/file.php?foo=hey&foobar=there > > > > Output is as expected: > > > > foo: hey > > foobar: there > > > > However, this URL: > > > > http://server/file.php?foo=hey;foobar=there > > > > produces: > > > > foo: hey;foobar=there > > foobar: > > > > Perhaps there is a compile option in PHP? Anyone else solve this issue? A > > compile time option with PHP maybe? -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
From: Christopher M. <chr...@mc...> - 2002-04-23 16:48:11
|
On Tue, 23 Apr 2002, Gilles Detillieux wrote: >According to Ted Stresen-Reuter: >> 1) You can set the argument separator for php in the php.ini file as in the >> following: >> arg_separator.output = ";" >> arg_separator.input = ";" > >If you do this, will PHP still recognize the "&" as a valid argument >separator as well? Ideally, you'd want a way of configuring it to >allow either. Hi Ted, Gilles, Thanks very much for the feedback. I have tested this and am happy to say that yes it will work for both. In fact, I also tried this: arg_separator.output = ",&;" arg_separator.input = ",&;" and this URL: http://server/page.php?foo=hey,foobar=there worked as expected. So, rather than the hack I did to htsearch, all I need to do is change the config line in my php.ini file (well, I did another hack as well, but I didn't need to change the ';'). :-) So, now my search engine can fetch restrict values from Postgres that is aware of multiple sites that span across different servers, and I've hacked htsearch so that it doesn't put the restrict in the page URL (because it is generated by PHP/Postgres). I then simply have this code: if($words){ $words = urlencode($words); $config = preg_replace ("/[^a-z]/","",$config); $method = preg_replace ("/[^a-z]/","",$method); $sort = preg_replace ("/[^a-z]/","",$sort); $matchesperpage = preg_replace ("/[^0-9]/","",$matchesperpage); $SearchArea = preg_replace ("/[^0-9]/","",$SearchArea); $page = preg_replace ("/[^0-9]/","",$page); print"<!--\n"; // this is to remove the content-type generated for // CGI compliance, also needed in // the php.conf file is the end of the comment // because I'm using passhthru rather than exec // (faster and no parsing arrays) passthru("/path_to/cgi-bin/htsearch_php -c /path_to/htdig/conf/php.conf 'restrict=${restrict};config=${config};etc...'"; } Thanks again for all the feedback! Cheers, Chris -- Christopher Murtagh Webmaster / Sysadmin Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 |
From: Gilles D. <gr...@sc...> - 2002-04-23 18:49:16
|
According to Christopher Murtagh: > On Tue, 23 Apr 2002, Gilles Detillieux wrote: > >According to Ted Stresen-Reuter: > >> 1) You can set the argument separator for php in the php.ini file as in the > >> following: > >> arg_separator.output = ";" > >> arg_separator.input = ";" > > > >If you do this, will PHP still recognize the "&" as a valid argument > >separator as well? Ideally, you'd want a way of configuring it to > >allow either. > > Hi Ted, Gilles, > > Thanks very much for the feedback. I have tested this and am happy to say > that yes it will work for both. In fact, I also tried this: > > arg_separator.output = ",&;" > arg_separator.input = ",&;" > > and this URL: > > http://server/page.php?foo=hey,foobar=there > > worked as expected. OK, so arg_separator.input works as a multiple-choice thingy, where any one single character in the list works as separator (but & works whether in the list or not). What do multiple characters in arg_separator.output mean, though? If it only uses one, which one? If it picks the one that was used for input, if it's in the list, that would be ideal. If it uses the whole string as a separator, though, that's not what you want. How do you test how arg_separator.output works? If I can get to the bottom of what the ideal settings for these are, I'd like to add a note about this to FAQ 5.21. > So, rather than the hack I did to htsearch, all I need to do is change > the config line in my php.ini file (well, I did another hack as well, but > I didn't need to change the ';'). :-) > > So, now my search engine can fetch restrict values from Postgres that is > aware of multiple sites that span across different servers, and I've > hacked htsearch so that it doesn't put the restrict in the page URL > (because it is generated by PHP/Postgres). I then simply have this code: Interesting. Normally, htsearch puts in the page URL any parameter that it received as a CGI or URL parameter initially, so it doesn't propagate parameters that aren't previously passed to it, but it doesn't allow for the case where you'd want to pass it a parameter but not have it passed back. I don't know if this would be a commonly needed feature - I don't recall any earlier requests for it, or complaints about having to hack it out of the code. > if($words){ > > $words = urlencode($words); > $config = preg_replace ("/[^a-z]/","",$config); > $method = preg_replace ("/[^a-z]/","",$method); > $sort = preg_replace ("/[^a-z]/","",$sort); > $matchesperpage = preg_replace ("/[^0-9]/","",$matchesperpage); > $SearchArea = preg_replace ("/[^0-9]/","",$SearchArea); > $page = preg_replace ("/[^0-9]/","",$page); > > print"<!--\n"; // this is to remove the content-type generated for > // CGI compliance, also needed in > // the php.conf file is the end of the comment > // because I'm using passhthru rather than exec > // (faster and no parsing arrays) In 3.1.6, you can turn off the content-type header that htsearch outputs. See http://www.htdig.org/attrs.html#search_results_contenttype -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) |
From: Christopher M. <chr...@mc...> - 2002-04-23 21:10:52
|
On Tue, 23 Apr 2002, Gilles Detillieux wrote: >OK, so arg_separator.input works as a multiple-choice thingy, where any >one single character in the list works as separator (but & works whether >in the list or not). What do multiple characters in >arg_separator.output mean, though? Good question, I'm not sure how to test this. I tried changing it to a bunch of different options, but I'm not sure where PHP would actually *use* arg_separator.output. If you make a FORM with the GET method, it is the browser that generates the query string (mozilla and every browser I know of uses '&'). >How do you test how arg_separator.output works? Dunno. >If I can get to the bottom of what the ideal settings for these are, I'd >like to add a note about this to FAQ 5.21. I would say that for the most part, the arg_separator.input should be ";&". >> So, now my search engine can fetch restrict values from Postgres that is >> aware of multiple sites that span across different servers, and I've >> hacked htsearch so that it doesn't put the restrict in the page URL >> (because it is generated by PHP/Postgres). I then simply have this code: > >Interesting. Normally, htsearch puts in the page URL any parameter that >it received as a CGI or URL parameter initially, so it doesn't propagate >parameters that aren't previously passed to it, but it doesn't allow for >the case where you'd want to pass it a parameter but not have it passed >back. I don't know if this would be a commonly needed feature - I don't >recall any earlier requests for it, or complaints about having to hack >it out of the code. I don't know how different my site is from anyone else's. I did have to pass an additional parameter as well. The allow_in_form config was really handy there, and I was just glad that the code was clean enough to be able to go and make the changes without any problems. :-) >In 3.1.6, you can turn off the content-type header that htsearch outputs. >See http://www.htdig.org/attrs.html#search_results_contenttype Very cool. Thanks! BTW, to see how we are using ht:/dig with Postgres and PHP, check out this URL: http://www.mcgill.ca/music-departments/jazz/ and perform a search on 'student' at the bottom of the page. The result page is color-coded for the site you came from, the title of the search site and search section comes from Postgres as well, and the restrict value will be (in this case) www.mcgill.ca/music-departments/. If you refine the search to the entire faculty, it will include a number of www.mcgill.ca/music-foo type dirs, but could also include www.music.mcgill.ca/foo. ht://Dig rocks. Thanks for all the work. Cheers, Chris -- Christopher Murtagh Webmaster / Sysadmin Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 |