From: G. T. Stresen-R. <ted...@ma...> - 2006-08-09 08:37:33
|
ASP, PHP, JSP, cgi, and any other server-side scripting language sites =20= can be indexed just fine by htdig. It is true, however, that a =20 developer _may_ find a way to script a site such that it cannot be =20 properly indexed by htdig (or that the indexing/searching process would =20= not return optimal results). It should also be noted that if the content to be searched lives in a =20= database and the only way to access the content is via a search form, =20= that a workaround would have to be implemented in order for htdig to be =20= able to index the content. This is because htdig indexes a site like a =20= spider: it crawls every link looking for content. If there are no links =20= to the content, the content will not be included in the index (nor in =20= the search results, obviously). I can promise you that at least one major firm in the U.S. uses htdig =20= as the search engine for their intranet, which is done in PHP and =20 Fusebox (which does not produce Search Engine Friendly URLs, in other =20= words, there are plenty of query string parameters). The content lives =20= in a database and access is restricted by groups (you have to be a =20 member of the group to see the content). The search results are =20 filtered. Users only see content for which they have security =20 clearance. This is accomplished via a PHP wrapper. I should note that I am NOT a developer (but I have provided an =20 installer package for Mac OS X and have used htdig since 1998). HTH Ted Stresen-Reuter http://www.tedmasterweb.com/modules/mydownloads/viewcat.php?cid=3D1 http://www.clevernet.biz On Aug 1, 2006, at 2:44 AM, JL Martin wrote: > Hi, > =A0 > My hosting company (Telus) claims that htDig cannot index websites =20 > that make use of ASP (they use version 3.1.6 and it worked in the =20 > past). > =A0 > Here's an excerpt of their email ..... > =A0 > ....To summarize, the results that HtDig would receive from a site =20 > using a server side scripting language could easily be ambiguous if =20= > not meaningless. There really is no solution for this, as we cannot =20= > predict what a developer might chose to code or how they might choose =20= > the make the site work. The best that can be done is to realize that =20= > HtDig is a good tool for static (html) websites. Unfortunately, as a =20= > result of it's design and implementation, HtDig does not handle =20 > dynamically composed websites very well or in a consistant manner. > =A0 > Any truth to this? Would upgrading to=A0v3.2 resolve this (scripting) =20= > issue? > =A0 > JL =20 > = Martin=A0----------------------------------------------------------------=20= > --------- > Using Tomcat but need to do more? Need to support web services, =20 > security? > Get stuff done quickly with pre-integrated technology to make your job = =20 > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache =20 > Geronimo > http://sel.as-us.falkag.net/sel?=20 > = cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D121642__________________________= _____=20 > ________________ > ht://Dig Developer mailing list: > htd...@li... > List information (subscribe/unsubscribe, etc.) > https://lists.sourceforge.net/lists/listinfo/htdig-dev= |