Share

HouseSpider

The forum address has changed, you have been automatically redirected. Please update any bookmarks to use the new URL.

Subscribe

1 page searched, not spidering

  1. nobody

    2003-01-14 06:43:54 UTC
    My site: http://www.learningcommunity.capellauniversity.edu/~rdanevich/

    HTML applet tag code:
    <applet code="HouseSpider.class" archive="HouseSpider.jar, buttons.bevel.jar" width="90%" height="200">
    <param name="URLStart" value="index.html"> <!-- Optional Parameter, already at base -->
    <param name="URLExclude" value=""> <!-- Which folder do you want to exclude? Enter here -->
    <param name="URLHelp" value="searchhelp.htm">
    <param name="bgcolour" value="FFFFFF">
    <param name="fgcolour" value="666666">
    <param name="bgtextcolour" value="FFFFFF">
    <param name="textcolour" value="666666">
    <param name="Debug" value="3">
    <param name="Target" value="_blank">
    </applet>

    Even with an absolute link specified in the URLStart parameter I get the same error.

    JavaConsole Debug message:
    URLStart: http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html

    URLExclude: http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html

    URLHelp: http://www.learningcommunity.capellauniversity.edu/~rdanevich/searchhelp.htm

    Reading http://www.learningcommunity.capellauniversity.edu/~rdanevich/HouseSpider.index.zip

    Compressed index-file could not be read.

    Reading http://www.learningcommunity.capellauniversity.edu/~rdanevich/HouseSpider.index

    Index-file could not be read, doing spider-search.

    Loading http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html ...

    Loading http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html ... done.

    Listing search words and operators ...

    (Operator codes: 0 = and, 1 = or, 11 = end of or, 2 = exclude)

    bank - 0

    Listing done.

    Parsing http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html ...

    Warning, found no body tag in: http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html

    Parsing http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html ... done.
  2. 2003-01-14 07:05:33 UTC
    You have found a "bug" - not present when using the appletviewer, but only in browsers. You have

    <param name="URLStart" value="index.html"> <!-- Optional Parameter, already at base -->
    <param name="URLExclude" value=""> <!-- Which folder do you want to exclude? Enter here -->

    but still the output in the java console is:

    URLStart: http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html
    URLExclude: http://www.learningcommunity.capellauniversity.edu/~rdanevich/index.html

    Since URLExclude = URLStart you get only one page. I'll try to fix this in the soon coming new release of HouseSpider, but in the mean time you solve it by the following work-around:

    <!-- param name="URLExclude" value="" --> <!-- Which folder do you want to exclude? Enter here -->

    Hans
  3. nobody

    2003-01-14 08:32:38 UTC
    That seems to work, however it only seems to search through a maximum of 14 pages. Perhaps that is all that I have "linked" together, but I thought this applet it did a directory search? I have more than 14 web pages. And I don't understand where it's grabbing the page title/description from? Almost as if it's from the first line of text.

    Perhaps I'll try the indexing process tomorrow. I'm not a java programmer and barely understand it. However may I suggest you look at this java search applet code. http://www.babbage.demon.co.uk/java.html
    IT seemed to do a good directory search and even displayed some lines of text before and after your keyword search. However the hyperlink function didn't work.
  4. 2003-01-14 09:27:33 UTC
    1) The applet only follow links - it doesn't do a directory search. How should that be done? In addition it only follow links to your pages - not to for example http://www.learningcommunity.capellauniversity.edu/~dgichana/coffeeproject/personalpage/webpages/home.htm which isn't a subdirectory of http://www.learningcommunity.capellauniversity.edu/~rdanevich/

    2) The title is taken from the title-tag of your pages. Look at the top bar of your browser (or view the source) and you see that HouseSpider is using the title. May I suggest using better titles?

    3) I know http://www.babbage.demon.co.uk/java.html very well - quote from the page:
    The applet now compiles under Java 1.1.
    Thanks to Hans Fr. Nordhaug (http://sourceforge.net/users/hansfn/) for the code changes.

    Yes, that is me.

    4) Indexing improves the speed a lot, but doesn't find more pages.

    Hans
  5. nobody

    2003-01-14 14:51:42 UTC
    Thanks, I checked my site map and confirmed that there are only 14 linked pages.

    Did I read that Dr. Shenk adjusted some code where his does a directory search? I checked his site but couldn't find any altered code in the <applet> section of his web page where his search engine is
  6. 2003-01-14 20:36:09 UTC
    No, "Dr. Shenk " as you call it, made a script that made sitemap. Then URLstart was pointing to this sitemap.

    Hans
< Previous | 1 | Next >

Add a Reply

This forum does not allow anonymous participation.

Log in to add a reply. Not registered? Create an account to participate and receive email updates when replies are posted to this topic.