I've developed a small headache up till now trying to get the search engine to add the sub-files to the index. I finally broke down, wrote a small .html page containing every other page that I want searchable. This finally got HouseSpider to see the pages, but now I'm getting "Warning, ignoring: file:///Z:\NDProject\TEST\AA\AA.html - unsupported file format/protocol." I'll include my entire set up below. Please, any assistance would be awesome.
The webpage is local based and will eventually be loaded on a network drive to be access by multiple users on a closed, secure intra-net.
Yes, HouseSpider is web spider/crawler so you need to link to all the pages that you want index. It's not a general indexer. In addition it's expecting to index web pages so the file protocol isn't supported. Why not use relative URLs in your index file? In other words, in contents.html link to 911/911.html (or 911\911.html) in stead of file:///Z:\NDProject\TEST\911\911.html (This is untested - it's some years since I used HouseSpider myself.)
Let me know how it goes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've developed a small headache up till now trying to get the search engine to add the sub-files to the index. I finally broke down, wrote a small .html page containing every other page that I want searchable. This finally got HouseSpider to see the pages, but now I'm getting "Warning, ignoring: file:///Z:\NDProject\TEST\AA\AA.html - unsupported file format/protocol." I'll include my entire set up below. Please, any assistance would be awesome.
The webpage is local based and will eventually be loaded on a network drive to be access by multiple users on a closed, secure intra-net.
<applet code="HouseSpider.class" archive="HouseSpider.jar,buttons.bevel.jar" width="90%" height="200">
<param name="URLStart" value="file:///Z:\NDProject\TEST\Main.html">
<param name="URLHelp" value="http://housespider.sourceforge.net/doc/ver44/">
<param name="bgcolour" value="FFFFFF">
<param name="fgcolour" value="000000">
<param name="bgtextcolour" value="FFFFFF">
<param name="textcolour" value="000000">
</applet>
Jars are located in this TEST folder with the "index" page.
java -cp HouseSpider.jar;buttons.bevel.jar;il8n.jar HouseSpider URLStart="file:///Z:\NDProject\TEST\contents.html" Debug="3"
This is HouseSpider v4.7.
Doing a local (file) search.
URLStart: file:/Z:/NDProject/TEST/contents.html
URLHelp: http://housespider.sourceforge.net/doc/ver47/info.html
URLExclude: Not used.
FileExclude: Not used.
Search started.
Searchstring: houseindex
Loading file:/Z:/NDProject/TEST/contents.html … done.
Writing: HouseSpider.index
Parsing file:/Z:/NDProject/TEST/contents.html.
Done parsing file:/Z:/NDProject/TEST/contents.html.
Warning, ignoring: file:///Z:\NDProject\TEST\911\911.html - unsupported file for
mat/protocol.
Warning, ignoring: file:///Z:\NDProject\TEST\Police\Police.html - unsupported fi
le format/protocol.
Warning, ignoring: file:///Z:\NDProject\TEST\AA\AA.html - unsupported file forma
t/protocol.
Warning, ignoring: file:///Z:\NDProject\TEST\AAA\AAA.html - unsupported file for
mat/protocol.
Warning, ignoring: file:///Z:\NDProject\TEST\Abandoned\Abandoned.html - unsuppor
ted file format/protocol.
Warning, ignoring: file:///Z:\NDProject\TEST\Aircraft/Aircraft.html - unsupporte
d file format/protocol.
Done - indexed 1 pages in total.
Index file successfully created!
Writing: HouseSpider.log
Search finished.
Search duration: 63 (ms).
Thanks in advance for the help!!
Yes, HouseSpider is web spider/crawler so you need to link to all the pages that you want index. It's not a general indexer. In addition it's expecting to index web pages so the file protocol isn't supported. Why not use relative URLs in your index file? In other words, in contents.html link to 911/911.html (or 911\911.html) in stead of file:///Z:\NDProject\TEST\911\911.html (This is untested - it's some years since I used HouseSpider myself.)
Let me know how it goes.