Share

WebLech URL Spider

Code

Programming Languages: Java

License: MIT License

Repositories

browse code, statistics, last commit on 2004-06-13 cvs -d:pserver:anonymous@weblech.cvs.sourceforge.net:/cvsroot/weblech login

cvs -z3 -d:pserver:anonymous@weblech.cvs.sourceforge.net:/cvsroot/weblech co -P modulename

Show:

What's happening?

  • A Better Spider

    http://www.codeplex.com/spidernet.

    2007-07-20 14:45:19 UTC by cawoodm

  • spider OutOfMemoryError

    when I use the spider to mirror the website, in the configuration I set the max depth 10 and max thread is 8, and after about 20 minutes, the console write the error like below: Exception in thread "Spider-Thread-2" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.io.ByteArrayOutputStream.write(Unknown Source) at...

    2007-07-20 06:33:42 UTC by qzheng3

  • make command didn't make

    OS: Ubuntu 7.04 Latest version of make ran using sudo make and make without sudo, same error below: gcc openwebspider-0.7.c -o openwebspider `mysql_config --cflags --libs` -lpthread -ldl -rdynamic -Wall -O2 /bin/sh: mysql_config: not found openwebspider-0.7.c:68:27: error: mysql/mysql.h: No such file or directory In file included from functions.h:31, from...

    2007-06-19 01:32:49 UTC by johndoe32102002

  • following Java-Links

    Some Homepages I'm working with are using links like onClick="location.href='URL'" in combination with frames to open submenus and some Page simultaneous. Because I have no direct FTP Access I tryed to mirror the page but the Lech didn't follow the links. I tryed to work aruond this by typing all urls as startLocation but it seems as this didn't work either. I'm looking forward to using...

    2006-07-13 15:34:09 UTC by nobody

  • missing pages

    I tested weblech with a site. It missed some pages. For example in the index file there are links likes this: But weblech did not visit biodep.php page for insance.

    2006-06-20 13:02:37 UTC by nobody

  • Followup: RE: Java heap space?

    Shaun, I'm guessing you where running Weblech on quite a large set of files? Which version of Weblech are you running and how long did it take to produce this error? If this was due to a large set of file you've got two options i) Run Weblech to process the files in smaller sets of files on multiple runs. ii) increase the heap space available to Java. Java's maximum heap size can be set...

    2006-04-02 12:12:39 UTC by tom_hey

  • Java heap space?

    I'm getting this error.... Please advice... Exception in thread "Spider-Thread-2" java.lang.OutOfMemoryError: Java heap space Exception in thread "Spider-Thread-3" java.lang.OutOfMemoryError: Java heap space.

    2006-03-19 15:31:53 UTC by shaunmurray

  • GUI / Swing Interface...

    I recomend a JNLP application that allows simple configuration (most people have simple requirements). However the basic need is a GUI front end, so people dont HAVE to learn the syntax. Todd PS: If you'd like I can do it for you, email me at musheno@users.sourceforge.net (I will require credit).

    2005-11-26 01:57:55 UTC by musheno

  • Spider the whole internet???

    I am one new user of weblech , and I find it's great and do help to me! by using weblech, I want to ask some questions: 1 I want to use weblech to spider in the whole internet ,what should i do? 2 when downloading the web page, do it can identify the last-modify time of that page , and when the time isn't change, so weblech needn't to download that page again? need someone to help! thanks.

    2005-10-11 10:23:02 UTC by happier5281

  • Comment: Application does not exit when multiple Threads are used

    Logged In: YES user_id=1311727 This is a real bummer. I kept waiting and waiting. A simple Control-C stopped the process and allowed a clean exit.

    2005-07-13 02:48:55 UTC by scruf

Our Numbers