Hi all,
i have a problem with the query-string, when digging a server. I get an url
with the same query-parameters, but one query-parameter changes his value, so this url
will be indexed many times and the second problem is, that the digging process never ends or ends with a
very high count of urls.
I know, thats the problem of this special server, but I'am not able to change the servers structure,
because its an external server.
So I thought, I could solve this problem by changing htdig.
I tried to eliminate the query_parameter '_last' (see below) in method 'push' of class Server (Server.cc),
but that wasn't really successfull, because then the digging process ended too early.
Can someone give me an idea at which position of htdig I should eleminate this bad query_parameter ??
Here an example of digging this server (I stopped it manually !):
ht://dig Start Time: Thu Oct 25 10:15:46 2001
New server: www.xxxx.de, 80
- Persistent connections: enabled
- HEAD before GET: disabled
- Timeout: 30
- Connection space: 0
- Max Documents: 10000
- TCP retries: 1
- TCP wait time: 5
0:2:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/pass/req.htm?_usr=&_pwd=&_last=01298101328&_ses=2177219424.01298101139: --- size = 3347
1:3:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/patient/frameset_patient.htm?_usr=&_pwd=: + size = 289
2:5:0:http://www.xxxx.de/: ++ size = 1663
3:8:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=&_last=01298101328&_ses=2177219424.01298101139: size = 317
4:7:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=: + size = 276
5:6:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/hauptframeset.htm?_usr=&_pwd=: + size = 278
6:11:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/detail_mittelpunkt_patient.htm?_usr=&_pwd=&_last=01298101210&_ses=2177219424.01298101139: size = 587
7:12:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/navi_service_patient.htm?_usr=&_pwd=&_last=01298101240&_ses=2177219424.01298101139: +++-- size = 4279
8:16:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101306&_ses=2177219424.01298101139: +++ size = 1144
9:20:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101220&_ses=2177219424.01298101139: *** size = 1144
10:19:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/service_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: +-+-*--- size = 11294
11:18:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/navi_service_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: ***-- size = 4279
12:23:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/banner_service.htm?_usr=&_pwd=&_last=01298101242&_ses=2177219424.01298101139: size = 462
13:24:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/frameset_praxis_patient.htm?_usr=&_pwd=&_last=01298101302&_ses=2177219424.01298101139: +++ size = 1137
14:17:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/banner_service.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: size = 462
15:27:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/praxis_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: *--- size = 20050
16:26:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/navi_praxis_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: +++-- size = 4289
17:25:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/banner_praxis.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: size = 460
18:31:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/frameset_arzt.htm?_usr=&_pwd=&_last=01298101242&_ses=2177219424.01298101139: ++++ size = 2309
19:36:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/frameset_arzt.htm?_usr=&_pwd=&_last=01298101326&_ses=2177219424.01298101139: **** size = 2309
20:35:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: size = 317
21:34:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/pass/req.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: --- size = 3347
22:33:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/navi_home.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: ++-- size = 3732
23:32:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/banner_home.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: size = 456
24:39:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/detail_nebenwirkungen.htm?_usr=&_pwd=&_last=01298101324&_ses=2177219424.01298101139: size = 594
25:40:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101321&_ses=2177219424.01298101139: +++ size = 1144
26:44:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/frameset_praxis_patient.htm?_usr=&_pwd=&_last=01298101324&_ses=2177219424.01298101139: +++ size = 1137
27:43:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/service_patient.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: +-+-*--- size = 11294
28:47:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/praxis_patient.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: +--- size = 20050
...
________________________________________________________________
Lotto online tippen! Egal zu welcher Zeit, egal von welchem Ort.
Mit dem WEB.DE Lottoservice. http://tippen2.web.de/?x=13
|