|
From: stack <st...@ar...> - 2005-11-11 18:28:18
|
kau...@cs... wrote: >On 11/11/2005, "stack" <st...@ar...> wrote: > > > >>The below should be fixed by upgrade to nutchwax 0.4.1, if you haven't >>already. >> >>St.Ack >> >> > > > Sounds like still work to do. Thanks for the detailed report Kaisa (I've pasted below into new encoding issue and will try and figure whats going on). St.Ack >Yes I noticed that errors in my archive were of the same type as reported >later by other people. After installing nutchwax 0.4.1 the archive looks >better now, thanks very much. > >I still have some isolated cases where a file is inside archive but wera >shows 'not found'. Here are some examples of problem urls > >Perhaps '&' right after '?' is too much >http://www.helsinki2005.fi/index.php?&Lang=eng >http://www.helsinki2005.fi/index.php?&Name=xteams >http://www.helsinki2005.fi/index.php?&Name=tickets > >http://www.noc.fi/mp/db/tiedotteet/foo/IMG?num=18157&FIELD=kuva0_kk&R=897830 >http://www.noc.fi/taustasivut/artikkeliarkisto/?num=17075&JKNUM=17075 >http://www.slu.fi/mp/db/tiedotteet/foo/IMG?num=29512&FIELD=kuva0_pieni&R=064184 > >Other urls with '&' work fine, but these with 'Name=something&' do >not. >http://www.helsinki2005.fi/index.php?Name=newsitem&item=322 >http://www.helsinki2005.fi/index.php?Name=newsitem&item=405 >http://www.helsinki2005.fi/index.php?Name=tickets&lang=eng > >When I make in wera a query >url:http://www.helsinki2005.fi/index.php?Name=tickets&lang=eng >it reports 102 hits, the first one being >http://www.helsinki2005.fi/index.php?Name=tickets_1 >but wera only wants to display hits 1-10 and 11-16. > >For some reason all images with a '%' character in url still refuse to >come out. This could apply to html file urls as well if there were any >in the archive. >http://www.helsinki2005.fi/files/pics/1079364264_mascot%20medium.gif > >I'm not sure which part of nutchwax&wera combination causes it. > >Kaisa > > >------------------------------------------------------- >SF.Net email is sponsored by: >Tame your development challenges with Apache's Geronimo App Server. Download >it for free - -and be entered to win a 42" plasma tv or your very own >Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >_______________________________________________ >Archive-access-discuss mailing list >Arc...@li... >https://lists.sourceforge.net/lists/listinfo/archive-access-discuss > > |