From: Antoni M. <ant...@gm...> - 2009-06-10 22:50:22
|
Hi, This is not the same. droste.zip contains itself, and thus when unzipping it with a normal unzip utility you'll get two perfecly normal files, a jpg and a zip, only when you try to apply the same crawler on the resulting zip you'll get into an infinite loop. AFAIU your zip processing class doesn't do recursion (crawling into a zip inside a zip) which won't exhibit the problem with droste.zip. Our ZipSubCrawler though doesn't check the decompression ratio, which still makes it vulnerable when someone tries to apply an extractor on a file extracted from 42.zip. We'll have to look at the 42.zip issue separately. Antoni Mylka ant...@gm... Jukka Zitting pisze: > Hi, > > 2009/6/10 Antoni Mylka <ant...@gm...>: >> I've implemented a solution to the issue http://tinyurl.com/sf2786554. >> The AbstractArchiverSubCrawler will see if the same pattern repeats on >> the URI more than 10 times. If it does, the offending entry will not >> be returned as a FileDataObject, but as a plain DataObject, thus the >> crawler handler will not subcrawl it once more. This makes us immune >> to droste.zip, and (probably) other zip bombs of this kind. > > Interesting, thanks for sharing this! > > In Apache Tika we've recently solved the same problem in a slightly > different manner, see https://issues.apache.org/jira/browse/TIKA-216 > for the details. > > BR, > > Jukka Zitting > > ------------------------------------------------------------------------------ > Crystal Reports - New Free Runtime and 30 Day Trial > Check out the new simplified licensing option that enables unlimited > royalty-free distribution of the report engine for externally facing > server and web deployment. > http://p.sf.net/sfu/businessobjects > _______________________________________________ > Aperture-devel mailing list > Ape...@li... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > |