|
From: Conor E. <co...@ls...> - 2007-04-03 16:07:20
|
file-4.20 has problems processing a file containing only a large number of line feed characters (2.7 million in the case we are seeing). file-4.20 takes a long time (11 minutes on my machine) to process this file. file-4.19 processes the same file instantly. Anyone else seeing this behavior after upgrading file? -Conor Edberg |
|
From: Mark M. <Mar...@ij...> - 2007-04-03 17:32:26
|
Conor,
> file-4.20 has problems processing a file containing only a large number of
> line feed characters (2.7 million in the case we are seeing). file-4.20
> takes a long time (11 minutes on my machine) to process this file.
> file-4.19 processes the same file instantly.
> Anyone else seeing this behavior after upgrading file?
Must be something else in that file, 2.7M LFs alone in a file
makes no difference to file-4.20 here. Can you provide a sample?
$ perl -e 'for (1..2700) {print "\n" x 1000}' >0.lis
$ time file 0.lis
0.lis: ASCII text
real 0m0.253s
user 0m0.245s
sys 0m0.008s
Mark
|
|
From: Mark M. <Mar...@ij...> - 2007-04-03 18:36:52
|
Henrik, Conor,
> > $ perl -e 'for (1..2700) {print "\n" x 1000}' >0.lis
> > $ time file 0.lis
> Here 0.lis uses 100% CPU on Debian, using package or self
> compiled. But no problems on Solaris.
> I guess someone should report this bug.
Please do so.
The author's address is chr...@as... .
(no problems here on FreeBSD 6.2 either).
Mark
|
|
From: Conor E. <co...@ls...> - 2007-04-03 20:47:30
|
--On Tuesday, April 03, 2007 20:36:41 +0200 Mark Martinec <Mar...@ij...> wrote: > Henrik, Conor, > >> I guess someone should report this bug. > > Please do so. > The author's address is chr...@as... . > > (no problems here on FreeBSD 6.2 either). > > Mark I sent an email to the author about this along with a link to this thread. -Conor |
|
From: Sven S. <sch...@gm...> - 2007-04-03 18:59:15
|
Hi Henrik,
On Tue, Apr 03, 2007 at 08:43:23PM +0300, Henrik Krohns told us:
> > Must be something else in that file, 2.7M LFs alone in a file
> > makes no difference to file-4.20 here. Can you provide a sample?
> >
> > $ perl -e 'for (1..2700) {print "\n" x 1000}' >0.lis
> > $ time file 0.lis
> > 0.lis: ASCII text
> >
> > real 0m0.253s
> > user 0m0.245s
> > sys 0m0.008s
>
> Here 0.lis uses 100% CPU on Debian, using package or self
> compiled. But no problems on Solaris.
tested here on a Fedora Core 6 box, just upgraded to the rawhide rpm
of file:
$ file --version
file-4.20
magic file from /usr/share/file/magic
$ time file 0.lis
*waiting, not holding breath*
0.lis: ASCII text
real 33m32.123s
user 29m56.064s
sys 0m14.149s
$
while running, it chewed up all available cpu power on my shiny
little AMD Duron home box :-) the old version (4.19-something)
worked without problems.
So, might be something linux-related!?
Sven
> I guess someone should report this bug.
>
> Cheers,
> Henrik
--
Linux zion.homelinux.com 2.6.19-1.2911.6.5.fc6xen #1 SMP Sun Mar 4 16:59:41 EST 2007 i686 athlon i386 GNU/Linux
20:03:19 up 8 days, 18:34, 2 users, load average: 1.12, 0.72, 0.57
|
|
From: Vincent Li <vl...@vc...> - 2007-04-03 19:22:45
|
On Tue, 3 Apr 2007, Sven Schuster wrote:
>
> Hi Henrik,
>
> On Tue, Apr 03, 2007 at 08:43:23PM +0300, Henrik Krohns told us:
>>> Must be something else in that file, 2.7M LFs alone in a file
>>> makes no difference to file-4.20 here. Can you provide a sample?
>>>
>>> $ perl -e 'for (1..2700) {print "\n" x 1000}' >0.lis
>>> $ time file 0.lis
>>> 0.lis: ASCII text
>>>
>>> real 0m0.253s
>>> user 0m0.245s
>>> sys 0m0.008s
>>
>> Here 0.lis uses 100% CPU on Debian, using package or self
>> compiled. But no problems on Solaris.
>
> tested here on a Fedora Core 6 box, just upgraded to the rawhide rpm
> of file:
>
> $ file --version
> file-4.20
> magic file from /usr/share/file/magic
> $ time file 0.lis
> *waiting, not holding breath*
> 0.lis: ASCII text
>
> real 33m32.123s
> user 29m56.064s
> sys 0m14.149s
> $
>
>
> while running, it chewed up all available cpu power on my shiny
> little AMD Duron home box :-) the old version (4.19-something)
> worked without problems.
>
> So, might be something linux-related!?
>
Same here, Yellow Dog Linux release 4.1 (Sagitta) on Power Mac G5 dual
core.
Vincent Li
http://bl0g.blogdns.com
|
|
From: Vincent Li <vl...@vc...> - 2007-04-03 19:33:41
|
On Tue, 3 Apr 2007, Vincent Li wrote:
> On Tue, 3 Apr 2007, Sven Schuster wrote:
>
>>
>> Hi Henrik,
>>
>> On Tue, Apr 03, 2007 at 08:43:23PM +0300, Henrik Krohns told us:
>>>> Must be something else in that file, 2.7M LFs alone in a file
>>>> makes no difference to file-4.20 here. Can you provide a sample?
>>>>
>>>> $ perl -e 'for (1..2700) {print "\n" x 1000}' >0.lis
>>>> $ time file 0.lis
>>>> 0.lis: ASCII text
>>>>
>>>> real 0m0.253s
>>>> user 0m0.245s
>>>> sys 0m0.008s
>>>
>>> Here 0.lis uses 100% CPU on Debian, using package or self
>>> compiled. But no problems on Solaris.
>>
>> tested here on a Fedora Core 6 box, just upgraded to the rawhide rpm
>> of file:
>>
>> $ file --version
>> file-4.20
>> magic file from /usr/share/file/magic
>> $ time file 0.lis
>> *waiting, not holding breath*
>> 0.lis: ASCII text
>>
>> real 33m32.123s
>> user 29m56.064s
>> sys 0m14.149s
>> $
>>
>>
>> while running, it chewed up all available cpu power on my shiny
>> little AMD Duron home box :-) the old version (4.19-something)
>> worked without problems.
>>
>> So, might be something linux-related!?
>>
>
> Same here, Yellow Dog Linux release 4.1 (Sagitta) on Power Mac G5 dual
> core.
>
Is there anyway for amavisd to avoid the file-4.20 bug before someone use
this as DOS attack?
Vincent
|
|
From: Mark M. <Mar...@ij...> - 2007-04-03 20:10:30
|
Vincent, > Is there anyway for amavisd to avoid the file-4.20 bug before someone use > this as DOS attack? I don't think there is, all mail parts are submitted to file(1) for assessment. The best path is to get file(1) fixed as soon as possible. Better a DOS than a free shell (nonprivileged) on a remote mailer. Mark |
|
From: Conor E. <co...@ls...> - 2007-04-04 15:11:08
|
>
> So I guess the quick fix is to:
>
> cd /usr/share/file
> (comment out all lines containing regex in magic)
> file -C
>
> Cheers,
> Henrik
>
A bit more detail here, it has been narrowed down to 2 regex lines in
magic.
-Conor
> ------------ Forwarded Message ------------
> Date: Wednesday, April 04, 2007 10:33:10 -0400
> From: Christos Zoulas <chr...@zo...>
> To: fi...@mx...
> Cc:
> Subject: Re: Possible DoS in file 4.20
>
> On Apr 4, 10:54am, ki...@gl... (Kimmo Suominen) wrote:
> -- Subject: Possible DoS in file 4.20
>
> Well, the regex stuff on later versions of linux seems to be the
> culprit:
>
> A profiling run of file on sle9-sp3:
>
> perl -e 'for (1..2700) {print "\n" x 10}' >0.lis
>
> Shows the top line being:
> 98.03 14.93 14.93 4 3.73 3.80 re_search_internal
>
> The profiling tree looks like:
>
> -----------------------------------------------
> 14.93 0.29 4/4 regexec [11]
> [10] 99.9 14.93 0.29 4 re_search_internal [10]
> 0.00 0.28 53800/53800 re_string_reconstruct
> [12] 0.00 0.00 22/22 extend_buffers [24]
> 0.00 0.00 53800/107598 re_string_context_at [27]
> 0.00 0.00 53800/53800 match_ctx_clean [28]
> 0.00 0.00 6/621 cfree [36]
> 0.00 0.00 6/6 build_trtable [82]
> 0.00 0.00 4/60 memset [51]
> 0.00 0.00 4/8
> re_string_construct_common [78 ]
> 0.00 0.00 4/30
> re_string_realloc_buffers [57] 0.00 0.00 4/8
> re_string_destruct [79]
>
> It fails on all recent glibc based linux distributions. On RH8.0
> it works fine, and on all the BSD's it works fine. So it is the
> new gnu regex code. If you comment out the following two lines
> (thanks to co...@ls... for narrowing it down):
>
># OS/2 batch files are REXX. the second regex is a bit generic, oh well
># the matched commands seem to be common in REXX and uncommon elsewhere
># 100 regex/c =^\\s*call\\s+rxfuncadd.*sysloadfu OS/2 REXX batch file text
># 100 regex/c =^\\s*say\ ['"] OS/2 REXX batch file text
>
> it works fine again. I will comment out the two lines for the next
> file release, but the bug is in the gnu regex code.
>
> christos
>
> ---------- End Forwarded Message ----------
|
|
From: Mark M. <Mar...@ij...> - 2007-04-04 17:57:05
|
Conor,
> A bit more detail here, it has been narrowed down to 2 regex lines in
> magic.
Thanks!
Here are two replacement lines to the magic file
that limit the RE evaluation time on Linux,
and produce comparable results.
100 regex/c =^[\ \t]{0,999}call[\ \t]{1,999}rxfu OS/2 REXX batch file text
100 regex/c =^[[:space:]]{0,999}say\ ['"] OS/2 REXX batch file text
The following is what I sent to Christos, and will be accepted for the
next version:
| > Well, the regex stuff on later versions of linux seems to be the
| > culprit:
|
| Perhaps. Still, using more than one wildcard in a regular expression
| on a possibly long string is asking for trouble - it depends on
| how well the optimizer in a regex library does its job, and if
| unlucky, one could end up in a deep recursion or with O(2+) searches.
| It needn't be a bug and can still result in unacceptable behaviour.
|
| It is prudent to limit the allowed range of matches in a regular
| expression as much as possible and practical.
|
| The following replacement entries in a magic file are functionally
| pretty close to existing expressions, but are safe to use,
| even on Linux:
|
| 100 regex/c =^[\ \t]{0,999}call[\ \t]{1,999}rxfu OS/2 REXX batch file text
| 100 regex/c =^[[:space:]]{0,999}say\ ['"] OS/2 REXX batch file text
|
| I had to shrink the first one, as the length of a RE is severely limited
| by a magic file.
|
| I also avoided the \s, which is not documented in POSIX regex,
| and is possibly also a reason why the problem does not occur on Solaris.
Mark
|