|
From: Matthieu C. <cho...@gm...> - 2007-04-30 16:17:18
|
Hi Kazutoshi, I know that detecting binary files using a reader is not 100%
accurate, because if the encoding is wrong, the file could be detected as
binary, but it works in most of the cases I saw. With a little tuning on how
many errors can be accepted it is better than nothing.
But if you have a better method to detect binary files, tell me, otherwise
please do not remove this method
Matthieu
2007/4/30, Alan Ezust <ala...@gm...>:
>
>
>
> On 4/30/07, Kazutoshi Satoda <k_s...@f2...> wrote:
> >
> > Alan Ezust wrote:
> > > 11:41:55 AM [error] FileVFS: java.nio.charset.MalformedInputException:
> > > Input
> > > length = 1
> > (snip)
> > > 11:41:55 AM [error] FileVFS: at
> > org.gjt.sp.jedit.MiscUtilities.isBinary(
> > > MiscUtilities.java:721)
> > > 11:41:55 AM [error] FileVFS: at org.gjt.sp.jedit.io.VFSFile.isBinary(
> > > VFSFile.java:304)
> > > 11:41:55 AM [error] FileVFS: at org.gjt.sp.jedit.io.VFS.listFiles(
> > VFS.java
> > > :1122)
> > >
> > >
> > > I found a quick and dirty workaround, adding a try{ } catch
> > > (MalformedInputException mie) {return true} around the reader.read()
> > in
> > > isBinary().
> > > But could you please examine that code (line 721 MiscUtilities.java)
> > and
> > > see
> > > if I am not missing a case?
> >
> > An encoding error doesn't indicate a file is binary. It indicates a file
> > can't be decoded with the specified encoding. It may include a case the
> > file is binary. But many cases are result of wrong encoding. So I think
> > you can't return true because of an encoding error, at least from that
> > scope.
> >
> > The exception is caught at VFS.listFiles(). But adding them to the list
> > may be wrong because the same exception will be thrown from actual
> > loading of the file. I added them because I had another encoding
> > detection based on full contents of a file which works only at the
> > actual loading time. But the code is not mature to release yet. Logging
> > of the exception as ERROR might be wrong too. DEBUG can be enough there.
> >
> > I think MiscUtilities.isBinary(Reader) should be deprecated. Binary
> > detection can't be done for a Reader. One file can be looked like a
> > binary in an encoding, but can be looked like a text in another
> > encoding. It should be MiscUtilities.isBinary(InputStream), and should
> > be widely customizable by user. Sorry for saying that without a patch.
> > --
> > k_satoda
> >
>
> I'm not sure what you mean by "widely customizable by the user". Why would
> the user need to customize how isBinary works?
>
> Nor do I fully understand why it has to be an InputStream, or why Reader
> was chosen in the first place. Matthieu, do you have comments about this?
>
> Is this something you want to / plan to fix, or shall I open a ticket in
> the tracker?
>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> --
> -----------------------------------------------
> jEdit Developers' List
> jEd...@li...
> https://lists.sourceforge.net/lists/listinfo/jedit-devel
>
>
|