From: Christiaan F. <chr...@ad...> - 2007-06-28 14:23:04
|
A word of warning to everyone using the MagicMimeTypeIdentifier: I have added a fall-back heuristic so that plain text files without a known plain text file extension and without a UTF Byte Order Mark are also classified as text/plain. We have come across several AutoFocus users who want to search generated plain text files and who were not able to use AutoFocus for this, so that's why I added it. The heuristic is very simple: just test whether the first 100 bytes of a file are readable/printable ASCII characters. UTF-8 files without a BOM using multi-byte encoding be damned! ;) So far I haven't ever seen a false positive, i.e. a non-plain text file that was classified as text/plain because of this heuristic (based on years of experience with older non-Aperture versions of AutoFocus). The consequence of this heuristic is that you may now see a *lot* more plain text files than before. E.g. source code files, .classpath and .project files, CVS and SVN control files, etc. I hope this does not ruin anything for anyone but if it does, I can always put the code in a separate utility method that is not called automatically. Regards, Chris -- |