[Aperture-devel] warning about MIME type detection

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

A word of warning to everyone using the MagicMimeTypeIdentifier: I have 
added a fall-back heuristic so that plain text files without a known 
plain text file extension and without a UTF Byte Order Mark are also 
classified as text/plain.

We have come across several AutoFocus users who want to search generated 
plain text files and who were not able to use AutoFocus for this, so 
that's why I added it.

The heuristic is very simple: just test whether the first 100 bytes of a 
file are readable/printable ASCII characters. UTF-8 files without a BOM 
using multi-byte encoding be damned! ;) So far I haven't ever seen a 
false positive, i.e. a non-plain text file that was classified as 
text/plain because of this heuristic (based on years of experience with 
older non-Aperture versions of AutoFocus).

The consequence of this heuristic is that you may now see a *lot* more 
plain text files than before. E.g. source code files, .classpath and 
.project files, CVS and SVN control files, etc. I hope this does not 
ruin anything for anyone but if it does, I can always put the code in a 
separate utility method that is not called automatically.

Regards,

Chris
--