From: David G. <go...@us...> - 2002-07-01 18:22:36
|
Thanks for your reply, Martin. > I'd reorder this: (try command line). Try ASCII first, then UTF-8. If > ASCII passes, it most likely is ASCII. If not, and UTF-8 passes, it > most likely is UTF-8. Then try the locale's encoding. Out of curiosity, is there any point in trying both ASCII and UTF-8? UTF-8 is a strict superset of ASCII, so shouldn't checking UTF-8 alone be enough for both? If we don't care what the original encoding was (we just want Unicode text to process), does explicitly checking for ASCII buy us anything? -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |