From: Waylan L. <wa...@gm...> - 2009-02-26 19:57:28
|
On Thu, Feb 26, 2009 at 2:17 PM, tchomby <tc...@go...> wrote: [snip] > > Fortunately this happened in few enough files that I was able to find > and remove the offending characters manually. Still, it would be good > to be able to read and write text from files in a robust way. > Keep in mind that you just explained the various sources of your files earlier in your message. That particular situation is unique to you. Someone else will have a different situation. It is impossible for the markdown library to be able to anticipate every possible situation. Therefore, the most "robust way" is to leave the encoding/decoding to the end user - the only person is a position to properly address that specific situation. In other words, you are in a better position to know and/or determine the encoding of your files than we or any code we write could ever guess. Therefore, Python-Markdown's policy is to only work in Unicode. Any encoding and/or decoding is handled by the end user. That is the most "robust way" to handle it. As an aside, I should note that there is an exception in that we do handle some encoding/decoding for the command line stuff. However, even then, it is rather dumb and requires the user to specify the encoding for anything except utf-8 (which it expects by default). -- ---- \X/ /-\ `/ |_ /-\ |\| Waylan Limberg |