Character sets

  • Petri


    I'm having trouble with Comoblog displaying Scandinavian characters, mainly ä and ö.

    They come out as: ä and ö respectively.

    This only happens when I post from my mobile phone. Posted by e-mail everything looks OK. The contents seem OK if I check the posts from the mailserver with my e-mail client before they are read into comoblog, but once they are read in to the DB they become corrupted. Any ideas?

    • Can you please send such an email to and also cc I'll see if I can spot where the corruption is occuring.

      We have an RFE raised to investigate such charset problems - this isn't the first we have heard about it.

      • Widell

        Any progress?

        • Sorry, I've been offline without power for the last few days due to a major storm hitting my region. I'll try and look at it soon.

    • Petri

      Sure. Just sent an e-mail from my phone.
      Thanks for replying and especially thanks for all the great work on this project!

    • RoarinPenguin

      I also had similar issues in displaying italian accented characters like à, ì, ò, è, é...

    • hejkristian

      I have a smiliar problem when im trying to put <br> in the e-mails from my phone to make a "break row" between the pictures and the text.

      they show up as <br> in the post :1 but when i edit the post in the admin area and saves it, it works ok.

      thanx for all the good and fast answers!


      • This occurs because the batch script converts all 'html' style characters like greater-than and less-than symbols into their HTML equivalents. This has to be done so if someone types in a less than symbol in their blog post it won't corrupt the display. The fact that it works when you re-edit via the admin pages is actually a bug ;-)

        The obvious solution is to have your phone insert a new line character instead of you manually putting in <br>, but I assume your phone can't do this. If this is the case the only solution will be to write a post-processing module to convert your <br>'s into real new-line-characters before the clensing occurs.

        If you want me to write such a module let me know, shouldn't be too hard.


    • After further investigation it appears that the PEAR libraries that we use to Mime decode the email messages doesn't support UTF-8 very well - hence all these problems.

      Unfortunately the latest version of PEAR doesn't resolve the issue so we are going to have to raise it with their developers to see if they have a fix planned.

      I'll raise a bug with the PEAR bug tracker and see what happens.

      For the time being, if you are able to change your mail client to send Unicode instead of UTF-8 then that most likely will resolve the issue (does for me anyway)


    • colione

      Yeah i noticed that too, however when i post from my SonyEricsson z800i from sweden, the encoding is correct and i won't get these problems, but when i try to post from my sagem from ireland the encoding gets bad.

    • I can confirm that the PEAR libraries we are using don't support UTF-8. Essentially the PEAR libraries are only able to support 8-bit characters instead of the 16-bit characters that are used in UTF-8, hence the corruption in the output.

      Unfortunatley, this means that I have no solution to this problem at the current time. I may be able to patch the PEAR libraries myself, but I'm afraid I don't have much UTF-8 knowledge so it may take some time to work it all out.

    • colione

      If i send you an email from my sony-ericsson that gives the correct encoding and one from my sagem that gives errors, could you get closer to the problem then?

      • It would be useful to at least confirm that what I think the problem is is correct.

        Please send to devblog at

    • colione

      I can send you one later on today from ireland. But you have to wait until the 26th january to get one from sweden, cos i won't be able to get back to sweden before then.. :)