From: Shane <sh...@lo...> - 2004-07-02 11:00:39
|
On Jul 1, 2004, at 10:17 PM, Malcolm Lawrence wrote: > "Slash doesn't need to deal with any character set, it only needs to > deal > with one - Unicode. The modern browser should/will make conversions > to/from > Unicode for both inbound and outbound data. If slash is rewritten to > support > Unicode instead of whatever it is now (ISO-8859-1 probably?), then > that is > all there is to it as far as the database is concerned. Even templates > could then be in many languages, even within a template if so desired." > > Rightio. Any other voices like to chime in about slash and Unicode? Question #1: Unicode Admittedly, I know next to nothing about i18n and unicode, so forgive me... I'm looking at the src to http://blogalization.org/community/weblog.php The "oddish" characters are "<D9><82><D9><88><D9>" and the encoding is <meta http-equiv="content-type" content="text/html; charset=UTF-8"> Now, comparing that to http://slashcode.com/index.shtml or even http://slashdot.org/index.shtml doesn't get you too far, because neither has any of that metadata. So, my question is, what happens if you stick <meta http-equiv="content-type" content="text/html; charset=UTF-8"> in the standard header-html template? Does that "make" it unicode compliant? If I write a story up, and in the story bodytext, how would I define the same paragraph for different languages in unicode? Or, would you do it once in english, and then somehow via unicode do it again in XX language? Then when slashd renders the story, it write the entire thing out, but the browser, seeing the UTF-8 in the header, only renders part of it (similar to <style> .english { display: none; } </style> <html> <div id="story"> <div id="storytext"> <p class="english">blah blah blah</p> <p class="french">blahsky blahsky blahsky</p> </div></div> Yes, syntax isn't correct, but you get the idea - the browser automatically knows what to display and what not to display? Question #2: Opnames What about form op names? I don't recall where someone brought this up, but it's definitely a problem with slash, because sometimes the op values are directly tied to the submit buttons, whereas it's better if the text of the submit buttons is tied to a getdata call so the textname of the button can be pulled from a data template, which, there could be multiple versions of it. Someone may have even submitted a patch for this - if they did, I don't think it was used. (fyi, I started to code all my plugins this way) Question #3: Templates and languages Part 1: Naming conventions Templates, whether they are in a theme, or in a plugin, are always stored as name\;page\;section How would one include alternative language templates in either a theme, or a language, so that when install-plugin is run, all those templates, regardless of the LANG variable in the template, would be installed automatically? Part 2: Unicode I cannot see anyone in the future using one template stuffed with a ton of unicode text in it. You'd have to verify that Template::Toolkit handled unicode, for one thing. But for another, management of the templates, by the OSDN coders, that would seem to be a nightmare. They'd have unicode code in the standard templates for which, I'd assume, most of their staff could _not_ read. A better solution would seem to be to move as *much text as possible* out of the standard templates and put it into data templates. So most templates just handle code. Anything that is called to be displayed to the user (that isn't text pulled from the db) would be pulled from a data template, which could have lang=fr or lang=gb which would match what the user has defined and it'd show. Question #4: Constants mysql> select name from vars where name like "%char%"; +---------------------------+ | name | +---------------------------+ | charrefs_bad_entity | | charrefs_bad_numeric | | comment_nonstartwordchars | | draconian_charrefs | | draconian_charset | | draconian_charset_convert | | nick_chars | +---------------------------+ 7 rows in set (0.00 sec) I think that only slightly begins on the variables that you would need to analyze to see if you would need to change them to handle unicode. Shane PS - some i18n discussions on slashcode: http://www.slashcode.com/search.pl? query=i18n&op=stories&author=&tid=§ion= PSS - Don't forget to peruse the i18n list's archives: http://sourceforge.net/mailarchive/forum.php?forum_id=7482 |