languagesys-discussion Mailing List for LanguageSys

Status: Inactive

Brought to you by: mc_breit

languagesys-discussion — The languagesys-discussion list is meant to discuss and develop the guidelines

You can subscribe to this list here.

2005	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct (3)	_Nov	_Dec

Flat | Threaded

Re: [LSys-Discuss] Classification and further definition of the project it's goals

From: Florian B. <fl...@ph...> - 2005-10-16 19:48:36

What do you think about a language specific serialization method?
I think we should implement a serialization method within the parser, so 
that we can cache the parsed xml and don't have to parse it every time, 
this should be specific on the programming language used, so that we can 
minimize the load by this. This means that the serialized datas are 
created and maintained by the parser, too.

AW: [LSys-Discuss] Classification and further definition of the project it's goals

From: <eg...@sw...> - 2005-10-16 19:31:50

Hi,
I think that the list is not bad.
=20
For the first version I think it's enough.

Gregor Wegberg

-----Urspr=FCngliche Nachricht-----
Von: lan...@li...
[mailto:lan...@li...] Im Auftrag =
von
Florian Breit
Gesendet: Sonntag, 9. Oktober 2005 18:24
An: lan...@li...
Betreff: [LSys-Discuss] Classification and further definition of the =
project
it's goals

Hi folks!

Since we wanted to start developing the rule sets/guidelines for our=20
I18N Project, it is required to find borders to other projects,=20
especially in the area of L10N.

The main focus of our work should be T9N, or rather M17N, but of course
we have to take all the other aspects of internationalization into
consideration to gain an optimal cooperation with other projects like
Pango. Therefore I suggest to collect proposals of features for the next
time. If there is already a solution for a problem that works
cross-platform wide, we still can factor or cross it out.

Here are my feature proposals:
o LanguageCodes as recommended by RFC 3066 (eg. en-US)
o The possibility to transform POSIX Locales (eg. en_US) into RFC
3066-Styled LanguageCodes.
o An illustration of relationships between different languages,
especially to find alternative languages for the case that a "language
of choice" does not exist.
o Loading and using language-files. Therefore XML would be a good=20
choice, in my opinion LDML isn't the best practice here because of its=20
size (at least in its  original version as available at unicode.org).=20
But maybe we can support LDML too (maybe this can be added later, when=20
there is a need for it).
o Handling of language specific data via unique IDs (direct
identification, like INI files) _and_ string combination (like GNU
Gettext, but more alike substitution of language data, not just
identification by itself).
o Generic number formatters, also defined by an ID, that can be used in=20
a lot of different cases. For instance, to say: The number format for
"currency" is "%1.%2,%3" where %1(.) is the thousands separator, %2(,)
is the decimal separator and %3 the rest. This should been implemented
in a way that adds facility for lots of forms of number
transformations, like simple numbers separated by thousands or currencys
or anything else.
o Arguments for language data, for example "Hallo %1!" (de-DE) becomes
"Hello %1!" (en), where %1 is a defined argument, for example a number
or a string, so that %1 will be replaced by it. For that it should be
possible to cover different cases ("0 numbers", "1 number", "2 numbers",
...) via patterns (spoken example: %1 <=3D 7: use "foo", %1 < 100: use
"bar", otherwise: use "faz").
o All data should be managed in UTF-8, so there is no trouble with all
the different character sets, whereat there should be a possibility to
register something like "filters" to get other character sets then UTF-8
for _output_. The input (eg. from the language files) should
be always UTF-8. (Later it can be used for example via an ANSI filter
for a terminal or integrated and rendered via Pango)

So, are there still any proposals for the basis of features we want to
provide for the startup?
Anything that we should not do, or that we should do any other way?

Regards,
Florian Breit

P.S.: Excuse me that I'm late, but I had to do so much things the last=20
days and had so little time.








-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, =
discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
languagesys-discussion mailing list
lan...@li...
https://lists.sourceforge.net/lists/listinfo/languagesys-discussion

[LSys-Discuss] Classification and further definition of the project it's goals

From: Florian B. <fl...@ph...> - 2005-10-09 16:19:32

Hi folks!

Since we wanted to start developing the rule sets/guidelines for our 
I18N Project, it is required to find borders to other projects, 
especially in the area of L10N.

The main focus of our work should be T9N, or rather M17N, but of course
we have to take all the other aspects of internationalization into
consideration to gain an optimal cooperation with other projects like
Pango. Therefore I suggest to collect proposals of features for the next
time. If there is already a solution for a problem that works
cross-platform wide, we still can factor or cross it out.

Here are my feature proposals:
o LanguageCodes as recommended by RFC 3066 (eg. en-US)
o The possibility to transform POSIX Locales (eg. en_US) into RFC
3066-Styled LanguageCodes.
o An illustration of relationships between different languages,
especially to find alternative languages for the case that a "language
of choice" does not exist.
o Loading and using language-files. Therefore XML would be a good 
choice, in my opinion LDML isn't the best practice here because of its 
size (at least in its  original version as available at unicode.org). 
But maybe we can support LDML too (maybe this can be added later, when 
there is a need for it).
o Handling of language specific data via unique IDs (direct
identification, like INI files) _and_ string combination (like GNU
Gettext, but more alike substitution of language data, not just
identification by itself).
o Generic number formatters, also defined by an ID, that can be used in 
a lot of different cases. For instance, to say: The number format for
"currency" is "%1.%2,%3" where %1(.) is the thousands separator, %2(,)
is the decimal separator and %3 the rest. This should been implemented
in a way that adds facility for lots of forms of number
transformations, like simple numbers separated by thousands or currencys
or anything else.
o Arguments for language data, for example "Hallo %1!" (de-DE) becomes
"Hello %1!" (en), where %1 is a defined argument, for example a number
or a string, so that %1 will be replaced by it. For that it should be
possible to cover different cases ("0 numbers", "1 number", "2 numbers",
...) via patterns (spoken example: %1 <= 7: use "foo", %1 < 100: use
"bar", otherwise: use "faz").
o All data should be managed in UTF-8, so there is no trouble with all
the different character sets, whereat there should be a possibility to
register something like "filters" to get other character sets then UTF-8
for _output_. The input (eg. from the language files) should
be always UTF-8. (Later it can be used for example via an ANSI filter
for a terminal or integrated and rendered via Pango)

So, are there still any proposals for the basis of features we want to
provide for the startup?
Anything that we should not do, or that we should do any other way?

Regards,
Florian Breit

P.S.: Excuse me that I'm late, but I had to do so much things the last 
days and had so little time.

Flat | Threaded