There is something which I haven't noticed at first.
Currently all available encodings are plugged in by Poco::TextEncoding
class and are available automatically. Here is the code.
If we put encodings into separate libraries we'll need to plug them in
explicitly. The availability of various encodings is used in Poco/XML in
ParserEngine::handleUnknownEncoding function. The options are:
1. TextEncding imports all available encodings (current behavior). Then
all encoding libraries are implicitly loaded and we go exactly where we
are in regard code bloating.
2. We change TextEncoding and ParserEngine interface in order to allow
clients to turn on specific encoding support explicitly.
Which one would you prefer? ;)
Guenter Obiltschnig wrote:
> Hi Sergey,
> we shouldn't do such a rather big change in 1.3.3. This is definitely
> something for a 1.4 release, as it involves backwards-incompatible
> changes. We should leave the additional encodings out of 1.3.3 and do
> this changes in the trunk only, for inclusion in the 1.4 release (an 1.4
> branch will be available soon).
> Apart from that, your proposal sounds good.
> On May 18, 2008, at 22:52 , Sergey Skorokhodov wrote:
>> Hi all,
>> First, some info for those who may have no idea who I am and why I'm
>> writing to this list. ;)
>> My name is Sergey, I'm a C++ developer from Moscow, Russia. A few weeks
>> ago Alex "recruited" me to help with poco-1.3.3 release. My initial task
>> was to backport some of recent trunk improvements and test the result on
>> the platforms in my reach. I propose to add S. Yatskevich's Cyrillic
>> encodings patch and discussed it with Alex and Guenter. Guenter
>> suggested to provide additional encodings as separate libraries in the
>> same way DBMS drivers libraries in Poco/Data. Being a newcomer, I'm a
>> bit nervous making any changes to Poco, so please comment on how I think
>> it should be done.
>> The following encodings are available now.
>> - Western (Latin 1, Latin 9 and Windows-1252)
>> - Cyrillic (IBM866 aka CP866, Windows-1251, ISO_8859-5, KOI8R, KOI8U)
>> Cyrillic encodings seem to be too numerous but it crude reality and one
>> needs them all to feel comfortable with Russian, Ukrainian and
>> Belorussian. It may be also enough to work with Serbian and Bulgarian,
>> but I'm not quite sure. We actually need even more for non-Slavonic
>> Cyrillic languages but I hope we can leave it out of scope for now.
>> Sorry for this "illuminating" paragraph, just in case:).
>> I believe that Unicode stuff should be left where they are in Foundation
>> as they are required in any program. What about ASCII? On one hand, it
>> is ubiquitous. On the other side, it is built into the language in a way
>> and all additional encoding support can be moved where it logically
>> belong i.e. to Western encodings group.
>> All encodings libraries are moved into a separate Foundation directory
>> 'Encodings' with the following structure.
>> I suggest that all these libraries are build from main Foundation
>> solution in MSVC on Windows and as separate targets in main Foundation
>> makefile on POSIX. Anyway, we are going to prevent bloating of loadable
>> modules rather than excluding some encodings from being built. I also
>> suggest that all the headers are placed to include/Poco directory after
>> installation as they are integral part of Foundation. Moreover, it
>> should minimize changes of existing code if any.
>> So it is, please comment.