Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

Diff of /ansi_streams.xml [dbf28d] .. [66e977] Maximize Restore

  Switch to unified view

a/ansi_streams.xml b/ansi_streams.xml
...
...
19
  <function>print</function>, etc.</para>
19
  <function>print</function>, etc.</para>
20
20
21
 </section>
21
 </section>
22
22
23
 <section xml:id="ansi.streams.io">
23
 <section xml:id="ansi.streams.io">
24
  <title>Input/Output model</title>
24
  <title>Stream element types</title>
25
25
26
  <para>&ECL; distinguishes between two kinds of streams: character streams
26
  <para>&ECL; distinguishes between two kinds of streams: character streams and byte streams. <emphasis>Character streams</emphasis> only accept and produce characters, either written or read individually, with <function>write-char</function> or <function>read-char</function>, or in chunks, with <function>write-sequence</function> or any of the Lisp printer functions. Character operations are conditioned by the external format, as described in <xref linkend="ansi.streams.formats"/></para>
27
  and byte streams. In the first kind one is only allowed to write
28
  characters, either individually, with <function>write-char</function>, or
29
  in chunks, with <function>write-sequence</function> or any of the Lisp
30
  printer functions. The implementation of character streams in &ECL; has the
31
  following shortcomings:
32
  <itemizedlist>
33
   <listitem><para>No support for external formats. Reading and writing is
34
   performed using the 8-bit code of the character.</para></listitem>
35
   <listitem><para>No support for Unicode characters. The code of large
36
   characters is simply truncated.</para></listitem>
37
  </itemizedlist></para>
38
27
39
  <para>The other kind are binary streams. Here input and output is performed
28
  <para>The other kind are binary streams. Here input and output is performed
40
  in chunks of bits. Binary streams are created with the function
29
  in chunks of bits. Binary streams are created with the function
41
  <function>open</function> passing as argument a subtype of
30
  <function>open</function> passing as argument a subtype of
42
  <type>integer</type>. We distinguish two cases
31
  <type>integer</type>. We distinguish two cases
...
...
53
  needs some extra information which tells how many bits in the last byte are
42
  needs some extra information which tells how many bits in the last byte are
54
  significant for the content. This information is stored as a single-byte
43
  significant for the content. This information is stored as a single-byte
55
  header at the beginning of the file.</para>
44
  header at the beginning of the file.</para>
56
 </section>
45
 </section>
57
46
47
 <section xml:id="ansi.streams.formats">
48
   <title>Stream external formats</title>
49
50
   <para>An <emphasis>external format</emphasis> is an encoding for characters that maps character codes to a sequence of bytes, in a one-to-one or one-to-many fashion. External formats are also known as "character encodings" in the programming world and are an essential ingredient to be able to read and write text in different languages and alphabets.</para>
51
52
   <para>&ECL; has one of the most complete supports for <emphasis>external formats</emphasis>, covering all of the usual codepages from the Windows and Unix world, up to the more recent <acronym>UTF-8</acronym>, <acronym>UCS-2</acronym> and <acronym>UCS-4</acronym> formats, all of them with big and small endian variants, and considering different encodings for the newline character.</para>
53
54
   <para>However, the set of supporte external formats depends on the size of the space of character codes. When &ECL; is built with Unicode support (the default option), it can represent all known characters from all known codepages, and thus all external formats are supported. However, when &ECL; is built with the restricted character set, it can only use one codepage (the one provided by the C library), with a few variants for the representation of end-of-line characters.</para>
55
56
   <para>In &ECL;, an external format designator is defined recursively as either a symbol or a list of external format designators. The grammar is as follows
57
<screen>external-format-designator := 
58
   symbol |
59
   ( {external-format-designator}+ )
60
</screen>
61
and the table of known symbols is shown below</para>
62
63
<table xml:id="table.external-formats">
64
  <title>Stream external formats</title>
65
  <tgroup cols="3">
66
    <thead>
67
      <row>
68
  <entry>Symbols</entry>
69
  <entry>Codepage or encoding</entry>
70
  <entry>Unicode required</entry>
71
      </row>
72
    </thead>
73
    <tbody>
74
      <row>
75
  <entry><symbol>:cr</symbol></entry>
76
  <entry><code>#\NewlineUnicode</code> is Carriage Return</entry>
77
  <entry>No</entry>
78
      </row>
79
80
      <row>
81
  <entry><symbol>:crlf</symbol></entry>
82
  <entry><code>#\NewlineUnicode</code> is Carriage Return followed by Linefeed</entry>
83
  <entry>No</entry>
84
      </row>
85
86
      <row>
87
  <entry><symbol>:lf</symbol></entry>
88
  <entry><code>#\NewlineUnicode</code> is Linefeed</entry>
89
  <entry>No</entry>
90
      </row>
91
92
      <row>
93
  <entry><symbol>:little-endian</symbol></entry>
94
  <entry>Modify <acronym>UCS</acronym> to use little endian encoding.</entry>
95
  <entry>No</entry>
96
      </row>
97
98
      <row>
99
  <entry><symbol>:big-endian</symbol></entry>
100
  <entry>Modify <acronym>UCS</acronym> to use big endian encoding.</entry>
101
  <entry>No</entry>
102
      </row>
103
104
      <row>
105
  <entry><symbol>:utf-8</symbol> <symbol>ext:utf8</symbol></entry>
106
  <entry>Unicode <acronym>UTF-8</acronym></entry>
107
  <entry>Yes</entry>
108
      </row>
109
      <row>
110
  <entry><symbol>:ucs-2</symbol> <symbol>ext:ucs2</symbol> <symbol>ext:utf-16</symbol> <symbol>ext:utf16</symbol>
111
<symbol>ext:unicode</symbol></entry>
112
  <entry><acronym>UCS-2</acronym> encoding with <acronym>BOM</acronym>.</entry>
113
  <entry>Yes</entry>
114
      </row>
115
      <row>
116
  <entry><symbol>:ucs-2le</symbol> <symbol>ext:ucs2le</symbol> <symbol>ext:utf-16le</symbol></entry>
117
  <entry><acronym>UCS-2</acronym> with big-endian encoding</entry>
118
  <entry>Yes</entry>
119
      </row>
120
      <row>
121
  <entry><symbol>:ucs-2be</symbol> <symbol>ext:ucs2be</symbol> <symbol>ext:utf-16be</symbol></entry>
122
  <entry><acronym>UCS-2</acronym> with big-endian encoding</entry>
123
  <entry>Yes</entry>
124
      </row>
125
      <row>
126
  <entry><symbol>:ucs-4</symbol> <symbol>ext:ucs4</symbol> <symbol>ext:utf-32</symbol> <symbol>ext:utf32</symbol></entry>
127
  <entry><acronym>UCS-4</acronym> encoding with <acronym>BOM</acronym>.</entry>
128
  <entry>Yes</entry>
129
      </row>
130
      <row>
131
  <entry><symbol>:ucs-4le</symbol> <symbol>ext:ucs4le</symbol> <symbol>ext:utf-32le</symbol></entry>
132
  <entry><acronym>UCS-4</acronym> with big-endian encoding</entry>
133
  <entry>Yes</entry>
134
      </row>
135
      <row>
136
  <entry><symbol>:ucs-4be</symbol> <symbol>ext:ucs4be</symbol> <symbol>ext:utf-32be</symbol></entry>
137
  <entry><acronym>UCS-4</acronym> with big-endian encoding</entry>
138
  <entry>Yes</entry>
139
      </row>
140
      <row>
141
  <entry><symbol>ext:iso-8859-1</symbol> <symbol>ext:iso8859-1</symbol> <symbol>ext:latin-1</symbol> <symbol>ext:cp819</symbol>  <symbol>ext:ibm819</symbol></entry>
142
  <entry>Latin-1 encoding</entry>
143
  <entry>Yes</entry>
144
      </row>
145
      <row>
146
  <entry><symbol>ext:iso-8859-2</symbol> <symbol>ext:iso8859-2</symbol> <symbol>ext:latin-2</symbol> <symbol>ext:latin2</symbol></entry>
147
  <entry>Latin-2 encoding</entry>
148
  <entry>Yes</entry>
149
      </row>
150
      <row>
151
  <entry><symbol>ext:iso-8859-3</symbol> <symbol>ext:iso8859-3</symbol> <symbol>ext:latin-3</symbol> <symbol>ext:latin3</symbol></entry>
152
  <entry>Latin-3 encoding</entry>
153
  <entry>Yes</entry>
154
      </row>
155
      <row>
156
  <entry><symbol>ext:iso-8859-4</symbol> <symbol>ext:iso8859-4</symbol> <symbol>ext:latin-4</symbol> <symbol>ext:latin4</symbol></entry>
157
  <entry>Latin-4 encoding</entry>
158
  <entry>Yes</entry>
159
      </row>
160
      <row>
161
  <entry><symbol>ext:iso-8859-5</symbol> <symbol>ext:cyrillic</symbol></entry>
162
  <entry>Latin-5 encoding</entry>
163
  <entry>Yes</entry>
164
      </row>
165
      <row>
166
  <entry><symbol>ext:iso-8859-6</symbol> <symbol>ext:arabic</symbol> <symbol>ext:asmo-708</symbol> <symbol>ext:ecma-114</symbol></entry>
167
  <entry>Latin-6 encoding</entry>
168
  <entry>Yes</entry>
169
      </row>
170
      <row>
171
  <entry><symbol>ext:iso-8859-7</symbol> <symbol>ext:greek8</symbol> <symbol>ext:greek</symbol> <symbol>ext:ecma-118</symbol></entry>
172
  <entry>Greek encoding</entry>
173
  <entry>Yes</entry>
174
      </row>
175
      <row>
176
  <entry><symbol>ext:iso-8859-8</symbol> <symbol>ext:hebrew</symbol></entry>
177
  <entry>Hebrew encoding</entry>
178
  <entry>Yes</entry>
179
      </row>
180
      <row>
181
  <entry><symbol>ext:iso-8859-9</symbol> <symbol>ext:latin-5</symbol> <symbol>ext:latin5</symbol></entry>
182
  <entry>Latin-5 encoding</entry>
183
  <entry>Yes</entry>
184
      </row>
185
      <row>
186
  <entry><symbol>ext:iso-8859-10</symbol> <symbol>ext:iso8859-10</symbol> <symbol>ext:latin-6</symbol> <symbol>ext:latin6</symbol></entry>
187
  <entry>Latin-6 encoding</entry>
188
  <entry>Yes</entry>
189
      </row>
190
      <row>
191
  <entry><symbol>ext:iso-8859-13</symbol> <symbol>ext:iso8859-13</symbol> <symbol>ext:latin-7</symbol> <symbol>ext:latin7</symbol></entry>
192
  <entry>Latin-7 encoding</entry>
193
  <entry>Yes</entry>
194
      </row>
195
      <row>
196
  <entry><symbol>ext:iso-8859-14</symbol> <symbol>ext:iso8859-14</symbol> <symbol>ext:latin-8</symbol> <symbol>ext:latin8</symbol></entry>
197
  <entry>Latin-8 encoding</entry>
198
  <entry>Yes</entry>
199
      </row>
200
      <row>
201
  <entry><symbol>ext:iso-8859-15</symbol> <symbol>ext:iso8859-15</symbol> <symbol>ext:latin-9</symbol> <symbol>ext:latin9</symbol></entry>
202
  <entry>Latin-7 encoding</entry>
203
  <entry>Yes</entry>
204
      </row>
205
206
      <row>
207
  <entry><symbol>ext:dos-cp437</symbol> <symbol>ext:ibm-437</symbol></entry>
208
  <entry>IBM CP 437</entry>
209
  <entry>Yes</entry>
210
      </row>
211
      <row>
212
  <entry><symbol>ext:dos-cp850</symbol> <symbol>ext:ibm-850</symbol> <symbol>ext:cp850</symbol></entry>
213
  <entry>Windows CP 850</entry>
214
  <entry>Yes</entry>
215
      </row>
216
      <row>
217
  <entry><symbol>ext:dos-cp852</symbol> <symbol>ext:ibm-852</symbol></entry>
218
  <entry>IBM CP 852</entry>
219
  <entry>Yes</entry>
220
      </row>
221
      <row>
222
  <entry><symbol>ext:dos-cp855</symbol> <symbol>ext:ibm-855</symbol></entry>
223
  <entry>IBM CP 855</entry>
224
  <entry>Yes</entry>
225
      </row>
226
      <row>
227
  <entry><symbol>ext:dos-cp860</symbol> <symbol>ext:ibm-860</symbol></entry>
228
  <entry>IBM CP 860</entry>
229
  <entry>Yes</entry>
230
      </row>
231
      <row>
232
  <entry><symbol>ext:dos-cp861</symbol> <symbol>ext:ibm-861</symbol></entry>
233
  <entry>IBM CP 861</entry>
234
  <entry>Yes</entry>
235
      </row>
236
      <row>
237
  <entry><symbol>ext:dos-cp862</symbol> <symbol>ext:ibm-862</symbol> <symbol>ext:cp862</symbol></entry>
238
  <entry>Windows CP 862</entry>
239
  <entry>Yes</entry>
240
      </row>
241
      <row>
242
  <entry><symbol>ext:dos-cp863</symbol> <symbol>ext:ibm-863</symbol></entry>
243
  <entry>IBM CP 863</entry>
244
  <entry>Yes</entry>
245
      </row>
246
      <row>
247
  <entry><symbol>ext:dos-cp864</symbol> <symbol>ext:ibm-864</symbol></entry>
248
  <entry>IBM CP 864</entry>
249
  <entry>Yes</entry>
250
      </row>
251
      <row>
252
  <entry><symbol>ext:dos-cp865</symbol> <symbol>ext:ibm-865</symbol></entry>
253
  <entry>IBM CP 865</entry>
254
  <entry>Yes</entry>
255
      </row>
256
      <row>
257
  <entry><symbol>ext:dos-cp866</symbol> <symbol>ext:ibm-866</symbol> <symbol>ext:cp866</symbol></entry>
258
  <entry>Windows CP 866</entry>
259
  <entry>Yes</entry>
260
      </row>
261
      <row>
262
  <entry><symbol>ext:dos-cp869</symbol> <symbol>ext:ibm-869</symbol></entry>
263
  <entry>IBM CP 869</entry>
264
  <entry>Yes</entry>
265
      </row>
266
267
      <row>
268
  <entry><symbol>ext:windows-cp932</symbol> <symbol>ext:windows-932</symbol> <symbol>ext:cp932</symbol></entry>
269
  <entry>Windows CP 932</entry>
270
  <entry>Yes</entry>
271
      </row>
272
      <row>
273
  <entry><symbol>ext:windows-cp936</symbol> <symbol>ext:windows-936</symbol> <symbol>ext:cp936</symbol></entry>
274
  <entry>Windows CP 936</entry>
275
  <entry>Yes</entry>
276
      </row>
277
      <row>
278
  <entry><symbol>ext:windows-cp949</symbol> <symbol>ext:windows-949</symbol> <symbol>ext:cp949</symbol></entry>
279
  <entry>Windows CP 949</entry>
280
  <entry>Yes</entry>
281
      </row>
282
      <row>
283
  <entry><symbol>ext:windows-cp950</symbol> <symbol>ext:windows-950</symbol> <symbol>ext:cp950</symbol></entry>
284
  <entry>Windows CP 950</entry>
285
  <entry>Yes</entry>
286
      </row>
287
288
      <row>
289
  <entry><symbol>ext:windows-cp1250</symbol> <symbol>ext:windows-1250</symbol> <symbol>ext:ms-ee</symbol></entry>
290
  <entry>Windows CP 1250</entry>
291
  <entry>Yes</entry>
292
      </row>
293
294
      <row>
295
  <entry><symbol>ext:windows-cp1251</symbol> <symbol>ext:windows-1251</symbol> <symbol>ext:ms-cyrl</symbol></entry>
296
  <entry>Windows CP 1251</entry>
297
  <entry>Yes</entry>
298
      </row>
299
300
      <row>
301
  <entry><symbol>ext:windows-cp1252</symbol> <symbol>ext:windows-1252</symbol> <symbol>ext:ms-ansi</symbol></entry>
302
  <entry>Windows CP 1252</entry>
303
  <entry>Yes</entry>
304
      </row>
305
306
      <row>
307
  <entry><symbol>ext:windows-cp1253</symbol> <symbol>ext:windows-1253</symbol> <symbol>ext:ms-greek</symbol></entry>
308
  <entry>Windows CP 1253</entry>
309
  <entry>Yes</entry>
310
      </row>
311
312
      <row>
313
  <entry><symbol>ext:windows-cp1254</symbol> <symbol>ext:windows-1254</symbol> <symbol>ext:ms-turk</symbol></entry>
314
  <entry>Windows CP 1254</entry>
315
  <entry>Yes</entry>
316
      </row>
317
318
      <row>
319
  <entry><symbol>ext:windows-cp1255</symbol> <symbol>ext:windows-1255</symbol> <symbol>ext:ms-hebr</symbol></entry>
320
  <entry>Windows CP 1255</entry>
321
  <entry>Yes</entry>
322
      </row>
323
324
      <row>
325
  <entry><symbol>ext:windows-cp1256</symbol> <symbol>ext:windows-1256</symbol> <symbol>ext:ms-arab</symbol></entry>
326
  <entry>Windows CP 1256</entry>
327
  <entry>Yes</entry>
328
      </row>
329
330
      <row>
331
  <entry><symbol>ext:windows-cp1257</symbol> <symbol>ext:windows-1257</symbol> <symbol>ext:winbaltrim</symbol></entry>
332
  <entry>Windows CP 1257</entry>
333
  <entry>Yes</entry>
334
      </row>
335
336
      <row>
337
  <entry><symbol>ext:windows-cp1258</symbol> <symbol>ext:windows-1258</symbol></entry>
338
  <entry>Windows CP 1258</entry>
339
  <entry>Yes</entry>
340
      </row>
341
    </tbody>
342
  </tgroup>
343
</table>
344
345
 </section>
346
58
 <xi:include href="ref_c_streams.xml" xpointer="ansi.streams.c-dict" xmlns:xi="http://www.w3.org/2001/XInclude"/>
347
 <xi:include href="ref_c_streams.xml" xpointer="ansi.streams.c-dict" xmlns:xi="http://www.w3.org/2001/XInclude"/>
59
348
60
</chapter>
349
</chapter>
61
</book>
350
</book>