|
From: Greg C <gm...@ya...> - 2001-08-09 22:34:52
|
Here are my thoughts.
The OS-specific EOL convention can be placed in a constant string, which
is then used to match the actual EOL. Runtime interfaces can then be used
to select a specific convention.
for example (warning untested code follows):
.
.
.
feature {ANY}
eol : STRING
unix_eol : STRING is "%N"
ms_eol : STRING is "%R%N"
mac_eol : STRING is "%R"
feature make is do eol := ms_eol end -- set a default
set_unix_conventions is do eol := unix_eol end
-- and so forth.
The line reading code could even be adaptive, setting the EOL type the
first time it encounters one.
Note that you don't want to do this "once" since a program could easily
have to process files with different conventions at the same time.
If you can seek on a file, you don't necessarily have to do a manual
ungetc.
Alternatively, if you follow the read_line convention (in SE) of not
placing the EOL characters at the end of the string, you may not need to
do an ungetc or a seek to handle %R%N vs %R. Just stop processing whenever
you see the %R, remember what you saw on the last call, and then skip the
%N on the next call if it happens to be there. (The risk here is incorrect
behavior if the data expected the %R%N to actually denote two different
lines, but I think it's acceptable.)
Some programs are stupid enough to get it backwards, so it helps if the
algorithm works both ways.
On the other hand, files are generally consistent with how they use EOL
delimiters, once you figure out what it thinks an EOL is.
As for the %R being a "soft return" I'd say that this is a convention that
does not have to be immediately addressed. It would be far better to get
something that works correctly on the major platforms in a native mode,
and that has a good chance of working with files across the majaor
platforms. Anyone working with files that are too anomolous will have to
consign themselves to writing cleanup filters.
Greg
=====
http://www.geocities.com/gmc444/gregs.html
Apologies for the stupid Yahoo ad below.
__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/
|
|
From: Christian C. <chc...@cl...> - 2001-08-11 08:22:10
|
Greg C wrote :
>
> Here are my thoughts.
>
> The OS-specific EOL convention can be placed in a constant string, which
> is then used to match the actual EOL. Runtime interfaces can then be used
> to select a specific convention.
>
> for example (warning untested code follows):
> .
> .
> .
> feature {ANY}
> eol : STRING
> unix_eol : STRING is "%N"
> ms_eol : STRING is "%R%N"
> mac_eol : STRING is "%R"
I fully agree with this.
Sometimes ago I posted such a proposal on a NICE list because I think
it's much better and easier to write something like :
io.put-string("Hello world !" + eol)
than
io.put_string("Hello world !%N")
or
io.put_string("Hello world !")
io.put_new_line()
I think that %N and %R are C like stuff that should not be used in high
level programming. They are also very disturbing because you don't know
if your program will be portable to Windows or other platforms if you
use an hardcoded "%N".
And put_new_line like features are awfull too because it means that
every time you add a "put_string" or an "append_string" like feature to
a class you should also add a special case "put_new_line" or
"append_new_line" feature to the class.
It's also much shorter to type " + eol" than "io.put_new_line" on a new
line or something like this.
So I think that not having a standard new line feature (called "eol" or
"nl") is a very bad design mistake in Eiffel.
And it strikes you as soon as the "hello world" program...
My opinion is that GOBO should have such a feature used everywhere
instead of "%N" and "%R".
> feature make is do eol := ms_eol end -- set a default
>
> set_unix_conventions is do eol := unix_eol end
>
> -- and so forth.
>
> The line reading code could even be adaptive, setting the EOL type the
> first time it encounters one.
>
> Note that you don't want to do this "once" since a program could easily
> have to process files with different conventions at the same time.
If it's not a once feature, then it means each object has a memory
overhead, except if we accept to write something like "io.eol" instead
of "eol".
Solution to this problem is perhaps to undefine or redefine an inherited
eof feature from ANY in the FILE or IO class used to process files with
different conventions.
There are probably expert people knowing Eiffel much better than me who
could tell how to do it (if possible).
Regards,
Christian.
|
|
From: Berend de B. <be...@po...> - 2001-08-11 18:52:05
|
Christian Couder <chc...@cl...> writes: > I think that %N and %R are C like stuff that should not be used in high > level programming. They are also very disturbing because you don't know > if your program will be portable to Windows or other platforms if you > use an hardcoded "%N". You perfectly know that it will work. In C you only have to use %N. All other stuff discussed here is because VE is a native code compiler that treats a %N as just a %N. And BM hasn't specified it. All other compilers use ANSI C, so a %N works just very, very fine. So guys, let me repeat: 1. use eposix and use %N. works with all compilers, including VE. using eposix has its disadvantages of course as that compiler to JVM, if you ever want to do that, is not possible. 2. Or use the new portable Gobo routines. Eric has done some fine work here, but the disadvantage is that %N will not work. -- Groetjes, Berend. (-: |
|
From: Eric B. <er...@go...> - 2001-08-11 20:18:41
|
Christian Couder wrote:
>
> And put_new_line like features are awfull too because it means that
> every time you add a "put_string" or an "append_string" like feature to
> a class you should also add a special case "put_new_line" or
> "append_new_line" feature to the class.
What about:
put_line ("Hello world!")
instead of:
put_string ("Hello world!")
put_new_line
? It looks symmetrical to the `read_line' routine
to me: the string doesn't include the line-separator
but it is actually there in the file.
> It's also much shorter to type " + eol" than "io.put_new_line" on a new
> line or something like this.
But it creates an extra string object at run-time.
--
Eric Bezault
mailto:er...@go...
http://www.gobosoft.com
|
|
From: Christian C. <chc...@cl...> - 2001-08-12 05:52:09
|
Eric Bezault a =E9crit :
>=20
> Christian Couder wrote:
> >
> > And put_new_line like features are awfull too because it means that
> > every time you add a "put_string" or an "append_string" like feature =
to
> > a class you should also add a special case "put_new_line" or
> > "append_new_line" feature to the class.
>=20
> What about:
>=20
> put_line ("Hello world!")
>=20
> instead of:
>=20
> put_string ("Hello world!")
> put_new_line
>=20
> ? It looks symmetrical to the `read_line' routine
> to me: the string doesn't include the line-separator
> but it is actually there in the file.
It solves the typing problem because it's shorter, but it doesn't solve
the problem that you will in many cases have to create 2 features in
many classes instead of only one (for example: append_string and
append_line, put_string and put_line). And some people will probably
mistake the 2 features.
> > It's also much shorter to type " + eol" than "io.put_new_line" on a n=
ew
> > line or something like this.
>=20
> But it creates an extra string object at run-time.
If eol is a once string then it's created only once in each program.
Regards,
Christian.
|
|
From: Eric B. <er...@go...> - 2001-08-12 06:05:47
|
Christian Couder wrote: > > It solves the typing problem because it's shorter, but it doesn't solve > the problem that you will in many cases have to create 2 features in > many classes instead of only one The routine should be define once and for all in the ancestor class. > > > It's also much shorter to type " + eol" than "io.put_new_line" on a new > > > line or something like this. > > > > But it creates an extra string object at run-time. > > If eol is a once string then it's created only once in each program. STRING.infix "+" creates a new string at each call. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Christian C. <chc...@cl...> - 2001-08-12 09:06:01
|
Eric Bezault wrote : > > Christian Couder wrote: > > > > It solves the typing problem because it's shorter, but it doesn't solve > > the problem that you will in many cases have to create 2 features in > > many classes instead of only one > > The routine should be define once and for all in the > ancestor class. Ok, so I agree with you that put_line features are a good thing, but I don't think they solve the whole problem. For example if you want to set a multi-line message in a message box, you can use one of the following : 1) msg_box.set_msg(error_header + eol + error_msg + eol + help_msg) 2) msg_box.set_msg(error_header + "%N" + error_msg + "%N" + help_msg) 3) my_msg.append_line(error_header) my_msg.append_line(error_msg) my_msg.append_string(help_msg) msg_box.set_msg(my_msg) From this example you can see that 1) is both very short and likely to be portable, while 2) is not likely to be portable and 3) is not short. Regards, Christian. |
|
From: Eric B. <er...@go...> - 2001-08-12 11:50:19
|
Christian Couder wrote: > > Ok, so I agree with you that put_line features are a good thing, but I > don't think they solve the whole problem. > > For example if you want to set a multi-line message in a message box, > you can use one of the following : > > 1) > > msg_box.set_msg(error_header + eol + error_msg + eol + help_msg) > > 2) > > msg_box.set_msg(error_header + "%N" + error_msg + "%N" + help_msg) > > 3) > > my_msg.append_line(error_header) > my_msg.append_line(error_msg) > my_msg.append_string(help_msg) > msg_box.set_msg(my_msg) > > >From this example you can see that 1) is both very short and likely to > be portable, while 2) is not likely to be portable and 3) is not short. But I would rather use 3) anyway: I don't want to have to create zillion objects just to display a message. And apparently your 'eol' is part of ANY, so I think that this should rather be discussed in NICE and be adopted by the Eiffel compilers since they have better ways to handle platform-dependent functionalities than library writers. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Christian C. <chc...@cl...> - 2001-08-12 15:41:37
|
Eric Bezault wrote: > > And apparently your 'eol' is part of ANY, so I think that > this should rather be discussed in NICE and be adopted by > the Eiffel compilers since they have better ways to handle > platform-dependent functionalities than library writers. Ok, perhaps I will suggest this when the class ANY will be reviewed. Regards, Christian. |
|
From: Eric B. <er...@go...> - 2001-08-13 05:26:36
|
Christian Couder wrote: > > msg_box.set_msg(error_header + eol + error_msg + eol + help_msg) This assumes that the line separators for files and message boxes on the underlying platform are the same. This needs to be proven... -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Christian C. <chc...@cl...> - 2001-08-12 06:17:21
|
Christian Couder wrote :
>=20
> Eric Bezault a =E9crit :
> >
> > Christian Couder wrote:
> > >
> > > It's also much shorter to type " + eol" than "io.put_new_line" on a=
new
> > > line or something like this.
> >
> > But it creates an extra string object at run-time.
>=20
> If eol is a once string then it's created only once in each program.
Replying to myself, sorry, I didn't understand that you were talking
about the string created by the +.
Yes it is slower to use a +, but if you really want speed you can always
use
put_string("Hello world !")
put_string(eol)
or define your own put_line feature doing just this :-)
Regards,
Christian.
|
|
From: Eric B. <er...@go...> - 2001-08-12 06:43:47
|
Christian Couder wrote:
>
> Yes it is slower to use a +, but if you really want speed you can always
> use
>
> put_string("Hello world !")
> put_string(eol)
Considering that 'eol' is "%R%N" under Windows, I don't think
that this will work with what some of the Eiffel compilers
currently provide us since it is likely that you will end
up with:
Hello world!%R%R%N
in your file, the %N (second character in 'eol') being
automatically converted to %R%N. To avoid this I would
have to use binary files instead of text files to
implement the KL_*_FILE classes, but I don't think that
SmallEiffel has a binary file class in its lib_std, and
it would not work with the std files (which are text
files) anyway.
--
Eric Bezault
mailto:er...@go...
http://www.gobosoft.com
|
|
From: Christian C. <chc...@cl...> - 2001-08-12 07:56:30
|
Eric Bezault wrote: > > Considering that 'eol' is "%R%N" under Windows, I don't think > that this will work with what some of the Eiffel compilers > currently provide us since it is likely that you will end > up with: > > Hello world!%R%R%N > > in your file, the %N (second character in 'eol') being > automatically converted to %R%N. I don't use Windows any more so I cannot easily test. But perhaps you are right, perhaps some compiler blindly convert all %N to %R%N without checking if there is already a %R before. It would be strange anyway because what would happen if you read a whole text file into a string and then just write the string back to another file ? Would the "%R%N" be converted to a "%N" when reading the file and then to "%R%N" again when writing the file ? Could someone test ? If what you say is true, perhaps it could still be possible to define the meaning of "eof" depending both on the target platform and the compiler. So that for example eof would always be "%N" if SmallEiffel is used, and it would be "%N" on Unix or "%R%N" on Windows if another compiler is used. Regards, Christian. |
|
From: Eric B. <er...@go...> - 2001-08-12 11:49:24
|
Christian Couder wrote: > > Would the "%R%N" be converted to a "%N" when reading the file and then > to "%R%N" again when writing the file ? Yes. As far as I know they all do that under Windows except Visual Eiffel. > If what you say is true, perhaps it could still be possible to define > the meaning of "eof" depending both on the target platform and the > compiler. > So that for example eof would always be "%N" if SmallEiffel is used, and > it would be "%N" on Unix or "%R%N" on Windows if another compiler is > used. But I don't think that it is a Gobo issue. Platform-dependent behavior should be handled by the compiler in my opinion. I don't want to have to provide different versions of the same class for the different Eiffel compilers and yet again different versions for all supported platforms. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Eric B. <er...@go...> - 2001-08-12 16:13:21
|
Christian Couder wrote: > > If what you say is true, perhaps it could still be possible to define > the meaning of "eof" depending both on the target platform and the > compiler. > So that for example eof would always be "%N" if SmallEiffel is used, and > it would be "%N" on Unix or "%R%N" on Windows if another compiler is > used. And then: > For example if you want to set a multi-line message in a message box, > you can use one of the following : > > 1) > > msg_box.set_msg(error_header + eol + error_msg + eol + help_msg) These two suggestions seem incompatible to be. Under Windows with some Eiffel compilers 'eol' will be %N just to make sure that %R%N (and not %R%R%N) is actually written to files, but then in your message box you really want 'eol' to be %R%N because you are now dealing with strings and not files anymore and the compilers do special treatments for %N only in files and standard files, not in strings. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Eric B. <er...@go...> - 2001-08-11 20:18:28
|
Berend de Boer wrote: > > 2. Or use the new portable Gobo routines. Eric has done some fine work > here, but the disadvantage is that %N will not work. Hmmm, actually I think it works! I have modified the inherited implementation of VE's files in the KL_*_FILE classes so that it automatically converts %R%N to %N and vice-versa under Windows in order to have a compatible behavior of the KL_*_FILE classes when compiled with all Eiffel compilers. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Berend de B. <be...@po...> - 2001-08-13 05:03:42
|
Eric Bezault <er...@go...> writes: > Hmmm, actually I think it works! Didn't look at the implementation, but this is great of course :-) -- Groetjes, Berend. (-: |
|
From: Eric B. <er...@go...> - 2001-08-11 20:18:29
|
Greg C wrote:
>
> Here are my thoughts.
>
> The OS-specific EOL convention can be placed in a constant string, which
> is then used to match the actual EOL. Runtime interfaces can then be used
> to select a specific convention.
>
> for example (warning untested code follows):
> .
> .
> .
> feature {ANY}
> eol : STRING
> unix_eol : STRING is "%N"
> ms_eol : STRING is "%R%N"
> mac_eol : STRING is "%R"
>
> feature make is do eol := ms_eol end -- set a default
>
> set_unix_conventions is do eol := unix_eol end
>
> -- and so forth.
That is nice because one could read or write to a file
for a given platform while executing from another.
But it doesn't fit well with the underlying C used
by 3 out of 4 Eiffel compilers to implement their
FILE routines.
> The line reading code could even be adaptive, setting the EOL type the
> first time it encounters one.
Looks a little bit too clever to me.
> If you can seek on a file, you don't necessarily have to do a manual
> ungetc.
Not all Eiffel compilers support file seeking.
> Alternatively, if you follow the read_line convention (in SE) of not
> placing the EOL characters at the end of the string,
This is ELKS's behavior I think: "New line will be consumed
but not part of `last_string'."
> you may not need to
> do an ungetc or a seek to handle %R%N vs %R. Just stop processing whenever
> you see the %R, remember what you saw on the last call, and then skip the
> %N on the next call if it happens to be there.
That does'nt work. The new line character(s) must be consumed.
Which means that after a call to `read_line' if I call
`read_character' I should get the first character of the
next line, and not the leftover of the previous new-line
character(s).
> On the other hand, files are generally consistent with how they use EOL
> delimiters, once you figure out what it thinks an EOL is.
I already saw files which had been edited under Windows
and Unix and which contained both %N and %R%N as line
separators. This is not a very good practice, but when
this happens the current implementation of `read_line'
will still work because it recognizes both %N and %R%N
as line separators.
--
Eric Bezault
mailto:er...@go...
http://www.gobosoft.com
|
|
From: Glenn M. <gle...@op...> - 2001-08-12 02:19:01
|
Hello Eric, I'd like to try your new file handling classes but can't get the CVS version of GOBO built. I know it is in a state of flux at the moment but do you have any hints for building with ISE Eiffel on Win2000? I've got as far as generating the .e files from .ge using the ge2e.sh script under cygwin. Is there anything else I need to do to use the library classes? I'd also like to use the new version of getest (with the new switches) and try out geant and gexace. What state are these tools in? Thanks. Glenn. |
|
From: Eric B. <er...@go...> - 2001-08-12 06:01:38
|
Glenn Maughan wrote: > > I'd like to try your new file handling classes but can't get the CVS version > of GOBO built. I know it is in a state of flux at the moment but do you have > any hints for building with ISE Eiffel on Win2000? Yes I know, a boostrap procedure to allow easy building of the Gobo package from CVS is still in the TODO list. But with the progress that Sven made in the development of 'geant', I hope that we will be able to get rid of the Makefile soon (i.e. no need for cygwin under Windows anymore). > I've got as far as generating the .e files from .ge using the ge2e.sh script > under cygwin. > > Is there anything else I need to do to use the library classes? Just add the cluster $GOBO/library/kernel/io to your Ace file. The .e files generated from the .ge should be placed in $GOBO/library/kernel/spec/<compiler>. If you generated them somewhere else, either copy them to the "standard" location, or add the correspnding cluster to your Ace file. > I'd also like to use the new version of getest (with the new switches) and > try out geant and gexace. What state are these tools in? getest: ------ If you already downloaded the modified classes, then you just need to recompile the 'getest' executable using the Ace file which was included in the Gobo 2.0 distribution in $GOBO/src/getest. geant: ----- All the source code needed to compile 'geant' is now committed in the Gobo CVS repository. 'geant' is still under development, so some new tasks need to be added, and some already existing tasks may have some of their attribute names changed before the first official release. There is no doc yet, but Sven has already written some examples in $GOBO/example/geant. I'm sure that Sven will be able to provide you with more details. gexace: ------ I still need to commit some classes in the Gobo CVS repository. They are classes for the parsing of the Xace files. They used to be in eXML in the 'xace' directory, but Andreas agreed that having them somewhere in the Gobo Eiffel Tools Library would be a better place since we don't want to pollute the Eiffel XML library with all sorts of XML applications and $GOBO/library/tools is alread a library dealing with Eiffel related tool building. Andreas is currently busy with his summer job, but I'm sure that he will continue improving 'gexace' as soon as he'll have time. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
|
From: Christian C. <chc...@cl...> - 2001-08-12 06:18:27
|
Berend de Boer a =E9crit : >=20 > Christian Couder <chc...@cl...> writes: >=20 > > I think that %N and %R are C like stuff that should not be used in hi= gh > > level programming. They are also very disturbing because you don't kn= ow > > if your program will be portable to Windows or other platforms if you > > use an hardcoded "%N". >=20 > You perfectly know that it will work. In C you only have to use %N. > > All other stuff discussed here is because VE is a native code compiler > that treats a %N as just a %N. And BM hasn't specified it. All other > compilers use ANSI C, so a %N works just very, very fine. And what about compiler to jvm ?=20 And what about when reading characters ? Also if you want to write portable Eiffel code, you should not have to know how each compiler works and you should not have to know if something works in C. And in my opinion it's very disturbing that "%N" could well mean "%R%N" sometimes. "%N" should always have the same meaning that is just a "%N". Regards, Christian. |
|
From: Christian C. <chc...@cl...> - 2001-08-13 05:50:53
|
Eric Bezault wrote : > > Christian Couder wrote: > > > > If what you say is true, perhaps it could still be possible to define > > the meaning of "eof" depending both on the target platform and the > > compiler. > > So that for example eof would always be "%N" if SmallEiffel is used, and > > it would be "%N" on Unix or "%R%N" on Windows if another compiler is > > used. > > And then: > > > For example if you want to set a multi-line message in a message box, > > you can use one of the following : > > > > 1) > > > > msg_box.set_msg(error_header + eol + error_msg + eol + help_msg) > > These two suggestions seem incompatible to be. > Under Windows with some Eiffel compilers 'eol' will be > %N just to make sure that %R%N (and not %R%R%N) is actually > written to files, but then in your message box you really > want 'eol' to be %R%N because you are now dealing with > strings and not files anymore and the compilers do special > treatments for %N only in files and standard files, not > in strings. In my opinion, the compilers shouldn't cconvert "%R%N" to "%N" when reading from a file and then back from "%N" to "%R%N" when writing to a file. I think there are many good reasons why using something like "eol" would be better for them than doing this. First it's faster not to convert anything. Then it's also less disturbing for the user to see that when he reads a file into a string then the size of the string is the same as the size of the file he read. Now if they prefer to do convert, then library developers should also, if needed, convert back and forth in features like set_msg and get_msg, which is another reason why converting is bad. Regards, Christian. |
|
From: Berend de B. <be...@po...> - 2001-08-13 06:40:43
|
Christian Couder <chc...@cl...> writes: > In my opinion, the compilers shouldn't cconvert "%R%N" to "%N" when > reading from a file and then back from "%N" to "%R%N" when writing to a > file. You have the choice: use a text file if you want conversion, use binary if you don't want to. And remember, this only occurs on Windows systems, for POSIX systems there's no distinction between text and binary files. -- Groetjes, Berend. (-: |
|
From: Eric B. <er...@go...> - 2001-08-13 07:22:06
|
[Cc:ed to SmallEiffel mailing list.] Berend de Boer wrote: > > Christian Couder <chc...@cl...> writes: > > > In my opinion, the compilers shouldn't cconvert "%R%N" to "%N" when > > reading from a file and then back from "%N" to "%R%N" when writing to a > > file. > > You have the choice: use a text file if you want conversion, use > binary if you don't want to. I wish there was a binary file class which worked with both C and JVM in SmallEiffel's lib_std. Is that something that we could expect in future releases of SmallEiffel? -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |