Thread: [Taoscript-lang] rot13.tao reveals Tao (or documentation?) shortcoming :-<
Status: Beta
Brought to you by:
phoolimin
From: Josef 'J. S. <ju...@gm...> - 2005-06-02 08:41:59
|
Hi! At the end of this message I added rot13 written in Tao. If you do not know it: rot13 is a Cesar encoding; an implementation of rot13 using standard Unix tools is tr '[a-zA-Z]' '[n-za-mN-ZA-M]' The Tao script works flawlessly - for a single line. I am an old Linux hand so I have two questions not answered by the documentation. - How to make the program iterate over all input provided via standard input unless EOF is encountered? - How to access command line arguments? The reason is obvious: Without these two means it is impossible to write Unix style tools in Tao that are capable of both processing input streams and files. Bad. I am pretty sure most people who know why this is bad but perhaps not all do. I therefore describe my motivation for having such capabilities. A typical problem when dealing with data is this: You have to implement operation F on data of the same type that is provided by different sources and need to be provided to other people. While the data are in principle always of the same type you have to deal with input formats i1, i2, i3, ..., iN and output formats o1, o2, o3, ... oM. If you implement all functionality in one program you end up with always modifying the program whenever a new source is added. Bad. A data processing program should only be modified if the information it processes, the information it extracts or the way in which the information input is transformed into the information output (i.e. the algorithm) changes but it should *never* need to be modified if the representation of the data is changed without actually modifying the informational content. It's simply the "never change a winning team" (aka "never touch a running system") issue. It is much better to have filters that transform i1, i2, i3, ..., iN into a generic input format I, then process this input with a program that produces some generic output format O and have this output then transformed by other filters into any of o1, o2, o3, ..., oM. A great example of tools that follow this style is netpbm that comes with filters that transform an awful lot of graphics formats into a PBM (a trivial graphics format) and others that transform PBM into an awful lot of output formats (there are also programs that can do quantization and the like so F is known to exist as well :-). If netpbm does not support your format simply write two tools: One that transforms the format to PBM and one for the opposite direction. Now netpbm supports your format. Enough motivation, here's code (note that it does not work on EBCDIC systems!): a = read(); orig = unpack(a); rot = { }; foreach(orig:x) { if (x >= unpack("a")[0] && x <= unpack("z")[0]) { x = (x + 13 - unpack("a")[0]) % 26 + unpack("a")[0]; } else if (x >= unpack("A")[0] && x <= unpack("Z")[0]) { x = (x + 13 - unpack("A")[0]) % 26 + unpack("A")[0]; } rot.insert(x); } print (pack(rot), "\n"); Before I forget to ask: Does "rot[rot.#] = x" fail on purpose instead of meaning the same as "rot.insert(x)"? Josef 'Jupp' SCHUGT -- "NO" to the European constitution means "YES" to democracy, not "NO" to Europe - presently Europe as a whole is governed by a central committee while the parliament only has very limited power. Thank you, France. |
From: Limin Fu <fu....@gm...> - 2005-06-06 09:09:18
|
This is a re-sended message which was not previously sent to this mailing= =20 list by mistake :) Hi! The feature in your first question is not implemented, I will do it.=20 The command line arguments are stored in an array named COMARG, which has t= o=20 be accessed by the current namespace"this", namely, this.COMARG or=20 this::COMARG.=20 This command argument accessing was not elegantly support, probably it will= =20 be changed. And probably in the following ways, "this" will be given the=20 same meaning as in C++ to mean the current object, the current namespace=20 will be accessed with "here"; A special namespace will be used to store=20 command line arguments, environment variables etc... For convenience,=20 accessing classes and routines in a namespace will not require expilicit=20 specifying its namespace, they will be searched automatically from current= =20 namespace and other namespace accessable from the current namespace. "rot[rot.#] =3D x" fails on purpose ( I should have let it give a warning),= =20 since "rot" is an array. If "rot" and "x" are strings, "rot[rot.#] =3D x" w= ill=20 insert "x" to the end of "rot". This kind of operation is not supported for= =20 array because if "x" is also an array, "rot[rot.#] =3D x" will be ambiguous= ,=20 since "x" can be inserted into "rot" as a whole or element by element. I understand your motivation for having such capabilities. Your suggestions= =20 are very good, by myself alone I can't have prevision for all possible=20 important features for the language. Thank you for the suggestions :-) Limin FU On 6/1/05, Josef 'Jupp' SCHUGT <ju...@gm...> wrote: >=20 > Hi! >=20 > At the end of this message I added rot13 written in Tao. If you do not > know it: rot13 is a Cesar encoding; an implementation of rot13 using > standard Unix tools is >=20 > tr '[a-zA-Z]' '[n-za-mN-ZA-M]' >=20 > The Tao script works flawlessly - for a single line. I am an old Linux > hand so I have two questions not answered by the documentation. >=20 > - How to make the program iterate over all input provided via > standard input unless EOF is encountered?=20 >=20 > - How to access command line arguments? >=20 > The reason is obvious: Without these two means it is impossible to > write Unix style tools in Tao that are capable of both processing > input streams and files. Bad.=20 >=20 > I am pretty sure most people who know why this is bad but perhaps not > all do. I therefore describe my motivation for having such > capabilities. >=20 > A typical problem when dealing with data is this: >=20 > You have to implement operation F on data of the same type that is > provided by different sources and need to be provided to other people. > While the data are in principle always of the same type you have to > deal with input formats i1, i2, i3, ..., iN and output formats o1, o2,=20 > o3, ... oM. If you implement all functionality in one program you end > up with always modifying the program whenever a new source is added. > Bad. A data processing program should only be modified if the > information it processes, the information it extracts or the way in=20 > which the information input is transformed into the information output > (i.e. the algorithm) changes but it should *never* need to be modified > if the representation of the data is changed without actually > modifying the informational content. It's simply the "never change a=20 > winning team" (aka "never touch a running system") issue. It is much > better to have filters that transform i1, i2, i3, ..., iN into a > generic input format I, then process this input with a program that=20 > produces some generic output format O and have this output then > transformed by other filters into any of o1, o2, o3, ..., oM. >=20 > A great example of tools that follow this style is netpbm that comes > with filters that transform an awful lot of graphics formats into a=20 > PBM (a trivial graphics format) and others that transform PBM into an > awful lot of output formats (there are also programs that can do > quantization and the like so F is known to exist as well :-). If > netpbm does not support your format simply write two tools: One that=20 > transforms the format to PBM and one for the opposite direction. Now > netpbm supports your format. >=20 > Enough motivation, here's code (note that it does not work on EBCDIC > systems!): >=20 >=20 >=20 > a =3D read();=20 > orig =3D unpack(a); > rot =3D { }; > foreach(orig:x) { > if (x >=3D unpack("a")[0] && x <=3D unpack("z")[0]) { > x =3D (x + 13 - unpack("a")[0]) % 26 + unpack("a")[0];=20 > } else if (x >=3D unpack("A")[0] && x <=3D unpack("Z")[0]) { > x =3D (x + 13 - unpack("A")[0]) % 26 + unpack("A")[0]; > } > rot.insert(x); > } > print (pack(rot), "\n");=20 >=20 >=20 >=20 > Before I forget to ask: Does "rot[rot.#] =3D x" fail on purpose instead > of meaning the same as "rot.insert(x)"? >=20 > Josef 'Jupp' SCHUGT > -- > "NO" to the European constitution means "YES" to democracy, not "NO" to= =20 > Europe - presently Europe as a whole is governed by a central committee > while the parliament only has very limited power. Thank you, France. >=20 >=20 > ------------------------------------------------------- > This SF.Net <http://SF.Net> email is sponsored by Yahoo. > Introducing Yahoo! Search Developer Network - Create apps using Yahoo! > Search APIs Find out how you can build Yahoo! directly into your own > Applications - visit http://developer.yahoo.net/?fr=3Doffad-ysdn-ostg-q22= 005 > _______________________________________________ > Taoscript-lang mailing list > Tao...@li... > https://lists.sourceforge.net/lists/listinfo/taoscript-lang > |
From: Josef 'J. S. <ju...@gm...> - 2005-06-10 19:16:52
|
Hi! At Mon, 6 Jun 2005 11:08:35 +0200, Limin Fu wrote: > This is a re-sended message which was not previously sent to this > mailing list by mistake :) What mail user agent (aka mail client) do you use? I use wanderlust and rather happen to send answers to too many recipents - Per default it addresses answers both to the author of a message and to the list the messages was delivered to - which only makes limited sense because the author of a message usually is subscribed to that list and therefore receives the message twice. That is one of the few things mutt is better at :-) BTW: You should consider not to use TOFU - (your) *t*ext *o*n-top, *f*ull-quote *u*nderneath (it)[1]. There are several reasons why one should not use it. The most important (at least to me) is that switching back and forth between the answer and the original message takes more time and is more confusing than reading the question that is immediately followed by the answer. > I understand your motivation for having such capabilities. Your > suggestions are very good, by myself alone I can't have prevision > for all possible important features for the language. Thank you for > the suggestions :-) The best way of finding lacking features of a programming language is writing programs in it. If one finds out that something cannot be done one has to decide if one wants to implement the feature(s) require for that task or to refrain from supporting it. Recursion is such a feature: FORTRAN 77 does not support it - by intention. Such design decisions should not to be fixed for eternity but should be overhauled now and then. Guess what? Fortran 90 *does* support recursion. [1] The correct etymology of the acronym is German "Text oben Fullquote unten" (I tried my best to provided an appropriate translation that results in the same acronym). All for now, Josef 'Jupp' SCHUGT --=20 Your computer seems to have been infected by "nTOSkrnl.exe" (the "New Tramiel Operating System" is a revised version of the Atari ST/TT operating system and is known not to run on a PC). Please make sure to remove any file with that name=E2=80=A6 |
From: Limin Fu <fu....@gm...> - 2005-06-13 09:33:34
|
Hi! What mail user agent (aka mail client) do you use? I use wanderlust > and rather happen to send answers to too many recipents - Per default > it addresses answers both to the author of a message and to the list > the messages was delivered to - which only makes limited sense because > the author of a message usually is subscribed to that list and > therefore receives the message twice. That is one of the few things > mutt is better at :-) Usually I use the web browser to check and send email in my gmail acount, I= =20 find it is enough convenient, because it can group emails according to=20 topics, like a forum does. Another reason is, when I register this gmail=20 account, it didn't support POP client, though now it does. Last time I replied your post in the list by clicking "reply" instead of=20 "reply all", so it just sent it to you. This time I clicked "reply all", an= d=20 you will receive it twice :) BTW: You should consider not to use TOFU - (your) *t*ext *o*n-top, > *f*ull-quote *u*nderneath (it)[1]. There are several reasons why one > should not use it. The most important (at least to me) is that > switching back and forth between the answer and the original message > takes more time and is more confusing than reading the question that > is immediately followed by the answer. Yes, sometime I also reply in this way of breaking the original message and= =20 inserting answers, especially when there are too many questions to answer.= =20 It's really more convinient to discuss something by email in this way. The best way of finding lacking features of a programming language is > writing programs in it. If one finds out that something cannot be done > one has to decide if one wants to implement the feature(s) require for > that task or to refrain from supporting it.=20 You are right, that's what I'm doing. I started to use Tao language in my= =20 work since March, since then, Tao really improved a lot.=20 Recursion is such a > feature: FORTRAN 77 does not support it - by intention. Such design > decisions should not to be fixed for eternity but should be overhauled > now and then. Guess what? Fortran 90 *does* support recursion. I think there were something similar in Tao, which I thought were not reall= y=20 necessary, but in the end, when I started to writing program in Tao, I foun= d=20 it was important to have them in Tao :-) [1] The correct etymology of the acronym is German "Text oben > Fullquote unten" (I tried my best to provided an appropriate > translation that results in the same acronym). You did it! Limin |