Thread: [Tack-devel] Question on em opcodes and other topics
Moved to https://github.com/davidgiven/ack
Brought to you by:
dtrg
From: Carl E. C. <cec...@ya...> - 2018-09-25 16:34:43
|
Greetings, I am partially back, i have been reading (a few times) the em.pdf whitepaper, I just wanted to know if the opcodes and syntax and load format is the same as today in the ACK or should i go through the source code to understand it? If so, where would be the best place to look? Also, how many parts of the tools are currently not ANSI C compliant? Carl |
From: George K. <ke...@gm...> - 2018-09-26 01:18:12
|
On Tue, Sep 25, 2018 at 12:34 PM Carl Eric Codere via Tack-devel <tac...@li...> wrote: > I am partially back, i have been reading (a few times) the em.pdf whitepaper... The instructions, pseudos, and syntax of EM source code are almost the same now as in the EM report (em.pdf). I have not checked if the binary formats of EM compact assembly and EM machine code differ from the report. Beware that many backends are missing some EM instructions or EM traps from the report. There are no new EM instructions, but there are new messages for `mes` (not in report's 11.1.4.4). You can see the messages in h/em_mes.h and search for symbols like ms_stb in the source code. mes 12 (ms_stb) and mes 13 (ms_std) insert stabs for debugging. Most of our backends can't use the stabs, but I had `ack -g -mosxppc` partly working; gdb on PowerPC Mac OS X knew the names of some global and local variables from stabs. mes 14 (ms_tes) is the top element size for unconditional branches. The peephole optimizer in util/opt inserts ms_tes messages, and ncg uses them for topeltsize(), so unconditional branches may keep the top element of the EM stack in a register. The backends are missing many EM traps. Signed overflow should raise trap EIOVFL, but most back ends just ignore signed overflow. Modula-2 programs behave strangely. They should trap both signed and unsigned overflow. They do trap unsigned overflow (because the Modula-2 compiler adds the check) but they ignore signed overflow (because the Modula-2 compiler wants the back end to check it, but the check is missing). The sparc paper says, "Currently many backends do not implement error checks because they are too expensive and almost never needed. Some frontends even have facilities build in to generate EM-code to force these checks. If this trend continues we will end up with a de-facto and a de-jure standard both developed by the same people but nonetheless incompatible." There is no sparc.pdf in http://tack.sourceforge.net/olddocs.html, but see doc/sparc/5 in the source code. > Also, how many parts of the tools are currently not ANSI C compliant? The ACK's own code is a mix of traditional C, C89, and C99. (ANSI C is the old name for C89.) The ACK's C compiler should comply with C89, but some parts of libc are now broken. Some of the date and time functions cause linker errors. --George Koehler |
From: Carl E. C. <cec...@ya...> - 2018-09-26 07:27:18
|
Greetings, Thank you for all this information, really appreciated! I will check the code you indicated, and maybe see if i can update the documentation. I have also looked at the prototypes of the monitor calls in em.pdf and compared with both ISO C90 and POSIX 2004 standards, and I feel a few of them are not compliant with neither standards. I made a list of those that should be implemented to at least to be compatible with ISO C90 and have basic POSIX functionality. Should you not define a minimum set of monitor calls that should be implemented for each platform instead of implementing all monitor calls, at least to have an ISO C90 compliant compiler? Yesterday I have forked the project and started working on it (doing a pause on my other project!), but my objective is mainly to contribute to the TACK project and send you pull requests if you find them useful. I have a lot of crazy ideas, but let me start small first :) 1. Adapt most tools in util so they build without warnings on GCC with -pedantic and -std=c89, using minimum POSIX functionality, probably using only POSIX 1992 (IEEE) interfaces if possible (so that it can build on most systems, even older ones and ack itself eventually?) 2. Create Makefiles for each project in util so that they can build at least with the make tool in util/make and also create CMake files, as i want to make sure it builds with different compilers The objective is that ack can be built on Windows (mingw), MacOS and Linux systems through different C compilers, then to make sure at least the basic tools can be built by ack itself (is that feasible you feel?) What is your opinion on this? p.s: I see the C code in util has some #ifdef for EON, OS9, etc, do you feel using ISO C90 and POSIX only is ok? Cheers, Carl On 2018-09-26 03:17, George Koehler wrote: > On Tue, Sep 25, 2018 at 12:34 PM Carl Eric Codere via Tack-devel > <tac...@li...> wrote: >> I am partially back, i have been reading (a few times) the em.pdf whitepaper... > The instructions, pseudos, and syntax of EM source code are almost the > same now as in the EM report (em.pdf). I have not checked if the > binary formats of EM compact assembly and EM machine code differ from > the report. Beware that many backends are missing some EM instructions > or EM traps from the report. > > There are no new EM instructions, but there are new messages for `mes` > (not in report's 11.1.4.4). You can see the messages in h/em_mes.h and > search for symbols like ms_stb in the source code. > > mes 12 (ms_stb) and mes 13 (ms_std) insert stabs for debugging. Most > of our backends can't use the stabs, but I had `ack -g -mosxppc` > partly working; gdb on PowerPC Mac OS X knew the names of some global > and local variables from stabs. > > mes 14 (ms_tes) is the top element size for unconditional branches. > The peephole optimizer in util/opt inserts ms_tes messages, and ncg > uses them for topeltsize(), so unconditional branches may keep the top > element of the EM stack in a register. > > The backends are missing many EM traps. Signed overflow should raise > trap EIOVFL, but most back ends just ignore signed overflow. Modula-2 > programs behave strangely. They should trap both signed and unsigned > overflow. They do trap unsigned overflow (because the Modula-2 > compiler adds the check) but they ignore signed overflow (because the > Modula-2 compiler wants the back end to check it, but the check is > missing). > > The sparc paper says, "Currently many backends do not implement error > checks because they are too expensive and almost never needed. Some > frontends even have facilities build in to generate EM-code to force > these checks. If this trend continues we will end up with a de-facto > and a de-jure standard both developed by the same people but > nonetheless incompatible." There is no sparc.pdf in > http://tack.sourceforge.net/olddocs.html, but see doc/sparc/5 in the > source code. > >> Also, how many parts of the tools are currently not ANSI C compliant? > The ACK's own code is a mix of traditional C, C89, and C99. (ANSI C is > the old name for C89.) > > The ACK's C compiler should comply with C89, but some parts of libc > are now broken. Some of the date and time functions cause linker > errors. > > --George Koehler > > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel |
From: <u-...@ae...> - 2018-09-26 12:20:20
|
On Wed, Sep 26, 2018 at 08:56:33AM +0200, Carl Eric Codere via Tack-devel wrote: > The objective is that ack can be built on Windows (mingw), MacOS and Linux > systems through different C compilers, then to make sure at least the basic > tools can be built by ack itself (is that feasible you feel?) I would suggest targeting C89/90, based on a strict subset of Posix. Keeping K&R compatibility would be nice, but it would make changes harder to test. Such a source shall be well compatible with modern compilers, as long as they properly support "ANSI C". I would add a small ANSI C compatible make there (Minix-2 has a nice one, <3k source lines), and let it be built first of all, by a sh script. Then there is a challenge of providing a well structured source tree and a manitainable build system, using only the supplied "make" and its features. This would make the build system quite robust and portable. > p.s: I see the C code in util has some #ifdef for EON, OS9, etc, do you feel > using ISO C90 and POSIX only is ok? As a matter of reliable engineering, ifdefs if any should target available or missing _features_, not _assumptions_ of what features which "platforms" or their versions offer or lack. How to automatically figure out the available features in a given environment is a contended and non-portable matter. Besides, as soon as you put in complicated tools like auto****, you lose self-containment. So if you really must do different things in different situations, then better let the "available-_feature_" knobs be few, documented and manually adjustable. Good luck! Rune |
From: George K. <ke...@gm...> - 2018-09-27 04:54:42
|
On Wed, Sep 26, 2018 at 3:27 AM Carl Eric Codere wrote: > I have also looked at the prototypes of the monitor calls in em.pdf and > compared with both ISO C90 and POSIX 2004 standards, and I feel a few of > them are not compliant with neither standards. Most platforms don't use EM's _mon_ instruction. They use the calls in plat/*/libsys, with the headers in plat/*/include. There are some prototypes in lang/cem/libcc.ansi/headers/unistd.h. (Beware that fcntl.h and signal.h just include unistd.h, so unistd.h is crowded with things from other headers.) The libsys that uses EM's _mon_ is in plat/em/libsys, for the EM interpreter in util/int, where _mon_ is the only way to make a system call. The interpreter tries to provide the system calls from old Unix v7, where open(path, how) has no 3rd argument, and _how_ must be 0 (read), 1 (write), or 2 (read and write), as other flags like O_CREAT don't exist. Most platforms don't use EM's heap pointer, the `lor 2` and `str 2` instructions, but plat/em/libsys uses them in brk() and sbrk(). --George Koehler |
From: David G. <dg...@co...> - 2018-09-26 09:52:38
|
On Wed, 26 Sep 2018 at 09:27 Carl Eric Codere via Tack-devel < tac...@li...> wrote: [...] > 1. Adapt most tools in util so they build without warnings on GCC with > -pedantic and -std=c89, using minimum POSIX functionality, probably > using only POSIX 1992 (IEEE) interfaces if possible (so that it can > build on most systems, even older ones and ack itself eventually?) > That would be really useful, particularly ANSI-fying the codebase. I've been slowly picking away at it but there's always more to do, and it's problematic with compilers like clang which don't like K&R code much. (I've had good luck with cproto to assist in this.) > 2. Create Makefiles for each project in util so that they can build at > least with the make tool in util/make and also create CMake files, as i > want to make sure it builds with different compilers > This one's a lot harder. The ACK is exceptionally difficult to build, and I've gone through *multiple* build systems, including writing my own, before settling on the current one. There are multiple layers of code generators producing code which is used to build code generators, and libraries which are built multiple times with different configurations, etc, etc. Writing correct build rules for this lot is really hard, particular as make simply can't handle targets which generate multiple output files, which the ACK uses a lot (you might like to read https://www.gnu.org/software/automake/manual/html_node/Multiple-Outputs.html and then cry). A good place to start going down the rabbit hole is util/misc/convert.c, which is a very small program, which is built twice to generate em_decode and em_encode. The dependency graph is epic... The existing scripts do work on Unices, OSX, Cygwin and Haiku, and support relatively robust parallel builds (important as the number of build artifacts has just passed 7000!), so I'd kinda not like to fiddle with it at this point. What I'd really like is to burn it all down and replace it with bazel, but until bazel's mainstream enough to at least go into Debian I don't think that's an option. There are some build tool docs in first/ackbuilder.md, BTW. The objective is that ack can be built on Windows (mingw), MacOS and > Linux systems through different C compilers, then to make sure at least > the basic tools can be built by ack itself (is that feasible you feel?) > Self-hosting should definitely be feasible, and would make a very good test suite. The ACK always was traditionally self-hosted so it should be possible. Re OSX and Windows: yes, it absolutely *ought* to work there. But doesn't. The issue is that several of the ACK's tools use brk and sbrk for memory allocation, which these don't support. led, the linker, is particularly irritating here. It really ought to be rewritten to use conventional malloc() but the code model is, uh, weird and not very tractable. (We have Travis CI set up for OSX, BTW, although it's disabled for now because it doesn't work.) (And I've learned how to work Tea CI for Windows platforms, so we can totally add that too.) p.s: I see the C code in util has some #ifdef for EON, OS9, etc, do you > feel using ISO C90 and POSIX only is ok? > I think self-hosting is a good goal, but that means we'd be limited to the same C89/C90/ANSI C syntax which the ACK supports. That's not necessarily a *problem* but I personally would love it if cemcom could be extended to support C99, because that's what I tend to write unless I think about it, so I keep finding C99-isms slipping through, but I suspect that's a lot of work. For now I reckon that POSIX C89 is a good target --- everything of interest supports it. |
From: Carl E. C. <cec...@ya...> - 2018-09-30 23:52:24
|
Greetings, See below for my comments.. On 2018-09-26 11:52, David Given wrote: > On Wed, 26 Sep 2018 at 09:27 Carl Eric Codere via Tack-devel > <tac...@li... > <mailto:tac...@li...>> wrote: > [...] > > 1. Adapt most tools in util so they build without warnings on GCC > with > -pedantic and -std=c89, using minimum POSIX functionality, probably > using only POSIX 1992 (IEEE) interfaces if possible (so that it can > build on most systems, even older ones and ack itself eventually?) > > > That would be really useful, particularly ANSI-fying the codebase. > I've been slowly picking away at it but there's always more to do, and > it's problematic with compilers like clang which don't like K&R code > much. (I've had good luck with cproto to assist in this.) I have started locally on my side, this is the current status: * util/make is now ANSI compliant and uses only basic POSIX functions, I also fixed some non portable code. make can read the makefile of make to build correctly itself. * util/arch is now mostly POSIX compliant, this one i cannot push it to my fork before quite a while because of the below issue: * modules/src/object This is now ANSI C compliant using FILE*, but i did find that some of the code is strange and the API itself also, and my changes affects everything else, so i need to port everything else before i push this! Also I saw some of functions use internal variables with no fd input which i find special. > 2. Create Makefiles for each project in util so that they can > build at > least with the make tool in util/make and also create CMake files, > as i > want to make sure it builds with different compilers > > > This one's a lot harder. The ACK is exceptionally difficult to build, > and I've gone through /multiple/ build systems, including writing my > own, before settling on the current one. There are multiple layers of > code generators producing code which is used to build code generators, > and libraries which are built multiple times with different > configurations, etc, etc. Writing correct build rules for this lot is > really hard, particular as make simply can't handle targets which > generate multiple output files, which the ACK uses a lot (you might > like to read > https://www.gnu.org/software/automake/manual/html_node/Multiple-Outputs.html and > then cry). > > A good place to start going down the rabbit hole is > util/misc/convert.c, which is a very small program, which is built > twice to generate em_decode and em_encode. The dependency graph is epic... > > The existing scripts do work on Unices, OSX, Cygwin and Haiku, and > support relatively robust parallel builds (important as the number of > build artifacts has just passed 7000!), so I'd kinda not like to > fiddle with it at this point. What I'd really like is to burn it all > down and replace it with bazel, but until bazel's mainstream enough to > at least go into Debian I don't think that's an option. I understand your point of view, i am not that far enough to see the issue, the problem is that if i want to build some of the tools on legacy / weird platforms (which i will try), i would need to port those build tools. Is there alternate solution or a happy medium? Maybe we could add CMake or basic makefiles for the util part only? From my understanding there are too much interdependencies, right? Can some of these be broken a bit? > There are some build tool docs in first/ackbuilder.md, BTW. > > The objective is that ack can be built on Windows (mingw), MacOS and > Linux systems through different C compilers, then to make sure at > least > the basic tools can be built by ack itself (is that feasible you > feel?) > > > Self-hosting should definitely be feasible, and would make a very good > test suite. The ACK always was traditionally self-hosted so it should > be possible. > > Re OSX and Windows: yes, it absolutely /ought/ to work there. But > doesn't. The issue is that several of the ACK's tools use brk and sbrk > for memory allocation, which these don't support. led, the linker, is > particularly irritating here. It really ought to be rewritten to use > conventional malloc() but the code model is, uh, weird and not very > tractable. (We have Travis CI set up for OSX, BTW, although it's > disabled for now because it doesn't work.) (And I've learned how to > work Tea CI for Windows platforms, so we can totally add that too.) > Well, I can take a look at it no problem, as i said i will start with the util/ directory, but it seems there is a long way to go and a lot of work even only in those directories. Maybe the linker should be completely rewritten... Some interesting notes / questions by looking at the code: * name limit is 14 characters in libraries, probably to follow POSIX, added a define for this in arch.h * I usually use BSD/allman C code formatting (as described in eclipse CDT), is that ok with you, or not i can follow your way if needed? * How is the testing done? Any framework you use, personally i usually use simple applications with asserts... but will follow your way as long as it is portable :) * For documentation, i usually document using doxygen in headers, but i am not sure the style used now, is there any approach done now? > feel using ISO C90 and POSIX only is ok? > > > I think self-hosting is a good goal, but that means we'd be limited to > the same C89/C90/ANSI C syntax which the ACK supports. That's not > necessarily a /problem/ but I personally would love it if cemcom could > be extended to support C99, because that's what I tend to write unless > I think about it, so I keep finding C99-isms slipping through, but I > suspect that's a lot of work. > > For now I reckon that POSIX C89 is a good target --- everything of > interest supports it. That will be far off for me as a goal of C99 of cemcom :), as i would like to work first on the interpreter, then the pascal compiler and add support for low-end platform targets... and see how to improve so that the generated native code can be compatible with official ABI's.. Carl |
From: <u-...@ae...> - 2018-10-01 12:45:11
|
On Mon, Oct 01, 2018 at 01:41:59AM +0200, Carl Eric Codere via Tack-devel wrote: > I have started locally on my side, this is the current status: > * util/make is now ANSI compliant and uses only basic POSIX functions, I > also fixed some non portable code. make can read the makefile of make to > build correctly itself. Now I see that there _was_ a make in utils (not present in the 5.5). Even smaller than the Minix one, nice that you make it portable. > * util/arch is now mostly POSIX compliant, this one i cannot push it to my Whe you are at it, would you make its output with the D flag reproducible? My take was ... #ifdef DISTRIBUTION if (distr_fl) { - static struct stat statbuf; - - stat(progname, &statbuf); - distr_time = statbuf.st_mtime; + distr_time = 0; } #endif ... Also ... if (distr_fl) { member.ar_uid = 2; member.ar_gid = 2; ... for generality would be probably better with 0. > the problem is that if i want to build some of the tools on legacy / weird > platforms (which i will try), i would need to port those build tools. Exactly. > alternate solution or a happy medium? Maybe we could add CMake or > basic makefiles for the util part only? From my understanding there are too > much interdependencies, right? Can some of these be broken a bit? If a linear build is broken down into "reasonably sized" pieces and given that the "previous" parts have completed successfully can be restarted from any such point (unconditionally resetting and rebuilding all of the "following" parts), I suggest this would be good enough, without a need for much complexity. This strips down the dependency graph into one dimension and makes it coarse-grained, but is fully sufficient for a correct build. It is also robust for maintenance. Rune |
From: <u-...@ae...> - 2019-02-16 13:25:59
|
On Sat, Feb 16, 2019 at 11:54:31AM +0100, David Given wrote: > Being able to build the ACK on small systems is definitely valuable, and we > should keep it. Appreciated. > That said, I'm not sure that using a custom system library's the right way > to do this. These days C is much more standard than it was back then and > the need for custom wrappers to make sure that, e.g., realloc with NULL > works properly isn't necessary. A lot of the stuff in system is equivalent > to the standard Posix calls, or is unused completely (sys_lock --- in fact, > I see that function's not even built!). > > What I'd propose as a compromise is: > > - if a system function is a trivial reimplementation of Posix, replace it > with the Posix version. > - *do* replace File* with FILE*. If it turns out to be too expensive later, > we can replace the buffered stdio with an unbuffered implementation > equivalent to the one in system, but using the standard interfaces. I am not sure I am fully following you (any code accidentally dereferencing *File would break with *FILE because FILE is opaque, but this might be a non-issue here), assume that you intend to replace references to the "system" library entries everywhere with corresponding Posix and stdio equivalents, dropping the library out of existence, or possibly keeping some entry points not falling under the cases above (?) If this is a correct interpretation, I do not like to sound negative, but unfortunately I anticipate some drawbacks. One is to have to make changes in many files, this implies some additional work and a risk for subtle breakage. The other is a loss of the presently sufficient constrained interface. When someone would have to adapt Ack to a new constrained environment, one would have a challenge to find out which Posix/libc subset needs to be ported there. At that time the necessary set of operation will probably become larger that today, if more Posix/libc calls will be introduced in different parts of the code. It is hard to expect that an Ack developer would always have in mind to double check whether a certain Posix / C library call is already being used or not. > This way we should end up with standard interfaces, which makes things > easier to work on and maintain, but still allow the small systems. This particular interface is small (17 entry points?) and the implementation takes 449 lines in *.[hc] and 324 lines of nroff in the man page. An implementation of a minimal standard-compatible unbuffered stdio would probably take at least about the same amount of LOC even if the man page would become redundant (?). To get some idea, I checked with avr-libc: (buffered stdio, so take it with a large grain of salt, but anyway) The docs say: "only a limited subset of standard IO is implemented" The non-formatted i/o stdio code: avr-libc-2.0.0/libc/stdio$ ls *.c | grep -v printf | grep -v scanf | xargs wc | tail -1 1067 5812 39654 total Everything in stdio: avr-libc-2.0.0/libc/stdio$ wc * | tail -1 4998 23588 160132 total Rune |
From: David G. <dg...@co...> - 2018-10-04 15:20:10
|
On Mon, 1 Oct 2018 at 01:52 Carl Eric Codere via Tack-devel < tac...@li...> wrote: [...] > * modules/src/object This is now ANSI C compliant using FILE*, but i did > find that some of the code is strange and the API itself also, and my > changes affects everything else, so i need to port everything else before i > push this! Also I saw some of functions use internal variables with no fd > input which i find special. > Yep. Very special --- lots of global state everywhere. I'd recommend not trying to change the API alongside other changes; that sort of thing belongs in the minimum possible change. > I understand your point of view, i am not that far enough to see the > issue, the problem is that if i want to build some of the tools on legacy / > weird platforms (which i will try), i would need to port those build tools. > Is there alternate solution or a happy medium? Maybe we could add CMake or > basic makefiles for the util part only? From my understanding there are too > much interdependencies, right? Can some of these be broken a bit? > It'd be difficult. Any tools which do anything interesting will have touch the really complex stuff in, e.g., modules/src/read_em or modules/src/em_code, which are all built in multiple varieties (to handle ASCII em and bytecode em), and are based on generated tables from h/em_table, etc. Some of these libraries will call tools it util/ to generate some of their code (util/cmisc/tabgen is a favourite). Simplifying into fewer libraries is certainly possible, but you still end up with a dependency between h/em_table and the code generators and your tool. No matter what you did, you still have to ensure that if a dependency changes, you rebuild things. Otherwise you get incorrect builds and subtle, horrible-to-find bugs. [...] > * name limit is 14 characters in libraries, probably to follow POSIX, > added a define for this in arch.h > Yikes. Yes, definitely --- although it's just for member filenames, not symbol names, so it's not critical. It's probably 14 to match the Unix V7 file system. * I usually use BSD/allman C code formatting (as described in eclipse CDT), > is that ok with you, or not i can follow your way if needed? > There's a .clang-format file in the root which I use, mostly to reformat the ancient K&R code before editing. I'm not precious, though. > * How is the testing done? Any framework you use, personally i usually use > simple applications with asserts... but will follow your way as long as it > is portable :) > Yes, that's fine. Tests are currently mostly limited to compiler tests (in tests/plat) but there ought to be a simple framework for generic tests too. I like tests which run as part of the build, so failing tests cause build failures, but the compiler tests are sufficiently problematic that we can't run them on the CI systems. I'm hoping to change this eventually. > * For documentation, i usually document using doxygen in headers, but i am > not sure the style used now, is there any approach done now? > Documentation? We've heard of it. What there is is mostly man pages, both API (e.g. https://github.com/davidgiven/ack/blob/default/modules/src/em_mes/em_mes.3) or tool (e.g. https://github.com/davidgiven/ack/blob/default/util/cmisc/tabgen.1). But they're both fairly antiquated. I like embedded documentation in headers, but I'm not terribly keen on the external dependency on doxygen. It must be possible to generate man pages from headers using a very small shell script... > That will be far off for me as a goal of C99 of cemcom :), as i would like > to work first on the interpreter, then the pascal compiler and add support > for low-end platform targets... and see how to improve so that the > generated native code can be compatible with official ABI's.. > Regarding the last: I've actually been thinking about it. Stacking parameters is killing MIPS code generation quality. Unfortunately it's not at all easy to change. The critical issue is that for register-based calling conventions, the caller needs to put parameters in the right registers, which means it needs to know what the function's expecting. Unfortunately, em bytecode doesn't carry this information! It's entirely legal to pass arbitrary parameters to a function where you don't know anything except the name. For this to work, I think the only way is to modify the call instruction to take a parameter spec as well as the function name. Except this would require changing every compiler (five) to correctly pass in parameter specs, plus every code generator (three), plus fix all the hand-written em code (lots), and this would be fundamentally incompatible with old code so it'd have to be done as a single change. Plus em instructions can only take one parameter anyway... |
From: Carl E. C. <cec...@ya...> - 2018-10-04 23:53:40
|
On 2018-10-04 17:19, David Given wrote: > On Mon, 1 Oct 2018 at 01:52 Carl Eric Codere via Tack-devel > <tac...@li... > <mailto:tac...@li...>> wrote: > [...] > > * modules/src/object This is now ANSI C compliant using FILE*, but > i did find that some of the code is strange and the API itself > also, and my changes affects everything else, so i need to port > everything else before i push this! Also I saw some of functions > use internal variables with no fd input which i find special. > > > Yep. Very special --- lots of global state everywhere. I'd recommend > not trying to change the API alongside other changes; that sort of > thing belongs in the minimum possible change. No problem, i will just try to make sure it is using ANSI calls for the moment. > > I understand your point of view, i am not that far enough to see > the issue, the problem is that if i want to build some of the > tools on legacy / weird platforms (which i will try), i would need > to port those build tools. Is there alternate solution or a happy > medium? Maybe we could add CMake or basic makefiles for the util > part only? From my understanding there are too much > interdependencies, right? Can some of these be broken a bit? > > > It'd be difficult. Any tools which do anything interesting will have > touch the really complex stuff in, e.g., modules/src/read_em or > modules/src/em_code, which are all built in multiple varieties (to > handle ASCII em and bytecode em), and are based on generated tables > from h/em_table, etc. Some of these libraries will call tools it util/ > to generate some of their code (util/cmisc/tabgen is a favourite). > > Simplifying into fewer libraries is certainly possible, but you still > end up with a dependency between h/em_table and the code generators > and your tool. No matter what you did, you still have to ensure that > if a dependency changes, you rebuild things. Otherwise you get > incorrect builds and subtle, horrible-to-find bugs. > Ok, i will get back to you when i am there, but am still far from there... > [...] > > * name limit is 14 characters in libraries, probably to follow > POSIX, added a define for this in arch.h > > > Yikes. Yes, definitely --- although it's just for member filenames, > not symbol names, so it's not critical. It's probably 14 to match the > Unix V7 file system. > > * I usually use BSD/allman C code formatting (as described in > eclipse CDT), is that ok with you, or not i can follow your way if > needed? > > > There's a .clang-format file in the root which I use, mostly to > reformat the ancient K&R code before editing. I'm not precious, though. Ok, clear. > > * How is the testing done? Any framework you use, personally i > usually use simple applications with asserts... but will follow > your way as long as it is portable :) > > > Yes, that's fine. Tests are currently mostly limited to compiler tests > (in tests/plat) but there ought to be a simple framework for generic > tests too. I like tests which run as part of the build, so failing > tests cause build failures, but the compiler tests are sufficiently > problematic that we can't run them on the CI systems. I'm hoping to > change this eventually. Ok, then i will try to use that approach also.. > * For documentation, i usually document using doxygen in headers, > but i am not sure the style used now, is there any approach done now? > > > Documentation? We've heard of it. > > What there is is mostly man pages, both API (e.g. > https://github.com/davidgiven/ack/blob/default/modules/src/em_mes/em_mes.3) > or tool (e.g. > https://github.com/davidgiven/ack/blob/default/util/cmisc/tabgen.1). > But they're both fairly antiquated. I like embedded documentation in > headers, but I'm not terribly keen on the external dependency on > doxygen. It must be possible to generate man pages from headers using > a very small shell script... Ok, no problem, I am updating the man pages i see when i can find them... > > That will be far off for me as a goal of C99 of cemcom :), as i > would like to work first on the interpreter, then the pascal > compiler and add support for low-end platform targets... and see > how to improve so that the generated native code can be compatible > with official ABI's.. > > > Regarding the last: I've actually been thinking about it. Stacking > parameters is killing MIPS code generation quality. Unfortunately it's > not at all easy to change. > > The critical issue is that for register-based calling conventions, the > caller needs to put parameters in the right registers, which means it > needs to know what the function's expecting. Unfortunately, em > bytecode doesn't carry this information! It's entirely legal to pass > arbitrary parameters to a function where you don't know anything > except the name. > > For this to work, I think the only way is to modify the call > instruction to take a parameter spec as well as the function name. > Except this would require changing every compiler (five) to correctly > pass in parameter specs, plus every code generator (three), plus fix > all the hand-written em code (lots), and this would be fundamentally > incompatible with old code so it'd have to be done as a single change. > Plus em instructions can only take one parameter anyway... > Would it not be better to instead use one of the following options instead: * Update the PRO pseudo-instruction with an additional parameter (at the end), which contains the signature of the routine parameter types, for example, for the C function: void hello(int8_t value), the PRO pseudo instruction would be: PRO _hello,,v_hello_i8 or something similar to this (we would need to define this signature, and/or see if a standard exists already).. so the assembler is still backward compatible, as the last option is optional. Then in the em object file, we either add a new section including type information OR add new symbol entries with the new names, while also keeping the old ones, and internally we can search both when linking... converting. Need to think deeper about it though... * Use a MES pseudo instruction that is required just after the the PRO declaration that would give the type information of the parameters... format could be once again a function signature with type definitions. Some requirements I see: * old em assembler code should be easily to adapt * the routine signatures should be as compact as possible to save precious space on low-end machines. I would see as a step by step process, adapt all compilers first to generate the type information - no problem for me to do this, and then adapt all the tools to ignore this type information, except for one target cg that we could test on first... Not sure if this makes sense... Some useful standards on naming conventions that could be used: https://en.wikipedia.org/wiki/Name_mangling Carl |
From: Carl E. C. <cec...@ya...> - 2019-02-06 04:13:24
|
Greetings, Small update on where I am at in my "porting" effort, nothing committed yet, but the changes are quite important but not a lot of functional changes, about 60% of the tack project is now ported to ANSI C. * Try to remove calls to system library using sys_ when I know that all platforms support the ANSI C version of the calls. ** I need to add sys_tmpfile / sys_tmpnam since its not portable because of Microsoft libc weirdness. * Try as much as possible to fix to have C99 compatible function prototypes * Adapt to have header files with guards when functions are used across modules. ** Move the docs to the header file when they exist in the function definition. ** One-line sentence overview of function when I can do it. * Adapt to have internal prototypes and add STATIC to functions used internally to a module. * Add CMake and use cmake to build portable scripts: ** Create sed scripts so no more shell requirement ** Create awk scripts so no more shell requirement. * Properly documenting all opcodes / pseudoinstructions so that a reference manual can be created in a new document. I have only ported the pascal and basic compilers now and since ack is not compiled, I am testing by hand, but i see some issues: * The pascal compiler generates a PRO pseudoinstruction without the size, since its only 1 pass, so the old code generator chokes, because it expects it, I have questions on this: ** Do other compilers do this also? Will ncg also choke if the PRO does not have the locals parameters? ** Should we adapt cg or the compiler to properly process a PRO / END to get the number of local bytes? BTW: Regarding the em documentation, I would like to update it (I see mistakes in the specification), just wondering should it be converted to something else than troff, or ok like that? Latex? Asciidoc? etc... I will not change anything on the format for now... just want an opinion on this. I still have a lot of work to go to... Carl |
From: David G. <dg...@co...> - 2019-02-16 14:31:59
|
Yup, that's exactly what I mean. Lose the library completely and use Posix interfaces throughout. On systems which need it, we can just typedef FILE to an int and then implement fread() as a tiny wrapper around read(). (We should probably have this anyway as a libc option. It looks like the ACK is mostly used for real for very small systems; I'm currently doing a tonne of work on the 8080 backend for Fuzix, and I get persistent bug reports from some working with CP/M that the stdio is just too big.) This is equivalent to what the system library currently does --- except without the conceptual overhead of having to deal with an extra API with its own semantic quirks. Carl's reaction above on seeing File* is exactly what I'm trying to avoid here. Provided we make a reasonable effort to use a minimal subset of the APIs available, there's little work here and a lot of win. Code is cheap; understanding is expensive. Making the code easier to understand makes it easier to work with, and therefore there's more likelihood that people will understand what's going on (which we desperately need; a lot of the code is very opaque). Plus, on systems which *don't* need the small library, we get stdio file buffering for free and faster compiles. On Sat, 16 Feb 2019 at 14:14 <u-...@ae...> wrote: > On Sat, Feb 16, 2019 at 11:54:31AM +0100, David Given wrote: > > Being able to build the ACK on small systems is definitely valuable, and > we > > should keep it. > > Appreciated. > > > That said, I'm not sure that using a custom system library's the right > way > > to do this. These days C is much more standard than it was back then and > > the need for custom wrappers to make sure that, e.g., realloc with NULL > > works properly isn't necessary. A lot of the stuff in system is > equivalent > > to the standard Posix calls, or is unused completely (sys_lock --- in > fact, > > I see that function's not even built!). > > > > What I'd propose as a compromise is: > > > > - if a system function is a trivial reimplementation of Posix, replace it > > with the Posix version. > > - *do* replace File* with FILE*. If it turns out to be too expensive > later, > > we can replace the buffered stdio with an unbuffered implementation > > equivalent to the one in system, but using the standard interfaces. > > I am not sure I am fully following you (any code accidentally > dereferencing *File would break with *FILE because FILE is opaque, > but this might be a non-issue here), assume that you intend to replace > references to the "system" library entries everywhere with corresponding > Posix and stdio equivalents, dropping the library out of existence, > or possibly keeping some entry points not falling under the cases above (?) > > If this is a correct interpretation, I do not like to sound negative, > but unfortunately I anticipate some drawbacks. > > One is to have to make changes in many files, this implies some > additional work and a risk for subtle breakage. > > The other is a loss of the presently sufficient constrained interface. > > When someone would have to adapt Ack to a new constrained environment, > one would have a challenge to find out which Posix/libc subset needs > to be ported there. > > At that time the necessary set of operation will probably become larger > that today, if more Posix/libc calls will be introduced in different > parts of the code. > > It is hard to expect that an Ack developer would always have in mind to > double check whether a certain Posix / C library call is already being > used or not. > > > This way we should end up with standard interfaces, which makes things > > easier to work on and maintain, but still allow the small systems. > > This particular interface is small (17 entry points?) and the > implementation > takes 449 lines in *.[hc] and 324 lines of nroff in the man page. > An implementation of a minimal standard-compatible unbuffered > stdio would probably take at least about the same amount of LOC even if > the man page would become redundant (?). > > To get some idea, I checked with avr-libc: > (buffered stdio, so take it with a large grain of salt, but anyway) > The docs say: "only a limited subset of standard IO is implemented" > > The non-formatted i/o stdio code: > avr-libc-2.0.0/libc/stdio$ ls *.c | grep -v printf | grep -v scanf | xargs > wc | tail -1 > 1067 5812 39654 total > > Everything in stdio: > avr-libc-2.0.0/libc/stdio$ wc * | tail -1 > 4998 23588 160132 total > > Rune > > |
From: Carl E. C. <cec...@ya...> - 2019-02-16 18:26:16
|
Greetings, Ok, I do understand your point of view and I have no issue with it, but I have several questions related to it once again.. sorry for the bother. Do we agree that we should avoid POSIX calls in the C layer, and stick with ANSI C if possible (might not be possible in certain cases), right? Now a real use case:* I have replaced in LLgen mkstemp() by tmpnam() it is now fully ANSI, even though it has known race conditions, something acceptable for this project, right?* But you might already know tmpnam() is completely broken in Visual C++ and simply cannot be used. In that case, for Visual C++ how do you solve it, or other ANSI C libraries which has such issues... how you fix it? The solution i made is to add TMPNAM wrapper as a function in machdep.c and then if necessary in the future, add platform specific parts for those broken libc API's there directly... Not sure if this is the correct approach though. On the other hand, if POSIX calls are required, then we need to implement them on the platforms that are missing them, this is what you are saying? Using the semantics of the real POSIX API instead of a wrapper, so the "system" library would actually emulate the POSIX calls? And for low-end platforms, we simply have libc simply call as directly as possible the kernel, by doing typedef's, etc, right, making it minimal, right? Just trying to understand... Carl On Saturday, February 16, 2019, 10:32:08 PM GMT+8, David Given <dg...@co...> wrote: Yup, that's exactly what I mean. Lose the library completely and use Posix interfaces throughout. On systems which need it, we can just typedef FILE to an int and then implement fread() as a tiny wrapper around read(). (We should probably have this anyway as a libc option. It looks like the ACK is mostly used for real for very small systems; I'm currently doing a tonne of work on the 8080 backend for Fuzix, and I get persistent bug reports from some working with CP/M that the stdio is just too big.) This is equivalent to what the system library currently does --- except without the conceptual overhead of having to deal with an extra API with its own semantic quirks. Carl's reaction above on seeing File* is exactly what I'm trying to avoid here. Provided we make a reasonable effort to use a minimal subset of the APIs available, there's little work here and a lot of win. Code is cheap; understanding is expensive. Making the code easier to understand makes it easier to work with, and therefore there's more likelihood that people will understand what's going on (which we desperately need; a lot of the code is very opaque). Plus, on systems which don't need the small library, we get stdio file buffering for free and faster compiles. On Sat, 16 Feb 2019 at 14:14 <u-...@ae...> wrote: On Sat, Feb 16, 2019 at 11:54:31AM +0100, David Given wrote: > Being able to build the ACK on small systems is definitely valuable, and we > should keep it. Appreciated. > That said, I'm not sure that using a custom system library's the right way > to do this. These days C is much more standard than it was back then and > the need for custom wrappers to make sure that, e.g., realloc with NULL > works properly isn't necessary. A lot of the stuff in system is equivalent > to the standard Posix calls, or is unused completely (sys_lock --- in fact, > I see that function's not even built!). > > What I'd propose as a compromise is: > > - if a system function is a trivial reimplementation of Posix, replace it > with the Posix version. > - *do* replace File* with FILE*. If it turns out to be too expensive later, > we can replace the buffered stdio with an unbuffered implementation > equivalent to the one in system, but using the standard interfaces. I am not sure I am fully following you (any code accidentally dereferencing *File would break with *FILE because FILE is opaque, but this might be a non-issue here), assume that you intend to replace references to the "system" library entries everywhere with corresponding Posix and stdio equivalents, dropping the library out of existence, or possibly keeping some entry points not falling under the cases above (?) If this is a correct interpretation, I do not like to sound negative, but unfortunately I anticipate some drawbacks. One is to have to make changes in many files, this implies some additional work and a risk for subtle breakage. The other is a loss of the presently sufficient constrained interface. When someone would have to adapt Ack to a new constrained environment, one would have a challenge to find out which Posix/libc subset needs to be ported there. At that time the necessary set of operation will probably become larger that today, if more Posix/libc calls will be introduced in different parts of the code. It is hard to expect that an Ack developer would always have in mind to double check whether a certain Posix / C library call is already being used or not. > This way we should end up with standard interfaces, which makes things > easier to work on and maintain, but still allow the small systems. This particular interface is small (17 entry points?) and the implementation takes 449 lines in *.[hc] and 324 lines of nroff in the man page. An implementation of a minimal standard-compatible unbuffered stdio would probably take at least about the same amount of LOC even if the man page would become redundant (?). To get some idea, I checked with avr-libc: (buffered stdio, so take it with a large grain of salt, but anyway) The docs say: "only a limited subset of standard IO is implemented" The non-formatted i/o stdio code: avr-libc-2.0.0/libc/stdio$ ls *.c | grep -v printf | grep -v scanf | xargs wc | tail -1 1067 5812 39654 total Everything in stdio: avr-libc-2.0.0/libc/stdio$ wc * | tail -1 4998 23588 160132 total Rune _______________________________________________ Tack-devel mailing list Tac...@li... https://lists.sourceforge.net/lists/listinfo/tack-devel |
From: <u-...@ae...> - 2019-02-16 20:16:20
|
> On Sat, 16 Feb 2019 at 14:14 <u-...@ae...> wrote: > > unfortunately I anticipate some drawbacks. ... > > The other is a loss of the presently sufficient constrained interface. > > > > When someone would have to adapt Ack to a new constrained environment, > > one would have a challenge to find out which Posix/libc subset needs > > to be ported there. In practice, the developers are to assume the availability of all standard features, unless you explicitly document a certain subset to be used. Are you planning to define such a subset? On Sat, Feb 16, 2019 at 03:31:32PM +0100, David Given wrote: > Yup, that's exactly what I mean. Lose the library completely and use Posix > interfaces throughout. On systems which need it, we can just typedef FILE > to an int and then implement fread() as a tiny wrapper around read(). > > (We should probably have this anyway as a libc option. It looks like the > ACK is mostly used for real for very small systems; I'm currently doing a > tonne of work on the 8080 backend for Fuzix, and I get persistent bug > reports from some working with CP/M that the stdio is just too big.) > > This is equivalent to what the system library currently does --- except > without the conceptual overhead of having to deal with an extra API with > its own semantic quirks. Carl's reaction above on seeing File* is exactly > what I'm trying to avoid here. In one or another way, "system" or "libc", this ought to be documented for the developers, which interfaces the compiler is to use. See my question about a chosen subset above. > Provided we make a reasonable effort to use > a minimal subset of the APIs available, there's little work here and a lot > of win. > > Code is cheap; understanding is expensive. Making the code easier to > understand makes it easier to work with, and therefore there's more > likelihood that people will understand what's going on (which we > desperately need; a lot of the code is very opaque). (What do you mean by saying that code is cheap?) I agree with the rest of the paragraph, but the additional complexity of the tiny "system" API reflects one of the desirable properties of the code, to be as portable as feasible. The cost for this looks reasonable to me. > Plus, on systems which *don't* need the small library, we get stdio file > buffering for free and faster compiles. I believe I understand your motivation but still your arguments do not address the issue of replacing a small API ("system") with a much, much larger one (Posix/libc). It is the larger API which will need to be available on all systems, including the most limited ones, as soon as the compiler will depend on that larger API. The necessary effort is remarkably different regarding porting the "system" library compared to adding Posix/libc compatibility to a new environment. Adding the latter is a very good thing, but can be prohibitively expensive, especially if some features will have to be present for the sake of the compiler and not otherwise. The decision is of course yours. Hope that sharing my view helps in some way. Thanks for your work on Ack! Rune |
From: <u-...@ae...> - 2019-02-06 08:53:41
|
Hello Carl, Thanks for the progress report. On Wed, Feb 06, 2019 at 12:13:00PM +0800, Carl Eric Codere via Tack-devel wrote: > Greetings, > Small update on where I am at in my "porting" effort, > nothing committed yet, but the changes are quite important but not a lot of > functional changes, about 60% of the tack project is now ported to ANSI C. [...] > * Add CMake and use cmake to build portable scripts: > ** Create sed scripts so no more shell requirement > ** Create awk scripts so no more shell requirement. Just to make sure, what do you mean by "no more shell requirement"? (How does a "shell requirement" compare to "cmake/sed/awk requirements"?) > * Properly documenting all opcodes / pseudoinstructions so that a reference > manual can be created in a new document. That's great. > BTW: Regarding the em documentation, I would like to update it (I see > mistakes in the specification), just wondering should it be converted to > something else than troff, or ok like that? Latex? Asciidoc? etc... I will > not change anything on the format for now... just want an opinion on this. Corrections are straightforward as long as you know what the errors are; a choice of the format can hardly ever satisfy everyone. FWIIW I suggest that you focus now on the contents, not on a format change, if you can live with troff. > I still have a lot of work to go to... > Carl Looking forward to an ANSIfied ack. What worries me is that it is hard to verify such large changes. It would be very helpful if we could produce identical compiler binaries from the sources "before" and "after", to know that they are equivalent. This implies of course that this change has to be kept separate from all other modifications. Rune |
From: David G. <dg...@co...> - 2019-02-06 23:37:36
|
Thanks very much --- that all sounds great! Please, though, send lots of small PRs as you go rather than one big one. It's vastly easier to review and reduces the risk of stuff you're doing crossing stuff I'm doing. For example, I recently ANSIfied the Pascal and Basic compiler myself (although pretty crudely, just to make the warnings go away). [...] > ** Move the docs to the header file when they exist in the function > definition. > ...for anything other than public functions (i.e. stuff in public headers in modules/), it's best left next to the definition. This is because these usually describe the *implementation*, not the specification. > ** One-line sentence overview of function when I can do it. > Very valuable. > * Adapt to have internal prototypes and add STATIC to functions used > internally to a module. > Any reason not to simply use 'static'? > * Add CMake and use cmake to build portable scripts: > ** Create sed scripts so no more shell requirement > ** Create awk scripts so no more shell requirement. > I don't think CMake will cut it with the ACK --- it's too complex. In particular, the ACK needs the same module to be compiled multiple times with different settings, and I've found very few build systems that can handle that. (I so, so badly want to use bazel for this.) > I have only ported the pascal and basic compilers now and since ack is > not compiled, I am testing by hand, but i see some issues: > There is a test suite, which is run automatically by the main build system. It's not a terribly complete test suite, but it does exist. > * The pascal compiler generates a PRO pseudoinstruction without the > size, since its only 1 pass, so the old code generator chokes, because > it expects it, I have questions on this: ** Do other compilers do this also? Will ncg also choke if the PRO does > not have the locals parameters? > The compiler output is always run through em_opt before being passed to a code generator, which among other things will add the parameter to PRO. You don't need to worry about it being missing in the code generator. > BTW: Regarding the em documentation, I would like to update it (I see > mistakes in the specification), just wondering should it be converted to > something else than troff, or ok like that? Latex? Asciidoc? etc... I > will not change anything on the format for now... just want an opinion > on this. > I'd very much like to keep the existing documentation, even though it's a mess. In my experience it's been surprisingly accurate; what errors did you find? Also, bear in mind that the code generators are significantly more relaxed about things like alignment than the spec actually decrees; I had to do a lot of fixing before the int interpreter would run some of the compiler output... |
From: Carl E. C. <cec...@ya...> - 2019-02-07 16:57:24
|
On 2019-02-07 07:37, David Given wrote: > Thanks very much --- that all sounds great! > > Please, though, send lots of small PRs as you go rather than one big > one. It's vastly easier to review and reduces the risk of stuff you're > doing crossing stuff I'm doing. For example, I recently ANSIfied the > Pascal and Basic compiler myself (although pretty crudely, just to > make the warnings go away). > > [...] Ohoh.. you did those already? Ok, i will double check your changes and try to merge them with mine. While ansifying i try to follow the guidelines of the university of Michigan on C include files. Hope this is ok (umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). I will try to push something soon, maybe by small batches... > > ** Move the docs to the header file when they exist in the function > definition. > > > ...for anything other than public functions (i.e. stuff in public > headers in modules/), it's best left next to the definition. This is > because these usually describe the /implementation/, not the > specification. > > ** One-line sentence overview of function when I can do it. > > > Very valuable. > > * Adapt to have internal prototypes and add STATIC to functions used > internally to a module. > > > Any reason not to simply use 'static'? Actually you are right, but it is quite messy here, i saw that for the ANSI C compiler on LINT code they used a PRIVATE define while in pascal it was a STATIC define, better directly stick with the static as you proposed everywhere right? > * Add CMake and use cmake to build portable scripts: > ** Create sed scripts so no more shell requirement > ** Create awk scripts so no more shell requirement. > > > I don't think CMake will cut it with the ACK --- it's too complex. In > particular, the ACK needs the same module to be compiled multiple > times with different settings, and I've found very few build systems > that can handle that. (I so, so badly want to use bazel for this.) Actually, up to now i have had no issue.... BUT you are right, that I might get stuck when I am at building ACK... i will probably see what can be done, but my objective was to remove all references to bash shell scripts... and replace them with basic POSIX compliant tools... > I have only ported the pascal and basic compilers now and since > ack is > not compiled, I am testing by hand, but i see some issues: > > > There is a test suite, which is run automatically by the main build > system. It's not a terribly complete test suite, but it does exist. > > * The pascal compiler generates a PRO pseudoinstruction without the > size, since its only 1 pass, so the old code generator chokes, > because > it expects it, I have questions on this: > > ** Do other compilers do this also? Will ncg also choke if the PRO > does > not have the locals parameters? > > > The compiler output is always run through em_opt before being passed > to a code generator, which among other things will add the parameter > to PRO. You don't need to worry about it being missing in the code > generator. Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i will go with porting that so i can test a bit and then start doing some pull requests... Thanks, that clarifies a lot. > > BTW: Regarding the em documentation, I would like to update it (I see > mistakes in the specification), just wondering should it be > converted to > something else than troff, or ok like that? Latex? Asciidoc? > etc... I > will not change anything on the format for now... just want an > opinion > on this. > > > I'd very much like to keep the existing documentation, even though > it's a mess. In my experience it's been surprisingly accurate; what > errors did you find? Also, bear in mind that the code generators are > significantly more relaxed about things like alignment than the spec > actually decrees; I had to do a lot of fixing before the int > interpreter would run some of the compiler output... > Agreed, i can keep troff, just need to brush up on it i guess. Actually its not big mistake, more an omission, in the EM report, i could not find at least in what i read how a single byte value is encoded, maybe i missed it? And in the EM PDF i have, the annex gives information on em encodings which do not seem to fit with what is actually encoded in compact form... unless its something different? Carl |
From: Carl E. C. <cec...@ya...> - 2019-02-13 15:56:05
|
Greetings, I am trying to be careful on the ansification process so i don't break anything, so I am carefully reviewing my changes before doing a commit in my branch, I have some questions, because i am probably missing some understanding. * Why is the alloc library relying on the system library in NoMem(void) ? Can we simply not call fprintf(stderr.. ) and exit(EXIT_FAILURE) instead? It would make it more portable and less interdependent of other libraries, no? * Why is the print library relying on the system library for its I/O, same question as before, we could just replace File* by standard FILE* ? The only reasons i could foresee are * that this was done before ISO C90 and was not portable at the time? * when printing you may want to actually write somewhere else by implementing your own sys_xxx function. * any others? What is the objective of the system library? I thought it would contain API calls that are not fully portable, for example POSIX API's that we should "emulate" for each platform, or ISO C routines that must be overriden because of broken implementations in some libraries (tmpfile() and tmpnam() in Visual C++ come to mind)? other examples: * get file modification time. * create temp file But I see sys_write and sys_open ... I am ok to keep them, but it should not be a FILE* instead? Should it also not also add some common API's gettmpdir() that could are used in different utilities, maybe instead it needs to be reimplemented? Sorry for all those questions, but it is very difficult for me to understand the spirit of the above... Thanks in advance, Carl On 2019-02-08 00:57, Carl Eric Codere via Tack-devel wrote: > On 2019-02-07 07:37, David Given wrote: >> Thanks very much --- that all sounds great! >> >> Please, though, send lots of small PRs as you go rather than one big >> one. It's vastly easier to review and reduces the risk of stuff >> you're doing crossing stuff I'm doing. For example, I recently >> ANSIfied the Pascal and Basic compiler myself (although pretty >> crudely, just to make the warnings go away). >> >> [...] > Ohoh.. you did those already? Ok, i will double check your changes and > try to merge them with mine. While ansifying i try to follow the > guidelines of the university of Michigan on C include files. Hope this > is ok (umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). > > I will try to push something soon, maybe by small batches... >> >> ** Move the docs to the header file when they exist in the function >> definition. >> >> >> ...for anything other than public functions (i.e. stuff in public >> headers in modules/), it's best left next to the definition. This is >> because these usually describe the /implementation/, not the >> specification. >> >> ** One-line sentence overview of function when I can do it. >> >> >> Very valuable. >> >> * Adapt to have internal prototypes and add STATIC to functions used >> internally to a module. >> >> >> Any reason not to simply use 'static'? > Actually you are right, but it is quite messy here, i saw that for the > ANSI C compiler on LINT code they used a PRIVATE define while in > pascal it was a STATIC define, better directly stick with the static > as you proposed everywhere right? > >> * Add CMake and use cmake to build portable scripts: >> ** Create sed scripts so no more shell requirement >> ** Create awk scripts so no more shell requirement. >> >> >> I don't think CMake will cut it with the ACK --- it's too complex. In >> particular, the ACK needs the same module to be compiled multiple >> times with different settings, and I've found very few build systems >> that can handle that. (I so, so badly want to use bazel for this.) > Actually, up to now i have had no issue.... BUT you are right, that I > might get stuck when I am at building ACK... i will probably see what > can be done, but my objective was to remove all references to bash > shell scripts... and replace them with basic POSIX compliant tools... > >> I have only ported the pascal and basic compilers now and since >> ack is >> not compiled, I am testing by hand, but i see some issues: >> >> >> There is a test suite, which is run automatically by the main build >> system. It's not a terribly complete test suite, but it does exist. >> >> * The pascal compiler generates a PRO pseudoinstruction without the >> size, since its only 1 pass, so the old code generator chokes, >> because >> it expects it, I have questions on this: >> >> ** Do other compilers do this also? Will ncg also choke if the >> PRO does >> not have the locals parameters? >> >> >> The compiler output is always run through em_opt before being passed >> to a code generator, which among other things will add the parameter >> to PRO. You don't need to worry about it being missing in the code >> generator. > Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i > will go with porting that so i can test a bit and then start doing > some pull requests... Thanks, that clarifies a lot. >> >> BTW: Regarding the em documentation, I would like to update it (I >> see >> mistakes in the specification), just wondering should it be >> converted to >> something else than troff, or ok like that? Latex? Asciidoc? >> etc... I >> will not change anything on the format for now... just want an >> opinion >> on this. >> >> >> I'd very much like to keep the existing documentation, even though >> it's a mess. In my experience it's been surprisingly accurate; what >> errors did you find? Also, bear in mind that the code generators are >> significantly more relaxed about things like alignment than the spec >> actually decrees; I had to do a lot of fixing before the int >> interpreter would run some of the compiler output... >> > Agreed, i can keep troff, just need to brush up on it i guess. > Actually its not big mistake, more an omission, in the EM report, i > could not find at least in what i read how a single byte value is > encoded, maybe i missed it? And in the EM PDF i have, the annex gives > information on em encodings which do not seem to fit with what is > actually encoded in compact form... unless its something different? > > Carl > > > > > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel |
From: Jacobs, C.J.H. <c.j...@vu...> - 2019-02-14 08:15:37
|
Hi, back when we developed ACK, we were worried about code size (in particular, the resulting size of binaries). We needed to have versions of the compiler running in 16-bit address space (OK, we had separate instruction and data spaces), but that meant that the compiler binary code had to fit in 64K, on systems that did not support shared libraries and such. Using stdio would make that virtually impossible, because even back then, that would add 8-16K to the binary size (I don’t remember exactly, but it was a lot). This is the reason that the ACK compiler front-ends don’t use printf, sprintf, FILE, etc., or anything that internally uses those. Best wishes, Ceriel Jacobs > On 13 Feb 2019, at 16:25, Carl Eric Codere via Tack-devel <tac...@li...> wrote: > > Greetings, > ����������������� I am trying to be careful on the ansification process so i don't break anything, so I am carefully reviewing my changes before doing a commit in my branch, I have some questions, because i am probably missing some understanding. > > * Why is the alloc library relying on the system library in NoMem(void) ? Can we simply not call fprintf(stderr.. ) and exit(EXIT_FAILURE) instead? It would make it more portable and less interdependent of other libraries, no? > * Why is the print library relying on the system library for its I/O, same question as before, we could just replace File* by standard FILE* ? > > The only reasons i could foresee are > *� that this was done before ISO C90 and was not portable at the time? > *� when printing you may want to actually write somewhere else by implementing your own sys_xxx function. > *� any others? > > What is the objective of the system library? > I thought it would contain API calls that are not fully portable, for example POSIX API's that we should "emulate" for each platform, or ISO C routines that must be overriden because of broken implementations in some libraries (tmpfile() and tmpnam() in Visual C++ come to mind)? other examples: > * get file modification time. > * create temp file > But I see sys_write and sys_open ... I am ok to keep them, but it should not be a FILE* instead? > Should it also not also add some common API's gettmpdir() that could are used in different utilities, maybe instead it needs to be reimplemented? > > Sorry for all those questions, but it is very difficult for me to understand the spirit of the above... > > Thanks in advance, > Carl > > > > > > > > > On 2019-02-08 00:57, Carl Eric Codere via Tack-devel wrote: >> On 2019-02-07 07:37, David Given wrote: >>> Thanks very much --- that all sounds great! >>> >>> Please, though, send lots of small PRs as you go rather than one big one. It's vastly easier to review and reduces the risk of stuff you're doing crossing stuff I'm doing. For example, I recently ANSIfied the Pascal and Basic compiler myself (although pretty crudely, just to make the warnings go away). >>> >>> [...] >> Ohoh.. you did those already? Ok, i will double check your changes and try to merge them with mine. While ansifying i try to follow the guidelines of the university of Michigan on C include files. Hope this is ok� (umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). >> >> I will try to push something soon, maybe by small batches... >>> ** Move the docs to the header file when they exist in the function >>> definition. >>> >>> ...for anything other than public functions (i.e. stuff in public headers in modules/), it's best left next to the definition. This is because these usually describe the implementation, not the specification. >>> � >>> ** One-line sentence overview of function when I can do it. >>> >>> Very valuable. >>> � >>> * Adapt to have internal prototypes and add STATIC to functions used >>> internally to a module. >>> >>> Any reason not to simply use 'static'? >> Actually you are right, but it is quite messy here, i saw that for the ANSI C compiler on LINT code they used a PRIVATE define while in pascal it was a STATIC define, better directly stick with the static as you proposed everywhere right? >> >>> � >>> * Add CMake and use cmake to build portable scripts: >>> ** Create sed scripts so no more shell requirement >>> ** Create awk scripts so no more shell requirement. >>> >>> I don't think CMake will cut it with the ACK --- it's too complex. In particular, the ACK needs the same module to be compiled multiple times with different settings, and I've found very few build systems that can handle that. (I so, so badly want to use bazel for this.) >>> � >> Actually, up to now i have had no issue.... BUT you are right, that I might get stuck when I am at building ACK... i will probably see what can be done, but my objective was to remove all references to bash shell scripts... and replace them with basic POSIX compliant tools... >> >>> I have only ported the pascal and basic compilers now and since ack is >>> not compiled, I am testing by hand, but i see some issues: >>> >>> There is a test suite, which is run automatically by the main build system. It's not a terribly complete test suite, but it does exist. >>> � >>> * The pascal compiler generates a� PRO pseudoinstruction without the >>> size, since its only 1 pass, so the old code generator chokes, because >>> it expects it, I have questions on this:� >>> ** Do other compilers do this also? Will ncg also choke if the PRO does >>> not have the locals parameters? >>> >>> The compiler output is always run through em_opt before being passed to a code generator, which among other things will add the parameter to PRO. You don't need to worry about it being missing in the code generator. >> Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i will go with porting that so i can test a bit and then start doing some pull requests... Thanks, that clarifies a lot. >>> >>> BTW: Regarding the em documentation, I would like to update it (I see >>> mistakes in the specification), just wondering should it be converted to >>> something else than troff, or ok like that? Latex? Asciidoc? etc...� I >>> will not change anything on the format for now... just want an opinion >>> on this. >>> >>> I'd very much like to keep the existing documentation, even though it's a mess. In my experience it's been surprisingly accurate; what errors did you find? Also, bear in mind that the code generators are significantly more relaxed about things like alignment than the spec actually decrees; I had to do a lot of fixing before the int interpreter would run some of the compiler output... >>> >> Agreed, i can keep troff, just need to brush up on it i guess. Actually its not big mistake, more an omission, in the EM report, i could not find at least in what i read how a single byte value is encoded, maybe i missed it? And in the EM PDF i have, the annex gives information on em encodings which do not seem to fit with what is actually encoded in compact form... unless its something different? >> >> Carl >> >> >> >> >> >> _______________________________________________ >> Tack-devel mailing list >> >> Tac...@li... >> https://lists.sourceforge.net/lists/listinfo/tack-devel > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel |
From: <u-...@ae...> - 2019-02-14 10:37:12
|
On Thu, Feb 14, 2019 at 08:15:25AM +0000, Jacobs, C.J.H. via Tack-devel wrote: > back when we developed ACK, we were worried about code size (in particular, the resulting size of binaries). > We needed to have versions of the compiler running in 16-bit address space (OK, we had separate instruction and data spaces), > but that meant that the compiler binary code had to fit in 64K, on systems that did not support shared libraries and such. > Using stdio would make that virtually impossible, because even back then, that would add 8-16K to the binary size (I don’t > remember exactly, but it was a lot). This is the reason that the ACK compiler front-ends don’t use printf, sprintf, FILE, etc., > or anything that internally uses those. Hello Ceriel, Thanks for the explanation. I hope this compactness does not have to be sacrificed for the ansification. It would be certainly exciting to have a version (as ack-4 was, or possibly even the ack-5-derived minix ackpack?) capable of self-hosting on 16-bit platforms, not necessarily only on retrostyle pdp11-lookalikes but why not on, say, risc-v rv16*? Run under Fuzix? IMHO as long as there is a compact i/o library sufficient for the compiler I would rather keep it - you never know when somebody would need to run the compiler or its derivative on a constrained platform. Rune |
From: Carl E. C. <cec...@ya...> - 2019-02-15 06:49:37
|
Greetings, Thank you for the clarifications, it makes me understand the initial objectives. I guess its a question for the current maintainers now: In that case, I propose to keep system, and maybe at least in the current version have system use stdio as it is already doing now, clean it up a bit, expand it as necessary, and then when we need to support compiling on low end machines we just need to adapt system and not everything? Is it also in the long term objectives of ACK development to run the compiler on older systems that are maybe within a minimal threshold? Not only as targets, but also as hosts? I would nice, no, not sure it its possible though? Thanks again, Carl On Thursday, February 14, 2019, 4:15:28 p.m. GMT+8, Jacobs, C.J.H. <c.j...@vu...> wrote: Hi, back when we developed ACK, we were worried about code size (in particular, the resulting size of binaries). We needed to have versions of the compiler running in 16-bit address space (OK, we had separate instruction and data spaces), but that meant that the compiler binary code had to fit in 64K, on systems that did not support shared libraries and such. Using stdio would make that virtually impossible, because even back then, that would add 8-16K to the binary size (I don’t remember exactly, but it was a lot). This is the reason that the ACK compiler front-ends don’t use printf, sprintf, FILE, etc., or anything that internally uses those. Best wishes, Ceriel Jacobs > On 13 Feb 2019, at 16:25, Carl Eric Codere via Tack-devel <tac...@li...> wrote: > > Greetings, > ����������������� I am trying to be careful on the ansification process so i don't break anything, so I am carefully reviewing my changes before doing a commit in my branch, I have some questions, because i am probably missing some understanding. > > * Why is the alloc library relying on the system library in NoMem(void) ? Can we simply not call fprintf(stderr.. ) and exit(EXIT_FAILURE) instead? It would make it more portable and less interdependent of other libraries, no? > * Why is the print library relying on the system library for its I/O, same question as before, we could just replace File* by standard FILE* ? > > The only reasons i could foresee are > *� that this was done before ISO C90 and was not portable at the time? > *� when printing you may want to actually write somewhere else by implementing your own sys_xxx function. > *� any others? > > What is the objective of the system library? > I thought it would contain API calls that are not fully portable, for example POSIX API's that we should "emulate" for each platform, or ISO C routines that must be overriden because of broken implementations in some libraries (tmpfile() and tmpnam() in Visual C++ come to mind)? other examples: > * get file modification time. > * create temp file > But I see sys_write and sys_open ... I am ok to keep them, but it should not be a FILE* instead? > Should it also not also add some common API's gettmpdir() that could are used in different utilities, maybe instead it needs to be reimplemented? > > Sorry for all those questions, but it is very difficult for me to understand the spirit of the above... > > Thanks in advance, > Carl > > > > > > > > > On 2019-02-08 00:57, Carl Eric Codere via Tack-devel wrote: >> On 2019-02-07 07:37, David Given wrote: >>> Thanks very much --- that all sounds great! >>> >>> Please, though, send lots of small PRs as you go rather than one big one. It's vastly easier to review and reduces the risk of stuff you're doing crossing stuff I'm doing. For example, I recently ANSIfied the Pascal and Basic compiler myself (although pretty crudely, just to make the warnings go away). >>> >>> [...] >> Ohoh.. you did those already? Ok, i will double check your changes and try to merge them with mine. While ansifying i try to follow the guidelines of the university of Michigan on C include files. Hope this is ok� (umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). >> >> I will try to push something soon, maybe by small batches... >>> ** Move the docs to the header file when they exist in the function >>> definition. >>> >>> ...for anything other than public functions (i.e. stuff in public headers in modules/), it's best left next to the definition. This is because these usually describe the implementation, not the specification. >>> � >>> ** One-line sentence overview of function when I can do it. >>> >>> Very valuable. >>> � >>> * Adapt to have internal prototypes and add STATIC to functions used >>> internally to a module. >>> >>> Any reason not to simply use 'static'? >> Actually you are right, but it is quite messy here, i saw that for the ANSI C compiler on LINT code they used a PRIVATE define while in pascal it was a STATIC define, better directly stick with the static as you proposed everywhere right? >> >>> � >>> * Add CMake and use cmake to build portable scripts: >>> ** Create sed scripts so no more shell requirement >>> ** Create awk scripts so no more shell requirement. >>> >>> I don't think CMake will cut it with the ACK --- it's too complex. In particular, the ACK needs the same module to be compiled multiple times with different settings, and I've found very few build systems that can handle that. (I so, so badly want to use bazel for this.) >>> � >> Actually, up to now i have had no issue.... BUT you are right, that I might get stuck when I am at building ACK... i will probably see what can be done, but my objective was to remove all references to bash shell scripts... and replace them with basic POSIX compliant tools... >> >>> I have only ported the pascal and basic compilers now and since ack is >>> not compiled, I am testing by hand, but i see some issues: >>> >>> There is a test suite, which is run automatically by the main build system. It's not a terribly complete test suite, but it does exist. >>> � >>> * The pascal compiler generates a� PRO pseudoinstruction without the >>> size, since its only 1 pass, so the old code generator chokes, because >>> it expects it, I have questions on this:� >>> ** Do other compilers do this also? Will ncg also choke if the PRO does >>> not have the locals parameters? >>> >>> The compiler output is always run through em_opt before being passed to a code generator, which among other things will add the parameter to PRO. You don't need to worry about it being missing in the code generator. >> Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i will go with porting that so i can test a bit and then start doing some pull requests... Thanks, that clarifies a lot. >>> >>> BTW: Regarding the em documentation, I would like to update it (I see >>> mistakes in the specification), just wondering should it be converted to >>> something else than troff, or ok like that? Latex? Asciidoc? etc...� I >>> will not change anything on the format for now... just want an opinion >>> on this. >>> >>> I'd very much like to keep the existing documentation, even though it's a mess. In my experience it's been surprisingly accurate; what errors did you find? Also, bear in mind that the code generators are significantly more relaxed about things like alignment than the spec actually decrees; I had to do a lot of fixing before the int interpreter would run some of the compiler output... >>> >> Agreed, i can keep troff, just need to brush up on it i guess. Actually its not big mistake, more an omission, in the EM report, i could not find at least in what i read how a single byte value is encoded, maybe i missed it? And in the EM PDF i have, the annex gives information on em encodings which do not seem to fit with what is actually encoded in compact form... unless its something different? >> >> Carl >> >> >> >> >> >> _______________________________________________ >> Tack-devel mailing list >> >> Tac...@li... >> https://lists.sourceforge.net/lists/listinfo/tack-devel > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel |
From: <u-...@ae...> - 2019-02-16 10:33:56
|
On Fri, Feb 15, 2019 at 06:08:54AM +0000, Carl Eric Codere via Tack-devel wrote: > I guess its a question for the current maintainers now: > In that case, I propose to keep system, and maybe at least in the current version have system use stdio as it is already doing now, clean it up a bit, expand it as necessary, and then when we need to support compiling on low end machines we just need to adapt system and not everything? > > Is it also in the long term objectives of ACK development to run the compiler on older systems that are maybe within a minimal threshold? Not only as targets, but also as hosts? I would nice, no, not sure it its possible though? I only qualify as a current maintainer for my Linux ackpack port, but here follows my opinion: Ack has its specific and not easily replaceable value of being capable to run on constrained systems. Of course, if other virtues (which motivate people to contribute to ack at all?) conflict with this one, it is up to the benevolent dictator to make a choice. As long as such a virtue conflict is not present, I suggest to avoid enforcing solutions heavier than ack actually needs. The same applies to the choice of the build tools. Complex build tools make self-hosting on constrained platforms impossible. This also negatively affects reproducibility. Nothing wrong with using advanced tools for the actual development and debugging, where you change random places in the code and wish to rebuild correctly, i.e. reflecting all build dependencies, and also very fast. But a distribution IMHO should be always accompanied by a straightforward script, taking paths to the needed resources explicitly (no fancy guessing), to linearly and fully compile the software, by minimal tools. That's what an integrator needs. (hopefully the script would be made in a lightweight and easily available language like Bourne shell using minimal Posix tools; not GNU make, bash, perl, python with specific modules or alike) Otherwise besides the usual dependency hell (not much in the case of ack) one is faced by a deeper hell of the build tools (impossible to build without a hardware at least 10-100-1000 times more powerful than ack needs and without relying on 10-100-1000 times more lines of code? no, thanks! :) Regards, Rune |
From: David G. <dg...@co...> - 2019-02-16 10:55:02
|
Being able to build the ACK on small systems is definitely valuable, and we should keep it. That said, the current configuration can't do it and people aren't currently building it for these platforms. (Apart from anything else everything's built with the BigPars settings. See https://github.com/davidgiven/ack/blob/default/lang/cem/cemcom.ansi/BigPars vs SmallPars.) I'm quite sure that mcg won't work on these systems. That said, I'm not sure that using a custom system library's the right way to do this. These days C is much more standard than it was back then and the need for custom wrappers to make sure that, e.g., realloc with NULL works properly isn't necessary. A lot of the stuff in system is equivalent to the standard Posix calls, or is unused completely (sys_lock --- in fact, I see that function's not even built!). What I'd propose as a compromise is: - if a system function is a trivial reimplementation of Posix, replace it with the Posix version. - *do* replace File* with FILE*. If it turns out to be too expensive later, we can replace the buffered stdio with an unbuffered implementation equivalent to the one in system, but using the standard interfaces. - *don't* start calling printf() where we don't need it. One of the things on the to-do list is to make printf cheaper, somehow (I have a few ideas), but let's not tempt fate. This way we should end up with standard interfaces, which makes things easier to work on and maintain, but still allow the small systems. PS. I use cscope to search for symbols in the ACK tree. It's great. Do 'cscope -Rq' and then enter a function name in the box. On Fri, 15 Feb 2019 at 07:49 Carl Eric Codere via Tack-devel < tac...@li...> wrote: > Greetings, > Thank you for the clarifications, it makes me understand > the initial objectives. > > I guess its a question for the current maintainers now: > > In that case, I propose to keep system, and maybe at least in the current > version have system use stdio as it is already doing now, clean it up a > bit, expand it as necessary, and then when we need to support compiling on > low end machines we just need to adapt system and not everything? > > Is it also in the long term objectives of ACK development to run the > compiler on older systems that are maybe within a minimal threshold? Not > only as targets, but also as hosts? I would nice, no, not sure it its > possible though? > > Thanks again, > Carl > > > On Thursday, February 14, 2019, 4:15:28 p.m. GMT+8, Jacobs, C.J.H. < > c.j...@vu...> wrote: > > > Hi, > > back when we developed ACK, we were worried about code size (in > particular, the resulting size of binaries). > We needed to have versions of the compiler running in 16-bit address space > (OK, we had separate instruction and data spaces), > but that meant that the compiler binary code had to fit in 64K, on systems > that did not support shared libraries and such. > Using stdio would make that virtually impossible, because even back then, > that would add 8-16K to the binary size (I don’t > remember exactly, but it was a lot). This is the reason that the ACK > compiler front-ends don’t use printf, sprintf, FILE, etc., > or anything that internally uses those. > > Best wishes, > > Ceriel Jacobs > > > On 13 Feb 2019, at 16:25, Carl Eric Codere via Tack-devel < > tac...@li...> wrote: > > > > Greetings, > > ����������������� I am trying to be > careful on the ansification process so i don't break anything, so I am > carefully reviewing my changes before doing a commit in my branch, I have > some questions, because i am probably missing some understanding. > > > > * Why is the alloc library relying on the system library in NoMem(void) > ? Can we simply not call fprintf(stderr.. ) and exit(EXIT_FAILURE) instead? > It would make it more portable and less interdependent of other libraries, > no? > > * Why is the print library relying on the system library for its I/O, > same question as before, we could just replace File* by standard FILE* ? > > > > The only reasons i could foresee are > > *� that this was done before ISO C90 and was not portable at the time? > > *� when printing you may want to actually write somewhere else by > implementing your own sys_xxx function. > > *� any others? > > > > What is the objective of the system library? > > I thought it would contain API calls that are not fully portable, for > example POSIX API's that we should "emulate" for each platform, or ISO C > routines that must be overriden because of broken implementations in some > libraries (tmpfile() and tmpnam() in Visual C++ come to mind)? other > examples: > > * get file modification time. > > * create temp file > > But I see sys_write and sys_open ... I am ok to keep them, but it should > not be a FILE* instead? > > Should it also not also add some common API's gettmpdir() that could are > used in different utilities, maybe instead it needs to be reimplemented? > > > > Sorry for all those questions, but it is very difficult for me to > understand the spirit of the above... > > > > Thanks in advance, > > Carl > > > > > > > > > > > > > > > > > > On 2019-02-08 00:57, Carl Eric Codere via Tack-devel wrote: > >> On 2019-02-07 07:37, David Given wrote: > >>> Thanks very much --- that all sounds great! > >>> > >>> Please, though, send lots of small PRs as you go rather than one big > one. It's vastly easier to review and reduces the risk of stuff you're > doing crossing stuff I'm doing. For example, I recently ANSIfied the Pascal > and Basic compiler myself (although pretty crudely, just to make the > warnings go away). > >>> > >>> [...] > >> Ohoh.. you did those already? Ok, i will double check your changes and > try to merge them with mine. While ansifying i try to follow the guidelines > of the university of Michigan on C include files. Hope this is ok� ( > umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). > >> > >> I will try to push something soon, maybe by small batches... > >>> ** Move the docs to the header file when they exist in the function > >>> definition. > >>> > >>> ...for anything other than public functions (i.e. stuff in public > headers in modules/), it's best left next to the definition. This is > because these usually describe the implementation, not the specification. > >>> � > >>> ** One-line sentence overview of function when I can do it. > >>> > >>> Very valuable. > >>> � > >>> * Adapt to have internal prototypes and add STATIC to functions used > >>> internally to a module. > >>> > >>> Any reason not to simply use 'static'? > >> Actually you are right, but it is quite messy here, i saw that for the > ANSI C compiler on LINT code they used a PRIVATE define while in pascal it > was a STATIC define, better directly stick with the static as you proposed > everywhere right? > >> > >>> � > >>> * Add CMake and use cmake to build portable scripts: > >>> ** Create sed scripts so no more shell requirement > >>> ** Create awk scripts so no more shell requirement. > >>> > >>> I don't think CMake will cut it with the ACK --- it's too complex. In > particular, the ACK needs the same module to be compiled multiple times > with different settings, and I've found very few build systems that can > handle that. (I so, so badly want to use bazel for this.) > >>> � > >> Actually, up to now i have had no issue.... BUT you are right, that I > might get stuck when I am at building ACK... i will probably see what can > be done, but my objective was to remove all references to bash shell > scripts... and replace them with basic POSIX compliant tools... > >> > >>> I have only ported the pascal and basic compilers now and since ack is > >>> not compiled, I am testing by hand, but i see some issues: > >>> > >>> There is a test suite, which is run automatically by the main build > system. It's not a terribly complete test suite, but it does exist. > >>> � > >>> * The pascal compiler generates a� PRO pseudoinstruction without the > >>> size, since its only 1 pass, so the old code generator chokes, because > >>> it expects it, I have questions on this:� > >>> ** Do other compilers do this also? Will ncg also choke if the PRO > does > >>> not have the locals parameters? > >>> > >>> The compiler output is always run through em_opt before being passed > to a code generator, which among other things will add the parameter to > PRO. You don't need to worry about it being missing in the code generator. > >> Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i > will go with porting that so i can test a bit and then start doing some > pull requests... Thanks, that clarifies a lot. > >>> > >>> BTW: Regarding the em documentation, I would like to update it (I see > >>> mistakes in the specification), just wondering should it be converted > to > >>> something else than troff, or ok like that? Latex? Asciidoc? etc...� > I > >>> will not change anything on the format for now... just want an opinion > >>> on this. > >>> > >>> I'd very much like to keep the existing documentation, even though > it's a mess. In my experience it's been surprisingly accurate; what errors > did you find? Also, bear in mind that the code generators are significantly > more relaxed about things like alignment than the spec actually decrees; I > had to do a lot of fixing before the int interpreter would run some of the > compiler output... > >>> > >> Agreed, i can keep troff, just need to brush up on it i guess. Actually > its not big mistake, more an omission, in the EM report, i could not find > at least in what i read how a single byte value is encoded, maybe i missed > it? And in the EM PDF i have, the annex gives information on em encodings > which do not seem to fit with what is actually encoded in compact form... > unless its something different? > >> > >> Carl > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Tack-devel mailing list > >> > >> Tac...@li... > >> https://lists.sourceforge.net/lists/listinfo/tack-devel > > > > > _______________________________________________ > > Tack-devel mailing list > > Tac...@li... > > https://lists.sourceforge.net/lists/listinfo/tack-devel > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel > |
From: Carl E. C. <cec...@ya...> - 2019-02-16 18:31:47
|
Greetings, Thanks for the clarification, and am happy you share the goal to make it work on small systems, I will try to see if i can eventually (very far from it), make it work on low-end systems and keep the .text segment to 64k. Ok, I had started to replace File* by FILE*, but it is too big of a change to do it in one shot, I will try to first commit the changes I made to have proper function prototyping in the ANSI C compiler and its dependencies... once this is done, I will move to the next step of cleaning up the usage of system and getting rid of File*. Carl On Saturday, February 16, 2019, 6:54:44 PM GMT+8, David Given <dg...@co...> wrote: Being able to build the ACK on small systems is definitely valuable, and we should keep it. That said, the current configuration can't do it and people aren't currently building it for these platforms. (Apart from anything else everything's built with the BigPars settings. See https://github.com/davidgiven/ack/blob/default/lang/cem/cemcom.ansi/BigPars vs SmallPars.) I'm quite sure that mcg won't work on these systems. That said, I'm not sure that using a custom system library's the right way to do this. These days C is much more standard than it was back then and the need for custom wrappers to make sure that, e.g., realloc with NULL works properly isn't necessary. A lot of the stuff in system is equivalent to the standard Posix calls, or is unused completely (sys_lock --- in fact, I see that function's not even built!). What I'd propose as a compromise is: - if a system function is a trivial reimplementation of Posix, replace it with the Posix version.- do replace File* with FILE*. If it turns out to be too expensive later, we can replace the buffered stdio with an unbuffered implementation equivalent to the one in system, but using the standard interfaces.- don't start calling printf() where we don't need it. One of the things on the to-do list is to make printf cheaper, somehow (I have a few ideas), but let's not tempt fate. This way we should end up with standard interfaces, which makes things easier to work on and maintain, but still allow the small systems. PS. I use cscope to search for symbols in the ACK tree. It's great. Do 'cscope -Rq' and then enter a function name in the box. On Fri, 15 Feb 2019 at 07:49 Carl Eric Codere via Tack-devel <tac...@li...> wrote: Greetings, Thank you for the clarifications, it makes me understand the initial objectives. I guess its a question for the current maintainers now: In that case, I propose to keep system, and maybe at least in the current version have system use stdio as it is already doing now, clean it up a bit, expand it as necessary, and then when we need to support compiling on low end machines we just need to adapt system and not everything? Is it also in the long term objectives of ACK development to run the compiler on older systems that are maybe within a minimal threshold? Not only as targets, but also as hosts? I would nice, no, not sure it its possible though? Thanks again, Carl On Thursday, February 14, 2019, 4:15:28 p.m. GMT+8, Jacobs, C.J.H. <c.j...@vu...> wrote: Hi, back when we developed ACK, we were worried about code size (in particular, the resulting size of binaries). We needed to have versions of the compiler running in 16-bit address space (OK, we had separate instruction and data spaces), but that meant that the compiler binary code had to fit in 64K, on systems that did not support shared libraries and such. Using stdio would make that virtually impossible, because even back then, that would add 8-16K to the binary size (I don’t remember exactly, but it was a lot). This is the reason that the ACK compiler front-ends don’t use printf, sprintf, FILE, etc., or anything that internally uses those. Best wishes, Ceriel Jacobs > On 13 Feb 2019, at 16:25, Carl Eric Codere via Tack-devel <tac...@li...> wrote: > > Greetings, > ����������������� I am trying to be careful on the ansification process so i don't break anything, so I am carefully reviewing my changes before doing a commit in my branch, I have some questions, because i am probably missing some understanding. > > * Why is the alloc library relying on the system library in NoMem(void) ? Can we simply not call fprintf(stderr.. ) and exit(EXIT_FAILURE) instead? It would make it more portable and less interdependent of other libraries, no? > * Why is the print library relying on the system library for its I/O, same question as before, we could just replace File* by standard FILE* ? > > The only reasons i could foresee are > *� that this was done before ISO C90 and was not portable at the time? > *� when printing you may want to actually write somewhere else by implementing your own sys_xxx function. > *� any others? > > What is the objective of the system library? > I thought it would contain API calls that are not fully portable, for example POSIX API's that we should "emulate" for each platform, or ISO C routines that must be overriden because of broken implementations in some libraries (tmpfile() and tmpnam() in Visual C++ come to mind)? other examples: > * get file modification time. > * create temp file > But I see sys_write and sys_open ... I am ok to keep them, but it should not be a FILE* instead? > Should it also not also add some common API's gettmpdir() that could are used in different utilities, maybe instead it needs to be reimplemented? > > Sorry for all those questions, but it is very difficult for me to understand the spirit of the above... > > Thanks in advance, > Carl > > > > > > > > > On 2019-02-08 00:57, Carl Eric Codere via Tack-devel wrote: >> On 2019-02-07 07:37, David Given wrote: >>> Thanks very much --- that all sounds great! >>> >>> Please, though, send lots of small PRs as you go rather than one big one. It's vastly easier to review and reduces the risk of stuff you're doing crossing stuff I'm doing. For example, I recently ANSIfied the Pascal and Basic compiler myself (although pretty crudely, just to make the warnings go away). >>> >>> [...] >> Ohoh.. you did those already? Ok, i will double check your changes and try to merge them with mine. While ansifying i try to follow the guidelines of the university of Michigan on C include files. Hope this is ok� (umich.edu/~eecs381/handouts/CHeaderFileGuidelines.pdf). >> >> I will try to push something soon, maybe by small batches... >>> ** Move the docs to the header file when they exist in the function >>> definition. >>> >>> ...for anything other than public functions (i.e. stuff in public headers in modules/), it's best left next to the definition. This is because these usually describe the implementation, not the specification. >>> � >>> ** One-line sentence overview of function when I can do it. >>> >>> Very valuable. >>> � >>> * Adapt to have internal prototypes and add STATIC to functions used >>> internally to a module. >>> >>> Any reason not to simply use 'static'? >> Actually you are right, but it is quite messy here, i saw that for the ANSI C compiler on LINT code they used a PRIVATE define while in pascal it was a STATIC define, better directly stick with the static as you proposed everywhere right? >> >>> � >>> * Add CMake and use cmake to build portable scripts: >>> ** Create sed scripts so no more shell requirement >>> ** Create awk scripts so no more shell requirement. >>> >>> I don't think CMake will cut it with the ACK --- it's too complex. In particular, the ACK needs the same module to be compiled multiple times with different settings, and I've found very few build systems that can handle that. (I so, so badly want to use bazel for this.) >>> � >> Actually, up to now i have had no issue.... BUT you are right, that I might get stuck when I am at building ACK... i will probably see what can be done, but my objective was to remove all references to bash shell scripts... and replace them with basic POSIX compliant tools... >> >>> I have only ported the pascal and basic compilers now and since ack is >>> not compiled, I am testing by hand, but i see some issues: >>> >>> There is a test suite, which is run automatically by the main build system. It's not a terribly complete test suite, but it does exist. >>> � >>> * The pascal compiler generates a� PRO pseudoinstruction without the >>> size, since its only 1 pass, so the old code generator chokes, because >>> it expects it, I have questions on this:� >>> ** Do other compilers do this also? Will ncg also choke if the PRO does >>> not have the locals parameters? >>> >>> The compiler output is always run through em_opt before being passed to a code generator, which among other things will add the parameter to PRO. You don't need to worry about it being missing in the code generator. >> Ahhh.. ok, then after I have finished compiling the ANSI C compiler, i will go with porting that so i can test a bit and then start doing some pull requests... Thanks, that clarifies a lot. >>> >>> BTW: Regarding the em documentation, I would like to update it (I see >>> mistakes in the specification), just wondering should it be converted to >>> something else than troff, or ok like that? Latex? Asciidoc? etc...� I >>> will not change anything on the format for now... just want an opinion >>> on this. >>> >>> I'd very much like to keep the existing documentation, even though it's a mess. In my experience it's been surprisingly accurate; what errors did you find? Also, bear in mind that the code generators are significantly more relaxed about things like alignment than the spec actually decrees; I had to do a lot of fixing before the int interpreter would run some of the compiler output... >>> >> Agreed, i can keep troff, just need to brush up on it i guess. Actually its not big mistake, more an omission, in the EM report, i could not find at least in what i read how a single byte value is encoded, maybe i missed it? And in the EM PDF i have, the annex gives information on em encodings which do not seem to fit with what is actually encoded in compact form... unless its something different? >> >> Carl >> >> >> >> >> >> _______________________________________________ >> Tack-devel mailing list >> >> Tac...@li... >> https://lists.sourceforge.net/lists/listinfo/tack-devel > > _______________________________________________ > Tack-devel mailing list > Tac...@li... > https://lists.sourceforge.net/lists/listinfo/tack-devel _______________________________________________ Tack-devel mailing list Tac...@li... https://lists.sourceforge.net/lists/listinfo/tack-devel |