Hello,
I tested all the 50 samples of the MSXgl library with the new SDCC 4.5 version.
Everything works fine except 1 sample called "s_swsprt" for which there are rendering glitches.
The exact same code works fine with SDCC 4.2, 4.3 and 4.4.
I packaged all compiled output files for each version in the attached SDCC_4.5_bug.zip file.
You can test the .ROM files by drag & drop them on the online MSX emulator: https://webmsx.org.
With SDCC 4.5, adding a line of code after the bug location fix the issue.
I think the problem could be about inline functions.
You can compare the s_swsprt.asm generated in the out_4.5_bug/ folder and out_4.5_fix/.
BTW, I've been able to verify that not only does SDCC 4.5 correct the problems of increased final binary file sizes present in SDCC 4.3 and 4.4, but it even produces binaries that are generally smaller than SDCC 4.2, which was the reference in the field.
All the results can be found here: https://docs.google.com/spreadsheets/d/1ZTBx3nAW32ZxbbLYjrwJ2ISelW34hyPq6hx7q86Mobk/edit?usp=sharing
Once the bug in the “s_swsprt” sample has been corrected, I'll do some performance tests, as I have the impression that the code generated is, at least sometimes, less efficient than with SDCC 4.2.
SDCC version: 4.5.0 #15242 (MINGW64)
The way I read that report is: if two different programs are compiled, they produce different pictures. You show that two different programs on 4.5.0 produce different pictures, but they are different, as you say: "adding a line of code after the bug location fix the issue."
The question is, how can it be demonstrated that it's an actual problem in the compiler, and not the bug in the drawing code which was simply masked in previous versions of the compiler, that you only now fixed in your code?
You're right. I don't have formal proof that the bug wasn't hidden by older versions of SDCC, but it seems very unlikely to me.
I've tried to isolate the bug, but the problem is that it disappears as soon as I modify the code.
Here's where the bug occurs:
The VDP_CommandXXX functions are inline functions that setup a structure in RAM then call a function that copy it to the video processor.
The bug appear to be the structure to not been setup correctly when the code is as it.
If I move the sprite initialization code else where, for example, the generated assembler code for the bugged part is modified and the bug disappear.
I think someone with a good knowledge of z80 assembler and how the compiler works will easily find out what's going on.
As for me, I'll keep trying to isolate the problem by simplifing the code that generates the bug.
I made some progress.
The bug is linked to a local variable (sprtWidth), located at IX-17.
When the bug is present (with the original code), the use of this variable in first VDP_CommandLMMV() is not correct: the assembler code thinks that the value is in HL when it is not.
In the 2nd use, the value is correctly recovered from the right offset of IX.
Here is bugged assembler code (note that HL is got from AF value, what is nonsense):
Attached, the .C file and the corresponding .ASM.
Note that I was able to slightly simplify the loop while preserving the bug.
Last edit: Aoineko 2025-02-09
At first sight, it looks like the problem is a swapped order of the push de, push af vs. the pop hl, pop af.
Loos the same at second sight, too. Apparently a codegen bug in src/z80/gen.c, genPointerSet. Still, I'd like code to reproduce the issue, so I can write a regression test for it.
Last edit: Philipp Klaus Krause 2025-02-09
Ok, it seems the "only" problem is that the compiler think the value of
sprtWidthis in HL register when settingg_VDP_Command.NX, while in fact, it is in DE.I'm surprised no one else has come across this SDCC 4.5 bug.
Last edit: Aoineko 2025-02-09
Could you give us a compileable example, which does not depend on anything that is not part of SDCC? Just the affected function and all necessary declarations in one source file?
I understand that it would be far easier to check the problematic code if it were isolated, but it's a big task to remove all the dependencies and I don't necessarily have the time right now.
In the meantime, if you want to test the bug and a fix, you can clone MSXgl (https://github.com/aoineko-fr/MSXgl.git), change the SDCC path to your 4.5 version (
SDCCPath) inprojects/samples/project_config.js, then just execute there: “build s_swsprt” (or “./build.sh s_swsprt” on Linux).You'll have all the intermediate files in
/out/.Last edit: Aoineko 2025-02-09
Attached the reduced code which compiles only with
sdcc -c -mz80 s_swsprt.cand still produces in the asm:(edit: added the lines to see that the hl is used wrong)
Last edit: Janko Stamenović 2025-02-09
I've modified code so that after the
InitScreenfunction is run as:SX = 64; SY = 64; InitScreen();the bufferwordsis populated with the values from the calls which go wrong (I hope) and the bufferbuffwith the textual representation of the same values (I can see them so I'm sure that works). There are total 4 logged calls, and two calls are wrong: the first and the third. On that web emulator, linked with the rest of the runtime these 4 values are:0xa010, 1, 0xa010, 1and the "right" values should be all four times 1.Edit: here now the minimal version.
sdcc -mz80 bug_3834.cshould still produce the bug. Thatwordsis filled as described in the c file (now 5 words, and different values, two from 5 are bad).For higher optimizations I saw the correct values, for default I see that bad code (using the official mingw64 build of 4.5.0, and also my current build on Linux).
Last edit: Janko Stamenović 2025-02-09
Thank you very much Janko.
I hope this will enable the SDCC team to fix this bug quickly.
When this kind of compilation bug is fixed, is a new official version of SDCC released?
I think you will have to use a snapshot build in your project, if you want to use it before the new version is released. As far as I understood, the tradition of SDCC is to publish a new "official release" each year: 4.5 in 2025, 4.0 in 2020, and 3.5 in 2015, see:
https://sourceforge.net/projects/sdcc/files/sdcc-win64/
My experiments suggested that there's a chance that you can avoid that bug in your examples by simply adding something like e.g.
(or a bigger number) to your compilation options. As I tried to reproduce, I saw the problem go away with higher settings, and also with
--opt-code-size.Edit: tried: even if the above is true for the smaller examples as I've tried to reduce, it seems that your full C code is more "robust" against the optimizations(!)
BTW you should definitely learn to produce these kinds of C files that don't depend on any includes yourself, by adding the option :
to your compilation and then sending the output for a given .c file produced that way.
Last edit: Janko Stamenović 2025-02-09
The SDCC 4.5.0 release will probably still be the most stable version of SDCC. Your options are:
We've had SDCC 4.5.0 RC1 on 2025-01-03, and the release on 2025-01-28 (with RC2 and RC3 in between). That was 26 days of testing and fixing release candidates, more than usual for an SDCC release.
It's not that simple. The MSXgl library is used by quite a few MSX developers and I can't ask them to build their own version of SDCC to have a stable compiler. What's more, building a version manually, picking up patches here and there, doesn't provide any guarantees, as you end up with a version that nobody has ever tested.
26 days may be a long time for those who are on this forum every day, but personally I missed this period because I was busy elsewhere (I was finishing a game with a January 31 deadline).
Anyway, I'm going to continue my tests with SDCC 4.5 in particular to see how it performs, but this bug seems far too dangerous for me to consider this version as the new minimum version for MSXgl, which will remain the super-stable SDCC 4.2 version.
Thanks for the quick fix.
It is simpler than you think it is. You have your really nice set of programs you can use for tests, and you can just take a snapshot 15282 and run your tests with it, and you already include the SDCC binaries your package, so there's really nothing against including the snapshot 15282 in the package.
The "official releases" in the release process of SDCC by design aren't "the known optimum". The users can have their own "known" optimum, for their own use cases, which definitely can be a snapshot.
I admit my perspective is biased, as I expect from SDCC both the best and smallest code it can produce, and the "most correct" even for the "unknown" C input, and I still think 4.2.0 can't be better than what exists now in 15282. If 4.2.0 appears more stable to you it doesn't mean it's for your own purposes in any way better than the snapshot 15282. I definitely suggest you to save it, both the source and the binaries... until some more recent version with which you could be even more satisfied appears.
That was from the pure "what works" perspective. But I know some companies (read: managers) insist on using "just releases" and nothing else. If such a company actually cares about "just releases", the consistence of the timing of the official releases though the years allows it to very easily schedule their own tests to happen before each official release. In your case, you certainly didn't have to run your set of tests only in January: the problem was probably visible long ago, and now that you know it, if you would use some much older snapshots you could probably see it and confirm. Maybe it was harder to cause it to be manifested, maybe not, but it could be that it was present since a while: the specific
poporder, where the bug was in this case, seems to have been the same since 2021, and is only changed now.In short: whichever tests the compiler team does, they can't protect you from having to run your own tests to see that the code you write works correctly.
Additionally and tangentially, as I've had to reduce the code from your library, I some observations related to the "principles" on which it depends: the way the library is written and used is from the perspective of both the resulting code size and speed suboptimal. Specifically in this loop:
what you have behind these VDP_CommandLMMV functions are writes that can't be optimized by a compiler. You have a structure that has to be updated, and which has different fields and after the update of the structure some additional function is called. So in this case, even if it's inlined, the compiler can't know that that additional function is not wiping out the content of the structure, and it has to update the variables that potentially don't have to ever be updated, if the library or library use followed somewhat different principles. Specifically, if that additional function doesn't wipe the structure, then writing to pixelWidth, 1 and VDP_OP_IMP could be done only before the loop as the content then never changes. So the potential hypothetical code could be just:
and then the overall generated code is both shorter and faster, and also more obvious what it actually does.
But the more complex code helped catching a bug, so it's also good that it exists, so "thumb up" for that.
Also, one more observation: as far as I see the only widths for the given modes used are 256 and 512, in that case, instead of using width to do %, the faster code could get the mask of the mode and instead of
do just
(again, the compiler can't know if the called functions will change the content behind the src-> so it's you who have to move it out of the loop).
Fixed in [r15277].
Related
Commit: [r15277]