I compiled twice in both 64 and 32 and here is the resut:
(i7 - 2.4 gig)
>JWasm64: 17271 lines, 25 passes, 9367 ms, 0 warnings, 0 errors
>JWasm64: 17271 lines, 25 passes, 9314 ms, 0 warnings, 0 errors
I'm not sure this is a good idea. I see that this little optimization may save some bytes in the resulting binary, but this probably comes at a cost:
- it forces jwasm to make at least 3 passes, because the "used" flag will always be clear in pass one.
- it may corrupt the listing if the -Sg switch is set.
the second reason is more important, of course. I guess one first has to change the way the listing is written before this optimization can be implemented.
> I compiled twice in both 64 and 32 and here is the resut:
Interesting. So you succeeded to create a 64-bit version that is faster than the 32-bit one - which I was unable to achieve.
Congrats! One may even boost the 64-bit version a bit if 2 locations were adjusted: function LclAlloc() in memalloc.c and macro GetAlignedPointer() in input.h. Both cases handle alignment, and in both cases alignment is DWORD only, while for 64-bit an alignment of 8 (QWORD) almost certainly is better.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
However, the "size+4" is indeed a bug in 64-bit. I'll have to correct this.
> is it possible to make a switch for optimization?
This won't help with the listing. However, it may be possible to do "backpatching" when the ENDP directive is reached. At this point it's clear which parameter is used and which isn't. Then all labels inside the procedure are adjusted ( virtually the same technique that's used when a label is reached by the assembler that has been forward referenced )
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did a test on my own to see what the 64-bit alignment for the 64-bit jwasm binary does improve.
There is a small benefit of about 2% in a few benchmarks - just enough to be measurable. I don't think you'll notice any difference if a real-world assembly source is assembled.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Since 30.8.2012, anonymous bug reports and feature requests are no longer possible.
Hey, Japheth!
Thank you very much for your awesome JWasm
Did you maybe consider this :
I have built 64 bit version with the MSVC12 RC Ultimate and it works beautifully
I will compare the speed later and report it here
I compiled twice in both 64 and 32 and here is the resut:
(i7 - 2.4 gig)
>JWasm64: 17271 lines, 25 passes, 9367 ms, 0 warnings, 0 errors
>JWasm64: 17271 lines, 25 passes, 9314 ms, 0 warnings, 0 errors
>JWasm32: 17271 lines, 25 passes, 9770 ms, 0 warnings, 0 errors
>JWasm32: 17271 lines, 25 passes, 9757 ms, 0 warnings, 0 errors
Hello habran,
> Did you maybe consider this :
I'm not sure this is a good idea. I see that this little optimization may save some bytes in the resulting binary, but this probably comes at a cost:
- it forces jwasm to make at least 3 passes, because the "used" flag will always be clear in pass one.
- it may corrupt the listing if the -Sg switch is set.
the second reason is more important, of course. I guess one first has to change the way the listing is written before this optimization can be implemented.
> I compiled twice in both 64 and 32 and here is the resut:
Interesting. So you succeeded to create a 64-bit version that is faster than the 32-bit one - which I was unable to achieve.
Congrats! One may even boost the 64-bit version a bit if 2 locations were adjusted: function LclAlloc() in memalloc.c and macro GetAlignedPointer() in input.h. Both cases handle alignment, and in both cases alignment is DWORD only, while for 64-bit an alignment of 8 (QWORD) almost certainly is better.
did you mean size+8?
I tried it and there is not much improvement but there is some
here is the result of several builds:
AX64Edit.asm: 17271 lines, 25 passes, 9262 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9240 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9424 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9480 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9268 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9448 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9420 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9298 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 9347 ms, 0 warnings, 0 errors
> it may corrupt the listing if the -Sg switch is set.
is it possible to make a switch for optimization?
EG bit 3 in option win64
"option win64:7"
that above was debug version
and this is release 64 bit version build:
AX64Edit.asm: 17271 lines, 25 passes, 5341 ms, 0 warnings, 0 errors
AX64Edit.asm: 17271 lines, 25 passes, 5260 ms, 0 warnings, 0 errors
almost double fast
> did you mean size+8?
Not exactly. I meant:
However, the "size+4" is indeed a bug in 64-bit. I'll have to correct this.
> is it possible to make a switch for optimization?
This won't help with the listing. However, it may be possible to do "backpatching" when the ENDP directive is reached. At this point it's clear which parameter is used and which isn't. Then all labels inside the procedure are adjusted ( virtually the same technique that's used when a label is reached by the assembler that has been forward referenced )
Ok, here's how it should look like for both 32- and 64-bit:
memalloc.c
input.h
the "size+4" thing in memalloc.c is virtually not relevant for 64-bit and can remain as it is.
if FASTMEM
that is what I did but I also put size+8
but a perfect solution is:
I changed it in the source and recompiled
it doesn't make remarkable change in speed:
AX64Edit.asm: 17271 lines, 25 passes, 5256 ms, 0 warnings, 0 errors (release version x64)
however, 17271 lines, 25 passes, 5256 ms is a great speed
I did a test on my own to see what the 64-bit alignment for the 64-bit jwasm binary does improve.
There is a small benefit of about 2% in a few benchmarks - just enough to be measurable. I don't think you'll notice any difference if a real-world assembly source is assembled.
a small benefit with a little effort is much better than none with the great effort