I am having problems with JWasm version v2.07 because of stack alignment on x64 machine
when pushed uneven number of registers and have local variable it decrements stack for only 8 bytes
v2.06e does the same
when pushed even number of registers it aligns 16 bytes and works fine
or if I add dummy variable
I have looked how is the stack alignment solved in MSVC for x64 programming
and found out that they reserve the space on the stack in the beginning of the subroutine as big as the greatest call plus locals
so, it is not necessary to push the stack for each call
that way it is possible to have faster program with less code
best regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> that way it is possible to have faster program with less code
Yes, it's more efficient. OTOH, this approach requires to use PUSH/POP very carefully inside the procedure; in any case, the programmer must know exactly what he's doing. Unlike C, which has full control of the stack, the assembler allows more freedom - and hence more possibilities to add bugs.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In x64 programming there is available plenty of registers and there is no need so often to use PUSH/POP
maybe you could make .OPTION "CFRAME" so when you don't need to use PUSH/POP you can use CFRAME and FRAME otherwise
You are right in saying that assembly programmer must know exactly what he's doing and I think this would give us more freedom and speed
thank you for a great tool
best regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Errors happened because of conflicting with the same macros in wingdi.inc
GetRValue macro rgb
exitm <( ( rgb ) ) >
endm
GetGValue macro rgb
exitm <( ( ( ( rgb ) ) shr 8 ) ) >
endm
GetBValue macro rgb
exitm <( ( ( rgb ) shr 16 ) ) >
endm
I have changed GetRValue into GetaRValue and it works fine However, why JWasm 205 did not produce error?
Also, these macros from wingdi.inc are not working
maybe I don't know how to use them
Anyway, thank you so much to take care of that stack alignment
I spent lot of time looking for the bug in my program and couldn't find it
then I realized that it is a bug in the compiler
best regards
BTW what is a "Bugs Tracker"?
I found only .err file in my folder and there was the same message as I posted to you
nothing else
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Japheth
sorry about your difficulties, I know how it fills when you can not do programming
while I was waiting for you to come back I looked up that problem about stack alignment and found what was wrong
here is an original source from v2.06e:
file: proc.c, line:413
#if AMD64_SUPPORT
/* adjust start displacement for Win64 FRAME procs.
* v2.06: the list may contain xmm registers, which have size 16!
*/
if ( info->isframe ) {
uint_16 *regs = info->regslist;
int sizestd = 0;
int sizexmm = 0;
if ( regs )
for( cnt = *regs++; cnt; cnt-, regs++ )
if ( GetValueSp( *regs ) & OP_XMM )
sizexmm += 16;
else
sizestd += 8;
displ = sizexmm + sizestd;
if ( sizestd & 0xf ) // problem is here because not checking if there is any xmm register or not
displ += 8; // just checking for odd or even
}
#endif
here is the correct source:
#if AMD64_SUPPORT
/* adjust start displacement for Win64 FRAME procs.
* v2.06: the list may contain xmm registers, which have size 16!
*/
if ( info->isframe ) {
uint_16 *regs = info->regslist;
int sizestd = 0;
int sizexmm = 0;
if ( regs )
for( cnt = *regs++; cnt; cnt-, regs++ )
if ( GetValueSp( *regs ) & OP_XMM )
sizexmm += 16;
else
sizestd += 8;
displ = sizexmm + sizestd;
if (( sizestd & 0xf ) && sizexmm) // is there any xmm register?
displ += 8;
}
#endif
now it works fine
I wish you to come back soon
wee need you
best regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
>> that way it is possible to have faster program with less code
>
> Yes, it's more efficient. OTOH, this approach requires to use PUSH/POP very carefully inside the
> procedure; in any case, the programmer must know exactly what he's doing. Unlike C, which has full
> control of the stack, the assembler allows more freedom - and hence more possibilities to add bugs.
The reason for these bugs is usually readability. You start by hard-coding the locals (mov ,eax), then naming them (result equ ).
Well, a bit off topic, but I try to convince myself that there is a very simple solution to this..
To size-up both arguments and locals in one struct seems to be the simplest way of doing this. A redefine of the labels relative to base enable direct access.
The _output function is basicly a large switch with a lot of static functions, all using the same stack frame. In addition to this there is also a math section defined in different files. Child functions will then use the struct:
_output . . . .P Near 0000 _TEXT 001D Public PASCAL
arg1 . . . . .Word bp + 0008
arg2 . . . . .Word bp + 0006
arg3 . . . . .Word bp + 0004
OP . . . . . .Byte[522] bp - 020A
_output . . . .P Far 0000 _TEXT 001D Public PASCAL
arg1 . . . . .DWord bp + 000E
arg2 . . . . .DWord bp + 000A
arg3 . . . . .DWord bp + 0006
OP . . . . . .Byte[522] bp - 020A
_output . . . .P Near 00000000 _TEXT 0000001F Public STDCALL
arg1 . . . . .DWord ebp + 0008
arg2 . . . . .DWord ebp + 000C
arg3 . . . . .DWord ebp + 0010
OP . . . . . .Byte[524] ebp - 020C
_output . . . .P Near 00000000 _TEXT 0000002C Public FASTCALL
arg3 . . . . .QWord rbp + 0020
arg2 . . . . .QWord rbp + 0018
arg1 . . . . .QWord rbp + 0010
OP . . . . . .Byte[528] rbp - 0210
If the stack issue is handled by the proc definition, and invoke is used, the code becomes more readable, and to some degree also portable.
include clib.inc
include stdio.inc
.code
printf proc _CDecl public format:ptr byte, argptr:VARARG
invoke _stbuf,addr stdout
push ax?
invoke _output,addr stdout,format,addr argptr
pop dx?
push ax?
invoke _ftbuf,dx?,addr stdout
pop ax?
ret
printf endp
end
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi japheth,
I am having problems with JWasm version v2.07 because of stack alignment on x64 machine
when pushed uneven number of registers and have local variable it decrements stack for only 8 bytes
v2.06e does the same
when pushed even number of registers it aligns 16 bytes and works fine
or if I add dummy variable
here is a disassembly:
8515:SomeSubrutine PROC FRAME USES r12 r14 r15 a1:PTR ASMWEDIT, a2:PTR ASMWEDIT
8516: local lpAsmwItem :PTR ASMWEDITITEMW
8517:
8518: mov r14,rdx
000000000046D8A7 mov qword ptr ,rcx
000000000046D8AC mov qword ptr ,rdx
000000000046D8B1 mov qword ptr ,r8
000000000046D8B6 mov qword ptr ,r9
000000000046D8BB push rbp
000000000046D8BC mov rbp,rsp
000000000046D8BF push rbx
000000000046D8C0 push r12
000000000046D8C2 push r14
000000000046D8C4 push r15
000000000046D8C6 sub rsp,8
best regards
I have looked how is the stack alignment solved in MSVC for x64 programming
and found out that they reserve the space on the stack in the beginning of the subroutine as big as the greatest call plus locals
so, it is not necessary to push the stack for each call
that way it is possible to have faster program with less code
best regards
> that way it is possible to have faster program with less code
Yes, it's more efficient. OTOH, this approach requires to use PUSH/POP very carefully inside the procedure; in any case, the programmer must know exactly what he's doing. Unlike C, which has full control of the stack, the assembler allows more freedom - and hence more possibilities to add bugs.
In x64 programming there is available plenty of registers and there is no need so often to use PUSH/POP
maybe you could make .OPTION "CFRAME" so when you don't need to use PUSH/POP you can use CFRAME and FRAME otherwise
You are right in saying that assembly programmer must know exactly what he's doing and I think this would give us more freedom and speed
thank you for a great tool
best regards
However,
I am still having a problem with overwritten last QWORD in local vars if pushed registers are uneven
best regards
Hi japheth
here is the code to prove to you that there is a bug in the JWasm version 206 and 207
205 is working correctly:
HeadersTo64bits proc FRAME USES rsi rdi r12 RawText:LPSTR, FixedText:LPSTR,pszFileName :LPCTSTR, fSize:DWORD
local szName :BYTE
local szVar :BYTE
mov rsi,rcx
mov rdi,rdx
mov r12,rcx
add r12,r9
invoke lstrcpy,addr szName, pszFileName
;-----------------------------------------------------------------------
JWasm version 205bw compiles correctly:
mdi64!HeadersTo64bits:
00000000`0040268e 48894c2408 mov qword ptr ,rcx
00000000`00402693 4889542410 mov qword ptr ,rdx
00000000`00402698 4c89442418 mov qword ptr ,r8
00000000`0040269d 4c894c2420 mov qword ptr ,r9
00000000`004026a2 55 push rbp
00000000`004026a3 488bec mov rbp,rsp
00000000`004026a6 56 push rsi
00000000`004026a7 57 push rdi
00000000`004026a8 4154 push r12
00000000`004026aa 4881ec08020000 sub rsp,208h
00000000`004026b1 488bf1 mov rsi,rcx
00000000`004026b4 488bfa mov rdi,rdx
00000000`004026b7 4c8be1 mov r12,rcx
00000000`004026ba 4d03e1 add r12,r9
00000000`004026bd 4883ec20 sub rsp,20h
00000000`004026c1 488d8de4feffff lea rcx, ;Corect
00000000`004026c8 488b5520 mov rdx,qword ptr
00000000`004026cc ff15fe8f0000 call qword ptr
00000000`004026d2 4883c420 add rsp,20h
JWasm version 207bw compiles wrong
and JWasm 206 also wrong
mdi64!HeadersTo64bits:
00000000`0040268e 48894c2408 mov qword ptr ,rcx
00000000`00402693 4889542410 mov qword ptr ,rdx
00000000`00402698 4c89442418 mov qword ptr ,r8
00000000`0040269d 4c894c2420 mov qword ptr ,r9
00000000`004026a2 55 push rbp
00000000`004026a3 488bec mov rbp,rsp
00000000`004026a6 56 push rsi
00000000`004026a7 57 push rdi
00000000`004026a8 4154 push r12
00000000`004026aa 4881ec08020000 sub rsp,208h
00000000`004026b1 488bf1 mov rsi,rcx
00000000`004026b4 488bfa mov rdi,rdx
00000000`004026b7 4c8be1 mov r12,rcx
00000000`004026ba 4d03e1 add r12,r9
00000000`004026bd 4883ec20 sub rsp,20h
00000000`004026c1 488d8ddcfeffff lea rcx, ;Incorect
00000000`004026c8 488b5520 mov rdx,qword ptr
00000000`004026cc ff15fe8f0000 call qword ptr
00000000`004026d2 4883c420 add rsp,20h
thanks
regards
Hi again,
JWasm version 207bw can not compile these macros:
GetGValue MACRO arg:REQ
IFDIFI <arg>,<eax>
mov eax, arg
ENDIF
shr eax, 8
and eax, 0ffh
ENDM <eax>
GetBValue MACRO arg:REQ
IFDIFI <arg>,<eax>
mov eax, arg
ENDIF
shr eax, 16
and eax, 0ffh
ENDM <eax>
GetRValue MACRO arg:REQ
IFDIFI <arg>,<eax>
mov eax, arg
ENDIF
and eax, 0ffh
ENDM <eax>
it throws : Error A2209: Syntax error: GetBValue
Error A2209: Syntax error: GetGValue
Error A2209: Syntax error: GetRValue
JWasm version 205bw works fine
best regards
Habran,
> JWasm version 207bw can not compile these macros:
Please post bugs in the "Bugs Tracker"! Also, please don't add new bugs to existing threads. These rules ensure that no bug is forgotten.
FYI: I confirm the "stack alignment" bug. It's a side-effect of a bugfix in v2.05.
Hi Japheth,
Errors happened because of conflicting with the same macros in wingdi.inc
GetRValue macro rgb
exitm <( ( rgb ) ) >
endm
GetGValue macro rgb
exitm <( ( ( ( rgb ) ) shr 8 ) ) >
endm
GetBValue macro rgb
exitm <( ( ( rgb ) shr 16 ) ) >
endm
I have changed GetRValue into GetaRValue and it works fine
However, why JWasm 205 did not produce error?
Also, these macros from wingdi.inc are not working
maybe I don't know how to use them
Anyway, thank you so much to take care of that stack alignment
I spent lot of time looking for the bug in my program and couldn't find it
then I realized that it is a bug in the compiler
best regards
BTW what is a "Bugs Tracker"?
I found only .err file in my folder and there was the same message as I posted to you
nothing else
http://en.wikipedia.org/wiki/Bug_tracking_system
Please take a look at the menu bar at the top of this site: Tracker -> Bugs
Or take this direct link:
https://sourceforge.net/tracker/?group_id=255677&atid=1126895
Regards
I apologize for my ignorance
Hi Japheth
sorry about your difficulties, I know how it fills when you can not do programming
while I was waiting for you to come back I looked up that problem about stack alignment and found what was wrong
here is an original source from v2.06e:
file: proc.c, line:413
#if AMD64_SUPPORT
/* adjust start displacement for Win64 FRAME procs.
* v2.06: the list may contain xmm registers, which have size 16!
*/
if ( info->isframe ) {
uint_16 *regs = info->regslist;
int sizestd = 0;
int sizexmm = 0;
if ( regs )
for( cnt = *regs++; cnt; cnt-, regs++ )
if ( GetValueSp( *regs ) & OP_XMM )
sizexmm += 16;
else
sizestd += 8;
displ = sizexmm + sizestd;
if ( sizestd & 0xf ) // problem is here because not checking if there is any xmm register or not
displ += 8; // just checking for odd or even
}
#endif
here is the correct source:
#if AMD64_SUPPORT
/* adjust start displacement for Win64 FRAME procs.
* v2.06: the list may contain xmm registers, which have size 16!
*/
if ( info->isframe ) {
uint_16 *regs = info->regslist;
int sizestd = 0;
int sizexmm = 0;
if ( regs )
for( cnt = *regs++; cnt; cnt-, regs++ )
if ( GetValueSp( *regs ) & OP_XMM )
sizexmm += 16;
else
sizestd += 8;
displ = sizexmm + sizestd;
if (( sizestd & 0xf ) && sizexmm) // is there any xmm register?
displ += 8;
}
#endif
now it works fine
I wish you to come back soon
wee need you
best regards
I added a bug tracker entry for the stack alignment issue:
https://sourceforge.net/tracker/?func=detail&aid=3539225&group_id=255677&atid=1126895
>> that way it is possible to have faster program with less code
>
> Yes, it's more efficient. OTOH, this approach requires to use PUSH/POP very carefully inside the
> procedure; in any case, the programmer must know exactly what he's doing. Unlike C, which has full
> control of the stack, the assembler allows more freedom - and hence more possibilities to add bugs.
The reason for these bugs is usually readability. You start by hard-coding the locals (mov ,eax), then naming them (result equ ).
One way of handling this is to use a struct:
There should be some method for handling this, a reversed stack struct:
Well, a bit off topic, but I try to convince myself that there is a very simple solution to this..
To size-up both arguments and locals in one struct seems to be the simplest way of doing this. A redefine of the labels relative to base enable direct access.
The _output function is basicly a large switch with a lot of static functions, all using the same stack frame. In addition to this there is also a math section defined in different files. Child functions will then use the struct:
The definition of the proc in 16/16/32/64:
If the stack issue is handled by the proc definition, and invoke is used, the code becomes more readable, and to some degree also portable.