Re: [Capstone-users] Decoding AArch64 (ARM64) instructions via Python bindings (unsuccessful attemp

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Oct 14, 2014, at 1:49 AM, Nguyen Anh Quynh wrote:

> hi David,
> 
> (1) it is fine to use CS_MODE_ARM with Arm64, like below:
> 
>     md = Cs(CS_ARCH_ARM64, CS_MODE_ARM)
> 
> (2) obviously you need Big Endian mode in your case, like:
> 
>     md = Cs(CS_ARCH_ARM64, CS_MODE_ARM + CS_MODE_BIG_ENDIAN)
> 
> (3) on your machine, compiler produces Big Endian code for AArch64, and that is the reason why your code failed

AFAIK, it's not or should not.

$ rpm --eval='%{_host}'
aarch64-redhat-linux-gnu
$ rpm --eval='%{_build}'
aarch64-redhat-linux-gnu
$ rpm --eval='%{_target}'
aarch64-linux

https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

-mbig-endian
Generate big-endian code. This is the default when GCC is configured for an ‘aarch64_be-*-*’ target.

$ lscpu
Architecture:          aarch64
Byte Order:            Little Endian

I am running this code on little endian AArch64 silicon, the triplet is also little endian machine.

0x4F8010A4 == 0100'1111 1000'0000 0001'0000 1010'0100 (fmla    v4.4s, v5.4s, v0.s[0])

C7.3.108 from DDI0487A_c_armv8_arm.pdf

FMLA by element vector encoding

bits:
31    : 0
30    : 1 (Q)
29-23 : 0011111
22    : 0 (sz)
21    : 0 (L)
20    : 0 (M)
19-16 : 0001 (Rm)

$ objdump -d inst.o
0000000000000000 <.text>:
   0:	4f8010a4 	fmla	v4.4s, v5.4s, v0.s[0]

$ od -t x1 -j 64 -N 4 inst.o
0000100 a4 10 80 4f
0000104

B2.5.2  Instruction endianness

In ARMv8-A, A64 instructions have a fixed length of 32 bits and are always little-endian.

Let's test it.

$ as -EL -o inst.o inst.s
$ objdump -d inst.o

inst.o:     file format elf64-littleaarch64

Disassembly of section .text:

0000000000000000 <.text>:
   0:	4f8010a4 	fmla	v4.4s, v5.4s, v0.s[0]

$ od -t x1 -j 64 -N 4 inst.o
0000100 a4 10 80 4f
0000104

$ as -EB -o inst.o inst.s
$ objdump -d inst.o

inst.o:     file format elf64-bigaarch64

Disassembly of section .text:

0000000000000000 <.text>:
   0:	4f8010a4 	fmla	v4.4s, v5.4s, v0.s[0]

$ od -t x1 -j 64 -N 4 inst.o
0000100 a4 10 80 4f
0000104

So, they are stored on disk as little endian (always, doesn't matter what is the mode). Objdump displays instruction as big endian. Well, it does make easier manually reading instructions.

Let's look into x86_64.

$ cat test.c

void dummy(void) {
  int a = 0;
  int b = 1;
  int c = 3;
  if (a > b && b < c) {
    int d = a + b + c;
  }
}

0000000000000000 <dummy>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	c7 45 fc 00 00 00 00 	movl   $0x0,-0x4(%rbp)
   b:	c7 45 f8 01 00 00 00 	movl   $0x1,-0x8(%rbp)
  12:	c7 45 f4 03 00 00 00 	movl   $0x3,-0xc(%rbp)
  19:	8b 45 fc             	mov    -0x4(%rbp),%eax
  1c:	3b 45 f8             	cmp    -0x8(%rbp),%eax
  1f:	7e 18                	jle    39 <dummy+0x39>
  21:	8b 45 f8             	mov    -0x8(%rbp),%eax
  24:	3b 45 f4             	cmp    -0xc(%rbp),%eax
  27:	7d 10                	jge    39 <dummy+0x39>
  29:	8b 45 f8             	mov    -0x8(%rbp),%eax
  2c:	8b 55 fc             	mov    -0x4(%rbp),%edx
  2f:	01 c2                	add    %eax,%edx
  31:	8b 45 f4             	mov    -0xc(%rbp),%eax
  34:	01 d0                	add    %edx,%eax
  36:	89 45 f0             	mov    %eax,-0x10(%rbp)
  39:	5d                   	pop    %rbp
  3a:	c3                   	retq

$ od -t x1 -j 68 -N 7 test.o
0000104 c7 45 fc 00 00 00 00
0000113

Seems that objdump works differently for x86_64 and aarch64.

david

Re: [Capstone-users] Decoding AArch64 (ARM64) instructions via Python bindings (unsuccessful attemp

Capstone disassembly engine

Re: [Capstone-users] Decoding AArch64 (ARM64) instructions via Python bindings (unsuccessful attempts)