ElBlo

Harry Potter and the Cursed Assembler

June 2, 2022

GNU AS, the assembler, has a very strange behavior when it comes to specifying operation sizes while using the Intel syntax.

.global _start
.intel_syntax noprefix
_start:
    mov  al, BYTE [rdi]
    mov  ax, WORD [rdi]
    mov  al, BYTE PTR [rdi]
    mov  ax, WORD PTR [rdi]

gas, assembles¹ it into:

_start:
 mov    al,BYTE PTR [rdi+0x1]
 mov    ax,WORD PTR [rdi+0x2]
 mov    al,BYTE PTR [rdi]
 mov    ax,WORD PTR [rdi]

Why is that? Well, because gas has special values for BYTE, WORD, DWORD, and QWORD that get replaced to 1, 2, 4, and 8, respectively. This means that mov al, BYTE [rdi] gets interpreted as mov al, 1[rdi] which to gas, is the same as mov al, [rdi + 1].

If you are familiar with nasm, you would probably use the first instructions, and get bit by the issue. Luckily, LLVM’s assembler rejects this input:

<source>:4:19: error: Expected 'PTR' or 'ptr' token!
    mov  al, BYTE [rdi]
                  ^

Play in godbolt ↩︎