How to disassemble a binary executable in Linux to get the assembly code?

c++ linux assembly executable disassembly

I don't think gcc has a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdump takes a -d/--disassemble flag:

$ objdump -d /path/to/binary

The disassembly looks like this:

080483b4 <main>: 80483b4:   8d 4c 24 04             lea    0x4(%esp),%ecx 80483b8:   83 e4 f0                and    $0xfffffff0,%esp 80483bb:   ff 71 fc                pushl  -0x4(%ecx) 80483be:   55                      push   %ebp 80483bf:   89 e5                   mov    %esp,%ebp 80483c1:   51                      push   %ecx 80483c2:   b8 00 00 00 00          mov    $0x0,%eax 80483c7:   59                      pop    %ecx 80483c8:   5d                      pop    %ebp 80483c9:   8d 61 fc                lea    -0x4(%ecx),%esp 80483cc:   c3                      ret     80483cd:   90                      nop 80483ce:   90                      nop 80483cf:   90                      nop

c++ linux assembly executable disassembly

An interesting alternative to objdump is gdb. You don't have to run the binary or have debuginfo.

$ gdb -q ./a.out Reading symbols from ./a.out...(no debugging symbols found)...done.(gdb) info functions All defined functions:Non-debugging symbols:0x00000000004003a8  _init0x00000000004003e0  __libc_start_main@plt0x00000000004003f0  __gmon_start__@plt0x0000000000400400  _start0x0000000000400430  deregister_tm_clones0x0000000000400460  register_tm_clones0x00000000004004a0  __do_global_dtors_aux0x00000000004004c0  frame_dummy0x00000000004004f0  fce0x00000000004004fb  main0x0000000000400510  __libc_csu_init0x0000000000400580  __libc_csu_fini0x0000000000400584  _fini(gdb) disassemble mainDump of assembler code for function main:   0x00000000004004fb <+0>:     push   %rbp   0x00000000004004fc <+1>:     mov    %rsp,%rbp   0x00000000004004ff <+4>:     sub    $0x10,%rsp   0x0000000000400503 <+8>:     callq  0x4004f0 <fce>   0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)   0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax   0x000000000040050e <+19>:    leaveq    0x000000000040050f <+20>:    retq   End of assembler dump.(gdb) disassemble fceDump of assembler code for function fce:   0x00000000004004f0 <+0>:     push   %rbp   0x00000000004004f1 <+1>:     mov    %rsp,%rbp   0x00000000004004f4 <+4>:     mov    $0x2a,%eax   0x00000000004004f9 <+9>:     pop    %rbp   0x00000000004004fa <+10>:    retq   End of assembler dump.(gdb)

With full debugging info it's even better.

(gdb) disassemble /m mainDump of assembler code for function main:9       {   0x00000000004004fb <+0>:     push   %rbp   0x00000000004004fc <+1>:     mov    %rsp,%rbp   0x00000000004004ff <+4>:     sub    $0x10,%rsp10        int x = fce ();   0x0000000000400503 <+8>:     callq  0x4004f0 <fce>   0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)11        return x;   0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax12      }   0x000000000040050e <+19>:    leaveq    0x000000000040050f <+20>:    retq   End of assembler dump.(gdb)

objdump has a similar option (-S)

c++ linux assembly executable disassembly

This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdump and llvm-objdump.

Agner Fog's disassembler, objconv, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).

objconv  -fyasm a.out /dev/stdout | less

(It doesn't recognize - as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asm tacked on.)

It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.

It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)

It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.

Sample output:

; Filling space: 0FH; Filler type: Multi-byte NOP;       db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH;       db 1FH, 84H, 00H, 00H, 00H, 00H, 00HALIGN   16foo:    ; Function begin        cmp     rdi, 1                                  ; 00400620 _ 48: 83. FF, 01        jbe     ?_026                                   ; 00400624 _ 0F 86, 00000084        mov     r11d, 1                                 ; 0040062A _ 41: BB, 00000001?_020:  mov     r8, r11                                 ; 00400630 _ 4D: 89. D8        imul    r8, r11                                 ; 00400633 _ 4D: 0F AF. C3        add     r8, rdi                                 ; 00400637 _ 49: 01. F8        cmp     r8, 3                                   ; 0040063A _ 49: 83. F8, 03        jbe     ?_029                                   ; 0040063E _ 0F 86, 00000097        mov     esi, 1                                  ; 00400644 _ BE, 00000001; Filling space: 7H; Filler type: Multi-byte NOP;       db 0FH, 1FH, 80H, 00H, 00H, 00H, 00HALIGN   8?_021:  add     rsi, rsi                                ; 00400650 _ 48: 01. F6        mov     rax, rsi                                ; 00400653 _ 48: 89. F0        imul    rax, rsi                                ; 00400656 _ 48: 0F AF. C6        shl     rax, 2                                  ; 0040065A _ 48: C1. E0, 02        cmp     r8, rax                                 ; 0040065E _ 49: 39. C0        jnc     ?_021                                   ; 00400661 _ 73, ED        lea     rcx, [rsi+rsi]                          ; 00400663 _ 48: 8D. 0C 36...

Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like

  (from /lib/x86_64-linux-gnu/libc.so.6)SECTION .plt    align=16 execute                        ; section number 11, code?_00001:; Local function        push    qword [rel ?_37996]                     ; 0001F420 _ FF. 35, 003A4BE2(rel)        jmp     near [rel ?_37997]                      ; 0001F426 _ FF. 25, 003A4BE4(rel)...    ALIGN   8?_00002:jmp     near [rel ?_37998]                      ; 0001F430 _ FF. 25, 003A4BE2(rel); Note: Immediate operand could be made smaller by sign extension        push    11                                      ; 0001F436 _ 68, 0000000B; Note: Immediate operand could be made smaller by sign extension        jmp     ?_00001                                 ; 0001F43B _ E9, FFFFFFE0

doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.

If you don't want to install it objconv, GNU binutils objdump -Mintel -d is very usable, and will already be installed if you have a normal Linux gcc setup.

CodeHunter

How to disassemble a binary executable in Linux to get the assembly code?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last