The world smallest Mach-O executable!

written by mfazekas on April 4th, 2009 @ 06:44 PM

Amit Singh the author of the excelent MacOSX Internals: A Systems Aproach posted an article about creating a minimal Mach (OSX) executable – Crafting a Tiny Mach-O Executable In the article he demonstrated a Mach 165 byte executable returning the exit code ‘42’. As he noted there are some zeros in his program, so we should be able to compress it further. This article is my attempt to reduce the executable size a bit.

Strategy

This is the 165 byte executable from Amit article:

; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm

BITS 32
  org   0x1000

  db    0xce, 0xfa, 0xed, 0xfe       ; magic
  dd    7                            ; cputype (CPU_TYPE_X86)
  dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
  dd    2                            ; filetype (MH_EXECUTE)
  dd    2                            ; ncmds
  dd    _start - _cmds               ; cmdsize
  dd    0                            ; flags
_cmds:
  dd    1                            ; cmd (LC_SEGMENT)
  dd    44                           ; cmdsize
  db    "__TEXT"                     ; segname
  db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
  dd    0x1000                       ; vmaddr
  dd    0x1000                       ; vmsize
  dd    0                            ; fileoff
  dd    filesize                     ; filesize
  dd    7                            ; maxprot

  dd    5                            ; cmd (LC_UNIXTHREAD)
  dd    80                           ; cmdsize
  dd    1                            ; flvaor (i386_THREAD_STATE)
  dd    16                           ; count (i386_THREAD_STATE_COUNT)
  dd    0, 0, 0, 0, 0, 0, 0, 0       ; state
  dd    0, 0, _start, 0, 0, 0, 0, 0  ; state
_start:
  xor   eax,eax
  inc   eax
  push  byte 42
  sub   esp, 4
  int   0x80                         ; _exit(42)

filesize equ  $ - $$

So we have a mach_header with 2 segements, then the executable code at _start. The total size is 165 bytes. The header with the 2 load commands takes 152 bytes and the code takes only 13 bytes. If we can embed the code into the load commands (there are a lot of zeroes), the we’ll be able to save this 13 bytes.

Test rig

I’m using the following shell script as a test framework. It compiles, executes and tests the executable:

#!/bin/sh
nasm -f bin -o tiny tiny.asm
ls -las  ./tiny
./tiny
if [ $? == 42 ] ; then
  echo "OK" 
else
  echo "Fail" 
fi

The compressed code:

So the plan is to place the 13 bytes of code to one of unused serion of zeroes in the load commands. There are 2 series of zeroes we can reuse. One is at LC_SEGMENT after the __TEXT. We need a single zero for correct null terminated c-string but others can be reused – this is 9 bytes space. We have a longer series of zero-s at LC_UNIXTHREAD. These zeros represent the initial states of the registers, and it’s safe to change them to non zero. We have 40 + 20 bytes of free space here.

The simple solution would be to place the 13 bytes code into the 40 or 20 bytes place. But that would be too simple. Instead i’ll place first 4 bytes of code + jmp taking 5 bytes to the LC_SEGMENT after the null terminated string __TEXT, and i’ll put the remainder of code 9 bytes to the zeroes to the thread state.

So here is the code for the 152 bytes executable:


; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm

BITS 32
    org   0x1000

    db    0xce, 0xfa, 0xed, 0xfe       ; magic
    dd    7                            ; cputype (CPU_TYPE_X86)
    dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
    dd    2                            ; filetype (MH_EXECUTE)
    dd    2                            ; ncmds
    dd    _cmdend - _cmds              ; cmdsize
    dd    0                            ; flags
_cmds:
    dd    1                            ; cmd (LC_SEGMENT)
    dd    44                           ; cmdsize
    db    "__TEXT"                     ; segname
    db    0                            ; segname

_start0:                               ; first part of the code
    xor   eax,eax                      ; 2 bytes
    push  byte 42                      ; 2 bytes
    jmp   _start2                      ; 5 bytes

    dd    0x1000                       ; vmaddr
    dd    0x1000                       ; vmsize
    dd    0                            ; fileoff
    dd    filesize                     ; filesize
    dd    7                            ; maxprot

    dd    5                            ; cmd (LC_UNIXTHREAD)
    dd    80                           ; cmdsize
    dd    1                            ; flvaor (i386_THREAD_STATE)
    dd    16                           ; count (i386_THREAD_STATE_COUNT)

_start2:                               ; second part
    inc   eax                          ; 1 bytes
    sub   esp, 4                       ; 6 bytes
    int   0x80                         ; 2 bytes

    db    0, 0, 0
    dd    0, 0, 0, 0, 0                 ; state 
    dd    0, 0, _start0, 0, 0, 0, 0, 0  ; state
_cmdend:

filesize equ $ - $$

Is it possible to compress the size further?

An even smaller Mach executable would require smaller load commands. But just reducing the size and cmdsize of LC_UNIXTHREAD caused “Malformed Mach-o file”, so the trivial way doesn’t works.

Leave if you have an idea for further space optimizations!

Comments are closed