Portable Executable (PE) Format

Portable Executable is file format which is used in Windows OS for executable files like .exe, .dll, .cpl etc. It is based on COFF (Common Object File Format).

A PE file is a data structure that holds information necessary for OS loader to load that executable into memory and execute it.

This article serves as basic overview of PE structure, understanding of which is useful for reverse engineering and understanding not just malware binaries.

Note

Examples provided in this article will be taken from random executable file, opened using analytics tool named PE-bear.

Code examples are from winnt.h WinAPI file. You can download these files as part of Visual Studio.

Structure

DOS Header

Is represented by first 64 bytes of every PE file. Following parts are the most important:
e_magic - Every PE File starts with 2 byte magic number 0x5A4D. It is used to verify if it is valid executable. The value can be seen in reverse order in screenshot below, due to Windows using little endian encoding
e_lfanew - These 4 bytes contain the offset of PE header. When the program needs to be loaded by Windows loader, it looks for this value to skip the DOS Stub and go directly to NT headers.

DOS Stub

Usually contains message "This program cannot be run in DOS mode". It is used as fallback for older DOS systems that cannot process PE files.

NT Headers

Are accesssed from address in e_lfanew

Signature - Serves for checking validity of the structure, has value of Ox4550 (PE)
File Headers - Contains information about structure of the whole file, such as the machine type of the executable code, a time stamp, a pointer to symbol table and various flags. Value in machine type can help you determine whether the executable is 32(value 0x4c) or 64 bit (value 0x64)
Optional Headers - Unlike name suggests, this header is not actually optional. It contains additional important information to File Headers, another magic number that determines whether file is 32/64bit, information about running subsystem, Preffered base address and security flags. Another important part is import,export, resource tables etc. which contain used APIs, imported functions, string and other static resources.

Section Header

Is an array that contains memory locations for each section.

Sections

.text - Contains the executable code. This section includes all compiled instructions that the processor will execute. The section is typically marked as executable and read-only for security purposes.

.data - Contains initialized global data. This includes variables with initial values that the program requires during execution. The section is marked as readable and writable.

.rdata - Contains read-only data, including import and export tables. It stores constant data, string literals, and critical tables that support dynamic linking functionality.

.rsrc - Contains resources such as icons, images, and strings. This section organizes resources in a hierarchical structure that applications can access during runtime.

.reloc - Contains relocation table that is used by loader for recalculating addresses in case the executable is not loaded at base address.

.tls(Thread Local Storage)- is a special storage class that contains thread specific data.

This list of sections is not exhaustive, just explains the most common ones.

When analyzing PE file, malicious executables can have unusually small or large headers or sections. Unusually large header can be a sign of obfuscation and for example small or empty import table can be sign of dynamic loading of libraries which is common for malware.

Loading