PE Import Table Parser

DSAVE

PE Import Table Parser

The Portable Executable (PE) format

The Portable Executable (PE) format is the standard file format used by Windows for executables such as .exe and .dll. It was introduced with Windows NT (1993) and became the unified layout for both 32-bit and 64-bit programs.A solid understanding of PE internals is a prerequisite for building security tooling on Windows, including antivirus engines, static analyzers, unpackers, and incident response utilities.

PE at a glance

A typical PE file contains:

  • DOS header (IMAGE_DOS_HEADER) for legacy compatibility and for locating the NT header.
  • NT headers (IMAGE_NT_HEADERS32/64), which include:
    • IMAGE_FILE_HEADER
    • IMAGE_OPTIONAL_HEADER32/64 (the name is historical, it is effectively mandatory for PE images)
  • Section table (array of IMAGE_SECTION_HEADER)
  • Sections such as .text, .rdata, .data, .rsrc, etc
  • Data directories (import, export, resources, relocations, debug, TLS, …) Most PE-related structures and constants are declared in WinNT.h.

IMAGE_DOS_HEADER

IMAGE_DOS_HEADER is a legacy structure for MS-DOS compatibility. In practice, the two most important fields are:

  • e_magicSignature that identifies the DOS header. For PE files it must be 0x5A4D which corresponds to ASCII "MZ" (IMAGE_DOS_SIGNATURE).
  • e_lfanewFile offset (from the beginning of the file) to the NT headers (IMAGE_NT_HEADERS).
typedef struct _IMAGE_DOS_HEADER {
  WORD  e_magic;    // 0x5A4D ("MZ")
  WORD  e_cblp;
  WORD  e_cp;
  WORD  e_crlc;
  WORD  e_cparhdr;
  WORD  e_minalloc;
  WORD  e_maxalloc;
  WORD  e_ss;
  WORD  e_sp;
  WORD  e_csum;
  WORD  e_ip;
  WORD  e_cs;
  WORD  e_lfarlc;
  WORD  e_ovno;
  WORD  e_res[4];
  WORD  e_oemid;
  WORD  e_oeminfo;
  WORD  e_res2[10];
  DWORD e_lfanew;   // offset of IMAGE_NT_HEADERS
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

MZ and e_lfanew

History of the "MZ" signature

In the early 1980s, MS-DOS needed a simple method to distinguish executable files from other file types. The "magic number" MZ was chosen and written at the start of executables.Today this signature remains as part of PE compatibility. It is stored in IMAGE_DOS_HEADER.e_magic as 0x5A4D.

IMAGE_NT_HEADERS (32 and 64)

The NT headers were introduced with Windows NT to standardize executable loading. There are two forms:

typedef struct _IMAGE_NT_HEADERS32 {
  DWORD Signature;
  IMAGE_FILE_HEADER FileHeader;
  IMAGE_OPTIONAL_HEADER32 OptionalHeader;
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
typedef struct _IMAGE_NT_HEADERS64 {
  DWORD Signature;
  IMAGE_FILE_HEADER FileHeader;
  IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;

NT headers layout

IMAGE_FILE_HEADER

IMAGE_FILE_HEADER describes the image at a high level: architecture, number of sections, timestamp, and attributes. Microsoft documentation:

typedef struct _IMAGE_FILE_HEADER {
  WORD  Machine;
  WORD  NumberOfSections;
  DWORD TimeDateStamp;
  DWORD PointerToSymbolTable;
  DWORD NumberOfSymbols;
  WORD  SizeOfOptionalHeader;
  WORD  Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

Key fields:

  • MachineTarget CPU architecture (for example IMAGE_FILE_MACHINE_I386, IMAGE_FILE_MACHINE_AMD64).
  • NumberOfSectionsCount of section headers that follow.
  • TimeDateStampUNIX epoch seconds. Useful for debugging, versioning, triage.
  • SizeOfOptionalHeaderSize of the optional header structure.
  • CharacteristicsFlags such as:
    • IMAGE_FILE_EXECUTABLE_IMAGE
    • IMAGE_FILE_DLL
    • IMAGE_FILE_32BIT_MACHINE
    • IMAGE_FILE_DEBUG_STRIPPED

IMAGE_OPTIONAL_HEADER (32 and 64)

Despite the name, this header is essential. It contains memory layout information, entry point, alignments, subsystem, and the DataDirectory array. Microsoft documentation:

IMAGE_SECTION_HEADER

A PE is split into sections, each described by IMAGE_SECTION_HEADER. Sections are the fundamental mapping units used by the loader and by tooling. Microsoft documentation:

#pragma pack(push, 1)
typedef struct _IMAGE_SECTION_HEADER {
  BYTE  Name[IMAGE_SIZEOF_SHORT_NAME];  // 8
  union {
    DWORD PhysicalAddress;
    DWORD VirtualSize;
  } Misc;
  DWORD VirtualAddress;
  DWORD SizeOfRawData;
  DWORD PointerToRawData;
  DWORD PointerToRelocations;
  DWORD PointerToLinenumbers;
  WORD  NumberOfRelocations;
  WORD  NumberOfLinenumbers;
  DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
#pragma pack(pop)

Typical section names:

  • .text executable code
  • .rdata read-only data
  • .data initialized writable data
  • .bss uninitialized data (often virtual only)
  • .rsrc resources Images from the article:
  • Section header overview
  • Sections in a file
  • Section characteristics

VirtualAddress vs PhysicalAddress (file offset)

In a PE, many fields are expressed as RVA (Relative Virtual Address). RVA is an address relative to the module base in memory. On disk, data is located by file offsets.If you need to access something on disk using an RVA, you must convert it using an RVA-to-file-offset translation (often called rva2offset).RVA vs file offset

Imports

When an executable depends on external functions (for example from user32.dll or third-party DLLs), it lists those dependencies in the Import Directory.The Windows loader resolves imports at load time and fills the Import Address Table (IAT) with actual function pointers. Compilers typically generate indirect calls through the IAT.Import table concept

IMAGE_IMPORT_DIRECTORY (Import Directory Table)

The import directory is a sequence of IMAGE_IMPORT_DESCRIPTOR structures. Each descriptor describes one imported DLL.

typedef struct _IMAGE_IMPORT_DESCRIPTOR {
  union {
    DWORD Characteristics;        // 0 for terminating null import descriptor
    DWORD OriginalFirstThunk;     // RVA to unbound IAT (lookup table)
  } DUMMYUNIONNAME;
  DWORD TimeDateStamp;
  DWORD ForwarderChain;
  DWORD Name;                    // RVA to DLL name
  DWORD FirstThunk;              // RVA to IAT (addresses after binding)
} IMAGE_IMPORT_DESCRIPTOR;

Visualization from the article:

  • OriginalFirstThunk/FirstThunk tables
  • Imported function name strings

How the import call works

Typical compiler output calls through a cell that the loader fills once:

; x86
call dword ptr [_cell_with_address_of_function]
; x64
call qword ptr [_cell_with_address_of_function]

IAT callThe loader only needs to write the function address once into the IAT, and the code keeps calling through it.

C++ import table parser (x86 + x64)

Below is a compact, practical approach:

  • Validate DOS header and PE signature
  • Detect architecture from IMAGE_FILE_HEADER.Machine
  • Locate sections and data directories
  • Convert RVA to file offsets (rva2offset)
  • Walk IMAGE_IMPORT_DESCRIPTOR entries
  • For each imported function, detect import-by-name vs import-by-ordinal

Full repository: https://github.com/SToFU-Systems/DSAVE

alignUp

DWORD alignUp(DWORD value, DWORD align) {
  DWORD mod = value % align;
  return value + (mod ? (align - mod) : 0);
}

rva2offset

static constexpr int64_t kRvaError = -1;
int64_t rva2offset(const IMAGE_NTPE_DATA& ntpe, DWORD rva) {
  try {
    PIMAGE_SECTION_HEADER sec = ntpe.sectionDirectories;
    // If rva is inside headers, return as-is
    if (!ntpe.fileHeader->NumberOfSections || rva < sec->VirtualAddress) {
      return rva;
    }
    for (uint32_t i = 0; i < ntpe.fileHeader->NumberOfSections; i++, sec++) {
      DWORD secEnd = alignUp(sec->Misc.VirtualSize, ntpe.SecAlign) + sec->VirtualAddress;
      if (sec->VirtualAddress <= rva && secEnd > rva) {
        return (int64_t)rva - sec->VirtualAddress + sec->PointerToRawData;
      }
    }
  } catch (...) {
  }
  return kRvaError;
}

Import walk core idea

// Pseudocode outline
// 1) Find DataDirectory[IMPORT].VirtualAddress
// 2) Convert RVA -> file offset
// 3) Walk IMAGE_IMPORT_DESCRIPTOR until Name == 0
// 4) For each descriptor, walk thunk table until entry == 0
// 5) If high-bit set -> ordinal, else -> name (IMAGE_IMPORT_BY_NAME)

The article project implements this in a ready-to-use form, including both x86 and x64 support.

Tools used

What is next

Next steps suggested in the original article:

  • Build a fuzzy hashing module
  • Discuss blacklists and whitelists
  • Expand the tooling around PE parsing and analysis Contacts from the article: articles@stofu.io
Philip P.

Philip P. – CTO

Focused on fintech system engineering, low-level development, HFT infrastructure and building PoC to production-grade systems.

Back to Blogs

Start the Conversation

Share the system, the pressure, and what must improve. Or write directly to midgard@stofu.io.