This post covers memory management — including a brief summary of Windows 3’s memory model and some implementation details on Win3mu’s global and local heaps.
One of the amazing things about Windows 3 is just how much it managed to do in so little memory. At the time a typical PC might have about 1Mb RAM and Windows could run on as little as 384Kb.
In order pull this off a few key concepts were introduced over the simple, almost non-existent way memory was managed in DOS. This new programming model made it possible for Windows to fairly freely move allocated memory around and to discard memory that wasn’t actively in use:
- When a program allocates memory it receives a handle (as opposed to a pointer).
- When the memory needs to be accessed, the program calls a lock function passing the handle and gets back a pointer, the memory is read/written as required and then unlocked again.
- While not locked, Windows can shuffle memory around in order to satisfy other memory allocations.
The Global Heap
The global heap refers to all available memory. If we ignore real-mode and just focus on protected mode we can say that each allocation from the global heap is generally represented by one protected mode selector
(Although allocations greater than 64k have multiple selectors, it’s convenient to think of each global allocation as one selector).
In protected mode, even locked memory can be moved because of the extra layer of indirection provided by the selector descriptor table. The programming model is retained however so that programs work in real mode too.
One of the important things to understand about the global heap is that it represents not just memory allocated by a program but also encompasses all the code and data segments for all loaded programs and DLLs (see the previous post about the Windows executable format). Each of those segments is mapped to a selector which comes from the global heap.
The Local Heap
As you know, a selector is a 16-bit value of which 13 bits are an index into a descriptor table. This means there is a hard limit of 8192 (2¹³) selectors — across the entire operating system.
To overcome this limit Windows also offers local heaps. A local heap is essentially a heap that’s managed inside a chunk of memory allocated from the global heap. The same programming model is used — allocate a handle and lock/unlock when used.
Just like the global heap, Windows will move unlocked allocations around in order to make new allocations fit. Unlike the global heap however locked local allocations can’t be moved because there isn’t the extra level of indirection provided by protected mode selector table.
Win3mu’s Range Allocator
In Win3mu both the global and local heaps are based on the RangeAllocator class. RangeAllocator manages an address space providing functions for allocating, locking and freeing and can move unlocked allocations around to make new allocations fit.
The structure of the range allocator is pretty simple — it’s essentially a C# List<Range> where each Range has a position, length and a flag indicating if it’s allocated or free.
- When the heap is empty the list contains one entry spanning the entire address space that is marked unallocated.
- To make an allocation an unallocated entry is found and split — one part is marked allocated and the other becomes a smaller free slot.
- If an allocation doesn’t fit the heap is defragmented by moving and coalescing unlocked allocations to try and make room.
- When an allocation is freed its entry is joined back to any free adjacent entries.
It’s not the most efficient algorithm but it’s more than good enough for Win3mu’s needs.
As mentioned, RangeAllocator is used by both heaps:
- For the global heap a RangeAllocator is used to manage the set of allocated/available selectors (0–8192).
- For local heaps, RangeAllocator manages address ranges within a globally allocated block of memory — ie: which sections are allocated and which are free. (0–0xFFFF)
Selectors and Allocations
The global heap manages two closely related concepts — selectors and allocations. It’s possible to allocate a selector without actually mapping it to memory and it’s also possible to allocate two selectors for the same block of memory (say one with read-only code execute rights and one with read-write access).
To make an allocation from the global heap, the following occurs:
- Enough consecutive selectors to cover the entire allocation (1 selector for each 64Kb span) are allocated from the RangeAllocator.
- An allocation object is created that stores the size of the allocation, it’s flags and a byte array buffer for the actual memory.
- The selector is assigned a reference to the allocation object.
Although conceptually simple, testing the heap is essential because a memory corruption due to a heap management bug would be extremely difficult to track down.
Normally I like to write simple unit test cases but for something like this a brute force burn test provides a lot more confidence.
The test program:
- Makes random alloc, free, lock, unlock, realloc and defragment operations.
- Writes well known data to each allocation and checks for corruption.
- Goes through cycles of favoring allocations over frees, pushing the heap right up to the point where it runs out of memory and then cycles back to empty.
- Added debug code to the RangeAllocator class to check the integrity of its control structures before and after every operation.
Connecting the CPU to the Global Heap
Hooking the global heap up to the CPU was relatively straight forward — it just implements the IBus’ ReadByte and WriteByte methods:
It’s worth pointing out that Win3mu doesn’t have nearly the same demands as Windows so a few shortcuts have been taken:
- Win3mu only needs to handle the memory demands of one program. (Windows needed to cope with every loaded program).
- Win3mu can leverage C# collections to manage the contents of the heap externally to the actual heap. Windows needed to use heap memory to manage the heap. Basically this means Win3mu has a little more memory because it’s not losing that overhead.
- Win3mu doesn’t need to handle discardable memory. Today’s PCs easily have enough memory and because we only need to cope with one program there’s no point dealing with all the complexity of patching call stacks and function references to handle discarded code segments.
- Win3mu doesn’t need to manage physical memory — each allocation from the global heap is essentially allocating a new C# byte array. Windows had to manage the physical memory, updating CPU descriptor tables and more.
Time for a bit of a recap. So far I’ve got:
- A working 80286–ish CPU emulator with support for pseudo-protected mode.
- Ability to read Windows 3 .exe files and break out all the important information.
- Implementations of the global and local heaps.
- The CPU wired up to the global heap.
The next thing is to get some code loaded but before that I need to explain how the C# code will call into the emulated x86 code… and vice versa. This will be the topic of the next post.
ICYMI, here’s another game running on Win3mu, SkiFree:
Hi, I’m Brad Robinson — an independent software developer living in Sydney Australia. I write software for musicians and as an indie developer I rely on word of mouth.
If you enjoyed this article please consider sharing it by hitting the “recommend heart” below or by sharing on Facebook/Twitter. It’s a small gesture but makes a real difference.
Also, if your feed is lacking in hex dumps, disassembly listings and screen shots of old Windows 3 games you might like to follow me on Twitter.
Continue reading, Part 7 — “Thunking”