Making Your Own Memory Graph To Detect Memory Issues On iOS

At a recent WWDC 2018 conference at iOS Memory Deep Dive session Apple has shown several approaches that allow us to debug issues related to the application's memory. One of them is Debug Memory Graph - as the name says it captures every application's objects and represents it into a nice and convenient graph with nodes and relations between them. If you haven't used this feature before here how it looks like:

A cool thing that was mentioned in that session is the ability to export memory graph in to file with *.memgraph extension and then share this file between your teammates for further debugging. Unfortunately, you can't retrieve this graph from shipped binary*, so if you have some undefined behaviour in your app you are not able to see the detailed information about your app's state.

*unless if you have some crashes and debug symbols (dSYM) are enabled. In this case, you can export the .crashlog file into Xcode and continue debugging.

A couple of things to notice. Sometimes Xcode Memory Graph does not show full relations between objects. For example, imagine we have objects in memory with such relations:

Here, the Teacher object is being retained by the Student via two fields (someStorableObj, _teacher) and it also stores in an instance of an array. If we build a memory graph now we will get something like this:

As we can see, the Student object is missing in this graph and I guess it's one of the specificities of Xcode's memory graph implementation.

So our goal for this article is to build an app that will make a memory graph and output this graph in the console in raw mode. This approach has several advantages, let's take a look at them:

Ability to collect this type of information from shipped binaries and then send this data to our services and be a great addition to the crash reports.
It takes less time to build a graph. For instance, the process of building a graph via Xcode in large apps can take several minutes🙀 (measured with Macbook Pro i7/16GB)
(Very rare) Make some runtime checks. At some point of our app's lifecycle, we must be sure that some object exists/not exists or it does/doesn't have some retained or retainer objects.

The final result should look something like this:

So let's get started!

💀CAUTION💀

This app is used in research objectives and it is not recommended to use these techniques in your production code as it may lead to undefined behaviour to future iOS releases or your app can even be rejected by AppStore.

The complete source code is available in Github. This app works correctly in Xcode 12.4 and iOS SDK version 14.0. There are no guarantees that this app will work correctly in future releases because it's built upon low-level API.

To build an objects graph from our app we need somehow to retrieve all objects from the heap. The Malloc API will help us to achieve this goal. In Malloc.h file we have a suitable function for that:

kern_return_t (* MALLOC_INTROSPECT_FN_PTR(enumerator))
(task_t task, void *, 
unsigned type_mask, 
vm_address_t zone_address, 
memory_reader_t reader, 
vm_range_recorder_t recorder);
/* enumerates all the malloc pointers in use */

To call this function we have to do a preliminary work:

1. We need to get all malloc zones from our app,

2. ...and then get an introspection instance from that zone

vm_address_t *zones = NULL;
unsigned int zoneCount = 0;
malloc_get_all_zones(TASK_NULL,
reader, 
&zones, 
&zoneCount);

for (unsigned int i = 0; i < zoneCount; i++) {
    malloc_zone_t *zone = (malloc_zone_t *)zones[i];
    malloc_introspection_t *introspection = zone->introspect;
}

The enumerator function also accepts recorder and callback function pointers. Let's declare these two functions.

Recorder is a function that will be called whenever it finds blocks that contain memory objects.

Unfortunately, I couldn't find any diagrams showing how malloc zone works so I made it by myself based on my assumptions 🙂

In this example, we have a malloc zone that contains four blocks of memory objects, so the recorder function will be called four times.

A callback function is being called whenever the enumerator leaves from the recorder's scope. So let's focus our attention on how the recorder function works under the hood

Here we are enumerating through ranges (which are blue rectangles in the diagram above) and then we are accessing to specific vm_range_t value by a subscript. A vm_range_t structure contains two fields: size and address. The last one is what we need. If we just print out a value from range.address we will get an address of the object in a hexadecimal format.

To build our memory graph we have to access through every instance variables and properties of inspected object and also do the same algorithm for every property and ivar of that inspected object.

So having just a simple hexadecimal address is not enough to get the object's properties and ivars so we need to get all properties and ivars list from inspected address. And Objective-C Runtime is a good way to go. To retrieve all fields from our inspected address via ObjC Runtime we need to convert this address to Objective-C object. To do so we can use a special hack for that! 🤫

Let's declare a fake objc_structure_mock and in that structure we have a single field named as isa

typedef struct {
    Class isa;
} objc_structure_mock;

This looks a little bit tricky. Any structure which starts with a pointer to a Class structure can be treated as an objc_object.

As the final step at this point we just need to cast our address to this mock structure:

objc_structure_mock *rawMemoryObject = (objc_structure_mock *)range.address;

Now we can pass data from our recorder to our callback block simply by call a context block inside the recorder and pass all necessary data into it.

In the callback block we are getting the object's class name and then creating an instance of MemoryObject class which is a simple DTO class that has an interface like this:

And that's it! We have a list of MemoryObject instances that are representing all our app's objects in the heap.

Two things are left. First, we need to go through every instance of MemoryObject and get all ivars and properties cast into MemoryObject as well and then repeat this algorithm. Second, we have to build a graph for these objects.

Let's start with ivar and property introspections. And ObjC-Runtime is perfect for such tasks!

Here we have a list of object's ivars and now we can iterate through them and get values from these ivars.

We are declaring an array of memory objects outside of the loop. This array will contain a list of objects. These objects are retained by the memoryObject that passed to the method as an argument. This array will be returned as the final result of this method. IntrospectionResult class is a helper tool and it's responsible for retrieving values from passed ivar.

Now when we have a list of retained objects we can build our graph.
A brief reminder of what the graph is - it is a data structure consists of a finite (and possibly mutable) set of vertices (also called nodes or points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines), and for a directed graph are also known as arrows.

This data structure perfectly fits for describing memory objects relations. But the classic graph is not suitable for our problem, so I concluded that multigraph is what we want. A multigraph is a graph that is permitted to have multiple edges, that is, edges that have the same end nodes. Thus two vertices may be connected by more than one edge. In this case, it is the typical retain cycle scenario when objects are connected via strong references.

As is known, there are several common data structures for graph representation:

The Adjacency list is widely used, but for the multigraph, we have to make some reworks for it. For our purpose, we need to know which objects are retained by the source node and what objects the source node retains.

And here is the interface of the graph data structure.

The last thing to do is just fill our graph with appropriate data. As we have a list of MemoryObject instances it's easy to be done 🙂

And now, let's print out our graph!

MemoryService *memoryService = [MemoryService new];
NSDictionary *graph = [memoryService fullMemoryGraph];
NSLog(@"%@", graph);

Voila! That's exactly what we need! 🥳🥳🥳

Now we can send this data to our services and do other related work

Limitations of this approach and way to improve

The most significant limitation of this approach is that pure Swift objects are not going to be recorded in this graph. Only objects which are inherited from NSObject will.
As it was mentioned before, this app is built upon low-level API so there are no guarantees that in future iOS releases it will work correctly.
Unfortunately, we can't access specific ivar values (such as UIScrollView and its subclasses) it leads to EXC_BAD_ACCESS crash. As the result, we just can add UIScrollView to the exception list

Please, welcome to my GitHub and also a gist page with useful materials that belong to this topic. You can also find me on my LinkedIn profile.