Welcome to a comprehensive exploration of multi-object tracking. Given the complex nature of the environment, handling the uncertainties and fluctuations in a sensor's field of view has been a long-standing challenge in the field of sensor fusion and tracking. Traditional techniques have struggled to maintain an optimal balance between accuracy and computational efficiency.

In this article, as an outcome, you can consider the PMBM filter as one of the most modern and high-performance algorithms in this domain. We will be unraveling its mathematical foundation and highlighting its practical implications. Whether you're an engineer, a researcher, a student, or simply a curious mind fascinated by the world of multi-object tracking, I hope to equip you with a solid understanding of the Poisson Multi-Bernoulli Mixture filter by the end of this article, thus enabling you to appreciate its remarkable contributions to the realm of multi-target tracking.

Problem statement

The primary task at hand revolves around multi-object tracking, a problem rooted in the complex and dynamic nature of real-world environments.

This problem involves continuously estimating the states of multiple objects over time using sensor observations. The four primary challenges in this problem are clutter, object appearance and disappearance, the unknown number of objects, and sensor fusion.

Clutter: Tracking can be complicated by 'clutter,' which refers to misleading sensor readings that are not associated with the objects of interest. This can be due to various factors such as environmental noise, sensor errors, or the presence of irrelevant objects. For instance, in radar tracking, clutter could be caused by reflections from buildings, terrain, or other non-target objects. In image processing, clutter can be introduced by reflections in mirrors, glass surfaces, or water bodies
Miss Detection: This occurs when an object of interest is not detected by the object detection system. For example, a camera might miss a person who briefly steps behind a pillar.
Dynamics and Uncertainty in Object Count: The spontaneous appearance or disappearance of objects within the sensor's field of view adds complexity to tracking. For instance, in a public square, people can enter or leave at any time. This dynamic nature, coupled with the uncertainty in the total number of objects, makes consistent and accurate multi-object tracking a challenging task."
Sensor Fusion: This refers to the complex task of merging data from multiple sensors, each with unique characteristics. For example, a self-driving car might need to integrate data from cameras, radars, and lidar, despite their potential inconsistencies and time delays.

The goal is to develop a robust multi-object tracking system that can accurately track multiple objects in cluttered environments, handle object dynamics, adapt to changing object counts, and effectively fuse data from various sensors.

Models

In this article, we focus on model-based methods. These algorithms are effective under the assumption that the processes involved can be accurately represented by a model.

Sensor model

In our simplified 2D model, we're examining an algorithm for a basic problem: tracking the 2D movement of an object starting from a known position, amidst significant sensor inaccuracies and clutter. Before we can start tracking, we first need to model the measurements, which includes modeling the clutter.

Missdetections are modeled using the concept of detection probability (P_D). This parameter indicates the likelihood of an object being detected by the sensor. A P_D less than 1 implies a chance of missdetection. For example, a P_D of 0.9 suggests the object will be detected 90% of the time, and missed 10% of the time.

class SensorModelConfig:
    def __init__(self, P_D, lambda_c, range_c):
        # P_D: object detection probability
        self.P_D = P_D
        # lambda_c: average number of clutter measurements per time scan, Poisson distributed
        self.lambda_c = lambda_c
        # range_c: range of surveillance area
        self.range_c = range_c
        # V: Volume of the surveillance area
        self.V = np.prod(np.diff(self.range_c))  
        # pdf_c: Spatial PDF of clutter
        self.pdf_c = 1 / self.V  
        # intensity_c: expected number of clutter detections per unit volume
        self.intensity_c = self.lambda_c / self.V  
    
    def handle_missdetection(self, objects):
        """Handles missdetection based on the detection probability."""
        # Generate a random number for each object
        random_numbers = np.random.uniform(size=len(objects))
        # Determine which objects are detected
        detected_objects = objects[random_numbers < self.P_D]
        return detected_objects

    def generate_clutter(self):
        """Generates clutter based on the clutter intensity."""
        # N_c: Number of clutter measurements, Poisson distributed
        N_c = np.random.poisson(lam=self.lambda_c)
        # Generate clutter within the surveillance area
        clutter = np.random.uniform(self.range_c[0], self.range_c[1], (N_c, 2))
        return clutter

To model clutter, we assume it's uniformly distributed within a certain area. The quantity of clutter is modeled using a Poisson distribution, which represents the number of events (in this case, clutter) occurring in a fixed interval, given a constant mean rate. This aligns with our assumption of uniformly distributed clutter. The average number of clutter measurements per time scan is defined by the parameter lambda_c. We randomly generate the number of clutter measurements, N_c, based on this Poisson distribution. This provides us with the flexibility to model varying levels of clutter in different scenarios.

Let s how it looks like for single object and P_G = 0.8, lambda_c = 0.5, range_c = [[-1000, 1000],[-1000, 1000].

Single object tracking

We consider a state space model where we have one object and one measurement from the real object at each step. To represent our state, we adopt a Gaussian distribution. This choice is optimal under the assumption of normally distributed measurement noise, as it allows us to effectively estimate the object's true state. By incorporating motion and measurement models, we can predict future states based on previous observations.

Assuming we know the initial position of the real object, we create a Gaussian distribution around it. As time progresses, we observe multiple measurements, some from the real object and others from clutter. To update the state, we employ the nearest measurement heuristic. However, since our state is represented by a Gaussian with mean and covariance, we need a way to measure the distance between the state and the measurements. The Mahalanobis distance, which takes into account the covariance matrix, provides a suitable solution.

In scenarios where we have missdetections and can only rely on measurements from objects part of the time, we need a sensor model. We introduce a detection probability, denoted as P_D, to represent the likelihood of detecting the object. Additionally, we model the clutter by utilizing the Poisson distribution. The clutter rate, lambda_c, and the intensity_c, representing the expected number of clutter in one time interval, play crucial roles in this modeling approach.

Returning to our problem, we calculate the weights of detection and missdetection as w_detection and w_missdetection, respectively. If w_detection is greater than w_missdetection, we update the state using the measurements. Otherwise, we return the state after the prediction step. This decision is based on a comparison between the weights, enabling us to make informed updates to the state estimate in the presence of missdetections.

Object tracking with known number of objects

The n-object tracking problem involves tracking multiple objects simultaneously. Unlike the previous single-object tracking problem where we worked with Gaussian distributions, in this case, we need to handle n Gaussians representing the multiple objects.

At each step, we receive k measurements and consider missdetections and k association hypotheses for each Gaussian component. We calculate weights for each hypothesis using a similar approach to likelihood calculation. The challenge lies in finding the association between n Gaussians and k + n hypotheses.

To solve this, we formulate the problem as a linear assignment problem. We construct a cost matrix of size n x (k + n) and assign a value of +inf to each field. We fill each field with negative likelihood values. Using the Hungarian algorithm, we minimize the sum of the costs to find the associations between Gaussians and hypotheses.

While the Hungarian algorithm provides a greedy approach, it may not be suitable when dealing with unknown objects due to uncertainties. In such cases, alternative algorithms like Murty or Gibbs sampling can be used. These algorithms calculate the top k good solutions, taking into account the uncertain nature of the problem.

Multi object tracking

In the more complex scenario of multi-object tracking with unknown object appearances and disappearances, we employ the Poisson Multi-Bernoulli Mixture (PMBM) algorithm. This algorithm handles the uncertainty in the number of objects and their dynamic behavior.

To begin, we introduce the concepts of the birth model and the death model within the framework of Random Finite Sets (RFS). The birth model can be represented by either a Bernoulli or a Poisson point process, while the death model incorporates a survival probability to represent the likelihood of an object continuing to exist in the next time step.

In the case of a Bernoulli Random Finite Set, we describe the state of each object using Bernoulli components. These components consist of a Gaussian density representing the object's state and an existence probability (r) indicating the likelihood of the object's presence.

Consider an example where we have one Bernoulli and receive a single measurement. This measurement generates two hypotheses: one assumes it is from a new object, while the other associates it with the existing Bernoulli (representing a detection). By extending this concept, if we have a set of Bernoullis, we create new Bernoullis for missdetections and associations for each existing Bernoulli, resulting in a Multi-Bernoulli Mixture.

The PMBM algorithm allows us to consider uncertainty in associations, providing multiple possible outcomes. Each association represents a global hypothesis. We construct a cost matrix and aim to find the top m associations since the problem may not have a unique solution. The output of the tracker is the most probable hypothesis.

For each global hypothesis, we repeat the same procedures in the next step, resulting in an exponential growth of global hypotheses. However, we can store only the most probable ones.

The core of the PMBM filter relies on conjugate distributions. By parameterizing a distribution, the model remains unchanged after an update, and only the parameters are updated. This enables us to calculate these parameters and utilize them in subsequent steps.

The PMBM algorithm involves several steps:

Prediction Step: Predict the states using the motion model, including the prediction methods for both the Poisson Point Process and the Multi-Bernoulli Mixture.
Update Step: Update the predicted states based on new measurements.
Pruning Step: Remove unlikely hypotheses to manage computational complexity.
Estimation Step: Generate estimates of the current state based on the updated and pruned hypotheses.

By following these steps, the PMBM algorithm effectively handles the complexities of multi-object tracking with unknown object appearances and disappearances.

References

García-Fernández, Ángel F., Jason L. Williams, Karl Granström, and Lennart Svensson. 2017. “Poisson Multi-Bernoulli Mixture Filter: Direct Derivation and Implementation.” arXiv [cs.CV]. arXiv. https://doi.org/10.1109/TAES.2018.2805153.
Multiple Object Tracking Youtube Channel.