Table of Links Abstract and 1. Introduction Abstract and 1. Introduction 2. Relevant Work 2. Relevant Work 3. Methods 3. Methods 3.1 Models 3.1 Models 3.2 Summarising Features 3.2 Summarising Features 3.3 Calibration of Market Model Parameters 3.3 Calibration of Market Model Parameters 4. Experiments 4. Experiments 4.1 Zero Intelligence Trader 4.1 Zero Intelligence Trader 4.2 Extended Chiarella 4.2 Extended Chiarella 4.3 Historical Data 4.3 Historical Data 5. Discussion & Future Work 5. Discussion & Future Work 6. Significance, Acknowledgments, and References 6. Significance, Acknowledgments, and References 2 RELEVANT WORK Interest in the development of a realistic market simulator has increased in recent years due to new methods in deep learning and computer science as well as significant increases in data and compute. The history of market simulators can potentially be traced back as far as 1962, to the computer simulations of Nobel laureate George Stigler [38]. However, over the past several decades, both the number and the accuracy of market simulators has dramatically increased. For an overview of early examples of agent-based models of market simulations, we refer the reader to [2, 36]. Of the many recent examples of ABM market simulators, one of the most popular and widely used is the Agent-Based Interactive Discrete Event Simulation (ABIDES) framework [7], which provides high-fidelity market simulations. Recent work has extended the ABIDES framework to include a reinforcement learning (RL) environment and a ‘hybrid’ ABM/neural network-based market simulator [1, 37]. In this work, we use two custom-built ABMs for market simulation, which we describe in subsection 3.1. Separate to ABMs, deep learning has also been used to both recreate market behaviour. These approaches have used deep learning to reproduce market dynamics in an entirely data-driven way, without using explicit equations that describe market or trader behaviours. This includes using convolutional neural networks (CNNs) with recurrent architecture, in the so-called DeepLOB framework, as well as deep generative models such as GANs and variational autoencoders (VAEs) [6, 12, 43]. These methods provide higher accuracy than traditional generative approaches but are more challenging to reproduce out-of-distribution or long-tail events such as those experienced during the COVID-19 pandemic or the global financial crises of 2008. As discussed in section 1, traditional simulation approaches such as ABMs require parameter sets to be calibrated to best reflect observations. Methods for calibrating market simulators include using optimisation to find the point estimates of parameters, as well as Bayesian methods which seek to estimate the posterior over parameter values. Methods that use optimisation to find point estimates of parameter values include using Gaussian processes and surrogate modelling approaches that minimise the distance between historical and simulated data for a set of metrics [3, 21]. Other methods have used GANs to train a ‘calibration agent’, that identifies real and synthetic data and uses this to choose parameters that are most likely to lead to realistic data generation [40]. More broadly, methods that seek to estimate the posterior probability distribution of parameters are increasingly combined under the banner of likelihood-free or simulation-based inference [14]. These methods include classical methods such as approximate Bayesian computation (ABC) as well as more recent advances that leverage deep neural networks such as neural posterior estimation (NPE). For a review of applications of simulation-based inference in the context of economics and financial timeseries, see [17], which demonstrates how these methods can be applied to models of market dynamics. While these methods have been applied in many fields including epidemiology, high-energy physics, and nonequilibrium systems, they have yet to be used to calibrate a market simulator to real market data [5, 22, 39]. Our work addresses this. likelihood-free simulation-based inference Authors: (1) Namid R. Stillman, Simudyne Limited, United Kingdom (namid@simudyne.com); (2) Rory Baggott, Simudyne Limited, United Kingdom (rory@simudyne.com); (3) Justin Lyon, Simudyne Limited, United Kingdom (justin@simudyne.com); (4) Jianfei Zhang, Hong Kong Exchanges and Clearing Limited, Hong Kong (jianfeizhang@hkex.com.hk); (5) Dingqiu Zhu, Hong Kong Exchanges and Clearing Limited, Hong Kong (dingqiuzhu@hkex.com.hk); (6) Tao Chen, Hong Kong Exchanges and Clearing Limited, Hong Kong (taochen@hkex.com.hk); (7) Perukrishnen Vytelingum, Simudyne Limited, United Kingdom (krishnen@simudyne.com). Authors: Authors: (1) Namid R. Stillman, Simudyne Limited, United Kingdom (namid@simudyne.com); (2) Rory Baggott, Simudyne Limited, United Kingdom (rory@simudyne.com); (3) Justin Lyon, Simudyne Limited, United Kingdom (justin@simudyne.com); (4) Jianfei Zhang, Hong Kong Exchanges and Clearing Limited, Hong Kong (jianfeizhang@hkex.com.hk); (5) Dingqiu Zhu, Hong Kong Exchanges and Clearing Limited, Hong Kong (dingqiuzhu@hkex.com.hk); (6) Tao Chen, Hong Kong Exchanges and Clearing Limited, Hong Kong (taochen@hkex.com.hk); (7) Perukrishnen Vytelingum, Simudyne Limited, United Kingdom (krishnen@simudyne.com). This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv