```html Authors: Almudena Carrera Vazquez Caroline Tornow Diego Ristè Stefan Woerner Maika Takita Daniel J. Egger Abstract Quantum computers process information with the laws of quantum mechanics. Current quantum hardware is noisy, can only store information for a short time and is limited to a few quantum bits, that is, qubits, typically arranged in a planar connectivity . However, many applications of quantum computing require more connectivity than the planar lattice offered by the hardware on more qubits than is available on a single quantum processing unit (QPU). The community hopes to tackle these limitations by connecting QPUs using classical communication, which has not yet been proven experimentally. Here we experimentally realize error-mitigated dynamic circuits and circuit cutting to create quantum states requiring periodic connectivity using up to 142 qubits spanning two QPUs with 127 qubits each connected in real time with a classical link. In a dynamic circuit, quantum gates can be classically controlled by the outcomes of mid-circuit measurements within run-time, that is, within a fraction of the coherence time of the qubits. Our real-time classical link enables us to apply a quantum gate on one QPU conditioned on the outcome of a measurement on another QPU. Furthermore, the error-mitigated control flow enhances qubit connectivity and the instruction set of the hardware thus increasing the versatility of our quantum computers. Our work demonstrates that we can use several quantum processors as one with error-mitigated dynamic circuits enabled by a real-time classical link. 1 Main Quantum computers process information encoded in quantum bits with unitary operations. However, quantum computers are noisy and most large-scale architectures arrange the physical qubits in a planar lattice. Nevertheless, current processors with error mitigation can already simulate hardware-native Ising models with 127 qubits and measure observables at a scale where brute-force approaches with classical computers begin to struggle . The usefulness of quantum computers hinges on further scaling and overcoming their limited qubit connectivity. A modular approach is important for scaling current noisy quantum processors and for achieving the large numbers of physical qubits needed for fault tolerance . Trapped ion and neutral atom architectures can achieve modularity by physically transporting the qubits , . In the near term, modularity in superconducting qubits is achieved by short-range interconnects that link adjacent chips , . 1 2 3 4 5 6 7 8 In the medium term, long-range gates operating in the microwave regime may be carried out over long conventional cables , , . This would enable non-planar qubit connectivity suitable for efficient error correction . A long-term alternative is to entangle remote QPUs with an optical link leveraging a microwave to optical transduction , which has not yet been demonstrated, to our knowledge. Moreover, dynamic circuits broaden the set of operations of a quantum computer by performing mid-circuit measurements (MCMs) and classically controlling a gate within the coherence time of the qubits. They enhance algorithmic quality and qubit connectivity . As we will show, dynamic circuits also enable modularity by connecting QPUs in real time through a classical link. 9 10 11 3 12 13 14 We take a complementary approach based on virtual gates to implement long-range interactions in a modular architecture. We connect qubits at arbitrary locations and create the statistics of entanglement through a quasi-probability decomposition (QPD) , , . We compare a Local Operations (LO) only scheme to one augmented by Classical Communication (LOCC) . The LO scheme, demonstrated in a two-qubit setting , requires executing multiple quantum circuits with local operations only. By contrast, to implement LOCC, we consume virtual Bell pairs in a teleportation circuit to create two-qubit gates , . On quantum hardware with sparse and planar connectivity, creating a Bell pair between arbitrary qubits requires a long-range controlled-NOT (CNOT) gate. To avoid these gates, we use a QPD over local operations resulting in cut Bell pairs that the teleportation consumes. LO do not need the classical link and is thus simpler to implement than LOCC. However, as LOCC only requires a single parameterized template circuit, it is more efficient to compile than LO and the cost of its QPD is lower than the cost of the LO scheme. 15 16 17 16 17 18 19 20 Our work makes four key contributions. First, we present the quantum circuits and QPD to create multiple cut Bell pairs to realize the virtual gates in ref. . Second, we suppress and mitigate the errors arising from the latency of the classical control hardware in dynamic circuits with a combination of dynamical decoupling and zero-noise extrapolation . Third, we leverage these methods to engineer periodic boundary conditions on a 103-node graph state. Fourth, we demonstrate a real-time classical connection between two separate QPUs thereby demonstrating that a system of distributed QPUs can be operated as one through a classical link . Combined with dynamic circuits, this enables us to operate both chips as a single quantum computer, which we exemplify by engineering a periodic graph state that spans both devices on 142 qubits. We discuss a path forward to create long-range gates and provide our conclusion. 17 21 22 23 Circuit cutting We run large quantum circuits that may not be directly executable on our hardware because of limitations in qubit count or connectivity by cutting gates. Circuit cutting decomposes a complex circuit into subcircuits that can be individually executed , , , , , . However, we must run an increased number of circuits, which we call the sampling overhead. The results from these subcircuits are then classically recombined to yield the result of the original circuit ( ). 15 16 17 24 25 26 Methods As one of the main contributions of our work is implementing virtual gates with LOCC, we show how to create the required cut Bell pairs with local operations. Here, multiple cut Bell pairs are engineered by parameterized quantum circuits, which we call a cut Bell pair factory (Fig. ). Cutting multiple pairs at the same time requires a lower sampling overhead . As the cut Bell pair factory forms two disjoint quantum circuits, we place each subcircuit close to qubits that have long-range gates. The resulting resource is then consumed in a teleportation circuit. For instance, in Fig. , the cut Bell pairs are consumed to create CNOT gates on the qubit pairs (0, 1) and (2, 3) (see section ‘ ’). 1b,c 17 1b Cut Bell pair factories , Depiction of an IBM Quantum System Two architecture. Here, two 127 qubit Eagle QPUs are connected with a real-time classical link. Each QPU is controlled by its electronics in its rack. We tightly synchronize both racks to operate both QPUs as one. , Template quantum circuit to implement virtual CNOT gates on qubit pairs ( 0, 1) and ( 2, 3) with LOCC by consuming cut Bell pairs in a teleportation circuit. The purple double lines correspond to the real-time classical link. , Cut Bell pair factories ( ) for two simultaneously cut Bell pairs. The QPD has a total of 27 different parameter sets . Here, . a b q q q q c C2 θ i θ i Periodic boundary conditions We construct a graph state | ⟩ with periodic boundary conditions on ibm_kyiv, an Eagle processor , going beyond the limits imposed by its physical connectivity (see section ‘ ’). Here, has ∣ ∣ = 103 nodes and requires four long-range edges = {(1, 95), (2, 98), (6, 102), (7, 97)} between the top and bottom qubits of the Eagle processor (Fig. ). We measure the node stabilizers at each node ∈ and the edge stabilizers formed by the product across each edge ( , ) ∈ . From these stabilizers, we build an entanglement witness , which is negative if there is bipartite entanglement across the edge ( , ) ∈ (ref. ) (see section ‘ ’). We focus on bipartite entanglement because this is the resource we wish to recreate with virtual gates. Measuring witnesses of entanglement between more than two parties will measure only the quality of the non-virtual gates and measurements making the impact of the virtual gates less clear. G 1 Graph states G V Elr 2a Si i V SiSj i j E i j E 27 Entanglement witness , The heavy-hexagonal graph is folded on itself into a tubular form by the edges (1, 95), (2, 98), (6, 102) and (7, 97) highlighted in blue. We cut these edges. , The node stabilizers (top) and witnesses , (bottom), with 1 standard deviation for the nodes and edges close to the long-range edges. Vertical dashed lines group stabilizers and witnesses by their distance to cut edges. , Cumulative distribution function of the stabilizer errors. The stars indicate node stabilizers that have an edge implemented by a long-range gate. In the dropped edge benchmark (dash-dotted red line), the long-range gates are not implemented and the star-indicated stabilizers thus have unit error. The grey region is the probability mass corresponding to node stabilizers affected by the cuts. – , In the two-dimensional layouts, the green nodes duplicate nodes 95, 98, 102 and 97 to show the cut edges. The blue nodes in are qubit resources to create cut Bell pairs. The colour of node is the absolute error ∣ − 1∣ of the measured stabilizer, as indicated by the colour bar. An edge is black if entanglement statistics are detected at a 99% confidence level and violet if not. In , the long-range gates are implemented with SWAP gates. In , the same gates are implemented with LOCC. In , they are not implemented at all. a b Sj c Sj d f e i Si d e f We prepare | ⟩ using three different methods. The hardware-native edges are always implemented with CNOT gates but the periodic boundary conditions are implemented with (1) SWAP gates, (2) LOCC and (3) LO to connect qubits across the whole lattice. The main difference between LOCC and LO is a feed-forward operation consisting of single-qubit gates conditioned on 2 measurement outcomes, where is the number of cuts. Each of the 22 cases triggers a unique combination of and/or gates on the appropriate qubits. Acquiring the measurement results, determining the corresponding case and acting based on it is performed in real time by the control hardware, at the cost of a fixed added latency. We mitigate and suppress the errors resulting from this latency with zero-noise extrapolation and staggered dynamical decoupling , (see section ‘ ’). G n n n X Z 22 21 28 Error-mitigated quantum circuit switch instructions We benchmark the SWAP, LOCC and LO implementations of | ⟩ with a hardware-native graph state on ′ = ( , ′) obtained by removing the long-range gates, that is, ′ = \ . The circuit preparing | ′⟩ thus requires only 112 CNOT gates arranged in three layers following the heavy-hexagonal topology of the Eagle processor. This circuit will report large errors when measuring the node and edge stabilizers of | ⟩ for nodes on a cut gate because it is designed to implement | ′⟩. We refer to this hardware-native benchmark as the dropped edge benchmark. The swap-based circuit requires an additional 262 CNOT gates to create the long-range edges , which drastically reduces the value of the measured stabilizers (Fig. ). By contrast, the LOCC and LO implementation of the edges in does not require SWAP gates. The errors of their node and edge stabilizers for nodes not involved in a cut gate closely follow the dropped edge benchmark (Fig. ). Conversely, the stabilizers involving a virtual gate have a lower error than the dropped edge benchmark and the swap implementation (Fig. , star markers). As an overall quality metric, we first report the sum of absolute errors on the node stabilizers, that is, ∑ ∈ ∣ − 1∣ (Extended Data Table ). The large SWAP overhead is responsible for the 44.3 sum absolute error. The 13.1 error on the dropped edge benchmark is dominated by the eight nodes on the four cuts (Fig. , star markers). By contrast, the LO and LOCC errors are affected by MCMs. We attribute the 1.9 additional error of LOCC over LO to the delays and the CNOT gates in the teleportation circuit and cut Bell pairs. In the SWAP-based results, does not detect entanglement across 35 of the 116 edges at the 99% confidence level (Fig. ). For the LO and LOCC implementation, witnesses the statistics of bipartite entanglement across all edges in at the 99% confidence level (Fig. ). These metrics show that virtual long-range gates produce stabilizers with smaller errors than their decomposition into SWAPs. Furthermore, they keep the variance low enough to verify the statistics of entanglement. G G V E E E Elr G G G Elr 2b–d Elr 2b,c 2c i V Si 1 2c 2b,d G 2e Operating two QPUs as one We now combine two Eagle QPUs with 127 qubits each into a single QPU through a real-time classical connection. Operating the devices as a single, larger processor consists of executing quantum circuits spanning the larger qubit register. Apart from unitary gates and measurements running concurrently on the merged QPU, we use dynamic circuits to perform gates that act on qubits on both devices. This is enabled by a tight synchronization and fast classical communication between physically separate instruments required to collect measurement results and determine the control flow across the whole system . 29 We test this real-time classical connection by engineering a graph state on 134 qubits built from heavy-hexagonal rings that wind through both QPUs (Fig. ). These rings were chosen by excluding qubits plagued by two-level systems and readout issues to ensure a high-quality graph state. This graph forms a ring in three dimensions and requires four long-range gates that we implement with LO and LOCC. As before, the LOCC protocol thus requires two additional qubits per cut gate for the cut Bell pairs. As in the previous section, we benchmark our results to a graph that does not implement the edges that span both QPUs. As there is no quantum link between the two devices, a benchmark with SWAP gates is impossible. All edges exhibit the statistics of bipartite entanglement when we implement the graph with LO and LOCC at a 99% confidence level. Furthermore, the LO and LOCC stabilizers have the same quality as the dropped edge benchmark for nodes that are not affected by a long-range gate (Fig. ). Stabilizers affected by long-range gates have a large reduction in error compared with the dropped edge benchmark. The sum of absolute errors on the node stabilizers ∑ ∈ ∣ − 1∣, is 21.0, 19.2 and 12.6 for the dropped edge benchmark, LOCC and LO, respectively. As before, we attribute the 6.6 additional errors of LOCC over LO to the delays and the CNOT gates in the teleportation circuit and cut Bell pairs. The LOCC results demonstrate how a dynamic quantum circuit in which two subcircuits are connected by a real-time classical link can be executed on two otherwise disjoint QPUs. The LO results could be obtained on a single device with 127 qubits at the cost of an additional factor of 2 in run-time as the subcircuits can be run successively. 3 3c i V Si , Graph state with periodic boundaries shown in three dimensions. The blue edges are the cut edges. , Coupling map of two Eagle QPUs operated as a single device with 254 qubits. The purple nodes are the qubits forming the graph state in and the blue nodes are used for cut Bell pairs. , , Absolute error on the stabilizers ( ) and edge witnesses ( ) implemented with LOCC (solid green) and LO (solid orange) and on a dropped edge benchmark graph (dotted-dashed red) for the graph state in . In and , the stars show stabilizers and edge witnesses that are affected by the cuts. In and , the grey region is the probability mass corresponding to node stabilizers and edge witnesses, respectively, affected by the cut. In and , we observe that the LO implementation outperforms the dropped edge benchmark, which we attribute to better device conditions as these data were taken on a different day from the benchmark and LOCC data. a b a c d c d a c d c d c d Discussion and conclusion We implement long-range gates with LO and LOCC. With these gates, we engineer periodic boundary conditions on a 103-node planar lattice and connect two Eagle processors in real time to create a graph state on 134 qubits, going beyond the abilities of a single chip. Here, we chose to implement graph states as an application to highlight the scalable properties of dynamic circuits. Our cut Bell pair factories enable the LOCC scheme presented in ref. . Both the LO and LOCC protocols deliver high-quality results that closely match a hardware-native benchmark. Circuit cutting increases the variance of measured observables. We can keep the variance under control in both the LO and LOCC schemes as indicated by the statistical tests on the witnesses. An in-depth discussion of the measured variance is found in the . 17 Supplementary Information The variance increase from the QPD is why research now focuses on reducing the sampling overhead. It was recently shown that cutting multiple two-qubit gates in parallel results in optimal LO QPDs with the same sampling overhead as LOCC but requires an additional ancilla qubit and possibly reset , . In LOCC, the QPD is required only to cut the Bell pairs. This costly QPD could be removed, that is, no shot overhead, by distributing entanglement across multiple chips , . In the near to medium term, this could be done by operating gates in the microwave regime over conventional cables , , or, in the long term, with an optical-to-microwave transduction , , . Entanglement distribution is typically noisy and may result in non-maximally entangled states. However, gate teleportation requires a maximally entangled resource. Nevertheless, non-maximally entangled states could lower the sampling cost of the QPD and multiple copies of non-maximally entangled states could be distilled into a pure state for teleportation either during the execution of a quantum circuit or possibly during the delays between consecutive shots, which may be as large as 250 μs for resets . Combined with these settings, our error-mitigated and suppressed dynamic circuits would enable a modular quantum computing architecture without the sampling overhead of circuit cutting. 30 31 32 33 10 34 35 36 37 38 39 40 41 In an application setting, circuit cutting could benefit Hamiltonian simulation . Here, the cost of circuit cutting is exponential in the strength of the cut bonds times the evolution time. This cost may thus be reasonable for weak bonds and/or short evolution times. Furthermore, the LO scheme presented in ref. requires ancilla qubits in a Hadamard test, which would require a reset through a dynamic circuit if the same bond is cut multiple times in a Trotterized time evolution. 42 42 Circuit cutting can be applied to both wires and gates. The resulting quantum circuits have a similar structure making our approach applicable to both cases. Our real-time classical link implements long-range gates and classically couples disjoint quantum processors. The cut Bell pairs that we present have values beyond our work. For example, these pairs are directly usable to cut circuits in measurement-based quantum computing, which relies on dynamic circuits