How many T gates are needed to approximate an arbitrary $n$-qubit quantum state to within error $\varepsilon$? Improving prior work of Low, Kliuchnikov, and Schaeffer, we show that the optimal asymptotic scaling is $Î\left(\sqrt{2^n\log(1/\varepsilon)}+\log(1/\varepsilon)\right)$ if we allow ancilla qubits. We also show that this is the optimal T-count for implementing an arbitrary diagonal $n$-qubit unitary to within error $\varepsilon$. We describe applications in which a tensor product of many single-qubit unitaries can be synthesized in parallel for the price of one.
This paper investigates a fundamental quantum computing problem: how many T gates are required to approximate an arbitrary n-qubit quantum state to error ε? Building upon improvements to prior work by Low, Kliuchnikov, and Schaeffer, the authors prove that with auxiliary qubits, the optimal asymptotic complexity is Θ(2nlog(1/ε)+log(1/ε)). They simultaneously establish that this is also the optimal T-gate count for implementing arbitrary diagonal n-qubit unitary matrices. The paper further describes applications where tensor products of multiple single-qubit unitary matrices can be synthesized in parallel.
Core issue in fault-tolerant quantum computing: In fault-tolerant quantum computing based on 2D stabilizer codes (such as surface codes), T gates are significantly more costly to implement than Clifford gates. T gates require magic state distillation, while Clifford gates can be implemented transversally.
Quantification of quantum magic: Quantum magic is a crucial metric for measuring quantum computing's advantage over classical computation. Understanding the non-Clifford resources required to implement quantum states and operations is essential for analyzing quantum advantage.
Complexity of classical simulation: Extensions of the Gottesman-Knill theorem show that the cost of classically simulating quantum computation is polynomial in all parameters except the number of T gates.
Single-qubit case: The Ross-Selinger algorithm provides optimal T-gate count O(log(1/ε)) for single-qubit unitary matrices, matching the information-theoretic lower bound.
Challenges for multi-qubit systems: Direct application of single-qubit methods to n-qubit cases yields T-gate count O(2n(n+log(1/ε))).
Room for improvement in LKS method: Low-Kliuchnikov-Schaeffer (2024) improved T-gate count to O(2nnlog(n/ε)+log2(n/ε)), but further optimization is possible.
Optimal quantum state preparation: Proves that the T-gate count for arbitrary n-qubit quantum states has both upper and lower bounds of Θ(2nlog(1/ε)+log(1/ε))
Optimal diagonal unitary synthesis: Establishes the same optimal T-gate count for implementing diagonal unitary matrices
Batch single-qubit unitary synthesis: For m=O(loglog(1/ε)) distinct single-qubit unitary matrices, T-gate count is O(log(1/ε))
Batch production of single-qubit unitaries: For m copies of a single-qubit unitary matrix U, T-gate count is O(m+log(1/ε))
Strengthened lower bound proof: The lower bound holds in the adaptive Clifford+T circuit model, which is stronger than the model used for the upper bound
Quantum state preparation task: Given an n-qubit target state ∣ψ⟩ and error parameter ε, design a Clifford+T circuit U such that U∣0n⟩∣0a⟩=∣ψ~⟩∣0a⟩, where ∥∣ψ⟩−∣ψ~⟩∥≤ε.
Diagonal unitary synthesis task: Given an n-qubit diagonal unitary matrix D and error ε, design a Clifford+T circuit that approximately implements D.
Using Khintchine's inequality, prove the existence of Boolean phase oracles B1,B2 such that ∣ϕ⟩=B2H⊗nB1H⊗n∣0n⟩ has constant overlap ≥1/2 with target state ∣ψ⟩
Stage two: Error reduction (Lemma 3.4)
Iteratively apply coarse approximation to the difference state ∣ψ⟩−∣ϕ⟩
Construct series expansion: ∣ψ⟩≈ζ⋅∑k=0O(log(1/ε))2−k/2∣ψk⟩
Implement using linear combination of unitaries (LCU) and exact amplitude amplification
Avoiding Grover-Rudolph overhead: Traditional methods require n multi-controlled single-qubit unitaries; this work requires only O(1) diagonal unitaries
Optimal diagonal unitary synthesis: Innovatively decomposes multi-qubit diagonal unitaries into single-qubit problems and Boolean oracle problems
Exact amplitude amplification: Cleverly selects amplitude sin(π/10) so that two rounds of amplitude amplification exactly yield the target state