Quantum Many-Body Harness

From first clone to your own discovery

A problem-solving harness for quantum many-body lattice problems — ground-state and finite-temperature — plus the scientific-research skills around it. You drive it in plain language; it routes to the right method, runs the calculation, verifies the result, and hands you a script, plot, or document you can rerun. Everything below the fold is a real, captured run — the transcripts, plots, and reports are actual output, not mockups.

Early stage — contributors welcome; formal release planned for August 2026.

Exactly-solvable models 63 oracle cards — exact spectra, energies, and gaps to check any numerical run against Widely-used models 29 model cards — A1–D16 properties, phases, observables, and the methods that work on each Computational methods 36 method cards — accuracy class, M1–M14 properties, cost scaling, and benchmarks

① Get started

Three steps from nothing to a guided training session. No prior experience with agents, Julia, or many-body methods assumed.

Install an agent CLI — Claude Code, Codex CLI, or OpenCode. Don't have one? Follow the summer-school setup guide.

Clone the harness and install its skills.

git clone https://github.com/QuantumBFS/quantum.harness
cd quantum.harness
make skills

Start the guided training. Paste into your agent:
```
/beginner-training
```
It asks which track you want, then walks it one confirmed step at a time — every command is explained before it runs, and it never auto-solves for you.

② Your training path

Five tracks, recommended in order — but each stands alone, so start anywhere. Every track ends with a checkpoint that proves the result is trustworthy instead of just claiming it.

1 · Setup check ~0.5 h — verify skills, Python, Julia, and the GitHub CLI before you depend on them ✓ a deliberate failure that must fail

2 · Reproduce a paper ~2–4 h — pick a method track, regenerate its published figure at laptop size ✓ signature match + wrong-input control

3 · Conduct a literature survey ~1–2 h — build a real reference library with /download-ref and /survey ✓ bogus arXiv ID rejected; index counts match

4 · Develop code like an expert ~3–5 h — take a starter issue to a reviewed PR: brainstorm → plan → TDD → review ✓ the PR + its attached self-review

5 · Go beyond capstone — extend the paper with your own time-boxed challenge and ship the report ✓ every number traces to a real run

③ What people use it for

Three journeys, each shown as a real captured run — open a card for the full annotated transcript and the actual plot.

Reproduce & challenge — the summer-school journey Harnessing Quantum 2026

The summer-school journey. You arrive on a method track, reproduce its reference result to calibrate yourself and the tooling, then invent a feasible challenge that goes beyond the paper and ship it. The harness guides each phase and teaches as it goes — it never auto-solves the challenge for you.

/onboardset up the stack

→

/beginner-trainingguided training: pick a track

→

/reproduce-paperreproduce the reference

→

/challengego beyond it

→

/challenge-reportclean PR + report

Harnessing Quantum — Exact Diagonalization track

> /beginner-training # Track 2 — Reproduce a paper: the harness lists the method tracks # and their reference papers Track Title Reference target ed Exact Diagonalization Turner et al., "Quantum many-body scars," Nat. Phys. 14, 745 (2018) > the ED track, reproduce Fig. 3 Before any compute — confirming the setup: Model PXP / "Fibonacci" chain (Rydberg blockade) Hamiltonian H = Σ Pᵢ₋₁ Xᵢ Pᵢ₊₁ Boundary periodic; sector k=0, inversion-even Observable E vs log₂|⟨Z₂|ψ⟩|² (the scar band) Sizes L = 12, 16, 20 (beginner tier; paper uses L=32) Look right? > yes, run it # /reproduce-paper builds the constrained sector, dense-diagonalizes, # and checks every eigenpair residual Running L=12: constrained basis → k=0, I=+ sector → dense ED L=12 done: full D=322, sector D=26, max residual 2.2e-14 L=16 done: full D=2207, sector D=99, max residual 2.5e-14 L=20 done: full D=15127, sector D=455, max residual 4.4e-14 Figure: figs/fig3a.png Completed in 3.29 seconds. Reproduced. The Z₂ scar band (red, FSA tower) sits cleanly above the thermal bulk — the paper's signature. self-checked: residuals ~1e-14, correct sector dimensions. > /challenge — what could I try beyond this? Ranked, time-boxed candidates (you choose, then take it to the help desk for go/no-go): 1. Push L=24→28 on the cluster; track how the scar overlap scales with system size. (~hours, cluster) 2. Add a chemical-potential detuning term and watch the scar band detach / reconnect. (~half day, local) 3. Compare exact scars vs the FSA tower error vs L. (~hours, local) … you pick one, the mentor gives graded hints, then /challenge-report gates a clean PR under tracks/ed/.

PXP quantum-scar overlap scatter for L=12, 16, 20 — **Captured result.** Eigenstate energy vs overlap with |Z₂⟩ for the PXP chain. Gray = all eigenstates (thermal bulk); red ✕ = the forward-scattering-approximation (FSA) tower — the special scar states with anomalously large |Z₂⟩ overlap. The scar band sharpens as L grows (L=12 → 16 → 20). This is the Turner et al. (2018) Fig. 3 signature, reproduced in 3.3 s locally.

Takeaway. Reproduction is your calibration target: it confirms the setup, the tool, and your understanding before you spend effort going beyond. Here a faithful scar band landed in 3.3 s with residuals of ~1e-14 — solid ground to launch a challenge from.

Bring your own problem — solve a model end to end Computational methods

The everyday path. State a concrete model in plain language; the harness infers the defaults, recommends a method, runs it, and verifies against a known limit. When the system gets too big for your laptop, the same run ships to the cluster unchanged. Heavy compute is a branch inside this workflow, not a separate one.

/solvestate the problem

→

/method-mpspick the method

→

/using-itensorspick the tool

→

/using-slurmscale up (when heavy)

→

/reportHTML writeup

J₁–J₂ Heisenberg chain — ground state

> ground state of the J1-J2 Heisenberg chain at J2/J1 = 0.5 Confirming before compute: H = J₁ Σ Sᵢ·Sᵢ₊₁ + J₂ Σ Sᵢ·Sᵢ₊₂, S=½ J₁=1, J₂=0.5 → the Majumdar–Ghosh point N=100, open boundary, Sz=0 sector Method: DMRG (matrix-product states) via ITensors J₂/J₁=0.5 is exactly solvable — the ground state is the dimer product state with E₀/N = −3/8. That's a built-in check. > go # /method-mps → /using-itensors generates and runs the script, # ramping the bond dimension χ and printing every sweep sweep 1 χ= 20 E/N = −0.37484171 err = 1.6e-04 sweep 2 χ= 20 E/N = −0.37499978 err = 2.2e-07 sweep 3 χ= 20 E/N = −0.37500000 err = 7.3e-10 sweep 4 χ= 20 E/N = −0.37500000 err = 4.0e-14 sweep 8 χ= 40 E/N = −0.37500000 err = 1.7e-16 sweep 16 χ=160 E/N = −0.37500000 err = 5.6e-16 Final: E₀/N = −0.37500000 |error| = 5.55e-16 (26.6 s) E₀/N = −0.375000, converged. Verified against the Majumdar–Ghosh exact value to machine precision (5.6e-16). Script: scripts/j1j2_heisenberg_dmrg.jl > now I want N=400 and a J2 scan — too slow here Above the laptop budget → ship to the cluster: /using-slurm stage script + sbatch to the active profile /parameter-scan sweep J₂/J₁ ∈ [0, 1] as an array job one manifest per cell fetched back; nothing re-typed by hand … then /report renders the run into a self-contained HTML page.

DMRG energy convergence vs sweep for the J1-J2 chain — **Captured result.** DMRG energy error `E − E_exact` vs sweep (log scale). At bond dimension χ=20 the error already drops below 1e-13 in four sweeps; raising χ to 40/80/160 (dashed lines) holds it at the machine-precision floor. The flat floor is the convergence proof — the calculation has nothing left to gain. Final E₀/N = −0.375000, matching the Majumdar–Ghosh exact value.

Takeaway. You never picked a bond dimension, a sector, or a sweep schedule — you stated the physics and the harness chose them, then proved the answer against a known limit (|error| = 5.6e-16). The cluster is the same workflow with a bigger N, not a detour.

Survey & brainstorm — the sci-brain research skills sci-brain

The harness doesn't stop at numbers. The bundled sci-brain skills cover the research work around a calculation: mapping a field and writing it up, and brainstorming new directions with AI agents. Two flows below — a survey you can hand to a team, and an ideas report you can take to a help desk or advisor.

Survey writing

Map a research area from scratch: parallel search strategies build a literature knowledge base with BibTeX, the references are fetched and rendered to full text, then a structured state-of-the-art review is drafted section by section.

/surveyexplore + build KB

→

/download-reffetch + render PDFs

→

/survey-writerdraft the review

Survey — open-source many-body software

> /survey — open-source software for quantum many-body computation Which search strategies should I run in parallel? ✓ Landscape mapping broad field overview ✓ Cross-method DMRG vs QMC vs VMC vs DMFT stacks ✓ Benchmarks/datasets shared model/result infrastructure # one search worker per strategy, in parallel → findings reports Consolidated 3 reports, deduplicated by DOI/arXiv: • Tensor networks: ITensor, TeNPy, MPSKit, PEPSKit … • ED: QuSpin, XDiag, ALPS … • QMC / VMC / DMFT: ALF, mVMC, NetKet, TRIQS, w2dynamics … Which directions go into the knowledge base? > all of them # BibTeX verified against authoritative sources — never from memory Knowledge base: 53 references, 24 papers rendered to full text → .knowledge/literature/software/ ref.bib > /survey-writer — write it up, organized by method # drafts section by section, you approve each before the next § What and Why § Technical approaches (tensor networks · ED · QMC · VMC/NQS · DMFT) § At-a-glance comparison § Open problems Saved: articles/2026-06-20-many-body-software-review.pdf (5 pages) State-of-the-art review drafted from 53 grounded references, organized by technical approach with per-method trade-offs.

First page of the generated many-body software review — **Captured result.** Page 1 of the generated review, `articles/2026-06-20-many-body-software-review.pdf` — a real 5-page state-of-the-art report built from a 53-reference knowledge base (24 papers rendered to full text). Organized by technical approach (tensor networks, ED, QMC, VMC/NQS, DMFT), each with its own state of the art and trade-offs.

Takeaway. From a one-line topic to a cited, team-ready review — the search, the BibTeX, and the full-text rendering are all real artifacts under .knowledge/ and articles/, not a chat summary.

Brainstorming ideas with AI agents

A research collaborator that thinks with you. A warm mentor drives the conversation while a separate critic agent stress-tests every idea — the deal is "you think, I fetch": the agents surface facts, references, and cross-field connections; you do the deep reasoning. The converged direction becomes a structured ideas report.

Ideator (mentor)Proposes directions, surfaces a related fact from a neighboring field, encourages deeper thinking.

Polya (critic)Asks "what is the plan? can you solve a part of it?" — pins down the minimum viable experiment and the failure signals.

/brainstorm-ideastwo-agent dialogue

→

/idea-writerstructured proposal

Brainstorm — neural states vs DMRG on frustrated magnets

> /brainstorm-ideas — can neural quantum states beat DMRG on the 2D J1-J2 model near maximal frustration? Ideator Ambitious — I like it. The good news: this is exactly where DMRG struggles (2D area law fights you). Oh — this reminds me, in the survey we just built, NetKet's recent results on J1-J2 are the closest prior art. Worth a look. Polya Before we dream: what is the plan, concretely? And can you solve a *part* of it first — a size where ED still gives ground truth? > start at a 6×6 lattice where I can cross-check against ED Ideator Good instinct. I can't reason through the expressivity question myself — use your evolving brain here: is a shallow CNN ansatz enough near J2/J1≈0.5? Tell me what you'd need to know and I'll fetch it. Polya Define the kill-criterion now, while it's cheap. > ok. /idea-writer — turn this into a proposal # writes Research Question · Novelty · MVP · success/hope/pivot # signals → articles/…-ideas-report.md Saved: articles/2026-06-20-nqs-frustration-ideas-report.md

Neural quantum states near maximal frustration

ideas report · generated by /idea-writer

Research question: Can a CNN-based neural quantum state reach DMRG-competitive energies for the 2D J₁–J₂ model at J₂/J₁ ≈ 0.5?
Novelty: Targets the exact regime where 2D DMRG area-law cost explodes.
Why now / why you: NetKet + JAX make the ansatz a few lines; you have the ED cross-check.
Cross-field link: Expressivity bounds borrowed from ML approximation theory.
Min. viable exp.: 6×6 lattice vs ED ground truth before scaling up.
Success signal: Energy within ED error bars at 6×6, then beats DMRG χ-limit at 10×10.
Hope signal: Not yet at ED accuracy, but error falls steadily with width.
Pivot signal: Variance plateaus far above ED — change ansatz or abandon.

Representative artifact. The structure is exactly what /idea-writer emits — research question, novelty, minimum viable experiment, and explicit success / hope / pivot signals. (The survey above is a captured run; this brainstorming exchange is illustrative — the dialogue is interactive by nature.)

Takeaway. Two agents with distinct jobs — a mentor that opens directions and a critic that forces a minimum viable experiment and a kill-criterion — leave you with a proposal that already knows how it could fail. You stay the thinker; the agents fetch and challenge.

Where to start

New here? Run /beginner-training and pick a track. Already know what you want? Name a model to solve it directly, or say /onboard to set up the stack first. Checking a result? Browse the exactly-solvable models — 63 oracle cards with exact benchmark values.

Methods & the experts behind them

Each method's guidance is curated by a world-leading expert in that computational approach — what to compute, which checks matter, where the common failures hide.

Method	Expert contributor	Skill
Exact diagonalization	Chen Cheng (程晨)	`/method-ed`
MPS / LTRG / DMRG / TEBD	Wei Li (李伟)	`/method-mps` · `/method-ltrg`
PEPS / CTMRG	Hai-Jun Liao (廖海军)	`/method-peps`
Quantum Monte Carlo	Ming-Pu Qin (秦明普)	`/method-qmc`
Monte Carlo renormalization group	Yan-Tao Wu (武琰涛)	`/method-mcrg`
Quantum circuit simulation	Shi-Xin Zhang (张士欣)	`/method-qcs`
AI agent & knowledge base	Kun Chen (陈锟), Jin-Guo Liu (刘金国)	`tracks/agent-kb`

What every run leaves behind

a rerunnable script under scripts/ — e.g. julia --project=julia-env scripts/j1j2_heisenberg_dmrg.jl
data + a convergence/stability plot under results/ — the visual proof the result is trustworthy
a short run report: setup, settings, result, verification status, residual uncertainty

The two transcripts and both plots on this page were captured from real harness runs: the PXP scar reproduction (3.3 s, exact diagonalization) and the J₁–J₂ DMRG run (26.6 s, ITensors), both on a single laptop.