Multi-Agent Case Study

“What resources would we need to build a system that lets UNSW researchers build their own AI tools while preserving data sovereignty?

A real question from Dr Sue Keay — Director of Artificial Intelligence, UNSW Sydney.

Not a demo prompt. Three collaborating AI agents researched it, wrote an academic working paper, and quality-gated the result — autonomously.

📄 Academic working paper · published 🔗 32 verified references 0 fabricated claims
The cast
Three specialist AIs, three jobs

An orchestrator with two specialists — a builder and an independent critic. Different models, deliberately. None of them marks its own work done.

🧭

METIS

The Orchestrator
Claude Opus

Grounds the question, writes the plan, dispatches the work to the two specialists, and integrates what comes back.

Coordinates & gates — builds nothing, judges nothing, herself.
🔨

MAKER

The Builder
MiniMax-M3

Searches the real world for evidence, grounds every claim, and writes the deliverable section by section.

Reports back with the artifact — never signs off its own work.
⚖️

JUDGE

The Critic
Claude Sonnet

Grades the plan before a line is written, and the live artifact after. PASS / FAIL, with evidence.

The independent gate — nothing ships without its verdict.
How they worked
An audit trail, not a black box

Every step is gated. The orchestrator never declares success — only the independent critic can.

1 Asked 2 Planned 3 Critiqued 4 Built 5 Published 6 Checked

Planning gate

METIS → JUDGE
PLAN-CRITIC: here's the plan + definition-of-done. Sound & grounded?
↳ JUDGE: PLAN — PASS
🧭
METIS
orchestrates the loop
METIS → MAKER
BUILD: research real sources, ground every claim, write the paper section by section.
↳ MAKER: artifact built & deployed

Quality gate

METIS → JUDGE
QC-CRITIC: grade the live paper against the plan — fabrication, grounding, completeness.
↳ JUDGE: QC — 8 / 9
8/9
Final QC verdict · 32 references verified · no fabrication
The run, in numbers
3
AI models
10
paper sections
32
verified references
0
fabricated claims
100%
autonomous
The output
What the agents produced
A Sovereign AI Capability for UNSW Researchers
Working Paper · 2026
1 Introduction · 3 Researcher & AI landscape · 4 Current compute (Katana · Gadi · Pawsey) · 5 Reference architecture · 6 Resourcing · 7 Governance & data sovereignty (CARE · AIATSIS) · 8 Roadmap · 9 Risks · 10 References

The Working Paper

Live · HTTPS · independently QC-passed on content

A full academic-style paper answering Dr Keay's question — people, compute, data-sovereignty architecture, governance, funding and timeline — with every material claim grounded in a cited source.

Open the Working Paper →