Multi-Agent Case Study

“What resources would we need to build a system that lets UNSW researchers build their own AI tools while preserving data sovereignty?

A real question from Dr Sue Keay, Director of Artificial Intelligence, UNSW Sydney.

Not a demo prompt. Three collaborating AI agents researched it, wrote an academic working paper, and quality-gated the result, autonomously.

📄 Academic working paper, published 🔗 32 verified references 0 fabricated claims
The cast
Three specialist AIs, three jobs

An orchestrator with two specialists, a builder and an independent critic. Different models, deliberately. None of them marks its own work done.

🧭

METIS

The Orchestrator
Claude Opus

Grounds the question, writes the plan, dispatches the work to the two specialists, and integrates what comes back.

Coordinates and gates, builds nothing and judges nothing, herself.
🔨

MAKER

The Builder
GPT 5.5

Searches the real world for evidence, grounds every claim, and writes the deliverable section by section.

Reports back with the artifact, never signs off its own work.
⚖️

JUDGE

The Critic
Claude Sonnet

Grades the plan before a line is written, and the live artifact after. PASS / FAIL, with evidence.

The independent gate, nothing ships without its verdict.
How they worked
An audit trail, not a black box

Every step is gated. The orchestrator never declares success, only the independent critic can.

1 Asked 2 Planned 3 Critiqued 4 Built 5 Published 6 Checked

Planning gate

METIS → JUDGE
PLAN-CRITIC: here is the plan and definition-of-done. Sound and grounded?
↳ JUDGE: PLAN, PASS
🧭
METIS
orchestrates the loop
METIS → MAKER
BUILD: research real sources, ground every claim, write the paper section by section.
↳ MAKER: artifact built and deployed

Quality gate

METIS → JUDGE
QC-CRITIC: grade the live paper against the plan, fabrication, grounding, completeness.
↳ JUDGE: QC, 8 / 9
8/9
Final QC verdict, 32 references verified, no fabrication
The run, in numbers
3
AI models
10
paper sections
32
verified references
0
fabricated claims
100%
autonomous
The output
What the agents produced
A Sovereign AI Capability for UNSW Researchers
Working Paper, 2026
1 Introduction · 3 Researcher & AI landscape · 4 Current compute (Katana · Gadi · Pawsey) · 5 Reference architecture · 6 Resourcing · 7 Governance & data sovereignty (CARE · AIATSIS) · 8 Roadmap · 9 Risks · 10 References

The Working Paper

Live · HTTPS · independently QC-passed on content

A full academic-style paper answering Dr Keay's question, people, compute, data-sovereignty architecture, governance, funding and timeline, with every material claim grounded in a cited source.

Open the Working Paper →