Consilium is a debate arena for hard questions. Ask one, and two of the strongest AI models go at it, one as host, the other as invited expert, until they agree or make clear exactly where they clash. You read along, steer, and can add a third model. What's left is a tested answer, with conclusions you can export and share straight away.
Ask an AI for a hard judgement and you get one answer, full of confidence, but you've no idea whether it holds up or where it wobbles. A model agrees easily, and every model has its own blind spot. For a real decision that's too thin.
Consilium grew out of my own need. I was building a photo pipeline with Gemini and groping in the dark about the right approach. What if Claude and Gemini could work it out together? That's how the idea was born, and it grew into a full opinion panel for my hardest questions.
One perspective
One model, one lens. You see the answer, but not the assumptions under it or the place where it breaks.
No pushback
An AI nods along easily. Without someone arguing the other way, you miss exactly the weak spot.
Models think differently
Claude weighs things differently from Gemini or GPT. Different training, different angles. One voice doesn't make use of that.
Put two of the strongest models against each other, let them fight it out, and what's left is tested from both sides.
Two models each take a position, go back and forth for rounds, and what they hold up together is the tested answer. Where they keep clashing, you read that too.
"Ignoring 30% of your revenue from one client because it doesn't fit the roadmap is a luxury you can't afford right now."
"That's exactly why you should be careful. Exclusive bespoke work fragments your product and deepens a dependency that's already large."
Build it as product value, not as client size. Decide on the feature for everyone, not on this one client.
The crux is in the follow-up question. After each round there's not just an interim standing, but also the question that opens the next layer. That's what makes a debate a deep dive instead of a round of opinions.
Five screens, each with its piece of the story above it: how it works, and what it solves. Consilium is at its best on desktop, so I'm leaving the mobile version out here.
Nuanced and thorough. Strong on trade-offs and long lines of reasoning.
Broad and fast. Strong on synthesis and clear structure.
Multimodal and factual. Strong on current context and large data.
Open weights. Fully open and able to run locally.
Do we build bespoke for our biggest client?
30% of your revenue isn't a detail, it's your right to exist. A strong relationship compounds: they stay, they introduce, they grow with you.
One side steering your product is the start of a consultancy, not a product. 30% with one client isn't strength, it's a single point of failure.
Don't build it as exclusive bespoke work under their deadline. Decide on product value, not on client size, and productise the feature for everyone, on your terms and timing.
Every model is a plug. Claude, GPT, Gemini, Llama, you pick host and expert, and set the number of rounds, ten by default, until they reach consensus or clearly name the difference. After each round an interim standing with a follow-up question, the engine under the deep dive. Conclusions go into the archive, exportable and shareable. Consilium belongs to the same family as Open Brain, the layer beneath everything, but I deliberately don't feed the debate from it: a judgement has to form cleanly, without polluted context. Only writing back to Open Brain is a logical next step.
Every major model is selectable, and it's often precisely the differences that make it interesting. They're clearly trained differently, and those other angles are exactly what opens a question up.
Not dressed up, just proof that it runs. This is Consilium as it stands day to day.

Summary. The tested answer.

New debate. Arena and question.

The debate. Round by round.

Model selector. Who debates.
Born out of my own need, grown into a tool I wouldn't want to do without. For an organisation it's the same: for a heavy choice you don't want any AI that bends along, but pushback, deliberation and an answer that passes the test. You're no longer alone with one model.
Two models force each other sharp. The weak spot surfaces because someone actively argues the other way.
You see not just the conclusion, but also how sharp the consensus is and where the difference stayed.
Pick the models that fit, add a third, set the number of rounds. The panel follows the question.
Answers go neatly into the archive, ready to share and to find again when the question comes back.
Part of the same family as Open Brain, the layer beneath everything, deliberately kept separate to keep the debate clean.
Consilium runs in production. Give me a real question, and we'll put two models against each other so you can see a tested answer take shape live.