Consilium · What's running

01 · the problem

One model gives you one opinion, not the test.

Ask an AI for a hard judgement and you get one answer, full of confidence, but you've no idea whether it holds up or where it wobbles. A model agrees easily, and every model has its own blind spot. For a real decision that's too thin.

Consilium grew out of my own need. I was building a photo pipeline with Gemini and groping in the dark about the right approach. What if Claude and Gemini could work it out together? That's how the idea was born, and it grew into a full opinion panel for my hardest questions.

One perspective

One model, one lens. You see the answer, but not the assumptions under it or the place where it breaks.

No pushback

An AI nods along easily. Without someone arguing the other way, you miss exactly the weak spot.

Models think differently

Claude weighs things differently from Gemini or GPT. Different training, different angles. One voice doesn't make use of that.

Put two of the strongest models against each other, let them fight it out, and what's left is tested from both sides.

02 · the debate

Not one opinion. An answer that survives the friction.

Two models each take a position, go back and forth for rounds, and what they hold up together is the tested answer. Where they keep clashing, you read that too.

CClaudehost

"Ignoring 30% of your revenue from one client because it doesn't fit the roadmap is a luxury you can't afford right now."

GGPTexpert

"That's exactly why you should be careful. Exclusive bespoke work fragments your product and deepens a dependency that's already large."

=Consensus3 rounds

Build it as product value, not as client size. Decide on the feature for everyone, not on this one client.

The crux is in the follow-up question. After each round there's not just an interim standing, but also the question that opens the next layer. That's what makes a debate a deep dive instead of a round of opinions.

03 · up close

The application itself, in production.

Five screens, each with its piece of the story above it: how it works, and what it solves. Consilium is at its best on desktop, so I'm leaving the mobile version out here.

01 · New debate

Pick an arena, ask your question.

A debate starts with focus. Pick the arena, from product and strategy to society, and frame what you really want to know. Optionally you pass in your current best answer to peel apart.

Key takeaway

A good question in the right context yields a sharper debate.

Consilium — New debate

Start a new debatearena

General

free exploration, technology

Product & Strategy

building, positioning, go-to-market

Technology & AI

models, architecture, ethics

Society

policy, ethics, the long term

What do you want to discuss?

Our biggest client, worth 30% of revenue, is asking for an exclusive bespoke feature that doesn't fit the roadmap.

Start

02 · Model selector

Who do you put against each other?

Fully modular. Pick two flagship models, each with its own strengths, and add a third if you want. Claude, GPT, Gemini, Llama, you assemble the panel that fits your question.

Key takeaway

Different strengths against each other give a sharper, tested answer.

Consilium — Model selector

Who do you put against each other?pick two

Claude

Anthropic

Nuanced and thorough. Strong on trade-offs and long lines of reasoning.

nuancereasoning

GPT

OpenAI

Broad and fast. Strong on synthesis and clear structure.

synthesisbreadth

◆

Gemini

Google

Multimodal and factual. Strong on current context and large data.

multimodalfacts

Llama

Read along while they fight it out.

Round by round the models take a position, rebut each other and sharpen the question. You read along, steer, and see exactly how the answer takes shape.

Key takeaway

No black box: you see the reasoning, not just the outcome.

Consilium — The debate

Do we build bespoke for our biggest client?

Claude

argues for building

GPT

argues for holding focus

Round 1 · opening position

CClaudebuild

30% of your revenue isn't a detail, it's your right to exist. A strong relationship compounds: they stay, they introduce, they grow with you.

GGPThold focus

One side steering your product is the start of a consultancy, not a product. 30% with one client isn't strength, it's a single point of failure.

04 · Summary

The tested answer, plus where they clashed.

At the end there's a clear answer, with a degree of agreement, the points where they came together, and honestly too where the difference stayed. Ready to export and share.

Key takeaway

A conclusion you can trust, because you see how sharp the consensus is.

Consilium — Summary

The tested answerhigh consensus

Don't build it as exclusive bespoke work under their deadline. Decide on product value, not on client size, and productise the feature for everyone, on your terms and timing.

86%

Where they agreed

✓Decide on product value, not on client size.

✓Generalising is fine, exclusive bespoke work isn't.

Where they clashed

›Does the revenue risk weigh heavier than the damage to your roadmap?

›Can you say no without losing the client?

05 · Archive

Every conclusion neatly kept.

Every debate lands in the archive, findable, manageable and shareable. A growing library of tested answers to your hardest questions.

Key takeaway

Your decisions and their grounding stay kept, not fleeting.

Consilium — Archive

Archivetested answers

Do we build bespoke for our biggest client?

product & strategy · Claude × GPT · 3 rounds

86%

Which model for our RAG pipeline?

technology & AI · Claude × Gemini · 5 rounds

92%

Remote-first or hybrid after the growth?

work & life · Claude × GPT · 4 rounds

71%

07 · the value

An opinion panel for your hardest decisions.

Born out of my own need, grown into a tool I wouldn't want to do without. For an organisation it's the same: for a heavy choice you don't want any AI that bends along, but pushback, deliberation and an answer that passes the test. You're no longer alone with one model.

Pushback built in

No yes-nodding AI.

Two models force each other sharp. The weak spot surfaces because someone actively argues the other way.

Tested, not guessed

An answer with a degree of agreement.

You see not just the conclusion, but also how sharp the consensus is and where the difference stayed.

Fully modular

Your panel, your rounds.

Pick the models that fit, add a third, set the number of rounds. The panel follows the question.

Shareable & kept

Every conclusion exportable.

Answers go neatly into the archive, ready to share and to find again when the question comes back.

Part of the same family as Open Brain, the layer beneath everything, deliberately kept separate to keep the debate clean.

One debate, two flagship AIs, one tested answer.

One model gives you one opinion, not the test.

Not one opinion. An answer that survives the friction.

The application itself, in production.

Pick an arena, ask your question.

Who do you put against each other?

Read along while they fight it out.

The tested answer, plus where they clashed.

Every conclusion neatly kept.

Modular, and deliberately kept clean.

The real screens.

An opinion panel for your hardest decisions.

No yes-nodding AI.

An answer with a degree of agreement.

Your panel, your rounds.

Every conclusion exportable.

A hard question? I'll happily have it fought out live.