Five things I check before approving an AI architecture
None of them are about the model. All of them turn out to matter when projects fail later.
When I’m asked to review an AI architecture, internal or for a client, I have a short list of questions I run through before getting into the interesting parts. None of these are clever. They are the boring questions that turn out to matter most when projects fail later.
Where does the data come from, and who owns it?
Not the vector store. The original sources. Are they authoritative? Are they current? Who is responsible for keeping them updated after the system goes live? If the answer to the last one is “we’ll figure that out later,” you don’t have an AI project. You have a data engineering project wearing AI clothing.
What happens when the model is wrong?
Every system is wrong sometimes. The question is what happens next. Does the user see a confident wrong answer with no way to tell? Does the next step in a workflow run on that bad output? Is there a human in the loop before the result leaves the system? If there’s no answer to this, the system will eventually hurt someone.
Can you trace one response from end to end?
When a user complains about a specific answer, can you reconstruct what happened? What was retrieved, what was in the prompt, what the model said, what the post-processing did to it? If the answer is no, you cannot debug the system. You can only guess and hope.
Is there an evaluation that looks like the real work?
Not benchmark scores. Not generic accuracy metrics. A set of inputs that resemble actual user requests, with expected outcomes, that runs before every meaningful change. If the only “evaluation” is anecdotal demos, regressions ship undetected and the team stops noticing.
Who can turn it off?
In production, the system will misbehave eventually. The upstream model changes. A data source gets corrupted. A prompt injection finds a hole. Who has both the authority and the technical ability to disable the system quickly? Is that person on call? If not, that’s the first thing to fix.
These aren’t AI questions. They’re operational questions. They apply to any production system. The reason I lead with them on AI projects is that AI work skips this stage more often than other software work does, because the model is so visible and exciting that the system around it can feel like a distraction.
It isn’t. It’s most of what determines whether the thing stays working.