Agent product analytics: why the run matters more than the session
A practical framework for measuring agentic AI workflows by delegated work, trust, and business value instead of vanity activity metrics.
By Robin Fitzpatrick

What this article covers
A practical framework for measuring agentic AI workflows by delegated work, trust, and business value instead of vanity activity metrics.
If you are building with agents, the old dashboard can lie to you.
That is the real point behind Nate's latest note on agent product analytics: the session can look healthy while the actual delegated work is failing. A user can be active, a model can be busy, the logs can be full, and the output can still be wrong, unsafe, or simply not trusted.
For companies trying to turn AI into value, this matters because the business does not buy activity. It buys outcomes:
- less manual work
- faster execution
- better customer service
- higher-quality decisions
- fewer errors and rework cycles
If you cannot see the work inside the run, you cannot tell whether the workflow is creating value or just creating motion.
The core framework
The simplest useful model is a three-part sequence:
- Session - the visible interaction surface.
- Run - the delegated piece of work the agent actually executes.
- Trust - the point where a human accepts the result and lets the system do more next time.
That sequence is the backbone of agentic AI measurement.
If the session is busy but the run is weak, the system is decorative. If the run completes but the user does not trust it, the system is not yet operational. If the run is trusted and repeatable, the system is beginning to create leverage.
What changed
Traditional product analytics was built for human clicks.
Agentic software is different because the user is not doing the work step by step. They are handing the work over. That means the meaningful events are not page views or time on site. They are:
- the handoff
- the tools the agent used
- the boundaries it hit
- the corrections it received
- whether the user accepted the result
This is where a lot of AI programs get stuck. They deploy the shiny front end, announce the pilot, and then stare at usage numbers that do not answer the real question: did the work get better?
If you do not track those things, you will see a calm dashboard while the actual work goes sideways.
The value test
A&O's lens here is not academic. It is commercial.
Use the following questions to test whether an agentic workflow is actually worth keeping:
- Does this workflow reduce manual repetitive work?
- Does it improve the speed at which a team can complete a real task?
- Does it reduce error, rework, or escalation?
- Does it produce a result that a human trusts enough to reuse?
- Can we connect the workflow to customer value, operational leverage, or revenue impact?
If the answer to those questions is no, then the system may be impressive, but it is not yet useful.
A simple way to think about it
Session
The dashboard says someone showed up and something happened.
Run
The agent actually tried to do the delegated work.
Trust
The human accepts the result and lets the system do more next time.
Green but useless
The session looks fine. The run did not move the work forward.
Finished but not trusted
The model produced output, but the user still had to redo it.
Trusted and repeatable
This is when autonomy starts to earn a bigger role.
The three sources that matter
The point is not unique to one newsletter. The broader agent literature keeps landing in the same place.
"Agents are systems that independently accomplish tasks on your behalf."
- OpenAI, A practical guide to building agents
"Workflows are systems where LLMs and tools are orchestrated through predefined code paths. Agents ... dynamically direct their own processes and tool usage."
- Anthropic, Building effective agents
"The unit of product behavior is becoming the agent run."
- Nate, Agent Product Analytics
Those three lines point to the same operational shift:
- OpenAI is saying an agent is about doing real work with tools and guardrails.
- Anthropic is saying sometimes you need workflows, sometimes you need true agentic control, and the distinction matters.
- Nate is saying the analytics has to move from usage to delegated work.
This is also the right way to explain the opportunity to a buyer:
- workflows are best when the path is predictable and you want control
- agents are best when the work is messy, variable, or dependent on context
- analytics must follow the operating model, or you will measure the wrong thing
That distinction matters because companies often buy the tool before they understand the work shape. The result is predictable: a pilot that looks clever, a dashboard that looks healthy, and a business that still feels no relief.
What this means in plain English
If you are an operator, founder, or team lead, the question is not "Did the AI run?"
The question is:
- Did it do the right work?
- Did it hit the right boundaries?
- Did it need correction?
- Did the human trust the result?
- Would you let it do more next time?
That is the real scoring system.
The consultant-level version is even simpler:
- if the work is structured, prefer workflows
- if the work is ambiguous, consider agents
- if the work is valuable, instrument trust
- if the work is repetitive, design for leverage
That is the line between AI theater and AI value.
What A&O would do with this
This is why A&O keeps talking about actionable training, agentic workflows, and practical strategy.
The useful system is not:
- "we added AI"
The useful system is:
- teams understand when to use AI and when not to
- repetitive manual work gets reduced
- the workflow gets instrumented so you can see failure early
- the business can tell whether the work was actually better
- leadership can see how the workflow connects to value creation, not just tool adoption
The takeaway
If you only measure sessions, you will miss the work.
If you measure runs, corrections, and trust, you can steer.
And in agentic systems, steering is the whole game.
So the operating principle is:
- measure the run
- prove the trust
- connect both to value
That is the point of the whole post, and the point of the product conversation around AI right now.
