150 Hours Saved, One 5.6x Error Away from Disaster

What 150 hours of AI-assisted analysis actually looks like

Hardware Refresh Analysis Dashboard - Demo version with fictional data showing customers in scope, installed base units, TAM, refresh bookings, and software renewal rates

I recently ran an extensive customer refresh analysis across Cisco's firewall portfolio to inform strategic decisions about where to focus resources. The analysis required me to stitch together multiple, separate data sources: hardware support contracts, software subscription renewal performance, and current bookings. I used AI heavily throughout the process, and for good reason: it wrote the Python scripts to stitch the datasets together, generated visualizations in minutes that would have taken hours by hand, and helped me iterate on analytical approaches far faster than I could have alone. It saved easily over 150 hours of manual work across data engineering, building the analytical logic, designing a dashboard for visualization, and preparing comprehensive documentation.

In this particular case, the resulting dashboard helped a cross-functional effort gain momentum after being stalled for months due to conflicting priorities. Being able to visualize and detail how to approach the analysis, and what to get from it, helped the team agree on how to move forward. It's now on track to be the foundation for multi-source data analysis going forward.

The challenge: the outputs looked polished, the charts were clean, and the narrative was compelling. It was also wrong. Not obviously wrong, but plausibly wrong. The kind of wrong where the numbers pass a quick glance but quietly overstate success by several multiples (5.6x!). The kind of wrong that, if left unchecked, triggers a workstream involving a dozen people chasing a conclusion built on flawed data, wasting weeks of collective effort without meaningfully impacting the business.

As Cisco's Jeetu Patel put it at RSA this year: AI agents won't get fired for mistakes, but the person working with the AI agent owns the output. The agent apologizing for getting something wrong doesn't help you when a team has already been mobilized around a faulty recommendation.

In this case, my work functioned as a catalyst to a separate effort, and I wanted to share my journey as an optimistic, yet cautionary tale.

The Silent Error Iceberg: polished charts and confident numbers above the surface, mixed counting methods, phantom records, misclassification errors and 5.6x overstatement below

Trust is the bottleneck

There's a critical distinction between delegating work to AI and delegating it safely. In my analysis, the AI confidently mixed up counting methodologies between datasets, included phantom records that inflated volumes, and missed that an entire tier of high-end platforms, ones architected as modular chassis with security blades rather than individual appliances, needed to be reflected differently in the data. Each of these errors produced numbers that looked reasonable on the surface. None of them triggered an obvious red flag.

That's what makes this dangerous. These numbers look right until someone realizes "that doesn't smell right." In this case, the overall opportunity value shifted dramatically once I included that high-end tier and corrected for how those platforms are actually counted. Without that domain knowledge, the headline number would have stood, and decisions would have been made on it.

The fix: machine-readable context

Before and After: The Context File Effect - showing the iterative error loop without a context file versus the streamlined process with one

After iterating through dozens of corrections across multiple sessions, I did something that probably should have come first: with the help of our resident firewall data expert (shoutout to Jason!), I built a structured context file. A single, machine-readable markdown document that encodes the domain rules, data pitfalls, counting logic, and interpretation guidelines that any AI can leverage to work with this data correctly.

Benefit 1: Stop re-explaining, start asking real questions

Without this document, every new analysis session would start with the same friction: re-clarifying what the columns mean, how fiscal calendars map, which records to exclude, how units are counted differently across product families. By my estimate, that clarification overhead alone accounted for 10 hours of back-and-forth for this analysis alone. With the context file loaded, the AI applies those rules on the first pass. That time is now available for the questions that actually matter to the business (or getting back some sleep), not re-teaching the AI things that are matters of data dictionary. This isn't just a convenience, it's increasingly how AI tooling is designed to work, with structured context loaded before the model ever sees your data. When every interaction with AI costs tokens, the question becomes: are you spending those tokens on real business insight, or on re-explaining the "basics"?

Benefit 2: Avoiding the avoidable

More importantly, the context file prevents the silent errors. It encodes which datasets answer which questions (and which don't), flags dead-end analyses before you spend a day rediscovering them, and documents the methodology choices that produce systematically different answers. This is the real trust layer. And it comes with a practice that should be standard regardless of AI involvement, but becomes critical at AI scale: be explicit about definitions in your data and assumptions made when sharing any analysis, AI-generated or otherwise. The sheer volume of output that AI enables makes it harder to keep track of what you're actually looking at. Documenting assumptions isn't just good hygiene; it's how you maintain the chain of trust from data to decision.

Judgment separates insight from slop

In a world of instant analysis, everyone can spin up a dashboard that looks insightful within minutes. The data is visualized, the trends are annotated, and the summary reads with authority. But knowing how to produce something that is actually accurate, something that survives scrutiny when a stakeholder asks "how did you get this number?", that's where human judgment comes in. It's the difference between output and insight.

Anthropic's 4D AI Fluency framework captures this well, particularly two of its competencies: Discernment, the ability to accurately assess whether AI output is trustworthy, and Diligence, owning the work even when AI helped produce it. These aren't abstract skills to put on a resume. They're the exact skills that caught the errors in my analysis. Someone had to notice the numbers didn't add up, investigate why, and encode that knowledge so it wouldn't happen again. That process isn't automatable (yet), and it remains a genuine career differentiator.

Going full circle

When I joined Cisco about nine years ago, I was hired specifically for my expertise in business analysis, running big-data analysis against thousands of device configuration files to identify which features needed to be prioritized for next-generation firewalls. That work was done in close collaboration with the customer success organization, writing and refining Python scripts to extract signal from massive datasets. Much of that kind of work is now automated, or well on its way. But the expertise that made it valuable, understanding the data well enough to know when the output is wrong, knowing which questions are worth asking, and having the judgment to steer analysis toward decisions that actually matter, that's exactly what this article is about. The tools change. The need for human judgment compounds.

If you're working with AI on data-intensive tasks:

a) Don't shy away from getting familiar with the raw data first. Get really familiar with each individual input file you're asking the AI to analyze.

b) Start building your context files now. Every correction you make today is an investment in faster, more trustworthy work tomorrow, for you and your entire team.