Deep Barot

Deep Barot

Founder and CEO of ContextQA

About

Deep Barot is the founder and CEO of ContextQA, pioneering the next generation of software testing through agentic AI and context-aware automation. He’s built a groundbreaking platform powered by 30+ specialized AI agents that autonomously generate, execute, heal, and optimize tests across web, mobile, Salesforce, and API environments. Deep’s vision centers on creating truly intelligent testing systems that understand application context rather than simply executing scripts. This approach has attracted enterprise clients including Airlines, BFSI, and Consumer Tech, who rely on ContextQA’s autonomous agents to deliver unprecedented speed and reliability. Scaling a venture-backed company with a distributed large engineering team, Deep is reimagining how software quality is achieved through multi-agent AI systems that collaborate to solve complex testing challenges—fundamentally transforming how enterprises approach quality assurance in an era of rapid development velocity.

Pull Over: AI Agent Testing Has Entered the Highway

Time
11:00 AM - 12:00 PM
Room
Senate Chamber

Description

AI agents are writing code, making decisions, and shipping features — but who's testing the agents?

In this session, Deep Barot, CEO of ContextQA, tackles one of the most pressing challenges in modern software quality: how do you test something that doesn't behave the same way twice? Deep will walk through a live demo focused specifically on evaluating AI agents — from generating meaningful test scenarios to running multi-LLM judge evaluations that give your team real confidence in what's shipping.

You'll see how ContextQA approaches evals not as a checkbox, but as a rigorous, repeatable process — using multiple LLM judges to score outputs, surface disagreements, and produce confidence levels your team can actually act on.

If your organization is building with AI agents and wondering how to trust what they do, this session gives you a practical, live look at how to bring quality engineering discipline to agentic systems.

Key Takeaways:
• How to design test scenarios purpose-built for AI agent behavior
• Why single-model evaluation falls short and how multi-LLM judging fills the gap
• How to interpret confidence levels across judges to make informed release decisions
• A live look at ContextQA's eval framework in action