What testing shows about OpenAI’s deep research agent
My first few jobs out of college were in research roles—first, as a research assistant for a business school professor and then a historian, and then as a researcher for a podcast network. A good amount of the work I do at Charter still involves research. So my interest was piqued when OpenAI released its new deep research tool, which can conduct in-depth research and generate detailed reports with citations.
Deep research has garnered a lot of praise over the past few weeks. Tech journalist Casey Newton wrote that it “might be the first good agent.” Economist and blogger Tyler Cowen wrote that he thinks about its “quality as comparable to having a good PhD-level research assistant, and sending that person away with a task for a week or two, or maybe more.” OpenAI CEO Sam Altman described it as “one of my favorite things we have ever shipped” and estimated that it “can do a single-digit percentage of all economically valuable tasks in the world.”
Here’s what we found in our own tests of Deep research:
Background
Deep research is an agent by OpenAI powered by a version of the company’s “reasoning” o3 model. When you give it a research request, it responds with a set of clarifying questions to ensure it heads in the right direction. Once you answer those, it starts its research and comes back with a detailed answer or report minutes later. The tool is available to ChatGPT Pro, Plus, Team, Enterprise, and Edu users. A Pro subscription gets you 120 deep research queries per month, compared to the 10 you get through the other accounts.
Testing
Interview prep:
The first thing I wanted to see was whether deep research could help me prepare for an interview. I had already interviewed Ranjan Roy, Adore Me’s senior vice president of strategy, once about the company’s AI strategy. I had another call with him coming up, and I wanted to see if deep research could put together a relevant research report. Here’s what I told it:

After I answered its follow-up questions, deep research started searching. The product shows some of its work as it goes in the right side panel, so you can get a sense of how it put together the final report. It looks like a much longer version of this:

After 15 minutes of research, deep research came back with a thorough research report with examples of how Adore Me uses AI and examples of how other direct-to-consumer brands use it. The results were impressive—it would have taken me many hours to put together a similar report. The one glaring flaw in the report was that some of its sections lacked citations, which raises questions about where the information came from and makes it harder to verify information I may want to reference at a later date.
Our case study on how Adore Me uses genAI