AI in product discovery at Teamleader

"I would like to say that we do continuous discovery, but we actually only do discovery for the topics we are already thinking about doing. And we often end up building them anyway, whether or not the research supported it, because by the time it is done, the team needs something to start working on."

‍

Kalina Lipinska, VP of Product at Teamleader, said this in front of a room full of product managers, and most of them nodded.

‍

We welcomed Kalina to our latest Product Apéro to talk about what Teamleader is doing about it, and what the shift requires from anyone running a product team right now.

‍

Teamleader's context

Teamleader builds all-in-one business software for small and medium businesses: CRM, invoicing, project management, and time tracking. Founded in Ghent, part of Visma since 2022, the core product serves 12,000 customers across Belgium, the Netherlands, and Germany. The sweet spot is companies between two and twenty employees, spanning agencies, consultancies, construction firms, and IT companies. It is a mature, complex product built for a broad and varied customer base, which makes prioritisation difficult. Teamleader recently doubled in size following its merger with Yuki and AdminPulse.

‍

The product team working on Teamleader Focus is seven people, alongside four designers. Small enough that every process decision matters and every tool either earns its place or creates friction.

‍

Download the slides of the presentation >

‍

The gap that AI is widening

Engineering at Teamleader started experimenting with Claude Code early in the year. By March, it was a full team mandate, a deliberate push to demonstrate value and let adoption spread. Kalina watched as skeptical developers became converts, burning through a week's token allocation in three days. The throughput increase is real, even if still difficult to measure precisely.

‍

While development teams across the industry accelerate, product discovery has the potential to become the next bottleneck, creating a critical organizational gap. When delivery speed was constrained, product discovery had ample time to gather research. However, as delivery accelerates sharply, the slower discovery process becomes the system's new weakest link. Soon, product managers who cannot validate problems quickly enough will face increasing pressure to make decisions based on insufficient evidence, outdated research, or intuition rather than validated customer signals.

‍

Kalina named the result that will follow: product slop. Features shipped with confidence and low adoption, carrying a maintenance cost that the team absorbs indefinitely.

‍

At dualoop, we see this same pattern in most of the product organisations we work with. The discovery process was designed when delivery was the constraint, and it has not been revisited since engineering capacity changed. The question Kalina posed to her room, and that we would pose to any product leader, is whether your discovery process can keep pace with how fast your team can now build.

‍

"Human in the loop is not about making sure the output is correct. It is about making sure the human is still thinking."

‍

Treating your existing data as a discovery asset

The most practical part of Kalina's talk was about what Teamleader already had and that could be leveraged even more with AI.

‍

The first example is Modjo: it started as a sales intelligence tool; sales used it to record demos and get coaching on objection handling. But the product team found that these recordings contained something else: conversations with leads, including those who did not convert. Before Modjo, understanding why prospects chose not to go with Teamleader meant trusting the account executive's summary, which is filtered through recency bias and deal size in ways that are difficult to correct for after the fact. With Modjo, there is a searchable library of the actual content of those conversations. It is the only source they have for understanding why leads do not become customers, and that information is structurally different from anything else in their research stack.

‍

The second is a custom Zendesk agent that reads incoming support tickets, checks whether a ticket contains a feature request, looks for a corresponding Jira issue, and connects the customer to it automatically. If no matching issue exists, it creates one. Every support ticket that contains a product signal now feeds directly into the discovery backlog without anyone having to triage it. The customer success team spends less time on admin work, and signals that previously disappeared into the helpdesk now reach product consistently. It also provides a partial fix for what Kalina called the spaghetti problem: a decade of feature requests logged by dozens of people, full of duplicates, vague tickets, bundled needs, and missing context. The right customers are now attached to the right requests automatically.

‍

The third is a Python script built by one of her PMs. It runs semantic analysis across the entire Jira archive using ICP definitions and jobs to be done as inputs, and produces clusters of related requests mapped to customer segments. Instead of reading thousands of tickets to find patterns, product managers can query across the full archive and get a structured view of what matters for which customers. The output then feeds into Claude for further interrogation by topic. Kalina's governing principle here: AI should write and run the analysis script; the human interprets the output. Asking AI to interpret quantitative data directly is risky, because hallucinations in numbers are harder to spot than hallucinations in text.

‍

On interview preparation: once the research context is assembled, Kalina's team feeds it into Claude alongside what they already know, what they are trying to learn, and who they are talking to. The result is a solid draft questionnaire in around twenty minutes rather than several hours. The warning she stressed: if you do not know what you are trying to find out before you open the chat, AI cannot figure that out for you. She watched a junior PM ask Claude to generate interview questions without any context, and the output was, predictably, useless. The quality of what comes out is a direct reflection of the clarity of the thinking going in.

‍

Download the slides of the presentation >

‍

What AI does not handle well in discovery

Kalina spent as much time on the failure modes as on the wins, and this is where the talk became most useful.

‍

The first trap is hallucinated quotes. A language model does not retrieve quotes the way a search engine retrieves results; it generates text that is statistically plausible given the source material. Ask it to surface a customer quote that captures a particular sentiment and you will get a composite of what multiple customers said, not what any single one actually said. Kalina called it a Frankenstein quote: it sounds right, it captures a real theme, but it is not something anyone actually said. Used as evidence in a discovery document or a stakeholder presentation, it is not evidence at all, and it will pass unnoticed unless someone checks it against the original transcript.

‍

The second trap is generic synthesis. Language models default to finding what is mentioned most frequently and presenting it as insight. In discovery, the most frequently mentioned items are usually the most obvious and least useful for actual decisions. The signal that changes a roadmap is often something only two customers mentioned, or the tension between what someone said at minute eight and how they reframed it at minute thirty-five. That kind of signal lives in the conversation itself, not in a summary generated afterwards. Kalina described putting the same set of transcripts into two different models and getting two completely different narratives, each delivered with equal confidence. The output looks like research, but the process was not research.

‍

"If you start getting a summary of a summary, you really get very flat information." - Kalina Lipinska, VP of Product at Teamleader

‍

Her approach to interviews: run Modjo and Granola simultaneously. Granola is useful during the conversation, as you can ask it questions about what is being said while the interview is still in progress, which means you can stay present rather than taking notes. Modjo works better for going back afterwards: reviewing the recording, seeing the customer's screen, searching across previous sessions, and following up on things you missed. Neither tool should be asked to produce the synthesis. After the interview, write down what you remember, form your own conclusions, and only then work with AI on the raw transcripts. Use AI to capture, use your brain to make sense of it.

‍

What changes next and what stays the same

The broader shift Kalina described is one we are seeing across the product organisations we work with. The bottleneck has moved from delivery to decision quality. Engineering is already building a system of interconnected agents: tools that write, test, review, and orchestrate code. If product and design wait too long to build equivalent infrastructure for discovery and synthesis, the result is engineers with AI capabilities and product managers who cannot keep up. The window to build this as a shared system rather than engineering's system alone is now.

‍

On team structure: Teamleader is moving away from fixed product teams toward smaller, more fluid configurations, which Kalina called packs. These are groups of two to three engineers working on a specific problem, with a PM shared across several of them. As AI increasingly handles ceremonies, refinements, user story writing, and routine documentation, the PM role shifts toward the parts that require real judgment: understanding what customers actually need and deciding what not to build as much as what to build. The strategic dimension of the role becomes more important, not less.

‍

"The teams that will struggle are not the ones who adopt too slowly. They are the ones who adopt without raising the quality of their thinking first." - Kalina Lipinska, VP of Product at Teamleader

‍

The closing framing from Kalina holds across all of this: AI amplifies what is already there. Teams with clear thinking and honest research habits will move faster and get better answers. Teams that were already guessing will guess more confidently, at a higher speed, with outputs that look like evidence.

‍

Looking to strengthen your discovery practice? Want to go deeper on the fundamentals of product management, from problem framing to customer research to prioritisation? Our product management training covers the full product lifecycle, including how to integrate AI at each stage. Learn more >

‍

We also work directly with product organisations on shaping their operating model when the process itself needs to change. Let’s talk >

‍