Generative AI and large language models (LLMs) have been all the rage in recent years, upending traditional online search via the likes of ChatGPT while improving customer support, content generation, translation and more. Now, one fledgling startup is using LLMs to build AI assistants capable specifically of answering complex questions for developers, software end-users, and employees — it’s like ChatGPT, but for technical products.
Founded in February last year, Kapa.ai is a graduate of Y Combinator’s (YC) Summer 2023 program, and it has already amassed a fairly impressive roster of customers, including ChatGPT-maker OpenAI, Docker, Reddit, Monday.com and Mapbox. Not bad for an 18-month-old business.
“Our initial concept came after several friends who ran tech companies reached out with the same problem, and after we built the first prototype of Kapa.ai to address this for them, we landed our first paid pilot within a week,” CEO and co-founder Emil Sorensen told TechCrunch. “This led to organic growth through word-of-mouth — our customers became our biggest advocates.”
To build on that early traction, Kapa.ai has now raised $3.2 million in a seed round of funding led by Initialized Capital.
Getting technical
In the broadest terms, companies feed their technical documentation into Kapa.ai, which then serves up an interface using which developers and end-users can ask questions. Docker, for example, recently launched a new documentation assistant called Docker Docs AI, which provides instant responses to Docker-related questions from within its documentation pages — this is built using Kapa.ai.
But Kapa.ai can be used for myriad use-cases such as customer support, community engagement, and as a workplace assistant to help employees query their company’s knowledge base.
Under the hood, Kapa.ai is based on several LLMs from different providers and leans on a machine learning framework called Retrieval Augmented Generation (RAG), which enhances the performance of LLMs by enabling them to easily draw from relevant external data sources to provide richer responses.
“We’re model-agnostic — we work with multiple providers, including using our own models, in order to use the best-performing stack and retrieval techniques for each specific use case,” Sorensen said.
It’s worth noting that there are a number of similar tools out there already, including venture-backed startups such as Sana and Kore.ai, which are substantively about bringing conversational AI to enterprise knowledge bases. Kapa.ai, for its part, fits into that bucket, but the company says its main differentiator is that it largely focuses on external users rather than employees — and that has had a big influence on its design.
“When deploying an AI assistant externally to end-users, the level of scrutiny jumps ten-fold,” Sorensen said. “Accuracy is the only thing that matters, because companies are worried about AI misleading customers, and everyone has tried having ChatGPT or Claude hallucinate. A few bad answers and a company will immediately lose trust in your system. So that’s what we care about.”
Accuracy
This focus on providing accurate responses about technical documentation, with minimal hallucinations, highlights how Kapa.ai is a different kind of LLM animal — it is built for a much narrower use-case.
“Optimizing a system for accuracy naturally comes with trade-offs, as it means we have to design the system to be less creative than what other LLM systems can afford to be,” Sorensen said. “This is to guarantee the answers are only generated from the universe of content they provide.”
Then there is the thorny issue of data privacy — one of the major deterrents for enterprises that may want to adopt generative AI but are wary about exposing sensitive data to third-party systems. As such, Kapa.ai includes PII (personally identifiable information) data-detection and masking, which goes some way toward ensuring private information is neither stored nor shared.
This includes real-time PII scanning: When a message is received by Kapa.ai, it’s scanned for PII data, and if any personal data is detected, then the message is rejected and not stored. Users can also configure Kapa.ai so that any PII data detected in a document will be anonymized.
Businesses can, of course, assemble something akin to Kapa.ai themselves using third-party tools such as Azure’s OpenAI service or Deepset’s Haystack. But it’s a time-consuming and resource-intensive endeavor, especially when you can just tap Kapa’s website widget, deploy its bot for Slack or Zendesk, or use its API that allows companies to customize things a little with their own interfaces.
“Most of the people we work with don’t want to do all the engineering work, or don’t necessarily have the AI resources on their teams to do so,” Sorensen said. “They want an accurate and reliable AI engine that they can trust enough to expose directly to customers, and which has already been optimized for their use-case of answering technical product questions.”
In terms of pricing, Kapa.ai says it uses a SaaS subscription model, offering tiered pricing based on the complexity of the deployment and usage — though it doesn’t publish these prices.
The company has a remote team of nine spread across the globe in two main hubs in Copenhagen, where Sorensen is based, and San Francisco.
Aside from lead backer Initialized Capital, Kapa.ai’s seed round saw participation from Y Combinator and a slew of angel investors, including Docker founder Solomon Hykes, Stanford professor and AI researcher Douwe Kiela, and Replit founder Amjad Masad.