‘Where we are today in biology AI is similar to GPT in 2020’: An interview with the CEO of Africa’s biggest AI startup



In January last year, German biotech company BioNTech acquired African AI startup Instadeep for over $550 million, a deal finalized in July of the same year. Instadeep, whose exit is currently the largest from Africa, has been operating under the German pharma umbrella for just over a year. Now is a good time to look at how it has fared since the acquisition.

Instadeep uses advanced machine learning techniques to bring AI into enterprise applications. Its products range from GPU-accelerated insights to self-learning decision-making systems. Before last year’s acquisition, the Tunis-born and Paris, London-headquartered enterprise AI startup raised over $108 million from several global investors, including Google, Deutsche Bahn, and BioNTech. These three strategics were also among the startup’s biggest partners and clients. 

Notably, the decade-old startup collaborated with BioNTech to develop an early warning system that could detect high-risk COVID-19 variants months ahead of time during the pandemic. Instadeep worked with Google DeepMind to create an early detection system for desert locust outbreaks in Africa. It also collaborated on a moonshot project to automate railway scheduling for Deutsche Bahn, the largest rail operator in Europe.

While these partnerships show various applications for Instadeep’s solutions, its acquirer had a clear use case: using AI to develop therapeutics and vaccines for various cancers and infectious diseases — something it is now doubling down on under its new owner.

Fifteen months on from completion of the BioNTech acquisition, co-founder and CEO Karim Beguir told TechCrunch in an interview that Instadeep has made significant progress on that front, even as the AI company — which continues to operate independently — still delivers solutions to clients outside biotech.

“We’re strategically aligned with BioNTech on the objectives to be pursued in biology and bio AI capabilities,” the Instadeep chief said. “But we also have room to maneuver and continue to be a force in AI in Africa and in general while continuing to develop technologies that push the frontier of innovation in other verticals like industrial optimization.” 

Increasing capabilities within biotech 

Beguir notes that Instadeep’s objective in the past year since its acquisition has been to deploy AI at every step in BioNTech’s pipeline to improve existing processes. 

He shares an example in histology, which involves tissue analysis and the visual task of labeling different tissues, such as identifying tumor cells or healthy cells. According to him, experts at BioNTech traditionally performed this work manually. However, Instadeep’s tech has helped accelerate the process by deploying visual AI and segmentation systems, speeding up this labeling tissues workflow by 5x.

Another is the completion of its RiboMab project, which involves mRNA-encoded antibodies that have now become a part of BioNTech’s toolkit as an immunotherapy company to fight cancer and other diseases. InstaDeep introduced this project on its DeepChain platform, which designs proteins and analyzes biological data, during their first collaboration in 2020.

Biotech involves a wealth of sensitive healthcare data. Collecting and analyzing them is one thing. Keeping them safe is another. Just ask 23andMe, once heralded as a disruptor in the biotech space before it became victim to a massive breach that exposed the data of nearly 7 million people, half of its customer base. 

Interestingly, BioNTech is no stranger to such events. In 2020, hackers illegally accessed documents related to its COVID-19 vaccine, developed with Pfizer, by attacking the European Medicines Agency (EMA), Europe’s medicines regulator, which assesses medicines and vaccines. While Pfizer and BioNTech confirmed that their systems and trial data remained secure, the incident highlights how vulnerable organizations, even regulatory ones, can be to cyberattacks.

As any CEO would say, Beguir tells me that Instadeep and BioNTech are highly cautious with healthcare data, especially as the partnership is currently using AI to increase data assets, allowing them to identify precise protein sequences and potentially unlock new targets for cancer and other immunotherapy use cases. 

But there’s a segmentation in what data both companies use. BioNTech handles personal, real-life patient data, and Instadeep typically develops models and trains them on publicly available data. This is how, for example, it trained its Nucleotide Transformer, a series of models in AI genomics, which today is the most downloaded and popular AI genomics model in the world. [Thanks in part to this open-source deal.]

“Instadeep developed and trained the nucleotide model on public data,” Beguir notes. “However, when we wanted to deploy the model on specific use cases and real-life patient data, we did this at the BioNTech level, with all the privacy guarantees that come from its position as one of the leading players in biopharma operating under strict regulations and following rigorous quality protocols.

Developing new technologies within BioNTech and outside of biotech

When asked what the next milestones are for Instadeep within BioNTech, Beguir mentions the startup’s “latest breakthrough”: Bayesian flow networks (BFN), a new generative AI model for proteins that significantly outperforms autoregressive and diffusion models, according to the company. BioNTech CEO Ugur Sahin, in a statement, describes it as a “state-of-the-art technology.”

According to Beguir, the model produces the most natural-looking and best-behaved protein proteins in the market by allowing systems to search for specific properties on an antibody’s heavy chain, including chemical characteristics, hydrophobicity, or sequence length. Such models are crucial for understanding complex protein functions and engineering novel therapeutic proteins.

“We’re excited about the potential of AI innovations like ours to identify real use cases, collaborate closely with BioNTech, and build products that will be tested in labs and clinics, ultimately saving patients’ lives,” said Beguir. “If you consider where we are today in biology and AI, it’s similar to where we were with natural language processing in 2020 with GPT-3. Systems were starting to work, and their capabilities were impressive, but there was still room for improvement.”

Instadeep launched the new AI model last week alongside a new near-exascale supercomputer, which, according to the companies, places the partnership in the top 100 of compute and infrastructure and the top 20 of H100 GPU clusters globally.

Both developments highlight where Instadeep, under BioNTech, deploys AI in several life science use cases. On the other hand, it independently handles its other business line, which involves AI and deep reinforcement learning for industrial optimization.

One example is its 12-year-long ongoing project to automate railway planning and dispatching for Deutsche Bahn, one of its long-time partners and Europe’s largest rail operator. Similarly, the Tunis- and London-based AI company has bolstered efforts to develop other industrial optimization use cases, such as collaborating with Fraport in Germany to optimize complex airport operations with AI. 

“In general, we also see the potential of AI agents as very compelling for the future. We think industrial optimization and agentic-based systems, working hand in hand with human colleagues, will revolutionize industrial efficiency. So this is also another area we’ve been at for many years and one where we are continuing to invest,” noted Beguir.

Meanwhile, Instadeep, earlier this month, launched the pro version of its DeepPCB (Deep Printed Circuit Board) product, a hardware or printed circuit board design entirely assisted with autonomous AI powered by reinforcement learning, in San Francisco. Beguir says the company’s competitors are smaller AI startups in specific areas it operates, such as Riyadh-based Intelmatix.

The Instadeep chief takes pride in his company’s work on solving more complex use cases of AI – for example, Gen AI for DNA or proteomics or agentic workflows for combinatorial optimization – and steering away from simple ones like Gen AI for NLP. He claims that, besides BioNTech’s acquisition, this ingenuity plays a considerable part in driving inbound interest from customers in the U.S., where the AI company now has two offices, and also across Europe: Berlin, Paris, and the U.K., in particular.  

Even though BioNTech spent $500 million on Instadeep to boost its biotech capabilities, it keeps the AI company operationally independent for reasons like this, while funding its activities to serve customers beyond the biotech industry.”

“Because we contribute value by being leaders in AI, and AI skills can be improved across multiple sectors,” answered Beguir when asked why BioNTech still allows the AI company to work on non-biotech projects. “It’s the same tech stack, so time working on AI outside biotech is not time lost at all. BioNTech also deploys InstaDeep on tasks outside biotech R&D, such as in operations optimization.”

Beguir explains that while InstaDeep wasn’t forced to sell, it was the shared vision and successful projects with BioNTech since 2019, long before the acquisition, that convinced the AI company to move forward with the deal. He believes the trust built over years of collaboration is why InstaDeep will remain independent under BioNTech. The key for InstaDeep now is to keep up its momentum, maintain high-quality results, and continue innovating for as long as possible.”

Since the acquisition, InstaDeep has grown to over 400 employees worldwide. This includes its team in Africa, based in a new office in Kigali, which is leading the company’s geospatial intelligence work.

Initially an on-the-ground effort in partnership with Google to detect locust breeding grounds in Africa, Instadeep now uses past label data and satellite imaging to infer with high quality and an 80-85% accuracy where the locust bridging grounds will be in the next 30 days. Beguir says InstaGeo, the company’s framework that uses multispectral satellite imaging from NASA or the European Space Agency (ESA), is open-source and available for other companies to develop scalable solutions across the continent.

“This is a real example of how AI technology and capability is having an impact. Rather than collecting samples on the ground or depending on ground infrastructure, we can deliver those insights via satellites at scale and notify multiple governments and actors to tackle a growing challenge to food security, especially given the continent’s climate issues.”




Source