Home » Blog » Retab raises USD 3.5M and launches most powerful document AI platform on the market

Retab raises USD 3.5M and launches most powerful document AI platform on the market

The developer-first platform for document automation comes out of stealth with new funding, new product, and a bold plan to become infrastructure for the next wave of vertical AI.

Paris, France & San Francisco, CA – July 30, 2025; AI agents are poised to change the world, but they can’t read the documents that run it. Retab, a San Francisco-based startup founded by engineers frustrated by the broken state of document AI, is fixing that. Today, the company announces $3.5 million in pre-seed funding and the launch of its platform.

The round was backed by leading early-stage funds including VentureFriends, Kima Ventures, and K5 Global, alongside Eric Schmidt (via StemAI), Olivier Pomel (CEO, Datadog), and Florian Douetteau (CEO, Dataiku). The new capital will support platform development and community growth, as the company scales its infrastructure to meet rising demand from vertical AI startups and internal innovation teams alike.

Retab is a developer platform and SDK that redefines everything about document processing in the age of large language models. Developers define the schema of the data they need; Retab handles the rest – from dataset labeling and evaluations, to automated prompt engineering & model selection.

“People keep building demos that look like magic, but break the moment you put them into production,” said Louis de Benoist, co-founder and CEO of Retab. “We lived that pain ourselves. Wiring up fragile pipelines just to extract a few fields from a PDF. We built Retab because it’s the developer-first platform we always wished we had.”

Louis and his co-founders cut their teeth building internal automation tools for document-heavy workflows in logistics. Over time, they realized their true value wasn’t in the output, but the orchestration layer they’d built to make the models work. That tooling became the foundation of Retab – now used by dozens of companies to extract structured data from messy, real-world inputs.

Retab is not another large language model. It’s the essential intelligence layer that makes the world’s most powerful models—from providers like OpenAI, Google, and Anthropic—usable for critical workflows. Developers define the data they need, and Retab’s platform manages the entire lifecycle to ensure verifiable accuracy.

The platform delivers guaranteed performance through a system of intelligent checks and balances:

Self-Optimizing Schemas: An AI agent automatically tests and refines instructions based on a user’s documents, maximizing accuracy before the system ever goes live.

Intelligent Model Routing: The platform is model-agnostic. It automatically benchmarks and routes each task to the best-performing model for the job, whether the priority is cost, speed, or accuracy. This can make processes up to 100x cheaper than other solutions.

Guided Reasoning & k-LLM Consensus: Retab forces models to “think” step-by-step and uses a consensus mechanism among multiple models to quantify uncertainty, acting as a powerful safety net to ensure trustworthy results.

“Retab is the OS for reliably extracting structured data,” said de Benoist. “It wraps the best models in a layer of logic that actually makes them usable with error handling and structured outputs. That’s what devs need if they want to build production apps, not just prototypes.”

Customers across logistics, finance, and healthcare are already seeing results. A major trucking company used Retab and found the smallest, fastest model configuration that could meet their 99% accuracy threshold, dramatically lowering operational costs. A financial services firm uses Retab to extract specific quantitative metrics and qualitative risk factors from 200-page quarterly reports – a task that previously took a team of analysts days to complete. Others are automating claims processing, medical records, identity verification, and onboarding with minimal setup.

According to Florian Douetteau, co-founder and CEO of Dataiku and investor in Retab, “the AI-fication of the economy depends on the capability to convert operations based on millions of documents into verified, structured data that autonomous systems can utilize. On a large scale, this process hinges on quality control, cost efficiency, and rapid implementation. The team at Retab understands this thoroughly and is uniquely positioned to solve it for the thousands of AI first companies that are emerging.”

Looking ahead, Retab is expanding its platform to apply the same reliable extraction methods to websites and is launching integrations with automation platforms like n8n, Zapier, and Dify.

Retab is also building toward its long-term vision: to serve as the intelligent middleware layer between the world’s unstructured data and the AI agents that need to understand them. Whether it’s parsing a loan file, a contract, or a customs manifest, Retab makes unstructured data usable, safe, and programmable.

With just ten employees and a fast-growing developer base, Retab is already positioning itself as a foundational layer in the AI infrastructure stack – a tool that doesn’t just show what’s possible, but lets anyone build with it.

Leave a Reply

Your email address will not be published. Required fields are marked *