Gladia raises $16M for AI transcription and analytics

Date:

Share post:

Be a part of our day by day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra


Gladia, an AI transcription and audio intelligence supplier, has raised $16 million in funding.

The Paris, France-based firm will use the funding to develop an end-to-end audio infrastructure – beginning with a brand new real-time audio transcription and analytics engine – enabling voice-first platforms to ship extra worth to their customers throughout borders with cutting-edge AI.

It’s a problem to rivals similar to Otter.ai and Fireflies.ai, in addition to different AI-based providers that transcribe voice conversations to textual content. In an interview with VentureBeat, CEO Jean-Louis Quéguiner defined to me why he began the corporate.

“As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner stated. “That’s why I founded the company.”

I received a demo of the AI transcription, and it labored in actual time as Quéguiner spoke English together with his heavy French accent. I’m used to providers like Otter getting a number of phrases unsuitable in a transcription, however within the first web page of outcomes from Gladia, I noticed no errors. He additionally confirmed how he may converse two totally different languages and the system may shift from one language to a different as wanted.

XAnge led the spherical, with participation by Illuminate Monetary, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital.

Gladia makes use of AI for audio transcription.

Based in 2022, Gladia has now raised a complete of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as a part of the First Sequoia Arc program), Cocoa, and GFC. Gladia not too long ago was chosen to take part within the AWS generative AI accelerator program.

“Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” stated Alexis du Peloux, associate at XAnge, in an announcement. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.”

Given that almost all speech recognition fashions right this moment are skilled predominantly on English audio information and are subsequently inherently biased, Gladia prioritized constructing the primary real-time product that’s really multilingual.

The brand new fine-tuned engine delivers superior real-time transcription in over 100 languages, together with enhanced help for accents and the distinctive capability to adapt to totally different languages on the fly.

Gladia’s new engine is exclusive in its capability to extract insights from a name—just like the caller’s sentiment, key info, and dialog abstract—in real-time. This implies it takes lower than a second to generate each transcript and insights from a name or assembly utilizing Gladia.

New real-time AI transcription

2024 09 12 GLADIA QUEGUINER SOTO 002
Gladia founders Jonathan Soto (left) and Jean-Louis Quéguiner.

Constructing an correct, low-latency, and multilingual engine in-house is a posh and resource-intensive job. It requires intensive experience in language understanding, real-time information dealing with, with steady optimization and upkeep. Actual-time fashions require extra computing energy and will wrestle to supply correct output instantly as a result of restricted context.

Gladia’s new product permits firms to bypass these challenges. The actual-time speech-to-text engine boasts an industry-leading latency of underneath 300 milliseconds with out compromising accuracy, whatever the language, geography, or tech stack used.

“Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” stated Jonathan Soto, CTO of Gladia, in an announcement. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.”

What’s forward

The corporate’s first async transcription and audio intelligence API launched in June 2023 and was based mostly on a proprietary model of Whisper ASR.

It quickly gained traction within the enterprise market, significantly with assembly recorders and note-taking assistants. The API is now adopted by over 600 clients around the globe, together with Consideration, Circleback, Methodology Monetary, Recall, Sana, and VEED.IO and has greater than 70,000 customers.

“Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner stated. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.”

Gladia will use the brand new capital to advance its R&D efforts and shortly convey to market a one-stop AI toolkit for audio and broaden its product providing with further à la carte fashions—together with massive language fashions (LLMs) and retrieval-augmented era (RAG). With a number of design companions within the contact-center-as-a-service (CCaaS) phase, the corporate is at the moment piloting an agent-assist resolution powered by Gladia’s real-time AI engine. Moreover, Gladia will proceed to broaden its expertise base because it prepares for worldwide enlargement.

“We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner stated. “You can start with the language and switch to another.”

He went on to indicate me that he may begin a name in English and provoke the transcription. Then he spoke French phrases, and the mannequin accurately translated it in French.

“Keep in mind that [others] are not real time right now, and this one is real time,” he stated. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.”

The service has an AI summarizer, and it’ll have new optionally available options within the coming months. Quéguiner stated that his service also can get acronyms proper and detect the swap to a different language.

“The mannequin we use is similar to LLMs (massive language fashions). It has no code decoder structure, which isn’t the case for a lot of the fashions that you simply’ve seen with Fireflies, as an illustration.

The market consists of “meeting recorders,” Quéguiner stated. The outcomes may be handed on to real-time insights, which may also help individuals like gross sales leads shut offers quicker.

The corporate additionally works with Name Facilities, giving them 30% quicker time to completion when they’re on the telephone thanks to higher accuracy. The corporate will cost a flat payment similar to a per-hour pricing.

Related articles

Azra Video games raises $42.7M for next-generation cell RPG with Web3 options

Azra Video games has defied the funding gods and raised $42.7 million in funding to speed up improvement...

Instagram’s newest function is a digital enterprise card to your profile

Instagram is rolling out a brand new “profile card” function that's primarily a digital enterprise card to your...

The iPad mini 7 goes large on Apple Intelligence

Meet Apple’s seventh-generation iPad mini. Arriving over three years after its predecessor, the brand new pill provides a...

Concourse is constructing AI to automate monetary duties

In a typical group, finance is among the most essential features. But groups are sometimes slowed down by...