Meta has launched an “open” implementation of the viral generate-a-podcast function in Google’s NotebookLM.
Known as NotebookLlama, the undertaking makes use of Meta’s personal Llama fashions for a lot of the processing, unsurprisingly. Like NotebookLM, it might generate back-and-forth, podcast-style digests of textual content recordsdata uploaded to it.
NotebookLlama first creates a transcript from a file — e.g. a PDF of a information article or weblog publish. Then, it provides “more dramatization” and interruptions earlier than feeding the transcript to open text-to-speech fashions.
The outcomes don’t sound practically nearly as good as NotebookLM. Within the NotebookLlama samples I’ve listened to, the voices have a really clearly robotic high quality to them, and have a tendency to speak over one another at odd factors.
However the Meta researchers behind the undertaking say that the standard may very well be improved with stronger fashions.
“The text-to-speech model is the limitation of how natural this will sound,” they wrote on NotebookLlama’s GitHub web page. “[Also,] another approach of writing the podcast would be having two agents debate the topic of interest and write the podcast outline. Right now we use a single model to write the podcast outline.”
NotebookLlama isn’t the primary try to copy NotebookLM’s podcast function. Some tasks have had extra success than others. However none — not even NotebookLM itself — have managed to resolve the hallucination downside that canines all AI. That’s to say, AI-generated podcasts are certain to comprise some made-up stuff.