In an era where automation is not just a buzzword but a fundamental shift in operational paradigms across industries, the media localization sector is no exception.
The drive towards speed, efficiency and scalability has led to the integration of sophisticated artificial intelligence solutions at every level.
The rise of AI in media is a hot topic, with discussions centering around its potential to revolutionize the way we produce, translate, and distribute content.
The landscape is brimming with startups and tech giants alike, unveiling groundbreaking AI solutions tailored for the media. These innovations span various applications, from synthetic media to AI dubbing solutions, signaling a new era in media access and consumption.
Language technology: The past
The reality is language technologies have been in use in the media for a while now for accessibility purposes. Both speech recognition and synthesis have been used for over two decades to provide media accessibility to deaf and hard-of-hearing as well as blind and visually impaired viewers.
In the case of the former, live captioning and subtitling services have long been produced with a human in the loop in the form of a respeaker (or voice writer, as they are called in the States) speaking the on-screen dialogue into an automatic speech recognition (ASR) system that churns out live subtitles and captions at the speed of the audio delivery.
In the case of the latter, screen readers and audio subtitling have been used to voice on-screen text, while in the past decade audio description tracks have also been recorded with synthetic voices.
The advent of machine translation in broadcast media, however, marks a more recent milestone. Although early attempts trace back to the early 1990s, it wasn’t until neural machine translation (MT) came along and streaming platforms like iflix and later Netflix adopted it that the technology gained traction in the media sector.
The promise of the technology to localize vast libraries of content rapidly and cost-effectively for global audiences was too large to ignore and the fluency of neural MT too promising, so everyone started experimenting with it.
Language technology in service of media localization challenges: The present
The application of language technologies in media localization can address several pressing challenges:
• Cost efficiency: Traditional localization workflows may offer high product quality, but they are labor-intensive and costly. Full or partial automation through language technologies can significantly reduce operational costs, making it viable to localize content for niche markets and languages as well as for distribution avenues such as FAST channels and the like which do not offer an upfront guarantee on the return on localization investment.
• Time-to-market: In the competitive landscape of digital media, speed is of the essence and day-and-date releases are becoming the norm. Language technologies streamline the localization process and as a result can reduce turnaround times from weeks to days or even hours, depending on the level of human involvement, thus enabling faster global distribution.
• Scalability: With the explosive growth of digital content, one of the foremost challenges is scaling localization efforts to meet global demand. Language technologies enable content providers not only to localize content faster and more cost-effectively but to do so in more languages and dialects than ever before.
• Accessibility: Beyond translation, language technologies play a pivotal role in making content accessible to people with disabilities. Live intralingual captioning, subtitling and audio description are not just regulatory requirements but essential services that enhance the viewing experience for all.
• Personalization: With the inclusivity afforded by accessibility services and the hyper localization of content into viewers’ regional languages and dialects, language technologies make it possible for the first time to personalize access to content and truly break down language barriers globally.
Post-COVID recession in the market, paired with the resulting price pressure and a continuous demand to localize large volumes of content that could have not been localized at the time and cost constraints of traditional workflows, has led to an increased push for the use of such technologies.
Language service providers are being called on to deliver on all three aspects of the “speed, price, quality” trifecta without compromising on any of the three.
One after the other, they are experimenting with and adopting fully automated and hybrid workflows where language technologies are the enablers of new product lines that can satisfy the demands of the market, as evidenced by recent stats and discussions at MESA’s ITS: Localisation! event in London.
Leveraging language technology: Integrating AI in media localization
OOONA, a leader in the field of software for content localization, is harnessing the power of AI to provide innovative solutions for localization workflows. Not only is it integrating speech recognition, machine translation and text-to-speech (TTS) engines into its vast array of tools, it offers such integration upstream, directly in its media localization management platform.
In OOONA’s Integrated platform, AI is no more than an available resource to be selected in any given workflow. Once a video asset is available on the platform, it can be subtitled in the source audio language by a professional or by one of the available ASR engines. The same goes for any translation task. Such automation can carry on to post-editing by a professional with the relevant qualifications if the workflow envisages such a step.
Once an AI-enabled workflow is set up, task initiation and succession take place in an automated manner and the completion of each step automatically triggers the next one in the process with the relevant resource notified.
When it comes to scripts, they can be recorded by voice talents or by one of the available synthetic voices in each TTS engine – audio description being the primary use case. Post-editing of the synthetic voice output is also possible for pronunciation, speed, pitch, volume etc. as all editable features of the TTS engine are made available in the OOONA user interface via API.
As client demand for these technologies increases, OOONA plans to adhere to its standard approach of prioritizing user feedback in further development efforts. This means integrating any AI engine and any interface functionality that users request.
Given the rapid advances in language technology development and the competitive landscape among the different engines, it is crucial to be able to select different ones for different languages at different times.
This makes it important to be able to select them on a case-by-case basis.
To address this, OOONA offers flexible AI bundles for purchase, allowing users to deploy them as needed on any integrated ASR, MT, or TTS engine.
Going forward
The journey of language technologies in media has been marked by remarkable progress, from their first steps in the 1990s to the specialized solutions of today.
As we navigate the challenges and opportunities of a global digital landscape, the role of language technologies has never been more pertinent.
OOONA’s approach to content localization automation focuses on the dynamic synergy between human expertise and artificial intelligence, leading the way towards a more inclusive, accessible, and connected world.
=====================================
By Ma’ayan Leeper Carr, CMO, OOONA