Text-to-Speech: The audio opportunity that shouldn’t be ignored

By Ann-Katrine Fredenslund | Feb 11 2021 | Insights | Cases | Media | Audio

Would you rather listen to the article? Press play and have the article read aloud by the Text-to-Speech Google Cloud speech synthesis. 

Audio and spoken content are making quite the noise amongst publishers and newspapers these years. And no wonder: Reading takes time and focus – which is in short supply with modern ways of living. At the same time, always being accessible is key no-matter the context in which the user finds him or herself in.

German, regional newspaper NOZ, Neue Osnabrücker Zeitung, and Danish media group JFM, Jysk Fynske Medier, have both had a dedicated focus on audio the past years.

Visiolink caught up with respectively Patrick Körting, Head of Audio at NOZ Digital, and Jørn Broch, Digital Editor at JFM to have a talk on the potential they see in audio and spoken content.


Secondary usage of media

On-demand consumption of audio is stronger than ever, and the non-textual usage continues to grow. Giving the readers the possibility to listen to the newspaper, wherever and whenever they want, enhances reach and engagement and adds to a new news experience.

Respectively, NOZ and JFM are working with both human narrated articles, where a speaker records a spoken version of the article, podcasts and automated, machine spoken Text-to-Speech articles.

“At JFM we have been working with audio for some years. Our offset to begin working with audio was that all our offerings towards the consumers were centred around the activity of reading. We wanted to become relevant in other usage scenarios as well – for instance while driving where the scenario could be that within 10 minutes you would have the possibility to have the most important news read aloud,” Jørn Broch explains.

Patrick Körting adds to that point:

“Where reading the news in nature is an active way of consuming news, listening is passive. In that context, there are three instances where you will never be able to engage with users through a text-based product: While driving, doing household chores and during the morning routine. It’s something most of us are doing every single day and here audio can play a key role in reaching those generations who have an on-demand audio behaviour, putting together their own personalized audio-based playlist.”

“In my opinion, audio is no longer a nice-to-have option which newspapers can choose to have. It is a must-have feature if they in the long term want to reach new and younger audiences, change habits and thereby ultimately survive,” Patrick Körting outlines.


Behavioural change in news consumption

Listening to news becomes more and more relevant – both in engaging with existing users and in reaching new target groups.

According to Patrick Körting, audio is relevant to NOZ due to three factors.

“First, we have a huge asset of 400 colleagues who all work with creating a great amount of news content each day. Audio is a new leverage to present this originally written content. Second, there is a behavioural change in the way people consume news and content: Reading the news is one way of consuming it, but new ways of consuming prosper as well – audio being one of these. Third, the technological evolution within Text-To-Speech and recording technology has taken off recent years making it possible to easily and at a low price create audio content,” Patrick Körting explains and continues:

“The three factors tie well together, and audio fit perfectly with modern ways of living and consuming media in new usage scenarios while doing other things. Parallel use is the key.”

However, while there is an overall behavioural change taking place the way news is being consumed, it takes time to accommodate existing users to take on the habit of listening to news, Patrick Körting points out:

“It’s important to notice that while audio can over time generate new and younger audiences, who already have caught on to audio as part of their media consumption; changing the behaviour of existing users calls for a change of habit.”

“What we have done with our existing customers is to really put focus on and explain again and again that audio is available, where to find it and how to use it, how to put together a playlist and the advantages of having your individual news broadcast read aloud for instance while driving. In doing so, we can really see an increase in the number of people using audio – but it takes time and legwork,” he says.


Cost-effective audio reach

The quality of machine narrated Text-to-Speech has improved rapidly in the past years and the technological development continues to accelerate. Combined with the value of creating audio content both easily and at a cheap price point makes Text-to-Speech relevant to both JFM and NOZ.

“Text-to-Speech is relevant because it doesn’t take up people resources in our organisation to record audio content. Taking the number of articles, we produce at JFM, the business case would become way too expensive if people were to use time narrating all articles,” Jørn Broch says as he continues:

“Text-to-Speech might not have the same personal connection as a human narrated article has, but instead you have a cost-effective solution and the advantage of scaling – and the quality is beginning to draw close to human narration.

Patrick Körting also notes on the quality of Text-to Speech.

“We are seeing that technology is taking off so quickly that soon you will not be able to recognise whether you are listening to a human or a machine. It is my firm belief that within probably just two years, it is close to impossible to determine whether you are listening to a computer voice or a human.”


Are you up for getting started with audio as well? Stay tuned as Visiolink will soon launch the next generation of Text-to-Speech based on Google Cloud.

Ann-Katrine Fredenslund


Ann-Katrine Fredenslund