AI NARRATION NAMING Guidelines for audiobooks

Developed by the industry, the Audio Publishers Association, the Audio Publishers Group (part of the UK Publishers Association), October 2024

INTRODUCTION

With the proliferation of AI narration tools, the aim of these guidelines is to establish best-practice guidance for publishers, distributors, book management software providers and retailers regarding the terms that should be used to distinguish types of AI narration products, and where and how these should be applied.

The naming conventions suggested in this document are designed to be transparent for publishers, retailers and distributors, while offering clarity for consumers, allowing for informed choice and ensuring continued trust in our audio products.

These guidelines make no judgements on the use of AI narration in the industry.

In the majority of cases, responsibility for correctly applying these naming conventions within an audiobook’s metadata should lie with the publisher.

It is further recommended that retailers and distributors should display this information prominently to provide clarity for consumers.

TYPES OF AI VOICE NARRATION

These guidelines identify two types of AI narration that should be distinguished from each other using the following naming conventions:

AI VOICE

An AI-based synthesized voice that has been generated using samples from a large group of unidentified speakers.

An example of this is Google’s “Auto Narration” voices, such as “Archie” or “Mary”.

AUTHORIZED VOICE REPLICA (AVR)

An AI-based voice that has been generated using authorized/licensed samples from a specific human voice and seeks to replicate that voice.

An example of this would be a publisher working with a deceased author’s estate to create an authorized voice replica based on archive samples of the author’s voice.

In the past, replication has also been referred to as ‘cloning’. For the purposes of these guidelines, ‘cloning’ refers to unauthorized replication where a human has not given permission for their voice to be replicated.

Most AI Voices have identifying names, in some cases a single given name. These guidelines encourage the use of this name as the ‘narrator’ or ‘reader’ in the title metadata, allowing consumers to easily identify and select voices, as they do with standard human narrators.

WHERE AND HOW TO APPLY THESE LABELS

FOR PUBLISHERS

In the majority of cases, it should be the responsibility of the publisher to ensure that AI narration is correctly identified in a title’s metadata.

A narrator voice is deemed to be an AI Voice or Authorized Replica Voice if it was the intention of the producer/publisher to use a synthetic or replica voice instead of a human narrator for whole or part of an audiobook. In cases where AI narration has been used for only a small part of a production or in post-production, a publisher should consider using these terms if more than 10% of a voice has been created using AI tools.

How should publishers label AI narration in title metadata?

These guidelines recommend that publishers use title metadata only to convey to retailers the presence of AI narration.

ONIX can be used to easily transmit this AI narrator information to retailers. Many publishers will use book management software or a metadata portal in which new fields can be developed to easily reflect the following ONIX codelists and tags for each title.

For publishers not using ONIX, equivalent information regarding AI narrators could be supplied via a spreadsheet (such as Excel), or direct to a retailer in their proprietary format.

Below are two examples of how ONIX code can be used to reflect the two types of AI narration established in these guidelines:

  • AI Voice

In ONIX List 19 (Unnamed persons) codes 05–07 can be used for AI Voice. ONIX calls this a “synthetic voice” – a generic AI-based voice generated using samples from a large group of unidentified speakers. In this example the generic AI Voice has been given the name, Amir.

<Contributor>
<SequenceNumber>
2</SequenceNumber>
<ContributorRole>
E07</ContributorRole> <!-- read by -->
<UnnamedPersons>
05</UnnamedPersons> <!-- synthesized voice male -->
<AlternativeName>
<NameType>
07</NameType> <!-- fictional character name -->
<PersonName>
Amir</PersonName>
</AlternativeName>
</Contributor>

In this example, Amir is a male AI Voice. The Unnamed Persons codes 06 (Synthetic Voice Female) or 07 (Synthetic Voice Neutral) could also be used.

  • Authorized Voice Replica

Code 08 in List 19 (Unnamed Persons) indicates an Authorized Voice Replica. ONIX calls this a “synthetic voice based on a human voice actor”, and the Alternative name composite can be used to provide the real-world name of the actor – in this example we’ve used a fictitious narrator called Jane Smith.

<Contributor>
<SequenceNumber>
2</SequenceNumber>
<ContributorRole>
E07</ContributorRole> <!-- read by -->
<UnnamedPersons>
08</UnnamedPersons> <!-- synthesized voice based on -->
<AlternativeName>
<NameType>
04</NameType> <!-- ‘real name’ -->
<PersonName>
Jane Smith</PersonName>
</AlternativeName>
</Contributor>

The ‘real name’ can also be given an identifier such as an ISNI (see https://isni.org), to ensure there’s no ambiguity with other people of a similar name.

  • AI Translation Voices

In addition, in cases where AI has been used to translate a voice into another language while maintaining the voice characteristics, these guidelines encourage publishers and producers to express this in the metadata. In this example, a book by Jane Austen has been read by narrator Jane Smith. An AI model has then been used to translate the text, and a foreign language Authorized Voice Replica voice has been used for Jane Smith. This can be expressed in using contributor role code for “translated by” in List 17, together <UnnamedPersons> with code 09 which means “AI”. Jane Smith’s name can then be given with the same Authorized Voice Replica code above:

<Contributor>
<SequenceNumber>
1</SequenceNumber>
<ContributorRole>
A01</ContributorRole> <!-- author -->
<PersonName>
Jane Austen</PersonName>
</Contributor>
<Contributor>
<SequenceNumber>
2</SequenceNumber>
<ContributorRole>
B06</ContributorRole> <!-- translator -->
<FromLanguage>
eng</FromLanguage>
<UnnamedPersons>
09</UnnamedPersons> <!-- generative AI -->
</Contributor>
<Contributor>
<SequenceNumber>
3</SequenceNumber>
<ContributorRole>
E07</ContributorRole> <!-- reader of audiobook -->
<UnnamedPersons>
08</UnnamedPersons> <!-- synthesized voice based on -->
<AlternativeName>
<NameType>
04</NameType> <!-- ‘real name’ -->
<PersonName>
Jane Smith</PersonName>
</AlternativeName>
</Contributor>

FOR RETAILERS

For consumer clarity, it is recommended that the AI narrator information conveyed in ONIX be ingested by retailers and the corresponding naming convention (AI Voice or Authorized Replica Voice) be displayed in the narrator line of a title listing, alongside the narrator’s name. For example:

Pride and Prejudice
Author: Jane Austen
Narrated by:
Amir (AI Voice)
Duration: 11 hours
Pub date: 01-01-29

-------------------------

The Hound of the Baskervilles
Author: Arthur Conan Doyle
Narrated by:
Jane Smith (Authorized Voice Replica)
Duration: 6 hours
Pub date: 01-01-29

------------------------

Please note, these are recommendations only, and it is up to each individual retailer to decide how to process and display the AI narration details conveyed in a title’s metadata. For example, international retailers could convey the information in the title metadata in their local languages.

If a retailer is uncertain whether a label has been applied correctly on a title, or that a replica voice may be unlicensed, then it is the recommendation of these guidelines that clarification be sought from the publisher.

CONCLUSION

It should be stressed that these guidelines are the best practice recommendations only. They are not requirements. However, they are the considered conclusions that emerged from cross-industry discussion.

These guidelines cannot be effective without cross-industry support. To bring the broadest benefit, we hope that publishers and retailers come together to adopt these naming practices and clearly convey them to consumers.

RESOURCES

EDItEUR – For more information on AI codes in ONIX, please visit:
https://www.editeur.org/files/ONIX%203/APPNOTE%20Aspects%20of%20AI%20in%20ONIX.pdf

And for a non-technical introduction to ONIX, there is a 15-minute pre-recorded video briefing at https://www.editeur.org/83/Overview/

BIC – https://bic.org.uk/resources/bic-digital-audiobook-supply-chain-best-practice/

Audiobooks Publishers Association – https://www.audiopub.org/

Publishers Association – https://www.publishers.org.uk/

CONTRIBUTORS (in alphabetical order)

Audio Publishers Association
Audio Publishers Group (part of the UK Publishers Association)
Audible
Blackstone
Bloomsbury
Bonnier Books UK
Bookbeat
Bookwire
Dreamscape
EDItEUR
Hachette
HarperCollins
Hoopla Digital
Ingram Content Group
Macmillan
Midwest Tapes
Overdrive
Penguin Random House
Podium Audio
Princeton Press
Profile Books
Rakuten Kobo
Scholastic
Simon and Schuster
Sounded
Storytel
WF Howes
Xigxag
Yoto