Audiobox

Free and Open Source AI Speech and Voice Generation Models from Meta

Summary

Audiobox is a new foundation research model for audio generation by Meta that can generate voices and sound effects using voice inputs and natural language text prompts.

Abstract

Audiobox is Meta's new foundation research model for audio generation that utilizes a combination of voice inputs and natural language text prompts to generate voices and sound effects. The model includes specialist models Audiobox Speech and Audiobox Sound, and is built upon the shared self-supervised model Audiobox SSL. The website provides interactive audio demos to help users understand the unique capabilities of Audiobox and allows users to express their creativity and make a fun and original audio story with Audiobox Maker. The website also provides information on how Audiobox works and Meta's commitment to making AI safe for everyone.

Bullet Points

•Audiobox is a new foundation research model for audio generation by Meta.
•It can generate voices and sound effects using a combination of voice inputs and natural language text prompts.
•The Audiobox family of models includes specialist models Audiobox Speech and Audiobox Sound.
•All Audiobox models are built upon the shared self-supervised model Audiobox SSL.
•The website provides interactive audio demos to help users understand the unique capabilities of Audiobox.
•Users can express their creativity and make a fun and original audio story with Audiobox Maker.
•The website also provides information on how Audiobox works and Meta's commitment to making AI safe for everyone.