Audiobox is a new foundation research model for audio generation by Meta that can generate voices and sound effects using voice inputs and natural language text prompts.
Audiobox is Meta's new foundation research model for audio generation that utilizes a combination of voice inputs and natural language text prompts to generate voices and sound effects. The model includes specialist models Audiobox Speech and Audiobox Sound, and is built upon the shared self-supervised model Audiobox SSL. The website provides interactive audio demos to help users understand the unique capabilities of Audiobox and allows users to express their creativity and make a fun and original audio story with Audiobox Maker. The website also provides information on how Audiobox works and Meta's commitment to making AI safe for everyone.