Results for ""
Meta has introduced a high-fidelity codec that uses AI to compress audio files without losing quality in real time.
Meta research shows how AI can aid in listening to an audio message in an area with low connectivity and not having it stall or glitch. They built a three-part system and trained it to the end to compress the audio data to the target size.
This data can be decoded using a neural network. The research team achieved an approximate 10x compression rate compared with MP3 at 64 kbps without losing quality.
Similar techniques have been explored before for speech. However, the Meta team is the first to make it work for 48 kHz sampled stereo audio (i.e., CD quality), which is the standard for music distribution.
Although the team regards that more work needs to be done, eventually, it could lead to improvements such as supporting faster, better-quality calls under poor network conditions and delivering rich metaverse experiences without requiring major bandwidth improvements.
EnCodec consists of three parts:
The researchers use discriminators to improve the quality of the generated samples. However, the discriminators need to differentiate between real and reconstructed samples. The compression model attempts to create samples to fool the discriminators by pushing the reconstructed samples to be more perceptually similar to the original samples.