Lyra, Google’s open source low-bitrate voice codec last year, combined with the open AV1 codec enables voice chat over a 56kbps connection.Lyra utilizes machine learning and other techniques to achieve very low bitrate speech compression that operates at 3kbps.
Google officially open-sourced Lyra last year and recently announced the launch of Lyra V2. Compared to V1, Lyra V2 features a new architecture, supports more platforms, offers scalable bitrates, has better performance, and produces higher quality audio.
Lyra V2 is based on an end-to-end neural audio codec called SoundStream. The architecture has a residual vector quantizer (RVQ) before and after the transmission channel, which quantizes the encoded information into a bitstream and reconstructs it at the decoder side.
The new architecture reduces latency from 100ms in previous versions to 20ms. In this regard, Lyra V2 is comparable to Opus for WebRTC, the most widely used audio codec today, with typical latencies of 26.5ms, 46.5ms, and 66.5ms, respectively.
Lyra V2 also encodes and decodes five times faster than previous versions. On a Pixel 6 Pro phone, Lyra V2 takes 0.57ms to encode and decode a 20ms audio frame, 35 times faster than real-time. The reduced complexity means that more phones than the V1 can run the Lyra V2 in real-time, reducing overall battery consumption.
- Generate higher quality audio
The quality of the generated audio has also improved, driven by years of machine learning research. Listening tests show Lyra V2 at 3.2 kbps, 6 kbps and 9.2 kbps audio quality (in MUSHRA
Scores, which represent subjective quality), hit Opus at 10 kbps, 13 kbps, and 14 kbps.
Click here for a detailed audio quality comparison.
Lyra V2 continues to provide what was already available in Lyra V1 (build tools, testing framework, C++ encoding and decoding API, signal processing toolchain, and sample Android applications). Developers who have used the Lyra V1 API will find the V2 API to look familiar, although there are some changes. For example, it is now possible to change the bitrate during encoding.Additionally, model definitions and weights are contained in .tflite in the file. Like V1, this release is a beta release and API and bitstream changes are expected.
Release Notice | Release Note
#Googles #open #source #speech #codec #Lyra #released