Abstract: Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech ...
Note: this package is not 100% compatible with the CBOR specification. See the Not implemented section for more details.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results