The models in this repository mainly consist of audio autoencoders, which are utilized to train latent diffusion models. More models might be released in the future.