January 14, 2026 at 02:05 PM

Chinese experts have introduced a powerful neural network for voice synthesis and precise voice cloning — CosyVoice…

Chinese experts have introduced a powerful neural network for voice synthesis and precise voice cloning — CosyVoice can do the following: 🕤 It can run locally even on simple devices like a calculator; 🕤 The model is significantly more compact in size compared to competitors, which are three times larger in volume; 🕤 It supports nine languages: Russian, English, Chinese, Japanese, Korean, German, Spanish, French, and Italian; 🕤 For voice cloning, a short audio clip of just 3 to 10 seconds is enough — the model fully replicates the timbre and speech mannerisms; 🕤 It supports voice streaming with minimal latency of about 150 milliseconds, enabling real-time voice cloning; 🕤 The Pronunciation Inpainting feature allows manual correction of the pronunciation of individual words; 🕤 A license for commercial use is available. The model can be downloaded on GitHub via this link (https://github.com/FunAudioLLM/CosyVoice) and on HuggingFace here (https://huggingface.co/FunAudioLLM/Fun-CosyVoice3-0.5B-2512).

Chinese experts have introduced a powerful neural network for voice synthesis and precise voice cloning — CosyVoice…

Related posts