Voice cloning best practices

Better cloning starts with better source material. Small improvements in the source audio can make a noticeable difference in the final voice.

Keep the samples clean

Aim for speech that is:

The quality of the capture matters more than the file format itself.

The model responds better when the sample material sounds like one coherent speaking style. Try to keep:

If the source swings between whispering, shouting, noisy clips, and clean narration, the cloned result can become less stable.

Cloned voices tend to inherit the style of the sample material.

If you want a clean, steady narrator feel, source clips with that same style usually work better.

As a rule of thumb:

Try to avoid samples that are:

A balanced recording level is better than an aggressively loud one.

For most projects:

start with Studio
switch to Realistic only when the environment character is part of the experience