How lip sync works
Lip sync takes a source video and a new audio track, then updates the speaker’s mouth movements so they better match the translated speech.What changes and what stays the same
VoiceCheap focuses on the speaking face region. In practice:- the mouth and lower-face motion are updated
- the target audio drives the new mouth movement
- the rest of the frame stays as intact as possible
The high-level pipeline
Detect the face
VoiceCheap identifies the speaking face and tracks the relevant facial landmarks around the mouth and lower face.
Analyze the translated audio
The system listens to the target audio and computes the mouth shapes needed to match the speech.
Generate new lip movement
The speaking region is regenerated frame by frame to match the translated audio more closely.
Where lip sync fits in the workflow
Lip sync is not the first quality step. The usual order is:- get the transcript right
- choose the voice strategy
- make sure the dubbed audio sounds natural
- add lip sync if the project benefits from it
Limitations to expect
- strong profile views are harder than front-facing shots
- heavy obstructions around the mouth reduce quality
- shaky footage reduces face-tracking reliability
- noisy or overlapping audio can reduce the naturalness of the final result

