Machine Learning Engineer - Generative Video & Avatar Models
Mindflix AI
Bengaluru, Karnataka, IndiaMID
AIGenerative Models
Job Description
Join Mindflix AI as a Machine Learning Engineer.
Responsibilities
- Train, fine-tune, and adapt generative video and avatar models.
- Work with video, image, audio, and face datasets for model training and evaluation.
- Build and maintain training pipelines using Python and PyTorch.
- Improve lip-sync quality, facial realism, identity preservation, audio-visual alignment, and temporal consistency.
- Read research papers and adapt open-source GitHub repositories into stable internal workflows.
- Debug model training issues, data pipeline issues, GPU memory issues, and inference-quality problems.
- Run experiments on local GPU machines, cloud GPUs, or GPU clusters.
- Evaluate outputs through visual inspection, internal quality benchmarks, and relevant technical metrics.
- Collaborate with product, engineering, and creative teams to improve avatar output quality for real-world use cases.
- 2 to 4+ years of hands-on experience training deep learning models.
- Strong Python and PyTorch skills.
- Experience in computer vision, image/video processing, or generative AI.
- Experience working with generative models such as diffusion models, GANs, transformers, image generation models, or video generation models.
- Familiarity with OpenCV, FFmpeg, or similar video/image processing tools.
- Experience preparing, cleaning, and structuring datasets for model training.
- Ability to read research papers and implement/adapt research code.
- Experience training models on GPUs and debugging GPU-related issues.
- Strong problem-solving ability and comfort working with experimental model pipelines.
- Experience with talking-head generation, avatar generation, face animation, lip-sync, or audio-driven video generation.
- Experience with models such as Wav2Lip, SadTalker, MuseTalk, AnimateDiff, Stable Diffusion, diffusion transformers, or similar generative video systems.
- Experience with audio-visual models, speech-driven animation, or facial landmark-based animation.
- Experience improving temporal consistency, identity preservation, expression quality, or video realism.
- Experience with distributed training, multi-GPU training, mixed precision training, or model optimization.
- Familiarity with Hugging Face, CUDA, accelerate, DeepSpeed, or similar model training tools.
- Experience converting research code into repeatable internal pipelines or production-adjacent systems.
- Comfortable working in an early-stage AI product environment.
- Strong ownership mindset and ability to work independently.
- Practical research mindset: able to balance experimentation with usable product outcomes.
- Willingness to work with imperfect data, fast iterations, and evolving technical goals.
- Interest in building avatar and video AI systems that can be used in real-world enterprise products.
Qualifications
- 2 to 4 years of experience
- Strong Python and PyTorch skills
Nice to have
- Experience with talking avatars
- Familiarity with generative models