Skip to Main Content

Job Title


Senior ML Engineer / Applied Scientist – Post-Training (SFT, DPO, GRPO, RFT & Evaluation)


Company : gnani.ai


Location : Bengaluru, Karnataka


Created : 2025-12-18


Job Type : Full Time


Job Description

IndiaAI is building aligned, safe, and multilingual LLMs for 1.4B people. We’re hiring a Senior ML Engineer/Applied Scientist to lead post-training, covering SFT, DPO, GRPO, RFT, RLHF, multi-turn chat tuning, reward modeling, and evaluation — with a strong focus on Indic and low-resource languages.What You’ll Do- Build and scale SFT pipelines (single & multi-turn chat). - Run DPO, GRPO, RFT and other preference optimization techniques. - Train reward models and integrate them into alignment loops. - Use leading libraries: HuggingFace TRL/PEFT, DeepSpeed-Chat, NeMo Alignment, OpenRLHF, Axolotl, Colossal-AI. - Develop high-quality datasets for instructions, chat, and preference ranking. - Conduct multilingual & Indic evaluation using lm-eval-harness, Ragas, HELM. - Improve performance for low-resource Indic languages via augmentation & synthetic data loops. - Work with infra teams to scale training on multi-GPU clusters.What You Bring- 4–8+ years in ML/NLP with deep experience in post-training. - Strong expertise in SFT, DPO/GRPO/RFT, PPO-style RLHF. - Hands-on with TRL, NeMo, DeepSpeed-Chat, OpenRLHF, Axolotl. - Proficiency with LoRA/QLoRA, FSDP & distributed training. - Experience with Indic languages and multilingual NLP. - Strong evaluation and dataset engineering background.Bonus Skills- Experience with 7B–70B+ LLM tuning. - Contributions to alignment libraries. - Safety alignment or Constitutional AI experience.Join us to build India’s aligned, safe, multilingual LLMs.