Job Title:
SR AI Engineer
Company: Fulcrum Digital Inc
Location: Pune, Maharashtra
Created: 2025-11-24
Job Type: Full Time
Job Description:
We are seeking a skilled and hands-on Sr. AI Engineer with 4–8 years of experience in developing, fine-tuning, and deploying machine learning and deep learning models, including Generative AI systems. The ideal candidate has a strong foundation in classification, anomaly detection, and time-series modeling, along with experience in Transformer-based architectures. Expertise in model optimization, quantization, and Retrieval-Augmented Generation (RAG) pipelines is highly desirable.Exp-4-8 YearsNotice Period-Immediate-15 DaysLocation-Pune(Hybrid)Responsibilities- Design, train, and evaluate ML models for classification, anomaly detection, forecasting, and natural language understanding tasks. - Build and fine-tune deep learning models, including RNNs, GRUs, LSTMs, and Transformer architectures (e.g., BERT, T5, GPT). - Develop and deploy Generative AI solutions, including RAG pipelines for applications such as document search, Q&A, and summarization. - Apply model optimization techniques, including quantization, to improve latency and reduce memory/compute overhead in production. - Fine-tune large language models (LLMs) using Supervised Fine-Tuning (SFT) and Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA or QLoRA (optional). - Define, track, and report relevant evaluation metrics; monitor model drift and retrain models as required. - Collaborate with cross-functional teams (data engineering, backend, DevOps) to productionize ML models using CI/CD pipelines. - Maintain clean, reproducible code, and proper documentation and versioning of experiments.Required Skills & Qualifications- 4–5 years of hands-on experience in machine learning, deep learning, or data science roles. - Proficiency in Python and ML/DL libraries: scikit-learn, pandas, PyTorch, TensorFlow. - Strong understanding of traditional ML and deep learning, particularly for sequence and NLP tasks. - Experience with Transformer models and open-source LLMs (e.g., Hugging Face Transformers). - Familiarity with Generative AI tools and RAG frameworks (e.g., LangChain, LlamaIndex). - Experience in model quantization (dynamic/static, INT8) and deploying models in resource-constrained environments. - Knowledge of vector stores (e.g., FAISS, Pinecone, Azure AI Search), embeddings, and retrieval techniques. - Proficiency in evaluating models using statistical and business metrics. - Experience with model deployment, monitoring, and performance tuning in production. - Familiarity with Docker, MLflow, and CI/CD practices.