-
SalaryJob Description
Summary role description:
Hiring a Machine Learning (ML) Engineer for one of the leaders in firmware and platform-level software provider.
Company description:
Our client is a high-end tech company playing their part in the backbone of enterprise IT infrastructure. The company is US headquartered and has a global footprint. Their technology is integrated into millions of devices worldwide, including servers, desktops, and embedded systems. With decades of experience in low-level systems development, they play a critical role in shaping the foundational software that powers modern computing platforms.
Role details:
- Title / Designation: Machine Learning (ML) Engineer
- Location: Kolkata
- Experience: 7+ years
Role & responsibilities:
- Build, train, and fine-tune LLMs, and evaluate their performance using techniques like SFT, LoRA/QLoRA, and RLHF.
- Optimize models for local or edge environments by improving speed and efficiency through quantization, pruning, and distillation.
- Deploy models into production on-premises or at the edge using frameworks such as PyTorch, ONNX, TensorRT, vLLM, or llama.cpp, and integrate them into applications via APIs and internal services.
- Design and maintain scalable training and inference pipelines to ensure reproducibility and efficiency.
- Monitor model performance in production, including accuracy, drift, latency, and resource utilization, and continuously optimize outcomes.
- Ensure models meet security, privacy, and compliance requirements, especially in restricted or offline environments.
- Collaborate with software engineers, infrastructure teams, and domain experts to deliver end-to-end AI solutions.
- Document model architectures, training processes, and deployment workflows for clarity and future use.
Candidate requirements:
- Master’s or Ph.D. in Computer Science, Engineering, or a related field, or equivalent practical experience, along with 7+ years of experience in AI/ML, including at least 2 years working on LLMs, large-scale neural networks, RAG, or AI-driven automation.
- Strong hands-on experience with LLMs such as LLaMA, Mistral, Falcon, or similar open-weight models, along with proficiency in Python and frameworks like PyTorch or TensorFlow.
- Expertise in vector databases and retrieval systems (FAISS, Weaviate, Chroma, Pinecone, Milvus) and experience building RAG-based solutions.
- Experience developing and deploying models in local, on-premises, or resource-constrained environments, with a solid understanding of model optimization techniques like quantization, batching, and memory optimization.
- Hands-on experience with multi-agent AI systems (LangGraph, CrewAI, AutoGen, OpenAI Assistants API) and building autonomous or AI-driven workflows.
- Strong experience in end-to-end model development, working with business stakeholders to define KPIs and delivering multi-modal (text and image) or ensemble models.
- Familiarity with Linux, Docker, and basic cloud or on-prem infrastructure concepts.
- Experience with distributed training, multi-GPU systems, and handling large-scale models (10B+ parameters or multi-billion token datasets) is a plus.
- Knowledge of inference optimization tools such as vLLM, TensorRT-LLM, and ONNX, along with exposure to MLOps tools for model versioning and monitoring.
- Background in working with security-sensitive or regulated environments (such as finance, healthcare, or government) is preferred.
Selection Process:
- Two technical rounds
- One HR round
Recruiter Details:
Check Your Resume for Match
Upload your resume and our tool will compare it to the requirements for this job like recruiters do.
Check for Match
It has come to our attention that clients and candidates are being contacted by individuals fraudulently posing as Antal representatives. If you receive a suspicious message (by email or WhatsApp), please do not click on any links or attachments. We never ask for credit card or bank details to purchase materials, and we do not charge fees to jobseekers.
