We are seeking a Lead AI Engineer to design, build and scale cutting-edge AI applications powered by large language models. In this role, you will partner with clients to deliver tailored LLM-driven solutions, architect agentic systems and drive the adoption of emerging AI technologies across enterprise environments.
Responsibilities
- Design, implement and maintain end-to-end AI applications, including chatbots, Q&A platforms, agent workflows and other LLM-driven solutions
- Collaborate directly with clients to understand their needs, identify opportunities and recommend tailored AI/LLM solutions that drive business value
- Architect and optimize robust data pipelines, prompt strategies and datasets to ensure effective, accurate and scalable AI models
- Evaluate, monitor and refine AI system performance, ensure outputs are accurate, secure, scalable and compliant with industry regulations and best practices
- Conduct research, design experiments and perform rapid prototyping to validate technical feasibility and demonstrate the business value of AI solutions
- Stay current with evolving LLM technologies, frameworks, protocols (such as MCP, A2A, ACP) and methodologies, continuously improve solution quality and client outcomes
- Design and implement agentic systems with frameworks such as LangChain, LangGraph and Semantic Kernel, integrate with vector databases and advanced memory architectures
- Develop and maintain APIs and system integrations for production-grade AI applications, including enterprise system integration (CRM, ERP, databases)
- Deploy AI solutions at scale, consider performance, cost-efficiency, maintainability, observability and security (including guardrails and prompt injection prevention)
- Implement and monitor retrieval systems (keyword search, vector search, embeddings), ranking algorithms and agent evaluation frameworks
- Use MLOps/AIOps practices for agentic systems and ensure robust observability and monitoring of deployed solutions
- Clearly communicate complex technical concepts and AI strategies to both technical and non-technical stakeholders, iterate on models based on user feedback
Requirements
- Strong proficiency in at least one modern programming language (such as Python, Java, C#, Go, etc.); experience with web frameworks like FastAPI or similar is a plus
- Deep understanding of the AI application development lifecycle, including production deployment, system integration and rapid UI prototyping (Streamlit, Gradio or similar)
- Familiarity with major LLM platforms and APIs (OpenAI, Anthropic, Amazon Bedrock, Gemini) and related frameworks (LangChain, LangGraph, LlamaIndex, Strands Agents, etc.)
- Knowledge of advanced AI integration patterns (e.g., RAG, agent orchestration, tool calling), retrieval systems (keyword/vector search, embeddings) and ranking algorithms
- Experience to deploy AI solutions at scale, with a focus on performance, cost-efficiency, maintainability, observability and security (including guardrails and prompt injection prevention)
- Proven ability to evaluate generative AI quality with retrieval/classification scores, LLM-based evaluation, agent evaluation metrics and A/B testing
- Experience with vector databases (Pinecone, Weaviate, ChromaDB, FAISS) and semantic/hybrid search
- Experience to design experiments, conduct A/B tests and iterate on models based on user feedback
- Experience with enterprise system integration (CRM, ERP, databases) and deployment to cloud AI platforms or on-premise solutions
- Experience with observability and monitoring tools/frameworks, and application of MLOps/AIOps practices for agentic systems
- Familiarity with emerging protocols (MCP, A2A, ACP) and advanced memory architectures
- Proven experience in AI engineering and delivery of ML-based solutions in production environments
- Strong problem-solving skills, attention to detail and ability to work independently and collaboratively
- Excellent communication, collaboration and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders
Technologies
- Proficiency in at least one modern programming language (e.g., Python, Java, C#, Go, etc.) for AI development
- Web frameworks: FastAPI, Streamlit, Gradio, Flask, Spring Boot, ASP.NET or similar
- Major LLM platforms and APIs: OpenAI, Anthropic, Amazon Bedrock, Gemini
- Agentic frameworks: LangChain, LangGraph, Semantic Kernel, LlamaIndex, Strands Agents
- Data pipeline and integration tools
- Vector databases: Qdrant, FAISS, Chroma, Pinecone, Weaviate, ChromaDB
- Retrieval and ranking systems: keyword search, vector search, embeddings, ranking algorithms
- Cloud AI platforms: Azure OpenAI, Amazon Bedrock, GCP Vertex AI
- On-premise solutions: vLLM
- Enterprise AI platforms: AWS AgentCore, Databricks AgentBricks, Google Agents Space, Azure AI Foundry
- Observability and monitoring tools/frameworks
- MLOps/AIOps practices for agentic systems
- Security and guardrail tools for AI applications
- Protocols: MCP, A2A, ACP
- Advanced memory architectures
We offer
- Career plan and real growth opportunities
- Unlimited access to LinkedIn learning solutions
- Constant training, mentoring, online corporate courses, eLearning and more
- English classes with a certified teacher
- Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
- Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
- Flexible work schedule and dress code
- Collaborate in a multicultural environment and share best practices from around the globe
- Hired directly by EPAM & 100% under payroll
- Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
- Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)
- 13 % employee savings fund, capped to the law limit
- Grocery coupons
- 30 days December bonus
- Employee Stock Purchase Plan
- 12 vacations days
- Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)
- Monthly non-taxable amount for the electricity and internet bills
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice and Policy.