Senior Machine Learning Engineer
Docusign
Bengaluru, Karnataka, IndiaSENIOR
Hybrid
Job Description
Senior Machine Learning Engineer role to redefine global services.
Responsibilities
- Design and implement autonomous multi-agent systems using Reinforcement Learning (RL) loops that can interact with our infrastructure to perform safe, automated remediation actions
- Build GenAI agents capable of digesting logs, traces, and metrics to provide "Human-in-the-loop" root cause analysis and conversational debugging for our SREs
- Develop and deploy deep learning models (Transformers, LSTMs, etc.) for forecasting and anomaly detection on high-cardinality, high-volume time series data
- Optimize inference pipelines to run with low latency on streaming telemetry data (Kafka/Flink), ensuring we catch issues the moment they happen
- Own the lifecycle of your models—from feature engineering on petabyte-scale datasets to training, deployment, and monitoring in production Kubernetes environments
- Collaborate with Applied Scientists to translate bleeding-edge research (e.g., causal inference, decision transformers) into production-hardened AIOps tools
Qualifications
- 5+ years of professional experience in Machine Learning Engineering or Data Science, with a strong background in Software Engineering
- Deep understanding of PyTorch or TensorFlow, specifically regarding Time Series analysis (forecasting/anomaly detection) and NLP
- Proven experience building applications using LLMs (RAG pipelines, LangChain, vector databases) specifically for technical domains (code analysis, log parsing)
- Practical knowledge of RL concepts (policies, rewards, agents) and experience applying them to optimization or control problems
- Experience with distributed data processing and streaming technologies (Apache Spark, Kafka, Flink)
- Strong software engineering fundamentals (Python, C++, or Go), CI/CD for ML, and experience deploying models via APIs (FastAPI, Triton Inference Server)
- Familiarity with the "three pillars" (Logs, Metrics, Traces) and tools like Prometheus, Grafana, OpenTelemetry, or Jaeger
- Experience with frameworks like AutoGen, CrewAI, or Ray RLlib
- Deep experience with AWS/GCP/Azure and Kubernetes (K8s) orchestration
- A background in control theory or causal inference