Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Job at Enigma, San Jose, CA

cVpXTzdJUVZOYmc5NXhHTFRaZWN1L3dMMnc9PQ==
  • Enigma
  • San Jose, CA

Job Description

Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA

Title: Machine Learning Engineer

Location: San Jose, CA

Responsibilities:

  • Productize and optimize models from Research into reliable, performant, and cost-efficient services with clear SLOs (latency, availability, cost).
  • Scale training across nodes/GPUs (DDP/FSDP/ZeRO, pipeline/tensor parallelism) and own throughput/time-to-train using profiling and optimization.
  • Implement model-efficiency techniques (quantization, distillation, pruning, KV-cache, Flash Attention) for training and inference without materially degrading quality.
  • Build and maintain model-serving systems (vLLM/Triton/TGI/ONNX/TensorRT/AITemplate) with batching, streaming, caching, and memory management.
  • Integrate with vector/feature stores and data pipelines (FAISS/Milvus/Pinecone/pgvector; Parquet/Delta) as needed for production.
  • Define and track performance and cost KPIs; run continuous improvement loops and capacity planning.
  • Partner with ML Ops on CI/CD, telemetry/observability, model registries; partner with Scientists on reproducible handoffs and evaluations.

Educational Qualifications:

  • Bachelors in computer science, Electrical/Computer Engineering, or a related field required; Master’s preferred (or equivalent industry experience).
  • Strong systems/ML engineering with exposure to distributed training and inference optimization.

Industry Experience:

  • 3–5 years in ML/AI engineering roles owning training and/or serving in production at scale.
  • Demonstrated success delivering high-throughput, low-latency ML services with reliability and cost improvements.
  • Experience collaborating across Research, Platform/Infra, Data, and Product functions.

Technical Skills:

  • Familiarity with deep learning frameworks: PyTorch (primary), TensorFlow.
  • Exposure to large model training techniques (DDP, FSDP, ZeRO, pipeline/tensor parallelism); distributed training experience a plus
  • Optimization: experience profiling and optimizing code execution and model inference: (PTQ/QAT/AWQ/GPTQ), pruning, distillation, KV-cache optimization, Flash Attention
  • Scalable serving: autoscaling, load balancing, streaming, batching, caching; collaboration with platform engineers.
  • Data & storage: SQL/NoSQL, vector stores (FAISS/Milvus/Pinecone/pgvector), Parquet/Delta, object stores.
  • Write performant, maintainable code
  • Understanding of the full ML lifecycle: data collection, model training, deployment, inference, optimization, and evaluation.

Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA

Job Tags

Similar Jobs

Focus on WD

Azure Data Engineer Job at Focus on WD

 ...Azure Data Engineer with Power BI MarTech & Retail Focus Location: Remote, USA Rate: $55 - $60 per hours (W2 Preferred) Job Type: Contract, 6 Months Our client is seeking a highly skilled Azure Data Engineer with strong Power BI expertise and... 

Insight Global

Data Scientist Job at Insight Global

 ...Description We are seeking a Junior Data Scientist to join our large Utility client in downtown Indianapolis. This position will be hired as a Full-Time employee. This entry-level position is perfect for individuals eager to tackle real-world energy challenges through... 

Bespoke Private Service

Private Chef Job at Bespoke Private Service

 ...Job Description Mise - Private Chef Reports To : The Principal Work Schedule : Full-Time, On-Site Location : San Francisco - Peninsula, CA Start Date: ASAP Salary Range: $150,000 - $175,000 Overview A Bay Area family seeks a warm, approachable... 

Regions Facility Services, INC. [RFS®]

Project Manager Job at Regions Facility Services, INC. [RFS®]

 ...RFS, youll lead the charge in planning, estimating, quoting, scheduling, and resource allocation. Youll ensure projects are executed flawlessly...  ...standards. Youll also serve as a trusted advisor to clients, crews, and partners, fostering relationships that drive repeat... 

Team Rehabilitation Physical Therapy

Student Physical Therapist Job at Team Rehabilitation Physical Therapy

 ...Team Rehab is hiring future Physical Therapists! This position is open to current PT students who are not yet graduates. If you anticipate graduating within the same calendar year and would like to begin connecting with companies prior to your official graduation...