Site Reliability Engineer - Platform Core at Criteo

Job Overview

Company

Criteo

Location

Grenoble

Ready to Apply?

Take the Next Step in Your Career

Join Criteo and advance your career in Computer Occupations

Apply for This Position

Click the button above to apply on our website

Job Description

What You'll Do:

At Criteo, our Platform Core teams build the foundational services that power our global advertising platform.

We design and operate scalable, resilient systems that support real-time decision-making and data processing at massive scale.

As we expand our capabilities in high-performance inference and distributed computing, we’re forming a new team focused on GPU-powered services and cutting-edge ML serving technologies.

What You'll Do

As a Site Reliability Engineer in this new team, you’ll be at the forefront of building and operating GPU-powered services for machine learning workloads.

Your mission will be to ensure the reliability, scalability, and performance of our systems that leverage:

Ray: You’ll manage on-demand provisioning of Ray clusters on Kubernetes, enabling scalable distributed computing as a service for ML training and inference.

You’ll design, maintain, and monitor these ray-as-a-service systems, and deliver these capabilities as robust, self-service platform offerings.

Nvidia Triton Inference Server: You’ll optimize and operate high-performance inference services using Triton, ensuring low-latency and high-throughput serving of deep learning models.

You’ll work closely with ML engineers, data scientists, and other infrastructure teams to deliver production-grade services that accelerate innovation across Criteo.

Who You Are:

Strong experience with Kubernetes, especially in dynamic provisioning and custom operators.

Hands-on experience with GPU workloads, ideally in ML training or inference contexts.

Solid programming skills in C#, Python, Go, or similar languages.

Passion for automation, observability, and building reliable services.

Bonus Points

Familiarity with Ray or other distributed computing frameworks.

Knowledge of Nvidia Triton, TensorRT, or similar inference serving technologies.

Familiarity with cloud-native GPU orchestration (e.g., GKE, EKS, or on-prem equivalents).

Take a look at for access and insight into our engineering culture and achievements.

We understand that you might not meet each of the outlined requirements listed above or may have experience that is a little different from our specifications.

If you think that you can still bring value to the role, we want to hear from you.

About Criteo

Quick Access Links

Job Details:
https://fr.expertini.com/jobs/job/site-reliability-engineer-platform-core-grenoble-criteo-aa7c64d852dd/

Company Jobs:
More Criteo Jobs

Location Jobs:
Jobs in Grenoble

Category Jobs:
Computer Occupations Jobs

Don't Miss This Opportunity!

Criteo is actively hiring for this Site Reliability Engineer - Platform Core position

Apply Now