What You'll Do:
What's the Platform PRE group?
The concept of Product Reliability Engineering (PRE) draws inspiration from the principles of SRE.
At Criteo, PRE acts as the bridge between Product, Platform Engineering and Infrastructure.
The PRE group comprises nine global engineering teams helping R&D design, build, and operate large-scale distributed systems reliably and efficiently.
The common objective of the PRE teams is to build the most reliable platform in AdTech.
How You'll Make An Impact
As Site Reliability Engineer within the PRE WebApps team, you'll work closely with product engineering to improve the reliability of our apps, systems and pipelines and assess where optimization is needed most.
You'll tell stories with meaningful monitoring and hopefully never be paged on your on-call rotation because we've worked hard with dev teams to make our platform the most reliable in AdTech.
Speaking of on-call; with a group of 7 you're looking at only around 8 weeks in a year, and your time is compensated!
You'll learn skills from other team members along the way and have opportunities to teach us! It's perfect for an engineer who likes shipping code and wants to be involved in all aspects of reliability, efficiency & maintainability.
What You’ll Do
Engage in and improve the whole lifecycle of services - from inception and design, through to deployment, operation, and refinement.
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Scale, automate, and evolve systems by pushing for changes that improve reliability, efficiency and performance.
Optimize and maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Practice incident response and blameless postmortems.
Linux, Kubernetes, .NET Core, C#, Python, Java/Scala/JVM, Prometheus, Grafana, Kibana and more.
Who You Are:
You hold a master's or PhD degree in computer science, a related field, or equivalent practical experience.
You have at least 5 years of experience as SRE or Software/DevOps Engineer.
You have significant experience in software development in one or more programming languages, and data structures or algorithms.
You're at ease with designing, analyzing, and troubleshooting large-scale distributed systems,
You have experience working in computing, distributed systems, storage, or networking.
You are used to debug, optimize code, and to automate routine tasks.
You show a systematic problem-solving approach, coupled with effective verbal and written communication skills.
Take a look at for access and insight into our engineering culture and achievements.
We understand that you might not meet each of the outlined requirements listed above, or may have experience that is a little different from our specifications.
If you think that you can still bring value to the role, we want to hear from you.