Lead Site Reliability Engineer

Location:Cambridgeshire, Greater London, Warwickshire
Job Type:Full Time
Apply Now

Rare: we're not your typical developer. Over 30 years in the game-making business, every moment has been dedicated to kicking out one beloved title after another. We've tried all manner of new things in that time, but if there's one thing we've truly come to believe, it's the importance of having fun every single day and infusing it into everything we create. That's the core of who we are – check out our recent Rare Replay collection for three whirlwind decades of evidence. We strive to keep Rare a fantastic place to work, from its beautiful location and state-of-the-art facilities to a strong focus on work/life balance.

If you're a keen gamer, chances are you've already experienced our epic shared world pirate adventure Sea of Thieves, released in March and currently in active development as a live service. With every day bringing new challenges and discoveries in equal measure, there's never been a more rewarding time to join our daredevil crew!


Is the player experience at the forefront of your mind? Do you believe that high availability and quality of service are as important as functionality? Do you thrive on improving both tech issues and development culture? Then you may be our next Site Reliability Engineering Lead!

As Lead Site Reliability Engineer, we need you to ensure the excellence of service provision for our players, this involves accountability for non-functional characteristics of service performance.

As a key part of the studio, you will be equipped with the tools to minimise the interruption to retail player experience continuously. Your dedication to this experience will ensure performance is at its best. Your focus will be on building our resilience to surges, peaks and malicious attacks and the recoverability of user data and rapid restoration of service in the face of a data-centre disaster.

With this role, you will have the opportunity to both influence and grow a diverse engineering team to ensure focus on customer experience is in front of mind. As part of this, you’ll also be hiring, recognising and developing talent in the team.



As Lead Site Reliability Engineer, your key responsibilities will be the management of the live site and coordinating the following:

  • Work with the Service Engineering team to understand usage patterns and specify non-functional characteristics of new services work.
  • Quality/risk/acceptance of new deployments – including failure model analysis, pre-release validation and testing in production.
  • Work with the Deployment Pipeline team to reduce downtime and improve consistency and reliability.
  • Live operational management of service environment: deployments, scale and topology.
  • Perform ongoing improvement to capability to detect and respond to incidences of service behaviour that impact negatively on customers.
  • Ensure reliability engineering incident bridges, supporting with data on scope and impact and first-line actions involving deployment, rollback or hotfixes.
  • Co-ordinate with partner engineering teams to drive solutions to improve customer experience.
  • Define metrics that represent high-quality service to players and outward reporting of historical performance.
  • Evaluate and forecast demand based on historical volume and forward commercial guidance, with associated outward risk reporting.
  • Identify service engineering work items to address efficiency, performance or scalability needs.



  • Proven experience managing operations in an IT and/or Critical Environment infrastructure
  • The ability to recognise and develop a diverse, technical workforce preferably with managerial experience
  • Enterprise-level experience in managing large-scale and complex projects/programs.
  • Working knowledge of audit and compliance requirements in a large global enterprise.
  • Ability to express impact in terms of costs and benefits.

Problem-solving skills, investigative capabilities, data analysis and attention to detail.

Strong communication and organisation skills.

Rare is a flexible employer and as such this role can be based at our studio in Twycross, London or Cambridge depending. If working in London or Cambridge, travel to Rare in Twycross will be as per business needs.

For more on Rare's culture and achievements, see our About section.

Rare is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.