Site Reliability Engineer

Cloud & DevOps, Cybersecurity & IT Infrastructure, Software Development

12 January 2024

Full Time

Job Description

Are you passionate about ensuring maximum system uptime and optimizing infrastructure? Moreover, do you want to make a significant impact on cloud infrastructure? Join our team as a Site Reliability Engineer and drive innovation in system reliability.

About the Role

As a Site Reliability Engineer, you’ll bridge the gap between development and operations. Furthermore, you’ll ensure our systems run smoothly and efficiently. Additionally, you’ll implement automation solutions that enhance system reliability.

Key Objectives

In particular, our Site Reliability Engineer will focus on these essential goals:

Monitor production environments and maintain system health
Build automated solutions for infrastructure management
Optimize system performance and improve uptime
Implement DevOps best practices and troubleshooting procedures
Collaborate with development teams on reliability improvements

Core Responsibilities

Moreover, you’ll handle these key responsibilities:

Analyze system metrics and performance data for optimization
Partner with teams to implement rigorous testing procedures
Design scalable systems using cloud technologies
Create automation scripts using Python and Linux tools
Monitor application performance and troubleshoot issues
Participate in capacity planning and system design consulting

Required Skills & Qualifications

Consequently, we’re looking for candidates with:

Bachelor’s degree in Computer Science or related field
Strong programming skills in Python and shell scripting
Experience with cloud platforms and monitoring tools
Knowledge of Linux system administration
Understanding of DevOps practices and automation
Proven troubleshooting and system engineering abilities

Application Process

Ready to become our next Site Reliability Engineer? Subsequently, learn more about us and apply today. For additional insights, check out Google’s SRE Handbook to understand industry best practices.

Objectives of this role

Run the production/UAT environment by monitoring availability and taking a holistic view of system health
Build software and systems to manage platform infrastructure and applications
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, to push our capabilities forward, getting ahead of customer needs, and innovate for continual improvement
Provide primary operational support and engineering for multiple large-scale distributed software applications.
Engage in and improve the whole lifecycle of services from inception and design, deployment, operation, and refinement
Collaborate with stakeholders to set SLO and maintain Service level Indicators (SLI’s) that are representative of our customer experience and/or committed SLA

Responsibilities

Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Create sustainable systems and services through automation and uplifts Balance feature development speed and reliability with well-defined service-level objectives

Required skills and qualifications

Master’s/Bachelor’s degree (or equivalent) in computer science or related discipline
Ability to program using one or more high-level languages, such as Python and Shell scripting.
A proactive approach to identifying problems, performance bottlenecks, and areas for improvement.

Site Reliability Engineer

Job Description

About the Role

Key Objectives

Core Responsibilities

Required Skills & Qualifications

Application Process

Related Jobs

Senior Gen AI Developer

BDM with Cybersecurity / Infrastructure

Senior DevOps Engineer

Front-End Web Developer

Call us

+91-9916037193

For Candidates

For Employers

About Us

Resources

Login to superio

Reset Password

Create a free superio account

Site Reliability Engineer

Apply for this job

Job Description

About the Role

Key Objectives

Core Responsibilities

Required Skills & Qualifications

Application Process

Share this post

Related Jobs

Senior Gen AI Developer

BDM with Cybersecurity / Infrastructure

Senior DevOps Engineer

Front-End Web Developer

Call us

+91-9916037193

For Candidates

For Employers

About Us

Resources