Egen Site Reliability Engineer

via

Egen is a fast-growing and entrepreneurial company with a data-first mindset. We bring together the best engineering talent working with the most advanced technology platforms, including Google Cloud and Salesforce, to help clients drive action and impact through data and insights. We are committed to being a place where the best people choose to work so they can apply their engineering and technology expertise to envision what is next for how data and platforms can change the world for the better. We are dedicated to learning, thrive on solving tough problems, and continually innovate to achieve fast, effective results.

We are seeking a Site Reliability Engineer to ensure system reliability and infrastructure support. You will be responsible for delivering scalability, performance optimization, incident management, and analysis.

Responsibilities:

Ensure system reliability and uptime of applications depending on the SLAâs
Monitor system performance metrics and determine the approaches to optimize the system
Lead incident management efforts with available methodology and document RCA(Root Cause Analysis), lessons learned, and any SOPâs for solving the issue in future
Work closely with DevOps and Application teams to align priorities, share knowledge and drive continuous improvement initiatives
Prioritize response efforts based on issue severity, potential impact on users, and business priorities
Evaluate and approve changes to production systems, balancing the need for innovation with the requirement of stability and reliability
Optimize resource usage and manage costs by identifying inefficiencies, rightsizing infrastructure resources, and implementing cost-saving measures

What we're looking for:

3+ years of SRE experience with Azure and/or AWS
Bachelorâs Degree is preferred but will consider relevant experience as an equivalent
Programming: Java, SpringBoot, SQL, Bash
Monitoring: DataDog, Splunk, Grafana
Docker, Kubernetes, Linux
Incident/Alerts Management: VictorOps, PagerDuty
Git, Bitbucket
Troubleshooting complex, intertwined distributed services
Attention to detail
Testing, Monitoring, Logging, Alerting
Documentation
Incident Management

\n

Please mention the word **GAILY** and tag RMzguNjguMTM0LjE5NA== when applying to show you read the job post completely (#RMzguNjguMTM0LjE5NA==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

Posted Egen Site Reliability Engineer on January 28, 2025 via

Other remote programming jobs

Soar, Senior Manual Software Testing Engineer (January 28, 2025)
Extreme Networks, Business Development Representative USA (January 28, 2025)
Kyivstar, Middle System Business Analyst (January 28, 2025)
Scanline VFX , Senior Pipeline Developer (January 26, 2025)
BitGo, Backend Engineer E3 Wallet Core (January 22, 2025)
Stoneridge Software, Developer (January 22, 2025)

Find a remote job

Find a remote job today—click a job category and start your search. If you need help, read our guide on How to find a remote work job.

All remote jobs Remote programming jobs Remote design jobs Remote marketing jobs Remote business jobs Remote copywriting jobs Remote support jobs Remote system admin jobs Remote management jobs

Don't miss out on your dream job, get the best remote jobs in your inbox every day!

📫 Get remote jobs directly in your inbox

« Back to remote jobs

Was this job helpful? Yes / No