Careers: DevOps Engineer

DevOps Engineer

JOB CODE: 9218

We're recruiting for a DevOps Engineer to join our team. You'll be responsible for providing platform updates, troubleshooting production issues, and developing integrations to meet our clients' requirements. You will play a critical role in bridging the gap between development, quality assurance, and IT operations, as indicated below.

Objectives of this Role

DevOps Engineers serve as an important strategic and technical role on engineering teams; of which is responsible for supporting and improving process to ensure high availability, redundancy, recoverability, scalability, and capacity.

This role implements methods, tools, and strategies to balance demands across the software development life cycle, from coding to deployment to maintenance and upgrades. They decrease complexity by bridging the gap between the activities required to support applications and infrastructure.

Primary Responsibilities

  • Focusing on automation and supporting high-availability platforms
  • Analyze insights from event patterns influencing business-critical uptime metrics, and generate and disseminate incident reports to a wide range of stakeholders at all levels.
  • Automate analytical processes such as system & network log analysis to re-assemble and replay incident event history for root cause analysis & impact costs.
  • Drive KPI's and effectively monitor outage frequency, volume, downtime costs, and dependability concerns in a proactive manner.
  • Reduce operational inefficiencies in the incident management process through automation and continuous process improvement to provide the quickest path to SREs.
  • Triage issues and avoid the need for the customer to report problems, ensure that product/service faults are resolved, processes are improved, and documentation is improved.

Required Skills and Qualifications

  • 5+ years of incident management expertise, particularly in enterprise-class, loosely connected settings, with ownership of the end-to-end process of issue management.
  • 3+ years of experience working across technological stacks while utilizing the appropriate tools for the job, including but not limited to Kubernetes, Docker, Terraform, New Relic, DataDog, Splunk, etc.
  • Advanced understanding of ephemeral services in a complex cloud environment; analyze new technologies and industry trends, produce proof-of-concepts, and deliver results to engineering.
  • Strong programming/scripting abilities in Python, Go, JavaScript, C/C++ or similar.
  • Experience with large-scale programs, capable of interpreting bottlenecks in complex systems and paying close attention to the quality and detail of all deliverables.
  • Excellent ability to communicate across the org chart and see the value of establishing context for the team.

Preferred Qualifications

  • Knowledge of infrastructure automation tools ranging from intermediate to expert (Terraform, Ansible, K8s, etc.).
  • Experience with managing public, private and hybrid cloud environments (AWS preferred).
  • Experience designing distributed systems and business app integration architectures using microservices, containers, and cloud infrastructure.
  • Creative problem solving & thinking outside of the box.

Apply Now

JOIN THE TEAM

We're Looking For You

Senior React Native Developer

Learn More

Marketing Technology Solution Engineer

Learn More

Site Reliability Engineer

Learn More