Careers: Site Reliability Engineer

Site Reliability Engineer

JOB CODE: 7391

Our site reliability engineers (SREs) focus on a rich feature set, high availability, and excellent performance to enable our users to complete their missions.

At Datapixels, we're seeking a DevOps Engineer to join our team. You'll be responsible for providing product updates, troubleshooting production issues, and developing integrations to fulfill our clients' needs. You will play a critical role in bridging the gap between development, quality assurance, and IT operations.

You'll seek to incorporate the routine tasks of software development, quality assurance, deployment, and integration into a single, continuous set of processes.

Objectives of this Role

  • Automation of IT Operations, developing and integrating software solutions to increase the stability, automation, and scalability of organizational systems.
  • Monitoring critical applications and related services to ensure availability during critical business hours.
  • Specifying Service Level Indicators and Objectives.
  • Incident Management and Disaster Recovery.
  • On-Call Support and Issue Resolution.
  • Facilitate Post Incident Analysis.

Primary Responsibilities

  • Ensuring that services are available, the underlying infrastructure is properly functioning, and other internal tools, processes, and systems are working as expected.
  • Analyzing historical data and setting realistic objectives to meet Service Level Agreements (SLAs).
  • Collaborate for high-priority Incident Tickets and ensure system recovery within an SLA.
  • Ensure high-priority tickets are handled for a speedy resolution to meet Service Level Agreement. SRE will investigate, diagnose the problem, and subsequently resolve it.
  • Incident analysis to identify the root-cause and how to prevent the future occurrence of similar incidents.

Required Skills and Qualifications

  • Bachelor’s degree in computer science or other highly technical, scientific discipline or 5+ years of comparable experience.
  • Software Development experience in one or more languages such as Python, Java, Typescript and Javascript.
  • Experience with Docker, Kubernetes, and/or Terraform.
  • Experience with Github Actions, CircleCI, Jenkins or other Continuous Integration tooling.
  • Experience with AWS,Google Cloud Platform, or Azure.
  • High proficiency with source control including Git.
  • Proficiency with command line navigation.
  • Experience with site performance profiling and tuning.
  • Experience working within a service-oriented architecture.

Preferred Qualifications

  • Experience in implementing observability for Graph QLAPIs
  • Experience managing workloads for applications written in Javascript and Typescript with a Node runtime.
  • Experience facilitating blameless incident retrospectives.

Apply Now

JOIN THE TEAM

We're Looking For You

Senior React Native Developer

Learn More

Marketing Technology Solution Engineer

Learn More

Site Reliability Engineer

Learn More