Careers: Site Reliability Engineer

Site Reliability Engineer

JOB CODE: 7391

Our site reliability engineers (SREs) focus on a rich feature set, high availability, and excellent performance to enable our users to complete their missions.

At Datapixels, we're seeking a DevOps Engineer to join our team. You'll be responsible for providing product updates, troubleshooting production issues, and developing integrations to fulfill our clients' needs. You will play a critical role in bridging the gap between development, quality assurance, and IT operations.

You'll seek to incorporate the routine tasks of software development, quality assurance, deployment, and integration into a single, continuous set of processes.

Objectives of this Role

  • Automation of IT Operations, developing and integrating software solutions to increase the stability, automation, and scalability of organizational systems.
  • Monitoring critical applications and related services to ensure availability during critical business hours.
  • Specifying Service Level Indicators and Objectives.
  • Incident Management and Disaster Recovery.
  • On-Call Support and Issue Resolution.
  • Facilitate Post Incident Analysis.

Primary Responsibilities

  • Ensuring that services are available, the underlying infrastructure is properly functioning, and other internal tools, processes, and systems are working as expected.
  • Analyzing historical data and setting realistic objectives to meet Service Level Agreements (SLAs).
  • Collaborate for high-priority Incident Tickets and ensure system recovery within an SLA.
  • Ensure high-priority tickets are handled for a speedy resolution to meet Service Level Agreement. SRE will investigate, diagnose the problem, and subsequently resolve it.
  • Incident analysis to identify the root-cause and how to prevent the future occurrence of similar incidents.

Required Skills and Qualifications

  • Bachelor’s degree in computer science or other highly technical, scientific discipline or 5+ years of comparable experience.
  • Software Development experience in one or more languages such as Python, Java, Typescript and Javascript.
  • Experience with Docker, Kubernetes, and/or Terraform.
  • Experience with Github Actions, CircleCI, Jenkins or other Continuous Integration tooling.
  • Experience with AWS,Google Cloud Platform, or Azure.
  • High proficiency with source control including Git.
  • Proficiency with command line navigation.
  • Experience with site performance profiling and tuning.
  • Experience working within a service-oriented architecture.

Preferred Qualifications

  • Experience in implementing observability for Graph QLAPIs
  • Experience managing workloads for applications written in Javascript and Typescript with a Node runtime.
  • Experience facilitating blameless incident retrospectives.

Apply Now


We're Looking For You

Senior React Native Developer

Learn More

Marketing Technology Solution Engineer

Learn More

Site Reliability Engineer

Learn More