Technical Intelligence Solutions

Site Reliability Engineer (SRE)

Date Posted: Apr, 2022
Location: Reston, VA
Remote Capable: on-site

Security Compliance

  • US Citizenship
  • Clearance TS/SCI
  • DoD 8570 IAT Level II compliant

Basic Qualifications

  • 5+ years
  • Security+ CE

Additional Qualifications

  • Agile/SCRUM experience best practices
  • Ability to collaborate across multiple teams, in an organized manner, with the ability to prioritize tasks, communicate affectively, and can raise risks to senior leadership and balance team skillsets
  • An understanding of full stack infrastructure, rollout management of stack, differences between managing Kubernetes vs Cloudera stack
  • Understanding of DevOps Engineering Processes and Operations & Management (O&M)
  • 5+ years of experience working in Linux environments
  • 5+ years supporting production enterprise applications
  • Experience with large distributed, highly available environments
  • Experience with container technologies such as Docker, Kubernetes
  • Experience with declarative Infrastructure as Code tools, including Puppet, Terraform, and Ansible
  • Experience with GitOps and CI/CD tools like ArgoCD, Gitlab CI, Jenkins
  • Experience with cloud deployments and cluster resource management with cloud platforms such as AWS or Azure, including monitoring systems, logging, and security implementation
  • Ability to dive deep into all aspects of the stack to identify and fix problems and troubleshoot

Preferred Experience

  • Experience with Python/Go; Microservices; Serverless; MLOps; AIOps; Bash shell scripting
  • Experience with Big Data stack using Hadoop, Spark, Accumulo or MongoDB, and Solr or Elasticsearch
  • Experience of software development processes and code management tools and processes
  • Experience with Prometheus/Grafana or other monitoring tools

Job responsibilities

  1. Responsible for creation and improvement of infrastructure as code and continuous deployment tools built on Kubernetes
  2. Responsible for creation and improvement of continuous automation across multiple technical stacks.
  3. Maintain, troubleshoot and resolve issues related to Infrastructure, CI/CD pipelines, open source and commercial tools.

Role Description

As a DevOps Site Reliability Engineer, you will use your development experience to write infrastructure as code and deploy using industry best practices like GitOps and CI/CD in a hybrid cloud environment. You will lead efforts to incorporate open-source tools, automation, and cloud resources to cut down on tedious, boring tasks and free up the team’s developers to do what they do best – innovate.