logo

View all jobs

Senior Site Reliability Engineer (SRE)

Columbus, OH
Senior Site Reliability Engineer (SRE)
This role is based in Columbus, OH with a Syrinx Educational Technology Partner
This role can be contract or contract to hire.

 
U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor at this time

Senior Site Reliability Engineer
 
We are looking for an adventurous Senior Site Reliability Engineer who loves AWS technologies.  You will be a member of an engineering team where collaboration and innovation are a key focus.  As part of this team you will design, build, deploy, and monitor software and infrastructure that delivers new features to the market.  Be prepared to explore new technologies and design concepts as an integral part of your job.

Essential Accountabilities:
  • Partner with engineering, security and product teams to improve the availability, scalability and efficiency of our products
  • Design, develop, deploy, monitor and support large-scale production systems and event-driven services hosted within AWS
  • Lead technology initiatives that drive scalability and reliability improvements
  • Build tools and automation that eliminate repetitive tasks, minimize downtime and achieves human free operations
  • Participate in 24×7 operation support and on-call rotation
Skills and Qualifications:
  • Intermediate proficiency in one of the following programming languages: Java, Go or C#
  • Strong experience in Linux systems administration
  • You can design, build and support highly available production systems in AWS using technologies like ELB, EC2, ECS, RDS, Elasticache, Cloudfront, S3, IAM, Route 53 and DynamoDB
  • Experienced using automation technologies like Terraform, Puppet and Packer
  • Proficient with APM, infrastructure and log aggregation tooling to monitor system health and customer experience (e.g. New Relic, Cloudwatch, Datadog, Sumologic, ELK)
  • You have participated in a 24x7 on-call rotation with your team and responded to incidents
  • A proven track record of diagnosing and fixing time sensitive and critical production issues
  • Jenkins and/or CircleCI continuous integration – build, package, release and deploy
  • Git, GitHub, and GitFlow skills
  • Experience operating common middleware (e.g., Apache, NGINX, Tomcat, JBoss)
  • Solid understanding of DNS, DHCP, SSH, HTTP, TCP/IP and other common network protocol
 
Big Pluses
  • Database administration skills (AWS Aurora, MySQL, Postgres, Oracle)
  • Have leveraged deployment strategies such as blue-green and canary
  • Experience building RESTful services and/or web applications
  • Experience automating software deployments and following a continuous delivery and deployment model
  • Experience with system analysis and troubleshooting in large-scale Linux environment
 
People who have been successful in this role:
  • Passionate and adept at software development and/or system engineering
  • Love to understand how new technologies and architectures work, educate coworkers and channel their knowledge into improving system reliability and performance
  • Continuously learning about application scalability, availability, reliability, and security
  • Intensely curious about how complex distributed systems operate and fail at scale
  • Think freely and independently, and are ready to share their views
  • Eager to learn from mistakes and socialize the lessons learned
  • Like to take ownership of infrastructure components and leading projects

Share This Job

Powered by