logo

View all jobs

SRE - Site Reliability Engineer

Colombus, OH
Description:   •  Hands-on design, analysis, development and troubleshooting of highly-distributed large-scale production systems and event-driven services spanning on-prem and AWS based hosting
•             Ownership of reliability, uptime, system security, cost, operations, capacity and performance-analysis
•             Share a 24x7 on-call rotation with your team and respond to incidents; lead triage bridges during incidents and provide needed status updates
•             Create and maintain monitoring, alerting and dashboarding solutions that improve the visibility into our applications' performance and business metrics and keep operational workload in-check.
•             Use automation technologies to ensure repeatability, eliminate toil, reduce time to action and repair services
•             Participate in technical training events and game day scenarios
•             Partner with engineering, security, performance, qa and product management teams to improve the availability and quality of service of our products
 
Required Skills:
•             Strong Linux administration/build/management skills
•             Development experience in at least one of these languages: Java, Go, C# and/or Python;  Strong skills in reading, understanding and writing code in the same
•             Demonstrated expertise building and managing highly scaled production infrastructure in on-prem and AWS based environments
•             Extensive experience troubleshooting n-tier architectures with diverse sets of technologies strongly desired.  (e.g. load balancers, web/app/caching/database servers, queues, threading, memory, cpu, heap, storage, network, os)  
•             Strong experience using application and infrastructure monitoring systems (like Splunk, Cloudwatch, Datadog, New Relic, Sumologic, ELK)
•             Excellent presentation and communication skills
•             Mastery of infrastructure automation technologies (like Terraform, Puppet, Ansible, Chef)
•             Expertise with continuous deployment based software development lifecycles (e.g. CI/CD)
•             Experience with common middleware (e.g., Apache, NGINX, IIS, Tomcat, JBoss)
•             Experience with SQL databases (e.g., PostgreSQL, Oracle, MySQL)
•             Expertise with SDLC branching, SCM, and code deployment systems (git/gitflow, Jenkins, CircleCI, TravisCI, etc.)
•             Expertise in container/container-fleet-orchestration technologies (like Docker, Vagrant, Mesosphere)
•             BS Degree in Computer Science (or related technical field and/or equivalent industry experience)​​

Share This Job

Powered by