Jobs at Syrinx

Senior DevOps/Site Reliability Engineer

Boston, MA

We are hiring a Sr. Site Reliability Engineer who will work with the software engineers to build reliable, high capacity and high-performance infrastructure in support of our mission to reimagine learning for millions of students worldwide. If you know AWS services inside out, have solid networking experience, and you like engineering solutions to solve site reliability and operations problems, you will thrive in this position. The position will be located at our Boston, MA facility.

Essential Accountabilities:

Hands-on design, analysis and troubleshooting of highly-distributed large-scale production systems;
Ownership of reliability, uptime, capacity, and performance analysis thereof
Ensuring the repeatability, traceability, and transparency of our infrastructure automation
Identifying highest-impact opportunities to optimize existing systems
System design consulting for teams seeking to leverage or improve their production infrastructure
Anticipate, build and plan capacity for upcoming product/feature launches

Required Skills:

Mastery of AWS services (IAM, EC2, S3, EBS/EFS, ELB/ALB, AutoScaling, RDS and replication techniques, VPC, Subnets, Elastic IP, Route53, CloudWatch, CloudFront, Lambda, CloudFormation, ECS, SNS, ElastiCache);
Expertise in container/container-fleet-orchestration technologies (like Docker, Kubernetes, AWS ECS);
Expertise in designing and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven and AUTOMATED) ways;
Mastery of infrastructure build and configuration automation technologies (like Terraform, Ansible, Puppet, CodeDeploy, Chef);
Strong skills in reading, understanding and writing code in at least two of: Javascript, Python, PHP, Go, or Ruby;
Strong network engineering skills;
Cloud and container native Linux administration/build/management skills (AWS AMIs, Packer, etc.);
Significant experience troubleshooting concurrent and distributed system interactions;
Expertise with continuous-deployment software development lifecycles in the Cloud (CI/CD);

Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), caching operations & deployments (Memcache, Redis);
Expertise with Lean/Agile deployment processes (ZDT: Blue/Green, Canary, DNS strategies);
Familiarity with site and infrastructure monitoring systems (CloudWatch, Datadog, New Relic, Sumologic, Thousand Eyes);
Strong problem solving, root cause analysis and systems engineering skills;
Good presentation and communication skills;
Expertise with SDLC branching, SCM, and code deployment systems (Git/Gitflow, Jenkins, CircleCI, etc.);
BS Degree in Computer Science (or related technical field and/or equivalent industry experience).

Jobs at Syrinx

Senior DevOps/Site Reliability Engineer

Essential Accountabilities:

Required Skills:

Share This Job