Site Reliability Engineer
Job Summary: In a highly collaborative distributed agile team environment, this role will ensure the scalability, reliability and performance of Airspace Link’s systems and applications. This role in collaboration with the platform and software engineers will work on automating and improving operational processes.
Duties and Responsibilities
· Reliability and Performance: Design, implement and maintain systems to ensure reliability, high availability and performance
· Scalability: Optimize applications and infrastructure to handle growth
· Monitoring and Alerting: Implement monitoring systems and performance metrics to proactively identify and address issues before the impact end users
· Incident Response: Respond to incidents in the timely manner. Lead the efforts in resolving critical issues, prevent recurrence and run postmortems.
· Automation: Develop tools and scripts to automate manual operational tasks, increasing efficiency
· Infrastructure Management: Using Infrastructure as Code tools (IaC) like Terraform, Ansible etc
· Capacity Planning: Analyze and forecast future infrastructure needs.
· Change Management: Implement practices to safely release code (CI/CD, canary releases, feature flags) . Reduce risk with increasing deployment velocity
· Collaboration: Work closely with development teams to improve software quality and reliability
· Disaster Recovery: Create disaster recovery plans to mitigate systems failures
· Security and Compliance: Implement security controls and conduct audits and vulnerability assessments. Ensure systems adhere to industry standards and regulations and conduct compliance audits and assessments
Position Type: Full-Time, 40 hours per week
Status: Exempt
Location: Hybrid
Requirements
· B.S. in Computer Science or equivalent years to relevant experience or education
· 3+ years of professional experience in a similar SRE or DevOps role
· Strong programming skills (Python, Go, Java)
· Experience with cloud platforms (Azure, AWS, GCP)
· Experience with containerization technologies (Docker, Kubernetes)
· Experience with monitoring and logging tools (Prometheus, Grafana)
· Knowledge of system administration
· Strong problem solving and analytical skills
· Great teamwork skills
· An eagerness to learn and adapt to the needs of a greenfield industry
· Part 107 or another pilot’s license a plus
Recommended Jobs
Host / Hostess
Applebee's started with the same philosophy we follow today focused on serving good food to good people. Today, with almost 1,700 locations and counting, what was once a popular neighborhood restaur…
Fitness Trainer
Job Summary The Fitness Trainer will be responsible for running the Planet Fitness group fitness program (PE@PF). This includes assisting new members in the achievement of their fitness goal…
Employer Brand Designer - Contract
Leading the future in luxury electric and mobility At Lucid, we set out to introduce the most captivating, luxury electric vehicles that elevate the human experience and transcend the perceived li…
Accountant
For 20- plus years Centroid has helped complex companies accelerate their digital transformation by implementing and managing Cloud-based business solutions. We specialize in advisory, deployment, …
Press Operator
Labor Rocket has partnered with a tier 1 automotive supplier in Milan, MI. We are seeking a hard-working individual for assembling components and press machines. This is a long-term contract to hire…
General Maintenance at AJ's Family Fun Center
Job Description Job Description AJ's Family Fun Center in Comstock Park, MI is looking for one seasonal general maintenance to join our 41 person strong team. We are located on 4400 Ball Park Dri…