Senior Site Reliability Engineer - AWS Cloud Operations

synava
Full_timeβ€’Karlsruhe, Germany

πŸ“ Job Overview

  • Job Title: Senior Site Reliability Engineer - AWS Cloud Operations
  • Company: synava (medavis GmbH)
  • Location: Karlsruhe, Baden-WΓΌrttemberg, Germany
  • Job Type: Full-time, On-site
  • Category: DevOps Engineer, Site Reliability Engineer
  • Date Posted: 2025-08-01
  • Experience Level: 5-10 years

πŸš€ Role Summary

  • Design and implement best practices for cloud architectures to ensure reliability, scalability, and security of the cloud platform.
  • Collaborate with DevOps, AWS Admins, and the Developer Team to automate, observe, and secure multi-account AWS environments.
  • Drive IaC best practices, automate provisioning and configuration tasks, and lead CI/CD implementation for operational pipelines.
  • Design and implement IAM strategies, compliance automation, and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs).

πŸ“ Enhancement Note: This role requires a strong background in AWS cloud environments, with a focus on architecture, operations, automation, and security. The ideal candidate will have experience in multi-account AWS organizations and a deep understanding of cloud-native systems.

πŸ’» Primary Responsibilities

  • Architecture & Design:

    • Design and implement best practices for cloud architectures.
    • Advise on the selection of platform components for existing or new products.
    • Collaborate with other teams to ensure architectural consistency and compliance.
  • Operations & Observability:

    • Design and implement scalable, highly available cloud-native systems within a complex, multi-account AWS organization structure.
    • Design and implement a telemetry stack that is cost-effective, compliant, and can be used across multiple organizations.
    • Help set up Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to actively manage service qualities.
  • Automation & Infrastructure as Code:

    • Drive IaC best practices and automate provisioning and configuration tasks to streamline operations and ensure consistency.
    • Lead CI/CD implementation for operational pipelines, including strategies for secure cross-account deployments.
  • Security & Identity:

    • Design and implement IAM strategies, such as Zero Trust, RBAC, and SSO.
    • Collaborate with other teams to manage shared service accounts and OU structures.
    • Design and implement compliance automation with policies as code, such as Service Control Policies, AWS Config, and AWS Security Hub.

πŸ“ Enhancement Note: This role requires a strong focus on automation, observability, and security. The ideal candidate will have experience in designing and implementing secure, scalable, and highly available cloud-native systems.

πŸŽ“ Skills & Qualifications

Education: A Bachelor's degree in Computer Science, Engineering, or a related field. Relevant experience may be considered in lieu of a degree.

Experience: 5+ years of experience as an SRE, DevOps Engineer, or Cloud Engineer in production-grade AWS environments or similar environments.

Required Skills:

  • Strong proficiency in core AWS services, including EC2, RDS, VPC, IAM, Route 53, CloudTrail, AWS Organizations, Resource Access Manager (RAM), and AWS SSO.
  • Strong proficiency in Infrastructure as Code using modern frameworks such as AWS CDK, Terraform, and CloudFormation, with a focus on reusable, cloud-agnostic infrastructure patterns.
  • Experience with modern observability tools such as Prometheus, Grafana, and OpenTelemetry (OTel) for metrics, logging, and distributed tracing.
  • Experience in designing secure cloud network architectures and implementing compliance-aligned access controls across cloud-native environments.
  • Excellent troubleshooting skills and an ownership mindset.
  • Fluent English (German is a plus).

Preferred Skills:

  • Experience with Kubernetes declarative deployments, Helm, and advanced troubleshooting.
  • Experience with policy-as-code frameworks such as AWS Service Control Policies (SCPs), AWS Config Rules, or equivalent governance tools.
  • Familiarity with AWS Well-Architected Framework and AWS Cloud Adoption Framework.

πŸ“ Enhancement Note: The ideal candidate will have a strong background in AWS cloud environments, with a focus on architecture, operations, automation, and security. Experience in multi-account AWS organizations and a deep understanding of cloud-native systems are essential for this role.

πŸ“Š Web Portfolio & Project Requirements

Portfolio Essentials:

  • Demonstrate experience in designing and implementing secure, scalable, and highly available cloud-native systems.
  • Showcase your ability to automate provisioning and configuration tasks using Infrastructure as Code (IaC) tools.
  • Highlight your experience in managing multi-account AWS environments and implementing IAM strategies.
  • Include examples of your work in designing and implementing telemetry stacks and managing service level objectives (SLOs) and service level indicators (SLIs).

Technical Documentation:

  • Provide documentation for your cloud architecture, including diagrams and detailed explanations of your design choices.
  • Include code comments and documentation for your Infrastructure as Code (IaC) scripts.
  • Demonstrate your ability to write clear and concise technical documentation, explaining complex concepts in an easy-to-understand manner.

πŸ“ Enhancement Note: The ideal candidate will have a strong portfolio demonstrating their experience in designing and implementing secure, scalable, and highly available cloud-native systems. Their portfolio should showcase their ability to automate provisioning and configuration tasks using Infrastructure as Code (IaC) tools and their experience in managing multi-account AWS environments and implementing IAM strategies.

πŸ’΅ Compensation & Benefits

Salary Range: €70,000 - €90,000 per year (based on experience and qualifications)

Benefits:

  • Comprehensive health insurance
  • Company pension scheme
  • Flexible working hours and remote work options
  • Regular team events and company outings
  • Opportunities for professional development and training

Working Hours: Full-time, 40 hours per week. Flexible working hours and remote work options available.

πŸ“ Enhancement Note: The salary range for this role is estimated based on market research for senior site reliability engineer positions in the AWS cloud environment in Germany. The benefits listed are based on the company's career page and may vary depending on the candidate's experience and qualifications.

🎯 Team & Company Context

🏒 Company Culture

Industry: Healthcare technology, focusing on optimizing workflows in radiology practices and clinics worldwide.

Company Size: Medium-sized company with around 100 employees.

Founded: 2009, with a strong focus on innovation and continuous improvement.

Team Structure:

  • The Cloud Operations Team is responsible for managing the AWS infrastructure and ensuring its reliability, scalability, and security.
  • The team works closely with the DevOps, AWS Admins, and Developer Teams to ensure architectural consistency and compliance.
  • The team is structured with a focus on collaboration, knowledge sharing, and continuous learning.

Development Methodology:

  • The company follows Agile methodologies, with a focus on iterative development and regular feedback.
  • The team uses JIRA for project management and GitHub for version control and collaboration.
  • The company encourages a culture of experimentation and innovation, with a focus on continuous improvement.

Company Website: medavis.com

πŸ“ Enhancement Note: The company culture at medavis GmbH is focused on innovation, collaboration, and continuous improvement. The team structure is designed to encourage knowledge sharing and collaboration, with a strong focus on working closely with other teams to ensure architectural consistency and compliance.

πŸ“ˆ Career & Growth Analysis

Web Technology Career Level: Senior Site Reliability Engineer - AWS Cloud Operations

Reporting Structure: The Senior Site Reliability Engineer will report directly to the Head of Cloud Operations and work closely with the DevOps, AWS Admins, and Developer Teams.

Technical Impact: The Senior Site Reliability Engineer will have a significant impact on the reliability, scalability, and security of the cloud platform. They will be responsible for designing and implementing best practices for cloud architectures and ensuring the platform remains stable, secure, and compliant.

Growth Opportunities:

  • Technical Growth: The role offers opportunities for technical growth, including the chance to gain experience in managing multi-account AWS environments and implementing IAM strategies. The candidate will also have the opportunity to learn and apply the latest AWS services and best practices.
  • Leadership Growth: As the team grows, there may be opportunities for the candidate to take on a leadership role, mentoring junior team members and driving the team's technical direction.
  • Career Progression: The role offers opportunities for career progression, with the potential to move into a more senior or management role within the company.

πŸ“ Enhancement Note: The Senior Site Reliability Engineer role at medavis GmbH offers significant opportunities for technical and career growth. The candidate will have the chance to gain experience in managing multi-account AWS environments and implementing IAM strategies, as well as the opportunity to learn and apply the latest AWS services and best practices. As the team grows, there may also be opportunities for the candidate to take on a leadership role or move into a more senior or management role within the company.

🌐 Work Environment

Office Type: Modern, collaborative office space with a focus on open communication and knowledge sharing.

Office Location(s): The company's headquarters is located in Karlsruhe, Germany, with additional offices in other European countries.

Workspace Context:

  • The workspace is designed to encourage collaboration and communication, with open-plan offices and dedicated team spaces.
  • The company provides modern equipment and tools to support the team's work, including high-quality monitors and fast, reliable internet connections.
  • The company encourages a healthy work-life balance, with flexible working hours and remote work options available.

Work Schedule: Full-time, 40 hours per week. Flexible working hours and remote work options available.

πŸ“ Enhancement Note: The work environment at medavis GmbH is designed to encourage collaboration and communication, with a focus on open communication and knowledge sharing. The company provides modern equipment and tools to support the team's work and encourages a healthy work-life balance, with flexible working hours and remote work options available.

πŸ“„ Application & Technical Interview Process

Interview Process:

  1. Phone/Video Screen: A brief conversation to discuss the candidate's experience, qualifications, and fit for the role.
  2. Technical Deep Dive: A detailed discussion of the candidate's technical skills and experience, focusing on their knowledge of AWS cloud environments, Infrastructure as Code (IaC), and cloud-native systems.
  3. Behavioral Questions: A series of behavioral questions designed to assess the candidate's problem-solving skills, communication abilities, and cultural fit.
  4. Final Interview: A meeting with the Head of Cloud Operations to discuss the candidate's fit for the role and the company's culture.

Portfolio Review Tips:

  • Highlight your experience in designing and implementing secure, scalable, and highly available cloud-native systems.
  • Include examples of your work in automating provisioning and configuration tasks using Infrastructure as Code (IaC) tools.
  • Showcase your experience in managing multi-account AWS environments and implementing IAM strategies.
  • Include examples of your work in designing and implementing telemetry stacks and managing service level objectives (SLOs) and service level indicators (SLIs).

Technical Challenge Preparation:

  • Brush up on your knowledge of AWS cloud environments, including core services such as EC2, RDS, VPC, IAM, Route 53, CloudTrail, AWS Organizations, Resource Access Manager (RAM), and AWS SSO.
  • Familiarize yourself with Infrastructure as Code (IaC) tools such as AWS CDK, Terraform, and CloudFormation.
  • Prepare for questions on your experience with modern observability tools such as Prometheus, Grafana, and OpenTelemetry (OTel) for metrics, logging, and distributed tracing.
  • Review your experience in designing secure cloud network architectures and implementing compliance-aligned access controls across cloud-native environments.

ATS Keywords:

  • Programming Languages: Python, Bash, PowerShell
  • Web Frameworks: N/A
  • Server Technologies: AWS, Kubernetes, Docker
  • Databases: Amazon RDS, Amazon DynamoDB, Amazon Redshift
  • Tools: AWS CloudFormation, AWS CDK, Terraform, Ansible, Jenkins, JIRA, GitHub
  • Methodologies: Agile, DevOps, Site Reliability Engineering (SRE)
  • Soft Skills: Problem-solving, troubleshooting, communication, collaboration, leadership
  • Industry Terms: AWS, Cloud Operations, Site Reliability Engineering (SRE), Infrastructure as Code (IaC), Telemetry, Compliance, IAM, CI/CD

πŸ“ Enhancement Note: The interview process for the Senior Site Reliability Engineer role at medavis GmbH is designed to assess the candidate's technical skills and experience, as well as their cultural fit and problem-solving abilities. The candidate should be prepared to discuss their experience in designing and implementing secure, scalable, and highly available cloud-native systems, as well as their knowledge of AWS cloud environments, Infrastructure as Code (IaC), and modern observability tools.

πŸ›  Technology Stack & Web Infrastructure

Frontend Technologies: N/A

Backend & Server Technologies:

  • AWS Services: Amazon EC2, Amazon RDS, Amazon VPC, Amazon IAM, Amazon Route 53, Amazon CloudTrail, AWS Organizations, Amazon Resource Access Manager (RAM), Amazon SSO
  • Containerization: Docker, Kubernetes
  • Orchestration: Amazon EKS, Amazon ECS
  • Monitoring & Logging: Amazon CloudWatch, Prometheus, Grafana, ELK Stack, AWS CloudTrail
  • CI/CD: Jenkins, AWS CodePipeline, AWS CodeBuild
  • Infrastructure as Code (IaC): AWS CloudFormation, AWS CDK, Terraform

Development & DevOps Tools:

  • Version Control: Git, GitHub
  • Project Management: JIRA, Confluence
  • Communication: Slack, Microsoft Teams
  • Documentation: Confluence, GitHub Wikis

πŸ“ Enhancement Note: The technology stack for the Senior Site Reliability Engineer role at medavis GmbH is focused on AWS cloud environments, with a strong emphasis on Infrastructure as Code (IaC) and modern observability tools. The candidate should have experience with AWS services such as Amazon EC2, Amazon RDS, Amazon VPC, Amazon IAM, and Amazon Route 53, as well as containerization and orchestration tools such as Docker and Kubernetes.

πŸ‘₯ Team Culture & Values

Web Development Values:

  • Innovation: The company encourages a culture of experimentation and continuous improvement, with a focus on driving innovation in the healthcare technology industry.
  • Collaboration: The company values open communication and knowledge sharing, with a focus on working closely with other teams to ensure architectural consistency and compliance.
  • Quality: The company is committed to delivering high-quality products and services, with a focus on reliability, scalability, and security.
  • Customer Focus: The company is dedicated to understanding and meeting the needs of its customers, with a focus on delivering value and improving user experience.

Collaboration Style:

  • Cross-Functional Integration: The company encourages collaboration and communication between teams, with a focus on working closely with other teams to ensure architectural consistency and compliance.
  • Code Review Culture: The company values code review and peer programming, with a focus on ensuring code quality and knowledge sharing.
  • Knowledge Sharing: The company encourages a culture of knowledge sharing, with a focus on mentoring and supporting the growth and development of its team members.

πŸ“ Enhancement Note: The team culture at medavis GmbH is focused on innovation, collaboration, and continuous improvement. The company values open communication and knowledge sharing, with a focus on working closely with other teams to ensure architectural consistency and compliance. The company is committed to delivering high-quality products and services, with a focus on reliability, scalability, and security.

⚑ Challenges & Growth Opportunities

Technical Challenges:

  • Multi-Account AWS Organization: Managing a complex, multi-account AWS organization structure requires strong technical skills and a deep understanding of AWS services and best practices.
  • Secure Cloud Network Architecture: Designing and implementing secure cloud network architectures requires a strong understanding of AWS security services and compliance-aligned access controls.
  • Telemetry Stack Design: Designing and implementing a telemetry stack that is cost-effective, compliant, and can be used across multiple organizations requires a strong understanding of modern observability tools and AWS services.
  • Service Level Objectives (SLOs) & Service Level Indicators (SLIs): Actively managing service qualities requires a strong understanding of Service Level Objectives (SLOs) and Service Level Indicators (SLIs), as well as the ability to design and implement effective monitoring and alerting strategies.

Learning & Development Opportunities:

  • AWS Training: The company offers opportunities for AWS training and certification, with a focus on helping team members develop their skills and advance their careers.
  • Conference Attendance: The company supports attendance at relevant conferences and events, with a focus on helping team members stay up-to-date with the latest trends and best practices in the industry.
  • Mentorship: The company offers mentorship opportunities, with a focus on helping team members develop their skills and advance their careers.

πŸ“ Enhancement Note: The technical challenges for the Senior Site Reliability Engineer role at medavis GmbH require a strong understanding of AWS cloud environments, with a focus on managing complex, multi-account organizations and designing secure cloud network architectures. The learning and development opportunities offered by the company are designed to help team members develop their skills and advance their careers, with a focus on AWS training, conference attendance, and mentorship.

πŸ’‘ Interview Preparation

Technical Questions:

  • AWS Services: Questions on AWS services such as Amazon EC2, Amazon RDS, Amazon VPC, Amazon IAM, and Amazon Route 53, as well as AWS Organizations, Amazon Resource Access Manager (RAM), and Amazon SSO.
  • Infrastructure as Code (IaC): Questions on Infrastructure as Code (IaC) tools such as AWS CloudFormation, AWS CDK, and Terraform, as well as best practices for designing and implementing reusable, cloud-agnostic infrastructure patterns.
  • Observability: Questions on modern observability tools such as Prometheus, Grafana, and OpenTelemetry (OTel) for metrics, logging, and distributed tracing, as well as best practices for designing and implementing effective monitoring and alerting strategies.
  • Security: Questions on designing secure cloud network architectures and implementing compliance-aligned access controls across cloud-native environments, as well as best practices for managing IAM strategies and implementing compliance automation with policies as code.

Company & Culture Questions:

  • Company Culture: Questions on the company's culture, values, and mission, as well as the team's dynamics and collaboration style.
  • Technical Challenges: Questions on the technical challenges faced by the team and how they are addressed, as well as the opportunities for growth and development within the role.
  • Growth Opportunities: Questions on the opportunities for career progression and technical growth within the company, as well as the team's plans for expansion and development.

Portfolio Presentation Strategy:

  • Architecture Walkthrough: Present a detailed walkthrough of your architecture, including diagrams and detailed explanations of your design choices.
  • Code Review: Include a code review section in your portfolio, with a focus on demonstrating your ability to write clear and concise code and your understanding of best practices for designing and implementing secure, scalable, and highly available cloud-native systems.
  • Problem-Solving: Include examples of your problem-solving skills and your ability to troubleshoot complex technical issues.

πŸ“ Enhancement Note: The interview process for the Senior Site Reliability Engineer role at medavis GmbH is designed to assess the candidate's technical skills and experience, as well as their cultural fit and problem-solving abilities. The candidate should be prepared to discuss their experience in designing and implementing secure, scalable, and highly available cloud-native systems, as well as their knowledge of AWS cloud environments, Infrastructure as Code (IaC), and modern observability tools. The candidate should also be prepared to discuss the company's culture, values, and mission, as well as the team's dynamics and collaboration style.

πŸ“Œ Application Steps

To apply for this Senior Site Reliability Engineer - AWS Cloud Operations position at medavis GmbH:

  1. Update Your Portfolio: Ensure your portfolio highlights your experience in designing and implementing secure, scalable, and highly available cloud-native systems, as well as your knowledge of AWS cloud environments, Infrastructure as Code (IaC), and modern observability tools.
  2. Tailor Your Resume: Highlight your relevant experience and skills, with a focus on your knowledge of AWS cloud environments, Infrastructure as Code (IaC), and modern observability tools.
  3. Prepare for Technical Interviews: Brush up on your knowledge of AWS cloud environments, Infrastructure as Code (IaC), and modern observability tools, as well as your problem-solving skills and your ability to troubleshoot complex technical issues.
  4. Research the Company: Familiarize yourself with the company's culture, values, and mission, as well as the team's dynamics and collaboration style.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and web development/server administration industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Application Requirements

The ideal candidate should have 5+ years of experience in AWS environments and proficiency in Kubernetes, IAM integration, and observability tools. Strong skills in Infrastructure as Code and designing secure cloud architectures are also essential.