Site Reliability Engineer at ComplyAdvantage

📍 Job Overview

Job Title: Site Reliability Engineer
Company: ComplyAdvantage
Location: Lisbon, Portugal
Job Type: Hybrid (2 days in the office)
Category: DevOps Engineer
Date Posted: 2025-07-25
Experience Level: Mid-Senior Level (5-10 years)
Remote Status: On-site/Hybrid

🚀 Role Summary

Key Responsibilities: Design, build, and maintain reliable foundational services for CI/CD pipelines and observability platforms. Participate in on-call rotations to respond to production incidents and lead post-incident reviews.
Key Technologies: Cloud-based infrastructure (AWS/GCP), Kubernetes, Terraform, Helm, CI/CD tooling, observability platforms, Python, and containerized workloads.

📝 Enhancement Note: This role requires a strong background in cloud services, Kubernetes, and CI/CD tooling. Familiarity with the company's tech stack, including AWS/GCP, Kubernetes, Terraform, Helm, and observability platforms, is essential for success in this position.

💻 Primary Responsibilities

Design and Build: Architect, implement, and maintain highly available and reliable foundational services for CI/CD pipelines, observability platforms, and the Internal Developer Platform.
Ensure Reliability: Participate in an on-call rotation to effectively respond to and resolve production incidents swiftly. Lead thorough post-incident reviews to identify root causes and implement proactive preventative measures.
Automate Infrastructure: Manage and automate the cloud infrastructure using Terraform and Helm, adhering to GitOps best practices.
Collaborate Effectively: Partner closely with development and data engineering teams to ensure seamless deployments and provide robust operational support.

📝 Enhancement Note: This role involves a significant amount of collaboration with other engineering teams. Strong communication and teamwork skills are crucial for success in this position.

🎓 Skills & Qualifications

Education: BSc/BA degree in computer science, engineering, or a related discipline, or relevant years of experience in required skills.

Experience: 5-10 years of experience in cloud services, Kubernetes, CI/CD tooling, and observability platforms.

Required Skills:

Deep expertise in cloud services (AWS and/or GCP)
Significant experience managing and troubleshooting services within Kubernetes environments
Proven track record with CI/CD tooling
Strong proficiency in observability platforms, including monitoring, alerting, and production operations
Hands-on experience codifying infrastructure with Terraform and Helm charts
Excellent incident response and troubleshooting abilities
Proficiency in scripting and automation using Python
Experience working with containerized workloads
Experience collaborating with software engineers to support production cloud-native applications

Preferred Skills:

Familiarity with ArgoCD, GitLab CI, and the Grafana, Mimir, Loki & Prometheus stack

📝 Enhancement Note: While not required, familiarity with the company's preferred tools and technologies can provide a significant advantage in this role.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

Demonstrate a strong understanding of cloud services, Kubernetes, and CI/CD tooling through relevant projects and case studies.
Showcase your ability to design, build, and maintain reliable foundational services for CI/CD pipelines and observability platforms.
Highlight your experience with Terraform and Helm, and provide examples of infrastructure automation projects.
Include examples of your incident response and troubleshooting skills, and describe how you have led post-incident reviews.

Technical Documentation:

Provide clear and concise documentation for your projects, including code comments, version control, and deployment processes.
Include performance metrics and optimization techniques used in your projects.

📝 Enhancement Note: A well-structured portfolio that demonstrates your technical skills and problem-solving abilities will be crucial for success in this role.

💵 Compensation & Benefits

Salary Range: €60,000 - €90,000 per year (based on market research and company size)

Benefits:

Equity as we want you to have a part of what we are building
Private medical insurance designed to keep you ensuring peace of mind while you excel in your career
Unlimited Time Off Policy - A work-life balance and focus on our well-being are critical to keeping us performing at our best
We embrace a hybrid approach that requires employees to be in the office for two days a week. We strongly believe that this approach fosters collaboration and enables the building of meaningful relationships
You will also get a new starter budget to kit out your home office
Opportunity to work on innovative projects with smart-minded people keen to share their knowledge and continuously improve
Annual learning budget (prorated based on start date) to drive your performance and career development

Working Hours: 40 hours per week, with flexible deployment windows and maintenance schedules

📝 Enhancement Note: The salary range provided is based on market research and company size. Actual compensation may vary based on experience and qualifications.

🎯 Team & Company Context

🏢 Company Culture

Industry: Financial crime risk data and detection technology

Company Size: Medium-sized company with around 1,000 employees

Founded: 2014

Team Structure:

The DevOps team is part of the Platform tribe, which is dedicated to building and maintaining foundational systems, tooling, and services for the Technology organization.
The team collaborates closely with other engineering teams, including development and data engineering teams.

Development Methodology:

The company uses modern development methodologies, including Agile and Scrum.
They emphasize engineering excellence and strive to ship the best possible code and solutions to their customers.

Company Website: complyadvantage.com

📝 Enhancement Note: The company's focus on engineering excellence and collaboration makes it an attractive place for DevOps engineers looking to grow their careers in a dynamic and innovative environment.

📈 Career & Growth Analysis

Web Technology Career Level: Mid-Senior Level (5-10 years of experience)

Reporting Structure: The Site Reliability Engineer reports directly to the DevOps team lead and works closely with other engineering teams.

Technical Impact: The Site Reliability Engineer plays a crucial role in ensuring the reliability, scalability, and performance of the company's critical services. Their work directly impacts the user experience and the company's ability to deliver exceptional products to its customers.

Growth Opportunities:

Technical Growth: The role offers ample opportunities for technical growth, including working with cutting-edge technologies and collaborating with experienced engineers.
Leadership Potential: With experience, there is potential for growth into a technical leadership role, where you would be responsible for guiding the team's technical direction and mentoring other engineers.
Career Progression: As the company continues to grow, there may be opportunities for career progression into more senior roles within the DevOps team or other areas of the Technology organization.

📝 Enhancement Note: The company's focus on engineering excellence and innovation provides numerous opportunities for technical growth and career progression.

🌐 Work Environment

Office Type: Hybrid office environment, with employees required to be in the office for two days a week.

Office Location(s): Lisbon, Portugal

Workspace Context:

The company provides a collaborative workspace with multiple monitors and testing devices available for engineers.
The workspace is designed to foster cross-functional collaboration between developers, designers, and stakeholders.

Work Schedule: The work schedule is flexible, with deployment windows and maintenance schedules managed by the on-call rotation.

📝 Enhancement Note: The company's hybrid work environment and flexible work schedule allow for a healthy work-life balance while still fostering collaboration and innovation.

📄 Application & Technical Interview Process

Interview Process:

Technical Assessment: A hands-on technical assessment focused on cloud services, Kubernetes, CI/CD tooling, and observability platforms. This may include live coding exercises and system design discussions.
Behavioral Interview: A behavioral interview focused on your problem-solving skills, communication abilities, and cultural fit with the company.
Final Evaluation: A final evaluation based on your technical skills, cultural fit, and alignment with the company's mission and values.

Portfolio Review Tips:

Highlight your experience with cloud services, Kubernetes, CI/CD tooling, and observability platforms through relevant projects and case studies.
Emphasize your ability to design, build, and maintain reliable foundational services for CI/CD pipelines and observability platforms.
Include examples of your incident response and troubleshooting skills, and describe how you have led post-incident reviews.

Technical Challenge Preparation:

Brush up on your cloud services, Kubernetes, CI/CD tooling, and observability platforms skills.
Familiarize yourself with the company's tech stack, including AWS/GCP, Kubernetes, Terraform, Helm, and the Grafana, Mimir, Loki & Prometheus stack.
Prepare for live coding exercises and system design discussions by practicing common interview questions and working through relevant coding challenges.

ATS Keywords:

Cloud Services: AWS, GCP, Kubernetes, Terraform, Helm, CI/CD, Observability, Monitoring, Alerting, Production Operations, Incident Response, Troubleshooting, Scripting, Automation, Containerized Workloads, Collaboration, Agile, Scrum, Engineering Excellence, Innovation, Hybrid Work Environment, Flexible Work Schedule, Technical Growth, Career Progression, Technical Leadership.

📝 Enhancement Note: Familiarize yourself with the company's tech stack and the relevant ATS keywords to optimize your resume and application materials.

🛠 Technology Stack & Web Infrastructure

Cloud-Based Infrastructure: Fully cloud-based with a Kubernetes-focused tech stack. Compute workloads run in Kubernetes clusters across multiple regions.

Backend & Server Technologies:

Cloud services: AWS and/or GCP
Containerization: Kubernetes
Infrastructure as Code: Terraform and Helm
CI/CD tooling: GitLab CI, ArgoCD
Observability platforms: Grafana, Mimir, Loki & Prometheus

Development & DevOps Tools:

Version control: Git
Collaboration: GitLab, ArgoCD
Monitoring: Grafana, Mimir, Loki & Prometheus
Logging: ELK Stack (Elasticsearch, Logstash, Kibana)
Infrastructure as Code: Terraform, Helm
CI/CD: GitLab CI, ArgoCD

📝 Enhancement Note: Familiarize yourself with the company's tech stack and be prepared to discuss your experience with the relevant technologies during the interview process.

👥 Team Culture & Values

Web Development Values:

Reliability: The company values reliability and strives to ensure the availability and performance of its critical services.
Collaboration: The company fosters a culture of collaboration and encourages engineers to work closely with other teams to deliver exceptional products.
Innovation: The company embraces innovation and encourages engineers to explore new technologies and approaches to solve complex problems.
Continuous Learning: The company values continuous learning and provides opportunities for engineers to develop their skills and advance their careers.

Collaboration Style:

Cross-functional Integration: The company encourages collaboration between developers, designers, and stakeholders to deliver exceptional products.
Code Review Culture: The company emphasizes code review and peer programming practices to ensure code quality and knowledge sharing.
Knowledge Sharing: The company encourages knowledge sharing and provides opportunities for engineers to mentor and learn from one another.

📝 Enhancement Note: The company's focus on collaboration, innovation, and continuous learning makes it an attractive place for DevOps engineers looking to grow their careers in a dynamic and supportive environment.

⚡ Challenges & Growth Opportunities

Technical Challenges:

Cloud Services: Design, implement, and maintain highly available and reliable foundational services for CI/CD pipelines and observability platforms in a cloud-based environment.
Incident Response: Respond to and resolve production incidents swiftly, and lead thorough post-incident reviews to identify root causes and implement proactive preventative measures.
Automation: Manage and automate the cloud infrastructure using Terraform and Helm, adhering to GitOps best practices.

Learning & Development Opportunities:

Technical Skill Development: Develop your skills in cloud services, Kubernetes, CI/CD tooling, and observability platforms through hands-on projects and collaboration with experienced engineers.
Emerging Technologies: Stay up-to-date with emerging technologies and trends in cloud services, Kubernetes, and CI/CD tooling.
Leadership Development: Develop your leadership skills through mentoring, team management, and architecture decision-making opportunities.

📝 Enhancement Note: The company's focus on technical growth and innovation provides numerous opportunities for DevOps engineers to develop their skills and advance their careers.

💡 Interview Preparation

Technical Questions:

Cloud Services: Describe your experience with cloud services (AWS and/or GCP) and how you have used them to design, implement, and maintain highly available and reliable foundational services for CI/CD pipelines and observability platforms.
Kubernetes: Explain your experience with Kubernetes and how you have used it to manage and automate cloud infrastructure using Terraform and Helm.
Incident Response: Describe your incident response and troubleshooting skills, and provide examples of how you have led post-incident reviews to identify root causes and implement proactive preventative measures.

Company & Culture Questions:

Company Mission: Explain how your experience and skills align with the company's mission to neutralize the risk of money laundering, terrorist financing, corruption, and other financial crime.
Company Values: Describe how you embody the company's values, including reliability, collaboration, innovation, and continuous learning.

Portfolio Presentation Strategy:

Live Demo: Demonstrate your ability to design, build, and maintain reliable foundational services for CI/CD pipelines and observability platforms through a live demo of your portfolio projects.
Code Walkthrough: Provide a detailed walkthrough of your code, including your use of Terraform, Helm, and other relevant technologies.
Incident Response Example: Describe an incident you have responded to and how you led a post-incident review to identify root causes and implement proactive preventative measures.

📝 Enhancement Note: Prepare thoroughly for the technical and behavioral interview questions, and be ready to demonstrate your skills and experience through a live demo and code walkthrough.

📌 Application Steps

To apply for this Site Reliability Engineer position:

Customize Your Resume: Highlight your experience with cloud services, Kubernetes, CI/CD tooling, and observability platforms, and tailor your resume to the specific requirements of this role.
Prepare Your Portfolio: Showcase your ability to design, build, and maintain reliable foundational services for CI/CD pipelines and observability platforms through relevant projects and case studies.
Research the Company: Familiarize yourself with the company's mission, values, and tech stack, and be prepared to discuss how your experience and skills align with the company's goals and culture.
Practice Interview Questions: Prepare for the technical and behavioral interview questions by working through common interview questions and practicing your responses.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and web development/DevOps industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Content Guidelines (IMPORTANT: Do not include this in the output)

Web Technology-Specific Focus:

Tailor every section specifically to DevOps, Site Reliability Engineering, and cloud infrastructure roles
Include cloud services, Kubernetes, CI/CD tooling, and observability platforms in the role summary and primary responsibilities
Emphasize incident response, troubleshooting, and automation skills in the skills and qualifications section
Address cloud infrastructure management, automation, and deployment pipelines in the technology stack and web infrastructure section
Provide detailed guidance on preparing for technical interview questions related to cloud services, Kubernetes, and CI/CD tooling

Quality Standards:

Ensure no content overlap between sections - each section must contain unique information only
Only include Enhancement Notes when making significant inferences about cloud services, Kubernetes, CI/CD tooling, or incident response, with specific reasoning based on role level and web technology industry practices
Be comprehensive but concise, prioritizing actionable information over descriptive text
Strategically distribute web technology, cloud services, Kubernetes, CI/CD tooling, and incident response-related keywords throughout all sections naturally
Provide realistic salary ranges based on location, experience level, and cloud services/DevOps specialization

Industry Expertise:

Include specific cloud services (AWS/GCP), Kubernetes, CI/CD tooling, and observability platforms relevant to the role
Address cloud infrastructure management, automation, and deployment pipelines in the technology stack and web infrastructure section
Provide tactical advice for cloud infrastructure management, automation, and incident response in the interview preparation section
Include cloud services, Kubernetes, CI/CD tooling, and incident response-specific interview questions and portfolio presentation strategies

Professional Standards:

Maintain consistent formatting, spacing, and professional tone throughout
Use cloud services, Kubernetes, CI/CD tooling, and incident response industry terminology appropriately and accurately
Include comprehensive benefits and growth opportunities relevant to DevOps, Site Reliability Engineering, and cloud infrastructure professionals
Provide actionable insights that give DevOps, Site Reliability Engineering, and cloud infrastructure candidates a competitive advantage
Focus on cloud infrastructure management, automation, incident response, and user experience design principles in the team culture and values section

Technical Focus & Portfolio Emphasis:

Emphasize cloud services, Kubernetes, CI/CD tooling, and incident response best practices in the primary responsibilities and skills and qualifications sections
Include specific cloud infrastructure management, automation, and incident response portfolio requirements
Address cloud infrastructure management, automation, and incident response in the technical interview process and portfolio review tips sections
Focus on cloud services, Kubernetes, CI/CD tooling, and incident response problem-solving methods, performance optimization, and scalable architecture in the challenges and growth opportunities section

Avoid:

Generic business jargon not relevant to DevOps, Site Reliability Engineering, or cloud infrastructure roles
Placeholder text or incomplete sections
Repetitive content across different sections
Non-technical terminology unless relevant to the specific DevOps, Site Reliability Engineering, or cloud infrastructure role
Marketing language unrelated to cloud services, Kubernetes, CI/CD tooling, or incident response

Site Reliability Engineer