Site Reliability Engineer (Middle) ID38916
📍 Job Overview
- Job Title: Site Reliability Engineer (Middle)
- Company: AgileEngine
- Location: Cali, West Kalimantan, Indonesia
- Job Type: On-site (Hybrid)
- Category: DevOps, Infrastructure
- Date Posted: July 29, 2025
- Experience Level: Mid-level (2-5 years)
- Remote Status: On-site with hybrid flexibility
🚀 Role Summary
- Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on-call
- Key Responsibilities: Manage alerts, provide 24x7 on-call support, collaborate with teams, automate tasks, and improve infrastructure health
- Technical Skills: AWS, EKS, Terraform, Helm, Docker, Linux, Bash, Python, REST APIs, monitoring solutions, and strong communication skills
📝 Enhancement Note: This role focuses on maintaining and enhancing the reliability of SaaS services, requiring a strong background in AWS, infrastructure as code (IaC), and scripting. Familiarity with Datadog and a customer-centric mindset are also crucial for success in this position.
💻 Primary Responsibilities
- Alert Management: Monitor and manage alerts, escalate issues as needed, and document remediation steps
- On-Call Support: Provide 24x7 on-call support for critical SaaS events and be available for emergencies
- Infrastructure Management: Deploy to EKS/K8s cluster using Terraform and Helm, maintain existing infrastructure running under Docker Swarm, and improve infrastructure health
- Automation & Collaboration: Automate manual tasks, collaborate with other teams to provide high-level support, and work closely with various departments to ensure the best SaaS service for customers
- Root Cause Analysis (RCA) & Corrective Actions: Perform RCA, take corrective actions to prevent issue recurrence, and create and assign alert-related actions to the appropriate team after investigation
📝 Enhancement Note: This role requires a proactive approach to infrastructure management, with a focus on preventing issues before they occur. Strong problem-solving skills and the ability to work effectively in a collaborative environment are essential for success in this position.
🎓 Skills & Qualifications
Education: Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience)
Experience: 2+ years of professional experience in a similar role
Required Skills:
- Experience working with Datadog
- Hands-on experience as an AWS Cloud Engineer
- Working knowledge of EKS, Terraform, and Helm
- Working experience with Docker and Docker Swarm
- Good understanding of AWS IAM roles and policies
- Experience logging and monitoring AWS resources using CloudWatch logs
- Experience working in a Linux environment
- Proficient in Bash and/or Python scripting
- A strong understanding of web technologies such as REST APIs
- Working experience with monitoring solutions, such as Grafana and Prometheus
- Excellent oral and written communication skills
- Experience in Product/Application Support for SaaS-based products
- Understanding of APIs, Databases, Systems Architecture, and Design
- Designing, implementing, and operating in a DevSecOps environment
- Excellent communication skills, both written and verbal
- Ability to work independently as well as within a collaborative environment
- A technical aptitude with the desire to learn new and evolving technologies
- Upper-Intermediate English level
Preferred Skills:
- Experience with AWS services (e.g., S3, RDS, Lambda, etc.)
- Familiarity with CI/CD pipelines and GitOps
- Knowledge of container orchestration and service mesh technologies
📝 Enhancement Note: While the required skills list is comprehensive, candidates with experience in AWS services, CI/CD pipelines, and container orchestration will have an advantage in this role. Additionally, a strong focus on customer-centric communication and problem-solving will be valuable for success in this position.
📊 Web Portfolio & Project Requirements
Portfolio Essentials:
- Demonstrate experience with AWS, EKS, Terraform, Helm, and Docker through relevant projects and case studies
- Showcase your ability to manage alerts, perform RCA, and take corrective actions through real-world examples
- Highlight your scripting skills (Bash, Python) and problem-solving approach in your portfolio projects
- Include examples of your collaboration and communication skills, demonstrating your ability to work effectively with teams
Technical Documentation:
- Document your approach to infrastructure as code (IaC) and provide examples of your Terraform and Helm configurations
- Explain your monitoring and alerting strategies, including any custom scripts or tools you've developed
- Describe your process for performing RCA and taking corrective actions, including any automation you've implemented
- Include any relevant certifications or training you've received, such as AWS certifications or Kubernetes certifications
📝 Enhancement Note: When preparing your portfolio for this role, focus on demonstrating your technical skills and problem-solving approach through real-world examples. Highlight your ability to manage alerts, perform RCA, and collaborate effectively with teams to provide high-level support for SaaS services.
💵 Compensation & Benefits
Salary Range: The salary range for this role is IDR 25,000,000 - 35,000,000 per year (USD 1,750 - 2,450 per month), based on regional market standards and the candidate's experience level.
Benefits:
- Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps
- Competitive compensation: USD-based compensation and budgets for education, fitness, and team activities
- A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands
- Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive
Working Hours: 40 hours per week, with a hybrid work arrangement (Monday – Thursday 8AM – 7PM PST, rotating on-call)
📝 Enhancement Note: The provided salary range is based on regional market standards and the candidate's experience level. AgileEngine offers competitive compensation and benefits, including professional growth opportunities, exciting projects, and flextime arrangements to attract top talent in the industry.
🎯 Team & Company Context
🏢 Company Culture
Industry: AgileEngine is a software development company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries, ranking among the leaders in application development and AI/ML. Their people-first culture has earned them multiple Best Place to Work awards.
Company Size: AgileEngine is an Inc. 5000 company, indicating a mid-sized to large organization with a significant presence in the software development industry.
Founded: 2012
Team Structure:
- The Site Reliability Engineering team works closely with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to customers
- The team is responsible for managing alerts, providing on-call support, and improving infrastructure health to ensure high-level support for SaaS services
- The team consists of mid-level and senior Site Reliability Engineers, with a collaborative and customer-centric approach to problem-solving
Development Methodology:
- AgileEngine follows Agile methodologies, with a focus on iterative development, continuous improvement, and customer-centric problem-solving
- The team uses tools such as Jira, Confluence, and Bitbucket to manage projects, track progress, and collaborate effectively
- The company encourages a culture of learning and innovation, with regular TechTalks, mentorship programs, and personalized growth roadmaps
Company Website: AgileEngine
📝 Enhancement Note: AgileEngine's people-first culture, focus on customer-centric problem-solving, and commitment to professional growth make it an attractive employer for mid-level Site Reliability Engineers looking to advance their careers in the software development industry.
📈 Career & Growth Analysis
Web Technology Career Level: Mid-level Site Reliability Engineer
Reporting Structure: The Site Reliability Engineer reports directly to the Site Reliability Engineering Manager and works closely with other teams, including Support, Customer Success, Migration, and Professional Services.
Technical Impact: The Site Reliability Engineer plays a crucial role in ensuring the reliability and performance of AgileEngine's SaaS services, with a significant impact on customer experience and satisfaction.
Growth Opportunities:
- Technical Growth: Develop your skills in AWS, EKS, Terraform, Helm, and other relevant technologies through hands-on experience, mentorship, and personalized growth roadmaps
- Leadership Growth: Demonstrate your ability to lead projects, mentor junior team members, and make critical decisions that impact the reliability and performance of AgileEngine's SaaS services
- Architecture & Design: Contribute to the design and implementation of scalable, reliable, and secure infrastructure solutions that support AgileEngine's growing portfolio of SaaS services
📝 Enhancement Note: AgileEngine offers mid-level Site Reliability Engineers the opportunity to grow technically, take on leadership roles, and contribute to the design and implementation of critical infrastructure solutions that support the company's SaaS services.
🌐 Work Environment
Office Type: AgileEngine's office is a modern, collaborative workspace designed to foster creativity, innovation, and teamwork. The office features open-plan workspaces, meeting rooms, and relaxation areas to support a productive and enjoyable work environment.
Office Location(s): Cali, West Kalimantan, Indonesia
Workspace Context:
- Collaborative Workspace: The office is designed to encourage collaboration and communication between team members, with open-plan workspaces and dedicated meeting rooms for team discussions and brainstorming sessions
- Technical Infrastructure: AgileEngine provides its employees with access to modern hardware, software, and tools to ensure they have everything they need to succeed in their roles
- Cross-Functional Collaboration: The office is designed to facilitate cross-functional collaboration between different teams, with dedicated spaces for designers, developers, and project managers to work together on projects
Work Schedule: The hybrid work arrangement allows employees to work from home and go to the office, with a focus on maintaining a healthy work-life balance.
📝 Enhancement Note: AgileEngine's modern, collaborative workspace and flexible work arrangements support a productive and enjoyable work environment for mid-level Site Reliability Engineers looking to grow their careers in the software development industry.
📄 Application & Technical Interview Process
Interview Process:
- Online Assessment: Complete an online assessment to evaluate your technical skills and problem-solving abilities
- Technical Phone Screen: Participate in a technical phone screen to discuss your experience, skills, and career goals with an AgileEngine recruiter
- On-site Technical Interview: Attend an on-site technical interview with the Site Reliability Engineering team, where you will be asked to perform hands-on tasks, discuss your portfolio, and answer technical questions related to AWS, EKS, Terraform, Helm, and other relevant technologies
- Final Interview: Meet with the Site Reliability Engineering Manager and other team members to discuss your career goals, cultural fit, and next steps in the interview process
Portfolio Review Tips:
- Highlight your experience with AWS, EKS, Terraform, Helm, and other relevant technologies through real-world examples and case studies
- Demonstrate your ability to manage alerts, perform RCA, and take corrective actions through specific examples and anecdotes
- Showcase your scripting skills (Bash, Python) and problem-solving approach in your portfolio projects
- Include examples of your collaboration and communication skills, demonstrating your ability to work effectively with teams
Technical Challenge Preparation:
- Brush up on your AWS, EKS, Terraform, Helm, and other relevant technologies through online tutorials, documentation, and hands-on practice
- Familiarize yourself with AgileEngine's development methodologies, tools, and company culture to ensure a strong fit with the team and organization
- Prepare for hands-on tasks and technical questions related to AWS, EKS, Terraform, Helm, and other relevant technologies, with a focus on problem-solving, collaboration, and customer-centric approaches
ATS Keywords: [Comprehensive list of web development and server administration-relevant keywords for resume optimization, organized by category: programming languages, web frameworks, server technologies, databases, tools, methodologies, soft skills, industry terms]
📝 Enhancement Note: AgileEngine's interview process focuses on evaluating candidates' technical skills, problem-solving abilities, and cultural fit within the organization. By preparing for hands-on tasks, technical questions, and portfolio review, candidates can demonstrate their qualifications and increase their chances of success in the interview process.
🛠 Technology Stack & Web Infrastructure
Backend & Server Technologies:
- AWS: AgileEngine's infrastructure is built on AWS, with a focus on using managed services and infrastructure as code (IaC) to ensure reliability, scalability, and security
- EKS: AgileEngine uses Amazon Elastic Kubernetes Service (EKS) to manage and deploy containerized applications at scale
- Terraform: AgileEngine uses Terraform to provision and manage infrastructure as code (IaC), ensuring consistency, version control, and automated deployment
- Helm: AgileEngine uses Helm to package, configure, and deploy applications on Kubernetes clusters, with a focus on automation, version control, and dependency management
Development & DevOps Tools:
- Docker: AgileEngine uses Docker to containerize applications and ensure consistent deployment across different environments
- Git: AgileEngine uses Git for version control, collaboration, and code review, with a focus on branching, merging, and pull request workflows
- Jira: AgileEngine uses Jira for project management, issue tracking, and collaboration, with a focus on Agile methodologies and iterative development
- Confluence: AgileEngine uses Confluence for documentation, knowledge sharing, and collaboration, with a focus on maintaining up-to-date and accessible information for team members
- Bitbucket: AgileEngine uses Bitbucket for version control, code review, and collaboration, with a focus on Git workflows and continuous integration/continuous deployment (CI/CD) pipelines
📝 Enhancement Note: AgileEngine's technology stack focuses on AWS, EKS, Terraform, Helm, and other relevant technologies to ensure the reliability, scalability, and security of the company's SaaS services. By leveraging infrastructure as code (IaC), containerization, and automation, AgileEngine's development and DevOps teams can work efficiently and effectively to deliver high-quality software solutions.
👥 Team Culture & Values
Web Development Values:
- Customer-Centric: AgileEngine prioritizes customer-centric problem-solving, with a focus on understanding customer needs, expectations, and pain points to deliver high-quality SaaS services
- Collaboration: AgileEngine encourages collaboration and teamwork, with a focus on open communication, knowledge sharing, and collective problem-solving
- Continuous Learning: AgileEngine fosters a culture of continuous learning and improvement, with a focus on staying up-to-date with the latest technologies, best practices, and industry trends
- Innovation: AgileEngine values innovation and creativity, with a focus on exploring new technologies, approaches, and solutions to drive business growth and customer success
Collaboration Style:
- Cross-Functional Collaboration: AgileEngine encourages collaboration between different teams, with a focus on breaking down silos, sharing knowledge, and working together to achieve common goals
- Code Review Culture: AgileEngine prioritizes code review and pair programming, with a focus on ensuring code quality, knowledge sharing, and collective code ownership
- Knowledge Sharing: AgileEngine fosters a culture of knowledge sharing, with a focus on mentorship, training, and continuous learning opportunities for team members
📝 Enhancement Note: AgileEngine's web development values and collaboration style focus on customer-centric problem-solving, continuous learning, and innovation. By prioritizing collaboration, knowledge sharing, and collective problem-solving, AgileEngine's teams can work effectively together to deliver high-quality SaaS services and drive business growth.
⚡ Challenges & Growth Opportunities
Technical Challenges:
- Infrastructure Management: Manage and maintain AgileEngine's infrastructure, with a focus on reliability, scalability, and security
- Alert Management: Monitor and manage alerts, with a focus on minimizing downtime, maximizing performance, and ensuring high-level support for SaaS services
- On-Call Support: Provide 24x7 on-call support for critical SaaS events, with a focus on minimizing downtime, maximizing performance, and ensuring high-level support for customers
- Root Cause Analysis (RCA): Perform RCA and take corrective actions to prevent issue recurrence, with a focus on minimizing downtime, maximizing performance, and ensuring high-level support for SaaS services
- Automation & Optimization: Automate manual tasks and optimize infrastructure for performance, with a focus on minimizing downtime, maximizing efficiency, and ensuring high-level support for SaaS services
Learning & Development Opportunities:
- Technical Skills: Develop your skills in AWS, EKS, Terraform, Helm, and other relevant technologies through hands-on experience, mentorship, and personalized growth roadmaps
- Leadership Skills: Demonstrate your ability to lead projects, mentor junior team members, and make critical decisions that impact the reliability and performance of AgileEngine's SaaS services
- Architecture & Design: Contribute to the design and implementation of scalable, reliable, and secure infrastructure solutions that support AgileEngine's growing portfolio of SaaS services
📝 Enhancement Note: AgileEngine offers mid-level Site Reliability Engineers the opportunity to tackle technical challenges, develop their skills, and contribute to the design and implementation of critical infrastructure solutions that support the company's SaaS services.
💡 Interview Preparation
Technical Questions:
- AWS: Describe your experience with AWS, including specific services such as S3, RDS, Lambda, and others. Explain how you've used AWS to manage and deploy applications at scale
- EKS: Explain your experience with EKS, including cluster management, deployment, and scaling. Describe how you've used EKS to manage and deploy containerized applications at scale
- Terraform: Describe your experience with Terraform, including infrastructure as code (IaC), version control, and automated deployment. Explain how you've used Terraform to manage and provision infrastructure at scale
- Helm: Explain your experience with Helm, including package management, dependency resolution, and deployment. Describe how you've used Helm to package, configure, and deploy applications on Kubernetes clusters
- Problem-Solving: Describe your approach to problem-solving, with a focus on root cause analysis, corrective actions, and preventive measures. Provide specific examples of how you've applied this approach in previous roles
Company & Culture Questions:
- Customer-Centric: Explain how you prioritize customer-centric problem-solving in your work, with a focus on understanding customer needs, expectations, and pain points
- Collaboration: Describe your approach to collaboration, with a focus on open communication, knowledge sharing, and collective problem-solving. Provide specific examples of how you've worked effectively with teams in previous roles
- Continuous Learning: Explain how you stay up-to-date with the latest technologies, best practices, and industry trends. Describe how you've applied this knowledge to drive business growth and customer success in previous roles
Portfolio Presentation Strategy:
- Project Selection: Choose projects that demonstrate your experience with AWS, EKS, Terraform, Helm, and other relevant technologies. Highlight your ability to manage alerts, perform RCA, and take corrective actions through real-world examples and case studies
- Storytelling: Use storytelling techniques to engage the interviewer and highlight your problem-solving approach, technical skills, and collaboration abilities
- Demonstration: Prepare a live demonstration of your portfolio projects, with a focus on walking the interviewer through your code, architecture, and problem-solving approach
📝 Enhancement Note: AgileEngine's interview process focuses on evaluating candidates' technical skills, problem-solving abilities, and cultural fit within the organization. By preparing for technical questions, company and culture questions, and portfolio presentation, candidates can demonstrate their qualifications and increase their chances of success in the interview process.
📌 Application Steps
To apply for this Site Reliability Engineer (Middle) position at AgileEngine:
- Submit Your Application: Click on the application link and complete the online application form
- Prepare Your Portfolio: Tailor your portfolio to highlight your experience with AWS, EKS, Terraform, Helm, and other relevant technologies. Include real-world examples and case studies that demonstrate your ability to manage alerts, perform RCA, and take corrective actions
- Optimize Your Resume: Highlight your technical skills, problem-solving approach, and collaboration abilities in your resume. Include relevant keywords and phrases to optimize your resume for Applicant Tracking System (ATS) screening
- Prepare for Technical Interview: Brush up on your technical skills, problem-solving abilities, and collaboration abilities. Prepare for hands-on tasks, technical questions, and portfolio review
- Research AgileEngine: Familiarize yourself with AgileEngine's company culture, development methodologies, and technology stack. Prepare for company and culture questions, with a focus on demonstrating your cultural fit within the organization
📝 Enhancement Note: AgileEngine's application process focuses on evaluating candidates' technical skills, problem-solving abilities, and cultural fit within the organization. By preparing your portfolio, optimizing your resume, and researching AgileEngine, candidates can demonstrate their qualifications and increase their chances of success in the application process.
Application Requirements
Candidates must have 2+ years of professional experience, hands-on experience with AWS, and a strong understanding of web technologies. Proficiency in scripting languages and experience with monitoring solutions are also required.