Site Reliability Engineer (Remote - MX)

Whitestack
Full_time

πŸ“ Job Overview

  • Job Title: Site Reliability Engineer (Remote - MX)
  • Company: Whitestack
  • Location: Santiago, RegiΓ³n Metropolitana, Chile
  • Job Type: Full-Time
  • Category: DevOps, Site Reliability Engineering
  • Date Posted: 2025-06-23
  • Experience Level: Mid-Senior Level (2-5 years)
  • Remote Status: Remote (Mexico, Chile, Argentina, Colombia, Uruguay, Peru)

πŸš€ Role Summary

  • Key Responsibilities: Design, implement, and optimize monitoring solutions for cloud infrastructures. Ensure the stability, availability, and performance of production cloud environments.
  • Key Technologies: Kubernetes, OpenStack, Prometheus, Grafana, Elasticsearch, Kibana, Bash, Python, Linux, Docker

πŸ’» Primary Responsibilities

  • Design, Implement, and Optimize Monitoring Solutions: Develop and maintain monitoring solutions for cloud infrastructures, ensuring optimal performance and minimal downtime.
  • Define and Implement Dashboards: Create and manage dashboards to visualize critical performance indicators, enabling proactive issue resolution.
  • Ensure Cloud Platform Stability: Maintain and optimize cloud platforms, especially those based on Kubernetes and OpenStack, to guarantee high availability and reliability.
  • Handle and Escalate Critical Incidents: Troubleshoot and resolve critical incidents, collaborating with senior engineers and product development teams as needed.
  • Manage Development and Testing Environments: Oversee and administer development and testing environments to ensure smooth deployment processes.
  • Develop and Operate CI/CD Pipelines: Create and maintain CI/CD pipelines for monitoring and updating images in production environments.

πŸŽ“ Skills & Qualifications

Education: Bachelor's degree in Computer Engineering, Computer Science, or a related field.

Experience:

  • At least 3 years of experience in managing, monitoring, and optimizing cloud infrastructures, with a focus on Kubernetes and OpenStack in production environments.
  • Proven experience in designing and implementing monitoring solutions and incident management.

Required Skills:

  • Proficiency in monitoring tools: Prometheus, Grafana, Elasticsearch, Kibana.
  • Experience with Kubernetes cluster administration and operation.
  • Intermediate-level Linux administration and Docker usage.
  • Intermediate-level Bash and Python automation skills.
  • Intermediate-level English proficiency (reading and writing).

Preferred Skills:

  • Experience with cloud platforms (AWS, GCP, Azure, or OpenStack).
  • Familiarity with agile methodologies (Scrum, Kanban).
  • Ability to adapt open-source tools.
  • Certifications in Linux, Kubernetes, or OpenStack.
  • Contributions or integration with open-source projects.
  • Basic networking knowledge.

πŸ“Š Web Portfolio & Project Requirements

Portfolio Essentials:

  • Demonstrate experience in cloud infrastructure monitoring and management, with a focus on Kubernetes and OpenStack.
  • Showcase incident management and problem-solving skills through case studies or examples.
  • Highlight proficiency in monitoring tools (Prometheus, Grafana, Elasticsearch, Kibana) and CI/CD pipeline development.

Technical Documentation:

  • Document code quality, commenting, and documentation standards for cloud infrastructure management and monitoring.
  • Explain version control, deployment processes, and server configuration for cloud environments.
  • Describe testing methodologies, performance metrics, and optimization techniques for cloud infrastructure.

πŸ’΅ Compensation & Benefits

Salary Range: The estimated salary range for a Site Reliability Engineer with 2-5 years of experience in Santiago, Chile is CLP 3,500,000 - CLP 5,000,000 per year (USD 5,000 - USD 7,200 per year). This estimate is based on regional market research and industry standards for similar roles.

Benefits:

  • Private Medical Insurance
  • Access to Courses
  • Books
  • Certification Reimbursement
  • Language Courses
  • Technology Equipment Renewal
  • Performance Bonuses
  • Minimum 15 Days Vacation
  • Extra Day Off for Birthday
  • Extra Breaks Before Holidays
  • Budget for Recreational Activities
  • Innovation Culture

Working Hours: Full-time position with a standard 40-hour workweek. Flexible deployment windows and maintenance schedules may be required.

🎯 Team & Company Context

Company Culture:

  • Industry: Whitestack is a leading company in Latin America specializing in cloud solutions and hyper-scalable digital infrastructure. They work with open-source technology and industry-leading standards to drive digital transformation across the region.
  • Company Size: Whitestack is a mid-sized company with a strong focus on innovation, collaboration, and personal development.
  • Founded: Whitestack was founded with a mission to empower businesses through cutting-edge technology and digital transformation.

Team Structure:

  • The Site Reliability Engineering team is responsible for designing, implementing, and maintaining monitoring solutions for cloud infrastructures. They collaborate closely with senior engineers, product development teams, and other cross-functional teams to ensure optimal performance and minimal downtime.
  • The team follows Agile methodologies, such as Scrum or Kanban, to manage projects and ensure efficient workflows.

Development Methodology:

  • Whitestack employs Agile development methodologies, such as Scrum or Kanban, to manage projects and ensure efficient workflows.
  • The company uses code reviews, testing, and quality assurance practices to maintain high code quality and performance.
  • Whitestack employs CI/CD pipelines and automated deployment strategies to streamline the development and release process.

Company Website: whitestack.com

πŸ“ Enhancement Note: Whitestack's focus on open-source technology and industry-leading standards enables the company to attract and retain top talent in the cloud and infrastructure management space. The company's commitment to innovation, collaboration, and personal development fosters a dynamic and engaging work environment for Site Reliability Engineers.

πŸ“ˆ Career & Growth Analysis

Web Technology Career Level: The Site Reliability Engineer role at Whitestack is a mid-senior level position that requires a strong background in cloud infrastructure management, monitoring, and optimization. This role offers significant opportunities for growth and leadership within the organization.

Reporting Structure: Site Reliability Engineers report directly to the Engineering Manager or a similar role, depending on the specific team structure. They collaborate closely with senior engineers, product development teams, and other cross-functional teams to ensure optimal performance and minimal downtime.

Technical Impact: Site Reliability Engineers at Whitestack play a critical role in ensuring the stability, availability, and performance of the company's cloud platforms. Their work directly impacts the user experience and the overall success of the organization's digital transformation initiatives.

Growth Opportunities:

  • Technical Growth: Whitestack offers opportunities for Site Reliability Engineers to specialize in specific cloud technologies, monitoring tools, or infrastructure management areas. This allows professionals to deepen their expertise and stay up-to-date with the latest industry trends.
  • Leadership Development: As the company continues to grow, there are opportunities for Site Reliability Engineers to take on leadership roles, mentoring junior team members and contributing to strategic decision-making processes.
  • Architecture and Design: With experience, Site Reliability Engineers can transition into architecture roles, designing and implementing large-scale cloud infrastructure solutions that meet the organization's evolving needs.

πŸ“ Enhancement Note: Whitestack's commitment to innovation and continuous learning provides Site Reliability Engineers with ample opportunities to grow both technically and professionally. The company's focus on open-source technology and industry-leading standards enables engineers to work on cutting-edge projects and stay at the forefront of the cloud and infrastructure management industry.

🌐 Work Environment

Office Type: Whitestack offers a remote-friendly work environment, with the option to work from one of the company's offices in Latin America or from a home office.

Office Location(s): Whitestack has offices in Santiago, Chile; Buenos Aires, Argentina; BogotΓ‘, Colombia; and SΓ£o Paulo, Brazil. Remote work is available for candidates based in Mexico, Chile, Argentina, Colombia, Uruguay, and Peru.

Workspace Context:

  • Remote Work: Whitestack's remote work policy allows Site Reliability Engineers to work from home or from a co-working space, providing flexibility and autonomy in their daily routines.
  • Collaboration Tools: The company provides access to collaboration tools, such as Slack, Microsoft Teams, or Google Workspace, to facilitate communication and teamwork among remote team members.
  • Training and Development: Whitestack offers access to courses, books, and certification reimbursement to help Site Reliability Engineers stay up-to-date with the latest industry trends and technologies.

Work Schedule: Whitestack offers a flexible work schedule, with core hours between 10:00 AM and 4:00 PM (local time). The company encourages employees to maintain a healthy work-life balance and offers additional time off during national holidays and special events.

πŸ“ Enhancement Note: Whitestack's remote-friendly work environment and commitment to employee well-being enable Site Reliability Engineers to maintain a healthy work-life balance while collaborating with team members across Latin America. The company's focus on training and development ensures that engineers have access to the resources and support they need to grow both personally and professionally.

πŸ“„ Application & Technical Interview Process

Interview Process:

  1. Technical Assessment: Candidates will be asked to complete a technical assessment, focusing on cloud infrastructure management, monitoring, and optimization. This may include hands-on exercises, case studies, or problem-solving challenges related to Kubernetes, OpenStack, or other relevant technologies.
  2. Technical Deep Dive: In this stage, candidates will be asked to discuss their approach to designing, implementing, and maintaining monitoring solutions for cloud infrastructures. This may include questions about incident management, performance optimization, and architecture decision-making.
  3. Behavioral and Cultural Fit Assessment: Candidates will participate in a behavioral interview to assess their cultural fit with the Whitestack team. This may include questions about their problem-solving skills, communication style, and teamwork abilities.
  4. Final Evaluation: In the final stage, candidates will be evaluated based on their technical skills, cultural fit, and overall potential for success in the role. This may include a discussion of the candidate's long-term career goals and expectations.

Portfolio Review Tips:

  • Technical Portfolio: Highlight your experience in cloud infrastructure management, monitoring, and optimization. Include case studies or examples that demonstrate your ability to design, implement, and maintain monitoring solutions for cloud platforms.
  • Incident Management: Showcase your incident management and problem-solving skills through case studies or examples that illustrate your ability to handle and resolve critical incidents in cloud environments.
  • Monitoring Tools: Demonstrate your proficiency in monitoring tools, such as Prometheus, Grafana, Elasticsearch, and Kibana, through examples or projects that showcase your ability to create and manage dashboards, visualize critical performance indicators, and ensure optimal cloud platform performance.

Technical Challenge Preparation:

  • Cloud Infrastructure Management: Brush up on your knowledge of cloud infrastructure management, with a focus on Kubernetes and OpenStack. Familiarize yourself with the latest best practices and industry trends in cloud monitoring and optimization.
  • Incident Management: Review your incident management and problem-solving skills, focusing on your ability to handle and resolve critical incidents in cloud environments. Prepare examples or case studies that demonstrate your approach to incident management and resolution.
  • Architecture Decision-Making: Familiarize yourself with Whitestack's architecture and design principles, focusing on the company's commitment to open-source technology and industry-leading standards. Prepare for questions about your approach to architecture decision-making and your ability to contribute to the organization's long-term success.

ATS Keywords:

  • Cloud Technologies: Kubernetes, OpenStack, AWS, GCP, Azure, cloud infrastructure management, cloud monitoring, cloud optimization.
  • Monitoring Tools: Prometheus, Grafana, Elasticsearch, Kibana, monitoring solutions, dashboard creation, performance visualization.
  • Programming Languages: Bash, Python, Linux, Docker, CI/CD pipelines, automation, scripting.
  • Soft Skills: Problem-solving, incident management, communication, teamwork, collaboration, adaptability, continuous learning.

πŸ“ Enhancement Note: Whitestack's interview process is designed to assess candidates' technical skills, cultural fit, and overall potential for success in the Site Reliability Engineer role. The company's focus on open-source technology and industry-leading standards enables engineers to work on cutting-edge projects and contribute to the organization's long-term success.

πŸ›  Technology Stack & Web Infrastructure

Cloud Technologies:

  • Primary: Kubernetes, OpenStack
  • Secondary: AWS, GCP, Azure

Monitoring Tools:

  • Primary: Prometheus, Grafana, Elasticsearch, Kibana
  • Secondary: New Relic, Datadog, AppDynamics

Programming Languages:

  • Primary: Bash, Python
  • Secondary: JavaScript, TypeScript, Go, Java

Infrastructure Tools:

  • CI/CD Pipelines: Jenkins, GitLab CI/CD, CircleCI
  • Containerization: Docker, Kubernetes
  • Server Configuration: Terraform, Ansible, Puppet

πŸ“ Enhancement Note: Whitestack's technology stack is designed to provide Site Reliability Engineers with the tools and resources they need to manage, monitor, and optimize cloud infrastructures effectively. The company's commitment to open-source technology and industry-leading standards enables engineers to work with cutting-edge technologies and contribute to the organization's long-term success.

πŸ‘₯ Team Culture & Values

Whitestack Values:

  • Innovation: Whitestack fosters a culture of innovation, encouraging Site Reliability Engineers to explore new technologies, tools, and methodologies to improve cloud infrastructure management and optimization.
  • Collaboration: The company emphasizes cross-functional collaboration, encouraging Site Reliability Engineers to work closely with senior engineers, product development teams, and other departments to ensure optimal performance and minimal downtime.
  • Continuous Learning: Whitestack prioritizes continuous learning, providing Site Reliability Engineers with access to courses, books, and certification reimbursement to help them stay up-to-date with the latest industry trends and technologies.
  • Customer Focus: The company maintains a strong focus on customer satisfaction, ensuring that Site Reliability Engineers prioritize the needs and expectations of the organization's clients.

Collaboration Style:

  • Cross-Functional Integration: Whitestack encourages Site Reliability Engineers to collaborate closely with senior engineers, product development teams, and other departments to ensure optimal performance and minimal downtime.
  • Code Review Culture: The company fosters a code review culture, encouraging Site Reliability Engineers to review and provide feedback on each other's work to maintain high code quality and performance.
  • Knowledge Sharing: Whitestack promotes knowledge sharing, encouraging Site Reliability Engineers to share their expertise and experience with team members and other departments to drive continuous learning and improvement.

πŸ“ Enhancement Note: Whitestack's values and collaboration style enable Site Reliability Engineers to work effectively in a dynamic and engaging work environment. The company's commitment to innovation, collaboration, and continuous learning fosters a culture of growth and success for both individual engineers and the organization as a whole.

🌐 Challenges & Growth Opportunities

Technical Challenges:

  • Cloud Infrastructure Management: Site Reliability Engineers at Whitestack may face technical challenges related to cloud infrastructure management, monitoring, and optimization. These may include issues with scalability, performance, or security, requiring engineers to design and implement innovative solutions to ensure optimal cloud platform performance.
  • Incident Management: The role may present challenges related to incident management, requiring Site Reliability Engineers to handle and resolve critical incidents in cloud environments. This may involve working with senior engineers, product development teams, or other departments to identify and address the root causes of incidents and prevent their recurrence.
  • Emerging Technologies: As cloud technologies continue to evolve, Site Reliability Engineers may face challenges related to the adoption and integration of emerging technologies. This may require engineers to stay up-to-date with the latest industry trends and best practices, and to adapt their skills and knowledge to new tools and methodologies.

Learning & Development Opportunities:

  • Technical Skill Development: Whitestack offers opportunities for Site Reliability Engineers to develop and enhance their technical skills through access to courses, books, and certification reimbursement. This enables engineers to stay up-to-date with the latest industry trends and technologies, and to specialize in specific areas of cloud infrastructure management, monitoring, or optimization.
  • Leadership Development: As the company continues to grow, there are opportunities for Site Reliability Engineers to take on leadership roles, mentoring junior team members and contributing to strategic decision-making processes. This may involve working with senior engineers, product development teams, or other departments to define and implement the organization's long-term vision and goals.
  • Architecture and Design: With experience, Site Reliability Engineers can transition into architecture roles, designing and implementing large-scale cloud infrastructure solutions that meet the organization's evolving needs. This may involve working with senior engineers, product development teams, or other departments to define and implement the organization's long-term vision and goals.

πŸ“ Enhancement Note: Whitestack's commitment to innovation, collaboration, and continuous learning provides Site Reliability Engineers with ample opportunities to grow both technically and professionally. The company's focus on open-source technology and industry-leading standards enables engineers to work on cutting-edge projects and contribute to the organization's long-term success.

πŸ’‘ Interview Preparation

Technical Questions:

  • Cloud Infrastructure Management: Prepare for questions related to cloud infrastructure management, monitoring, and optimization. This may include questions about your experience with Kubernetes, OpenStack, or other relevant technologies, as well as your approach to designing, implementing, and maintaining monitoring solutions for cloud platforms.
  • Incident Management: Brush up on your incident management and problem-solving skills, focusing on your ability to handle and resolve critical incidents in cloud environments. Prepare examples or case studies that demonstrate your approach to incident management and resolution.
  • Architecture Decision-Making: Familiarize yourself with Whitestack's architecture and design principles, focusing on the company's commitment to open-source technology and industry-leading standards. Prepare for questions about your approach to architecture decision-making and your ability to contribute to the organization's long-term success.

Company & Culture Questions:

  • Whitestack Culture: Research Whitestack's company culture, values, and mission. Prepare for questions about your alignment with the company's commitment to innovation, collaboration, and continuous learning.
  • Agile Methodologies: Familiarize yourself with Whitestack's Agile methodologies, such as Scrum or Kanban. Prepare for questions about your experience with Agile development processes and your ability to collaborate effectively with cross-functional teams.
  • User Experience Impact: Brush up on your understanding of user experience (UX) principles and their impact on cloud infrastructure management and optimization. Prepare for questions about your approach to ensuring optimal cloud platform performance and minimal downtime.

Portfolio Presentation Strategy:

  • Technical Portfolio: Highlight your experience in cloud infrastructure management, monitoring, and optimization. Include case studies or examples that demonstrate your ability to design, implement, and maintain monitoring solutions for cloud platforms.
  • Incident Management: Showcase your incident management and problem-solving skills through case studies or examples that illustrate your ability to handle and resolve critical incidents in cloud environments.
  • Architecture Decision-Making: Demonstrate your understanding of Whitestack's architecture and design principles, focusing on the company's commitment to open-source technology and industry-leading standards. Prepare for questions about your approach to architecture decision-making and your ability to contribute to the organization's long-term success.

πŸ“ Enhancement Note: Whitestack's interview process is designed to assess candidates' technical skills, cultural fit, and overall potential for success in the Site Reliability Engineer role. The company's focus on open-source technology and industry-leading standards enables engineers to work on cutting-edge projects and contribute to the organization's long-term success.

πŸ“Œ Application Steps

To apply for this Site Reliability Engineer position at Whitestack:

  1. Update Your Resume: Tailor your resume to highlight your experience in cloud infrastructure management, monitoring, and optimization. Include specific examples or case studies that demonstrate your ability to design, implement, and maintain monitoring solutions for cloud platforms.
  2. Prepare Your Portfolio: Curate a portfolio that showcases your incident management and problem-solving skills, as well as your understanding of Whitestack's architecture and design principles. Include case studies or examples that illustrate your approach to architecture decision-making and your ability to contribute to the organization's long-term success.
  3. Practice Technical Challenges: Brush up on your knowledge of cloud infrastructure management, monitoring, and optimization. Familiarize yourself with Whitestack's technology stack and prepare for technical challenges related to cloud infrastructure management, incident management, and architecture decision-making.
  4. Research Whitestack: Learn about Whitestack's company culture, values, and mission. Prepare for questions about your alignment with the company's commitment to innovation, collaboration, and continuous learning. Familiarize yourself with Whitestack's Agile methodologies and be prepared to discuss your experience with Agile development processes and cross-functional team collaboration.

πŸ“ Important Notice: This enhanced job description includes AI-generated insights and web technology industry-standard assumptions. All details should be verified directly with Whitestack before making application decisions.


Content Guidelines (IMPORTANT: Do not include this in the output)

Web Technology-Specific Focus:

  • Tailor every section specifically to Site Reliability Engineering, cloud infrastructure management, and monitoring roles.
  • Include cloud technology methodologies, monitoring principles, and incident management practices.
  • Emphasize cloud portfolio requirements, live project demonstrations, and user experience considerations.
  • Address cloud technology career progression, emerging technology adoption, and technical specialization.
  • Provide tactical advice for cloud portfolio development, live demonstrations, and project case studies.
  • Include cloud technology-specific interview preparation and coding challenge guidance.
  • Emphasize cloud technology-specific team culture, cross-functional collaboration, and user impact measurement.

Quality Standards:

  • Ensure no content overlap between sections - each section must contain unique information.
  • Only include Enhancement Notes when making significant inferences about cloud technology processes, infrastructure configuration, or team structure.
  • Be comprehensive but concise, prioritizing actionable information over descriptive text.
  • Strategically distribute cloud technology and Site Reliability Engineering-related keywords throughout all sections naturally.
  • Provide realistic salary ranges based on location, experience level, and cloud technology specialization.

Industry Expertise:

  • Include specific cloud technologies, frameworks, server platforms, and infrastructure tools relevant to the role.
  • Address cloud technology career progression paths and technical leadership opportunities in cloud teams.
  • Provide tactical advice for cloud portfolio development, live demonstrations, and project case studies.
  • Include cloud technology-specific interview preparation and coding challenge guidance.
  • Emphasize cloud technology-specific team culture, cross-functional collaboration, and user impact measurement.

Professional Standards:

  • Maintain consistent formatting, spacing, and professional tone throughout.
  • Use cloud technology and Site Reliability Engineering industry terminology appropriately and accurately.
  • Include comprehensive benefits and growth opportunities relevant to cloud technology professionals.
  • Provide actionable insights that give Site Reliability Engineering candidates a competitive advantage.
  • Focus on cloud technology-specific team culture, cross-functional collaboration, and user impact measurement.

Technical Focus & Portfolio Emphasis:

  • Emphasize cloud infrastructure management, monitoring, and optimization best practices, responsive design principles, and performance optimization.
  • Include specific portfolio requirements tailored to the cloud technology discipline and role level.
  • Address browser compatibility, accessibility standards, and user experience design principles.
  • Focus on problem-solving methods, performance optimization, and scalable cloud architecture.
  • Include technical presentation skills and stakeholder communication for cloud projects.

Avoid:

  • Generic business jargon not relevant to cloud technology or Site Reliability Engineering roles.
  • Placeholder text or incomplete sections.
  • Repetitive content across different sections.
  • Non-cloud technology terminology unless relevant to the specific cloud technology role.
  • Marketing language unrelated to cloud technology, Site Reliability Engineering, or user experience.

Generate comprehensive, cloud technology-focused content that serves as a valuable resource for Site Reliability Engineering candidates seeking their next opportunity in the cloud infrastructure management and monitoring industry.

Application Requirements

Bachelor's degree in Computer Engineering or related field with at least 3 years of experience in cloud infrastructure management. Proficiency in monitoring tools and technologies such as Kubernetes and OpenStack is required.