Site Reliability Engineer

Resident
Full_timeTel Aviv-Yafo, Israel

📍 Job Overview

  • Job Title: Site Reliability Engineer
  • Company: Resident
  • Location: Tel Aviv-Yafo, Tel Aviv, Israel
  • Job Type: On-site
  • Category: DevOps Engineer
  • Date Posted: June 24, 2025
  • Experience Level: Mid-Senior level (5-10 years)
  • Remote Status: On-site (Remote OK for specific locations)

🚀 Role Summary

  • Key Responsibilities: Ensure reliability, performance, and scalability of back-office solutions, develop SRE capabilities, and establish effective monitoring systems.
  • Key Skills: Site Reliability Engineering, DevOps, E-commerce flows, Automation, Software Development, Monitoring, Observability, Scripting, AWS, Infrastructure Provisioning, Problem Solving, Collaboration, English Proficiency.

📝 Enhancement Note: This role requires a strong background in Site Reliability Engineering and DevOps, with a focus on E-commerce back-office operations and order processing. Proficiency in automation, monitoring platforms, and AWS services is crucial for success in this position.

💻 Primary Responsibilities

  • SRE Capabilities Development: Develop and implement SRE capabilities to enhance the reliability, availability, and performance of Admin solutions.
  • Proactive Monitoring: Design and maintain proactive monitoring and alerting systems for deep visibility into critical business flows.
  • SDLC Improvement: Drive improvements in the Software Development Lifecycle (SDLC) for reliability and scalability from design to deployment.
  • Incident Management: Collaborate with development and operations teams to troubleshoot production incidents affecting the purchase flow through root cause analysis.
  • SRE Initiatives: Lead SRE initiatives to boost system resilience and operational efficiency.
  • Incident Management Processes: Implement best practices for incident management and conduct blameless post-mortems, contributing to capacity planning and performance testing to ensure scalability.

📝 Enhancement Note: This role involves a high level of technical responsibility, requiring strong problem-solving skills and the ability to work effectively with cross-functional teams.

🎓 Skills & Qualifications

Education: Bachelor's degree in Computer Science, Engineering, or a related field. Relevant experience may be considered in lieu of a degree.

Experience: 5+ years of experience as a Site Reliability/DevOps Engineer, with a strong understanding of E-commerce flows, specifically with back-office operations and order processing.

Required Skills:

  • Deep understanding of E-commerce flows, specifically with back-office operations and order processing.
  • Experience as an Automation/Software Engineer with a strong understanding of software development principles and in building, testing, and deploying distributed systems.
  • Experience in designing, implementing, and utilizing monitoring and observability platforms such as DataDog, NewRelic, Prometheus/Grafana, or ELK stack.
  • Proficiency in scripting and automation using languages such as Python, Java, etc.
  • Ability to create dashboards, alerts, and insightful queries.
  • Experience with AWS services to build and operate scalable and resilient applications (e.g., EC2, ECS/EKS, RDS, S3, Lambda, CloudWatch).

Preferred Skills:

  • Experience in automating infrastructure provisioning, application deployments, and repetitive operational tasks.
  • Familiarity with CI/CD pipelines and version control systems (e.g., Git).

📝 Enhancement Note: Candidates should have a proven track record in Site Reliability Engineering and DevOps, with a strong focus on E-commerce back-office operations and order processing. Proficiency in automation, monitoring platforms, and AWS services is essential for success in this role.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

  • Demonstrate experience in designing, implementing, and maintaining monitoring and observability platforms.
  • Showcase projects that highlight your ability to ensure the reliability, performance, and scalability of back-office solutions.
  • Provide examples of your problem-solving skills and incident management processes.

Technical Documentation:

  • Document your approach to designing, implementing, and maintaining monitoring and alerting systems.
  • Explain your methodology for driving improvements in the Software Development Lifecycle (SDLC) for reliability and scalability.
  • Describe your incident management processes, including root cause analysis and post-mortem analysis.

📝 Enhancement Note: As this role focuses on Site Reliability Engineering and DevOps, your portfolio should emphasize your technical skills and experience in ensuring the reliability, performance, and scalability of back-office solutions.

💵 Compensation & Benefits

Salary Range: The salary range for this role is estimated to be between ₪350,000 - ₪500,000 per year (approximately $105,000 - $150,000 USD), based on market research for Site Reliability Engineers in Tel Aviv with 5-10 years of experience.

Benefits:

  • Health, dental, and vision insurance
  • Retirement savings plan with company match
  • Flexible work arrangements
  • Professional development opportunities
  • Competitive compensation and benefits package

Working Hours: Full-time position with standard working hours. Flexibility for deployment windows, maintenance, and project deadlines as needed.

📝 Enhancement Note: The salary range provided is an estimate based on market research for Site Reliability Engineers in Tel Aviv with 5-10 years of experience. The actual salary may vary depending on the candidate's qualifications and the company's internal compensation structure.

🎯 Team & Company Context

Company Culture:

  • Industry: E-commerce and Direct-to-Consumer (DTC)
  • Company Size: Medium to Large (200-500 employees)
  • Founded: 2015
  • Team Structure: The DevOps team works closely with development and operations teams to ensure the reliability, performance, and scalability of back-office solutions. The team is responsible for designing, implementing, and maintaining the processes, methodologies, and technologies that support the development of the Resident's platform.

Development Methodology:

  • Agile/Scrum methodologies and sprint planning for web projects
  • Code review, testing, and quality assurance practices
  • Deployment strategies, CI/CD pipelines, and server management

Company Website: www.resident.com

📝 Enhancement Note: Resident is an industry leader in the Direct-to-Consumer (e-commerce) space, with a strong focus on data and technology to create a competitive advantage for its brands. The company's mission is to build a best-in-class e-commerce platform that delivers a world-class customer experience.

📈 Career & Growth Analysis

Web Technology Career Level: Mid-Senior level Site Reliability Engineer, responsible for ensuring the reliability, performance, and scalability of back-office solutions, leading the development of SRE capabilities, and establishing effective monitoring systems.

Reporting Structure: This role reports directly to the Head of DevOps and works closely with development and operations teams.

Technical Impact: This role has a significant impact on the reliability, performance, and scalability of back-office solutions, ensuring a seamless customer experience throughout the shopping and post-purchase journey.

Growth Opportunities:

  • Technical Growth: Expand your expertise in Site Reliability Engineering, DevOps, and E-commerce back-office operations. Gain experience in leading SRE initiatives and driving improvements in the Software Development Lifecycle (SDLC).
  • Leadership Growth: Develop your leadership skills by working with cross-functional teams and contributing to capacity planning and performance testing. Prepare for potential team management or architecture decision-making roles in the future.
  • Career Progression: As a mid-senior level Site Reliability Engineer, you have the opportunity to progress to senior or principal-level roles, or transition into technical leadership or architecture positions within the organization.

📝 Enhancement Note: This role offers significant growth opportunities in technical expertise, leadership, and career progression within the Site Reliability Engineering and DevOps domains.

🌐 Work Environment

Office Type: On-site office with remote work flexibility for specific locations.

Office Location(s): Tel Aviv, Israel

Workspace Context:

  • Collaborative workspace with a diverse group of experts around the globe
  • Flexible work arrangements to support a healthy work-life balance
  • Opportunities for virtual collaboration and knowledge sharing with global teams

Work Schedule: Standard working hours with flexibility for deployment windows, maintenance, and project deadlines as needed.

📝 Enhancement Note: Resident offers a collaborative work environment with a diverse team of experts, fostering a culture of continuous improvement and innovation in the E-commerce space.

📄 Application & Technical Interview Process

Interview Process:

  1. Technical Phone Screen (30 minutes): A brief phone call to discuss your technical background, experience, and understanding of Site Reliability Engineering and DevOps principles.
  2. Technical Deep Dive (60 minutes): A deeper dive into your technical skills, focusing on your experience with E-commerce flows, back-office operations, and order processing. Expect questions on monitoring, observability, automation, and AWS services.
  3. Behavioral & Cultural Fit Interview (30 minutes): An interview to assess your problem-solving skills, collaboration, and cultural fit within the organization.
  4. Final Decision & Offer (TBD): A final decision on your application and an offer for the position, if applicable.

Portfolio Review Tips:

  • Highlight your experience in designing, implementing, and maintaining monitoring and alerting systems for E-commerce back-office operations.
  • Showcase your ability to drive improvements in the Software Development Lifecycle (SDLC) for reliability and scalability.
  • Demonstrate your incident management processes, including root cause analysis and post-mortem analysis.

Technical Challenge Preparation:

  • Brush up on your knowledge of E-commerce flows, back-office operations, and order processing.
  • Review your experience with monitoring, observability, automation, and AWS services.
  • Prepare for questions on your problem-solving skills, collaboration, and cultural fit within the organization.

ATS Keywords: Site Reliability Engineering, DevOps, E-commerce, Back-office Operations, Order Processing, Monitoring, Observability, Automation, AWS, Infrastructure Provisioning, Problem Solving, Collaboration, English Proficiency.

📝 Enhancement Note: The interview process for this role is designed to assess your technical skills, problem-solving abilities, and cultural fit within the organization. Be prepared to discuss your experience with E-commerce flows, back-office operations, and order processing, as well as your proficiency in monitoring, observability, automation, and AWS services.

🛠 Technology Stack & Web Infrastructure

Monitoring & Observability Tools:

  • DataDog
  • NewRelic
  • Prometheus/Grafana
  • ELK stack

AWS Services:

  • EC2
  • ECS/EKS
  • RDS
  • S3
  • Lambda
  • CloudWatch

Infrastructure Provisioning & Automation Tools:

  • Terraform
  • Ansible
  • Jenkins
  • Git

📝 Enhancement Note: This role requires proficiency in monitoring and observability tools, AWS services, and infrastructure provisioning and automation tools. Familiarity with these technologies is essential for success in this position.

👥 Team Culture & Values

Web Development Values:

  • Reliability: Ensure the reliability, performance, and scalability of back-office solutions to support a seamless customer experience.
  • Collaboration: Work effectively with cross-functional teams to drive improvements in the Software Development Lifecycle (SDLC) and incident management processes.
  • Innovation: Continuously improve and optimize monitoring and alerting systems to enhance the customer experience and drive business growth.
  • Customer Focus: Understand and prioritize the needs of the customer, ensuring that back-office solutions support a world-class customer experience.

Collaboration Style:

  • Cross-functional integration between developers, operations, and stakeholders
  • Code review culture and peer programming practices
  • Knowledge sharing, technical mentoring, and continuous learning

📝 Enhancement Note: Resident values a collaborative work environment, with a focus on reliability, innovation, and customer focus. The company fosters a culture of continuous improvement and optimization in the E-commerce space.

⚡ Challenges & Growth Opportunities

Technical Challenges:

  • Ensuring the reliability, performance, and scalability of back-office solutions in a dynamic E-commerce environment.
  • Developing and implementing SRE capabilities to meet Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs).
  • Establishing effective monitoring systems to gain deep visibility into critical business flows and identify functional issues.

Learning & Development Opportunities:

  • Technical Skill Development: Expand your expertise in Site Reliability Engineering, DevOps, and E-commerce back-office operations. Explore emerging technologies and best practices in monitoring, observability, and automation.
  • Conference Attendance & Certification: Attend industry conferences, webinars, and workshops to stay up-to-date with the latest trends and best practices in Site Reliability Engineering and DevOps. Pursue relevant certifications to enhance your technical skills and credibility.
  • Technical Mentorship & Leadership Development: Seek mentorship opportunities from experienced Site Reliability Engineers and DevOps professionals within the organization. Develop your leadership skills by contributing to capacity planning, performance testing, and incident management processes.

📝 Enhancement Note: This role presents significant technical challenges and learning opportunities in Site Reliability Engineering, DevOps, and E-commerce back-office operations. By embracing these challenges and pursuing continuous learning, you can drive your technical growth and career progression within the organization.

💡 Interview Preparation

Technical Questions:

  • E-commerce Flows & Back-office Operations: Describe your experience with E-commerce flows, back-office operations, and order processing. How have you ensured the reliability, performance, and scalability of these systems in previous roles?
  • Monitoring & Observability: Explain your approach to designing, implementing, and maintaining monitoring and alerting systems. How have you gained deep visibility into critical business flows and identified functional issues in the past?
  • Incident Management: Walk through your incident management process, including root cause analysis and post-mortem analysis. Describe a challenging incident you've handled and the lessons you learned from the experience.

Company & Culture Questions:

  • Company Culture: How do you see yourself contributing to Resident's culture of continuous improvement and innovation in the E-commerce space?
  • Collaboration & Teamwork: Describe your experience working with cross-functional teams in previous roles. How have you driven improvements in the Software Development Lifecycle (SDLC) and incident management processes?
  • Customer Focus: Explain how you prioritize the needs of the customer in your work. How have you ensured that back-office solutions support a world-class customer experience in previous roles?

Portfolio Presentation Strategy:

  • Technical Walkthrough: Provide a detailed walkthrough of your experience in designing, implementing, and maintaining monitoring and alerting systems for E-commerce back-office operations.
  • Incident Management Case Study: Present a case study of a challenging incident you've handled, highlighting your incident management process, root cause analysis, and post-mortem analysis.
  • Customer Impact: Explain how your work has contributed to a seamless customer experience and driven business growth in previous roles.

📝 Enhancement Note: The interview process for this role is designed to assess your technical skills, problem-solving abilities, and cultural fit within the organization. Be prepared to discuss your experience with E-commerce flows, back-office operations, and order processing, as well as your proficiency in monitoring, observability, automation, and AWS services.

Application Requirements

Candidates should have 5+ years of experience in Site Reliability or DevOps Engineering, with a strong understanding of E-commerce flows and back-office operations. Proficiency in automation, monitoring platforms, and AWS services is essential, along with excellent problem-solving and collaboration skills.