Site Reliability Engineer
📍 Job Overview
- Job Title: Site Reliability Engineer
- Company: Resident
- Location: Tel Aviv-Yafo, Tel Aviv, Israel
- Job Type: On-site
- Category: DevOps Engineer
- Date Posted: June 24, 2025
- Experience Level: Mid-Senior level (5-10 years)
- Remote Status: On-site (Remote OK for specific locations)
🚀 Role Summary
- Key Responsibilities: Ensure reliability, performance, and scalability of back-office solutions, develop SRE capabilities, and establish effective monitoring systems.
- Key Skills: Site Reliability Engineering, DevOps, E-commerce flows, Automation, Software Development, Monitoring, Observability, Scripting, AWS, Infrastructure Provisioning, Problem Solving, Collaboration, English Proficiency.
📝 Enhancement Note: This role requires a strong background in Site Reliability Engineering and DevOps, with a focus on E-commerce back-office operations and order processing. Proficiency in automation, monitoring platforms, and AWS services is crucial for success in this position.
💻 Primary Responsibilities
- SRE Capabilities Development: Develop and implement SRE capabilities to enhance the reliability, availability, and performance of Admin solutions.
- Proactive Monitoring: Design and maintain proactive monitoring and alerting systems for deep visibility into critical business flows.
- SDLC Improvement: Drive improvements in the Software Development Lifecycle (SDLC) for reliability and scalability from design to deployment.
- Incident Management: Collaborate with development and operations teams to troubleshoot production incidents affecting the purchase flow through root cause analysis.
- SRE Initiatives: Lead SRE initiatives to boost system resilience and operational efficiency.
- Incident Management Processes: Implement best practices for incident management and conduct blameless post-mortems, contributing to capacity planning and performance testing to ensure scalability.
📝 Enhancement Note: This role involves a high level of technical responsibility, requiring strong problem-solving skills and the ability to work effectively with cross-functional teams.
🎓 Skills & Qualifications
Education: Bachelor's degree in Computer Science, Engineering, or a related field. Relevant experience may be considered in lieu of a degree.
Experience: 5+ years of experience as a Site Reliability/DevOps Engineer, with a strong understanding of E-commerce flows, specifically with back-office operations and order processing.
Required Skills:
- Deep understanding of E-commerce flows, specifically with back-office operations and order processing.
- Experience as an Automation/Software Engineer with a strong understanding of software development principles and in building, testing, and deploying distributed systems.
- Experience in designing, implementing, and utilizing monitoring and observability platforms such as DataDog, NewRelic, Prometheus/Grafana, or ELK stack.
- Proficiency in scripting and automation using languages such as Python, Java, etc.
- Ability to create dashboards, alerts, and insightful queries.
- Experience with AWS services to build and operate scalable and resilient applications (e.g., EC2, ECS/EKS, RDS, S3, Lambda, CloudWatch).
Preferred Skills:
- Experience in automating infrastructure provisioning, application deployments, and repetitive operational tasks.
- Familiarity with CI/CD pipelines and version control systems (e.g., Git).
📝 Enhancement Note: Candidates should have a proven track record in Site Reliability Engineering and DevOps, with a strong focus on E-commerce back-office operations and order processing. Proficiency in automation, monitoring platforms, and AWS services is essential for success in this role.
📊 Web Portfolio & Project Requirements
Portfolio Essentials:
- Demonstrate experience in designing, implementing, and maintaining monitoring and observability platforms.
- Showcase projects that highlight your ability to ensure the reliability, performance, and scalability of back-office solutions.
- Provide examples of your problem-solving skills and incident management processes.
Technical Documentation:
- Document your approach to designing, implementing, and maintaining monitoring and alerting systems.
- Explain your methodology for driving improvements in the Software Development Lifecycle (SDLC) for reliability and scalability.
- Describe your incident management processes, including root cause analysis and post-mortem analysis.
📝 Enhancement Note: As this role focuses on Site Reliability Engineering and DevOps, your portfolio should emphasize your technical skills and experience in ensuring the reliability, performance, and scalability of back-office solutions.
💵 Compensation & Benefits
Salary Range: The salary range for this role is estimated to be between ₪350,000 - ₪500,000 per year (approximately $105,000 - $150,000 USD), based on market research for Site Reliability Engineers in Tel Aviv with 5-10 years of experience.
Benefits:
- Health, dental, and vision insurance
- Retirement savings plan with company match
- Flexible work arrangements
- Professional development opportunities
- Competitive compensation and benefits package
Working Hours: Full-time position with standard working hours. Flexibility for deployment windows, maintenance, and project deadlines as needed.
📝 Enhancement Note: The salary range provided is an estimate based on market research for Site Reliability Engineers in Tel Aviv with 5-10 years of experience. The actual salary may vary depending on the candidate's qualifications and the company's internal compensation structure.
🎯 Team & Company Context
Company Culture:
- Industry: E-commerce and Direct-to-Consumer (DTC)
- Company Size: Medium to Large (200-500 employees)
- Founded: 2015
- Team Structure: The DevOps team works closely with development and operations teams to ensure the reliability, performance, and scalability of back-office solutions. The team is responsible for designing, implementing, and maintaining the processes, methodologies, and technologies that support the development of the Resident's platform.
Development Methodology:
- Agile/Scrum methodologies and sprint planning for web projects
- Code review, testing, and quality assurance practices
- Deployment strategies, CI/CD pipelines, and server management
Company Website: www.resident.com
📝 Enhancement Note: Resident is an industry leader in the Direct-to-Consumer (e-commerce) space, with a strong focus on data and technology to create a competitive advantage for its brands. The company's mission is to build a best-in-class e-commerce platform that delivers a world-class customer experience.
📈 Career & Growth Analysis
Web Technology Career Level: Mid-Senior level Site Reliability Engineer, responsible for ensuring the reliability, performance, and scalability of back-office solutions, leading the development of SRE capabilities, and establishing effective monitoring systems.
Reporting Structure: This role reports directly to the Head of DevOps and works closely with development and operations teams.
Technical Impact: This role has a significant impact on the reliability, performance, and scalability of back-office solutions, ensuring a seamless customer experience throughout the shopping and post-purchase journey.
Growth Opportunities:
- Technical Growth: Expand your expertise in Site Reliability Engineering, DevOps, and E-commerce back-office operations. Gain experience in leading SRE initiatives and driving improvements in the Software Development Lifecycle (SDLC).
- Leadership Growth: Develop your leadership skills by working with cross-functional teams and contributing to capacity planning and performance testing. Prepare for potential team management or architecture decision-making roles in the future.
- Career Progression: As a mid-senior level Site Reliability Engineer, you have the opportunity to progress to senior or principal-level roles, or transition into technical leadership or architecture positions within the organization.
📝 Enhancement Note: This role offers significant growth opportunities in technical expertise, leadership, and career progression within the Site Reliability Engineering and DevOps domains.
🌐 Work Environment
Office Type: On-site office with remote work flexibility for specific locations.
Office Location(s): Tel Aviv, Israel
Workspace Context:
- Collaborative workspace with a diverse group of experts around the globe
- Flexible work arrangements to support a healthy work-life balance
- Opportunities for virtual collaboration and knowledge sharing with global teams
Work Schedule: Standard working hours with flexibility for deployment windows, maintenance, and project deadlines as needed.
📝 Enhancement Note: Resident offers a collaborative work environment with a diverse team of experts, fostering a culture of continuous improvement and innovation in the E-commerce space.
📄 Application & Technical Interview Process
Interview Process:
- Technical Phone Screen (30 minutes): A brief phone call to discuss your technical background, experience, and understanding of Site Reliability Engineering and DevOps principles.
- Technical Deep Dive (60 minutes): A deeper dive into your technical skills, focusing on your experience with E-commerce flows, back-office operations, and order processing. Expect questions on monitoring, observability, automation, and AWS services.
- Behavioral & Cultural Fit Interview (30 minutes): An interview to assess your problem-solving skills, collaboration, and cultural fit within the organization.
- Final Decision & Offer (TBD): A final decision on your application and an offer for the position, if applicable.
Portfolio Review Tips:
- Highlight your experience in designing, implementing, and maintaining monitoring and alerting systems for E-commerce back-office operations.
- Showcase your ability to drive improvements in the Software Development Lifecycle (SDLC) for reliability and scalability.
- Demonstrate your incident management processes, including root cause analysis and post-mortem analysis.
Technical Challenge Preparation:
- Brush up on your knowledge of E-commerce flows, back-office operations, and order processing.
- Review your experience with monitoring, observability, automation, and AWS services.
- Prepare for questions on your problem-solving skills, collaboration, and cultural fit within the organization.
ATS Keywords: Site Reliability Engineering, DevOps, E-commerce, Back-office Operations, Order Processing, Monitoring, Observability, Automation, AWS, Infrastructure Provisioning, Problem Solving, Collaboration, English Proficiency.
📝 Enhancement Note: The interview process for this role is designed to assess your technical skills, problem-solving abilities, and cultural fit within the organization. Be prepared to discuss your experience with E-commerce flows, back-office operations, and order processing, as well as your proficiency in monitoring, observability, automation, and AWS services.
🛠 Technology Stack & Web Infrastructure
Monitoring & Observability Tools:
- DataDog
- NewRelic
- Prometheus/Grafana
- ELK stack
AWS Services:
- EC2
- ECS/EKS
- RDS
- S3
- Lambda
- CloudWatch
Infrastructure Provisioning & Automation Tools:
- Terraform
- Ansible
- Jenkins
- Git
📝 Enhancement Note: This role requires proficiency in monitoring and observability tools, AWS services, and infrastructure provisioning and automation tools. Familiarity with these technologies is essential for success in this position.
👥 Team Culture & Values
Web Development Values:
- Reliability: Ensure the reliability, performance, and scalability of back-office solutions to support a seamless customer experience.
- Collaboration: Work effectively with cross-functional teams to drive improvements in the Software Development Lifecycle (SDLC) and incident management processes.
- Innovation: Continuously improve and optimize monitoring and alerting systems to enhance the customer experience and drive business growth.
- Customer Focus: Understand and prioritize the needs of the customer, ensuring that back-office solutions support a world-class customer experience.
Collaboration Style:
- Cross-functional integration between developers, operations, and stakeholders
- Code review culture and peer programming practices
- Knowledge sharing, technical mentoring, and continuous learning
📝 Enhancement Note: Resident values a collaborative work environment, with a focus on reliability, innovation, and customer focus. The company fosters a culture of continuous improvement and optimization in the E-commerce space.
⚡ Challenges & Growth Opportunities
Technical Challenges:
- Ensuring the reliability, performance, and scalability of back-office solutions in a dynamic E-commerce environment.
- Developing and implementing SRE capabilities to meet Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs).
- Establishing effective monitoring systems to gain deep visibility into critical business flows and identify functional issues.
Learning & Development Opportunities:
- Technical Skill Development: Expand your expertise in Site Reliability Engineering, DevOps, and E-commerce back-office operations. Explore emerging technologies and best practices in monitoring, observability, and automation.
- Conference Attendance & Certification: Attend industry conferences, webinars, and workshops to stay up-to-date with the latest trends and best practices in Site Reliability Engineering and DevOps. Pursue relevant certifications to enhance your technical skills and credibility.
- Technical Mentorship & Leadership Development: Seek mentorship opportunities from experienced Site Reliability Engineers and DevOps professionals within the organization. Develop your leadership skills by contributing to capacity planning, performance testing, and incident management processes.
📝 Enhancement Note: This role presents significant technical challenges and learning opportunities in Site Reliability Engineering, DevOps, and E-commerce back-office operations. By embracing these challenges and pursuing continuous learning, you can drive your technical growth and career progression within the organization.
💡 Interview Preparation
Technical Questions:
- E-commerce Flows & Back-office Operations: Describe your experience with E-commerce flows, back-office operations, and order processing. How have you ensured the reliability, performance, and scalability of these systems in previous roles?
- Monitoring & Observability: Explain your approach to designing, implementing, and maintaining monitoring and alerting systems. How have you gained deep visibility into critical business flows and identified functional issues in the past?
- Incident Management: Walk through your incident management process, including root cause analysis and post-mortem analysis. Describe a challenging incident you've handled and the lessons you learned from the experience.
Company & Culture Questions:
- Company Culture: How do you see yourself contributing to Resident's culture of continuous improvement and innovation in the E-commerce space?
- Collaboration & Teamwork: Describe your experience working with cross-functional teams in previous roles. How have you driven improvements in the Software Development Lifecycle (SDLC) and incident management processes?
- Customer Focus: Explain how you prioritize the needs of the customer in your work. How have you ensured that back-office solutions support a world-class customer experience in previous roles?
Portfolio Presentation Strategy:
- Technical Walkthrough: Provide a detailed walkthrough of your experience in designing, implementing, and maintaining monitoring and alerting systems for E-commerce back-office operations.
- Incident Management Case Study: Present a case study of a challenging incident you've handled, highlighting your incident management process, root cause analysis, and post-mortem analysis.
- Customer Impact: Explain how your work has contributed to a seamless customer experience and driven business growth in previous roles.
📝 Enhancement Note: The interview process for this role is designed to assess your technical skills, problem-solving abilities, and cultural fit within the organization. Be prepared to discuss your experience with E-commerce flows, back-office operations, and order processing, as well as your proficiency in monitoring, observability, automation, and AWS services.
Application Requirements
Candidates should have 5+ years of experience in Site Reliability or DevOps Engineering, with a strong understanding of E-commerce flows and back-office operations. Proficiency in automation, monitoring platforms, and AWS services is essential, along with excellent problem-solving and collaboration skills.