Site Reliability Engineer, Release Engineering, Vice President
π Job Overview
- Job Title: Site Reliability Engineer, Release Engineering, Vice President
- Company: BlackRock
- Location: Mumbai, MahΔrΔshtra, India
- Job Type: Hybrid (4 days in office, 1 day remote)
- Category: DevOps, Site Reliability Engineering
- Date Posted: 2025-06-24
- Experience Level: 10+ years
- Remote Status: Hybrid
π Role Summary
- Key Responsibilities: Apply Site Reliability Engineering (SRE) principles to improve software release tools' reliability and availability. Contribute to the observability and stability of BlackRock's global trading platform, Aladdin.
- Key Skills: Observability tools (Grafana, Prometheus, Splunk, Datadog, AppDynamics), Java/Spring Framework, distributed applications, relational databases, Linux OS, TCP/IP, cloud platforms, microservices, APIs, agile development, DevOps, AI, Docker, Kubernetes.
- Nice to Have: Experience in scripting languages (Python, Golang), cloud deployment technology (Docker, Ansible, Terraform), optimization algorithms, and AI-related projects.
π» Primary Responsibilities
- Apply SRE Principles: Enhance software release tools' reliability and availability using SRE principles.
- Define and Refine Priorities: Based on performance and incident data, define and refine priorities to improve the platform's reliability and availability.
- Contribute to Observability and Stability: Significantly contribute to the observability and stability of Aladdin's global, multi-asset trading platform.
- Design and Develop Innovative Solutions: Identify issues and roadblocks, then design and develop innovative solutions to complex problems.
- Lead and Collaborate: Be a leader with vision, guiding and motivating others. Collaborate with partnering teams, sponsors, and user groups to drive a strong culture of inclusion and diversity.
π Skills & Qualifications
Education: B.S. or M.S. degree in Computer Science, Engineering, or a related subject area.
Experience: 8+ years of experience in software engineering, with a strong background in Java, distributed applications, and observability tools.
Required Skills:
- Hands-on experience with observability tools (Grafana, Prometheus, Splunk, Datadog, AppDynamics, etc.)
- A background in setting, measuring, and achieving Service Level Objectives (SLOs) using Service Level Indicators (SLIs) and error budgets.
- Hands-on experience in Java/Spring Framework/Spring Boot.
- A track record of building high-quality software with design-focused and test-driven approaches.
- In-depth understanding of concurrent programming and experience in designing high throughput, high availability, fault-tolerant distributed applications.
- Understanding of relational databases, Linux OS, and TCP/IP fundamentals.
- Demonstrable experience building modern software using engineering tools such as git, maven, unit testing, and integration testing tools, mocking frameworks, and CI/CD pipelines.
- Strong analytical, problem-solving, and communication skills.
- Some experience or interest in finance, investment processes, and translating business problems into technical solutions.
Preferred Skills:
- Experience in scripting languages such as Python, Golang, etc.
- Expertise in building distributed applications using SQL and/or NoSQL technologies.
- A real-world practitioner of applying cloud-native design patterns to event-driven microservice architectures.
- Exposure to high-scale distributed technology like Kafka, Mongo, Ignite, Redis.
- Experience building microservices and APIs ideally with REST, Kafka, or gRPC.
- Experience working in an agile development team or on open-source development projects.
- Experience with optimization, algorithms, or related quantitative processes.
- Experience with cloud platforms like Microsoft Azure, AWS, Google Cloud.
- Experience with cloud deployment technology (Docker, Ansible, Terraform, etc.).
- Experience with DevOps and tools like Azure DevOps.
- Experience with AI-related projects/products or working in an AI research environment.
- Exposure to Docker, Kubernetes, and cloud services.
- A degree, certifications, or open-source track record that shows mastery of software engineering principles.
π Web Portfolio & Project Requirements
-
Portfolio Essentials:
- Demonstrate your proficiency in observability tools with case studies or projects showcasing your ability to set, measure, and achieve Service Level Objectives (SLOs).
- Highlight your experience in Java/Spring Framework/Spring Boot with examples of high-quality, test-driven software development.
- Showcase your understanding of distributed applications, concurrent programming, and fault-tolerant systems with relevant projects.
- Include examples of your problem-solving skills and ability to design innovative solutions to complex problems.
-
Technical Documentation:
- Provide code quality, commenting, and documentation standards examples.
- Include version control, deployment processes, and server configuration demonstrations.
- Showcase testing methodologies, performance metrics, and optimization techniques used in your projects.
π΅ Compensation & Benefits
Salary Range: INR 2,500,000 - 3,500,000 per annum (Based on experience and market standards for Site Reliability Engineering roles in Mumbai)
Benefits:
- Strong retirement plan
- Tuition reimbursement
- Comprehensive healthcare
- Support for working parents
- Flexible Time Off (FTO)
Working Hours: 40 hours per week, with flexible remote work options (1 day remote, 4 days in the office)
π― Team & Company Context
π’ Company Culture
Industry: Financial Services & Investment Management
Company Size: Large (Over 10,000 employees)
Founded: 1988
Team Structure:
- The Aladdin Engineering team resides inside the broader BlackRock organization.
- The team is structured with multiple sub-teams focusing on different aspects of Aladdin's development and maintenance.
- The team works collaboratively, with a strong emphasis on cross-functional collaboration with designers, marketers, and business teams.
Development Methodology:
- Agile/Scrum methodologies with sprint planning for web projects.
- Code review, testing, and quality assurance practices.
- Deployment strategies, CI/CD pipelines, and server management.
Company Website: BlackRock
π Enhancement Note: BlackRock's culture is highly collaborative, with a strong emphasis on cross-functional teamwork and innovation. The company values diversity, inclusion, and continuous learning.
π Career & Growth Analysis
Web Technology Career Level: Vice President, Release Engineering, Site Reliability Engineering
Reporting Structure: Reports directly to the Head of Aladdin Engineering or a similar leadership role within the organization.
Technical Impact: Significantly contributes to the reliability, availability, and performance of BlackRock's global trading platform, Aladdin. Works closely with development teams to ensure the platform's stability and scalability.
Growth Opportunities:
- Technical Growth: Develop expertise in cloud-native design patterns, event-driven microservice architectures, and AI-related projects.
- Leadership Growth: Gain experience leading development teams, projects, or being responsible for the design and technical quality of significant applications, systems, or components.
- Architecture Growth: Contribute to the platform's architecture and design decisions, driving innovation and continuous improvement.
π Enhancement Note: BlackRock offers significant growth opportunities for technical and leadership development within the organization. The company values internal promotions and encourages employees to take on new challenges and responsibilities.
π Work Environment
Office Type: Hybrid (4 days in the office, 1 day remote)
Office Location(s): Mumbai, India
Workspace Context:
- Collaborative workspaces designed to facilitate team interaction and knowledge sharing.
- Multiple monitors and testing devices available to support development and debugging tasks.
- Cross-functional collaboration opportunities with designers, marketers, and business teams.
Work Schedule: 40 hours per week, with flexible remote work options (1 day remote, 4 days in the office)
π Enhancement Note: BlackRock's hybrid work model enables a balance between collaboration and flexibility, allowing employees to work from home one day a week while maintaining a strong collaborative culture in the office.
π Application & Technical Interview Process
Interview Process:
- Online Assessment: A technical assessment focusing on coding, configuration, and problem-solving skills.
- Technical Deep Dive: A detailed discussion of your technical skills, architecture expectations, and system design approaches.
- Behavioral Interview: An evaluation of your cultural fit, communication skills, and problem-solving abilities.
- Final Evaluation: A comprehensive assessment of your technical impact, leadership potential, and alignment with the company's values.
Portfolio Review Tips:
- Observability Tools: Highlight your proficiency in observability tools with case studies or projects showcasing your ability to set, measure, and achieve Service Level Objectives (SLOs).
- Java/Spring Framework/Spring Boot: Demonstrate your experience in Java/Spring Framework/Spring Boot with examples of high-quality, test-driven software development.
- Distributed Applications: Showcase your understanding of distributed applications, concurrent programming, and fault-tolerant systems with relevant projects.
- Problem-Solving Skills: Highlight your ability to design innovative solutions to complex problems, with a focus on test-driven development and performance optimization.
Technical Challenge Preparation:
- Coding Challenges: Practice coding challenges focusing on Java, Spring Framework, and distributed systems design.
- System Design: Brush up on your system design skills, focusing on high availability, fault tolerance, and scalability.
- Performance Optimization: Review performance optimization techniques and best practices for distributed systems.
ATS Keywords: Observability, Site Reliability Engineering, Java, Spring Framework, Distributed Applications, Relational Databases, Linux OS, TCP/IP, Cloud Platforms, Microservices, APIs, Agile Development, DevOps, AI, Docker, Kubernetes, SLO, SLI, Error Budgets, CI/CD, Performance Optimization, System Design, Problem-Solving, Leadership, Architecture, Innovation, Collaboration, Cross-Functional Teams, Hybrid Work Model, BlackRock, Aladdin, Aladdin Engineering
π Enhancement Note: BlackRock's interview process focuses on technical skills, problem-solving abilities, and cultural fit. The company values candidates who can drive innovation, collaborate effectively, and contribute to the platform's reliability and performance.
π Application Steps
To apply for this Site Reliability Engineer, Release Engineering, Vice President role at BlackRock:
- Update Your Portfolio: Highlight your proficiency in observability tools, Java/Spring Framework/Spring Boot, distributed applications, and problem-solving skills with relevant projects and case studies.
- Tailor Your Resume: Emphasize your experience with observability tools, Java/Spring Framework/Spring Boot, distributed applications, and problem-solving skills. Include relevant keywords to optimize your resume for ATS systems.
- Prepare for Technical Interviews: Brush up on your coding, system design, and performance optimization skills. Practice problem-solving exercises and review BlackRock's technical interview process.
- Research the Company: Familiarize yourself with BlackRock's mission, values, and culture. Understand the company's focus on financial well-being and investment management.
Application Requirements
Candidates should have a B.S. or M.S. in Computer Science or a related field with over 8 years of experience. Hands-on experience with observability tools and a strong background in Java and distributed applications is essential.