Cross Technology Managed Services Engineer (L2) (SRE)
📍 Job Overview
- Job Title: Cross Technology Managed Services Engineer (L2) (SRE)
- Company: NTT Ltd.
- Location: Kallang, Singapore
- Job Type: Full-time
- Category: DevOps, Site Reliability Engineering
- Date Posted: 2025-06-11
- Experience Level: Mid-level (2-5 years)
- Remote Status: On-site
🚀 Role Summary
- Key Responsibilities: Ensure operational IT infrastructure and systems reliability, scalability, and efficiency through proactive monitoring, incident management, and automation.
- Key Skills: Infrastructure Monitoring, Incident Management, Problem Management, CI/CD, Deployment Automation, Linux Systems Administration, Configuration Management, Cloud Platforms, Scripting, Disaster Recovery, High Availability, ITIL Best Practices, SRE Best Practices, API, Automation, Ansible.
💻 Primary Responsibilities
-
Proactive Monitoring & Incident Management:
- Monitor and maintain client IT infrastructure and systems.
- Identify, investigate, and resolve technical incidents and problems.
- Restore service to clients with minimal downtime and reduce Mean Time to Recovery (MTTR).
-
Automation & Operations:
- Develop automation scripts to reduce manual intervention and recurring operational tasks (toil).
- Set up and maintain monitoring, alerting, and logging tools (e.g., Prometheus, Grafana, PagerDuty).
- Participate in on-call rotations and drive fast resolution of P1/P2 incidents.
- Contribute to deployment pipelines using tools like Jenkins or GitLab CI/CD.
-
Security & Compliance:
- Harden security and ensure compliance across production systems through configuration management and patching.
-
Capacity Planning & Performance Optimization:
- Ensure systems can handle current and future loads.
- Optimize infrastructure usage and reduce waste (cost-efficiency).
-
Bridging Development & Operations:
- Advocate for and implement DevOps and SRE best practices.
- Collaborate with development teams to ensure production systems are always available, fast, and efficient.
🎓 Skills & Qualifications
Education:
- Bachelor's degree or equivalent qualification in IT/Computing.
Certifications:
- Relevant certifications carry additional weightage, such as:
- Microsoft Certified
- AWS Certified
- VMware Certified
- Google Cloud Platform (GCP)
- VMWare Certified Cloud Management and Automation
- SRE Certifications
Required Skills:
- Infrastructure Monitoring & Observability & Telemetry
- Incident & Problem Management
- CI/CD and Deployment Automation
- Linux Systems Administration
- Configuration Management (Ansible, Puppet, etc.)
- Cloud Platforms (AWS, GCP, Azure)
- Scripting (PS, Bash, Python)
- Disaster Recovery & High Availability
- ITIL / SRE Best Practices
- Familiarity with JSON, API, Automation, Ansible, CI/CD
Experience:
- Moderate level years of relevant managed services experience handling cross-technology infrastructure.
- Moderate level knowledge in ticketing tools, preferably ServiceNow.
- Moderate level working knowledge of ITIL processes.
- Moderate level experience working with vendors and/or 3rd parties.
📊 Web Portfolio & Project Requirements
Portfolio Essentials:
- Demonstrate experience with infrastructure monitoring, incident management, and automation tools.
- Showcase projects that highlight your ability to ensure high availability, scalability, and efficiency of IT systems.
- Include examples of your scripting and automation skills.
Technical Documentation:
- Document your approach to incident management, problem resolution, and automation.
- Include any relevant case studies or success stories demonstrating your impact on system performance and reliability.
💵 Compensation & Benefits
Salary Range:
- The salary range for this role in Singapore is approximately SGD 80,000 - 120,000 per annum, depending on experience and qualifications. This estimate is based on market research and industry standards for mid-level DevOps and SRE roles in Singapore.
Benefits:
- Competitive benefits package, including health insurance, retirement plans, and employee assistance programs.
- Opportunities for professional development, training, and certifications.
- A global culture that embraces diversity and offers equal opportunities for growth and advancement.
Working Hours:
- Full-time position with standard working hours, including on-call rotations and maintenance windows as required.
🎯 Team & Company Context
Company Culture:
- Industry: Information Technology and Services
- Company Size: Large (10,001+ employees)
- Founded: 1967 (as NTT DATA, part of the NTT Group)
- Team Structure: Large, global teams with cross-functional collaboration and specialization in various technologies.
- Development Methodology: Agile, with a focus on DevOps and SRE best practices.
Company Website: https://www.nttdata.com/
📝 Enhancement Note: NTT DATA is a global innovator of business and technology services, serving 75% of the Fortune Global 100. They invest heavily in R&D and are committed to helping clients innovate, optimize, and transform for long-term success.
📈 Career & Growth Analysis
Web Technology Career Level:
- Mid-level Site Reliability Engineer (SRE) responsible for ensuring operational IT infrastructure and systems reliability, scalability, and efficiency.
Reporting Structure:
- Reports to the Managed Services Team Lead or equivalent.
- Collaborates with cross-functional teams, including development, operations, and client-facing teams.
Technical Impact:
- Directly impacts client IT infrastructure and systems reliability, performance, and security.
- Indirectly influences user experience and business continuity through proactive monitoring and incident management.
Growth Opportunities:
- Technical Growth: Develop expertise in emerging technologies, tools, and best practices related to SRE and DevOps.
- Leadership Potential: Gain experience in managing teams, mentoring junior engineers, and driving technical projects.
- Career Progression: Advance to senior SRE roles, technical lead positions, or move into management and architecture decision-making.
🌐 Work Environment
Office Type: Large, global offices with a collaborative and inclusive work environment.
Office Location(s): Kallang, Singapore, with additional offices worldwide.
Workspace Context:
- Modern, well-equipped workspaces with multiple monitors, testing devices, and development tools.
- Collaborative environment with opportunities for knowledge sharing, technical mentoring, and continuous learning.
Work Schedule:
- Standard full-time working hours with flexibility for deployment windows, maintenance, and project deadlines.
- On-call rotations and maintenance windows as required.
📝 Enhancement Note: NTT DATA encourages a positive work-life balance, offering flexible working arrangements and employee benefits to support the well-being of their team members.
📄 Application & Technical Interview Process
Interview Process:
- Phone/Screening: Technical phone or video call to assess communication skills and basic understanding of SRE and DevOps concepts.
- Technical Assessment: Hands-on assessment of your incident management, automation, and scripting skills. Expect to work on real-world scenarios and case studies.
- On-site/Final Interview: In-depth discussion of your approach to SRE, DevOps, and infrastructure management. Assess your cultural fit and long-term goals.
Portfolio Review Tips:
- Highlight your experience with infrastructure monitoring, incident management, and automation tools.
- Include examples of your scripting and automation skills, demonstrating your ability to reduce manual intervention and toil.
- Showcase your problem-solving skills and approach to root cause analysis and incident resolution.
Technical Challenge Preparation:
- Brush up on your knowledge of ITIL and SRE best practices.
- Familiarize yourself with the company's tech stack and any relevant certifications.
- Prepare for hands-on assessments and be ready to discuss your approach to incident management, automation, and infrastructure management.
ATS Keywords:
- Infrastructure Monitoring, Incident Management, Problem Management, CI/CD, Deployment Automation, Linux Systems Administration, Configuration Management, Cloud Platforms, Scripting, Disaster Recovery, High Availability, ITIL Best Practices, SRE Best Practices, API, Automation, Ansible, JSON, DevOps, Site Reliability Engineering
🛠 Technology Stack & Web Infrastructure
Monitoring & Observability Tools:
- Prometheus, Grafana, PagerDuty, ServiceNow
Cloud Platforms:
- AWS, GCP, Azure
Scripting Languages:
- PowerShell (PS), Bash, Python
Configuration Management:
- Ansible, Puppet
Deployment Automation:
- Jenkins, GitLab CI/CD
Infrastructure as Code (IaC):
- Terraform, CloudFormation
Containerization & Orchestration:
- Docker, Kubernetes
📝 Enhancement Note: NTT DATA uses a wide range of tools and technologies, and candidates should be open to learning and working with new tools as needed.
👥 Team Culture & Values
NTT DATA Values:
- Clients First: Prioritize client needs and create a positive client experience throughout the total client journey.
- Integrity: Uphold the highest ethical standards and act with honesty and transparency.
- Respect: Value diversity and inclusion, fostering a culture of collaboration and teamwork.
- Excellence: Strive for continuous improvement and deliver high-quality services and solutions.
- Sustainability: Contribute to a sustainable future by embracing environmentally responsible practices and technologies.
Collaboration Style:
- Cross-functional Integration: Collaborate with development, operations, and client-facing teams to ensure operational IT infrastructure and systems reliability, scalability, and efficiency.
- Code Review Culture: Encourage knowledge sharing, peer programming, and continuous learning.
- Knowledge Sharing: Foster a culture of mentoring and technical skill development.
⚡ Challenges & Growth Opportunities
Technical Challenges:
- Incident Management: Develop and refine incident management processes to minimize downtime and reduce Mean Time to Recovery (MTTR).
- Automation & Efficiency: Identify and eliminate manual, repetitive tasks (toil) through automation and process improvement.
- Scalability & Performance: Ensure IT systems can handle current and future loads while optimizing infrastructure usage and reducing waste (cost-efficiency).
- Emerging Technologies: Stay up-to-date with the latest tools, best practices, and trends in SRE and DevOps.
Learning & Development Opportunities:
- Technical Skill Development: Expand your expertise in SRE, DevOps, and related technologies through training, certifications, and hands-on projects.
- Leadership Development: Gain experience in managing teams, mentoring junior engineers, and driving technical projects.
- Architecture Decision-Making: Contribute to strategic architecture decisions and influence the direction of IT infrastructure and systems.
📝 Enhancement Note: NTT DATA offers a supportive learning environment with opportunities for professional development, training, and certifications.
💡 Interview Preparation
Technical Questions:
- SRE & DevOps Fundamentals: Demonstrate your understanding of SRE and DevOps principles, best practices, and tools.
- Incident Management: Walk through real-world scenarios and discuss your approach to incident management, problem resolution, and automation.
- Automation & Scripting: Showcase your scripting skills and discuss your approach to automating manual, repetitive tasks (toil).
Company & Culture Questions:
- NTT DATA Culture: Demonstrate your understanding of NTT DATA's values, culture, and commitment to client success.
- Team Dynamics: Discuss your approach to collaboration, knowledge sharing, and working with cross-functional teams.
- Long-term Goals: Align your career aspirations with NTT DATA's growth opportunities and commitment to employee development.
Portfolio Presentation Strategy:
- Live Demonstration: Showcase your incident management, automation, and scripting skills through live demonstrations and walkthroughs.
- Technical Deep Dive: Provide detailed explanations of your approach to infrastructure management, automation, and problem resolution.
- User Impact: Highlight the user impact of your work and discuss how your efforts contribute to client success and business continuity.
📝 Enhancement Note: Prepare thoroughly for the interview process, focusing on your technical skills, problem-solving approach, and cultural fit with NTT DATA.
📌 Application Steps
To apply for this SRE (Site Reliability Engineer) position at NTT DATA:
- Tailor Your Resume: Highlight your relevant experience with infrastructure monitoring, incident management, and automation tools.
- Prepare Your Portfolio: Showcase your incident management, automation, and scripting skills through live demonstrations and walkthroughs.
- Research the Company: Familiarize yourself with NTT DATA's values, culture, and commitment to client success.
- Prepare for the Interview: Brush up on your technical skills, problem-solving approach, and cultural fit with NTT DATA.
📝 Enhancement Note: This enhanced job description includes AI-generated insights and industry-standard assumptions. All details should be verified directly with NTT DATA before making application decisions.
Content Guidelines (IMPORTANT: Do not include this in the output)
Role-Specific Focus:
- Highlight the role's focus on ensuring operational IT infrastructure and systems reliability, scalability, and efficiency.
- Emphasize the candidate's experience with infrastructure monitoring, incident management, and automation tools.
- Address the candidate's ability to work in a collaborative, cross-functional environment.
Industry Expertise:
- Include specific tools, technologies, and best practices relevant to SRE and DevOps.
- Address the candidate's experience with cloud platforms, scripting languages, and configuration management.
- Highlight the candidate's problem-solving skills and approach to incident management and automation.
Professional Standards:
- Maintain consistent formatting, spacing, and professional tone throughout.
- Use SRE and DevOps industry terminology appropriately and accurately.
- Include comprehensive benefits and growth opportunities relevant to SRE and DevOps professionals.
- Provide actionable insights that give SRE and DevOps candidates a competitive advantage.
Technical Focus & Interview Preparation:
- Emphasize the candidate's experience with infrastructure monitoring, incident management, and automation tools.
- Address the candidate's ability to work with cloud platforms, scripting languages, and configuration management.
- Include specific portfolio requirements tailored to the SRE role, focusing on incident management, automation, and scripting skills.
- Provide tactical advice for SRE portfolio development, live demonstrations, and project case studies.
- Include SRE-specific interview preparation and coding challenge guidance.
Avoid:
- Generic business jargon not relevant to SRE and DevOps roles.
- Placeholder text or incomplete sections.
- Repetitive content across different sections.
- Non-technical terminology unless relevant to the specific SRE role.
- Marketing language unrelated to SRE and DevOps.
Application Requirements
Candidates should have a Bachelor's degree in IT/Computing or equivalent experience, along with relevant certifications. Required skills include infrastructure monitoring, incident management, CI/CD, and experience with cloud platforms.