Infra Tech Lead Analyst- VP
π Job Overview
- Job Title: Infra Tech Lead Analyst - VP
- Company: Citi
- Location: Irving, Texas, United States
- Job Type: On-site
- Category: DevOps, Infrastructure
- Date Posted: 2025-06-24
- Experience Level: 5-10 years
- Remote Status: On-site
π Role Summary
- Lead the monitoring, troubleshooting, and maintenance of AWS/GCP environments to ensure high availability and operational excellence.
- Manage incident response, perform root cause analysis, and drive continuous improvement.
- Collaborate with cross-functional teams to design testing approaches, automate repetitive tasks, and enhance operational processes.
- Provide technical direction to team members and act as a subject matter expert (SME) for senior stakeholders.
- Ensure ongoing compliance with regulatory requirements and maintain a strong focus on incident response and automation.
π» Primary Responsibilities
-
Environment Monitoring & Management:
- Monitor AWS/GCP infrastructure and services to ensure availability, performance, and reliability.
- Perform incident management, including triage, impact assessment, and coordination with engineering teams to resolve issues.
- Participate in on-call rotation for high severity/major incidents support coverage.
-
Incident Response & Resolution:
- Provide root cause analysis (RCA) post-restoration of service.
- Design testing approaches, complex processes, reporting streams, and assist with the automation of repetitive tasks.
- Create, maintain, and enhance operational runbooks, standard operating procedures (SPOs), and knowledge base articles.
-
Cloud Infrastructure Management:
- Support provisioning and configuration of cloud resources across multiple environments.
- Implement and maintain monitoring, logging, and alerting tools (e.g., CloudWatch, Stackdriver, Prometheus).
- Assist in deployments, patching, and disaster recovery procedures.
-
Team Leadership & Collaboration:
- Provide technical/strategic direction to team members.
- Act as an SME to senior stakeholders and/or other team members.
- Collaborate with product, engineering, security, and other stakeholders towards value-adding outcomes.
-
Risk Management & Compliance:
- Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citi, its clients, and assets.
- Ensure ongoing compliance with applicable laws, rules, and regulations, adhering to policy, applying sound ethical judgment regarding personal behavior, conduct, and business practices.
π Skills & Qualifications
Education:
- Bachelorβs/University degree or equivalent experience
Experience:
- At least 6+ years of experience in roles centered around infrastructure delivery (application hosting and/or end-user services) with a proven track record of operational process change and improvement.
- Experience in cloud operations/support and site reliability engineering.
- Hands-on experience with AWS and/or GCP.
- Proficiency with infrastructure as code (IaC) tools like Terraform, CloudFormation, and working knowledge of scripting (bash, Python, or similar).
- Strong understanding of networking, DNS, IAM, load balancing, and cloud-native services.
Required Skills:
- Proven experience in cloud operations/support and site reliability engineering.
- Hands-on experience with AWS and/or GCP.
- Proficiency with infrastructure as code (IaC) tools and scripting.
- Strong understanding of networking, DNS, IAM, load balancing, and cloud-native services.
- Ability to work with virtual/in-person teams and work under pressure/deadlines.
- Effective written and verbal communication skills.
- Effective analytic/diagnostic skills.
- Ability to communicate technical concepts well to non-technical audiences.
Preferred Skills:
- Experience in a financial services or large complex and/or global environment.
- Ability to develop projects required for design of metrics, analytical tools, benchmarking activities, and best practices.
π Web Portfolio & Project Requirements
Portfolio Essentials:
- Demonstrate experience in cloud infrastructure management, incident response, and automation.
- Showcase projects that highlight your ability to monitor, troubleshoot, and maintain cloud environments.
- Include examples of your incident management process, root cause analysis, and automation efforts.
Technical Documentation:
- Provide documentation for your cloud infrastructure projects, including runbooks, SPOs, and knowledge base articles.
- Include code comments, version control, deployment processes, and server configuration details.
- Demonstrate your understanding of testing methodologies, performance metrics, and optimization techniques.
π΅ Compensation & Benefits
Salary Range:
- $125,760 - $188,640 per year
Benefits:
- Medical, dental, and vision coverage
- 401(k) plan
- Life, accident, and disability insurance
- Wellness programs
- Paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays
π― Team & Company Context
Company Culture:
- Industry: Financial Services
- Company Size: Large (200,000+ employees)
- Founded: 1812
- Team Structure:
- Large, cross-functional teams with multiple levels of specialization (e.g., cloud architects, cloud engineers, cloud support engineers).
- Flat hierarchy with clear reporting structures and collaboration with various business lines and technology teams.
- Development Methodology:
- Agile/Scrum methodologies with sprint planning for cloud projects.
- Code review, testing, and quality assurance practices.
- Deployment strategies, CI/CD pipelines, and server management.
Company Website:
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for career growth and collaboration with diverse teams. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π Career & Growth Analysis
Web Technology Career Level:
- Role: Infra Tech Lead Analyst - VP
- Level: Executive/Management
- Responsibility Scope: Strategic planning, team leadership, and cross-functional collaboration to ensure high availability, performance, and operational excellence of cloud environments.
Reporting Structure:
- Reports directly to the senior leadership team, collaborating with various business lines and technology teams.
Technical Impact:
- Drives technical direction, incident response, and automation efforts to improve cloud infrastructure performance and reliability.
- Ensures ongoing compliance with regulatory requirements and maintains a strong focus on incident response and automation.
Growth Opportunities:
- Technical Growth: Develop expertise in emerging cloud technologies, infrastructure as code (IaC) tools, and automation best practices.
- Leadership Development: Gain experience in team management, strategic planning, and cross-functional collaboration.
- Architecture Decisions: Influence architectural decisions related to cloud infrastructure, incident response, and automation.
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for career growth and collaboration with diverse teams. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π Work Environment
Office Type:
- On-site, with opportunities for hybrid work arrangements in certain roles and locations.
Office Location(s):
- Irving, Texas, United States (primary location)
- Additional offices worldwide
Workspace Context:
- Collaborative workspaces with virtual/in-person team interaction and cross-functional collaboration opportunities.
- Access to development tools, multiple monitors, testing devices, and cloud infrastructure resources.
- Flexible work schedules with deployment windows, maintenance, and project deadlines.
Work Schedule:
- Full-time, with opportunities for flexible scheduling and remote work in certain roles and locations.
π Enhancement Note: Citi's global presence and diverse team structure offer extensive opportunities for collaboration and career growth. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π Application & Technical Interview Process
Interview Process:
- Technical Preparation: Brush up on AWS/GCP services, incident management processes, and automation best practices.
- Incident Management & Automation: Prepare for scenario-based questions focusing on incident response, RCA, and automation.
- Cloud Infrastructure Management: Demonstrate your understanding of cloud infrastructure, networking, and server management.
- Technical Deep Dive: Showcase your expertise in infrastructure as code (IaC), scripting, and cloud-native services.
- Cultural Fit Assessment: Highlight your ability to work in a large, global organization with a strong focus on risk management and compliance.
Portfolio Review Tips:
- Cloud Infrastructure Projects: Highlight your experience in cloud infrastructure management, incident response, and automation.
- Incident Management Documentation: Include runbooks, SPOs, and knowledge base articles demonstrating your incident management process.
- Automation & Scripting: Showcase your ability to automate repetitive tasks and optimize cloud infrastructure performance.
Technical Challenge Preparation:
- Incident Management Scenarios: Practice incident management scenarios, focusing on RCA, automation, and collaboration with cross-functional teams.
- Cloud Infrastructure Design: Prepare for cloud infrastructure design challenges, emphasizing performance optimization, scalability, and high availability.
- Technical Deep Dive: Brush up on your knowledge of AWS/GCP services, infrastructure as code (IaC) tools, and cloud-native services.
ATS Keywords:
- Cloud Operations, AWS, GCP, Incident Management, Automation, Root Cause Analysis, Infrastructure as Code, Scripting, Networking, DNS, IAM, Load Balancing, Cloud Native Services, Analytical Tools, Communication Skills, Diagnostic Skills
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for collaboration and career growth. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π Technology Stack & Web Infrastructure
Frontend Technologies:
- Not applicable (focus on cloud infrastructure and operations)
Backend & Server Technologies:
- AWS (Amazon Web Services)
- EC2, RDS, DynamoDB, CloudWatch, CloudFormation, Lambda, API Gateway
- GCP (Google Cloud Platform)
- Compute Engine, Cloud SQL, BigQuery, Stackdriver, App Engine, Cloud Functions, Cloud Endpoints
- Infrastructure as Code (IaC) Tools
- Terraform, CloudFormation, Ansible, Puppet
- Scripting Languages
- Bash, Python, PowerShell
- Cloud Native Services
- Kubernetes, Docker, Istio, Prometheus, Grafana
Development & DevOps Tools:
- Version Control Systems
- Git, GitHub, Bitbucket
- CI/CD Pipelines
- Jenkins, GitLab CI/CD, CircleCI, Travis CI
- Monitoring & Logging Tools
- CloudWatch, Stackdriver, Prometheus, ELK Stack, Datadog, New Relic
- Configuration Management Tools
- Ansible, Puppet, Chef
π Enhancement Note: Citi's extensive use of AWS and GCP services, along with its focus on incident response, automation, and infrastructure as code (IaC), requires strong proficiency in these technologies and best practices.
π₯ Team Culture & Values
Web Development Values:
- User-Centric Design: Prioritize user experience and accessibility in cloud infrastructure design and management.
- Performance Optimization: Focus on cloud infrastructure performance, scalability, and high availability.
- Automation & Efficiency: Emphasize automation, continuous improvement, and incident response processes.
- Collaboration & Knowledge Sharing: Encourage cross-functional collaboration and technical mentoring within and across teams.
Collaboration Style:
- Cross-Functional Integration: Work closely with developers, designers, and stakeholders to ensure cloud infrastructure meets user experience and performance expectations.
- Code Review Culture: Participate in code reviews and peer programming to maintain high-quality cloud infrastructure and automation processes.
- Knowledge Sharing: Facilitate technical mentoring, workshops, and brown bag sessions to foster a culture of continuous learning and skill development.
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for collaboration and career growth. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
β‘ Challenges & Growth Opportunities
Technical Challenges:
- Cloud Infrastructure Management: Stay up-to-date with the latest AWS/GCP services, best practices, and emerging technologies.
- Incident Response & Automation: Continuously improve incident management processes, automation, and collaboration with cross-functional teams.
- Performance Optimization: Optimize cloud infrastructure performance, scalability, and high availability through continuous monitoring, testing, and optimization efforts.
Learning & Development Opportunities:
- Technical Skill Development: Pursue certifications, online courses, and workshops to enhance your expertise in cloud infrastructure, incident response, and automation.
- Emerging Technologies: Stay informed about emerging cloud technologies, trends, and best practices to drive innovation and continuous improvement.
- Leadership Development: Participate in leadership development programs, mentoring, and coaching opportunities to advance your career in technical leadership roles.
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for collaboration and career growth. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π‘ Interview Preparation
Technical Questions:
- Cloud Infrastructure Management:
- Describe your experience with AWS/GCP services, infrastructure as code (IaC) tools, and cloud-native services.
- Explain your approach to cloud infrastructure design, performance optimization, and high availability.
- Discuss your experience with incident management, root cause analysis, and automation.
- Incident Management & Automation:
- Walk through a scenario-based incident management question, focusing on RCA, automation, and collaboration with cross-functional teams.
- Explain your approach to incident management process improvement, automation, and collaboration.
- Cloud Infrastructure Design:
- Describe your experience with cloud infrastructure design, emphasizing performance optimization, scalability, and high availability.
- Discuss your approach to cloud infrastructure cost management, security, and compliance.
Company & Culture Questions:
- Citi's Culture & Values: Explain how you align with Citi's risk management focus, ethical conduct, and business practices.
- Team Dynamics: Describe your experience working in large, global organizations and collaborating with diverse teams.
- Adaptability: Discuss your ability to adapt to new technologies, tools, and work environments.
Portfolio Presentation Strategy:
- Cloud Infrastructure Projects: Highlight your experience in cloud infrastructure management, incident response, and automation.
- Incident Management Documentation: Include runbooks, SPOs, and knowledge base articles demonstrating your incident management process.
- Automation & Scripting: Showcase your ability to automate repetitive tasks and optimize cloud infrastructure performance.
π Enhancement Note: Citi's large size and global presence offer extensive opportunities for collaboration and career growth. The company's focus on risk management and compliance ensures a strong emphasis on ethical conduct and business practices.
π Application Steps
To apply for this Infra Tech Lead Analyst - VP position at Citi:
- Customize Your Portfolio: Tailor your cloud infrastructure projects, incident management documentation, and automation examples to showcase your skills and experience relevant to Citi's requirements.
- Resume Optimization: Highlight your relevant cloud infrastructure, incident management, and automation experience, emphasizing your technical skills and accomplishments.
- Technical Interview Preparation: Brush up on your knowledge of AWS/GCP services, infrastructure as code (IaC) tools, and cloud-native services. Practice incident management scenarios, cloud infrastructure design challenges, and technical deep dive questions.
- Company Research: Thoroughly research Citi's company culture, risk management focus, and ethical conduct expectations to ensure a strong cultural fit and alignment with the organization's values.
β οΈ Important Notice: This enhanced job description includes AI-generated insights and web development/server administration industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.
Application Requirements
Candidates should have at least 6 years of experience in infrastructure delivery with a strong focus on cloud operations and support. Proficiency in AWS/GCP, Infrastructure as Code tools, and scripting is essential.