Senior Big Data Platform Engineer – C12 – AVP - Pune

Citi
Full_timePune, India

📍 Job Overview

  • Job Title: Senior Big Data Platform Engineer – C12 – AVP - Pune
  • Company: Citi
  • Location: Pune, Maharashtra, India
  • Job Type: Full-Time
  • Category: DevOps Engineer
  • Date Posted: 2025-08-01
  • Experience Level: Mid-Senior level (5-10 years)
  • Remote Status: On-site

🚀 Role Summary

  • Big Data Infrastructure Management: Oversee and maintain Cloudera clusters, ensuring high availability and optimal performance.
  • Platform Software Upgrades: Plan and execute major platform software and operating system upgrades across physical environments.
  • Automation & Security: Develop and automate processes for maintenance and monitoring, and implement security measures for the cluster.
  • Resource Utilization & Performance Tuning: Monitor resource utilization, review performance stats, and recommend changes for tuning Hive/Impala queries.
  • Technical Documentation: Create and maintain detailed, up-to-date technical documentation.

📝 Enhancement Note: This role requires a strong background in big data tools and cloud services, with a focus on Hadoop ecosystem components and security measures.

💻 Primary Responsibilities

  • Cluster Management: Install, configure, and maintain Cloudera clusters, ensuring high availability and proper resource utilization.
  • Upgrade & Maintenance: Plan and execute major platform software and operating system upgrades, and coordinate maintenance activities with application teams.
  • Automation & Monitoring: Develop and automate processes for maintenance and monitoring using Python, Java, and Shell Script, and monitor cluster connectivity and security.
  • Performance Tuning: Review performance stats, query execution, and explain plans; recommend changes for tuning Hive/Impala queries and optimize Hadoop cluster performance.
  • Security & Documentation: Implement security measures for all aspects of the cluster, and create and maintain detailed technical documentation.

📝 Enhancement Note: This role involves a mix of technical tasks, including hands-on cluster management, automation, and performance tuning, as well as collaborative work with application teams for upgrades and maintenance.

🎓 Skills & Qualifications

Education: Bachelor’s degree/University degree in Computer Science, Engineering, or a related field.

Experience: 5+ years of experience with big data tools and 3+ years with Python, Java, and Shell Script. Proven experience with cloud services (preferably AWS) and cluster maintenance.

Required Skills:

  • Big data tools: Hadoop, Hive, HBase, Spark, Kafka, Impala, Solr, Phoenix
  • Programming languages: Python, Java, Shell Script
  • Cloud services: AWS (preferred)
  • Cluster maintenance and performance tuning
  • Security measures and role-based access control
  • Technical documentation and knowledge sharing

Preferred Skills:

  • Experience with containerization (e.g., Docker, Kubernetes)
  • Familiarity with infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation)
  • Knowledge of CI/CD pipelines and DevOps practices
  • Strong communication and collaboration skills

📝 Enhancement Note: While the required skills are well-defined, the preferred skills section highlights areas where additional experience or knowledge would be beneficial for career growth in this role.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

  • Big Data Projects: Showcase your experience with big data tools by highlighting projects that demonstrate your ability to install, configure, and maintain clusters, as well as performance tuning and optimization.
  • Automation & Scripting: Include examples of automation scripts (Python, Java, Shell Script) used for maintenance and monitoring tasks.
  • Security & Compliance: Highlight projects that showcase your understanding of security measures and compliance in big data environments.

Technical Documentation:

  • Documentation Samples: Provide samples of technical documentation created for big data projects, demonstrating your ability to create clear, concise, and well-organized documentation.
  • Knowledge Sharing: Include examples of knowledge-sharing activities, such as blog posts, presentations, or training sessions, that showcase your ability to communicate complex technical concepts effectively.

📝 Enhancement Note: As this role involves a significant amount of technical documentation and knowledge sharing, it's essential to highlight these aspects in your portfolio.

💵 Compensation & Benefits

Salary Range: INR 1,500,000 - 2,500,000 per annum (Estimated based on market standards for senior-level big data roles in Pune, India)

Benefits:

  • Competitive salary and performance-based bonuses
  • Health, dental, and vision insurance
  • Retirement savings plans
  • Employee stock purchase plan
  • Tuition assistance and professional development opportunities
  • Employee discounts on various products and services

Working Hours: Full-time position with standard working hours (Monday to Friday, 9:00 AM to 6:00 PM IST) and occasional on-call duties for cluster maintenance and support.

📝 Enhancement Note: The salary range is estimated based on market research for senior-level big data roles in Pune, India. Benefits are typical for a large financial institution like Citi and may vary based on individual eligibility and company policies.

🎯 Team & Company Context

🏢 Company Culture

Industry: Financial Services

Company Size: Large (over 200,000 employees globally)

Founded: 1812 (New York, USA)

Team Structure:

  • The team consists of big data engineers, data scientists, and data analysts working collaboratively to manage and maintain big data infrastructure and deliver insights to stakeholders.
  • The role reports directly to the Big Data Platform Manager and works closely with application teams for upgrades and maintenance.

Development Methodology:

  • Agile/Scrum methodologies are used for project management and collaboration.
  • Code reviews, testing, and quality assurance practices are in place to ensure code quality and performance.
  • CI/CD pipelines and automated deployment processes are used to streamline development and release cycles.

Company Website: Citi

📝 Enhancement Note: Citi is a large, global financial institution with a strong focus on technology and innovation. The company culture values collaboration, continuous learning, and a commitment to delivering exceptional customer experiences.

📈 Career & Growth Analysis

Big Data Platform Engineer Career Level: This role is at the senior level, with a focus on managing and maintaining big data infrastructure, automating processes, and ensuring optimal performance and security. The role requires a deep understanding of big data tools and cloud services, as well as strong communication and collaboration skills.

Reporting Structure: The role reports directly to the Big Data Platform Manager and works closely with application teams for upgrades and maintenance. The team structure encourages collaboration and knowledge sharing among team members.

Technical Impact: The role has a significant impact on the performance, availability, and security of big data infrastructure, directly influencing the quality and timeliness of data-driven insights for stakeholders.

Growth Opportunities:

  • Technical Leadership: With experience and strong performance, there may be opportunities to move into a technical leadership role, mentoring junior team members and driving architectural decisions.
  • Architecture & Design: As the role involves managing and maintaining big data infrastructure, there may be opportunities to contribute to architecture and design decisions, driving innovation and improvement in the big data ecosystem.
  • Cross-Functional Collaboration: Working closely with application teams for upgrades and maintenance can provide opportunities to expand your skill set and gain exposure to different aspects of the business.

📝 Enhancement Note: This role offers significant opportunities for career growth and development, with a focus on technical leadership, architecture and design, and cross-functional collaboration.

🌐 Work Environment

Office Type: Modern, collaborative workspaces with a focus on employee well-being and productivity.

Office Location(s): Pune, Maharashtra, India (Tower B, EON Free Zone II)

Workspace Context:

  • Collaboration: The workspace is designed to encourage collaboration and communication among team members, with open-plan offices and dedicated meeting spaces.
  • Workstations: Modern workstations with multiple monitors, high-speed internet connectivity, and access to relevant software tools and applications.
  • Flexibility: The work environment offers flexibility for employees to balance their work and personal lives, with remote work options available for certain roles and situations.

Work Schedule: Full-time position with standard working hours (Monday to Friday, 9:00 AM to 6:00 PM IST) and occasional on-call duties for cluster maintenance and support.

📝 Enhancement Note: The work environment at Citi is designed to be collaborative, flexible, and supportive of employee well-being and productivity. The company offers modern workspaces, state-of-the-art technology, and a range of benefits and perks to support employee growth and development.

📄 Application & Technical Interview Process

Interview Process:

  1. Phone/Video Screen: A brief phone or video call to discuss your background, experience, and motivation for the role.
  2. Technical Assessment: A hands-on technical assessment, focusing on your ability to install, configure, and maintain big data clusters, as well as your understanding of performance tuning and optimization.
  3. Behavioral & Cultural Fit: A discussion to assess your communication skills, problem-solving abilities, and cultural fit within the team and organization.
  4. Final Interview: A final interview with the hiring manager or a panel of stakeholders to discuss your fit for the role and answer any remaining questions.

Portfolio Review Tips:

  • Highlight your experience with big data tools and cloud services, focusing on projects that demonstrate your ability to manage and maintain big data infrastructure.
  • Include examples of automation scripts and technical documentation, showcasing your attention to detail and commitment to knowledge sharing.
  • Be prepared to discuss your approach to security and compliance in big data environments, and how you ensure the integrity and confidentiality of data.

Technical Challenge Preparation:

  • Brush up on your knowledge of big data tools, with a focus on Hadoop ecosystem components and cloud services (preferably AWS).
  • Practice installing, configuring, and maintaining big data clusters, and be prepared to discuss your approach to performance tuning and optimization.
  • Familiarize yourself with security best practices and role-based access control in big data environments.

ATS Keywords: (Organized by category)

  • Big Data Tools: Hadoop, Hive, HBase, Spark, Kafka, Impala, Solr, Phoenix
  • Programming Languages: Python, Java, Shell Script
  • Cloud Services: AWS, Google Cloud Platform, Microsoft Azure
  • Cluster Management: Cloudera, Hortonworks, MapR
  • Security & Compliance: Role-Based Access Control (RBAC), SSL, Disk Encryption
  • Technical Documentation: Confluence, JIRA, GitHub
  • Soft Skills: Communication, Collaboration, Problem-Solving, Attention to Detail

📝 Enhancement Note: The interview process for this role is designed to assess your technical skills, problem-solving abilities, and cultural fit within the team and organization. The portfolio review and technical challenge preparation tips are tailored to help you demonstrate your expertise in big data infrastructure management and optimization.

🛠 Technology Stack & Web Infrastructure

Big Data Tools:

  • Hadoop: A distributed file system and processing framework for big data storage and analysis.
  • Hive: A data warehousing solution built on top of Hadoop, enabling SQL-like queries for big data.
  • HBase: A distributed, scalable big data store built on top of Hadoop HDFS, providing random, real-time read/write access to big data.
  • Spark: A fast and general engine for large-scale data processing, with built-in libraries for machine learning, graph computation, and streaming.
  • Kafka: A distributed streaming platform that enables real-time data pipelines and event-driven architectures.
  • Impala: A native analytic database for Hadoop, designed to provide fast, interactive SQL queries on big data.
  • Solr: A highly reliable, scalable, and distributed search and analytics platform built on Apache Lucene.
  • Phoenix: An open-source, distributed SQL database built on Apache HBase, providing ACID transactions, secondary indexes, and SQL access to big data.

Cloud Services:

  • AWS: Amazon Web Services, a comprehensive, evolving cloud computing platform that offers a mix of Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) offerings.

Programming Languages:

  • Python: A high-level, interpreted, and general-purpose programming language known for its simplicity and readability.
  • Java: A high-level, object-oriented programming language designed to have as few implementation dependencies as possible.
  • Shell Script: A scripting language used to automate tasks and simplify complex processes in Unix-based operating systems.

DevOps & Infrastructure Tools:

  • Ansible: A simple, agentless open-source automation platform that enables the rapid deployment and management of applications and services.
  • Terraform: An open-source infrastructure as code (IaC) software tool that provides a consistent workflow to manage and provision cloud resources.
  • CloudFormation: A service that helps you model and set up your Amazon Web Services resources, provisioning and updating them in an orderly and predictable fashion.
  • Prometheus: An open-source monitoring and alerting toolkit that enables real-time monitoring and alerting for complex, distributed systems.

📝 Enhancement Note: The technology stack for this role is focused on big data tools and cloud services, with a strong emphasis on Hadoop ecosystem components and security measures. The DevOps and infrastructure tools listed are commonly used in big data environments to automate processes and manage resources.

👥 Team Culture & Values

Big Data Platform Engineering Values:

  • Customer Focus: A commitment to understanding and addressing the needs of internal and external customers, ensuring that big data infrastructure meets their requirements and expectations.
  • Innovation: A dedication to staying up-to-date with the latest big data tools, technologies, and best practices, and driving continuous improvement in big data infrastructure and processes.
  • Collaboration: A belief in the power of teamwork and knowledge sharing, with a focus on working together to achieve common goals and drive success for the organization.
  • Integrity: A commitment to acting with honesty, transparency, and accountability in all aspects of big data infrastructure management and optimization.

Collaboration Style:

  • Cross-Functional Integration: The team works closely with application teams to ensure that big data infrastructure meets their needs and supports their goals.
  • Code Review Culture: The team follows a code review process to ensure code quality, performance, and maintainability.
  • Knowledge Sharing: The team encourages knowledge sharing and learning, with regular training sessions, workshops, and brown bag sessions to help team members develop their skills and stay up-to-date with the latest big data tools and technologies.

📝 Enhancement Note: The big data platform engineering team at Citi values customer focus, innovation, collaboration, and integrity. The team's collaboration style emphasizes cross-functional integration, code review culture, and knowledge sharing, fostering a culture of continuous learning and improvement.

⚡ Challenges & Growth Opportunities

Technical Challenges:

  • Big Data Infrastructure Management: Managing and maintaining big data infrastructure at scale, ensuring high availability, performance, and security.
  • Cloud Migration & Optimization: Migrating big data workloads to the cloud and optimizing them for performance, cost, and scalability.
  • Data Governance & Compliance: Ensuring data governance, security, and compliance in big data environments, with a focus on data privacy, access control, and regulatory compliance.
  • Emerging Technologies: Staying up-to-date with emerging big data technologies and trends, and integrating them into the big data ecosystem to drive innovation and improvement.

Learning & Development Opportunities:

  • Big Data Technologies: Deepening your knowledge of big data tools, technologies, and best practices, with a focus on emerging trends and innovations.
  • Cloud Services: Expanding your expertise in cloud services, with a focus on AWS, Google Cloud Platform, and Microsoft Azure.
  • Leadership & Management: Developing your leadership and management skills, with a focus on driving technical strategy, architecture, and decision-making in big data environments.

📝 Enhancement Note: The technical challenges and learning opportunities for this role are focused on big data infrastructure management, cloud migration and optimization, data governance and compliance, and emerging technologies. The learning and development opportunities emphasize big data technologies, cloud services, and leadership and management skills.

💡 Interview Preparation

Technical Questions:

  • Big Data Infrastructure Management: Questions focused on your experience with big data tools, cloud services, and cluster management, with a focus on performance tuning, optimization, and security.
  • Cloud Migration & Optimization: Questions focused on your experience with cloud migration and optimization, with a focus on cost, performance, and scalability.
  • Data Governance & Compliance: Questions focused on your understanding of data governance, security, and compliance in big data environments, with a focus on data privacy, access control, and regulatory compliance.

Company & Culture Questions:

  • Big Data Platform Engineering Values: Questions focused on your understanding and alignment with the big data platform engineering values of customer focus, innovation, collaboration, and integrity.
  • Collaboration Style: Questions focused on your experience with cross-functional integration, code review culture, and knowledge sharing, with a focus on driving success for the organization.
  • Career Growth & Development: Questions focused on your long-term career goals and aspirations, with a focus on driving technical strategy, architecture, and decision-making in big data environments.

Portfolio Presentation Strategy:

  • Big Data Infrastructure Management: Highlight your experience with big data tools, cloud services, and cluster management, with a focus on performance tuning, optimization, and security.
  • Cloud Migration & Optimization: Highlight your experience with cloud migration and optimization, with a focus on cost, performance, and scalability.
  • Data Governance & Compliance: Highlight your understanding of data governance, security, and compliance in big data environments, with a focus on data privacy, access control, and regulatory compliance.

📝 Enhancement Note: The technical questions for this role are focused on big data infrastructure management, cloud migration and optimization, and data governance and compliance. The company and culture questions emphasize the big data platform engineering values, collaboration style, and career growth and development. The portfolio presentation strategy highlights your experience with big data tools, cloud services, and cluster management, with a focus on performance tuning, optimization, and security.

📌 Application Steps

To apply for this Senior Big Data Platform Engineer – C12 – AVP – Pune position:

  1. Update Your Resume: Highlight your experience with big data tools, cloud services, and cluster management, with a focus on performance tuning, optimization, and security. Include any relevant portfolio projects or case studies that demonstrate your expertise in big data infrastructure management and optimization.
  2. Prepare Your Portfolio: Include examples of automation scripts, technical documentation, and big data infrastructure management projects that showcase your skills and experience. Be prepared to discuss your approach to performance tuning, optimization, and security in big data environments.
  3. Research the Company: Familiarize yourself with Citi's big data platform engineering values, collaboration style, and career growth and development opportunities. Be prepared to discuss your alignment with these aspects of the role and the organization.
  4. Practice Technical Interview Questions: Brush up on your knowledge of big data tools, cloud services, and cluster management, with a focus on performance tuning, optimization, and security. Practice answering technical interview questions and be prepared to discuss your approach to big data infrastructure management and optimization.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and big data infrastructure management industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Application Requirements

Candidates should have over 5 years of experience with big data tools and at least 3 years of experience with Python, Java, and Shell Script. Familiarity with cloud services, troubleshooting, and performance tuning of Hadoop clusters is also required.