Senior Data Center Operations Engineer

FluidStack
Full_timeUnited States

📍 Job Overview

  • Job Title: Senior Data Center Operations Engineer
  • Company: FluidStack
  • Location: United States (Multiple Locations)
  • Job Type: Full-Time, On-Site
  • Category: Infrastructure, Data Center Operations
  • Date Posted: 2025-08-03
  • Experience Level: 5-10 years

🚀 Role Summary

  • Own regional data center operations end-to-end, ensuring high availability and efficiency of AI supercomputing infrastructure.
  • Develop and implement automation tools to eliminate manual tasks and maximize data center performance.
  • Collaborate with cross-functional teams to drive continuous improvement and innovation in data center operations.

📝 Enhancement Note: This role requires a balance of technical expertise, process improvement, and stakeholder management skills to succeed in a dynamic, high-impact environment.

💻 Primary Responsibilities

  • Regional Data Center Management: Oversee day-to-day operations of multiple data centers, ensuring optimal power, cooling, and rack infrastructure.
  • Automation & Optimization: Develop and maintain automation tools to reduce manual tasks, improve efficiency, and drive data center optimization.
  • Vendor Management: Liaise with vendors to ensure contractual obligations are met, and negotiate contracts for maximum value.
  • DCIM Implementation & Asset Tracking: Lead DCIM implementation and adoption, track assets across multiple locations, and maintain high accuracy in asset management.
  • Policy & Procedure Documentation: Maintain and update standard operating procedures (SOPs) and maintenance operating procedures (MOPs) for compliance and efficiency.
  • Reporting & Analytics: Design reports and dashboards to identify trends, optimize capacity, and drive data-driven decision-making.
  • Collaboration & Communication: Work closely with cross-functional teams, including IT, engineering, and facilities, to ensure seamless data center operations.

📝 Enhancement Note: This role involves a high degree of ownership and responsibility, requiring strong problem-solving skills, attention to detail, and the ability to work effectively in a team environment.

🎓 Skills & Qualifications

Education: Bachelor's degree in Computer Science, Engineering, or a related field. Relevant military training may also be considered.

Experience: 5+ years of proven experience in data center operations, with a strong track record of achieving high accuracy in physical audits and managing complex infrastructure projects.

Required Skills:

  • Proven expertise in data center operations, including power infrastructure, cooling, and rack management.
  • Strong vendor management skills, with experience negotiating contracts and holding partners accountable.
  • Proficiency in DCIM tools and asset tracking, with a demonstrated ability to reduce retrieval times and improve data accuracy.
  • Excellent technical documentation skills, with experience creating and maintaining SOPs, MOPs, and training materials.
  • Strong data-driven approach, with experience creating dashboards and reports to drive infrastructure decisions.
  • Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams.

Preferred Skills:

  • Experience with GPU infrastructure and high-performance computing environments.
  • Familiarity with AI/ML workloads and their infrastructure requirements.
  • Knowledge of liquid cooling systems for high-density compute.
  • Experience building custom monitoring and automation tools.
  • Background in hyperscale or cloud data center operations.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

  • Detailed documentation of past data center operations projects, highlighting your role, responsibilities, and achievements.
  • Case studies demonstrating your ability to optimize data center infrastructure, improve efficiency, and reduce costs.
  • Examples of automation tools you've developed to eliminate manual tasks and drive data center performance.

Technical Documentation:

  • Well-structured and clearly written SOPs and MOPs for data center operations.
  • Detailed reports and dashboards showcasing your ability to analyze data center performance and drive informed decision-making.
  • Examples of your ability to collaborate with cross-functional teams, such as IT and engineering, to ensure seamless data center operations.

📝 Enhancement Note: Your portfolio should demonstrate your ability to manage complex data center operations, drive continuous improvement, and work effectively with stakeholders to achieve organizational goals.

💵 Compensation & Benefits

Salary Range: $120,000 - $180,000 per year (Based on regional market data and experience level)

Benefits:

  • Competitive total compensation package (salary + equity).
  • Retirement or pension plan, in line with local norms.
  • Health, dental, and vision insurance.
  • Generous PTO policy, in line with local norms.

Working Hours: Full-time, on-site position with a standard workweek of 40 hours. Occasional overtime may be required to support critical maintenance or emergency situations.

📝 Enhancement Note: The salary range provided is an estimate based on market data and experience level. Actual compensation may vary depending on factors such as location, qualifications, and company performance.

🎯 Team & Company Context

🏢 Company Culture

Industry: AI Cloud Platform, Supercomputing

Company Size: Small, highly motivated team focused on providing a world-class supercomputing experience for AI labs, governments, and enterprises.

Founded: 2020

Team Structure:

  • Small, highly motivated team with a strong focus on customer success and continuous improvement.
  • Collaborative environment that values effectiveness, competence, and a growth mindset.
  • Flat organizational structure with a high degree of ownership and autonomy.

Development Methodology:

  • Agile development processes, with a focus on continuous integration and deployment.
  • Strong emphasis on automation, monitoring, and optimization to ensure high data center availability and performance.
  • Regular team meetings and retrospectives to drive continuous improvement and innovation.

Company Website: www.fluidstack.io

📝 Enhancement Note: FluidStack's small, highly motivated team values effectiveness, competence, and a growth mindset. This culture fosters a high degree of ownership and autonomy, making it an ideal environment for a proactive, self-driven senior data center operations engineer.

📈 Career & Growth Analysis

Data Center Operations Career Level: Senior level, with a focus on end-to-end regional data center management, automation, and optimization.

Reporting Structure: This role reports directly to the Director of Infrastructure, with a dotted-line reporting relationship to the relevant regional facility managers.

Technical Impact: This role has a significant impact on the overall performance and availability of FluidStack's AI supercomputing infrastructure. The senior data center operations engineer is responsible for ensuring that data centers run flawlessly 24/7, supporting the world's leading AI labs and enterprises.

Growth Opportunities:

  • Technical Growth: Expand your expertise in data center operations, automation, and optimization, with opportunities to work on cutting-edge AI supercomputing infrastructure.
  • Leadership Development: Develop your leadership skills by mentoring junior team members and driving continuous improvement initiatives across the data center operations team.
  • Architecture & Design: Contribute to the design and architecture of FluidStack's data center infrastructure, ensuring that it meets the needs of its growing customer base and supports the company's long-term strategic goals.

📝 Enhancement Note: This role offers significant opportunities for technical growth, leadership development, and architectural contributions, making it an ideal fit for a motivated and ambitious senior data center operations engineer looking to make a real impact in a dynamic, high-impact environment.

🌐 Work Environment

Office Type: On-site, with multiple regional data center locations across the United States.

Office Location(s): United States (Multiple Locations)

Workspace Context:

  • Collaborative workspace with a strong focus on teamwork, communication, and continuous improvement.
  • Modern data center facilities with state-of-the-art infrastructure and monitoring tools.
  • Opportunities to work with cutting-edge AI supercomputing technology and high-performance computing environments.

Work Schedule: Full-time, on-site position with a standard workweek of 40 hours. Occasional overtime may be required to support critical maintenance or emergency situations.

📝 Enhancement Note: FluidStack's on-site, collaborative work environment offers ample opportunities to work with cutting-edge technology and drive continuous improvement in data center operations. The regional data center locations provide a diverse and dynamic work environment, with ample opportunities for professional growth and development.

📄 Application & Technical Interview Process

Interview Process:

  1. Phone Screen: A brief phone call to discuss your experience, qualifications, and career goals. Be prepared to answer questions about your data center operations experience and technical skills.
  2. On-Site Interview: A full-day on-site interview at one of FluidStack's regional data center locations. This will include a tour of the facility, meetings with key stakeholders, and a series of technical and behavioral interviews. Be prepared to discuss your past data center operations projects, automation tools, and problem-solving strategies in detail.
  3. Final Decision: A final decision will be made based on your interview performance, technical skills, and cultural fit with the FluidStack team.

Portfolio Review Tips:

  • Highlight your past data center operations projects, focusing on your role, responsibilities, and achievements.
  • Include case studies demonstrating your ability to optimize data center infrastructure, improve efficiency, and reduce costs.
  • Showcase your automation tools and explain how they have improved data center performance and eliminated manual tasks.
  • Be prepared to discuss your approach to technical documentation, asset tracking, and vendor management.

Technical Challenge Preparation:

  • Brush up on your data center operations knowledge, with a focus on power infrastructure, cooling, and rack management.
  • Familiarize yourself with DCIM tools and asset tracking best practices.
  • Prepare for behavioral interview questions that focus on your problem-solving skills, communication, and collaboration abilities.

ATS Keywords: (Organized by category)

  • Data Center Operations: Data center management, power infrastructure, cooling, rack management, DCIM, asset tracking, energy optimization, capacity planning, infrastructure optimization.
  • Automation & Monitoring: Automation tools, monitoring tools, data center performance, data center availability, high-availability infrastructure, continuous integration, continuous deployment.
  • Vendor Management: Vendor management, contract negotiation, vendor accountability, ITAD relationships, hardware disposals, e-waste revenue generation.
  • Leadership & Collaboration: Team leadership, cross-functional collaboration, stakeholder management, problem-solving, communication, collaboration, continuous improvement.
  • Technical Skills: Power infrastructure, cooling, rack management, DCIM, asset tracking, energy optimization, capacity planning, infrastructure optimization, automation tools, monitoring tools, data center performance, data center availability, high-availability infrastructure, continuous integration, continuous deployment, vendor management, contract negotiation, vendor accountability, ITAD relationships, hardware disposals, e-waste revenue generation, team leadership, cross-functional collaboration, stakeholder management, problem-solving, communication, collaboration, continuous improvement.

📝 Enhancement Note: The interview process for this role is designed to assess your technical expertise, problem-solving skills, and cultural fit with the FluidStack team. By preparing thoroughly and showcasing your past data center operations projects, automation tools, and problem-solving strategies, you'll be well-positioned to succeed in the interview process.

🛠 Technology Stack & Web Infrastructure

Data Center Infrastructure:

  • Power Infrastructure: PDUs, UPSs, generators, and other power distribution and backup systems.
  • Cooling Infrastructure: CRACs, CRAHs, chilled water systems, and other cooling solutions.
  • Rack Infrastructure: Standard 19" rack enclosures, custom rack designs, and other data center racking solutions.

DCIM & Asset Tracking:

  • DCIM Tools: DCIM software platforms for data center infrastructure management, asset tracking, and capacity planning.
  • Asset Tracking: RFID tracking, barcode scanning, and other asset tracking technologies.

Automation & Monitoring:

  • Automation Tools: Scripting languages (Python, Bash), automation frameworks (Ansible, Puppet), and other automation tools.
  • Monitoring Tools: Data center infrastructure monitoring (DCIM), power and cooling monitoring, and other data center monitoring tools.

📝 Enhancement Note: Familiarity with the latest data center infrastructure, DCIM tools, and automation technologies is essential for success in this role. By staying up-to-date with industry trends and best practices, you'll be well-positioned to drive continuous improvement and optimization in FluidStack's data center operations.

👥 Team Culture & Values

Data Center Operations Values:

  • Extreme Ownership: Take full responsibility for data center operations, from inception to delivery, and approach every problem with an open mind and a positive attitude.
  • Automation-First Mindset: Eliminate manual tasks through automation, and continuously improve data center performance through automation and optimization.
  • Continuous Improvement: Regularly review and improve data center operations processes, tools, and infrastructure to drive continuous improvement and innovation.
  • Customer-Focused Approach: Prioritize customer success in every aspect of data center operations, and work closely with cross-functional teams to ensure seamless data center performance and availability.

Collaboration Style:

  • Cross-Functional Integration: Collaborate closely with IT, engineering, and other cross-functional teams to ensure seamless data center operations and support FluidStack's strategic goals.
  • Code Review Culture: Regularly review and improve data center operations processes, tools, and infrastructure through a collaborative code review process.
  • Knowledge Sharing: Share your expertise and experience with the FluidStack team, and learn from your colleagues to drive continuous improvement and innovation in data center operations.

📝 Enhancement Note: FluidStack's data center operations team values extreme ownership, automation, continuous improvement, and a customer-focused approach. By embracing these values and collaborating effectively with cross-functional teams, you'll be well-positioned to drive continuous improvement and innovation in data center operations.

⚡ Challenges & Growth Opportunities

Technical Challenges:

  • Data Center Operations: Manage complex data center operations across multiple regional locations, ensuring high availability and efficiency.
  • Automation & Optimization: Develop and implement automation tools to eliminate manual tasks, improve data center performance, and drive continuous optimization.
  • Vendor Management: Negotiate contracts, manage vendor relationships, and generate revenue from e-waste disposal.
  • DCIM Implementation: Lead DCIM implementation and adoption, track assets across multiple locations, and maintain high accuracy in asset management.
  • Policy & Procedure Documentation: Maintain and update SOPs and MOPs for compliance and efficiency, ensuring that data center operations run smoothly and effectively.

Learning & Development Opportunities:

  • Technical Skill Development: Expand your expertise in data center operations, automation, and optimization, with opportunities to work on cutting-edge AI supercomputing infrastructure.
  • Leadership Development: Develop your leadership skills by mentoring junior team members and driving continuous improvement initiatives across the data center operations team.
  • Architecture & Design: Contribute to the design and architecture of FluidStack's data center infrastructure, ensuring that it meets the needs of its growing customer base and supports the company's long-term strategic goals.

📝 Enhancement Note: This role offers significant technical challenges and learning opportunities, making it an ideal fit for a motivated and ambitious senior data center operations engineer looking to make a real impact in a dynamic, high-impact environment.

💡 Interview Preparation

Technical Questions:

  • Data Center Operations: Describe your experience managing complex data center operations, and discuss your approach to power infrastructure, cooling, and rack management.
  • Automation & Optimization: Explain your experience with automation tools, and discuss your approach to eliminating manual tasks, improving data center performance, and driving continuous optimization.
  • Vendor Management: Describe your experience with vendor management, and discuss your approach to negotiating contracts, managing vendor relationships, and generating revenue from e-waste disposal.
  • DCIM Implementation: Explain your experience with DCIM tools, and discuss your approach to asset tracking, data center infrastructure management, and capacity planning.

Company & Culture Questions:

  • Company Culture: Describe what you value in a company culture, and explain how your values align with FluidStack's data center operations team.
  • Team Collaboration: Explain your approach to cross-functional collaboration, and discuss your experience working with IT, engineering, and other cross-functional teams to ensure seamless data center operations.
  • Customer Focus: Describe your approach to customer success, and explain how you prioritize customer needs in data center operations.

Portfolio Presentation Strategy:

  • Project Case Studies: Highlight your past data center operations projects, focusing on your role, responsibilities, and achievements.
  • Automation Tools: Showcase your automation tools, and explain how they have improved data center performance and eliminated manual tasks.
  • Technical Documentation: Present your technical documentation, including SOPs, MOPs, and other data center operations processes and tools.
  • Asset Tracking: Demonstrate your ability to track assets across multiple locations, and maintain high accuracy in asset management.

📝 Enhancement Note: By preparing thoroughly and showcasing your past data center operations projects, automation tools, and problem-solving strategies, you'll be well-positioned to succeed in the interview process and make a real impact on FluidStack's data center operations team.

📌 Application Steps

To apply for this Senior Data Center Operations Engineer position:

  1. Customize Your Resume: Tailor your resume to highlight your data center operations experience, technical skills, and achievements. Include relevant keywords and phrases to optimize your resume for the FluidStack ATS.
  2. Prepare Your Portfolio: Showcase your past data center operations projects, automation tools, and technical documentation. Include case studies that demonstrate your ability to optimize data center infrastructure, improve efficiency, and reduce costs.
  3. Research FluidStack: Familiarize yourself with FluidStack's company culture, data center operations values, and technical infrastructure. Prepare thoughtful questions to ask during the interview process, demonstrating your interest in the role and the company.
  4. Practice Interview Questions: Rehearse common data center operations interview questions, and practice your responses using the tips and strategies outlined in this enhanced job description. Focus on showcasing your technical expertise, problem-solving skills, and cultural fit with the FluidStack team.
  5. Prepare for On-Site Interview: Plan your travel and accommodations for the on-site interview, and ensure that you have all necessary documents and materials ready for the interview process.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and data center operations industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Application Requirements

5+ years managing data center operations at scale with a proven track record of achieving high accuracy in physical audits. Strong vendor management skills and expertise in power infrastructure are essential.