📍 Job Overview

Job Title: Workload Scheduling HPC Platform Engineer
Company: Qube Research & Technologies
Location: London
Job Type: On-site
Category: DevOps Engineer
Date Posted: June 27, 2025
Experience Level: Mid-level (2-5 years)
Remote Status: On-site

🚀 Role Summary

Develop and maintain High-Performance Computing (HPC) platforms using YellowDog and Ray schedulers across cloud and on-premises infrastructures.
Collaborate with internal teams to integrate HPC solutions, enabling scalable and efficient compute capabilities.
Improve performance, resilience, and observability of compute infrastructure through continuous improvement initiatives.

📝 Enhancement Note: This role requires a strong background in HPC environments and a solid understanding of cloud services to support both business and technology groups effectively.

💻 Primary Responsibilities

Platform Development & Support: Develop and support scalable workload scheduling solutions for HPC environments using YellowDog and Ray schedulers.
Team Collaboration: Work closely with internal teams to adopt and optimize HPC platforms, ensuring they meet the needs of various stakeholders.
Infrastructure Optimization: Improve the performance, resilience, and observability of compute infrastructure by identifying bottlenecks and implementing enhancements.
Automation & Continuous Improvement: Contribute to infrastructure automation and continuous improvement initiatives to streamline workflows and increase efficiency.
Expertise Sharing: Share your expertise and support team development through coaching and collaboration, fostering a culture of learning and growth.

📝 Enhancement Note: This role requires a balance of technical expertise and strong communication skills to effectively collaborate with diverse teams and stakeholders.

🎓 Skills & Qualifications

Education: A bachelor's degree in Computer Science, Engineering, or a related field. Relevant experience may be considered in lieu of a degree.

Experience: Proven experience (2-5 years) in engineering and supporting HPC environments, with a strong focus on workload scheduling and large-scale systems.

Required Skills:

Experience with at least one HPC scheduler (YellowDog, Ray, Slurm, or IBM Symphony)
Deep understanding of both loosely coupled and tightly coupled HPC workloads
Experience developing and supporting large-scale systems (5000+ nodes) and high levels of concurrency (100k+ tasks)
Strong Linux administration skills
Solid understanding of core AWS services, including VPC security, networking, EC2 configuration, and storage services (S3, EFS, EBS, and FSx)
Experience with infrastructure-as-code using configuration languages and tools, particularly Terraform and Ansible
Proficiency in Python for developing applications and tools
Good understanding of monitoring and visualization tools for large-scale systems (CloudWatch, CloudTrail, OpenSearch, Athena)
Experience with performance tuning of compute, network, and storage components
Strong communication and collaboration skills, with the ability to describe complex issues at the appropriate level for different audiences

Preferred Skills:

Experience with identity providers such as Okta and user authorization in large-scale distributed environments
Familiarity with containerization (e.g., Docker, Kubernetes) and orchestration tools
Knowledge of CI/CD pipelines and deployment strategies
Experience with data analysis and visualization tools (e.g., Tableau, Power BI)

📝 Enhancement Note: Candidates with a strong background in both HPC environments and cloud services will be well-positioned to succeed in this role.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

Demonstrate your experience with HPC schedulers by showcasing projects where you've developed, optimized, or maintained large-scale HPC platforms.
Highlight your cloud services expertise by presenting projects that showcase your ability to manage and optimize AWS resources.
Include examples of your Python development work, emphasizing projects that involve infrastructure automation or data processing.
Showcase your Linux administration skills by demonstrating your ability to manage and maintain large-scale systems.

Technical Documentation:

Provide clear and concise documentation for your HPC projects, explaining the architecture, configuration, and any performance optimizations implemented.
Include documentation for your Python projects, detailing the purpose, functionality, and any relevant code snippets or examples.
Demonstrate your understanding of monitoring and visualization tools by including screenshots, graphs, or other visual representations of system performance and health.

📝 Enhancement Note: When preparing your portfolio, focus on projects that showcase your ability to work with large-scale systems and optimize performance in HPC environments.

💵 Compensation & Benefits

Salary Range: £60,000 - £80,000 per annum (Based on market research and regional adjustments for London)

Benefits:

Competitive salary and bonus structure
Comprehensive health, dental, and vision insurance
Generous pension contributions
Flexible working hours and remote work options
Employee equity and profit-sharing opportunities
Professional development and training opportunities
A dynamic and inclusive work environment that values diversity and collaboration

Working Hours: Full-time position with standard office hours (Monday-Friday, 9:00 AM - 5:30 PM), with flexibility for occasional overtime or remote work as needed.

📝 Enhancement Note: The salary range provided is based on market research for similar roles in the London area, considering the required experience level and the unique demands of this position.

🎯 Team & Company Context

🏢 Company Culture

Industry: Quantitative and systematic investment management, with a strong focus on technology and data-driven approaches.

Company Size: Medium-sized global investment manager with a collaborative and innovative culture.

Founded: 2003, with a history of growth and success in the investment management industry.

Team Structure:

The Workload Scheduling team is part of the broader Technology group, which consists of various teams focused on data, research, trading, and infrastructure.
The team works closely with business groups to understand their needs and provide tailored HPC solutions.
The team follows Agile methodologies, with a focus on continuous improvement and collaboration.

Development Methodology:

The team follows Agile/Scrum methodologies, with sprint planning, daily stand-ups, and regular retrospectives.
Code reviews, testing, and quality assurance practices are integral to the development process.
Deployment strategies, CI/CD pipelines, and server management are essential aspects of the role.

Company Website: Qube Research & Technologies

📝 Enhancement Note: QRT's company culture emphasizes technology, data, and collaboration, making it an ideal environment for candidates with strong technical skills and a passion for working in a dynamic, innovative setting.

📈 Career & Growth Analysis

Web Technology Career Level: This role is suitable for a mid-level DevOps Engineer with a strong background in HPC environments and cloud services. The position offers opportunities for growth and progression within the Technology group and the broader organization.

Reporting Structure: The role reports directly to the Head of Workload Scheduling, with regular interactions with other team members, business stakeholders, and technology leadership.

Technical Impact: The role has a significant impact on the performance, scalability, and efficiency of QRT's HPC platforms, directly contributing to the organization's ability to process and analyze large datasets effectively.

Growth Opportunities:

Technical Growth: Develop expertise in emerging technologies, tools, and best practices related to HPC environments and cloud services.
Leadership Growth: Gain experience in coaching and mentoring team members, contributing to the team's development and success.
Architecture & Design: Contribute to the design and architecture of HPC platforms, influencing the organization's technology roadmap.

📝 Enhancement Note: This role offers numerous opportunities for growth and development, both technically and professionally, within QRT's collaborative and innovative work environment.

🌐 Work Environment

Office Type: Modern, collaborative office spaces designed to facilitate teamwork and innovation.

Office Location(s): London, with opportunities for remote work and flexible working arrangements.

Workspace Context:

Collaborative Environment: The office features open-plan workspaces, meeting rooms, and breakout areas designed to encourage collaboration and communication among team members.
Technical Infrastructure: The workspace is equipped with high-performance workstations, multiple monitors, and testing devices to support the development and maintenance of HPC platforms.
Cross-functional Collaboration: The team works closely with other technology groups and business stakeholders, fostering a culture of knowledge sharing and continuous learning.

Work Schedule: Full-time position with standard office hours (Monday-Friday, 9:00 AM - 5:30 PM), with flexibility for occasional overtime or remote work as needed.

📝 Enhancement Note: QRT's work environment emphasizes collaboration, innovation, and continuous learning, providing an ideal setting for candidates seeking to grow both technically and professionally.

📄 Application & Technical Interview Process

Interview Process:

Phone/Video Screen: A brief conversation to discuss your background, experience, and fit for the role (30-45 minutes).
Technical Deep Dive: A detailed discussion of your technical skills, experience, and approach to HPC environments and cloud services (60-90 minutes).
Behavioral & Cultural Fit: An assessment of your communication skills, problem-solving abilities, and cultural fit within the team and organization (30-45 minutes).
Final Interview: A meeting with the hiring manager or other senior stakeholders to discuss your qualifications, answer any remaining questions, and make a final decision (30-45 minutes).

Portfolio Review Tips:

HPC Projects: Highlight your experience with HPC schedulers by showcasing projects where you've developed, optimized, or maintained large-scale HPC platforms.
Cloud Services: Demonstrate your expertise in AWS services by presenting projects that showcase your ability to manage and optimize resources.
Python Development: Include examples of your Python development work, emphasizing projects that involve infrastructure automation or data processing.
Linux Administration: Showcase your Linux administration skills by demonstrating your ability to manage and maintain large-scale systems.

Technical Challenge Preparation:

HPC & Cloud Services: Brush up on your knowledge of HPC schedulers, AWS services, and cloud architecture best practices.
Python & Linux: Review your Python development skills and Linux administration fundamentals to ensure you can effectively demonstrate your expertise.
Problem-solving & Communication: Prepare for behavioral questions that assess your problem-solving abilities, communication skills, and cultural fit within the team and organization.

ATS Keywords: See the comprehensive list of web development and server administration-relevant keywords for resume optimization, organized by category: programming languages, web frameworks, server technologies, databases, tools, methodologies, soft skills, industry terms

📝 Enhancement Note: The interview process for this role is designed to assess both your technical expertise and cultural fit within the organization, ensuring that you are well-prepared to succeed in the role and contribute to QRT's ongoing success.

🛠 Technology Stack & Web Infrastructure

HPC Schedulers:

YellowDog
Ray

Cloud Services:

AWS (Amazon Web Services)
- VPC (Virtual Private Cloud)
- EC2 (Elastic Compute Cloud)
- S3 (Simple Storage Service)
- EFS (Elastic File System)
- EBS (Elastic Block Store)
- FSx (Amazon FSx)
- CloudWatch (Amazon CloudWatch)
- CloudTrail (AWS CloudTrail)
- OpenSearch (formerly Amazon Elasticsearch Service)
- Athena (Amazon Athena)

Infrastructure-as-Code Tools:

Terraform
Ansible

Operating Systems:

Linux (Ubuntu, CentOS, etc.)
macOS
Windows

Programming Languages:

Python
Bash
Shell scripting

Databases:

Relational databases (PostgreSQL, MySQL, etc.)
NoSQL databases (MongoDB, Cassandra, etc.)

Monitoring & Visualization Tools:

Prometheus
Grafana
ELK Stack (Elasticsearch, Logstash, Kibana)
Zabbix
Nagios

📝 Enhancement Note: Familiarity with the technology stack listed above is essential for success in this role, as it forms the foundation of QRT's HPC platforms and infrastructure.

👥 Team Culture & Values

Web Development Values:

Innovation: Embrace a culture of continuous learning and improvement, driving technological advancements in HPC environments and cloud services.
Collaboration: Foster a collaborative work environment that encourages knowledge sharing, teamwork, and mutual support.
Quality: Maintain a strong focus on quality and excellence in all aspects of your work, from code development to system design and optimization.
Performance: Prioritize performance and efficiency in your work, continually seeking opportunities to improve the scalability, reliability, and speed of QRT's HPC platforms.

Collaboration Style:

Cross-functional Integration: Work closely with other technology groups and business stakeholders to understand their needs and provide tailored HPC solutions.
Code Review Culture: Participate in code reviews and pair programming practices to ensure high-quality work and knowledge sharing among team members.
Knowledge Sharing: Contribute to the team's development and success by sharing your expertise and supporting the growth of your colleagues.

📝 Enhancement Note: QRT's team culture emphasizes collaboration, innovation, and continuous learning, providing an ideal environment for candidates seeking to grow both technically and professionally.

⚡ Challenges & Growth Opportunities

Technical Challenges:

HPC Scheduler Optimization: Develop and implement strategies to optimize the performance and efficiency of YellowDog and Ray schedulers in large-scale HPC environments.
Cloud Services Management: Manage and optimize AWS resources to ensure cost-effectiveness, scalability, and reliability in QRT's HPC platforms.
Performance Tuning: Identify and address performance bottlenecks in HPC workloads, cloud services, and infrastructure components to maximize efficiency and throughput.
Emerging Technologies: Stay up-to-date with the latest developments in HPC environments, cloud services, and related technologies, and evaluate their applicability to QRT's infrastructure.

Learning & Development Opportunities:

Technical Skill Development: Develop expertise in emerging technologies, tools, and best practices related to HPC environments and cloud services.
Conference Attendance: Attend industry conferences, workshops, and training sessions to expand your knowledge and network with other professionals in the field.
Certification & Community Involvement: Pursue relevant certifications and engage with online communities to enhance your skills and stay informed about the latest trends and best practices in HPC environments and cloud services.
Mentorship & Leadership Development: Seek mentorship opportunities from senior team members and contribute to the development of your colleagues through coaching and knowledge sharing.

📝 Enhancement Note: This role offers numerous opportunities for technical growth and development, both through hands-on experience and structured learning initiatives.

💡 Interview Preparation

Technical Questions:

HPC Schedulers: Describe your experience with YellowDog, Ray, or other HPC schedulers, and explain how you've optimized their performance in large-scale environments.
Cloud Services: Explain your approach to managing and optimizing AWS resources, and provide examples of your experience with VPC, EC2, S3, and other cloud services.
Performance Tuning: Discuss your experience with performance tuning in HPC environments, cloud services, and infrastructure components, and describe your approach to identifying and addressing bottlenecks.
Problem-solving: Present a challenging problem you've faced in your previous roles and explain your approach to solving it, emphasizing your technical skills and problem-solving abilities.

Company & Culture Questions:

Team Dynamics: Describe your experience working in a collaborative, cross-functional team environment, and explain how you've contributed to the success of your colleagues and the broader organization.
Adaptability: Discuss a situation where you had to adapt to significant changes in your role or the broader organization, and explain how you approached the challenge and achieved success.
Innovation: Describe a time when you drove technological advancements in your previous roles, and explain how you identified opportunities for improvement and implemented innovative solutions.

Portfolio Presentation Strategy:

HPC Projects: Highlight your experience with HPC schedulers by showcasing projects where you've developed, optimized, or maintained large-scale HPC platforms.
Cloud Services: Demonstrate your expertise in AWS services by presenting projects that showcase your ability to manage and optimize resources.
Python Development: Include examples of your Python development work, emphasizing projects that involve infrastructure automation or data processing.
Linux Administration: Showcase your Linux administration skills by demonstrating your ability to manage and maintain large-scale systems.
Code Explanation: Clearly explain your approach to system design, architecture, and performance optimization in your projects, emphasizing your technical expertise and problem-solving abilities.

📝 Enhancement Note: The interview process for this role is designed to assess both your technical expertise and cultural fit within the organization, ensuring that you are well-prepared to succeed in the role and contribute to QRT's ongoing success.

📌 Application Steps

To apply for this Workload Scheduling HPC Platform Engineer position at Qube Research & Technologies:

Submit Your Application: Click the application link and provide the required information, including your resume, cover letter, and portfolio.
Prepare Your Portfolio: Tailor your portfolio to showcase your experience with HPC schedulers, cloud services, and related technologies, emphasizing your technical expertise and problem-solving abilities.
Optimize Your Resume: Highlight your relevant skills, experience, and achievements, using web development and server administration-relevant keywords to improve your visibility to Applicant Tracking Systems (ATS).
Research the Company: Familiarize yourself with QRT's company culture, technology stack, and industry focus, ensuring that your application and interview preparation are tailored to the organization's unique needs and values.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and web technology industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Content Guidelines (IMPORTANT: Do not include this in the output)

Web Technology-Specific Focus:

Tailor every section specifically to DevOps Engineer roles, emphasizing HPC environments, cloud services, and infrastructure management.
Include web technology methodologies, cloud architecture principles, and server management practices.
Emphasize portfolio requirements, live project demonstrations, and user experience considerations for HPC environments and cloud services.
Address DevOps team dynamics, cross-functional collaboration with technology and business teams, and the unique demands of HPC environments and cloud services.
Focus on DevOps career progression, emerging technology adoption, and technical specialization in HPC environments and cloud services.

Quality Standards:

Ensure no content overlap between sections; each section must contain unique information.
Only include Enhancement Notes when making significant inferences about technical responsibilities, team structure, or company information.
Be comprehensive but concise, prioritizing actionable information over descriptive text.
Strategically distribute web technology and server administration-related keywords throughout all sections naturally.
Provide realistic salary ranges based on location, experience level, and the unique demands of the role.

Industry Expertise:

Include specific HPC schedulers, cloud services, and infrastructure tools relevant to the role.
Address DevOps career progression paths and technical leadership opportunities in HPC environments and cloud services.
Provide tactical advice for portfolio development, live demonstrations, and project case studies tailored to HPC environments and cloud services.
Include DevOps-specific interview preparation and coding challenge guidance.
Emphasize performance optimization, scalability, and user experience principles in HPC environments and cloud services.

Professional Standards:

Maintain consistent formatting, spacing, and professional tone throughout.
Use web technology and server administration industry terminology appropriately and accurately.
Include comprehensive benefits and growth opportunities relevant to DevOps professionals in HPC environments and cloud services.
Provide actionable insights that give DevOps candidates a competitive advantage in the job application process.
Focus on DevOps team culture, cross-functional collaboration, and user impact measurement in HPC environments and cloud services.

Technical Focus & Portfolio Emphasis:

Emphasize HPC environments, cloud services, and infrastructure management best practices.
Include specific portfolio requirements tailored to the DevOps discipline and role level, with a strong focus on HPC schedulers, cloud services, and related technologies.
Address browser compatibility, accessibility standards, and user experience design principles in the context of HPC environments and cloud services.
Focus on problem-solving methods, performance optimization, and scalable architecture in HPC environments and cloud services.
Include technical presentation skills and stakeholder communication for HPC projects and cloud services.

Avoid:

Generic business jargon not relevant to DevOps roles in HPC environments and cloud services.
Placeholder text or incomplete sections.
Repetitive content across different sections.
Non-technical terminology unless relevant to the specific DevOps role in HPC environments and cloud services.
Marketing language unrelated to DevOps, HPC environments, or cloud services.

Generate comprehensive, web technology-focused content that serves as a valuable resource for DevOps professionals evaluating career opportunities and preparing for technical interviews in the web technology industry, with a strong emphasis on HPC environments and cloud services.

Workload Scheduling HPC Platform Engineer