Senior Machine Learning Infrastructure Engineer

Halter
Full_timeAuckland, New Zealand

📍 Job Overview

  • Job Title: Senior Machine Learning Infrastructure Engineer
  • Company: Halter
  • Location: Auckland, Auckland, New Zealand
  • Job Type: On-site, Full-time
  • Category: Machine Learning Infrastructure Engineer
  • Date Posted: 2025-07-24
  • Experience Level: 5-10 years
  • Remote Status: On-site

🚀 Role Summary

  • Key Responsibilities: Design and maintain scalable ML pipelines, build optimized model serving infrastructure, manage data pipelines, optimize model performance, and collaborate with data scientists to productionize models.
  • Key Skills: Machine Learning, Cloud Platforms, Containerization, MLOps, Data Pipelines, Python, SQL, Infrastructure as Code, Monitoring Tools, CI/CD, Database Management, Data Warehousing, Stream Processing, Model Registry, Performance Optimization, Agile Development.

📝 Enhancement Note: This role requires a strong background in machine learning infrastructure and data engineering to ensure the smooth operation and scalability of Halter's predictive systems.

💻 Primary Responsibilities

  • ML Pipeline Design & Maintenance:

    • Design and maintain scalable ML pipelines for training, validation, and inference.
    • Implement MLOps practices including automated testing, deployment, and rollback systems.
  • Model Serving Infrastructure:

    • Build and optimize model serving infrastructure with proper monitoring, logging, and alerting.
    • Ensure model performance and resource utilization across different environments.
  • Data Pipeline Management:

    • Manage data pipelines to ensure data quality, lineage, and governance.
    • Collaborate with data scientists to integrate research experiments into production systems.
  • Collaboration & Mentorship:

    • Work closely with data scientists to productionize models and grow the team's capabilities.
    • Provide technical guidance and mentorship to junior team members.

📝 Enhancement Note: This role involves a high degree of collaboration with data scientists and other technical teams, requiring strong communication and problem-solving skills.

🎓 Skills & Qualifications

Education: Bachelor's degree in Computer Science, Mathematics, Statistics, or a related field. A Master's degree would be an asset.

Experience: Proven track record of at least 5-10 years in a similar role, with a strong portfolio of machine learning infrastructure projects.

Required Skills:

  • Strong proficiency in cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
  • Experience with ML frameworks (TensorFlow, PyTorch) and serving systems.
  • Knowledge of orchestration tools and infrastructure as code (Terraform).
  • Proficiency in Python, SQL, and version control systems (Git).
  • Experience with monitoring and observability tools.
  • Understanding of CI/CD pipelines and data engineering best practices.

Preferred Skills:

  • Experience with stream processing and real-time data systems (Kafka, Spark Streaming).
  • Familiarity with model registry and experiment tracking systems.
  • Knowledge of performance optimization and cost management in cloud environments.

📝 Enhancement Note: While not required, experience in the agriculture or IoT industries could provide a unique perspective and added value to this role.

📊 Web Portfolio & Project Requirements

Portfolio Essentials:

  • ML Infrastructure Projects: Include examples of scalable ML pipelines you've designed and maintained, highlighting the technologies and tools used.
  • Model Serving Infrastructure: Showcase optimized model serving infrastructure projects, demonstrating your ability to ensure proper monitoring, logging, and alerting.
  • Data Pipeline Management: Highlight projects where you've managed data pipelines, ensuring data quality, lineage, and governance.
  • Collaboration & Mentorship: Provide examples of successful collaborations with data scientists and other technical teams, demonstrating your ability to provide technical guidance and mentorship.

Technical Documentation:

  • Code Quality: Include code samples and documentation demonstrating your commitment to code quality, commenting, and best practices.
  • Version Control: Showcase your experience with version control systems (Git) and explain your approach to collaborative development.
  • Deployment Processes: Detail your experience with deployment processes, automated testing, and CI/CD pipelines.

📝 Enhancement Note: Given the collaborative nature of this role, it's essential to demonstrate your ability to work effectively with data scientists and other technical teams in your portfolio.

💵 Compensation & Benefits

Salary Range: The salary range for a Senior Machine Learning Infrastructure Engineer in Auckland, New Zealand, is approximately NZD 150,000 - 200,000 per year, based on industry standards and the level of experience required for this role.

Benefits:

  • Parental Leave: 6 months of fully paid parental leave for primary caregivers, 4 weeks for secondary caregivers.
  • Self-Development Budget: An annual NZD 1,000 budget for personal growth and development.
  • Wellness Leave: Unlimited paid annual leave and wellness leave to support work-life balance.
  • Health Insurance: Southern Cross Health Insurance to support your well-being.
  • Employee Stock Ownership Plan: Halter offers an attractive remuneration package, including an employee stock ownership plan.

Working Hours: This role requires a standard 40-hour workweek, with flexibility for deployment windows, maintenance, and project deadlines.

📝 Enhancement Note: The salary range provided is an estimate based on market research and may vary depending on the candidate's experience and skills.

🎯 Team & Company Context

🏢 Company Culture

Industry: Halter operates in the agriculture technology industry, focusing on enabling farmers and graziers to run more productive and sustainable operations.

Company Size: Halter is a high-growth technology scale-up with a team of around 100 employees, providing ample opportunities for growth and impact.

Founded: Halter was founded in 2016, with a mission to make a difference in the world by revolutionizing grazing practices and transforming the agriculture industry.

Team Structure:

  • Halter's engineering team consists of frontend, backend, and infrastructure engineers, as well as data scientists and machine learning engineers.
  • The team follows an agile development practice, working in cross-functional development teams to deliver high-quality products.
  • Collaboration and knowledge sharing are encouraged, with a strong focus on continuous learning and growth.

Development Methodology:

  • Halter uses Agile methodologies, including Scrum, to manage development processes and ensure efficient project delivery.
  • Code reviews, testing, and quality assurance practices are integral to Halter's development process.
  • Deployment strategies, CI/CD pipelines, and server management are essential aspects of Halter's infrastructure and development processes.

Company Website: Halter HQ

📝 Enhancement Note: Halter's culture is built around growth, high performance, and a genuine connection to its mission. The company values innovation, collaboration, and a strong customer focus.

📈 Career & Growth Analysis

Web Technology Career Level: This role is at the senior level, requiring a high degree of technical expertise and leadership. The ideal candidate will have a solid track record in machine learning infrastructure and data engineering, with a deep understanding of the fundamentals and design systems behind building reliable, scalable, and fit-for-purpose machine learning models and infrastructure.

Reporting Structure: This role reports directly to the Head of Engineering, with a dotted-line reporting structure to the Head of Data Science and Machine Learning. The Senior Machine Learning Infrastructure Engineer will work closely with data scientists, backend engineers, and other technical teams to ensure the smooth operation and scalability of Halter's predictive systems.

Technical Impact: The Senior Machine Learning Infrastructure Engineer will have a significant impact on Halter's data products and predictive systems. Their work will enable R&D on highly complex systems, unlock untapped value, and ensure that Halter's ML initiatives run at maximum velocity with minimal risk.

Growth Opportunities:

  • Technical Growth: Halter offers ample opportunities for technical growth, with a strong focus on emerging technologies and continuous learning.
  • Leadership Development: As Halter continues to grow, there will be opportunities for the Senior Machine Learning Infrastructure Engineer to take on more significant leadership roles, mentoring junior team members and driving technical strategy.
  • Architecture Decisions: The Senior Machine Learning Infrastructure Engineer will have the opportunity to make critical architecture decisions, shaping Halter's technical direction and ensuring the scalability and reliability of its predictive systems.

📝 Enhancement Note: Halter's high-growth environment provides numerous opportunities for career progression and technical leadership, with a strong emphasis on continuous learning and development.

🌐 Work Environment

Office Type: Halter's office is state-of-the-art, dog-friendly, and centrally located in Auckland city. The office has been thoughtfully designed to foster collaboration, creativity, and a high-performing team culture.

Office Location(s): Halter's office is located at 320 Queen Street, Auckland, New Zealand.

Workspace Context:

  • Collaborative Workspace: Halter's open-plan office encourages collaboration and knowledge sharing, with ample space for team meetings and workshops.
  • Development Tools: Halter provides state-of-the-art development tools, including multiple monitors and testing devices, to ensure optimal productivity.
  • Cross-Functional Collaboration: Halter's cross-functional teams work closely together, with regular stand-ups, sprint planning, and retrospectives to ensure alignment and continuous improvement.

Work Schedule: Halter operates on an office-first policy, with a high-trust culture that allows for flexibility when needed. The standard workweek is 40 hours, with flexibility for deployment windows, maintenance, and project deadlines.

📝 Enhancement Note: Halter's office-first approach fosters a strong team culture, with ample opportunities for collaboration, learning, and growth. The high-trust environment allows for flexibility and work-life balance.

📄 Application & Technical Interview Process

Interview Process:

  1. Technical Phone Screen: A 30-minute phone screen to assess your technical skills and cultural fit.
  2. On-site Technical Interview: A 2-hour on-site interview consisting of a technical deep dive, system design discussion, and live coding challenge.
  3. Behavioral Interview: A 30-minute behavioral interview to assess your problem-solving skills, communication, and cultural fit.
  4. Final Evaluation: A final evaluation with Halter's leadership team to discuss your technical impact and cultural fit.

Portfolio Review Tips:

  • Portfolio Structure: Organize your portfolio by project, highlighting your role and the technologies used in each.
  • Code Quality: Demonstrate your commitment to code quality, commenting, and best practices by including code samples and documentation.
  • Technical Challenges: Prepare for live coding challenges and be ready to explain your thought process and problem-solving approach.

Technical Challenge Preparation:

  • Cloud Platforms: Brush up on your knowledge of cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
  • MLOps & Data Pipelines: Familiarize yourself with MLOps practices, data pipeline management, and version control systems (Git).
  • System Design & Architecture: Prepare for system design discussions and architecture decision-making scenarios.

ATS Keywords: [List of relevant machine learning infrastructure, data engineering, and cloud platform keywords]

📝 Enhancement Note: Halter's interview process is designed to assess your technical skills, problem-solving abilities, and cultural fit. The portfolio review and technical challenge preparation tips provided will help you make a strong impression and demonstrate your qualifications for the role.

🛠 Technology Stack & Web Infrastructure

Cloud Platforms:

  • AWS: Halter uses AWS for cloud infrastructure, with expertise in services such as EC2, S3, RDS, and Lambda.
  • GCP: Halter has experience with GCP, with expertise in services such as Compute Engine, Cloud Storage, BigQuery, and Cloud Functions.
  • Azure: Halter has experience with Azure, with expertise in services such as Virtual Machines, Azure Storage, Azure SQL Database, and Azure Functions.

Containerization:

  • Docker: Halter uses Docker for containerization, with expertise in building, deploying, and managing Docker images and containers.
  • Kubernetes: Halter uses Kubernetes for orchestration, with expertise in cluster management, deployment, and scaling.

ML Frameworks & Serving Systems:

  • TensorFlow: Halter uses TensorFlow for machine learning, with expertise in building, training, and deploying ML models.
  • PyTorch: Halter has experience with PyTorch, with expertise in building and training ML models.
  • MLflow: Halter uses MLflow for model tracking, with expertise in managing model versions, experiments, and deployments.

Data Engineering Tools:

  • Apache Kafka: Halter uses Kafka for real-time data streaming, with expertise in producing, consuming, and processing data streams.
  • Apache Spark: Halter uses Spark for data processing, with expertise in batch and stream processing, as well as machine learning workloads.
  • Apache Airflow: Halter uses Airflow for orchestrating data pipelines, with expertise in creating, scheduling, and monitoring workflows.

Infrastructure as Code (IaC):

  • Terraform: Halter uses Terraform for IaC, with expertise in building, managing, and versioning infrastructure as code.
  • Ansible: Halter uses Ansible for configuration management, with expertise in automating deployment and configuration tasks.

Monitoring & Observability Tools:

  • Prometheus: Halter uses Prometheus for monitoring, with expertise in collecting, storing, and alerting on time-series data.
  • Grafana: Halter uses Grafana for visualization, with expertise in creating dashboards and visualizing data from Prometheus and other data sources.

📝 Enhancement Note: Halter's technology stack is designed to ensure the scalability, reliability, and performance of its predictive systems. The tools and technologies listed are essential for the Senior Machine Learning Infrastructure Engineer role.

👥 Team Culture & Values

Web Development Values:

  • Innovation: Halter values innovation and encourages its team members to think creatively and challenge the status quo.
  • Collaboration: Halter fosters a culture of collaboration, with a strong emphasis on knowledge sharing, mentoring, and continuous learning.
  • Customer Focus: Halter prioritizes the customer experience, ensuring that its products and services meet the needs of its users.
  • Performance Optimization: Halter is committed to optimizing the performance of its predictive systems, with a focus on scalability, efficiency, and cost management.

Collaboration Style:

  • Cross-Functional Integration: Halter's cross-functional teams work closely together, with regular stand-ups, sprint planning, and retrospectives to ensure alignment and continuous improvement.
  • Code Review Culture: Halter emphasizes code reviews and peer programming, with a focus on knowledge sharing, learning, and quality.
  • Knowledge Sharing: Halter encourages knowledge sharing, with regular technical talks, workshops, and brown bag sessions.

📝 Enhancement Note: Halter's culture is built around growth, high performance, and a genuine connection to its mission. The company values innovation, collaboration, and a strong customer focus, with a commitment to optimizing the performance of its predictive systems.

🌐 Challenges & Growth Opportunities

Technical Challenges:

  • Scalability: Halter's predictive systems must be designed to scale, with a focus on horizontal and vertical scalability, load balancing, and auto-scaling.
  • Performance Optimization: Halter is committed to optimizing the performance of its predictive systems, with a focus on resource utilization, cost management, and efficiency.
  • Data Quality & Governance: Halter must ensure the quality, lineage, and governance of its data, with a focus on data validation, data cleansing, and data privacy.
  • Emerging Technologies: Halter is committed to staying at the forefront of emerging technologies, with a focus on continuous learning, research, and development.

Learning & Development Opportunities:

  • Technical Growth: Halter offers ample opportunities for technical growth, with a strong focus on emerging technologies and continuous learning.
  • Conference Attendance: Halter supports its team members in attending industry conferences, with a focus on knowledge sharing, networking, and professional development.
  • Certification & Community Involvement: Halter encourages its team members to pursue relevant certifications and engage with the broader technical community, with a focus on knowledge sharing, learning, and growth.

📝 Enhancement Note: Halter's high-growth environment provides numerous opportunities for career progression and technical leadership, with a strong emphasis on continuous learning and development. The technical challenges and learning opportunities listed are essential for the Senior Machine Learning Infrastructure Engineer role.

💡 Interview Preparation

Technical Questions:

  • Cloud Platforms: Prepare for questions related to cloud platforms (AWS, GCP, Azure), with a focus on infrastructure as code, containerization, and deployment strategies.
  • MLOps & Data Pipelines: Brush up on your knowledge of MLOps practices, data pipeline management, and version control systems (Git).
  • System Design & Architecture: Prepare for system design discussions and architecture decision-making scenarios, with a focus on scalability, performance, and cost management.

Company & Culture Questions:

  • Company Culture: Research Halter's company culture, values, and mission, and be prepared to discuss how you align with the company's goals and objectives.
  • Team Dynamics: Familiarize yourself with Halter's team structure, development methodologies, and collaborative work environment.
  • Customer Focus: Prepare for questions related to Halter's customer focus, user experience, and the impact of your work on the company's products and services.

Portfolio Presentation Strategy:

  • Live Demonstration: Prepare a live demonstration of your portfolio, highlighting your technical skills, problem-solving abilities, and attention to detail.
  • Code Explanation: Be ready to explain your code, architecture decisions, and the thought process behind your projects.
  • User Experience: Prepare to discuss the user experience of your projects, with a focus on accessibility, performance, and responsiveness.

📝 Enhancement Note: Halter's interview process is designed to assess your technical skills, problem-solving abilities, and cultural fit. The interview preparation tips provided will help you make a strong impression and demonstrate your qualifications for the role.

📌 Application Steps

To apply for this Senior Machine Learning Infrastructure Engineer position at Halter:

  1. Customize Your Portfolio: Tailor your portfolio to highlight your relevant experience, technical skills, and achievements in machine learning infrastructure, data engineering, and cloud platforms.
  2. Optimize Your Resume: Update your resume to emphasize your experience, skills, and accomplishments in machine learning infrastructure, data engineering, and cloud platforms.
  3. Prepare for Technical Interviews: Brush up on your technical skills, prepare for live coding challenges, and practice explaining your thought process and problem-solving approach.
  4. Research Halter: Familiarize yourself with Halter's company culture, values, and mission, and be prepared to discuss how you align with the company's goals and objectives.

⚠️ Important Notice: This enhanced job description includes AI-generated insights and web development/server administration industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.

Application Requirements

Strong proficiency in cloud platforms and experience with ML frameworks is essential. A solid track record in machine learning engineering and understanding of data engineering best practices is required.