DevOps Engineer (AI/GPU Infrastructure)
📍 Job Overview
- Job Title: DevOps Engineer (AI/GPU Infrastructure)
- Company: Business Alliance
- Location: Ramallah, West Bank, Palestinian Territory
- Job Type: On-site
- Category: DevOps Engineer
- Date Posted: 2025-07-24
- Experience Level: 2-5 years
🚀 Role Summary
- Design, Manage, and Optimize Kubernetes Clusters: Leverage your Kubernetes expertise to design, manage, and optimize clusters using ArgoCD for GitOps pipelines.
- Build and Maintain CI/CD Workflows: Develop and maintain robust CI/CD workflows, infrastructure automation, and system observability to ensure efficient and reliable deployments.
- Administer and Secure Linux Systems and Cloud Environments: Demonstrate your Linux system administration skills and experience managing cloud environments, such as OCI.
- Manage and Troubleshoot Databases: Showcase your ability to manage and troubleshoot PostgreSQL and MySQL databases, ensuring optimal performance and reliability.
- Support and Maintain GPU-based AI Infrastructure: Leverage your GPU infrastructure and AI/ML workload experience to support and maintain GPU-based AI environments, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow.
📝 Enhancement Note: This role requires a strong background in cloud-native technologies and a solid understanding of GPU infrastructure for AI/ML workloads to scale the company's infrastructure and support GPU-enabled environments.
💻 Primary Responsibilities
- Cluster Management: Design, manage, and optimize Kubernetes clusters and GitOps pipelines using ArgoCD to ensure efficient and reliable deployments.
- CI/CD Pipeline Development: Build and maintain CI/CD workflows, infrastructure automation, and system observability to streamline the development and deployment process.
- Linux System Administration: Administer and secure Linux systems and cloud environments, ensuring the stability and security of the infrastructure.
- Database Management: Manage and troubleshoot PostgreSQL and MySQL databases, optimizing performance and ensuring data integrity.
- GPU Infrastructure Support: Support and maintain GPU-based AI infrastructure, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, to enable AI/ML workloads.
📝 Enhancement Note: The primary responsibilities of this role focus on managing and optimizing infrastructure, ensuring efficient deployments, and supporting GPU-based AI environments to facilitate the company's digital healthcare solutions.
🎓 Skills & Qualifications
Education: A Bachelor's degree in Computer Science, Information Technology, Computer Engineering, or a related field is required for this role.
Experience: A minimum of 3 years of experience in a similar role is necessary to apply for this position.
Required Skills:
- Kubernetes: Proven experience in designing, managing, and optimizing Kubernetes clusters and GitOps pipelines using ArgoCD.
- Docker: Familiarity with Docker for containerization and deployment of applications.
- Linux System Administration: Strong Linux system administration and troubleshooting skills, including experience with cloud environments like OCI.
- Database Management: Experience managing and troubleshooting PostgreSQL and MySQL databases, ensuring optimal performance and data integrity.
- GPU Infrastructure: Working knowledge of GPU infrastructure and AI/ML workload environments, including NVIDIA GPUs, CUDA stack, GPU provisioning, and monitoring.
- AI Frameworks: Basic setup and tuning experience with AI frameworks like PyTorch and TensorFlow.
- Cloud Providers: Experience with cloud providers such as OCI or other relevant platforms.
- CI/CD: Strong understanding of CI/CD pipelines and experience with tools like Jenkins, GitLab CI/CD, or similar platforms.
Preferred Skills:
- Infrastructure as Code (IaC): Familiarity with IaC tools such as Terraform or CloudFormation to automate infrastructure provisioning and management.
- Monitoring Tools: Experience with monitoring tools like Prometheus, Grafana, or ELK Stack to ensure system observability and performance optimization.
- AI/ML Experience: Hands-on experience with AI/ML workloads, including data preprocessing, model training, and deployment.
- Agile Methodologies: Familiarity with Agile methodologies and experience working in an Agile/Scrum environment.
📝 Enhancement Note: The required and preferred skills for this role emphasize cloud-native technologies, GPU infrastructure, and AI/ML workloads, reflecting the company's focus on digital healthcare solutions and AI/GPU-based infrastructure.
📊 Web Portfolio & Project Requirements
Portfolio Essentials:
- Kubernetes Cluster Demonstration: Include a project showcasing your Kubernetes cluster design, management, and optimization using ArgoCD for GitOps pipelines.
- CI/CD Pipeline Project: Present a project that demonstrates your ability to build and maintain CI/CD workflows, infrastructure automation, and system observability.
- Linux System Administration Case Study: Highlight a project or case study that showcases your Linux system administration skills and experience managing cloud environments.
- Database Management Demonstration: Include a project or case study that demonstrates your ability to manage and troubleshoot PostgreSQL and MySQL databases, ensuring optimal performance and data integrity.
- GPU-based AI Infrastructure Project: Present a project that showcases your support and maintenance of GPU-based AI infrastructure, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow.
Technical Documentation:
- Code Quality and Documentation: Ensure your code is well-documented, follows best practices, and adheres to coding standards relevant to the project.
- Version Control and Deployment Processes: Demonstrate your experience with version control systems like Git and explain your deployment processes, including CI/CD pipelines and server configuration.
- Testing Methodologies and Performance Metrics: Include information about your testing methodologies, performance metrics, and optimization techniques to ensure the quality and efficiency of your projects.
📝 Enhancement Note: The portfolio requirements for this role focus on demonstrating your technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments.
💵 Compensation & Benefits
Salary Range: The salary range for this role is estimated to be between $45,000 and $65,000 per year, based on regional market standards for DevOps Engineers with 2-5 years of experience in the West Bank, Palestinian Territory.
Benefits:
- Health Insurance: Comprehensive health insurance plan for employees and their dependents.
- Retirement Savings: Retirement savings plan with employer matching contributions.
- Paid Time Off: Competitive paid time off policy, including vacation, sick leave, and holidays.
- Professional Development: Opportunities for professional development, including training, workshops, and conference attendance.
- Company Culture: A dynamic and collaborative work environment that fosters innovation and growth.
Working Hours: The standard workweek is 40 hours, with flexible scheduling to accommodate project deadlines and maintenance windows.
📝 Enhancement Note: The salary range and benefits for this role are estimates based on regional market standards for DevOps Engineers with 2-5 years of experience in the West Bank, Palestinian Territory. Actual compensation may vary based on the candidate's qualifications and the company's internal policies.
🎯 Team & Company Context
🏢 Company Culture
Industry: Business Alliance is a high-tech company offering digital healthcare solutions, focusing on AI/ML workloads and GPU-based infrastructure.
Company Size: As a mid-sized company, Business Alliance offers a collaborative and dynamic work environment that encourages innovation and growth.
Founded: Business Alliance was founded in 2015, with a mission to revolutionize the digital healthcare industry through cutting-edge technology and AI/ML solutions.
Team Structure:
- Web Technology Team: The web technology team consists of frontend and backend developers, DevOps engineers, and data scientists, working collaboratively to deliver digital healthcare solutions.
- Reporting Structure: The DevOps Engineer will report directly to the Head of Infrastructure and work closely with the development and data science teams.
- Cross-functional Collaboration: The DevOps Engineer will collaborate with various teams, including frontend and backend developers, data scientists, and product managers, to ensure efficient and reliable deployments and infrastructure management.
Development Methodology:
- Agile/Scrum Methodologies: Business Alliance follows Agile/Scrum methodologies for software development, with sprint planning, code reviews, and regular team meetings to ensure efficient project management.
- Code Review and Quality Assurance: The company emphasizes code review, testing, and quality assurance practices to ensure the reliability and performance of its digital healthcare solutions.
- Deployment Strategies: Business Alliance employs CI/CD pipelines and automated deployment strategies to facilitate efficient and reliable deployments of its digital healthcare solutions.
Company Website: www.businessallianceinc.com
📝 Enhancement Note: Business Alliance is a high-tech company focused on digital healthcare solutions, with a mid-sized team structure that encourages collaboration and innovation. The company follows Agile/Scrum methodologies and emphasizes code review, testing, and quality assurance practices to ensure the reliability and performance of its digital healthcare solutions.
📈 Career & Growth Analysis
Web Technology Career Level: This role is a mid-level position within the web technology team, focusing on managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments.
Reporting Structure: The DevOps Engineer will report directly to the Head of Infrastructure and work closely with the development and data science teams, collaborating on projects and contributing to the company's digital healthcare solutions.
Technical Impact: The DevOps Engineer will have a significant impact on the company's digital healthcare solutions, ensuring efficient and reliable deployments, managing GPU-based AI infrastructure, and supporting the development and data science teams.
Growth Opportunities:
- Technical Leadership: As the company grows, there will be opportunities for the DevOps Engineer to take on more technical leadership roles, mentoring junior team members and contributing to architecture decisions.
- Specialization: The DevOps Engineer may have the opportunity to specialize in specific areas, such as AI/ML workloads, GPU infrastructure, or cloud-native technologies, depending on the company's growth and project requirements.
- Career Progression: With experience and demonstrated success in the role, the DevOps Engineer may have the opportunity to progress to senior or management positions within the company.
📝 Enhancement Note: This role offers a mid-level position within the web technology team, with significant technical impact on the company's digital healthcare solutions and opportunities for growth, including technical leadership, specialization, and career progression.
🌐 Work Environment
Office Type: Business Alliance offers a modern and collaborative office environment, with open-plan workspaces and dedicated meeting rooms for team discussions and brainstorming sessions.
Office Location(s): The company's headquarters is located in Ramallah, West Bank, Palestinian Territory, with additional offices in major cities worldwide.
Workspace Context:
- Collaborative Workspace: The office features collaborative workspaces, with dedicated areas for team meetings, brainstorming sessions, and informal discussions.
- Development Tools and Equipment: The office is equipped with state-of-the-art development tools, multiple monitors, and testing devices to facilitate efficient and productive work.
- Cross-functional Collaboration: The office layout encourages cross-functional collaboration between web technology teams, data scientists, and product managers, fostering a dynamic and innovative work environment.
Work Schedule: The standard workweek is 40 hours, with flexible scheduling to accommodate project deadlines and maintenance windows. The company offers a hybrid work arrangement, with employees working on-site and remotely as needed.
📝 Enhancement Note: Business Alliance offers a modern and collaborative office environment, with a focus on cross-functional collaboration and a hybrid work arrangement to facilitate efficient and productive work.
📄 Application & Technical Interview Process
Interview Process:
- Technical Phone Screen: A brief phone or video call to assess your technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments.
- Technical Deep Dive: A comprehensive technical interview focused on your Kubernetes, Docker, and Linux system administration skills, as well as your experience with GPU infrastructure and AI/ML workloads.
- Behavioral and Cultural Fit Assessment: A behavioral interview to assess your problem-solving skills, communication, and cultural fit within the company.
- Final Evaluation: A final evaluation based on your technical skills, experience, and cultural fit, as well as your potential to contribute to the company's digital healthcare solutions.
Portfolio Review Tips:
- Kubernetes Cluster Demonstration: Highlight your Kubernetes cluster design, management, and optimization using ArgoCD for GitOps pipelines, with a focus on efficiency, reliability, and scalability.
- CI/CD Pipeline Project: Demonstrate your ability to build and maintain CI/CD workflows, infrastructure automation, and system observability, with a focus on streamlined deployment processes and performance optimization.
- Linux System Administration Case Study: Showcase your Linux system administration skills and experience managing cloud environments, with a focus on security, stability, and performance.
- Database Management Demonstration: Present your ability to manage and troubleshoot PostgreSQL and MySQL databases, with a focus on data integrity, performance optimization, and efficient query execution.
- GPU-based AI Infrastructure Project: Demonstrate your support and maintenance of GPU-based AI infrastructure, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, with a focus on enabling efficient AI/ML workloads.
Technical Challenge Preparation:
- Kubernetes and ArgoCD: Brush up on your Kubernetes and ArgoCD skills, focusing on cluster management, GitOps pipelines, and efficient deployment strategies.
- Linux System Administration: Review your Linux system administration skills, with a focus on security, stability, and performance optimization.
- GPU Infrastructure and AI/ML Workloads: Familiarize yourself with GPU infrastructure and AI/ML workloads, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, to ensure efficient and reliable AI/ML workloads.
ATS Keywords: [Comprehensive list of web development and server administration-relevant keywords for resume optimization, organized by category: programming languages, web frameworks, server technologies, databases, tools, methodologies, soft skills, industry terms]
📝 Enhancement Note: The interview process for this role focuses on assessing the candidate's technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. The portfolio review tips and technical challenge preparation emphasize Kubernetes, Docker, Linux system administration, and GPU infrastructure, with a focus on enabling efficient and reliable AI/ML workloads.
🛠 Technology Stack & Web Infrastructure
Frontend Technologies:
- React: Business Alliance uses React for building user interfaces, with a focus on performance, accessibility, and responsive design.
- Redux: The company employs Redux for state management in its React applications, ensuring efficient and predictable state updates.
- Material-UI: Business Alliance uses Material-UI for creating visually appealing and consistent user interfaces, following Material Design guidelines.
Backend & Server Technologies:
- Node.js: The company uses Node.js for building scalable and efficient server-side applications, with a focus on performance and real-time data processing.
- Express.js: Business Alliance employs Express.js as its web application framework, providing a robust set of features for building web and mobile applications.
- PostgreSQL and MySQL: The company uses PostgreSQL and MySQL for data storage and management, ensuring data integrity, performance, and efficient query execution.
Development & DevOps Tools:
- Git: Business Alliance uses Git for version control and collaboration, with a focus on efficient and reliable code management.
- Jenkins: The company employs Jenkins for CI/CD pipeline automation, ensuring efficient and reliable deployments of its digital healthcare solutions.
- Prometheus and Grafana: Business Alliance uses Prometheus and Grafana for system monitoring and visualization, ensuring optimal performance and reliability of its digital healthcare solutions.
📝 Enhancement Note: Business Alliance's technology stack focuses on modern web development and server administration technologies, with a focus on performance, accessibility, and efficiency. The company employs React, Redux, and Material-UI for frontend development, Node.js and Express.js for backend development, and PostgreSQL and MySQL for data storage and management. Additionally, the company uses Git, Jenkins, Prometheus, and Grafana for version control, CI/CD pipeline automation, and system monitoring.
👥 Team Culture & Values
Web Development Values:
- User-centric Design: Business Alliance prioritizes user-centric design, with a focus on creating intuitive, accessible, and responsive user interfaces.
- Performance Optimization: The company emphasizes performance optimization, with a focus on efficient code, minimal resource usage, and fast loading times.
- Collaborative Development: Business Alliance fosters a collaborative development environment, with a focus on code review, pair programming, and knowledge sharing.
- Continuous Learning: The company encourages continuous learning and professional development, with a focus on staying up-to-date with the latest web technologies and best practices.
Collaboration Style:
- Cross-functional Integration: Business Alliance encourages cross-functional integration between web technology teams, data scientists, and product managers, fostering a dynamic and innovative work environment.
- Code Review Culture: The company emphasizes code review, with a focus on maintaining high coding standards, efficient code, and minimal technical debt.
- Knowledge Sharing: Business Alliance encourages knowledge sharing and technical mentoring, with a focus on fostering a collaborative and supportive work environment.
📝 Enhancement Note: Business Alliance's web development values emphasize user-centric design, performance optimization, collaborative development, and continuous learning. The company's collaboration style focuses on cross-functional integration, code review culture, and knowledge sharing, fostering a dynamic and innovative work environment.
⚡ Challenges & Growth Opportunities
Technical Challenges:
- Kubernetes Cluster Optimization: Design, manage, and optimize Kubernetes clusters using ArgoCD for GitOps pipelines, with a focus on efficiency, reliability, and scalability.
- CI/CD Pipeline Automation: Build and maintain CI/CD workflows, infrastructure automation, and system observability, with a focus on streamlined deployment processes and performance optimization.
- GPU Infrastructure Management: Support and maintain GPU-based AI infrastructure, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, to enable efficient AI/ML workloads.
- AI/ML Workload Optimization: Optimize AI/ML workloads for efficient and reliable execution, with a focus on performance, scalability, and cost-effectiveness.
Learning & Development Opportunities:
- Web Technology Specialization: Specialize in specific web technologies, such as React, Node.js, or AI/ML workloads, to deepen your expertise and contribute to the company's digital healthcare solutions.
- Conference Attendance and Certification: Attend industry conferences, workshops, and training sessions to stay up-to-date with the latest web technologies and best practices, and pursue relevant certifications to enhance your skills and credentials.
- Technical Mentorship and Leadership: Provide technical mentorship to junior team members and contribute to architecture decisions, fostering a collaborative and supportive work environment.
📝 Enhancement Note: The technical challenges for this role focus on managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. The learning and development opportunities emphasize web technology specialization, conference attendance, and technical mentorship, fostering a collaborative and supportive work environment.
💡 Interview Preparation
Technical Questions:
- Kubernetes and ArgoCD: Be prepared to discuss your experience with Kubernetes and ArgoCD, focusing on cluster management, GitOps pipelines, and efficient deployment strategies.
- Linux System Administration: Brush up on your Linux system administration skills, with a focus on security, stability, and performance optimization.
- GPU Infrastructure and AI/ML Workloads: Familiarize yourself with GPU infrastructure and AI/ML workloads, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, to ensure efficient and reliable AI/ML workloads.
Company & Culture Questions:
- Digital Healthcare Solutions: Research Business Alliance's digital healthcare solutions and be prepared to discuss how your technical skills and experience can contribute to the company's mission and goals.
- Agile Methodologies: Brush up on your understanding of Agile methodologies and be prepared to discuss your experience working in an Agile/Scrum environment.
- Collaboration and Communication: Prepare to discuss your experience working in a collaborative and dynamic team environment, with a focus on effective communication, problem-solving, and decision-making.
Portfolio Presentation Strategy:
- Kubernetes Cluster Demonstration: Highlight your Kubernetes cluster design, management, and optimization using ArgoCD for GitOps pipelines, with a focus on efficiency, reliability, and scalability.
- CI/CD Pipeline Project: Demonstrate your ability to build and maintain CI/CD workflows, infrastructure automation, and system observability, with a focus on streamlined deployment processes and performance optimization.
- Linux System Administration Case Study: Showcase your Linux system administration skills and experience managing cloud environments, with a focus on security, stability, and performance.
- Database Management Demonstration: Present your ability to manage and troubleshoot PostgreSQL and MySQL databases, with a focus on data integrity, performance optimization, and efficient query execution.
- GPU-based AI Infrastructure Project: Demonstrate your support and maintenance of GPU-based AI infrastructure, including NVIDIA GPUs, CUDA drivers, and AI frameworks like PyTorch and TensorFlow, with a focus on enabling efficient AI/ML workloads.
📝 Enhancement Note: The interview preparation for this role focuses on assessing the candidate's technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. The technical questions emphasize Kubernetes, Docker, Linux system administration, and GPU infrastructure, with a focus on enabling efficient and reliable AI/ML workloads. The company and culture questions focus on the candidate's understanding of Business Alliance's digital healthcare solutions, Agile methodologies, and collaboration and communication skills.
📌 Application Steps
To apply for this DevOps Engineer (AI/GPU Infrastructure) position at Business Alliance:
- Customize Your Resume: Tailor your resume to highlight your technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. Include relevant keywords and examples to demonstrate your qualifications for the role.
- Prepare Your Portfolio: Curate a portfolio that showcases your technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. Include projects that demonstrate your ability to design, manage, and optimize Kubernetes clusters using ArgoCD, build and maintain CI/CD workflows, and support and maintain GPU-based AI infrastructure.
- Research the Company: Familiarize yourself with Business Alliance's digital healthcare solutions, Agile methodologies, and company culture to ensure a strong fit for the role and demonstrate your enthusiasm for the company's mission and goals.
- Prepare for the Technical Interview: Brush up on your technical skills and experience in managing and optimizing infrastructure, supporting GPU-based AI environments, and ensuring efficient deployments. Review the technology stack and be prepared to discuss your experience with Kubernetes, Docker, Linux system administration, and GPU infrastructure.
⚠️ Important Notice: This enhanced job description includes AI-generated insights and web development/server administration industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.
Application Requirements
Candidates should have hands-on experience with Kubernetes, Docker, and strong Linux system administration skills. Familiarity with GPU infrastructure and AI frameworks like PyTorch and TensorFlow is also required.