Engineering Manager, DevOps & SRE
π Job Overview
- Job Title: Engineering Manager, DevOps & SRE
- Company: LucidLink
- Location: Sofia, Sofia-Grad, Bulgaria
- Job Type: Full-Time
- Category: Engineering Management
- Date Posted: 2025-07-04
- Experience Level: 5-10 years
- Remote Status: Remote OK
π Role Summary
- Lead and grow the DevOps & SRE team to ensure the scalability, security, and cost-efficiency of LucidLink's cloud-native infrastructure.
- Collaborate cross-functionally with product, platform, and customer success teams to define and execute DevOps initiatives.
- Drive the development of internal infrastructure products and automation tools to improve engineering efficiency and system reliability.
- Foster a culture of collaboration, ownership, and continuous improvement within the DevOps & SRE team.
π Enhancement Note: This role sits at the intersection of engineering leadership and technical operations, playing a critical part in scaling and maintaining LucidLink's globally distributed product.
π» Primary Responsibilities
-
Platform Strategy and Execution:
- Own the planning and prioritization of DevOps initiatives, balancing urgent needs with long-term roadmap work.
- Collaborate with engineering and product stakeholders to collect requirements and define an execution path.
- Drive the development of internal infrastructure products and automation tools to enhance engineering efficiency and system reliability.
-
Team Leadership and Development:
- Provide strong leadership to the DevOps & SRE team, fostering a culture of collaboration, ownership, and continuous improvement.
- Mentor and support individual growth, conducting regular feedback and development sessions, and defining clear objectives that align with company goals.
- Lead distributed team members with empathy and structure, ensuring alignment across time zones and geographies.
-
Technical and Operational Excellence:
- Oversee LucidLink's always-on, cloud-native infrastructure, ensuring uptime, performance, scalability, and disaster recovery readiness.
- Maintain deep knowledge of cloud platforms (e.g., AWS), modern DevOps toolchains (CI/CD, monitoring, infrastructure-as-code), and incident management best practices.
- Manage the on-call rotation at the leadership level, ensuring 24/7 support readiness for critical issues.
- Track infrastructure usage and cost, working with finance and vendors to manage budgets and optimize spend.
-
Stakeholder Communication:
- Keep key stakeholders informed about priorities, progress, risks, and incidents in a clear and timely manner.
- Represent the DevOps team in cross-functional initiatives and planning discussions.
- Build and maintain strong vendor relationships, including contract discussions and support escalations with providers like AWS and IBM.
π Skills & Qualifications
Education: Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field.
Experience: 4+ years in a DevOps/SRE Engineering Manager role, ideally in a scale-up or product company.
Required Skills:
- Technical expertise in cloud infrastructure, site reliability engineering, and infrastructure-as-code.
- Proven success managing and mentoring senior engineers, including remote and distributed team members.
- Strong communication and stakeholder management skills; able to operate effectively across technical and business teams.
- Experience managing always-on systems with complex operational needs.
Preferred Skills:
- Familiarity with budgeting, vendor management, and cloud cost optimization.
π Web Portfolio & Project Requirements
-
Portfolio Essentials:
- Demonstrate experience in managing and scaling cloud-native infrastructure.
- Showcase successful projects that improved system reliability, performance, or cost-efficiency.
- Highlight your ability to lead and mentor technical teams, driving collaboration and continuous improvement.
-
Technical Documentation:
- Provide examples of well-documented infrastructure code, CI/CD pipelines, and incident management processes.
- Share case studies or blog posts that demonstrate your technical expertise and thought leadership in DevOps and SRE.
π΅ Compensation & Benefits
Salary Range: The estimated salary range for this role is $150,000 - $200,000 USD per year, based on industry standards for a DevOps/SRE Engineering Manager in Sofia, Bulgaria.
Benefits:
- Unlimited PTO
- Competitive Salary
- Stock Options
- Full Health Coverage
Working Hours: Full-time, with a flexible schedule that may include on-call rotations for critical incident management.
π― Team & Company Context
π’ Company Culture
Industry: LucidLink operates in the cloud storage and file sharing industry, with a focus on enabling remote and hybrid workforces to collaborate securely and efficiently.
Company Size: LucidLink has around 170 employees, providing a mid-sized company environment where your impact will be significant.
Founded: LucidLink was founded in 2016, making it a well-established startup with a proven track record of success.
Team Structure:
- The DevOps & SRE team consists of experienced engineers responsible for maintaining and scaling LucidLink's cloud-native infrastructure.
- The team works closely with product, platform, and customer success teams to ensure the reliability, performance, and security of LucidLink's services.
- The engineering organization is led by the VP of Engineering, who reports directly to the CEO.
Development Methodology:
- LucidLink follows Agile development methodologies, with a focus on continuous integration, continuous deployment, and iterative improvement.
- The company uses modern DevOps toolchains, including CI/CD pipelines, infrastructure-as-code, and automated testing.
- LucidLink prioritizes collaboration, cross-functional teamwork, and a culture of learning and innovation.
Company Website: LucidLink
π Enhancement Note: LucidLink's culture is defined by its values of integrity, innovation, and empathy, which guide every decision and interaction within the company.
π Career & Growth Analysis
Web Technology Career Level: This role is an intermediate to senior-level position, offering significant growth opportunities in technical leadership, team management, and architecture decision-making.
Reporting Structure: The Engineering Manager, DevOps & SRE reports directly to the VP of Engineering and leads a team of experienced DevOps and SRE engineers.
Technical Impact: This role has a significant impact on LucidLink's infrastructure, ensuring the scalability, security, and cost-efficiency of the company's cloud-native services.
Growth Opportunities:
- Technical Growth: Deepen your expertise in cloud infrastructure, site reliability engineering, and infrastructure-as-code, while staying up-to-date with emerging technologies and best practices.
- Leadership Development: Gain experience in managing and mentoring senior engineers, fostering a culture of collaboration and continuous improvement, and driving technical decision-making.
- Architecture Decision-Making: Contribute to strategic architecture decisions that shape LucidLink's infrastructure and enable the company's continued growth and success.
π Enhancement Note: LucidLink's hypergrowth journey presents unparalleled opportunities for advancement, learning, and being part of an exciting journey toward unicorn status.
π Work Environment
Office Type: LucidLink has an engineering office in Sofia, Bulgaria, with a hybrid work arrangement that combines on-site collaboration with remote work flexibility.
Office Location(s): Sofia, Bulgaria
Workspace Context:
- The Sofia office provides a modern, collaborative workspace designed to facilitate team interaction and innovation.
- Engineers have access to multiple monitors, testing devices, and the tools they need to perform their jobs effectively.
- LucidLink encourages cross-functional collaboration between developers, designers, and stakeholders to ensure user-focused, high-quality products.
Work Schedule: Full-time, with a flexible schedule that may include on-call rotations for critical incident management. LucidLink offers unlimited PTO, allowing employees to maintain a strong work-life balance.
π Enhancement Note: LucidLink's work environment fosters a strong sense of teamwork, collaboration, and mutual support, enabling employees to thrive both professionally and personally.
π Application & Technical Interview Process
Interview Process:
- Screening: A 30-minute phone or video call to discuss your background, experience, and motivation for the role.
- Technical Deep Dive: A 60-90 minute technical conversation focused on your experience with cloud infrastructure, site reliability engineering, and infrastructure-as-code. Be prepared to discuss your approach to managing complex systems, incident management, and team leadership.
- Behavioral & Cultural Fit: A 60-minute conversation to assess your cultural fit with LucidLink, leadership style, and problem-solving approach.
- Final Decision: A final discussion with the VP of Engineering to make a hiring decision.
Portfolio Review Tips:
- Highlight your experience in managing and scaling cloud-native infrastructure, with a focus on system reliability, performance, and cost-efficiency.
- Showcase your ability to lead and mentor technical teams, driving collaboration and continuous improvement.
- Demonstrate your technical expertise and thought leadership in DevOps and SRE through case studies, blog posts, or open-source contributions.
Technical Challenge Preparation:
- Brush up on your knowledge of cloud platforms (e.g., AWS), modern DevOps toolchains (CI/CD, monitoring, infrastructure-as-code), and incident management best practices.
- Prepare for behavioral questions that assess your leadership style, problem-solving approach, and cultural fit with LucidLink.
- Research LucidLink's products, industry, and competition to demonstrate your understanding of the company's mission and market position.
ATS Keywords:
- Cloud Platforms: AWS, GCP, Azure
- Infrastructure-as-Code: Terraform, CloudFormation, Pulumi
- CI/CD: Jenkins, GitLab CI/CD, CircleCI
- Monitoring: Prometheus, Grafana, Datadog
- Incident Management: PagerDuty, OpsGenie, VictorOps
- Configuration Management: Ansible, Puppet, Chef
- Containerization: Docker, Kubernetes, ECS
- Serverless: AWS Lambda, Azure Functions, Google Cloud Functions
- Leadership: Team Management, Mentoring, Stakeholder Communication
- Problem-Solving: Troubleshooting, Root Cause Analysis, Incident Management
- Collaboration: Agile, Scrum, Kanban
- Cloud Cost Optimization: CloudWatch, Azure Cost Management, Google Cloud Billing
π Enhancement Note: Tailor your resume and portfolio to highlight the ATS keywords most relevant to this role, ensuring that your application showcases your technical expertise and fit for the position.
π Technology Stack & Web Infrastructure
Cloud Platforms:
- AWS (Primary)
- GCP, Azure (Secondary)
Infrastructure-as-Code:
- Terraform
- CloudFormation
- Pulumi
CI/CD:
- GitLab CI/CD (Primary)
- Jenkins, CircleCI (Secondary)
Monitoring:
- Prometheus (Primary)
- Grafana, Datadog (Secondary)
Incident Management:
- PagerDuty (Primary)
- OpsGenie, VictorOps (Secondary)
Configuration Management:
- Ansible (Primary)
- Puppet, Chef (Secondary)
Containerization:
- Docker (Primary)
- Kubernetes, ECS (Secondary)
Serverless:
- AWS Lambda (Primary)
- Azure Functions, Google Cloud Functions (Secondary)
Database:
- PostgreSQL (Primary)
- MySQL, MongoDB (Secondary)
Caching:
- Redis (Primary)
- Memcached, Varnish (Secondary)
π Enhancement Note: LucidLink's technology stack is primarily based on cloud-native, open-source tools, with a focus on scalability, security, and cost-efficiency.
π₯ Team Culture & Values
Web Development Values:
- Integrity: LucidLink values integrity in all aspects of its business, from product development to customer interactions.
- Innovation: The company encourages continuous learning, experimentation, and iteration to drive technological advancements and improve user experiences.
- Empathy: LucidLink prioritizes understanding and addressing the needs of its users, fostering a customer-centric culture that prioritizes user experience and satisfaction.
Collaboration Style:
- Cross-Functional Integration: LucidLink encourages collaboration between teams, with a focus on user-centered design, agile development methodologies, and iterative improvement.
- Code Review Culture: The company values peer review and feedback, fostering a culture of learning and continuous improvement.
- Knowledge Sharing: LucidLink encourages engineers to share their expertise with the team, contributing to the collective knowledge base and driving technical growth.
π Enhancement Note: LucidLink's values-led culture fosters a collaborative, performance-driven team environment that empowers engineers to grow both personally and professionally.
β‘ Challenges & Growth Opportunities
Technical Challenges:
- Scalability: Manage the growth and scalability of LucidLink's cloud-native infrastructure to support the company's hypergrowth journey.
- Security: Ensure the security and compliance of LucidLink's infrastructure, protecting user data and maintaining regulatory compliance.
- Cost Optimization: Monitor and optimize cloud costs, balancing performance and efficiency with budget constraints.
- Incident Management: Develop and maintain incident management processes that minimize downtime and ensure rapid resolution of critical issues.
Learning & Development Opportunities:
- Technical Skill Development: Deepen your expertise in cloud infrastructure, site reliability engineering, and infrastructure-as-code, staying up-to-date with emerging technologies and best practices.
- Leadership Development: Gain experience in managing and mentoring senior engineers, fostering a culture of collaboration and continuous improvement, and driving technical decision-making.
- Architecture Decision-Making: Contribute to strategic architecture decisions that shape LucidLink's infrastructure and enable the company's continued growth and success.
π Enhancement Note: LucidLink's hypergrowth journey presents unparalleled opportunities for learning, growth, and being part of an exciting journey toward unicorn status.
π‘ Interview Preparation
Technical Questions:
- Cloud Infrastructure: Describe your experience with cloud platforms (e.g., AWS, GCP, Azure) and their services. How have you used these platforms to manage and scale infrastructure for high-growth companies?
- Site Reliability Engineering: Explain your approach to site reliability engineering, including incident management, system monitoring, and performance optimization. Provide examples of your experience in ensuring high availability and minimal downtime for critical systems.
- Infrastructure-as-Code: Discuss your expertise in infrastructure-as-code tools (e.g., Terraform, CloudFormation, Pulumi). How have you used these tools to automate infrastructure provisioning, ensure consistency, and improve deployment efficiency?
Company & Culture Questions:
- LucidLink's Mission: Explain why you are excited about LucidLink's mission to make data instantly and securely accessible from everywhere. How do you see yourself contributing to this mission in your role as Engineering Manager, DevOps & SRE?
- Team Culture: Describe your experience working in a values-led, collaborative team environment. How have you contributed to fostering a culture of innovation, integrity, and empathy in your previous roles?
- Problem-Solving: Share an example of a complex technical challenge you faced in a previous role and how you approached solving it. What was the outcome, and what did you learn from the experience?
Portfolio Presentation Strategy:
- Technical Deep Dive: Prepare a detailed walkthrough of your experience in managing and scaling cloud-native infrastructure, highlighting your approach to system reliability, performance, and cost-efficiency.
- Leadership & Teamwork: Showcase your ability to lead and mentor technical teams, driving collaboration and continuous improvement. Provide examples of your experience in fostering a culture of learning, innovation, and high performance.
- Problem-Solving & Incident Management: Demonstrate your expertise in incident management, root cause analysis, and problem-solving. Provide examples of how you have minimized downtime and ensured rapid resolution of critical issues.
π Enhancement Note: Tailor your interview preparation to highlight your technical expertise, leadership skills, and cultural fit with LucidLink's mission, values, and team environment.
π Application Steps
To apply for this Engineering Manager, DevOps & SRE position at LucidLink:
- Customize Your Resume: Highlight your experience with cloud infrastructure, site reliability engineering, and infrastructure-as-code, emphasizing your leadership skills, problem-solving approach, and cultural fit with LucidLink.
- Prepare Your Portfolio: Showcase your experience in managing and scaling cloud-native infrastructure, with a focus on system reliability, performance, and cost-efficiency. Include examples of your leadership and teamwork skills, as well as your problem-solving and incident management expertise.
- Research LucidLink: Familiarize yourself with LucidLink's products, industry, and competition. Prepare thoughtful questions that demonstrate your understanding of the company's mission and market position.
- Prepare for Technical Interviews: Brush up on your knowledge of cloud platforms, modern DevOps toolchains, and incident management best practices. Practice behavioral questions that assess your leadership style, problem-solving approach, and cultural fit with LucidLink.
β οΈ Important Notice: This enhanced job description includes AI-generated insights and web development/server administration industry-standard assumptions. All details should be verified directly with the hiring organization before making application decisions.
Application Requirements
Candidates should have a degree in Computer Science or a related field and at least 4 years of experience in a DevOps/SRE Engineering Manager role. Strong communication skills and experience managing remote teams are essential.