Senior Software Engineer - Observability Infrastructure SRE
📍 Job Overview
- Job Title: Senior Software Engineer - Observability Infrastructure SRE
- Company: Datadog
- Location: New York, New York, USA
- Job Type: Hybrid
- Category: Senior Software Engineering
- Date Posted: 2025-07-21
- Experience Level: 5-10 years
- Remote Status: On-site/Hybrid
🚀 Role Summary
- 📝 Enhancement Note: This role focuses on managing Datadog's internal observability tooling and practices, with a specific emphasis on the telemetry data plane, which collects large volumes of observability data across all Datadog environments. The ideal candidate will have experience in software engineering, running production systems at scale, and building telemetry pipelines or observability platforms in a cloud-native environment.
💻 Primary Responsibilities
- 📝 Enhancement Note: The primary responsibilities of this role revolve around leading projects that impact Datadog's specific observability, instrumentation, or telemetry collection problems. This involves designing and implementing internal solutions or contributing to Datadog's products. Additionally, the role requires building, scaling, and operating a robust telemetry data plane while ensuring strict performance and reliability objectives.
🎓 Skills & Qualifications
Education: Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
Experience: 5+ years of experience in software engineering, running production systems at scale, and building telemetry pipelines or observability platforms in a cloud-native environment.
Required Skills:
- Expertise in software engineering and running production systems at scale
- Experience building telemetry pipelines or observability platforms in a cloud-native environment
- Strong programming skills with a structured programming language (Go, Python)
- Strong communication skills and experience working in cross-team programs/projects
- Experience leading the adoption of programs/projects with wide impact across Engineering
Preferred Skills:
- Familiarity with Datadog's products and services
- Experience with multi-cloud environments and infrastructure as code (IaC) tools
- Knowledge of observability best practices and principles
📊 Web Portfolio & Project Requirements
Portfolio Essentials:
- A strong portfolio showcasing your experience in software engineering, with a focus on telemetry pipeline and observability platform projects
- Examples of your ability to scale and operate robust telemetry data planes with strict performance and reliability objectives
- Demonstrations of your leadership in driving product innovation and maintaining operational excellence
Technical Documentation:
- Detailed documentation of your past projects, highlighting your problem-solving approach, technical implementation, and results
- Code samples and walkthroughs showcasing your proficiency in Go or Python and your understanding of observability principles
💵 Compensation & Benefits
Salary Range: $187,000 - $240,000 USD per year (based on the provided salary range)
Benefits:
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development, product training, and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks, our Internal panel discussions
- Free, global mental health benefits for employees and dependents age 6+
- Competitive global benefits
Working Hours: 40 hours per week, with flexibility for deployment windows, maintenance, and project deadlines
🎯 Team & Company Context
🏢 Company Culture
Industry: Datadog is a global SaaS business focused on delivering a rare combination of growth and profitability. They are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of their customers' entire technology stacks.
Company Size: Datadog has a significant global presence, with over 1,500 employees across multiple locations worldwide.
Founded: Datadog was founded in 2010 and is headquartered in New York, New York.
Team Structure:
- The Observability Infrastructure SRE team is part of the Core Observability SRE group and the broader SRE/Security organization
- The team works closely with various engineering teams to build and operate systems more effectively
- The team is responsible for managing Datadog's internal observability tooling and practices, with a focus on the telemetry data plane
Development Methodology:
- Datadog follows Agile methodologies, with a focus on continuous integration, continuous deployment, and iterative development
- The company emphasizes collaboration, cross-functional teamwork, and a culture of learning and improvement
Company Website: Datadog
📈 Career & Growth Analysis
Web Technology Career Level: This role is at the senior software engineering level, with a focus on observability infrastructure and site reliability engineering (SRE). The ideal candidate will have 5-10 years of experience in software engineering, running production systems at scale, and building telemetry pipelines or observability platforms in a cloud-native environment.
Reporting Structure: The Observability Infrastructure SRE team reports to the Core Observability SRE group and the broader SRE/Security organization. The team works closely with various engineering teams to build and operate systems more effectively.
Technical Impact: The role has a significant impact on Datadog's internal observability tooling and practices, with a focus on the telemetry data plane. The ideal candidate will have experience in software engineering, running production systems at scale, and building telemetry pipelines or observability platforms in a cloud-native environment.
Growth Opportunities:
- Opportunities for career progression within the SRE/Security organization or other engineering teams at Datadog
- Potential to take on more significant projects and leadership roles as the company continues to grow and expand its observability offerings
🌐 Work Environment
Office Type: Datadog operates a hybrid workplace, allowing employees to create a work-life harmony that best fits their needs.
Office Location(s): Datadog's headquarters is located in New York, New York, with additional offices worldwide.
Workspace Context:
- Datadog's offices are designed to foster collaboration, creativity, and a strong company culture
- Employees have access to multiple monitors, testing devices, and other tools necessary for their roles
- The company encourages cross-functional teamwork and knowledge sharing among its employees
Work Schedule: Datadog operates on a standard 40-hour workweek, with flexibility for deployment windows, maintenance, and project deadlines. The company offers a hybrid work arrangement, allowing employees to work from the office or remotely as needed.
📄 Application & Technical Interview Process
Interview Process:
- Technical Phone Screen: A 45-minute phone or video call to assess your technical skills and cultural fit with Datadog
- On-site Interview: A half-day on-site interview at Datadog's headquarters in New York, New York, or a virtual interview if applicable. This includes:
- A technical deep dive into your experience with telemetry pipelines, observability platforms, and cloud-native environments
- A discussion of your approach to problem-solving, leadership, and collaboration
- A review of your portfolio and a presentation of your past projects
- Final Interview: A final interview with a senior member of the team or hiring manager to discuss your fit for the role and answer any remaining questions
Portfolio Review Tips:
- Highlight your experience in software engineering, with a focus on telemetry pipeline and observability platform projects
- Showcase your ability to scale and operate robust telemetry data planes with strict performance and reliability objectives
- Demonstrate your leadership in driving product innovation and maintaining operational excellence
Technical Challenge Preparation:
- Brush up on your knowledge of Datadog's products and services, as well as observability best practices and principles
- Prepare for technical questions related to software engineering, telemetry pipelines, observability platforms, and cloud-native environments
- Practice explaining your problem-solving approach, technical implementation, and results in a clear and concise manner
ATS Keywords: [Provided in the "🛠 Technology Stack & Web Infrastructure" section below]
🛠 Technology Stack & Web Infrastructure
Frontend Technologies: (Not applicable for this role)
Backend & Server Technologies:
- Go
- Python
- Cloud-native environments (e.g., Kubernetes, Docker)
- Multi-cloud environments (e.g., AWS, GCP, Azure)
- Infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation)
Development & DevOps Tools:
- Git
- GitHub
- Jenkins
- Datadog's products and services (e.g., Datadog APM, Datadog Logs, Datadog Traces)
- Observability best practices and principles
👥 Team Culture & Values
Web Development Values:
- Datadog values people from all walks of life and fosters an inclusive company culture
- The company encourages continuous learning, collaboration, and innovation among its employees
- Datadog is committed to maintaining high standards of technical excellence and operational excellence
Collaboration Style:
- Datadog operates on a cross-functional, collaborative approach to problem-solving and decision-making
- The company encourages knowledge sharing, mentoring, and continuous learning among its employees
- Datadog's culture is built on a foundation of trust, respect, and open communication
⚡ Challenges & Growth Opportunities
Technical Challenges:
- Scaling and operating a robust telemetry data plane with strict performance and reliability objectives in a multi-region, multi-cloud provider ecosystem
- Integrating the latest Datadog features and driving product innovation while maintaining operational excellence
- Managing the complex technical challenges of observability infrastructure and site reliability engineering (SRE)
Learning & Development Opportunities:
- Opportunities to work on cutting-edge observability projects and gain exposure to the latest technologies and best practices
- Access to Datadog's learning and development resources, including product training, career pathing, and mentorship programs
- Opportunities to contribute to Datadog's open-source projects and engage with the broader observability community
💡 Interview Preparation
Technical Questions:
- Datadog's products and services (e.g., Datadog APM, Datadog Logs, Datadog Traces)
- Observability best practices and principles
- Telemetry pipelines, observability platforms, and cloud-native environments
- Problem-solving, leadership, and collaboration in a software engineering context
Company & Culture Questions:
- Datadog's mission, values, and company culture
- The role of the Observability Infrastructure SRE team within Datadog's broader organization
- Datadog's approach to work-life balance, remote work, and employee well-being
Portfolio Presentation Strategy:
- Tailor your portfolio to showcase your experience in software engineering, with a focus on telemetry pipeline and observability platform projects
- Highlight your ability to scale and operate robust telemetry data planes with strict performance and reliability objectives
- Demonstrate your leadership in driving product innovation and maintaining operational excellence
📌 Application Steps
To apply for this Senior Software Engineer - Observability Infrastructure SRE position at Datadog, follow these steps:
- Submit your application through the application link provided in the job listing.
- Prepare a tailored resume highlighting your experience in software engineering, telemetry pipelines, observability platforms, and cloud-native environments.
- Update your portfolio to showcase your experience and accomplishments in these areas, with a focus on driving product innovation and maintaining operational excellence.
- Brush up on your knowledge of Datadog's products and services, observability best practices, and the company's culture and values.
- Practice for technical interviews by reviewing your past projects, problem-solving approaches, and technical implementation details.
- Research Datadog's mission, values, and company culture to ensure a strong cultural fit and alignment with your personal goals and aspirations.
📝 Enhancement Note: This enhanced job description includes AI-generated insights and software engineering industry-standard assumptions. All details should be verified directly with Datadog before making application decisions.
Content Guidelines (IMPORTANT: Do not include this in the output)
Web Technology-Specific Focus:
- Tailor every section specifically to software engineering, telemetry pipelines, observability platforms, and cloud-native environments
- Include software engineering methodologies, telemetry pipeline design, and observability platform architecture
- Emphasize software engineering career progression, leadership, and collaboration in a software engineering context
- Address software engineering team dynamics, cross-functional collaboration with other engineering teams, and software engineering-specific interview preparation
Quality Standards:
- Ensure no content overlap between sections - each section must contain unique information
- Only include Enhancement Notes when making significant inferences about software engineering processes, telemetry pipeline design, or team structure
- Be comprehensive but concise, prioritizing actionable information over descriptive text
- Strategically distribute software engineering and related keywords throughout all sections naturally
- Provide realistic salary ranges based on location, experience level, and software engineering specialization
Industry Expertise:
- Include specific software engineering technologies, programming languages, and infrastructure tools relevant to the role
- Address software engineering career progression paths and technical leadership opportunities in software engineering teams
- Provide tactical advice for software engineering portfolio development, live demonstrations, and project case studies
- Include software engineering-specific interview preparation and coding challenge guidance
- Emphasize software engineering best practices, problem-solving, and technical implementation details
Professional Standards:
- Maintain consistent formatting, spacing, and professional tone throughout
- Use software engineering and related terminology appropriately and accurately
- Include comprehensive benefits and growth opportunities relevant to software engineering professionals
- Provide actionable insights that give software engineering candidates a competitive advantage
- Focus on software engineering team culture, cross-functional collaboration, and software engineering-specific interview preparation
Technical Focus & Portfolio Emphasis:
- Emphasize software engineering best practices, telemetry pipeline design, and observability platform architecture
- Include specific portfolio requirements tailored to the software engineering discipline and role level
- Address problem-solving methods, performance optimization, and scalable software architecture
- Include technical presentation skills and stakeholder communication for software engineering projects
Avoid:
- Generic business jargon not relevant to software engineering roles
- Placeholder text or incomplete sections
- Repetitive content across different sections
- Non-technical terminology unless relevant to the specific software engineering role
- Marketing language unrelated to software engineering, telemetry pipelines, or observability platforms
Application Requirements
Candidates should have 5+ years of experience in software engineering and expertise in building telemetry pipelines in a cloud-native environment. Strong programming skills in Go or Python and experience in cross-team collaboration are essential.