Remote

Principal Performance Architect

DataDirect Networks

United States

Dec 20, 2025

Principal Performance Architect

Job Locations

US-Remote

Job ID

2025-5289

Name Linked

Remote: US

Country

United States

City

Remote

Worker Type

Regular Full-Time Employee

Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

"DDN's A3I solutions are transforming the landscape of AI infrastructure." - IDC

"The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments" - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence.

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management.

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage.

Job Description

DDN is seeking a Principal Performance Architect to drive end-to-end performance analysis and optimization for Infinia, our AI-native, highly distributed data intelligence platform.

In this critical role, you will own the visibility, measurement, and tuning of system performance across the entire stack - from AI applications, vector databases, and inference pipelines, down to I/O patterns on NVMe drives. You'll partner with teams across engineering, architecture, and field engineering to deliver insights and innovations that drive customer value at scale.

This is a hands-on, full-stack systems performance role - ideal for someone who thrives on pushing the boundaries of scale, throughput, and latency across deeply interconnected layers.

We are looking for remote candidates in the following metropolitan areas: Boston, MA; Raleigh, NC; Denver, CO; or Tucson, AZ. Occasional in-person meetings or team events may be required.

Key Responsibilities

End-to-End Performance Architecture

Design and implement comprehensive performance instrumentation across the Infinia platform - from AI pipelines to storage backends.

Develop and maintain end-to-end performance models that span applications (e.g., LLM inference, vector search), query engines, data pipelines, and NVMe-backed storage.

Profiling, Benchmarking & Bottleneck Analysis

Build and execute reproducible performance tests that simulate realistic AI and data-intensive workloads.
Use advanced profiling and tracing tools (e.g., perf, eBPF, flamegraphs, custom telemetry) to identify and address latency hotspots, bandwidth bottlenecks, and concurrency inefficiencies.
Drive performance regression testing into CI pipelines and track key performance metrics over time.

Collaboration & Optimization

Partner with component teams (I/O Path, Core Platform, Data Engine, and AI Applications) to deliver performance fixes and architectural recommendations.
Work closely with hardware and storage teams to analyze I/O patterns on NVMe drives and optimize storage usage for real-world applications.
Provide tuning guidance for AI/ML applications, vector databases, and orchestration layers to maximize system utilization and efficiency.

Technical Leadership & Reporting

Own the performance roadmap and ensure performance visibility across teams and stakeholders.
Deliver clear, actionable performance summaries to engineering, product management, and field teams.
Mentor engineers on best practices in profiling, benchmarking, and performance-aware development.

Required Qualifications

15+ years of experience in systems performance engineering or low-level infrastructure development.
Proven expertise in end-to-end performance debugging across distributed systems, from application logic down to I/O subsystems.
Strong background in system-level tools (e.g., perf, ftrace, bpftrace, nvme-cli, or similar).
Deep understanding of NVMe, CPU and memory hierarchy, caching, thread scheduling, and Linux kernel internals.
Experience optimizing compute- or data-intensive applications such as AI inference, search, or analytics.
Proficiency in C/C++ or Rust and scripting languages such as Python or Bash.
Excellent written and verbal communication skills, especially around technical reporting and architectural recommendations.

Preferred Qualifications

Familiarity with vector databases (e.g., FAISS, Milvus, Weaviate) and LLM inference pipelines.
Experience with SPDK, RDMA, or high-performance I/O libraries.
Exposure to storage telemetry, AI model performance tuning, or multi-tenant throughput optimization.
Experience working in highly scaled distributed systems or HPC environments.

This position requires participation in an on-call rotation to provide after-hours support as needed.

Success Metrics - First 30 Days

Visibility & Integration

Establish baseline performance benchmarks across Infinia subsystems and components.
Deploy instrumentation and telemetry hooks for critical paths from app layer to storage.

Early Wins

Deliver a performance deep dive or RCA on an identified bottleneck impacting an AI workload.
Propose a prioritized list of performance improvements and related ownership areas.

Team & Process Engagement

Begin collaborating with 2-3 core component teams on performance-focused epics or tuning initiatives.
Integrate performance regression checks into the CI pipeline or release workflows.

Success Metrics - Beyond 30 Days

Measurable improvements in end-to-end latency, IOPS, throughput, or resource utilization across benchmarked workloads.
Performance transparency across teams via regular reports, dashboards, or incident retrospectives.
Influential contributions to system architecture based on empirical performance data.

Recognized as the go-to performance authority across the Infinia engineering organization.

DDN

Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

Interview Process: After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

Coding assessment: Often in a language of your choice.
Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
Meet and greet with the wider team.
Our goal is to finish the main process in 2-3 weeks at most.

DataDirect Networks (DDN) is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

#LI-Remote