Platform Architect
Company: Etched
Location: San Jose
Posted on: April 1, 2026
|
|
|
Job Description:
About Etched Etched is building the world’s first AI inference
system purpose-built for transformers - delivering over 10x higher
performance and dramatically lower cost and latency than a B200.
With Etched ASICs, you can build products that would be impossible
with GPUs, like real-time video generation models and extremely
deep & parallel chain-of-thought reasoning agents. Backed by
hundreds of millions from top-tier investors and staffed by leading
engineers, Etched is redefining the infrastructure layer for the
fastest growing industry in history. Job Summary As a Platform
Architect, you will lead the definition and realization of our AI
server platform architecture, from server board design to
rack-level integration and multi-rack POD-scale system
orchestration. This is a hands-on technical leadership role that
requires deep expertise in PCIe and fabric topologies, power and
thermal constraints, system controls, and high-speed networking.
Key responsibilities will include creating advanced new platform
architecture for next generation Sohu AI servers as part of a
future new product development roadmap. You will collaborate
cross-functionally with electrical, mechanical, thermal, firmware,
and operations teams to architect systems that scale from a single
server to full-rack and multi-rack POD deployments. Key
responsibilities Architect the end-to-end hardware system stack,
including server-level components, rack-scale systems, and
multi-rack POD designs optimized for AI and high-performance
workloads Design and implement advanced PCIe Gen5/Gen6 topologies:
root complex architecture, retimer placement, switch hierarchy, and
accelerator fan-out strategies Define scalable BMC architecture and
platform management features across fleet deployments, including
telemetry pipelines, orchestration hooks, and API integrations
(e.g., Redfish, IPMI) Specify and lead the implementation of
chip-to-chip interconnects such as NVLink, UCIe, and other emerging
high-bandwidth, low-latency fabrics Develop integration strategies
for power distribution, control planes, cooling systems (air and
liquid), and shared interconnect fabrics at the rack level Own the
networking architecture across servers and racks, including
400G/800G Ethernet, leaf-spine switching, NIC-to-ToR planning, and
cross-rack topology Specify power delivery systems for
high-density, multi-kilowatt platforms: VRM selection, power trees,
sequencing, and protection logic Guide system design decisions with
awareness of mechanical and thermal constraints to ensure
performance, manufacturability, and serviceability Contribute to
rack-level management infrastructure: CDU planning, telemetry
aggregation, rack controller architecture, and out-of-band control
Support bring-up and validation teams in debugging complex issues
at the system, rack, and POD levels You may be a good fit if you
have 8 years of experience in system or server hardware
architecture, ideally in HPC, AI infrastructure, or hyperscale data
centers Deep understanding of PCIe protocols and topologies,
including bifurcation, retimer tuning, switch fabrics, and
accelerator communication Experience with rack-level and multi-rack
system design, including shared power and networking infrastructure
Strong expertise in BMC systems, control buses, telemetry
integration, and orchestration tooling Familiarity with modern
high-speed networking technologies: 400G Ethernet, InfiniBand, CXL
fabrics, and NIC-switch integration Proven background in power
architecture for dense compute systems, including power budgeting,
sequencing logic, and VRM optimization Rack-level management
infrastructure design experience, including CDU layout, telemetry
aggregation, and rack controller implementation Proven track record
of building infrastructure for at-scale deployment, such as
automated diagnostics, health monitoring, and fleet orchestration
frameworks Understanding of thermal design principles such as
airflow, heatsink selection, and liquid cooling systems A
systems-level perspective with the ability to design scalable,
maintainable, and high-performance platforms Excellent
communication skills and experience collaborating with hardware,
firmware, validation, and mechanical engineering teams Benefits
Medical, dental, and vision packages with generous premium coverage
$500 per month credit for waiving medical benefits Housing subsidy
of $2k per month for those living within walking distance of the
office Relocation support for those moving to San Jose (Santana
Row) Various wellness benefits covering fitness, mental health, and
more Daily lunch dinner in our office How we’re different Etched
believes in the Bitter Lesson . We think most of the progress in
the AI field has come from using more FLOPs to train and run
models, and the best way to get more FLOPs is to build
model-specific hardware. Larger and larger training runs encourage
companies to consolidate around fewer model architectures, which
creates a market for single-model ASICs. We are a fully in-person
team in San Jose (Santana Row), and greatly value engineering
skills. We do not have boundaries between engineering and research,
and we expect all of our technical staff to contribute to both as
needed.
Keywords: Etched, Vacaville , Platform Architect, Engineering , San Jose, California