Senior/Principal Researcher: Next-generation NPU and Agentic CPU Micro-architecture
If you are enthusiastic in shaping Huawei’s European Research Institute together with a multicultural team of leading researchers, this is the right opportunity for you!
Huawei envisions a world where technology connects people, empowers industries, and unlocks human potential. Guided by its mission to enrich lives through communication and intelligent innovation, Huawei stands at the forefront of global digital transformation. As a leader in Information and Communications Technology (ICT), the company pioneers breakthroughs in artificial intelligence, cloud computing, and smart devices—building the intelligent foundation of a fully connected world.
Through its Carrier, Enterprise, and Consumer business groups, Huawei delivers resilient digital infrastructure, advanced cloud and AI platforms, and transformative devices that enable progress at every level. Supporting 45 of the world’s top 50 telecom operators and serving one-third of the global population across more than 170 countries, Huawei is shaping a future where connectivity becomes a powerful catalyst for opportunity and sustainable growth.
This spirit of bold innovation is embodied by Huawei Technologies Switzerland AG. From its research hubs in Zurich and Lausanne, pioneering teams push the boundaries of High-Performance Computing, Computer Architecture, Computer Vision, Robotics, Artificial Intelligence, Neuromorphic Computing, Wireless Technologies, and Networking—architecting the intelligent systems that will define tomorrow’s digital era.
Responsibilities:
The research investigates the next generation of Neural Processing Units (NPUs), with particular focus on core micro-architecture. This spans the frontend (branch prediction, BTB, instruction prefetchers), register files, issue and wake-up logic, scalar functional units, vector functional units, and tensor units; the backend (TLBs, L1, scratchpads); and the broader cache hierarchies and memory systems that feed them. In parallel, the work explores a new generation of CPUs tailored to the agentic AI era. As agent-based systems now spend time not only on the GPU/NPU but also on the CPU—handling tool calling, agent-logic scheduling, context management, and similar tasks—the CPU must become resilient to bursty compute and capable of rapid context switching, without the resource thrashing and limited on-chip contexts that constrain today's designs (most of which support only 2-way SMT).
Investigate and prototype new architectural features, including but not limited to:
NPU Core Micro-architecture: Explore frontend mechanisms (branch prediction, BTB, instruction prefetching), register file organization, issue and wake-up logic, and the design of scalar, vector, and tensor functional units to maximize throughput and utilization for AI workloads.
NPU Backend and Memory System: Investigate backend structures including TLBs, L1 caches, and scratchpads, along with cache hierarchies and memory systems that sustain high bandwidth and low latency to the compute units.
Agentic CPU Architecture: Design CPUs tailored to agentic AI, where the processor handles tool calling, agent-logic scheduling, and context management alongside accelerator-bound work.
Resilience to Bursty Compute and Context Switching: Develop architectural support for rapid context switching and high thread-level concurrency that goes beyond conventional 2-way SMT, mitigating the on-chip resource thrashing and limited context capacity of current designs.
Produce and present research papers at top-tier conferences and journals (e.g., ASPLOS, ISCA, MICRO, HPCA)
Establish and maintain collaborations with leading academic institutions and faculty.
Mentor and support junior researchers and interns in their professional development.
Requirements:
PhD in Computer Science, Electrical Engineering, or related field.
Strong background in Computer Architecture and Micro-architecture is a must.
Creativity and the ability to think outside the box to develop innovative technologies.
Research experience and/or strong knowledge in at least one of the following areas:
Computer Architecture and Micro-architecture: Computer Architecture and Micro-architecture of modern Intel/ARM/AMD CPUs, Nvidia GPUs, or Google’s TPUs.
Experience with computer architecture simulators: Gem5, ChampSim, Sniper, ZSim, or QFlex.
Vector and Matrix Extensions: ARM SVE/SME or Intel SSE/AVX/AMX.
GPU programming model: Understanding of CUDA kernels and PTX/SASS instructions.
Micro-architecture characterization: Understanding of profiling and characterization and bottleneck analysis on applications on CPU/GPUs via Performance Monitoring Units (PMUs).
Proven track record of publishing research papers in top-tier conferences or journals.
Excellent analytical, problem-solving, and system-level thinking skills.
Strong development and prototyping skills are a must.
Strong interpersonal skills, with a collaborative spirit and the ability to work independently.
Why join us:
Collaborate with world-class scientists and engineers in an open, curiosity-driven environment;
Access to state-of-the-art technology and tools;
Opportunities for professional growth and development;
Competitive salary, and a high quality of life in Zurich, at the center of Europe;
Last but certainly not least: be part of innovative projects that make a difference.
- Department
- Advance Computing & Storage
- Locations
- Zürich
- Employment type
- Full-time
- Employment level
- Professionals