Senior Researcher - LLM System Architecture
If you are enthusiastic in shaping Huawei’s European Research Institute together with a multicultural team of leading researchers, this is the right opportunity for you!
Huawei is a leading global information and communications technology (ICT) solutions provider. Driven by a commitment to sound operations, ongoing innovation, and open collaboration, we have established a competitive ICT portfolio of end-to-end solutions in telecom and enterprise networks, devices, and cloud technology and services. Our ICT solutions, products, and services are used in more than 170 countries and regions, serving over one-third of the world's population. With 180,000 employees, Huawei is committed to enabling the future information society, and building a Better Connected World.
Huawei's Switzerland Research Centre in Zurich is responsible for advanced technical research, architecture evolution design and strategic technical planning on computer architecture.
We are currently looking for Researchers of our new Computer Architecture Innovation Lab.
Job Responsibilities:
Efficient LLM inference engine design: architect and implement solutions to improve the efficiency of open-source (e.g., vllm/sglang) and proprietary (e.g., Mindspore) LLM inference engine, such as quantization, sparse attention, and KV cache reuse
Hardware-Software Co-Design: Design efficient kernels to support LLM inference on Huawei NPU.
Profile end-to-end AI workflows to identify bottlenecks in NPU-Python frameworks. Implement low-latency, high-throughput solutions for transformer-based models and generative AI workloads.
Research & Ecosystem Leadership: Publish cutting-edge research in top-tier conferences (e.g., ISCA, ASPLOS, MLSys) and contribute to open-source projects; Drive adoption of NPU frameworks through developer tools, documentation, and industry partnerships.
Requirements and qualifications:
PhD or MSc in computer science or area related to computer architectures.
Proficiency in Python and C/C++ for system-level programming.
Proficiency in AI frameworks (PyTorch, TensorFlow) and model optimization techniques.
Experience with xPU programming (e.g., AscendC, Triton, CUDA) and toolchains is a plus.
Experience with large visual models (LVM) and/or multi-modal models is a plus.
Excellent oral and written English.
What we offer
Competitive salary and incentive schemes
Research on high-impact topics
Work with top Researchers and University Professors
International mobility
Application Process:
To apply, please click on the link
- Department
- Computing Product Line
- Locations
- Zürich, Lausanne