High-Performance Optimisation (Postdoctoral Researcher)
Huawei is a leading global information and communications technology (ICT) solutions provider. Through our constant dedication to customer-centric innovation and strong partnerships, we have established leading end-to-end capabilities and strengths across the carrier networks, enterprise, consumer, and cloud computing fields. Our products and solutions have been deployed in over 170 countries serving more than one third of the world’s population.
For the Computing Systems Laboratory, we are hiring Three Postdoctoral Researchers in scope of an optimization-related high-performance computing platform. In the coming 1-2 years, we exclusively aim to solve foundational scientific problems related to high performance distributed- and shared-memory parallel optimization, with the goal to produce publications at top scientific magazines. On the longer term, the platform aims to solve industrial optimization problems either on-premises or as-a-Service.
This research is conducted jointly with a team of leading scientists at Huawei’s Theory Lab in Hong Kong, with whom successful candidates are expected to work closely. Researchers in Zurich are expected to work on basic operators that provide functionalities at a level similar to BLAS, SparseBLAS, GraphBLAS, LAPACK, LAGraph, etc., focusing on basic foundational operators for optimization workloads.
The successful candidate will work on
- Identifying both existing and novel basic operations relevant to optimization platforms;
- Speed-of-light analyses of both existing and newly identified basic operations that can a) identify fundamental performance bottlenecks, b) accurately predict scalability properties (e.g., iso-efficiency), c) predict trade-off effects (e.g., memory vs. communication), and d) predict what combination of devices (classic CPUs or specialized accelerators and how many) would lead to the highest efficiency solves;
- The design and prototyping of highly scalable, highly efficient, and highly productive software systems that lie at the foundation of our next-generation optimization platform.
Responsibilities
As such, successful candidates will:
- Design and implement novel basic operators required for our optimization platform;
- Analyze specific algorithms for basic operators and establish fundamental limits in models of parallel computation that account not only for classic work (flops) and compute power (flop/s), but also account for data reuse, and memory throughput & access latencies;
- Following, as appropriate, cache-aware or cache-oblivious paradigms, as well as standard HPC paradigms for shared- and distributed-memory parallelization, vectorization, etc.;
- Research novel data structures to speed up basic operator execution on traditional CPUs with vector and matrix SIMD, as well as less traditional xPUs such as AI accelerators;
- Ensure solvers may be easily expressed as data-centric C++ control flow around calls to basic operators that automatically dispatch the solver over potentially multiple xPUs;
- Use, and, if necessary, extend run-time systems and communication layers to achieve higher basic operator efficiency, better scalability, and automate computational trade-offs;
- Ensure the quality and performance of all solvers implemented on top of our basic operators, enabling the solution of next-generation scientific and industrial problems.
Requirements
Successful candidates will have in-depth experience with several of the following:
- Optimization of irregular algorithms, such as graph computations or sparse numerical linear algebra, touching on all of high-level data structures and algorithms to low-level code optimisations such as SIMD, coarse- and fine-grained locking mechanisms;
- Multi-core, many-core, programming (e.g., POSIX Threads or OpenMP);
- Distributed-memory programming (e.g., MPI, BSP, or LPF), both using collective communications as well as raw RDMA;
- Experience with code generation for high-performance computations and/or in-depth knowledge of their underlying methodologies (e.g., ALP, BLIS, DaCE, Spiral, Flame, Firedrake, et cetera).
Successful candidates master the following common aspects:
- Generic programming in C++11 (or higher), with strong knowledge of standard algorithms and data structures as found in the STL and beyond;
- Performance analysis and parallel debugging (e.g., Valgrind, GNU Debugger, CI testing);
- Excellent written and verbal communication skills with a proven ability to present complex technical information clearly and concisely to a variety of audiences;
- Track record of publications at top HPC or applied math conferences or journals;
- Collaborative work style with the ability to work in a multicultural environment.
The following additional experiences and in-depth knowledge would be considered a plus:
- GraphBLAS or Algebraic Programming (ALP);
- Any aspect of optimization or their key solvers;
- State-of-the-art fabrics and their programming (e.g., Infiniband & ibverbs);
- Publications at top venues in physical sciences or theoretical computer science; and
- SIMT or accelerator programming (e.g., CUDA, OpenCL)-- in particular with Huawei Ascend.
- Department
- Computing Systems
- Locations
- Huawei Research Center Zürich
- Employment type
- Contract
Huawei Research Center Zürich
About Huawei Research Center Zürich
High-Performance Optimisation (Postdoctoral Researcher)
Loading application form
Already working at Huawei Research Center Zürich?
Let’s recruit together and find your next colleague.