Chenxi Wang (王晨曦)

Associate professor
Institute of Computing Technology (ICT)
Chinese Academy of Sciences (CAS)
University of Chinese Academy of Sciences (UCAS)
Email: wangchenxi@ict.ac.cn

News

News: The first CPU silent error online monitoring system, Orthrus, has been accepted to SOSP'25. Congrats to Chenxiao, Zhenting and all the collaborators!
News: MemLiner's subsequent work has been accepted by TOCS'25. Congrats to Shengkai, Haonan, and all the collaborators!
News: Serve on the PC of NSDI 2026, ASPLOS 2026.
News: Beehive is accepted to NSDI 2025. Congrats to Quanxi, Hong, Ying, Yanwen, and all the collaborators!
News: Serve on the PC of PLDI 2025, USENIX ATC 2025.
News: Atlas is accepted to OSDI 2024. Congrats to Lei and Shi! The first student OSDI paper of ICT, CAS.
News: Serve on the PC of NSDI 2025, ASPLOS 2025, and USENIX ATC 2024.
News: Occamy is accepted to ASPLOS'23. Congratulations to Zhongcheng and all the collaborators!
News: Awarded the ACM SIGOPS China Rising Star
News: Hermit is accepted to NSDI'23. Congrats to Yifan and all the collaborators!
News: Canvas is accepted to NSDI'23. Congrats to Yifan and all the collaborators!
News: MemLiner was awarded the Jay Lepreau Best Paper Award at OSDI'22. Congrats to Haoran and the great team!

I am actively looking for self-motivated master and Ph.D. students. Please contact me if you are interested in system research!

About me

I've been an associate professor at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) since 2022. Before joining the ICT, I worked as a postdoc at UCLA, working with Dr. Harry Xu. I got my Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Sciences in 2018, under the supervision of Dr. Xiaobing Feng. I got my Bachelor degree from Tianjin University in 2013. My research interest is to build hard-core systems, managed runtime and big data systems for emerging hardware, such as GPUs and resource-disaggregated datacenters.

View My Github

Research

My current research is to build cross-layer systems for warehouse-scale computers.

#1 System for Resource Disaggregation

As an emerging datacenter architecture, resource-disaggregation aims to reorganize datacenter hardware of each kind into dedicated resource servers to improve resource utilization, fault tolerance and simplify hardware adoption. These servers are connected by advanced network fabrics, such as Infiniband and Intel Fabrics. As a result, the cloud application running on the resource-disaggregated cluster can get compute and memory resources from different servers.

(1) Runtime for Disaggregated Memory

I developed a series of runtime systems to improve data locality and mitigate resource contention problems in cloud applications running on the resource-disaggregated datacenter. Semeru[OSDI'20] separates the application threads and GC tracing threads of one program and schedules them to corresponding CPU servers and memory servers. As a result, the GC threads can run concurrently and continuously on the memory servers without interfering with the application threads running on CPU servers. At the same time, we also proposed Mako[PLDI'22], a low-pause, concurrent compaction algorithm for the resource-disaggregated cluster. MemLiner[OSDI'22] lines up the memory access of application threads and GC threads to reduce their interference, resulting in a small working set and clean access patterns for the prefetcher.

(2) Kernel for Disaggregated Memory

The existing paging and swap systems are often used to manage the remote memory server. However, the current swap data plane was designed long ago for slow disk-based swapping and hence can become a major performance bottleneck when fast remote memory is used, such as 1) lack of performance isolation and (2) poor scalability. To solve these problems, I redesigned the swap system, Canvas[NSDI'23], to provide holistic resources isolation and adaptive management strategies, e.g., swap-entry allocation, prefetching and scheduling, for the diverse cloud applications.

#2 System for GPUs

The rapid evolution of AI workloads has driven an unprecedented demand for GPUs in datacenters, making efficient utilization and cost reduction critically important. However, it is challenging to meet the diverse performance requirements of individual AI models while simultaneously optimizing resource usage. My research focuses on building operating system–level support for GPUs to enable transparent, fine-grained resource management.

#3 System for Mobile Devices

The Android ecosystem continually expands, resulting in wide variations in CPU and memory demands across applications. The diversity and complexity of applications pose significant challenges for systems, especially in thread scheduling and memory management. For example, the long-running apps intensify memory pressure and result in more GC overhead, which degrades system smoothness and responsiveness. We are currently researching the mismatches between the applications and the Android system, and have successfully resolved several practical issues. One such issue is the WeakReference Block (https://android-review.googlesource.com/c/platform/art/+/3683212), which refers to the application thread being blocked during the GC (garbage collection) processes WeakReferences. By profiling details on WeakReference blocking, we restructured the code for accessing Weak References during GC, resulting in an over 80% reduction of WeakReference-induced blocking. [This code patch](https://github.com/ICTPLSys/Android-Art-WeakReference) has been tested by Xiaomi and Honor and is planned for deployment in their systems.

Students


I am lucky to work with the following undergraduate and graduate students:

Alumni

Service

Released systems

Selected publications

(Asterisk(*) means corresponding author)

Conferences

Journals

Workshops



Complete publication list