CV
Summary
Computer architecture researcher focusing on computer system performance evaluation, simulation methodology, and hardware-software co-design for emerging applications. I build novel frameworks for fast, accurate, and portable computer system performance evaluation using sampled simulation.
Education
- Computer Science2026-06-01University of California, Davis
- Electrical and Computer Engineering2026-06-01Cornell University
- Computer Science and Engineering2023-06-01University of California, Davis
Work Experience
- Graduate Researcher2023-06-01 -DArchR Lab, UC Davis. Advisor: Jason Lowe-Power. Working on Nugget (LLVM IR sampling framework), LoopPoint validation, and gem5 contributions.
- Visiting Student2025-08-01 -Computer Systems Laboratory, Cornell University. Advisor: Christopher Batten. Working on agile robotic hardware-software co-design and STM32G4 MCU modeling in gem5.
- Undergraduate Researcher2022-06-01 - 2023-06-01DArchR Lab, UC Davis. Advisor: Jason Lowe-Power. Implemented SimPoint and LoopPoint support in gem5.
Skills
Programming Languages
- C/C++
- Python
- Bash
- CUDA
- Assembly
- LaTeX
Hardware Description Languages
- Chisel
System Evaluation Tools
- gem5
- QEMU
Profiling & Instrumentation
- LLVM passes
- DynamoRIO
- Valgrind
- Linux perf
- PAPI
Linux Tools
- cpuset
- CRIU
- cgroups
- Docker
Compilers
- GCC/G++
- GFortran
- Clang/LLVM
Publications
- Nugget: Portable Program Snippets2026HPCA 2026Introduces an LLVM IR-based sampling framework that turns long-running workloads into portable program snippets which can be analyzed once, then reused across binaries, ISAs, and microarchitectures. Achieves up to 100x+ speedup versus functional simulation on SPEC CPU2017, NPB, and LSMS workloads.
- Accelerating the Simulation of Parallel Workloads using Loop-Bounded Checkpoints2025ACM TACO (Under Review)Proposed LoopPoint, a synchronization-agnostic loop-based sampling methodology that enables fast, accurate simulation of multi-threaded workloads via loop-bounded checkpoints, achieving up to 801x speedup on SPEC CPU2017 with ~2.3% average runtime error.
Presentations
- LoopPoint Tools: Sampled Simulation of Complex Multi-threaded Workloads using Sniper and gem52023HPCA 2023Montreal, CanadaTutorial on LoopPoint sampling methodology
- gem5 Workshop2023ISCA 2023Orlando, FL, USAWorkshop on gem5 simulator
- gem5 Bootcamp 20242024gem5 BootcampVirtualBootcamp on gem5 simulation framework
Teaching
- ECS 154B: Computer Architecture2024University of California, DavisRole: Teaching AssistantLed weekly discussion sections and office hours; created assignment material on Chisel-based CPU model (DINO CPU).
Portfolio
- Computer Systems Seminar at UC Davis2024FounderOrganized weekly speaker series with 12+ talks from academia and industry (ongoing).
- gem5 Repository Contributions2022Open sourceAuthor of 50+ commits to the gem5 simulator, including full-system sampling support and related simulation features.
- CXL Shared Memory Filesystem Optimization2024Course projectInvestigated application needs in a CXL shared-memory system and improved the FAMFS framework with efficient allocation/deallocation and zero-copy operations.
- CUDA Microbenchmarks2024Course projectDeveloped configurable microbenchmarks to measure shared-memory latency and memory-scaling behavior on GPUs.
Languages
- EnglishNative speaker
- CantoneseNative speaker
- MandarinProfessional working proficiency
Interests
- Computer ArchitecturePerformance Evaluation, Simulation, Hardware-Software Co-design
- Robotic ApplicationsTiny Robots, HPC Workloads