CV

Education, experience, awards, skills, and patents.

Contact Information

Name Leo (Linxiao) Li
Professional Title AI Infrastructure & LLM Systems Researcher
Email llinxiao520@gmail.com

Professional Summary

Machine learning systems researcher focusing on efficient training and inference for large-scale language models. My work studies system-level optimizations including MFU improvement, communication–computation overlap, MoE inference optimization, and hardware-efficient execution for large-scale AI infrastructure.

Currently AI Infrastructure / Training Systems Researcher at AIGCode (蔻町科技). Previously Senior Algorithm Researcher at Huawei Cloud Architecture Innovation Lab. Bachelor of Computer Science from the University of Sydney.

Experience

  • 2025 -

    Shanghai, China

    AI Infrastructure / Training Systems Researcher
    AIGCode (蔻町科技)
    Leading model pretraining optimization and LLM training infrastructure.
    • Achieved 5x single-operator performance improvement (MoE Combine + FFN)
    • Tech stack: Triton, TileLang, PyPTO with Tile paradigm programming
  • 2023 - 2025

    Shenzhen, China

    Senior Algorithm Researcher Leader / PL (Huawei Fellow Team)
    Huawei Technologies Co., Ltd.
    Huawei Cloud CTO Fellow Team, Cloud Architecture Innovation Lab. Collaborated with Guangming Lab Intelligent Recommendation and Resource Scheduling Group.
    • Worked on large-scale LLM inference systems and AI infrastructure
    • Led MoE inference optimization and NPU virtualization projects
    • FRCP Capacity Prediction: iTransformer-based intelligent capacity prediction system, deployed at scale
    • QoS MoE Model: MAE 3%, application classification accuracy 99.995%
  • 2022 - 2023

    Chengdu, China

    Software Development Leader
    Huawei Technologies Co., Ltd.
    Led 5-person team as Committer. Core developer on Data Workshop distributed messaging system.
    • Built on Pulsar distributed messaging middleware with self-developed storage engine
    • Zero P3 incidents over 6 months; supported major customers at scale

Publications

Selected Research Projects

  • Ascend NPU User-Space Virtualization

    Designed a user-space virtualization framework for Ascend NPUs to improve hardware utilization in large-scale AI training.

    • Achieved <3% performance overhead while increasing NPU utilization from 10% to 30%
  • MoE Inference Memory Offloading

    Designed a memory offloading framework for large-scale Mixture-of-Experts inference to reduce GPU memory pressure.

    • Achieved <3% overhead while maintaining TTFT <3s and TPOT <100ms in production systems
    • Huawei Cloud High-Potential Patent: A MoE Model-Oriented Inference Optimization Method
  • Both projects were deployed in Huawei Cloud and announced at Huawei HDC 2025 (~50 min mark).

Education

  • 2017 - 2021

    Sydney, Australia

    Bachelor of Computer Science
    The University of Sydney
    Computer Science
    • WAM: 76/100
    • GRE: 334 (Q170, V164, AW 5.5)
    • Advisor: Prof. Wei (Wilson) Li
    • Received full scholarship doctoral offer from University of Sydney
  • 2020 - 2021

    Shanghai, China

    Exchange Student
    Fudan University
    Economics
    • GPA: 3.5/4.0
  • 2019 - 2020

    Shanghai, China

    Exchange Student
    Shanghai Jiao Tong University
    Data Science and Analytics
    • GPA: 4.0/4.0

Awards

  • 2025
    Huawei ICT Software Competition — 14th / 5200+
    Huawei

    Top 3 in Huawei Cloud product line, 1st in Chengdu Research Institute.

  • 2024
    Golden Cloud Award (金代码)
    Huawei Cloud

    Top 2% recognition for outstanding code contributions.

  • 2024
    HCS Cloud Summit Star (云巅之星)
    Huawei Cloud

    Top 5% internal technical honor for outstanding contributions.

  • 2023
    Huawei "Rising Star" (明日之星) Award
    Huawei

    Top 5% recognition.

  • 2023
    Huawei Trustworthy Professional Certification
    Huawei

    Passed Huawei Trustworthy Professional Level Certification.

  • 2022
    Huawei "Rising Star" (明日之星) Award
    Huawei

    Top 5% recognition.

Skills

Core Domains
LLM Training/Inference MFU Optimization Operator/Kernel Fusion MoE Inference & Memory Offloading Quantization (BF16/W8A8/W4A8) Pipeline Scheduling Multi-GPU Parallel Training
Programming Languages
Python Java/Spring Boot Go C/C++ Triton TileLang
AI Frameworks & Tools
vLLM Megatron MindIE PyTorch CANN HCCL NCCL Docker Kubernetes Prometheus

Patents

  • 2024
    A MoE Model-Oriented Inference Optimization Method
    Huawei Cloud High-Potential Patent

    Invention patent for MoE large model inference memory offloading optimization.

  • 2024
    A Latency-Aware Serverless Request Scheduling Strategy

    Invention patent for intelligent serverless request scheduling.