Electronic Design Automation (EDA), known as “the mother of chips”, serves as the cornerstone of chip design industry. The Design, Automation and Test in Europe Conference (DATE) is a top international academic conference in EDA. Recently, the Parallel Data and Storage Laboratory (HUST-PDSL) led by Prof. Wan Jiguang from Wuhan National Laboratory for Optoelectronics (WNLO) collaborated with Dr. Wang Jianzong’s team from the Artificial Intelligence Center at Ping An Technology(PAT). Their joint paper, published at DATE 2025, received the Best Paper Award.
The team proposed a novel chunk-level mixed-precision quantization strategy named “Cocktail” to optimize long-text inference for LLMs. By comparing the similarity between LLM queries and contexts at the chunk level, the researchers developed an efficient mixed-precision quantization search for KV caches. Additionally, the hardware efficiency of the KV cache after mixed-precision quantization was significantly improved through a KV cache chunk reordering strategy. Compared with the previous industry-standard mixed-precision quantization algorithm, this approach can reduce video memory occupancy by 10.4% and token generation latency by 21.4%.
Artifical intelligence (AI), especially large language models (LLMs), has entered a stage of widespread deployment and application. However, performance bottlenecks during deployment remain a major challenge hindering AI development, while also presenting key opportunities for integrating LLMs with computer architecture. The achievement, a collaborative effort between the laboratory and Ping An Technology, demonstrates the in-depth integration of storage and AI, highlights the strong partnership between academia and industry, and provides an effective method for inference optimization acceleration.
DATE, co-organized by the Institute of Electrical and Electronics Engineers (IEEE) and the Association for Computing Machinery (ACM) since 1994, has attracted leading scholars and business experts worldwide in electronic design automation and testing. This year’s conference received over 1,200 submissions, with a 25% acceptance rate. A total of four Best Paper Awards were selected in the D, A, T and E tracks (PDSL’s paper won the award in the E track.)
The paper lists Huazhong University of Science and Technology (HUST) as the first contributing institution, with PhD candidate Tao Wei from Wuhan National Laboratory for Optoelectronics as the first author. Prof. Wan Jiguang and Dr. Wang Jianzong serve as co-corresponding authors. The research was supported by projects including the Guangdong Provincial Key R&D Program (“Development and Application of Edge Computing Open Systems for Human-Computer Collaboration”) and the National Key R&D Program (“Distributed Storage Systems for New Computing Paradigms”).
PDSL, led by Wan Jiguang from WNLO, focuses on frontier research in areas such as storage systems, computer system architecture, and database systems. The laboratory developed a parallel file system that improved the world record in the IO500 supercomputer storage performance ranking (ten-node list) by 15 times, and has cultivated two recipients of Huawei’s “Talented Youngsters”program. PDSL will further collaborate with PAT to explore innovative applications of large models in storage system optimization through joint efforts.
Written by: Su Wanxin
Edited by: Ren Xinni, Chang Wen, Peng Yumeng