📝 Publications

🧑‍🎨 Large Language Models

中国计算机科学技术发展年度报告 2025

ICLR 2025

We explore the emergence of in-context learning (ICL) capabilities in auto-regressive next-token prediction models.
To bridge the pre-training and ICL phases, we introduce a two-level expectation over data and topic distributions, providing PAC-Bayes generalization bounds to support our analysis.
Additionally, we model the training process using Stochastic Differential Equations (SDEs), demonstrating that ICL arises from the exceptional generalization across sequences and topics.

ICLR 2025

We propose Adaptive Decomposed Prompt Tuning (ADePT), which can produce unique token embedding offset for each token.
ADePT addresses the limitations of DePT, enabling better optimization and generalization without increasing inference time or parameters.
Experiments on 23 NLP tasks and 4 PLMs show ADePT outperforms leading PEFT methods and even full fine-tuning in some cases.

NeurIPS 2024

We show an exciting phenomenon that SVD-based weight pruning can enhance In-Context Learning (ICL) performance.
we conduct theoretical analysis by presenting the implicit gradient descent (GD) of ICL and giving generalization bounds of ICL.
We further propose a simple, derivative-free algorithm to enhance ICL. Experiments demonstrate its effectiveness.

COLING 2025

PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning
Qibin Wang, Xiaolin Hu, Weikai Xu, Wei Liu, Jian Luan, Bin Wang

We propose PMSS, enabling high-rank updates at low costs by selecting skeletons from pre-trained weights.
PMSS overcomes LoRA’s low-rank limitations and optimizes initialization to utilize semantic and linguistic information.
Experiments show PMSS outperforms LoRA and excels in tasks like DROP and math reasoning with fewer trainable parameters.

KDD 2024

Neural Retrievers are Biased Towards LLM-Generated Content
Sunhao Dai, Yuqi Zhou, Liang Pang, Weihao Liu, Xiaolin Hu, Yong Liu, Xiao Zhang, Gang Wang, Jun Xu

We explore how LLM-generated texts influence IR systems, revealing a source bias where neural models favor LLM-generated documents.
We use information theory to explain this bias, showing it arises from the focused semantics of LLM-generated content.

AAAI 2025

This paper provides the first generalization analysis of ZO-DSGD with changing topology.
The obtained generalization bounds align with SGD in (strongly) convex cases and with DSGD in non-convex cases.
The results reflect the impact of client count, sample size, and topology on generalization performance.

ICLR 2023

We present a theoretical analysis of the generalization error for non-participating clients in federated learning.
The obtained generalization bounds in high probability form capture the performance of a single trial, rather than the average over multiple trials.
We derive generalization bounds for heavy-tail losses, applicable to federated learning with unbounded losses, such as cross-entropy.

APMC 2020

We employ Physics Informed Neural Networks (PINNs) to solve rectangular waveguide problems.
We successfully apply PINNs to the task of solving electric and magnetic fields, which can be described by partial differential equations (PDEs).
We also show the applicability of the framework for predicting the unknown parameters such as wavenumber.

ICIP 2021 3D Grid Transformation Network For Point Cloud Completion
Xiaobao Deng, Xiaolin Hu, Nicholas E. Buris, Ping An, Yilei Chen, ICIP 2021
Wavelength-tunable Q-switched fiber laser based on a 45 tilted fiber grating
Xiaolin Hu, Zhijun Yan, Qianqian Huang, Chuanhang Zou, Tianxing Wang, Chengbo Mou, Opto-Electronic Engineering 2018