On Robust Prefix-Tuning for Text Classification

Prefix-tuning lacks robustness, while current defense methods will hamper the modularity of the prefixes. We tune an additional prefix during inference to steer correct activation of the pretrained LM, which significantly improves the robustness. (Read More)

A Class of Short-term Recurrence Anderson Mixing Methods and Their Applications

We develop a novel class of short-term recurrence Anderson mixing methods and validate its effectiveness in several applications including solving fixed-point problems and training neural networks. (Read More)

Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

We propose a stochastic version of Anderson mixing with theoretical guarantees and promising results in training neural networks. (Read More)

Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision

We propose a self-supervised CWS approach with a straightforward and effective architecture, which outperforms previous methods on 9 different CWS datasets. (Read More)