On Robust Prefix-Tuning for Text Classification
Prefix-tuning lacks robustness, while current defense methods will hamper the modularity of the prefixes. We tune an additional prefix during inference to steer correct activation of the pretrained LM, which significantly improves the robustness. (Read More)A Class of Short-term Recurrence Anderson Mixing Methods and Their Applications
We develop a novel class of short-term recurrence Anderson mixing methods and validate its effectiveness in several applications including solving fixed-point problems and training neural networks. (Read More)Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
We propose a stochastic version of Anderson mixing with theoretical guarantees and promising results in training neural networks. (Read More)Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision
We propose a self-supervised CWS approach with a straightforward and effective architecture, which outperforms previous methods on 9 different CWS datasets. (Read More)
Newer