DISTRIBUTIONALLY ROBUST NEURAL NETWORKSFOR GROUP SHIFTS :O NTHE IMPORTANCE OF REG -ULARIZATION FOR WORST-C ASE GENERALIZATION

2019-12-30 |

123 |

65 |

Abstract

Overparameterized neural networks trained to minimize average loss can be highly accurate on average on an i.i.d. test set, yet consistently fail on atypical groups of the data (e.g., by learning spurious correlations that hold on average but not in such groups). Distributionally robust optimization (DRO) provides an approach for learning models that instead minimize worst-case training loss over a set of pre-defined groups. We find that naively applying group DRO to overparameterized neural networks fails: these models can perfectly fit the training data, and any model with vanishing average training loss will also already have vanishing worst-case training loss. Instead, their poor worst-case performance arises from poor generalization on some groups. By coupling group DRO models with increased regularization—stronger-than-typical `2 regularization or early stopping—we achieve substantially higher worst-group accuracies, with 10-40 percentage point improvements over standard models on a natural language inference task and two image tasks, while maintaining high average accuracies. Our results suggest that regularization is critical for worst-group generalization in the overparameterized regime, even if it is not needed for average generalization. Finally, we introduce and provide convergence guarantees for a stochastic optimizer for this group DRO setting, underpinning the empirical study above.

上一篇：SPAN RECOVERY FOR DEEP NEURAL NETWORKS WITH APPLI -CATIONS TO INPUT OBFUSCATION

下一篇：LINEAR SYMMETRIC QUANTIZATION OF NEURALN ETWORKS FOR LOW- PRECISION INTEGER HARD -WARE

用户评价

全部评价

还没有评论，说两句吧！

热门资源

A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Learning to Predi...

Much of model-based reinforcement learning invo...
The Variational S...

Unlike traditional images which do not offer in...
Hierarchical Task...

We extend hierarchical task network planning wi...
Shape-based Autom...

We present an algorithm for automatic detection...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com