Abstract To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classififier. Our word recognition models build upon the RNN semicharacter architecture, introducing several new backoff strategies for handling rare and unseen words. Trained to recognize words corrupted by random adds, drops, swaps, and keyboard mistakes, our method achieves 32% relative (and 3.3% absolute) error reduction over the vanilla semi-character model. Notably, our pipeline confers robustness on the downstream classififier, outperforming both adversarial training and off-the-shelf spell checkers. Against a BERT model fifine-tuned for sentiment analysis, a single adversarially-chosen character attack lowers accuracy from 90.3% to 45.8%. Our defense restores accuracy to 75%1 . Surprisingly, better word recognition does not always entail greater robustness. Our analysis reveals that robustness also depends upon a quantity that we denote the sensitivity