Abstract Recently, DNN model compression based on network architecture design, e.g., SqueezeNet, attracted a lot of attention. Compared to well-known models, these extremely compact networks don’t show any accuracy drop on image classifification. An emerging question, however, is whether these compression techniques hurt DNN’s learning ability other than classifying images on a single dataset. Our preliminary experiment shows that these compression methods could degrade domain adaptation (DA) ability, though the classifification performance is preserved. In this work, we propose a new compact network architecture and unsupervised DA method. The DNN is built on a new basic module Conv-M that provides more diverse feature extractors without signifificantly increasing parameters. The unifified framework of our DA method will simultaneously learn invariance across domains, reduce divergence of feature representations and adapt label prediction. Our DNN has 4.1M parameters—only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN obtains GoogLeNet-level accuracy both on classifification and DA, and our DA method slightly outperforms previous competitive ones. Put all together, our DA strategy based on our DNN achieves stateof-the-art on sixteen of total eighteen DA tasks on popular Offifice-31 and Offifice-Caltech datasets