资源论文Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

2020-03-16 | |  86 |   49 |   0

Abstract

The interpretation of deep learning models is a challenge due to their size, complexity, and often opaque internal state. In addition, many systems, such as image classifiers, operate on low-level fea tures rather than high-level concepts. To address these challenges, we introduce Concept Activation Vectors (CAVs), which provide an interpretation of a neural net’s internal state in terms of human-friendly concepts. The key idea is to view the high-dimensional internal state of a neural net as an aid, not an obstacle. We show how to use CAVs as part of a technique, Testing with CAVs (TCAV), that uses directional derivatives to quantify the degree to which a user-defined concept is important to a classification result–for example, how sensitive a prediction of zebra is to the presence of stripes. Using the domain of image classification as a testing ground, we describe how CAVs may be used to explore hypotheses and generate insights for a standard image classification network as well as a medical application.

上一篇:Clustering Semi-Random Mixtures of Gaussians

下一篇:Orthogonal Machine Learning: Power and Limitations

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...