Assessing the Interpretability of ML Models from a Human Perspective

Authors: (1) Omid Davoodi, Carleton University, School of Computer Science; (2) Shayan Mohammadizadehsamakosh, Sharif University of Technology, Department of Computer Engineering; (3) Majid Komeili, Carleton University, School of Computer Science. Table of Links Abstract and Intro Background Information Methodology Prototype Interpretability Prototype-query Similarity Interpretability of the Decision-Making Process The Effects of Low Prototype Counts Discussions Author Contributions, Data availability, Competing Interests, and References Discussions In a substantial portion of the previous work, it was implicitly assumed that part-prototype networks are interpretable by nature. We have devised three human-centric evaluation schemes to asses that assumption. One for evaluating the interpretability of the prototypes themselves, one for evaluating the similarity between prototypes and the activated regions of the query sample, and one for evaluating the interpretability of the decision-making process itself. Our experiments show that these schemes are able to differentiate between various methods in terms of interpretability while not suffering from the problems of the previous works. Moreover, we applied this scheme to seven related methods on three datasets. The results shed light on the interpretability of these methods from a human perspective. The results show that not all part-prototype methods are equal when it comes to human interpretability. In some cases, there are severe issues that hurt the interpretability of these methods. Chief among them is the dissimilarity between the prototype and the activation region of the query. This problem existed for all of the models tested to some degree. In addition, the prototypes of some models were not sufficiently interpretable. Finally, ProtoPool has a noticeable problem with the interpretability of the decision-making process due to assigning multiple classes to each prototype and requiring huge numbers of prototypes to make their final decisions. The decision-making process of some other methods were easy to understand using the top 10 activated prototypes, but they still needed more than 10 prototypes to make their final decision. Still, in some other cases, the methods performed relatively well. Apart from ProtoPool, all methods performed quite well in the interpretability of the decision-making process test. ProtoPNet and Deformable ProtoPNet were also able to make the majority of their predictions using the top 10 activated prototypes. Prototype interpretability was also particularly high for some of the methods involved. Considering these results and also the observations on low prototype count models, our suggestion is to look at the top prototype sets for the final decision rather than only individual prototypes to get a better understanding of a model. Even if the most activated prototype has low interpretability or is dissimilar to its activation, the other top prototypes used in the decision could better explain the final decision. It is important to note that the opposite could also happen: the top prototype being interpretable but the others showing a flawed decision-making process. The interpretability of many machine learning methods used in production is very important. However, it is more important to understand the limitations and peculiarities of the models, methods, and tools used to address the interpretability problem. In particular, unified frameworks with emphasis on human-centric assessments are necessary if we want to truly evaluate interpretability methods. We think that more studies like this should be done on different areas of AI interpretability to further our understanding of such models. Author Contributions O.D. Designed, developed, and implemented the experiments as well as wrote the paper, S.M. developed and implemented the experiments, and M.K. designed the experiments and wrote the paper. Data availability Anonymized data of the experiments can be found in the following repository: https://github.com/omiddavoodi/part-prototypeinterpretability-data Competing Interests The authors declare no competing interests. References Molnar, C. Interpretable machine learning (Lulu. com, 2020). Bezdek, J. C. & Castelaz, P. F. Prototype classification and feature selection with fuzzy sets. IEEE Transactions on Syst. Man, Cybern. 7, 87–92 (1977). Kohonen, T. Improved versions of learning vector quantization. In 1990 ijcnn international joint conference on Neural networks, 545–550 (IEEE, 1990). Kuncheva, L. I. & Bezdek, J. C. Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Syst. Man, Cybern. Part C (Applications Rev. 28, 160–164 (1998). Seo, S., Bode, M. & Obermayer, K. Soft nearest prototype classification. IEEE Transactions on Neural Networks 14, 390–398 (2003). Graf, A. B., Bousquet, O., Rätsch, G. & Schölkopf, B. Prototype classification: Insights from machine learning. Neural computation 21, 272–300 (2009). Li, O., Liu, H., Chen, C. & Rudin, C. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). Davoudi, S. O. & Komeili, M. Toward faithful case-based reasoning through learning prototypes in a nearest neighborfriendly space. In International Conference on Learning Representations (2021). Chen, C. et al. This looks like that: deep learning for interpretable image recognition. Adv. neural information processing systems 32 (2019). Huang, Q. et al. Evaluation and improvement of interpretability for self-explainable part-prototype networks (2023). 2212.05946. Bontempelli, A., Teso, S., Tentori, K., Giunchiglia, F. & Passerini, A. Concept-level debugging of part-prototype networks. arXiv preprint arXiv:2205.15769 (2022). Kim, S. S., Meister, N., Ramaswamy, V. V., Fong, R. & Russakovsky, O. Hive: Evaluating the human interpretability of visual explanations. In European Conference on Computer Vision, 280–298 (Springer, 2022). Krosnick, J. A. Questionnaire Design, 439–455 (Springer International Publishing, Cham, 2018). Hoffmann, A., Fanconi, C., Rade, R. & Kohler, J. This looks like that... does it? shortcomings of latent space prototype interpretability in deep networks. arXiv preprint arXiv:2105.02968 (2021). Lage, I., Ross, A., Gershman, S. J., Kim, B. & Doshi-Velez, F. Human-in-the-loop interpretability prior. Adv. neural information processing systems 31 (2018). Colin, J., Fel, T., Cadène, R. & Serre, T. What i cannot predict, i do not understand: A human-centered evaluation framework for explainability methods. Adv. Neural Inf. Process. Syst. 35, 2832–2845 (2022). Kraft, S. et al. Sparrow: semantically coherent prototypes for image classification. In The 32nd British Machine Vision Conference (BMVC) (2021). Donnelly, J., Barnett, A. J. & Chen, C. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10265–10275 (2022). Nauta, M., Van Bree, R. & Seifert, C. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14933–14943 (2021). Wang, J., Liu, H., Wang, X. & Jing, L. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 895–904 (2021). Rymarczyk, D. et al. Interpretable image classification with differentiable prototypes assignment. In European Conference on Computer Vision, 351–368 (Springer, 2022). Ghorbani, A., Wexler, J., Zou, J. Y. & Kim, B. Towards automatic concept-based explanations. Adv. neural information processing systems 32 (2019). Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. Cub-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011). Krause, J., Stark, M., Deng, J. & Fei-Fei, L. 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, 554–561, DOI: 10.1109/ICCVW.2013.77 (2013). Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255 (Ieee, 2009). This paper is available on arxiv under CC 4.0 license. Authors: (1) Omid Davoodi, Carleton University, School of Computer Science; (2) Shayan Mohammadizadehsamakosh, Sharif University of Technology, Department of Computer Engineering; (3) Majid Komeili, Carleton University, School of Computer Science. Authors: Authors: (1) Omid Davoodi, Carleton University, School of Computer Science; (2) Shayan Mohammadizadehsamakosh, Sharif University of Technology, Department of Computer Engineering; (3) Majid Komeili, Carleton University, School of Computer Science. Table of Links Abstract and Intro Abstract and Intro Background Information Background Information Methodology Methodology Prototype Interpretability Prototype Interpretability Prototype-query Similarity Prototype-query Similarity Interpretability of the Decision-Making Process Interpretability of the Decision-Making Process The Effects of Low Prototype Counts The Effects of Low Prototype Counts Discussions Discussions Author Contributions, Data availability, Competing Interests, and References Author Contributions, Data availability, Competing Interests, and References Discussions In a substantial portion of the previous work, it was implicitly assumed that part-prototype networks are interpretable by nature. We have devised three human-centric evaluation schemes to asses that assumption. One for evaluating the interpretability of the prototypes themselves, one for evaluating the similarity between prototypes and the activated regions of the query sample, and one for evaluating the interpretability of the decision-making process itself. Our experiments show that these schemes are able to differentiate between various methods in terms of interpretability while not suffering from the problems of the previous works. Moreover, we applied this scheme to seven related methods on three datasets. The results shed light on the interpretability of these methods from a human perspective. The results show that not all part-prototype methods are equal when it comes to human interpretability. In some cases, there are severe issues that hurt the interpretability of these methods. Chief among them is the dissimilarity between the prototype and the activation region of the query. This problem existed for all of the models tested to some degree. In addition, the prototypes of some models were not sufficiently interpretable. Finally, ProtoPool has a noticeable problem with the interpretability of the decision-making process due to assigning multiple classes to each prototype and requiring huge numbers of prototypes to make their final decisions. The decision-making process of some other methods were easy to understand using the top 10 activated prototypes, but they still needed more than 10 prototypes to make their final decision. Still, in some other cases, the methods performed relatively well. Apart from ProtoPool, all methods performed quite well in the interpretability of the decision-making process test. ProtoPNet and Deformable ProtoPNet were also able to make the majority of their predictions using the top 10 activated prototypes. Prototype interpretability was also particularly high for some of the methods involved. Considering these results and also the observations on low prototype count models, our suggestion is to look at the top prototype sets for the final decision rather than only individual prototypes to get a better understanding of a model. Even if the most activated prototype has low interpretability or is dissimilar to its activation, the other top prototypes used in the decision could better explain the final decision. It is important to note that the opposite could also happen: the top prototype being interpretable but the others showing a flawed decision-making process. The interpretability of many machine learning methods used in production is very important. However, it is more important to understand the limitations and peculiarities of the models, methods, and tools used to address the interpretability problem. In particular, unified frameworks with emphasis on human-centric assessments are necessary if we want to truly evaluate interpretability methods. We think that more studies like this should be done on different areas of AI interpretability to further our understanding of such models. Author Contributions O.D. Designed, developed, and implemented the experiments as well as wrote the paper, S.M. developed and implemented the experiments, and M.K. designed the experiments and wrote the paper. Data availability Anonymized data of the experiments can be found in the following repository: https://github.com/omiddavoodi/part-prototypeinterpretability-data Competing Interests The authors declare no competing interests. References Molnar, C. Interpretable machine learning (Lulu. com, 2020). Bezdek, J. C. & Castelaz, P. F. Prototype classification and feature selection with fuzzy sets. IEEE Transactions on Syst. Man, Cybern. 7, 87–92 (1977). Kohonen, T. Improved versions of learning vector quantization. In 1990 ijcnn international joint conference on Neural networks, 545–550 (IEEE, 1990). Kuncheva, L. I. & Bezdek, J. C. Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Syst. Man, Cybern. Part C (Applications Rev. 28, 160–164 (1998). Seo, S., Bode, M. & Obermayer, K. Soft nearest prototype classification. IEEE Transactions on Neural Networks 14, 390–398 (2003). Graf, A. B., Bousquet, O., Rätsch, G. & Schölkopf, B. Prototype classification: Insights from machine learning. Neural computation 21, 272–300 (2009). Li, O., Liu, H., Chen, C. & Rudin, C. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). Davoudi, S. O. & Komeili, M. Toward faithful case-based reasoning through learning prototypes in a nearest neighborfriendly space. In International Conference on Learning Representations (2021). Chen, C. et al. This looks like that: deep learning for interpretable image recognition. Adv. neural information processing systems 32 (2019). Huang, Q. et al. Evaluation and improvement of interpretability for self-explainable part-prototype networks (2023). 2212.05946. Bontempelli, A., Teso, S., Tentori, K., Giunchiglia, F. & Passerini, A. Concept-level debugging of part-prototype networks. arXiv preprint arXiv:2205.15769 (2022). Kim, S. S., Meister, N., Ramaswamy, V. V., Fong, R. & Russakovsky, O. Hive: Evaluating the human interpretability of visual explanations. In European Conference on Computer Vision, 280–298 (Springer, 2022). Krosnick, J. A. Questionnaire Design, 439–455 (Springer International Publishing, Cham, 2018). Hoffmann, A., Fanconi, C., Rade, R. & Kohler, J. This looks like that... does it? shortcomings of latent space prototype interpretability in deep networks. arXiv preprint arXiv:2105.02968 (2021). Lage, I., Ross, A., Gershman, S. J., Kim, B. & Doshi-Velez, F. Human-in-the-loop interpretability prior. Adv. neural information processing systems 31 (2018). Colin, J., Fel, T., Cadène, R. & Serre, T. What i cannot predict, i do not understand: A human-centered evaluation framework for explainability methods. Adv. Neural Inf. Process. Syst. 35, 2832–2845 (2022). Kraft, S. et al. Sparrow: semantically coherent prototypes for image classification. In The 32nd British Machine Vision Conference (BMVC) (2021). Donnelly, J., Barnett, A. J. & Chen, C. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10265–10275 (2022). Nauta, M., Van Bree, R. & Seifert, C. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14933–14943 (2021). Wang, J., Liu, H., Wang, X. & Jing, L. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 895–904 (2021). Rymarczyk, D. et al. Interpretable image classification with differentiable prototypes assignment. In European Conference on Computer Vision, 351–368 (Springer, 2022). Ghorbani, A., Wexler, J., Zou, J. Y. & Kim, B. Towards automatic concept-based explanations. Adv. neural information processing systems 32 (2019). Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. Cub-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011). Krause, J., Stark, M., Deng, J. & Fei-Fei, L. 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, 554–561, DOI: 10.1109/ICCVW.2013.77 (2013). Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255 (Ieee, 2009). Molnar, C. Interpretable machine learning (Lulu. com, 2020). Molnar, C. Interpretable machine learning (Lulu. com, 2020). Bezdek, J. C. & Castelaz, P. F. Prototype classification and feature selection with fuzzy sets. IEEE Transactions on Syst. Man, Cybern. 7, 87–92 (1977). Bezdek, J. C. & Castelaz, P. F. Prototype classification and feature selection with fuzzy sets. IEEE Transactions on Syst. Man, Cybern. 7, 87–92 (1977). Kohonen, T. Improved versions of learning vector quantization. In 1990 ijcnn international joint conference on Neural networks, 545–550 (IEEE, 1990). Kohonen, T. Improved versions of learning vector quantization. In 1990 ijcnn international joint conference on Neural networks, 545–550 (IEEE, 1990). Kuncheva, L. I. & Bezdek, J. C. Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Syst. Man, Cybern. Part C (Applications Rev. 28, 160–164 (1998). Kuncheva, L. I. & Bezdek, J. C. Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Syst. Man, Cybern. Part C (Applications Rev. 28, 160–164 (1998). Seo, S., Bode, M. & Obermayer, K. Soft nearest prototype classification. IEEE Transactions on Neural Networks 14, 390–398 (2003). Seo, S., Bode, M. & Obermayer, K. Soft nearest prototype classification. IEEE Transactions on Neural Networks 14, 390–398 (2003). Graf, A. B., Bousquet, O., Rätsch, G. & Schölkopf, B. Prototype classification: Insights from machine learning. Neural computation 21, 272–300 (2009). Graf, A. B., Bousquet, O., Rätsch, G. & Schölkopf, B. Prototype classification: Insights from machine learning. Neural computation 21, 272–300 (2009). Li, O., Liu, H., Chen, C. & Rudin, C. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). Li, O., Liu, H., Chen, C. & Rudin, C. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). Davoudi, S. O. & Komeili, M. Toward faithful case-based reasoning through learning prototypes in a nearest neighborfriendly space. In International Conference on Learning Representations (2021). Davoudi, S. O. & Komeili, M. Toward faithful case-based reasoning through learning prototypes in a nearest neighborfriendly space. In International Conference on Learning Representations (2021). Chen, C. et al. This looks like that: deep learning for interpretable image recognition. Adv. neural information processing systems 32 (2019). Chen, C. et al. This looks like that: deep learning for interpretable image recognition. Adv. neural information processing systems 32 (2019). Huang, Q. et al. Evaluation and improvement of interpretability for self-explainable part-prototype networks (2023). 2212.05946. Huang, Q. et al. Evaluation and improvement of interpretability for self-explainable part-prototype networks (2023). 2212.05946. Bontempelli, A., Teso, S., Tentori, K., Giunchiglia, F. & Passerini, A. Concept-level debugging of part-prototype networks. arXiv preprint arXiv:2205.15769 (2022). Bontempelli, A., Teso, S., Tentori, K., Giunchiglia, F. & Passerini, A. Concept-level debugging of part-prototype networks. arXiv preprint arXiv:2205.15769 (2022). Kim, S. S., Meister, N., Ramaswamy, V. V., Fong, R. & Russakovsky, O. Hive: Evaluating the human interpretability of visual explanations. In European Conference on Computer Vision, 280–298 (Springer, 2022). Kim, S. S., Meister, N., Ramaswamy, V. V., Fong, R. & Russakovsky, O. Hive: Evaluating the human interpretability of visual explanations. In European Conference on Computer Vision, 280–298 (Springer, 2022). Krosnick, J. A. Questionnaire Design, 439–455 (Springer International Publishing, Cham, 2018). Krosnick, J. A. Questionnaire Design, 439–455 (Springer International Publishing, Cham, 2018). Hoffmann, A., Fanconi, C., Rade, R. & Kohler, J. This looks like that... does it? shortcomings of latent space prototype interpretability in deep networks. arXiv preprint arXiv:2105.02968 (2021). Hoffmann, A., Fanconi, C., Rade, R. & Kohler, J. This looks like that... does it? shortcomings of latent space prototype interpretability in deep networks. arXiv preprint arXiv:2105.02968 (2021). Lage, I., Ross, A., Gershman, S. J., Kim, B. & Doshi-Velez, F. Human-in-the-loop interpretability prior. Adv. neural information processing systems 31 (2018). Lage, I., Ross, A., Gershman, S. J., Kim, B. & Doshi-Velez, F. Human-in-the-loop interpretability prior. Adv. neural information processing systems 31 (2018). Colin, J., Fel, T., Cadène, R. & Serre, T. What i cannot predict, i do not understand: A human-centered evaluation framework for explainability methods. Adv. Neural Inf. Process. Syst. 35, 2832–2845 (2022). Colin, J., Fel, T., Cadène, R. & Serre, T. What i cannot predict, i do not understand: A human-centered evaluation framework for explainability methods. Adv. Neural Inf. Process. Syst. 35, 2832–2845 (2022). Kraft, S. et al. Sparrow: semantically coherent prototypes for image classification. In The 32nd British Machine Vision Conference (BMVC) (2021). Kraft, S. et al. Sparrow: semantically coherent prototypes for image classification. In The 32nd British Machine Vision Conference (BMVC) (2021). Donnelly, J., Barnett, A. J. & Chen, C. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10265–10275 (2022). Donnelly, J., Barnett, A. J. & Chen, C. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10265–10275 (2022). Nauta, M., Van Bree, R. & Seifert, C. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14933–14943 (2021). Nauta, M., Van Bree, R. & Seifert, C. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14933–14943 (2021). Wang, J., Liu, H., Wang, X. & Jing, L. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 895–904 (2021). Wang, J., Liu, H., Wang, X. & Jing, L. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 895–904 (2021). Rymarczyk, D. et al. Interpretable image classification with differentiable prototypes assignment. In European Conference on Computer Vision, 351–368 (Springer, 2022). Rymarczyk, D. et al. Interpretable image classification with differentiable prototypes assignment. In European Conference on Computer Vision, 351–368 (Springer, 2022). Ghorbani, A., Wexler, J., Zou, J. Y. & Kim, B. Towards automatic concept-based explanations. Adv. neural information processing systems 32 (2019). Ghorbani, A., Wexler, J., Zou, J. Y. & Kim, B. Towards automatic concept-based explanations. Adv. neural information processing systems 32 (2019). Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. Cub-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011). Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. Cub-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011). Krause, J., Stark, M., Deng, J. & Fei-Fei, L. 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, 554–561, DOI: 10.1109/ICCVW.2013.77 (2013). Krause, J., Stark, M., Deng, J. & Fei-Fei, L. 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, 554–561, DOI: 10.1109/ICCVW.2013.77 (2013). Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255 (Ieee, 2009). Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255 (Ieee, 2009). This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv