Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics

Amirata Ghorbani, Dina Berenbaum, Maor Ivgi, Yuval Dafna and James Zou

Published in MDPI (vol. 13), 2021

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data is one of the most commonly used modes of data in diverse applications such as healthcare and finance. Much of the existing interpretability methods used for tabular data only report feature-importance scores—either locally (per example) or globally (per model)—but they do not provide interpretation or visualization of how the features interact. We address this limitation by introducing Feature Vectors, a new global interpretability method designed for tabular datasets. In addition to providing feature-importance, Feature Vectors discovers the inherent semantic relationship among features via an intuitive feature visualization technique. Our systematic experiments demonstrate the empirical utility of this new method by applying it to several real-world datasets. We further provide an easy-to-use Python package for Feature Vectors.

See paper page here or download the pdf directly.

Checkout the paper repository in github.

Install the package described in the paper through PyPi.

Cite as

@Article{info13010015, 
AUTHOR = {Ghorbani, Amirata and Berenbaum, Dina and Ivgi, Maor and Dafna, Yuval and Zou, James Y.},
TITLE = {Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics},
JOURNAL = {Information},
VOLUME = {13},
YEAR = {2022},
NUMBER = {1},
ARTICLE-NUMBER = {15},
URL = {https://www.mdpi.com/2078-2489/13/1/15},
ISSN = {2078-2489},
ABSTRACT = {Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data are one of the most commonly used modes of data in diverse applications such as healthcare and finance. Much of the existing interpretability methods used for tabular data only report feature-importance scores—either locally (per example) or globally (per model)—but they do not provide interpretation or visualization of how the features interact. We address this limitation by introducing Feature Vectors, a new global interpretability method designed for tabular datasets. In addition to providing feature-importance, Feature Vectors discovers the inherent semantic relationship among features via an intuitive feature visualization technique. Our systematic experiments demonstrate the empirical utility of this new method by applying it to several real-world datasets. We further provide an easy-to-use Python package for Feature Vectors.},
DOI = {10.3390/info13010015}
}