Model Cards for Model Reporting

Andrew Zaldivar

Ben Hutchinson

Elena Spitzer

Inioluwa Deborah Raji

Lucy Vasserman

M. Mitchell

Parker Barnes

Simone Sanoian McCloskey Wu

Timnit Gebru

(2019)

Download Google Scholar

Abstract

Trained machine learning models are increasingly used to perform high impact tasks such as determining crime recidivism rates and predicting health risks. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts they are not well-suited for, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards (or M-cards) to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic subgroups (e.g., race, geographic location, sex, Fitzpatrick skin tone) and intersectional subgroups (e.g., age and race, or sex and Fitzpatrick skin tone) that are relevant to the intended application domains. Model cards also disclose the context under which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for models trained to detect smiling faces on the CelebA dataset (Liu et al., 2015) and models trained to detect toxicity in the Conversation AI dataset (Dixon et al., 2018). We propose this work as a step towards the responsible democratization of machine learning and related AI technology, providing context around machine learning models and increasing the transparency into how well such models work. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed documentation.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Model Cards for Model Reporting

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Model Cards for Model Reporting

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities