Been Kim

DISSECT: Disentangled Simultaneous Explanations via Concept Traversals

Asma Ghandeharioun

Been Kim

Chun-Liang Li

Brendan Jou

Brian Eoff

Rosalind Picard

International Conference on Learning Representations (2022)

Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis

Shayegan Omidshafiei

Andrei Kapishnikov

Yannick Assogba

Lucas Gill Dixon

Been Kim

Advances in Neural Information Processing Systems (NeurIPS) (2022) (to appear)

Best of both worlds: local and global explanations with human-understandable concepts

Jessica Schrouff

Sebastien Baur

Shaobo Hou

Eric Loreaux

Diana Mincu

Ralph Blanes

James Wexler

Alan Karthikesalingam

Been Kim

(2021)

On Completeness-aware Concept-Based Explanations in Deep Neural Networks

Chih-kuan Yeh

Been Kim

Sercan Arik

Chun-Liang Li

Pradeep Ravikumar

Tomas Pfister

NeurIPS (2020) (to appear)

Concept Bottleneck Models

Pang Wei Koh

Thao Nguyen

Yew Siang Tang

Stephen Mussmann

Emma Pierson

Been Kim

Percy Liang

ICML 2020 (2020) (to appear)

Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making

Carrie Jun Cai

Emily Reif

Narayan G Hegde

Jason Hipp

Been Kim

Daniel Smilkov

Martin Wattenberg

Fernanda Viégas

Greg Corrado

Martin Stumpe

Michael Terry

Conference on Human Factors in Computing Systems (2019)

Towards Automatic Concept-based Explanations

Amirata Ghorbani

James Wexler

James Zou

Been Kim

NeurIPS (2019)

Learning how to explain neural networks: PatternNet and PatternAttribution

Pieter-jan Kindermans

Kristof T. Schütt

Maximilian Alber

Klaus-Robet Müller

Dumitru Erhan

Been Kim

Sven Dähne

ICLR (2018)

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Julius Adebayo

Justin Gilmer

Ian Goodfellow

Been Kim

ICLR Workshop (2018)

To Trust Or Not To Trust A Classifier

Heinrich Jiang

Been Kim

Melody Guan

Maya Gupta

NeurIPS (2018)

Evaluating Feature Importance Estimates

Sara Hooker

Dumitru Erhan

Pieter-jan Kindermans

Been Kim

arXiv (2018)

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Been Kim

Martin Wattenberg

Justin Gilmer

Carrie Jun Cai

James Wexler

Fernanda Viegas

Rory Abbott Sayres

ICML (2018)

Human-in-the-Loop Interpretability Prior

Isaac Lage

Andrew Ross

Been Kim

Samuel J. Gershman

Finale Doshi-Velez

NeurIPS (Spotlight) (2018)

Sanity Checks for Saliency Maps

Julius Adebayo

Justin Gilmer

Michael Christoph Muelly

Ian Goodfellow

Moritz Hardt

Been Kim

NeurIPS (Spotlight) (2018)

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez

Been Kim

arXiv (2017)

The (Un)reliability of Saliency methods

Pieter-jan Kindermans

Sara Hooker

Julius Adebayo

Maximilian Alber

Kristof T. Schütt

Sven Dähne

Dumitru Erhan

Been Kim

NIPS Workshop (2017)

No Results Found

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Been Kim

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Been Kim

Research Areas

Filter by:

Year

Team

Research Area

Join us

AI/ML Foundations  & Capabilities