Lime tutorial #681

aobo-y · 2021-06-12T05:16:31Z

Add a tutorial to demonstrate Lime. Made of 2 section:

one for image classification. Use high-level Lime class
one for text classification. Manually craft almost every customizable argument with LimeBade class

Github does not render the last text visualization part made of Ipython Html, which should look like below in actual notebook

facebook-github-bot · 2021-06-14T19:39:34Z

@aobo-y has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

vivekmig

Looks great, thanks for the awesome work on this :) ! I really like that it covers 2 different use-cases and using both the generic parent class and specific child class.

A few small nits and minor wording suggestions:

When explaining this line "Lime creates the pertubed samples in the form of interpretable representation, i.e., a binary vector indicating the “presence” or “absence” of features.", would it make sense to make it more concrete with examples of vector length 3 and visualizing the corresponding image? While this is not necessary for usage of Lime, it may be helpful for users to better understand what occurs internally
Do we want to save the parameters trained for the second model to avoid a user needing to train? How long did the training take?
For the text visualization issue, the BERT tutorial also includes a png of the results (https://captum.ai/tutorials/Bert_SQUAD_Interpret), could consider that here as well, particularly if the visualization issue still occurs if downloading the notebook or through the website.

Wording notes:

nit: What does POV stand for in "POV's segmentation masks"? Is this supposed to be VOC?
nit: "Everytime" -> "Every time"
nit: "check the document for details" -> "check the documentation for details"
nit: "machine still has capaciity" -> "capacity". Also can consider rewording this line a bit, since parallelize may be confusing to users, can instead say something like batching multiple samples in one forward pass evaluation, so readers understand this relates to max batch size they may typically use
nit: return_input_shape defaults to True for consistency with other methods, this doesn't need to be passed in cell 12
nit: "Lasso regularization can effectively help us filter them" -> can we add something like "by adding regularization to the interpretable model"
nit: "build-in sklearn" -> "built-in sklearn"
"pertubed binary interpretable" -> "perturbed binary interpretable"
"if the word of each position presents or not" -> "if the word of each position is present or not"
"Setting them to any baselines will surely pollute the calculation and end with imperfect result." -> Can we reword this a little to say that in this example we would like to remove characters rather than replacing with a baseline token? In this example, it makes sense why removing would be preferable considering the embedding bag structure, but I'm concerned this may lead users to think that replacement should not be used for text generally, but in general, just removing arbitrary tokens is also generating out of distribution inputs which may not always be better than replacement.
"forward_func, the target to attribute" -> target to attribute shouldn't be here?

facebook-github-bot · 2021-06-25T21:12:57Z

@aobo-y has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

aobo-y · 2021-06-25T21:55:27Z

thanks for the feedback, @vivekmig. I have updated all the wording as suggested.

Just want to further bring up the following items:

"Setting them to any baselines will surely pollute the calculation and end with imperfect result." -> Can we reword this a little to say that in this example we would like to remove characters rather than replacing with a baseline token? In this example, it makes sense why removing would be preferable considering the embedding bag structure, but I'm concerned this may lead users to think that replacement should not be used for text generally, but in general, just removing arbitrary tokens is also generating out of distribution inputs which may not always be better than replacement.

Yes, I agree. I believe setting to baseline can also be used in text. So I tried to soften my tone for our choice in the tutorial. Let me know if you have further suggestions.

When explaining this line "Lime creates the pertubed samples in the form of interpretable representation, i.e., a binary vector indicating the “presence” or “absence” of features.", would it make sense to make it more concrete with examples of vector length 3 and visualizing the corresponding image? While this is not necessary for usage of Lime, it may be helpful for users to better understand what occurs internally

This definitely helps users to understand Lime, but I cannot find any not awkward ways to inject the illustration into the quoted point, because it will interrupt the explanations of how to call attribute which is the most important goal for this tutorial.

So I plan to append a dedicated section for that, "1.4 Understand the sampling process", where I can show what a perturbed interpretable sample looks like, what the converted image looks like, what the model's prediction of it, and the similarity is calculated in feature space. I will ship the current tutorial and make another PR for the new section once I finish so it will be easier to review.

Do we want to save the parameters trained for the second model to avoid a user needing to train? How long did the training take?

As embedding-bag is both shallow & non-sequential, the training is finished around 6 mins on my CPU machine. I feel it is acceptable but I will add a checkpoint in the next PR.

For the text visualization issue, the BERT tutorial also includes a png of the results (https://captum.ai/tutorials/Bert_SQUAD_Interpret), could consider that here as well, particularly if the visualization issue still occurs if downloading the notebook or through the website.

I was curious why those tutorials have duplicate outputs... I believe the visualization issue is just Github's escape of HTML scripts on the web. All the HTML tables in previous tutorials are also collapsed. I have also included one duplicate screenshot to beautify its Github view.

facebook-github-bot · 2021-06-25T22:02:07Z

@aobo-y has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-06-28T15:44:56Z

@aobo-y merged this pull request in cf9bf08.

Summary: Based on the discussion in #681 (review) , - add trained checkpoint of the embedding-bag model and load it by default - add a new section to dive into the internal Lime sampling process to illustrate how the perturbed interpretable sample, transformed image and similarity are created and used. Pull Request resolved: #695 Reviewed By: vivekmig Differential Revision: D29458646 Pulled By: aobo-y fbshipit-source-id: a17fa87d6768c205de12ea62a98ab5d3164f0c83

aobo-y added 3 commits June 9, 2021 16:28

tutorial for image classification with Lime

6cb0359

refactor vision in Lime tutorial to subsection

401385c

add text classification in Lime tutorial

7d78c1f

facebook-github-bot added the cla signed label Jun 12, 2021

aobo-y added 2 commits June 14, 2021 12:15

revise structure of Lime tutorial

1092b46

revise wording in Lime tutorial

45d338d

aobo-y requested review from NarineK, vivekmig and bilalsal June 14, 2021 19:39

vivekmig approved these changes Jun 21, 2021

View reviewed changes

update the wording of Lime tutorial

567d889

facebook-github-bot closed this in cf9bf08 Jun 28, 2021

facebook-github-bot added the Merged label Jun 28, 2021

aobo-y mentioned this pull request Jun 29, 2021

add section of sampling process in Lime tutorial #695

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lime tutorial #681

Lime tutorial #681

aobo-y commented Jun 12, 2021 •

edited

Loading

facebook-github-bot commented Jun 14, 2021

vivekmig left a comment •

edited

Loading

facebook-github-bot commented Jun 25, 2021

aobo-y commented Jun 25, 2021

facebook-github-bot commented Jun 25, 2021

facebook-github-bot commented Jun 28, 2021

Lime tutorial #681

Lime tutorial #681

Conversation

aobo-y commented Jun 12, 2021 • edited Loading

facebook-github-bot commented Jun 14, 2021

vivekmig left a comment • edited Loading

Choose a reason for hiding this comment

facebook-github-bot commented Jun 25, 2021

aobo-y commented Jun 25, 2021

facebook-github-bot commented Jun 25, 2021

facebook-github-bot commented Jun 28, 2021

aobo-y commented Jun 12, 2021 •

edited

Loading

vivekmig left a comment •

edited

Loading