[2023 NeurIPS Spotlight] Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases #185

Jasonlee1995 · 2024-02-14T10:03:21Z

논문의 6가지 main contribution은 다음과 같음

Identify : 기존 연구 Salient ImageNet을 이용하여, ImageNet-1K의 모든 class에 대해 core feature, spurious feature를 구분
Measure : 각 image에 spurious feature가 얼마나 있는지를 측정하는 spuriosity를 제안
Measure : spuriosity rankings를 이용하여, model의 bias를 측정하는 방법인 spurious gap을 제안
spurious gap = acc(high spuriosity images) - acc(low spuriosity images)
Mitigate : high spuriosity images에 대한 bias를 mitigate하는 방법론 제시
low spuriosity images로 last linear layer만 fine-tuning
Analysis : bias는 data가 주 원인이며, model architecture, training method는 큰 원인이라고 보기 힘듬
Analysis : negative spurious gaps인 class의 high spuriosity images는 mislabeled이거나 multiple objects를 가짐

간단히 요약하면, image classification dataset에 spurious correlation이 있는지 파악하고, 이를 mitigate하는 방법론을 제시한 논문

중요하다고 생각되는 부분만 간단히 요약

1. Discovering Spurious Features in ImageNet and Beyond

1.0. Salient ImageNet framework

1.1. 5000 Reasons Deep Models Use to Perform ImageNet Classification

모델을 adversarial training으로 학습
adversarial training이 perceptually aligned gradients로 이끈다는 기존 연구가 존재
따라서 gradient-based interpretation의 utility가 개선되어, interpretable한 모델이 됨
각 class마다 annotation할 important neural features 선택
각 class마다 가장 많이 activate되는 top-5 feature 선택
activate된다는 기준 : feature activation x linear classification head weight
Neural features에 대한 visualization을 이용하여, 사람이 core feature인지 spurious feature인지 라벨링
CAM을 이용한 heatmap, feature attack을 이용하여 시각화
class info, top-5 neural features가 주어졌을 때 core인지 spurious인지 라벨링

사실상 기존 연구 Salient ImageNet 논문에서 제안한 방식을 그대로 이용했기에, 해당 section의 contribution은 ImageNet-1K의 모든 class에 대해 노가다한 정도?

1.2. Spurious Feature Discovery without Adversarial Training

Dataset이 작을 때, adversarial training 자체가 challenging함

ImageNet-1K adversarially trained model에 linear layer만 fine-tune하여, dataset의 spurious correlation을 identify하는 방법을 제안
(adversarial trained model은 transfer learning이 잘된다는 기존 연구가 존재)

Waterbirds, Celeb-A, UTKFace 3가지 dataset에서 실험하여, 해당 방식이 dataset의 spurious correlation을 밝힐 수 있음을 보임

Waterbirds, Celeb-A hair color classification에서 spurious correlation 확인 가능
(background bias in Waterbirds, racial bias in Celeb-A)

UTKFace gender and race classification에서 spurious correlation 확인 가능
(suit and tie for male class, pink for female class, glasses for Indian class)

1.3. Is a Human in the Loop Necessary?

사람의 라벨링 없이 automation하는 2가지 방법을 제안

Open-vocabulary segmentation models
Segment Anything과 같은 open-vocabulary segmentation model을 이용하여 object에 대한 segmentation과 neural activation map을 이용한 soft segmentation간의 IoU를 이용하여 core, spurious 여부 판단
Vision-language models (VLMs)
CLIP과 같은 VLM을 이용하여 어떠한 concept과 image간의 similarity를 계산하여 spuriosity 측정
(다만 어떤 concept을 사용해야할 지는 사람이 정해야함)

2. Measuring and Mitigating Biases with Spuriosity Rankings

2.1. Spuriosity Rankings: Organizing Image Data via Robust Neural Features

Image with high spuriosity : spurious correlation이 많은 image
Image with low spuriosity : spurious correlation이 없는 image

각 class 내의 image마다 spurious feature가 얼마나 activate되는 지를 이용하여 spuriosity 측정
(spuriosity : how strongly spurious cues are present in an image)

Class 정보가 필요한 이유 : neural feature가 core인지 spurious인지는 class마다 다르기에
(Figure 3처럼 같은 neural feature라고 하더라도 어떤 class인지에 따라 core feature인지 spurious feature인지가 다름)

즉, spurious correlation problem은 class-dependency가 있음

2.2. Measuring Bias: Computing Spurious Gaps

More examples can be find here

Spuriosity를 이용하여 sorting하면, 각 class마다 spuriosity ranking을 확인할 수 있음

Spuriosity ranking을 이용한 spurious gap을 통해, model의 bias를 측정 가능
(spurious gap : acc(top-k highest spuriosity validation images) - acc(top-k lowest spuriosity validation images))

Figure 6 (left) : 89개의 모든 모델들이 spurious cues가 없을 때 성능이 안좋음 → all models are biased
주목할 점 : CLIP은 다른 vision models와 다른 경향성을 보임
Figure 6 (right) : spurious gap의 variance는 per class가 per model보다 큼 → spurious correlation problem은 data가 주 원인이지 model architecture, training method가 큰 원인이라고 보기 힘듬

Figure 7 : adversarially trained model과 다른 vision models간의 class-wise spurious gaps의 correlation이 높음 → single adversarially trained model을 사용해서 spuriosity를 측정해도 괜찮다는 당위성을 보임

2.3. Mitigating Bias: Closing Spurious Gaps

Section 2.2를 통해 얻은 결론은, spurious correlation problem에는 data가 main issue라는 것임

학습한 모델에 core feature 정보가 이미 존재하기에, low spuriosity images로 linear layer만 fine-tuning하여 spurious feature에 대한 reliance를 mitigate
(overfitting을 피하기 위해 spurious gap이 5% 미만이 될 때 early stopping)

Figure 8 (left) : low spuriosity tuning을 하면, validation accuracy는 감소하지만 spurious gaps는 줄어듬
Figure 8 (right) : low spuriosity tuning을 하면, spurious rankings에 상관없이 비슷한 accuracy를 가짐

3. Additional Application: Flagging and Fixing Labeling Errors

Spurious feature collision

저자들이 분석한 결과, 63.8%의 spurious features가 다른 class에서의 core feature임

즉, spurious feature가 single class에만 영향을 미치는 것이 아니라 multiple classes에 영향을 미친다는 것임

Spurious feature collision : class $c_1$의 spurious feature가 class $c_2$에 more strongly correlated한 경우

Spurious feature collision이 발생하는 경우, class $c_1$의 이미지는 class $c_2$로 misclassify

Negative spurious gap

모델이 spurious cue가 있는 이미지는 잘 맞추고 없는 이미지는 잘 못맞추기에, spurious gap이 positive인게 자연스러움

그러나 통념과 반대되는 양상인, negative spurious gap인 classes가 존재

즉, spurious feature reliance가 오히려 성능에 해를 미친다는 것

Negative spurious gap이 발생한 classes의 high spurious images를 분석한 결과, 80%의 이미지들이 multiple objects를 가짐

따라서, 저자들은 spurious feature collision 때문에 negative spurious gap이 발생했다고 hypothesize

이를 방지하기 위해, core feature에 대한 neural activation maps를 이용하여 high spurious images를 refine해줌
(negative spurious gap이 발생한 classes의 core robust neural features로 neural activation maps를 구함 → neural activation maps를 이용하여 soft segmentation mask 생성 → soft segmentation을 이용하여 image crop)

4. Appendix

4.1. Characterizing Feature Sensitivities

Neural feature를 이용하여 input space에서의 soft segmentation을 얻은 후, mask 부분만 targeted corruption을 진행

Color, texture, shape 중 1가지만 건드리도록 3가지 corruption을 선택 : graying, blurring, random path rotation
(graying - color, blurring - texture, random patch rotation - shape)

Corruption을 가했을 때의 accuracy drop을 이용하여 feature sensitivity를 측정할 수 있으며, 이를 통해 feature가 어떤 정보를 담고 있는지 + 어떤 정보가 중요한지를 알 수 있음

Figure 10 (a) : color information in the core feature of lizard body is crucial for successful prediction
Figure 10 (b) : spurious feature of keyboard on desk hurts classification → blurring out the keyboard and desk leads to more accurate prediction of the desktop computer → texture of the keyboard and desk contribute more to confusing the model and leading to misclassification
Figure 10 (c) : rotating patches leads to an incorrect prediction → shape of the books is crucial

Strongly correlated sensitivities → model behavior is determined much more by the data it operates over than the specific decisions made during training

4.2. Full Pretrained Model Evaluation Results

acc(highest spuriosity)와 acc(lowest spuriosity)간의 직선이 y=x에 가까운지를 통해 effective robustness를 측정

즉, spurious cue가 있던 없던 모델 성능이 얼마나 일관된 지를 측정하겠다는 의미

zero-shot CLIP의 effective robustness가 좋은데, linear head를 fine-tuning하게 되면 effective robustness가 확 감소함

Figure 14 : per-class spurious gaps correlate strongly between models

zero-shot CLIP models have the worst correlation with other models, suggesting that their perception is fundamentally different

4.3. Alternate Baseline: Error Tuning Does Not Close Spurious Gaps

기존의 spurious correlation을 해결하는 논문들은 대부분 모델이 misclassify하는 데이터들에 집중해서 해결했음
(e.g. Learning from Failure, ...)

Low spurious images가 아닌, misclassified images로 classification head 학습

Error tuning으로 spurious gap을 줄일 수 있으나, accuracy의 손해가 너무 큼

Low spuriosity tuning은 error tuning보다 accuracy의 손해를 훨씬 적고, spurious gap도 더 잘 줄임

즉, low spuriosity images가 error images보다 shortcuts를 방지하는 reliable learning signal을 제공함

4.4. Details on Core-Cropping

Section 3에서의 core feature에 대한 neural activation maps를 이용한 high spurious images refine에 대한 detail

core neural features에 대한 neural activation maps를 average하여 soft segmentation을 구함
threshold 0.9를 이용하여 mask를 구함
mask를 encompass하는 bounding box를 구함
bounding box의 height, width를 20% expand한 다음, square로 만들어줌
(shorter side to match the larger side)

해당 방법은 class의 top 20th percentile activation images에 대해서 괜찮은데, 나머지 80%에 대해서는 reliable하지 않다고 함

The text was updated successfully, but these errors were encountered:

Jasonlee1995 added Principle Understanding the AI Data Related with data Vision Related with Computer Vision tasks Robust Learning Learning on Biased Dataset, Noisy Dataset, Imbalance Dataset labels Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2023 NeurIPS Spotlight] Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases #185

[2023 NeurIPS Spotlight] Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases #185

Jasonlee1995 commented Feb 14, 2024 •

edited

Loading

[2023 NeurIPS Spotlight] Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases #185

[2023 NeurIPS Spotlight] Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases #185

Comments

Jasonlee1995 commented Feb 14, 2024 • edited Loading

1. Discovering Spurious Features in ImageNet and Beyond

2. Measuring and Mitigating Biases with Spuriosity Rankings

3. Additional Application: Flagging and Fixing Labeling Errors

4. Appendix

Jasonlee1995 commented Feb 14, 2024 •

edited

Loading