Abstract
How can we explain the predictions of a blackbox model? In this paper, we use influence functions — a classic technique from robust statistics — to trace a model’s prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually indistinguishable training-set attacks.
https://arxiv.org/pdf/1703.04730.pdf
Pang Wei Koh 1 Percy Liang 1
Problem: What did I do wrong that made our blackbox less effective? Is it wrong to the training data? wrong labels?
Idea: use Influence Functions to observe the influence of the test samples from the training samples
The degree of influence of a single training sample z on all model parameters θ is calculated as:
Where ε is the weight of sample z relative to other training samples. If there are n samples, it can be interpreted as 1/n
Where Hessian is second-order partial matrix that contains the influence of all training samples (a total of n) on the model parameter θ.
Thus, the gradient:
Implicates the effect of a single training sample z on the model parameter θ. L is the loss
The Summary Impact of Index I is mainly composed of 2 parts of information:
- Influence information of other training samples implied by the Hessian matrix.
- The effect of current training sample z on the parameter θ of the model.
The article further gives the calculation of the degree of influence of a single training sample z on the prediction results of a single test sample Ztest:
We Can summarize the impact that consists of three parts of information (respectively, corresponding to the three formulas):
- the information that the current test sample ztest is influenced by the model parameter θ
- Influence information of other training samples implied by the Hessian matrix.
- The effect of current training sample z on the parameter θ of the model.
Let’s see what happens if there is a missing item in the formula:
- If there is no loss value information for the single sample z of the third item, it will be like the picture on the left of the figure above, and the degree of influence I of a single sample z will deviate.
- If there is no second Hessian matrix and there is no reference to other training samples, the training sample (green sample) that is labeled with the test sample will only have a positive effect on the predicted test sample. On the contrary, if the test sample has a different label of the training sample (red Samples only have a negative effect on predicting test samples.
This is not true, because some training samples and test samples are of the same label, but they have a negative effect on training. The right side of the figure below is a training sample, but it is a training sample for disturbing training:
That is: If we have one more the right side of the 7 training samples, it is more likely to cause the left side of the test sample to make a mistake (loss of value becomes larger).
At the end of the article, several practical cases (effects) affecting functions are given.
1. Understand the behavior of the model
The article gives a vivid example of using impact function comparison support vector machine (SVM) and deep network (Inception), a model for identifying “fish” and “dog”
The green dot is a training sample labeled “Fish”, and the red dot is a training sample labeled “Dog”.
Comparing the main SVM with the Inception plot, the abscissa is the Euclidean distance between the training sample and the test sample (can be understood as the picture similarity), and the ordinate is the degree of influence of the training sample on a single test sample.
what is interesting:
1. In the SVM model, training samples (large Euclidean distances) that differ significantly from the test sample are almost ineffective for model discriminant test samples (I is almost equal to 0). This is in line with the SVM support vector in the identification of the important role. (The more difficult the training sample, the greater the impact on the model)
2. In the Inception depth network, no matter how large the Euclidean distance is, a training sample will influence the judgment of the test sample. This proves the advantage of deep networks. Each training sample may have an effect on model optimization (whether positive or negative).
3. The two sample plots on the right are the ones that have the greatest influence on the prediction in the training sample. It can be seen that the two least-like “fish” samples in the SVM play a greater role in model discrimination, but Inception There are two “fish” charts in the comparison.
2. Generate Adversarial training examples
3. Assess the value of the training sample set
If the training sample set and the test sample set are not the same domain or not the same distribution, even if collecting more training samples, it is not helpful for training the model.
If you calculate the degree of influence I, only a very small part of the training sample has an effect on the prediction of the test sample, you should be careful, perhaps the field of your training sample collection is incorrect, you need to try to collect training samples in another way.
4. Find training sample with the wrong labels
Reference:
https://github.com/kohpangwei/influence-release