Studying Large Language Model Generalization with Influence Functions

Hessian matrix
DOI: 10.48550/arxiv.2308.03296 Publication Date: 2023-01-01
ABSTRACT
When trying to gain better visibility into a machine learning model in order understand and mitigate the associated risks, potentially valuable source of evidence is: which training examples most contribute given behavior? Influence functions aim answer counterfactual: how would model's parameters (and hence its outputs) change if sequence were added set? While influence have produced insights for small models, they are difficult scale large language models (LLMs) due difficulty computing an inverse-Hessian-vector product (IHVP). We use Eigenvalue-corrected Kronecker-Factored Approximate Curvature (EK-FAC) approximation up LLMs with 52 billion parameters. In our experiments, EK-FAC achieves similar accuracy traditional function estimators despite IHVP computation being orders magnitude faster. investigate two algorithmic techniques reduce cost gradients candidate sequences: TF-IDF filtering query batching. generalization patterns LLMs, including sparsity patterns, increasing abstraction scale, math programming abilities, cross-lingual generalization, role-playing behavior. Despite many apparently sophisticated forms we identify surprising limitation: influences decay near-zero when key phrases is flipped. Overall, give us powerful new tool studying properties LLMs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....