The availability of data and computation has caused a proliferation of machine learning models, even compared to an already machine learning-focused field on the 2000s. But do we really understand what those models do, what are their weaknesses and strengths, where do they make mistakes? This page is devoted to the survey of methods to analyze models to answer those questions.
Model analysis is different from explanation/interpretation: the later examine either individual decisions or try to capture a holistic view of the model in a person's head (which is often impossible); the former aims at discovering prominent/important characteristics of a model and predict its behavior.
The most basic form of model analysis is test set performance measurement -- this is what every machine learning paper does. Ideally, by measuring on a held-out data set, one can predict the performance in real-world application. However, the observed performance can drastically change for the worse for several reasons: overfitting on a dataset that has been used for decades (as is the case of WSJ dataset for syntactic parsing), difference in domain (e.g. medical vs financial), genre (broadcast news vs printed papers), or time-related changes (test sets are often collected around the same time as training while applications come much later).
IMPORTANT SURVEY: Belinkov and Glasss (2019)