Constrained conditional models

Main reference: Chang et al. (2008)^[1] from Dan Roth's group.

CCM is an NLP modeling paradigm in which the objective functions are expressed as the linear combination of feature functions and constraints:

$f(x, y) = \sum_i w_i \phi_i (x,y) - \sum_j \rho_j C_j(x,y)$

At test time, an annotation is obtained by maximizing the objective function:

$y^* = \operatorname*{argmax}_{y \in \mathcal{Y}} f(x,y)$

Notice that the inference is global in the sense that when the output contains a lot of components (say, POS tags of a sentence, arguments of a predicate), they are chosen to jointly optimize a function. In other words, they are decided simultaneously. This is different from, e.g., transition systems which assign one piece of the output at a time.

	CCM	Transition-based
Inference	+ no error propagation	- suffer from error propagation
	- harder to implement, but there are general-purpose solvers (e.g. ILP solvers)	+ easier to implement
	- slower	+ faster (for greedy decoding, but also beam search)
Features	- restricted set of features (otherwise the model becomes intractable)	+ rich set of features
Examples	anything that uses LBJ (Rizzolo and Roth, 2010)^[2]	transition-based dependency parsing, e.g. MALT parser (Nivre et al. 2006^[3])

References

↑ M. Chang, L. Ratinov, N. Rizzolo, and D. Roth. 2008. Learning and Inference with Constraints. In Proc. of AAAI.
↑ Rizzolo, N., & Roth, D. (2010). Learning Based Java for Rapid Development of NLP Systems. Proceedings of the Language Resources and Evaluation Conference, 957–964. Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/747_Paper.pdf
↑ Nivre, J., Hall, J., & Nilsson, J. (2006). MaltParser: A data-driven parser-generator for dependency parsing. In LREC 2006 (Vol. 6, pp. 2216–2219).

[1] M. Chang, L. Ratinov, N. Rizzolo, and D. Roth. 2008. Learning and Inference with Constraints. In Proc. of AAAI.

[2] Rizzolo, N., & Roth, D. (2010). Learning Based Java for Rapid Development of NLP Systems. Proceedings of the Language Resources and Evaluation Conference, 957–964. Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/747_Paper.pdf

[3] Nivre, J., Hall, J., & Nilsson, J. (2006). MaltParser: A data-driven parser-generator for dependency parsing. In LREC 2006 (Vol. 6, pp. 2216–2219).

[1]

[2]

[3]

Constrained conditional models

References

Fan Feed