Main reference: Chang et al. (2008)[1] from Dan Roth's group.
CCM is an NLP modeling paradigm in which the objective functions are expressed as the linear combination of feature functions and constraints:
At test time, an annotation is obtained by maximizing the objective function:
Notice that the inference is global in the sense that when the output contains a lot of components (say, POS tags of a sentence, arguments of a predicate), they are chosen to jointly optimize a function. In other words, they are decided simultaneously. This is different from, e.g., transition systems which assign one piece of the output at a time.
CCM | Transition-based | |
---|---|---|
Inference | + no error propagation | - suffer from error propagation |
- harder to implement, but there are general-purpose solvers (e.g. ILP solvers) | + easier to implement | |
- slower | + faster (for greedy decoding, but also beam search) | |
Features | - restricted set of features (otherwise the model becomes intractable) | + rich set of features |
Examples | anything that uses LBJ (Rizzolo and Roth, 2010)[2] | transition-based dependency parsing, e.g. MALT parser (Nivre et al. 2006[3]) |
References
- ↑ M. Chang, L. Ratinov, N. Rizzolo, and D. Roth. 2008. Learning and Inference with Constraints. In Proc. of AAAI.
- ↑ Rizzolo, N., & Roth, D. (2010). Learning Based Java for Rapid Development of NLP Systems. Proceedings of the Language Resources and Evaluation Conference, 957–964. Retrieved from http://www.lrec-conf.org/proceedings/lrec2010/pdf/747_Paper.pdf
- ↑ Nivre, J., Hall, J., & Nilsson, J. (2006). MaltParser: A data-driven parser-generator for dependency parsing. In LREC 2006 (Vol. 6, pp. 2216–2219).