Natural Language Understanding Wiki

Main reference: Chang et al. (2008)[1] from Dan Roth's group.

CCM is an NLP modeling paradigm in which the objective functions are expressed as the linear combination of feature functions and constraints:

At test time, an annotation is obtained by maximizing the objective function:

Notice that the inference is global in the sense that when the output contains a lot of components (say, POS tags of a sentence, arguments of a predicate), they are chosen to jointly optimize a function. In other words, they are decided simultaneously. This is different from, e.g., transition systems which assign one piece of the output at a time.

CCM can encode dependency between output parts but not so complicated (otherwise it becomes intractable). For example, Rizzolo and Roth (2010)[2] show how to encode Hidden Markov models in this framework.

CCM Transition-based
Inference + no error propagation - suffer from error propagation
- harder to implement, but there are general-purpose solvers (e.g. ILP solvers) + easier to implement
- slower + faster (for greedy decoding, but also beam search)
Features - restricted set of features (otherwise the model becomes intractable) + rich set of features
Examples anything that uses LBJ (Rizzolo and Roth, 2010)[2] transition-based dependency parsing, e.g. MALT parser (Nivre et al. 2006[3])


  1. M. Chang, L. Ratinov, N. Rizzolo, and D. Roth. 2008. Learning and Inference with Constraints. In Proc. of AAAI.
  2. 2.0 2.1 Rizzolo, N., & Roth, D. (2010). Learning Based Java for Rapid Development of NLP Systems. Proceedings of the Language Resources and Evaluation Conference, 957–964. Retrieved from
  3. Nivre, J., Hall, J., & Nilsson, J. (2006). MaltParser: A data-driven parser-generator for dependency parsing. In LREC 2006 (Vol. 6, pp. 2216–2219).