From Shen et al. (2016)[1]: "The basic idea is to introduce evaluation metrics as loss functions and assume that the optimal set of model parameters should minimize the expected loss on the training data."

Also called "Maximum Metric Score Training" by (Zhao and Ng, 2010).

Relation to reinforcement learning:

  • From Stoyanov and Eisner (2012)[2]: " In general, our proposed ERMA setting of trying to directly minimize the loss (or maximize the reward) of a controller is familiar in reinforcement learning, e.g., in model free methods such as policy gradient."

Zhao and Ng (2010)[3]: directly optimize B3 and MUC for (entity) coreference resolution.

TODO: Norouzi et al. (2016)[4]

References Edit

  1. Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2016. Minimum Risk Training for Neural Machine Translation. In Proceedings of ACL 2016, Berlin, Germany, August.
  2. Stoyanov, Veselin, and Jason Eisner. "Minimum-risk training of approximate CRF-based NLP systems." Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2012.
  3. Zhao, S., & Ng, H. T. (2010). Maximum Metric Score Training for Coreference Resolution. Coling, (August), 1308–1316. Retrieved from
  4. Norouzi, Mohammad, et al. "Reward augmented maximum likelihood for neural structured prediction." Advances In Neural Information Processing Systems. 2016.