Changes: Dependency parsing

Latest revision as of 08:43, 30 July 2016

Two main streams:

Transition-based
- Non-monotonic ~: http://www.aclweb.org/anthology/D/D15/D15-1162.pdf
- An important reference: Goldberg & Nirve (2013) they come close to reinforcement learning. TODO: Vaswani & Sagae (2016)^[1] use a special "error state" to recognize erroneous derivations.
Graph-based

Transition-based parsing[]

Transition systems[]

Arc-standard
Arc-eager
Arc-hybrid (Kuhlmann et al., 2011)^[2]
Swap-standard
Extended arc-standard (Attardi, 2006)^[3]

Feature selection[]

Traditional feature templates (TODO)
Primitive, embedded features: Chen and Manning (2014)^[4]
Stack LSTM: Dyer et al. (2015)^[5]
Bi-directional LSTMs: Kiperwasser and Goldberg (2016)^[6]

List-based (non-directional) systems[]

Easy-first (Goldberg and Elhadad, 2010)^[7]
Easy-first with Hierarchical Tree LSTMs (Kiperwasser & Goldberg, 2016)^[8]
Dependency parsing as head selection, using bi-directional LSTMs (Zhang et al. 2016)^[9]

Graph-based systems[]

TODO: Yu and Bohnet (2016) PDF

Classifier[]

From Yazdani & Henderson (2015)^[10]: "To train a classifier to choose the best actions, previous work has proposed memory-based classifiers (Nivre et al., 2004), SVMs (Nivre et al., 2006), structured perceptron (Huang et al., 2012; Zhang and Clark, 2011), two-layer neural networks (Chen and Manning, 2014)^[4], and Incremental Sigmoid Belief Networks (ISBN) (Titov and Henderson, 2007b), amongst other approaches."

Yazdani & Henderson (2015): recurrent neural network.

Corpora[]

Papers mainly use WSJ of Penn Treebank (TODO: confirm this). Although there is more in the treebank, only WSJ has been patched with gold NP-bracketing (Vadas & Curran, 2007)^[11].

Evaluation[]

Label Attachment Score (LAS): % of tokens for which a system has predicted the correct HEAD and DEPREL
Unlabeled A5achment Score (UAS): % of tokens with correct HEAD •
Label Accuracy (LA): % of tokens with correct DEPREL

For in-depth discussion: Volokh (2013)^[12]

Open problems[]

Ng and Curran (2015)^[13]: "Our results showthat noun phrases remain challenging for dependency parsers, both in choosing the correct head, and in determining the internal structure. Punctuation, despite being commonly ignored in parsers and evaluation, causes substantial cascading errors when misattached."

McDonald and Nivre (2007)^[14]: "Error propagation is an issue for MaltParser, which typically performs worse on long sentences, long dependency arcs and arcs higher in the graphs". This is true for gready transition-based parsers in general. Beam search can alleviate the problem but doesn't solve it. Global inference (Andor et al., 2016)^[15] might help but empirical verification is needed.

References[]

↑ Vaswani, A., & Sagae, K. (2016). Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States. Transactions of the Association for Computational Linguistics, 4, 183–196.
↑ Marco Kuhlmann, Carlos G´ omez-Rodrıguez, and Giorgio Satta. 2011. Dynamic programming algorithms for transition-based dependency parsers. In Proceed- ings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pages 673–682.
↑ Attardi, G. (2006). Experiments with a Multilanguage Non-Projective Dependency Parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X) (pp. 166–170). New York City: Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/W06-2922
↑ ^4.0 ^4.1 Chen, D., & Manning, C. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750). Doha, Qatar: Association for Computational Linguistics.
↑ Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-Based Dependency Parsing with Stack Long Short-Term Memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 334–343). Retrieved from http://www.aclweb.org/anthology/P15-1033
↑ Kiperwasser, E., & Goldberg, Y. (2016). Simple and Accurate Dependency Parsing Using Bidirectional {LSTM} Feature Representations. CoRR, abs/1603.0.
↑ Yoav Goldberg and Michael Elhadad. 2010. An effi- cient algorithm for easy-first non-directional dependency parsing. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguis- tics (NAACL HLT), pages 742–750.
↑ Kiperwasser, E., & Goldberg, Y. (2016). Easy-First Dependency Parsing with Hierarchical Tree LSTMs. Arxiv. Retrieved from http://arxiv.org/abs/1603.00375
↑ Zhang, X., Cheng, J., & Lapata, M. (2016). Dependency Parsing as Head Selection. Retrieved from http://arxiv.org/abs/1606.01280
↑ Yazdani, Majid, and James Henderson. "Incremental Recurrent Neural Network Dependency Parser with Search-based Discriminative Training." (2015): 142-152.
↑ Vadas, D., & Curran, J. R. (2007). Adding noun phrase structure to the Penn Treebank. 45th Annual Meeting of the Association of Computational Linguistics, (June), 240–247. Retrieved from http://acl.ldc.upenn.edu/P/P07/P07-1031.pdf
↑ Volokh, Alexander. "Performance-oriented dependency parsing." (2013). PDF
↑ Ng, D., & Curran, J. R. (2015). Identifying Cascading Errors using Constraints in Dependency Parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp. 1148–1158). Beijing: ACL.
↑ McDonald, R., & Nivre, J. (2007). Characterizing the Errors of Data-Driven Dependency Parsing Models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).
↑ Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrove, S., Collins, M. (2016). Globally Normalized Transition-Based Neural Networks. arXiv.org, cs.CL.

[1] Vaswani, A., & Sagae, K. (2016). Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States. Transactions of the Association for Computational Linguistics, 4, 183–196.

[2] Marco Kuhlmann, Carlos G´ omez-Rodrıguez, and Giorgio Satta. 2011. Dynamic programming algorithms for transition-based dependency parsers. In Proceed- ings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pages 673–682.

[3] Attardi, G. (2006). Experiments with a Multilanguage Non-Projective Dependency Parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X) (pp. 166–170). New York City: Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/W06-2922

[chen.manning.2014-4] 4.0 ^4.1 Chen, D., & Manning, C. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750). Doha, Qatar: Association for Computational Linguistics.

[5] Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-Based Dependency Parsing with Stack Long Short-Term Memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 334–343). Retrieved from http://www.aclweb.org/anthology/P15-1033

[6] Kiperwasser, E., & Goldberg, Y. (2016). Simple and Accurate Dependency Parsing Using Bidirectional {LSTM} Feature Representations. CoRR, abs/1603.0.

[7] Yoav Goldberg and Michael Elhadad. 2010. An effi- cient algorithm for easy-first non-directional dependency parsing. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguis- tics (NAACL HLT), pages 742–750.

[8] Kiperwasser, E., & Goldberg, Y. (2016). Easy-First Dependency Parsing with Hierarchical Tree LSTMs. Arxiv. Retrieved from http://arxiv.org/abs/1603.00375

[9] Zhang, X., Cheng, J., & Lapata, M. (2016). Dependency Parsing as Head Selection. Retrieved from http://arxiv.org/abs/1606.01280

[10] Yazdani, Majid, and James Henderson. "Incremental Recurrent Neural Network Dependency Parser with Search-based Discriminative Training." (2015): 142-152.

[11] Vadas, D., & Curran, J. R. (2007). Adding noun phrase structure to the Penn Treebank. 45th Annual Meeting of the Association of Computational Linguistics, (June), 240–247. Retrieved from http://acl.ldc.upenn.edu/P/P07/P07-1031.pdf

[12] Volokh, Alexander. "Performance-oriented dependency parsing." (2013). PDF

[13] Ng, D., & Curran, J. R. (2015). Identifying Cascading Errors using Constraints in Dependency Parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp. 1148–1158). Beijing: ACL.

[14] McDonald, R., & Nivre, J. (2007). Characterizing the Errors of Data-Driven Dependency Parsing Models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).

[15] Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrove, S., Collins, M. (2016). Globally Normalized Transition-Based Neural Networks. arXiv.org, cs.CL.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

@@ Line 47: / Line 47: @@
 * Unlabeled A5achment Score (UAS): % of tokens with correct HEAD •
 * Label Accuracy (LA): % of tokens with correct DEPREL
+For in-depth discussion: Volokh (2013)<ref>Volokh, Alexander. "Performance-oriented dependency parsing." (2013). [http://universaar.uni-saarland.de/monographien/volltexte/2013/109/pdf/Volokh_OABook.pdf PDF]</ref>
 == Open problems ==