Natural Language Understanding Wiki
Advertisement

This page documents necessary steps to reproduce results of Chen & Manning (2014)[1] for English (including re-implementation) and makes explicit decisions that aren't covered in the paper.

  1. Obtain data: WSJ part of PENN Treebank. Section 02-21 for training, 22 for development, 23 for testing.
  2. Constituent-to-dependency conversion:
    1. LTH Constituent-to-Dependency Conversion Tool
      • Downloaded pennconverter
      • Command: java -jar pennconverter.jar -format=conllx -rightBranching=false -verbosity 2 -stopOnError
    2. Stanford Basic Dependencies
      • TODO
  3. Assign POS tags using Stanford POS tagger with ten-way jackknifing of the training data
    • Reported accuracy: ≈ 97.3%
    • I used version 3.6.0 downloaded here and followed instructions in the JavaDoc.
    • Reused english-bidirectional-distsim.tagger.props (without distributional similarity since I don't have that resource). Fixed some problem: [1]

References

  1. Chen, D., & Manning, C. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750). Doha, Qatar: Association for Computational Linguistics.
Advertisement