(right child, left child) Tag: sourceedit |
(punc vs. nopunc) Tag: sourceedit |
||
Line 29: | Line 29: | ||
#* Should I use [http://ilk.uvt.nl/conll/software.html CoNLL-X eval script] instead?? What is the difference between them? |
#* Should I use [http://ilk.uvt.nl/conll/software.html CoNLL-X eval script] instead?? What is the difference between them? |
||
#* Stanford also provides [http://nlp.stanford.edu/software/lex-parser.shtml evaluation tool]: "The package includes a tool for scoring of generic dependency parses, in a class edu.stanford.nlp.trees.DependencyScoring. This tool measures scores for dependency trees, doing F1 and labeled attachment scoring. The included usage message gives a detailed description of how to use the tool." |
#* Stanford also provides [http://nlp.stanford.edu/software/lex-parser.shtml evaluation tool]: "The package includes a tool for scoring of generic dependency parses, in a class edu.stanford.nlp.trees.DependencyScoring. This tool measures scores for dependency trees, doing F1 and labeled attachment scoring. The included usage message gives a detailed description of how to use the tool." |
||
+ | #* Counter-intuitive observation: counting punctuation actually ''decrease'' UAS and LAS by ''~3%'' --> the parser mistakes punctuations ''more often'' than average tokens. |
||
# Run [http://nlp.stanford.edu/software/nndep.shtml Stanford neural parser] on the data and measure results. |
# Run [http://nlp.stanford.edu/software/nndep.shtml Stanford neural parser] on the data and measure results. |
||
#* Download <code>stanford-parser-full-2014-10-31.zip</code> as instructed [http://nlp.stanford.edu/software/nndep.shtml here] |
#* Download <code>stanford-parser-full-2014-10-31.zip</code> as instructed [http://nlp.stanford.edu/software/nndep.shtml here] |
Revision as of 14:25, 9 January 2016
This page documents necessary steps to reproduce results of Chen & Manning (2014)[1] for English (including re-implementation) and makes explicit decisions that aren't covered in the paper.
- Obtain data: WSJ part of PENN Treebank. Section 02-21 for training, 22 for development, 23 for testing.
I used this revised version: LDC2015T13LTH converter doesn't work with this version.- I used PENN Treebank 3.
- Assign POS tags using Stanford POS tagger with ten-way jackknifing of the training data
- Reported accuracy: ≈ 97.3%
- I used version 3.6.0 downloaded here and followed instructions in the JavaDoc.
- Reused
english-bidirectional-distsim.tagger.props
. Downloaded word clusters. Fixed a crash. (Which used bidirectional5words model.) - Instructions say: "The part-of-speech tags used as input for training and testing were generated by the Stanford POS Tagger (using the bidirectional5words model)."
- It's not clear how to divide the folds which can make a difference. I divide it by sentences, the accuracy is 97.18%. I also tried to divide by documents and it wasn't better.
- Constituent-to-dependency conversion:
- LTH Constituent-to-Dependency Conversion Tool
- Downloaded pennconverter
- The paper didn't specify command-line options or reference type of conversion
- Head-finding rules matter
- The default is CoNLL-2008 conventions and CoNLL-X file format
- I tried
-oldLTH
and-conll2007
but it doesn't split tokens with slases (different from footnote 6 page 745) - Tried
-rightBranching=false
and the performance of MaltParser was low: around 80% instead of 90%. - Command:
java -jar pennconverter.jar
- Error in one sentence, skipped. I submitted a question on Stackoverflow.
- Stanford Basic Dependencies
- Use Stanford parser v3.3.0 (page 745), downloaded here under the name
stanford-parser-full-2013-11-12
. - Convert PENN Treebank to Stanford Basic Dependency using:
java -cp stanford-parser-full-2014-10-31/stanford-parser.jar edu.stanford.nlp.trees.EnglishGrammaticalStructure -basic -conllx -originalDependencies -treeFile xxx
- Use Stanford parser v3.3.0 (page 745), downloaded here under the name
- LTH Constituent-to-Dependency Conversion Tool
- Measure statistics: sentences, words, POS's, labels, projective percentage (Table 3)
- TODO
- Evaluation tool:
- Downloaded MaltEval
- Should I use CoNLL-X eval script instead?? What is the difference between them?
- Stanford also provides evaluation tool: "The package includes a tool for scoring of generic dependency parses, in a class edu.stanford.nlp.trees.DependencyScoring. This tool measures scores for dependency trees, doing F1 and labeled attachment scoring. The included usage message gives a detailed description of how to use the tool."
- Counter-intuitive observation: counting punctuation actually decrease UAS and LAS by ~3% --> the parser mistakes punctuations more often than average tokens.
- Run Stanford neural parser on the data and measure results.
- Download
stanford-parser-full-2014-10-31.zip
as instructed here
- Download
- Run off-the-shelf MaltParser and MSTParser on dev and test sets.
- Implement oracle
- Implement parser
- Minuscule detail in the implementation: right child is to the right of the node of interest and left child is to the left.
- Implement neural net
- Dropout:
it isn't clear where did they apply dropout: to the output of embedding layer or hidden layer?applied to hidden layer units. - The paper implies that learning rate was varied during training ("initial learning rate of Adagrad α = 0.01.") but doesn't reveal the method (annealing/linear/etc.) and how much.
- The paper says "A slight variation is that we compute the softmax probabilities only among the feasible transitions in practice." but the implementation actually compute all probabilities.
- Note from the source code: output layer doesn't have bias terms which is consistence with the paper -- there is no bias in feature templates.
- Dropout:
References
- ↑ Chen, D., & Manning, C. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 740–750). Doha, Qatar: Association for Computational Linguistics.