In the EMNLP paper we presented several experiments to test the performance of our method. In this page we put the data, models and instruction to follow in order to replicate the experiments. All the material needed is distributed in a zip archive.
The software was built and tested on Ubuntu 10.04 and Ubuntu 12.04. On other systems, you may have to adapt the instructions below slightly. To run them, you definitely need
Three dataset are employed for the experiments: the Groningen Meaning Bank for English, the PAISÀ corpus for Italian and the Twente News Corpus for Dutch. The first two corpora are freely redistributable, so we put the data in IOB format in the data directory in the experiments archive. The TwNC must be obtained independently
Assuming that the TwNC-0.2 is downloaded and unpacked in the directory data/ext/TwNC-0.2, and that Go is installed on your system (see requirements above), the following commands will extract and convert the part of the corpus used in the experiments. The command must be issued in the root directory of the extracted experiments archive.
make generated/twnc/database/2000/20000112/ad20000112.iob make generated/twnc/database/2000/20000112/dd20000112.iob make generated/twnc/database/2000/20000112/gra20000112.iob make generated/twnc/database/2000/20000112/nrc20000112.iob make generated/twnc/database/2000/20000112/parool20000112.iob make generated/twnc/database/2000/20000112/trouw20000112.iob make generated/twnc/database/2000/20000112/volkskrant20000112.iob make generated/twnc/database/2000/20000122/ad20000122.iob make generated/twnc/database/2000/20000122/dd20000122.iob make generated/twnc/database/2000/20000122/nrc20000122.iob make generated/twnc/database/2000/20000122/parool20000122.iob make generated/twnc/database/2000/20000122/trouw20000122.iob make generated/twnc/database/2000/20000122/vnl20000122.iob make generated/twnc/database/2000/20000122/volkskrant20000122.iob for file in generated/twnc/database/2000/200001*2/*.iob; do cat $file; echo; done > data/dutch.iob ./src/scripts/seqsplit.py data/dutch.iob
To get the results published in Table 2 Error rates obtained with different feature sets. and Table 3 Using different context window sizes, a GNU Make makefile is provided. The name of the target should have the format generated/${DATASET}.${SPLIT}.${FEATURESET}${WINDOWSIZE}.eval so for instance the command
$ make generated/dutch.dev.codecat9.evalproduces the results for the dev subset of the Dutch dataset, using both Unicode character codes and categories as features, with a window size of 9. The content of the file contains information about the errors
$ cat generated/dutch.dev.codecat9.eval Annotated units: 489291 Errors: 774 Error rate: 0.001582 I T O S I 328099 498 0 2 T 237 80856 0 20 O 0 0 75234 0 S 3 14 0 4328 fp fn tp prec rec f1 I 240 500 328099 0.999269048148 0.998478388553 0.998873561889 T 512 257 80856 0.993707600039 0.996831580634 0.995267138927 O 0 0 75234 1.0 1.0 1.0 S 22 17 4328 0.994942528736 0.996087456847 0.9955146636
To get the results published in Table 4 Results obtained using different context
window sizes and addition of SRN features, you first need to build the tools
elman
and wapiti
, and make sure the required programs and
scripts are on your PATH
, like this:
$ make bin/elman bin/wapiti $ export PATH=`pwd`/bin:`pwd`/src/scripts-srn:$PATH
Then change into the experiments-srn
subdirectory:
$ cd experiments-srn
Here, another makefile is provided. The name of the target should have the format ${DATASET}/1.0/${FEATURESET}${WINDOWSIZE}-top10/${SPLIT}.eval so for instance the command
$ make dutch/1.0/codecat9-top10/dev.eval
produced the results for the dev subset of the Dutch dataset, using both Unicode character codes and categories as features, with a window size of 9. The content of the file contains information about the errors:
$ cat dutch/1.0/codecat9-top10/dev.eval processed 489291 tokens with 414057 phrases; found: 414057 phrases; correct: 413922. accuracy: 99.97%; precision: 99.97%; recall: 99.97%; FB1: 99.97 I: precision: 99.99%; recall: 99.98%; FB1: 99.98 328560 S: precision: 99.77%; recall: 99.70%; FB1: 99.74 4342 T: precision: 99.89%; recall: 99.94%; FB1: 99.92 81155 135 489291 0.000276