Building your own corpus – TagAnt

Laurence Anthony  does it again bringing difficult to set up programs to the masses with a #tagger called #TagAnt (http://www.antlab.sci.waseda.ac.jp/software.html).

This is based on TreeTagger (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/) which if you follow the link you will see how involved it is to setup!

I ran TagAnt on my multimedia #corpus then used the tagged corpus in AntConc.

Before working with a tagged corpus in AntConc make sure to check all the boxes in the Global Settings>Token Definitions as in Screenshot1.

Also here is a link to list of all the tags that Treetagger uses:

http://courses.washington.edu/hypertxt/csar-v02/penntable.html

Then use the #Clusters tool in AntConc (gleaned from reading the Google groups for AntConc https://groups.google.com/forum/#!forum/antconc).

For example Screenshot2 shows running a search for verb + noun(inspect Screenshot2 for exact search term). Note that I have set the cluster size to 2.

This shows me that verb + support is quite common:

add support

adds support

adding support

include support

including support

bringing support

drop support

introduces support

I did not notice this when using the non-tagged corpus although support was in the top 60 in the wordlist it would have taken me longer to discern interesting patterns.

Happy Tagging!

p.s. this is continuing my series of diy corpus I started on my blog which you can read here if so interested:http://eflnotes.wordpress.com/tag/build-your-own-corpus/.

Leave a Reply

Your email address will not be published. Required fields are marked *