Description: The perl scripts included in the BootCaT toolkit implement an
iterative procedure to bootstrap specialized corpora and terms from
the web, requiring only a list of ``seeds (terms that are expected
to be typical of the domain of interest) as input.
In implementing the algorithm, we followed the old UNIX adage that
each program should do only one thing, but do it well. Thus, we
developed a small, independent tool for each separate subtask of the
algorithm.
As a result, BootCaT is extremely modular: One can easily run a subset
of the programs, look at intermediate output files, add new tools to
the suite, or change one program without having to worry about the
others.
- [yuliaoku] - Some of the corpus of literature and inf
- [Unsupervise] - The use of Hidden Markov Model to achiev
File list (Check if you may need any files):
11912894BootCaT-0.1.2.tar