Description: The project’s goal is to provide an
application to provide a brief list for a set of
books in xml format then maybe people can
through this list to decide which book they
want to select or if some genre books are in
those set of books.So the application at least
can provide the Title, Author, Language,
Release Date and Genre fields. To provide
those informations, the application should
fetch test files and training files then process
those files to find the desired content then
store only the extracted content in
outputting file (books.xml). The extracted
content should help people to know what
those books are about. One import fact
which the application should provide is the
genre, because people maybe only want to
search a certain category of books.
So to implement tasks above, the first step,
the application tokenizes the books (test and
training xml files) to represent the documentfor extracting facts and decide classification.
This step should be careful to to
To Search:
File list (Check if you may need any files):
bsquaredold
bsquared.pl