Description: Development ofBIog texts information on the internet has brought new challenge tO Chinese text classification.Aim to solving thesemantics deficiency problem in traditional methods for Chinese text classification,this paper
implements a text classification method on classifying a blog asjoy,angry,sador/ bar us/ng a simple unsupervised learning algorithm.The classification ofa.blog text is predicted by the max semantic orientation(SO)ofthe phrases in the blog text that contains删ectives or adverbs.In this paper,the SO ofa phrase is calculated as the mutual information between the given phrase and thepolar words.Then the SO ofthe given blog text is determined by the maxmutual information value.A
blog text is classified asjoy ifthe SO ofits phrases isjoy.Two different corpora are adopted to test our method,one is the Blog corpus collected by Monitor and Research Center for National Language Resource Network Multimedia Sub-branch
Center,and the other is Chinese dataset provided by COAE2
To Search:
File list (Check if you may need any files):
PMl-IR.pdf