I. Purpose
The goal is to eficiently extract useful information betwen Biomedical Named Entities(BNEs) from Biomedical literatures using not only existing text mining method but various data mining method and to visualize extracted information.
Ⅱ. C...
I. Purpose
The goal is to eficiently extract useful information betwen Biomedical Named Entities(BNEs) from Biomedical literatures using not only existing text mining method but various data mining method and to visualize extracted information.
Ⅱ. Contents
A. Bio-medical literatures data preprocesing
Extract literature data from Web database-Pubmed and perform Parsing and POS targging
B. Design and develop biomedical Named Entity Recognition(NER) model.
· Design and develop dictionary based NER Recognize an entity using dictionary of entities which contains protein/gene names and symbols extracted from protein/gene databases.
· Design and develop rule based NER Extract the BNE candidates based on generated POS combination rules using Bayesian Network based Finite State Machine and frequent patern mining method.
· Design and develop data mining based NER Extract features from entiy candidates and predict the type of enties using incremental data mining method.
C. Design and develop a BNE interaction patern extraction model
· Select the sentences which include more than two BNEs in a sentence.
· Search protein/gene interaction words.
· Maping the interacting BNEs with their normalization gene names
· Extract protein/gene interaction paterns using the parsing tre
D. Construct the BNE interaction network and implement interaction patern visualization program
· Extract co-expresion paterns from gene expresion data
· Implement BNE interaction patern visualization program
Ⅲ. Result
A. Development of biomedical NER algorithm and biomedical NER program.
B. Development of Protein/Gene interaction patern extraction algorithm and program.
C. Verification of performance of developed Biomedical NER is beter than existing biomedical NER program in the conference.
D. Construction of database containing protein/gene basic information, protein/gene interaction patern and biomedical literature information