IMPLEMENTATION OF BERT BASED MACHINE LEARNING MODEL TO EXTRACT CANCER –MIRNA RELATIONSHIP FROM RESEARCH LITERATURE

ECU Author/Contributor (non-ECU co-authors, if there are any, appear on document)
Arunprasad Sundharam (Creator)
Institution
East Carolina University (ECU )
Web Site: http://www.ecu.edu/lib/

Abstract: In the world today, text mining is a widely popular and growing branch of Information technology, in which we extract useful information out of the given pile of text data. There are thousands of research papers in medical science pertaining to the study of how microRNAs (miRNAs) can assist or impede the development of various types of cancers. mirCancer is a repository which provides the details of this cancer-miRNA association by analyzing 6500+ research papers using text mining techniques. It would be helpful to create a machine learning model which can analyze the title and abstract content of the research papers and extract the cancer-miRNA association details if it is available in the given text. In this thesis work, we are proposing a solution for creating a machine learning model using the open source NLP framework - BERT, provided by Google which can identify the cancer-miRNA relationship in the given abstract text content. Bert is a deep learning model which is pretrained on Wikipedia text corpse and has built-in knowledge on the usage of English language. As part of this work, we have designed and implemented a machine learning model using Bert framework along with preparation of the dataset required to train the model in the task of identifying cancer-miRNA relationship from the given text. The machine learning model developed in this thesis work performed with an overall accuracy of 90.3% in retrieving the required information from the research papers of the test dataset and hence it can be leveraged to review the results of the existing mircancer text mining implementation.

Additional Information

Publication
Thesis
Language: English
Date: 2023
Subjects
Bert;Biological text mining

Email this document to

This item references:

TitleLocation & LinkType of Relationship
IMPLEMENTATION OF BERT BASED MACHINE LEARNING MODEL TO EXTRACT CANCER –MIRNA RELATIONSHIP FROM RESEARCH LITERATUREhttp://hdl.handle.net/10342/9120The described resource references, cites, or otherwise points to the related resource.