New Approach for Risk Estimation Algorithms of BRCA1/2 Negativeness Detection with Modelling Supervised Machine Learning Techniques

YAZICI, Hülya; Odemis, Demet; AKSU, Doğukan; Erdogan, ÖZGE; TUNÇER, ŞEREF; Avsar, Mukaddes; Kilic, SEDA; Turkcan, Gozde; ÇELİK, BETÜL; AYDIN, Muhammed

doi:10.1155/2020/8594090

New Approach for Risk Estimation Algorithms of BRCA1/2 Negativeness Detection with Modelling Supervised Machine Learning Techniques

YAZICI H., Odemis D., AKSU D., Erdogan Ö., TUNÇER Ş. B., Avsar M., ...Daha Fazla

DISEASE MARKERS, cilt.2020, 2020 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 2020
Basım Tarihi: 2020
Doi Numarası: 10.1155/2020/8594090
Dergi Adı: DISEASE MARKERS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Agricultural & Environmental Science Database, BIOSIS, Biotechnology Research Abstracts, CAB Abstracts, EMBASE, MEDLINE, Directory of Open Access Journals
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
İstanbul Üniversitesi Adresli: Evet

Özet

BRCA1/2 gene testing is a difficult, expensive, and time-consuming test which requires excessive work load. The identification of the BRCA1/2 gene mutations is significantly important in the selection of treatment and the risk of secondary cancer. We aimed to develop an algorithm considering all the clinical, demographic, and genetic features of patients for identifying the BRCA1/2 negativity in the present study. An experimental dataset was created with the collection of the all clinical, demographic, and genetic features of breast cancer patients for 20 years. This dataset consisted of 125 features of 2070 high-risk breast cancer patients. All data were numeralized and normalized for detection of the BRCA1/2 negativity in the machine learning algorithm. The performance of the algorithm was identified by studying the machine learning model with the test data. k nearest neighbours (KNN) and decision tree (DT) accuracy rates of 9 features involving Dataset 2 were found to be the most effective. The removal of the unnecessary data in the dataset by reducing the number of features was shown to increase the accuracy rate of algorithm compared with the DT. BRCA1/2 negativity was identified without performing the BRCA1/2 gene test with 92.88% accuracy within minutes in high-risk breast cancer patients with this algorithm, and the test associated result waiting stress, time, and money loss were prevented. That algorithm is suggested be useful in fast performing of the treatment plans of patients and accurately in addition to speeding up the clinical practice.