Prediction of Ovarian Cancer with Deep Machine Learning and Alternative Splicing

Main Article Content

Katharine Linder, MD Rachel Watson, MD Keely Ulmer, MD David Bender, MD Michael J Goodheart, MD Eric Devor, PhD Jesus Gonzalez Bosquet, MD, PhD

Abstract

Objective: Early detection of ovarian cancer could lead to improved survival rates, however no method has reliably been able to predict ovarian cancer. The aim of this study is to determine if processing alternative splicing data from high grade serous ovarian cancer patients using machine learning analytics will discriminate high grade serous ovarian cancer from normal fallopian tube samples. The ultimate goal would be to have a model that can predict high grade serous ovarian cancer with a blood test.


Methods: This is a case-control study of patients with confirmed high grade serous ovarian cancer and those undergoing salpingectomy for benign indications. RNA-sequencing was performed on all samples. RNA-sequence data was then put into Deep-learning augmented RNA-seq analysis of transcript splicing software suite. Deep-learning augmented RNA-seq analysis of transcript splicing created a model of differential alternative splicing aimed to discriminate between high grade serous ovarian cancer and normal fallopian tube. DEXSeq analysis was used to determine exon-based expression. Initial results with both analytics were then modelled with multivariate lasso regression to create prediction models (performance determined by area under the curve and 95% CI). Models created were the validated using The Cancer Genome Atlas data sets.


Results: One hundred and twelve high grade serous ovarian cancer and 12 benign samples were successfully sequenced. Deep-learning augmented RNA-sequencing analysis of transcript splicing identified 998 unique differentially expressed exons between high grade serous ovarian cancer and controls. Multivariate lasso regression analysis identified several exons that predicted high grade serous ovarian cancer with high performance. Specifically, ENSG00000182512:E001 from gene GLRX5 was highly predictive of high grade serous ovarian cancer with an area under the curve of 100%.


Conclusions: Application of machine learning analytics to exon differential expression, most likely due to alternative splicing, predicted high grade serous ovarian cancer with high performance. These results were validated in an independent dataset of cases and controls. Differential exon expression from cell-free RNA potentially could be used for early diagnosis of high grade serous ovarian cancer.

Keywords: ovarian cancer, alternative splicing, machine learning

Article Details

How to Cite
LINDER, Katharine et al. Prediction of Ovarian Cancer with Deep Machine Learning and Alternative Splicing. Medical Research Archives, [S.l.], v. 11, n. 11, nov. 2023. ISSN 2375-1924. Available at: <https://esmed.org/MRA/mra/article/view/4602>. Date accessed: 16 may 2024. doi: https://doi.org/10.18103/mra.v11i11.4602.
Section
Research Articles

References

1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA Cancer J Clin. 2022;72(1):7-33. doi:10.3322/caac.21708

2. SEER Cancer Statistics Review, 1975-2016. SEER. Accessed March 20, 2023. https://seer.cancer.gov/csr/1975_2016/index.html

3. Henderson JT, Webber EM, Sawaya GF. Screening for Ovarian Cancer: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2018;319(6):595-606. doi:10.1001/jama.2017 .21421

4. Pinsky PF, Yu K, Kramer BS, et al. Extended mortality results for ovarian cancer screening in the PLCO trial with median 15years follow-up. Gynecol Oncol. 2016; 143(2):270-275. doi:10.1016/j.ygyno.2016.08.334

5. Buys SS, Partridge E, Black A, et al. Effect of screening on ovarian cancer mortality: the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Randomized Controlled Trial. JAMA. 2011; 305(22):2295-2303. doi:10.1001/jama.2011.766

6. Junor EJ, Hole DJ, McNulty L, Mason M, Young J. Specialist gynaecologists and survival outcome in ovarian cancer: a Scottish national study of 1866 patients. Br J Obstet Gynaecol. 1999; 106(11):1130-1136. doi:10. 1111/j.1471-0528.1999.tb08137.x

7. McGowan L, Lesher LP, Norris HJ, Barnett M. Misstaging of ovarian cancer. Obstet Gynecol. 1985; 65(4):568-572.

8. Im SS, Gordon AN, Buttin BM, et al. Validation of referral guidelines for women with pelvic masses. Obstet Gynecol. 2005; 105(1):35-41. doi:10.1097/01.AOG.00001491 59.69560.ef

9. Bettegowda C, Sausen M, Leary RJ, et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med. 2014; 6(224):224ra24. doi:10.1126/ scitranslmed.3007094

10. Mellby LD, Nyberg AP, Johansen JS, et al. Serum Biomarker Signature-Based Liquid Biopsy for Diagnosis of Early-Stage Pancreatic Cancer. J Clin Oncol Off J Am Soc Clin Oncol. 2018;36(28):2887-2894. doi:10.1200/JCO.2017.77.6658

11. Abbosh C, Birkbak NJ, Wilson GA, et al. Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature. 2017; 545(7655):446-451. doi:10.1038/nature22364

12. Phallen J, Sausen M, Adleff V, et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci Transl Med. 2017; 9(403):eaan2415. doi:10.1126/scitranslmed.aan2415

13. Cheng X, Zhang L, Chen Y, Qing C. Circulating cell-free DNA and circulating tumor cells, the “liquid biopsies” in ovarian cancer. J Ovarian Res. 2017; 10(1):75. doi: 10.1186/s13048-017-0369-5

14. Sultan AS, Elgharib MA, Tavares T, Jessri M, Basile JR. The use of artificial intelligence, machine learning and deep learning in oncologic histopathology. J Oral Pathol Med Off Publ Int Assoc Oral Pathol Am Acad Oral Pathol. 2020; 49(9):849-856. doi:10.1111/ jop.13042

15. Leung MKK, Xiong HY, Lee LJ, Frey BJ. Deep learning of the tissue-regulated splicing code. Bioinformatics. 2014; 30(12):i121-i129. doi:10.1093/bioinformatics/btu277

16. Xiong HY, Alipanahi B, Lee LJ, et al. The human splicing code reveals new insights into the genetic determinants of disease. Science. 2015;347(6218):1254806. doi:10.1126/science.1254806

17. Zhang Z, Pan Z, Ying Y, et al. Deep-learning augmented RNA-seq analysis of transcript splicing. Nat Methods. 2019;16(4):307-310. doi:10.1038/s41592-019-0351-9

18. Anders S, Reyes A, Huber W. Detecting differential usage of exons from RNA-seq data. Genome Res. 2012; 22(10):2008-2017. doi:10.1101/gr.133744.111

19. Gonzalez-Bosquet J, Cardillo ND, Reyes HD, et al. Using Genomic Variation to Distinguish Ovarian High-Grade Serous Carcinoma from Benign Fallopian Tubes. Int J Mol Sci. 2022; 23(23):14814. doi:10.3390/ ijms232314814

20. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010; 11(10):R106. doi:10.1186 /gb-2010-11-10-r106

21. Gonzalez Bosquet J, Devor EJ, Newtson AM, et al. Creation and validation of models to predict response to primary treatment in serous ovarian cancer. Sci Rep. 2021; 11(1):5957. doi:10.1038/s41598-021-85256-9

22. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22.

23. Kuhn M. Building Predictive Models in R Using the caret Package. J Stat Softw. 2008; 28:1-26. doi:10.18637/jss.v028.i05

24. Asante DB, Calapre L, Ziman M, Meniawy TM, Gray ES. Liquid biopsy in ovarian cancer using circulating tumor DNA and cells: Ready for prime time? Cancer Lett. 2020; 468:59-71. doi:10.1016/j.canlet.2019.10.014

25. Reble E, Dineen A, Barr CL. The contribution of alternative splicing to genetic risk for psychiatric disorders. Genes Brain Behav. 2018;17(3):e12430. doi:10.1111/gbb. 12430

26. Dlamini Z, Mokoena F, Hull R. Abnormalities in alternative splicing in diabetes: therapeutic targets. J Mol Endocrinol. 2017; 59(2):R93-R107. doi: 10. 1530/JME-17-0049

27. Rehman SU, Schallschmidt T, Rasche A, et al. Alternative exon splicing and differential expression in pancreatic islets reveals candidate genes and pathways implicated in early diabetes development. Mamm Genome Off J Int Mamm Genome Soc. 2021; 32(3):153-172. doi:10.1007/s00335-021-09869-1

28. Hu Z, Liang MC, Soong TW. Alternative Splicing of L-type CaV1.2 Calcium Channels: Implications in Cardiovascular Diseases. Genes. 2017; 8(12):344. doi:10.3390/genes 8120344

29. Dlamini Z, Hull R, Makhafola TJ, Mbele M. Regulation of alternative splicing in obesity-induced hypertension. Diabetes Metab Syndr Obes Targets Ther. 2019; 12:1597-1615. doi:10.2147/DMSO.S188680

30. Yu S, Hu C, Liu L, et al. Comprehensive analysis and establishment of a prediction model of alternative splicing events reveal the prognostic predictor and immune microenvironment signatures in triple negative breast cancer. J Transl Med. 2020; 18(1):286. doi:10.1186/s12967-020-02454-1

31. Lou S, Zhang J, Zhai Z, et al. Development and validation of an individual alternative splicing prognostic signature in gastric cancer. Aging. 2021; 13(4):5824-5844. doi:10.18632/aging.202507

32. Zhang CJ, Li ZT, Shen KJ, Chen L, Xu DF, Gao Y. Characterization of progression-related alternative splicing events in testicular germ cell tumors. Asian J Androl. 2021; 23(3):259-265. doi:10.4103/aja.aja_30_20

33. Wan Q, Sang X, Jin L, Wang Z. Alternative Splicing Events as Indicators for the Prognosis of Uveal Melanoma. Genes. 2020; 11(2):227. doi:10.3390/genes11020227

34. Klinck R, Bramard A, Inkel L, et al. Multiple alternative splicing markers for ovarian cancer. Cancer Res. 2008; 68(3):657-663. doi:10.1158/0008-5472.CAN-07-2580

35. Kahles A, Lehmann KV, Toussaint NC, et al. Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients. Cancer Cell. 2018; 34(2):211-224.e6. doi:10. 1016/j.ccell.2018.07.001

36. Zhu J, Chen Z, Yong L. Systematic profiling of alternative splicing signature reveals prognostic predictor for ovarian cancer. Gynecol Oncol. 2018; 148(2):368-374. doi:10.1016/j.ygyno.2017.11.028

37. Ogata FT, Branco V, Vale FF, Coppo L. Glutaredoxin: Discovery, redox defense and much more. Redox Biol. 2021;43:101975. doi:10.1016/j.redox.2021.101975

38. Mollbrink A, Jawad R, Vlamis-Gardikas A, et al. Expression of Thioredoxins and Glutaredoxins in Human Hepatocellular Carcinoma: Correlation to Cell Proliferation, Tumor Size and Metabolic Syndrome. Int J Immunopathol Pharmacol. 2014; 27(2):169-183. doi:10.1177/039463201402700204

39. Lee J, You JH, Shin D, Roh JL. Inhibition of Glutaredoxin 5 predisposes Cisplatin-resistant Head and Neck Cancer Cells to Ferroptosis. Theranostics. 2020; 10(17):7775-7786. doi:10.7150/thno.46903

40. U.S. Census Bureau QuickFacts: Iowa. Accessed March 20, 2023. https://www.census.gov/quickfacts/IA