Abstract
Medical artificial intelligence (MAI) is artificial intelligence (AI) applied to the healthcare field. AI can be applied to many different aspects of genetics, such as variant classification. With little or no prior experience in AI coding, we share our experience with variant classification using the Variant Artificial Intelligence Easy Scoring (VARIES), an open-access platform, and the Automatic Machine Learning (AutoML) of the Google Cloud Platform.
We investigated exome sequencing data from a sample of 1410 individuals. The majority (80%) were used for training and 20% for testing. The user-friendly Google Cloud Platform was used to create the VARIES model, and the TRIPOD checklist to develop and validate the prediction model for the development of the VARIES system.
The learning rate of the training dataset reached optimal results at an early stage of iteration, with a loss value near zero in approximately 4 min. For the testing dataset, the results for F1 (micro average) was 0.64, F1 (macro average) 0.34, micro-average area under the curve AUC (one-over-rest) 0.81 and the macro-average AUC (one-over-rest) 0.73. The overall performance characteristics of the VARIES model suggest the classifier has a high predictive ability.
We present a systematic guideline to create a genomic AI prediction tool with high predictive power, using a graphical user interface provided by Google Cloud Platform, with no prior experience in creating the software programs required.
•VARIES SYSTEM for the purpose of variant classification with minimal or no prior expertise in AI, using auto-machine learning (AutoML).•VARIES could assist in lowering and filtering out the number of B and LB variants that would require manual curation by approximately 94% and 16%.•VARIES can build different prediction models from the same dataset. For example, a model that can predicts family information, HPO, or even family relatedness.