OBJECTIVE:
To assess the performance of ensemble prediction models combining embryologists’ weighted predictions with an AI tool’s (VIOLET, Future Fertility) prediction of blastocyst development from images of MII oocytes.
METHODS:
VIOLET analyzed 300 static MII images to predict blastocyst development. 17 embryologists (3 clinic groups) were asked to assess the same MIIs to predict blastocyst development (best judgment) and score their confidence (1-3) in each prediction. A weighted probability for each oocyte was calculated and utilized for the ensemble models. Two ensemble techniques were employed to create a prediction model combining VIOLET’s and embryologists’ probabilities of blastocyst development. Ensemble 1 utilized a lambda value (0, 0.25, 0.5, 0.75, or 1)—a higher lambda places more weight on VIOLET’s probability in the prediction. Ensemble 2 utilized VIOLET’s confidence as thresholds (10%, 30%, 50%, 70%, 90%). If VIOLET’s confidence exceeded the threshold, its prediction was used; and conversely the embryologists’ weighted prediction was used if VIOLET’s confidence was lower than the threshold. Accuracy, specificity, sensitivity, and AUC were calculated for both ensembles to assess performance.
RESULTS:
Ensemble 1 displayed a stepwise increase in prediction accuracy (0.54–0.61), AUC (0.61–0.66), and specificity (0.32–0.67), and a decrease in sensitivity (0.9–0.53), as the lambda value increased from 0 to 1 (more weight on VIOLET’s probability). Ensemble 2 displayed a decrease in prediction accuracy (0.62–0.56), AUC (0.66–0.61), and specificity (0.63–0.33) as the threshold increased from 10% to 90% (increasing utilization of embryologist prediction), whereas sensitivity increased (0.61–0.89).
CONCLUSIONS:
VIOLET’s predictions result in a balanced and higher overall model performance, suggesting an embedded ability of VIOLET to account for relevant oocyte features detectable by the human eye, while extracting additional information that is imperceptible, providing a consistent, efficient, and more accurate assessment.
> Return to our full library of research studies