Unlike sperm and embryos, establishing a standardized non-invasive laboratory assessment for oocytes has been challenging due to the absence of visual markers of quality. Patient age has been utilized as a proxy for oocyte quality but cannot account for deviations within oocyte cohorts or between IVF cycles. Therefore, to detect patterns imperceptible to the human eye, a DLM was applied to discern oocyte quality by its potential of developing into a blastocyst.
A dataset of 37,133 static images of MII oocytes denuded for IVF-ICSI cycles, with associated reproductive outcomes, was used to develop the DLM. 29,326 and 7,807 images were allocated into training and test sets, respectively. Performance on a new, external dataset comprising of 12,371 images was used to further validate the model. Performance of the model was evaluated using area under the curve (AUC), accuracy, specificity, and sensitivity. Blastocyst probabilities generated by the model were converted nonlinearly into scores (0-10) for simpler interpretation. Scores were assessed for correlation to blastocyst development and quality. To additionally evaluate clinical relevance, the model’s predictive ability was compared to a predictive model based on the current standard of care (age at oocyte retrieval) and to manual assessment by embryologists (assigning scores 0-10 based on visible dysmorphisms). Moreover, multiple applications of oocyte assessment were explored: sorting by oocyte quality for group culture methods, applicability to donor population, and analysis of how clinical factors (e.g., stimulation protocol) impact quality.
The DLM achieved similar AUC, accuracy, specificity, and sensitivity during model development and external testing [0.64, 0.60, 0.55, 0.65; 0.63, 0.58, 0.57, 0.59, respectively]. When probabilities are converted to scores, higher oocyte scores correlate with both higher blastocyst development and quality. The DLM additionally demonstrates greater relevance with a more balanced distribution of scores than the embryologists’ manually assigned scores, which skew high due to the rarity of dysmorphisms. Despite the relevance of age at a patient-population level, a predictive model for blastocyst development built on age alone displayed an AUC of 0.5, similar to chance. Thus, the DLM was superior for investigating a range of applications. In embryo group culture, oocytes were sorted into dishes based on score (Group A: 0-2.5, B: 2.6-5.0, C: 5.1-7.5, D: 7.6-10). Subsequent blastocyst rates exhibited a stepwise increase with significant differences between groups. Applicability to the oocyte donor population was assessed as a preliminary analysis for further investigation into donor oocyte distribution. Donor oocytes that successfully developed into a blastocyst had a significantly higher mean score than those that did not. Finally, it was demonstrated that oocytes matured using a GnRH-antagonist protocol tended towards high quality compared to the GnRH-agonist protocol (even when stratified by age) and as confirmed by higher blastocyst rates.
A DLM is capable of non-invasively evaluating oocyte quality, filling a gap in standard practice. By displaying good performance across new and diverse datasets, the DLM shows versatility in its clinical applications (group culture, donor oocytes, stimulation protocols) and affords novel insights.
Join our mailing list for dispatches on the future of fertility