Analysis of maturation features in fetal brain ultrasound via arti ﬁ cial intelligence for the estimation of gestational age

BACKGROUND: Optimal prenatal care relies on accurate gestational age dating. After the ﬁ rst trimester, the accuracy of current gestational age estimation methods diminishes with increasing gestational age. Considering that, in many countries, access to ﬁ rst trimester crown rump length is still dif ﬁ cult owing to late booking, infrequent access to prenatal care, and unavailability of early ultrasound examination, the development of accurate methods for gestational age estimation in the second and third trimester of pregnancy remains an unsolved challenge in fetal medicine. OBJECTIVE: This study aimed to evaluate the performance of an arti ﬁ - cial intelligence method based on automated analysis of fetal brain morphology on standard cranial ultrasound sections to estimate the gestational age in second and third trimester fetuses compared with the current formulas using standard fetal biometry. STUDY DESIGN: Standard transthalamic axial plane images from a total of 1394 patients undergoing routine fetal ultrasound were used to develop an arti ﬁ cial intelligence method to automatically estimate gestational age from the analysis of fetal brain information. We compared its performance — as stand alone or in combination with fetal biometric parameters — against 4 currently used fetal biometry formulas on a series of 3065 scans from 1992 patients undergoing second (n=1761) or third trimester (n=1298) routine ultrasound, with known gestational age estimated from crown rump length in the ﬁ rst trimester. RESULTS: Overall, 95% con ﬁ dence interval of the error in gestational age estimation was 14.2 days for the arti ﬁ cial intelligence method alone and 11.0 when used in combination with fetal biometric parameters, compared with 12.9 days of the best method using standard biometrics alone. In the third trimester, the lower 95% con ﬁ dence interval errors were 14.3 days for arti ﬁ cial intelligence in combination with biometric parameters and 17 days for fetal biometrics, whereas in the second trimester, the 95% con ﬁ dence interval error was 6.7 and 7, respectively. The performance differences were even larger in the small-for-gestational-age fetuses group (14.8 and 18.5, respectively). CONCLUSION: An automated arti ﬁ cial intelligence method using standard sonographic fetal planes yielded similar or lower error in gestational age estimation compared with fetal biometric parameters, especially in the third trimester. These results support further research to improve the performance of these methods in larger studies.


Introduction
O ptimal prenatal care relies on accurate gestational age (GA) dating. 1,2−6 Inaccurate GA estimation can lead to suboptimal or iatrogenic prenatal care.−9 After the first trimester, biparietal diameter (BPD) and head circumference (HC) are considered the best single predictors for GA estimation, 10,11 whereas the combination of BPD, HC, PA, and femur length (FL) has shown higher accuracy in the second trimester. 12lthough many regression equations have been described to estimate GA in the second and third trimester, 13−18 the reported accuracy is still substantially lower than CRL dating and diminishes with increasing GA.The reported deviation is around 12 to 14 days at 26 weeks' gestation, whereas it increases to >19 days later in the third trimester.Considering that, in many countries, access to first trimester CRL is still difficult owing to late booking, infrequent access to prenatal care, and unavailability of early ultrasound examination, 19,20 the development of accurate methods for GA estimation in the second and third trimester of pregnancy remains an unsolved challenge in fetal medicine.
The field of artificial intelligence (AI) has had remarkable progress during the last decade, owing to deep learning (DL) algorithms, 21 and it is now a part of our daily lives.In medicine, AI methods have shown their potential to quantitatively analyze images (classifying and measuring structures, organs, lesions, etc) on a wide set of medical images, such as cardiology, 22 radiology, 23 or dermoscopy, 24 to name just a few.The potential use of AI applied to fetal ultrasound has been recently reported, 25 and several groups have evaluated its use for fetal diagnosis. 26,27The use of AI for GA estimation has been attempted in preliminary studies.Considering that the brain undergoes significant morphologic changes during fetal development, 28  automatic fetal brain biometric parameters 29 or from an in-depth analysis of brain features using 3-dimensional (3D) ultrasound or fetal magnetic resonance imaging (MRI). 30,31However, to our knowledge, no attempts have been made to perform this analysis with 2D ultrasound.
The goal of this study was to develop a novel AI method to automatically estimate GA from the routine brain transthalamic axial plane on 2D ultrasound and to compare its performance against current formulas based on standard fetal biometric parameters on a large cohort of pregnancies.

Study design
This was a prospective observational study performed at BCNatal, Barcelona Center for Maternal-Fetal and Neonatal Medicine (Hospital Clinic and Hospital Sant Joan de D eu, Barcelona, Spain).The study protocol was approved by the local ethics committee on March 14, 2019, under protocol identifier HCB 2018/0031, and patients provided written informed consent to use US images for research purposes.
A total of 8580 images from 2034 patients were acquired during standard clinical practice for 6 months, between September 2019 and February 2020.All pregnant women attending for routine ultrasound at the second or third trimester of pregnancy were included in the study.Congenital fetal malformations, aneuploidies, and multiple pregnancies were excluded from the study.GA was determined by CRL measurements on first-trimester ultrasound 7 and ranged from 16 to 42 weeks.

Fetal biometry measurements and image acquisition
Ultrasounds were performed by clinicians with varying degrees of expertise.
Voluson E6 (GE Medical Systems, Zipf, Austria), Voluson S8, Voluson S10, and Aloka (Aloka Co, Ltd, Tokyo, Japan) were used for image acquisition by means of abdominal ultrasound using a curved transducer with a frequency range from 3 to 7 Mhz.GA at ultrasound evaluation and fetal biometric parameters (BPD, HC, abdomen circumference [AC], and FL) measured manually were recorded.Estimated fetal weight and fetal percentile according to sex and local charts 32 were calculated.Fetuses were classified as normal (10≤percentile≤97), small-for-gestational-age (SGA) (percen-tile<10), or large-for-gestational-age (LGA) (percentile>97).Images were stored during fetal ultrasound evaluation by either sonographers or fetal medicine specialists.The images were retrieved from the picture archiving and communication system, and we used the exact same images taken for actual clinical measurements (BPD and HC), avoiding the use of any type of postprocessing or artifacts, such as smoothing, noise, pointers, or calipers.Other image settings parameters, such as gain, frequency, and gain compensation, were left to the discretion of each operator.Images were stored in the original Digital Imaging and Communication in Medicine (DICOM) format.

Gestational age estimation via artificial intelligence
A novel method for GA estimation from the transthalamic plane (axial plane for BPD measurement) was developed using state-of-the-art DL techniques.The method is based in supervised learning and therefore required a previous learning stage.It was trained using images from 1394 patients collected during the months before the evaluation set used in this study, under the same protocol.These images were manually labeled with the orientation and specific brain landmarks by a maternal-fetal specialist (B.V.A.) using computer software.This was necessary for the method to learn to locate the brain in the image (first step of the method).We coined this new method quantusGA.
The outline of how quantusGA works is shown in Figure 1.It receives the fetal brain ultrasound image in DICOM format as input and provides GA estimation, in days, as the final output.The method is not limited to the measurement of a specific set of structures of the fetal brain; it combines textural information together with pixel resolution in mm of the image (which is stored in the DICOM) for a more robust estimation.This is important for the method to work on any image, considering that ultrasound settings were not standardized and different machines were used.Therefore, brain resolution in pixels varied between studies.Optionally, the method can incorporate fetal biometric parameters for GA estimation (we provide results both with and without fetal biometrics).
QuantusGA performs the following 3 steps: (1) it automatically detects the position and orientation of the fetal brain in the image by detecting the skull and internal key points, such as the midline and the anterior/posterior regions; (2) the key point positions are used to crop and rotate the brain, resulting in a horizontally aligned brain image; and (3) it extracts textural and size information from the brain pixels and uses this information (optionally in AJOG MFM at a Glance Why was this study conducted?This study aimed to evaluate whether artificial intelligence (AI) could improve gestational age (GA) dating in the second and third trimester.

Key findings
An automated AI tool yielded similar or lower error in GA estimation compared with fetal biometric parameters, especially in the third trimester.

What does this add to what is known?
AI can be used to improve current accuracy in GA estimation in the second and third trimester from standard brain ultrasound.
Original Research combination with fetal biometric parameters) to estimate GA.Each step is explained in more detail below.
Brain key point detection: The first step uses a deep convolutional neural network (CNN) 21,33 to detect 2D key points in the brain image.The network, previously trained on 1394 patients, is capable of detecting the position in the image of key structures, such as the midline, skull contour, anterior horns, and cerebellar hemisphere.
Brain alignment: Using standard 2D Euclidean geometry from the aforementioned 2D brain key points, an oriented ellipse can be fitted to the image, providing the centroid, orientation, and extrema of the brain.Then, the fetal brain image is cropped and rotated in such a way that the midline is fully horizontal and the anterior part of the brain is located to the right.
Final GA estimation: Once the fetal brain image is aligned, another deep CNN 21,33 (previously trained using the same 1394 patients) is used to estimate GA from the brain image pixels and the image resolution.The image resolution is extracted directly from the DICOM and is important to provide the network internally with information on the true size of the structures seen in the image.
The CNN used was an XCeption 34 adapted to the task by replacing regular convolutions by a series of slightly altered coordinated convolutions 35 layers, which incorporated image resolution into the computation.The convolutional part of the network extracts textural and size information from the pixels, whereas the last layers (predictor part of the network) use this information to estimate GA in days, using a linear regression.The linear regression can optionally also receive as input the value of fetal biometric parameters (BPD, HC, AC, FL) in mm, which it then combines together with the textural and size information extracted from the image to estimate GA using a stepwise regression.

Artificial intelligence evaluation and comparison
quantusGA was directly compared with the following 4 different fetal biometry GA estimation strategies based on the same data: (1) the use of fetal brain biometric parameters (BPD and HC). 15,16his method was evaluated to analyze whether our method was capable of extracting more information from the brain image than cephalic biometric parameters alone; (2) the use of all 4 biometric parameters through classic Hadlock formula 13 ; (3) the use of HC and FL, the formula with current best reported results for second and third trimester pregnancies developed for the Intergrowth project 17 ; and (4) using all 4 biometric parameters through second-best reported results for second and third trimester pregnancies, a formula proposed by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). 18All formulas were evaluated for a comprehensive comparison against current state-of-the-art methods.

Statistical analysis
All statistical analyses were performed using python (Python Software Foundation, Wilmington, DE).All the methods mentioned above were applied to all available ultrasound studies, storing the output: GA estimation in days.Whenever several images of the same patient with the same study date were available, the average GA estimation was used (both for biometrics and AI method).Proposed AI method (quantusGA) outline Brain key points are based on HC/BPD measurement: (1) midline sinciput (front), (2) cavum septum, (3) midline center, (4) mirror point with respect to cavum septum along midline, (5) midline occiput (back), ( 6) upper parietal bone at 90°angle from the midline center, and (7) lower parietal bone at 90°a ngle from the midline center.
Burgos-Artizzu.Gestational age estimation from fetal brain ultrasound via artificial intelligence.Am J Obstet Gynecol MFM 2021.

Original Research
Then, the estimated GA was compared with gold-standard GA obtained from CRL during the first trimester of pregnancy.Regression errors, such as Rsquared, the average absolute deviation, and its 95% confidence interval, and the proportion of errors above 2 weeks were computed.Furthermore, the power of the differences between the regression errors of the different techniques were analyzed by means of statistical power sampling using a level of significance (alpha) of 0.05 (5%).

Study population
Table 1 summarizes the characteristics of the study population.A total of 1992 patients were included in the study.Half of the patients had >1 ultrasound examination during the pregnancy, resulting in a total of 3065 scans.The number of brain images recorded during each ultrasound was on average 2.7 ( §1.4).The total fetal ultrasound brain image count was 8391.Mean GA was 26.5 ( §6.5) weeks (range, 16 +0 −41 +2 weeks).The Supplemental Figure shows the full GA distribution histogram.

Gestational age estimation
Table 2 shows the results of GA estimation on the 3065 ultrasound studies from 1992 patients.The new AI model estimated GA with a 95% CI deviation of 14.2 days (standard error, 4.76 days).In comparison, 95% CI deviation results of the biometry calculators were 18.8 days for the brain biometry calculator (BPD and HC), 15,16 13 days for Hadlock, 13 14.9 days for Intergrowth, 17 and 12.9 days for NICHD. 18Adding all biometric parameters to the AI model improved results, with a 95% CI deviation of 11 days (standard error, 3.74 days).
In second trimester fetuses (N=1761), the AI method had a 95% CI deviation of 6.7 and 8.6 days with and without the association of fetal biometric parameters, respectively.Deviations for biometry calculators alone were 13.8 days for BPD and HC, 15,16 7.1 days for Hadlock, 13 7.2 days for Intergrowth, 17 and 7 days for NICHD. 18n third trimester fetuses (N=1298), the AI method had a 95% CI deviation of 14.3 and 17.9 days with and without the association of fetal biometric parameters, respectively.Deviations for biometry calculators alone were 24.1 days for BPD and HC, 15,16 18.4 days for Hadlock, 13 18.8 days for Intergrowth, 17 and 17 days for NICHD. 18Large errors were reduced from 7.9% to 5.5% when using the method in combination with biometric parameters, compared with best biometry results.
Figure 2 shows the scatter plots for each of the 6 methods (brain biometry calculator, 15,16 Hadlock, 13 Intergrowth calculator, 17 NICHD, 18 AI method alone, and AI method and FL) using all ultrasound studies (N=3065).
The statistical power between the deviations from true GA when using NICHD 18 vs the AI method alone was 8% (small differences between the 2).However, the power when comparing the errors of NICHD vs those of the AI method and biometric parameters was 99% (clearly different).Power between the AI method alone and brain biometry calculator 15,16 was >99% (clearly different).
Table 3 shows the results on GA estimation on SGA (weight percentile ≤10) and LGA (weight percentile >97) fetuses.In the 137 SGA studies, the new AI model estimated GA with a 95% CI deviation of 14.8 days (standard error, 5.4 days).In comparison, deviations for biometry calculators alone were 28.3 days for BPD and HC, 15,16 24.3 days for Hadlock, 13 18.5 days for Intergrowth, 17 and 20 days for NICHD. 18Adding biometrics to the model did not improve results (deviation of 15.5 days).
In 104 LGA studies, the new AI model estimated GA with a 95% CI deviation of 19.8 days (standard error, 7.64 days).In comparison, deviations for biometry calculators alone were 29.5 days for BPD and HC, 15,16 21.9 days for Hadlock, 13 34 days for Intergrowth, 17 and 20.5 days for NICHD. 18Adding biometrics to the

Comment Principal findings
We developed an AI method (coined quantusGA) to estimate GA directly from a routinely used plane in all fetal ultrasound screenings.The method estimates GA from a standard transthalamic plane, where BPD and HC are usually measured.This AI was developed for this purpose using a single ultrasound image acquisition.The AI model here evaluated, when used in combination with FL, provided a statistically different and more accurate GA estimation than biometric evaluation.Furthermore, the errors observed were lower than other previous approaches using other specialized biometrics, 36 3D ultrasound, 30 or MRI, 31 especially in the third trimester of pregnancy.Using the entire image and its whole resolution, the method was able to detect changes relevant for GA estimation not identifiable by the human eye, likely associated to brain growth and maturation.

Comparison with previous results/ studies
The results of this study are in line with previous studies suggesting that AI methods might be a way forward to aid in the estimation of GA.A recent study by Namburete et al 30 reported the use of an automated framework for predicting GA and neurodevelopmental maturation based on 3D ultrasound volumes of the fetal brain, showing promising results (95% CI, §11.6 days).In this study, we report a similar performance (95% CI, §11 days) using a 2D-based approach.Other studies 31 have reported fetal age estimation from T2weighted MRI images, using an automated AI framework.Average error reported was 5.37 days, with an R 2 of 0.92 (95% CI was not reported).We found a similar or slightly better performance using 2D ultrasound, with the obvious advantages of ultrasound over MRI in terms of clinical applicability.Therefore, our results are in line with previous studies and further provide a

Original Research
November 2021 AJOG MFM 5 clinically feasible approach that can be tested in large numbers of cases, because it is applied on routine ultrasound sections now used in standard practice.
Of note, the AI method reduced the occurrence of large errors, defined as GA estimation deviations >14 days, by almost half.Finally, an observation with high potential was that the performance of the method here tested significantly improved results on SGA and LGA fetuses, although this notion requires confirmation in studies with a larger sample size.

Clinical implications
From a clinical standpoint, this study supports that AI is able to extract additional information from ultrasound images that might improve the accuracy of estimating GA based on currently used methods.The AI could be integrated in automated software applicable in any ultrasound machine.This application would be particularly relevant for large areas worldwide, where large numbers of women still have late booking in pregnancy and therefore access to first trimester ultrasound is not guaranteed.Here, the method described is based on a routine axial section and therefore no additional training is required to acquire the image.We have previously shown that AI-based methods are robust and can be used in different ultrasound machines. 6,27This study was performed using images from different ultrasound machines of mid-to high-range quality.Images were taken by different operators during routine clinical practice, with no specific instructions or constraints on image quality or machine settings.Therefore, we expect the system here described to be fairly robust with respect to the acquisition environment.

Strengths and limitations
This study has several strengths.Images were collected from 2 different sites using different machines and presets in accordance of each technician, therefore mimicking real clinical conditions.A new model was designed and evaluated, fully capable of estimating GA estimation from a fetal ultrasound brain image more reliably than using the brain fetal biometric parameters alone.Moreover, the method is automated and complementary to biometrics, such that both can be combined for a more accurate estimation, and results were promising also on small and large fetuses evaluated.Finally, it is important to note that no image filtering or quality check was performed whatsoever; the method was tested on real clinical images without any human intervention tailored to improve results.
We acknowledge some limitations of this study.First, the AI method was trained using patients from the same sites as the evaluation set, and in both cases, the data were skewed by the much larger numbers studied at 20 weeks.However, the number of patients used to train the method was lower than the evaluation patients (1394 to train and 1992 to evaluate, which translates as a 41%/59% split), and the training patients were randomly selected, which mitigates overfitting concerns.Nevertheless, further research to confirm the results of this study in a larger and less homogeneous population will be pursued.Second, the main goal of this study was to develop an AI method to automatically estimate GA from the routine transthalamic axial plane on 2D Original Research ultrasound.For this reason, fetal malformations, such as holoprosencephaly, ACC, and hydrocephaly, were excluded, and we have no data in this respect.The use of this AI model offers better accuracy in GA estimation; however, there is still deviation when compared with dating by CRL measurement in first trimester, so the development of better methods for improving estimation of GA, especially in late pregnancy, remains an unsolved challenge in obstetrics.

Conclusions
We describe a fully automated AI method for GA estimation, which requires an image of the transthalamic plane routinely obtained during fetal standard ultrasound examination.The method can be combined with fetal biometric parameters for a more precise GA estimation.These results should be validated externally in large samples and multicenter studies.The results of this study support further research to develop automated AI methods, improving the accuracy of GA estimation in pregnancies that could not be dated by a first trimester ultrasound.&

TABLE 1
General characteristics of the study population Full GA distribution is shown in the Supplemental Figure.AC, abdomen circumference; BPD, biparietal diameter; FL, femur length; GA, gestational age, HC, head circumference; LGA, large for gestational age; percentile, fetal growth percentile computed from estimated fetal weight; SD, standard deviation; SGA, small for gestational age.Burgos-Artizzu.Gestational age estimation from fetal brain ultrasound via artificial intelligence.Am J Obstet Gynecol MFM 2021.

TABLE 2
Performance on all the patients from the study AC, abdomen circumference; AI, artificial intelligence; Avg, average; BPD, biparietal diameter; CI, confidence interval; FL, femur length; GA, gestational age, HC, head circumference; NICHD, National Institute of Child Health and Human Development.Burgos-Artizzu.Gestational age estimation from fetal brain ultrasound via artificial intelligence.Am J Obstet Gynecol MFM 2021.

TABLE 3
Performance on small-for-gestational-age and large-for-gestational-age fetuses R 2 = R-squared.Avg error (d) = average absolute error in days, 95% CI (d) = 95% CI of the absolute error in days.Error >14 d (%) = percentage of cases where error was >2 weeks.AC, abdomen circumference; Avg, average; BPD, biparietal diameter; CI, confidence interval; FL, femur length; GA, gestational age, HC, head circumference; LGA, large for gestational age; NICHD, National Institute of Child Health and Human Development; SGA, small for gestational age.Burgos-Artizzu.Gestational age estimation from fetal brain ultrasound via artificial intelligence.Am J Obstet Gynecol MFM 2021.