Using Predictive Performance from an Elastic Net Regression to Classify Developmental Language Disorder Dld - Abstract
Purpose: Evidence suggests developmental disorders are best viewed from a multidimensional approach, where the disorder deficit profile may be
highly variable due to the complex interaction of factors that vary along a continuum. In this study, we leverage individual variability to determine whether a
multidimensional disorder, such as developmental language disorder (DLD), can be identified.
Method: We used repeated elastic net logistic regression with 71 high-density measures from 223 children ages 7 - 11 (DLD = 110; typically developing
(TD) controls = 113) from the Montgomery et al. [1] study.. In Study 1, we trained the model on 70% of the data and tested its performance on the remaining
30% holdout set. In the second study, we utilized the complete data set to derive the fitted models to compare the characteristics of the best- and worst
performing models.
Results: Area under the receiver operating characteristic curve (AUROC) was used to evaluate the performance of the fitted models. For the fitted model
in Study 1, the average AUROC in the training set was 0.88 (SD = 0.017) in discriminating DLD-TD groups, and the holdout set was 0.85 (SD = 0.04). The
average AUROC for discriminating the fitted modes in Study 2 was 0.87 (SD = 0.002). The model-estimated probability scores for both Study 1 & 2 models
were also significantly correlated with the language severity measure.
Conclusion: Our successful development of a predictive model based on an elastic net algorithm that classified children with DLD from those without, using
a multidimensional dataset, provides indirect support for the notion that DLD is a multidimensional disorder. Some of the conundrums of data-driven model
derivation and complementary findings, as well as the pros and cons of methodologies in Study 1 and Study 2, are discussed.