Translational Statistics
- 1. Department of Statistics, Purdue University, USA
Abstract
Statistical analyses presented in research articles typically focus on methodological approaches that summarize and explain the scientific results of a study or studies. Translational statistics are designed to facilitate the use of the scientific results in practice. We present three examples that illustrate the use of translational statistics and make some recommendations regarding their use.
Keywords
• Meta-analysis, Prediction intervals
• Vitamin A
• Anemia
• Calcium
Citation
McCabe GP (2014) Translational Statistics. J Transl Med Epidemiol 2(1): 1022.
ABBREVIATIONS
IU: International Units; N: sample size; RR: Relative Risk; g/L: Grams per Liter; mg/d: Milligrams per Day; g/d: Grams per Day.
INTRODUCTION
Statistical analyses presented in research articles typically focus on methodological approaches that summarize and explain the scientific results of a study or studies. Translational statistics are designed to facilitate the use of the scientific results in practice. We present three examples that illustrate the use of translational statistics and make some recommendations regarding their use.
MATERIALS AND METHODS
Three studies that use statistical analysis are described. The scientific results are explained using summary statistics. Then, for each of these studies, translational statistics are described that go beyond these results and are designed to facilitate the use of the scientific results in practice.
RESULTS AND DISCUSSION
Results
Vitamin A and morbidity of young children in developing countries: A lack of vitamin A can cause a condition called xeropthalmia or night blindness. Programs to treat this illness by periodic dosing with high levels of vitamin A have been implemented in many developing countries [1]. In the course of these programs, researchers noticed that the dosing appeared to be associated with a decrease in the morbidity of young children. To examine this relationship, a study of the effect of vitamin A supplementation on morbidity was conducted in the Aceh province of Indonesia. Treated subjects were given a dose of 200,000 IU of vitamin A at the beginning of the study and again, 6 months later. A total of 450 villages were randomized to either treatment or control. There were 12,991 children who were treated and 12,209 controls.
The outcome variable was vital status (dead or alive) one year after the beginning of the study. In the treated group, 101 children died (0.78%) and in the control group, 130 children died (1.06%). The relative risk was 0.73, corresponding to a 27% reduction in mortality due to vitamin A supplementation [2]. As a result of this research, the United Nations Subcommittee on Nutrition issued a statement saying that young child mortality might be an additional reason for increasing efforts to control vitamin A deficiency.
In the years following the initial study, additional trials were conducted that followed variety of protocols but with the same basic structure, the comparison of a vitamin A treatment with a control group and vital status as the outcome variable. The results varied and some controversy ensued. The United Nations Subcommittee on Nutrition then urged that an independent, objective review of the studies be undertaken. A study was funded by the Canadian International Development Agency which reported the results of a meta-analysis that included the results of 8 trials [3].
The meta-analysis took into account various design characteristics such as the cluster sampling due to randomization by villages by calculating a design effect for each study and using this to adjust the variability measures calculated under the assumption of independent samples. Given the diversity of protocols used in the studies, a random effects model was used for the analysis.
The relative risk for the effect of vitamin A supplementation on young child mortality was estimated to be 0.77 with a 95% confidence interval of (0.68, 0.88). Thus, the analysis suggested a 23% reduction in the mortality due to vitamin A dosing. This conclusion is based on the analysis of data from a total of more than 172,000 children under the age of 6 years.
The results of this meta-analysis, and many others like it, focused on the scientific conclusions that can be drawn from the data. But is there more that can be said? Are there translational statistics that would provide additional important information for those who might be concerned about using the scientific findings of the study?
In the preparation of an early draft of the report, there was a lively discussion of how the results would be interpreted by policy makers who might be responsible for initiating vitamin A supplementation programs. Some concerns were expressed that there could be overly optimistic misinterpretation of the confidence interval as a range of values for the relative risk that could be expected from a new study. Note that the relative risk varied from 0.50 to 1.04 for the 8 studies analyzed in the meta-analysis. As a result of these discussions, we decided to address the issue of what to expect from a new study based on a statistical analysis of the data at hand. This analysis is our first example of translational statistics.
We used prediction intervals to address the translational needs of the meta-analysis. These intervals are designed to provide an interval estimate of what might be expected from a new observation on a random variable. The topic is often taught in a course in linear models for advanced undergraduate and graduate students [4] but it can also be taught in an elementary course that does not require calculus [5]. In the latter case, it is a very useful instructional tool for helping students learn what a confidence interval does and does not mean.
Our use of a prediction interval in this context of the metaanalysis is somewhat more complicated than the applications typically taught in statistics courses, but the basic concepts are the same. Here, the variance of the log relative risk has two additive components: between study variance and within study variance. In assessing the likely outcomes of a future study, the first component is fixed while the second depends on the study characteristics. Note that the within study variances for the 8 studies in the meta-analysis ranged from 0.005 to 0.067. A study having within study variance of 0.01 has a prediction interval of (0.56, 1.06); for 0.03, the interval is (0.50, 1.16); and for 0.6, it is (0.45, 1.32). A relative risk of 1.0 corresponds to no effect, so these intervals indicate that, particularly when the within study variance is large, there is a reasonable chance that the estimated effect will be in the wrong direction (RR >1). These probabilities are 5.5%, 11.3%, and 17.2% for within study variances 0.01, 0.03, and 0.06, respectively.
The study characteristics that determine the within study variance are the true relative risk that would be seen with an arbitrary large sample (true RR), the numbers of subjects in the treatment and control groups (n in each), and the design effect. We can compute the probability of failing to see an effect in a new study as a function of these parameters. Note that the sample size for the studies in the meta-analysis ranged from about 4,000 to 14,000. For a study with n=5000 and RR=0.77 (a 23% reduction in mortality), there is a 24% chance of observing no effect (RR >1).
Weekly versus Daily Iron Supplementation for the Control of Iron Deficiency Anemia in Developing Countries: Iron deficiency anemia is recognized as a major nutritional disorder, particularly in developing countries [6]. A standard treatment for this disorder is to give daily iron supplements.
Unfortunately, the side effects from this treatment can be substantial and as a result compliance can be very low. Some research suggested that weekly supplementation could be just as or more effective than daily supplementation [7]. If true, consequences could be substantial with reduced side effects and greatly increased compliance. Meta-analyses were conducted using original data from 4 studies (775 subjects) of pregnant women, 5 studies (182 subjects) of adolescents, and 4 studies (1225 subjects) of children. Each study included subjects randomized to weekly or daily iron supplementation. A random effects model was used [8].
The primary measure of iron status in this study was hemoglobin measured in serum. An analysis was performed to estimate the difference between daily supplementation versus weekly supplementation final mean hemoglobin levels adjusted for initial hemoglobin levels. The summary estimates with confidence intervals (all in g/L) were: pregnant women (4 studies, n=775), 2.47, (-0.38, 5.32); adolescents and schoolers (5 studies, n=1862), 2.25, (-2.00, 6.50); and preschoolers (4 studies, n=1225), 1.91, (-0.27, 4.09). An additional meta-analysis that included all 13 studies (3862 subjects) gave 2.17, (-0.04, 4.38). Each of these confidence intervals includes zero, corresponding to no difference, so we conclude that these data do not provide evidence for difference in the effect on mean hemoglobin level based on the weekly versus daily supplementation of iron. To interpret these results, we would like to understand the clinical significance of a of 2 or 3 g/L change in mean serum hemoglobin. Is this a change that would be important if a new study with more power could be performed? A discussion of this issue led to the consensus that the analysis of mean serum hemoglobin levels is not directly related to the question about anemia that the study was designed to investigate.
Anemia is defined by the World Health Organization [9] as serum hemoglobin at 120 g/L or less for adolescents and schoolers, and 110 g/L or less for pregnant women and preschoolers. In this study, adjustments to in the cutoffs were made based on the stage of pregnancy due to increases in blood volume and based on differences in norms for subjects living in high altitudes. The analysis of mean hemoglobin did not directly address a question about anemia. However, the treatments are given to decrease the prevalence of anemia. Using anemia is our second example of a translational statistic.
An analysis of anemia gives a different picture of the comparison of weekly with daily supplementation. Here are the results expressed in the relative risk of anemia for weekly versus daily subjects with 95% confidence intervals: pregnant women (8 studies, n=1375), 1.29 (1.51); adolescents and schoolers (9 studies, n=4641), 1.44 (1.33, 1.56); (preschoolers 4 studies, n=1230), 1.06 (0.84, 1.34); all (21 studies, n=7246), 1.34 (1.09, 1.52). Note that the numbers of studies here are larger than those used in the analysis of mean hemoglobin because anemia results were available in the published papers and therefore original data were not required for these analyses. We see that there is a 29% increased risk of anemia with weekly dosing versus daily dosing for pregnant women and a 44% increased risk for adolescents and schoolers. The 95% confidence intervals do not include 1 so the results are significant at the 5% level. For preschoolers, there is a 6% increase but the increase is not statistically significant.
Effects of calcium intake and sodium intake on calcium retention: Calcium retention (the amount of calcium retained by the body, usually expressed as mg/d) is a major concern for young girls at ages when their bones are growing rapidly. Building bone mass while young is believed to reduce the risk of osteoporosis in later life. Many studies have been conducted to examine factors that influence calcium retention in adolescents [10,11].
One series of studies varied the amount of calcium and the amount of sodium (salt) in the diet of adolescents. The studies used different subjects for three different levels of calcium intake (800 mg/d, 1300 mg/d, and 1800 mg/d) with a crossover for two levels of sodium (1.30 and 3.86 g/d) [12,13]. The middle level of calcium and the low level of sodium correspond to the recommended intakes for adolescents [14,15]. The analyses performed included estimation of the main effects of calcium and sodium as well as the interaction between these two intakes.
Results indicated that the two main effects were statistically significant and that the interaction was not. Higher calcium is associated with increased calcium retention and high salt is associated with decreased calcium retention. To achieve high calcium retention, adolescent girls should eat a diet that is high in calcium and low in salt.
Our third example of a translational statistic is based on asking how these results can be used to say something about the effect of a dietary change. Consider an adolescent girl whose diet is low in calcium and high in sodium, for example 800 mg/d of calcium and 3.86 mg/d of sodium. The estimated calcium retention for this individual is 295 mg/d. Using the results of our analysis, we can estimate the consequences of changes in this diet. The effects of these changes are assumed to be linear over the ranges of intakes. A decrease of 1 g/d of sodium (3.86 to 2.86 g/d) would produce an expected increase of 27 mg/d in calcium retention. The same increase could also be achieved by increasing the calcium intake by 60 mg/d (800 to 860 mg/d). These changes correspond to a 26% decrease in sodium intake and a 7.5 % increase in calcium intake. The change in calcium intake would be relatively easy (a cup of milk contains 300 mg, an ounce of almonds contains 80 mg), but the change in sodium would require a substantial change in the use of salt.
Discussion
In the first example, we performed an analysis that provided additional information about what a program planner might expect to see if a new vitamin A supplementation program was initiated. Today, many countries include a vitamin A supplement in their National Immunization Day programs [16].
In the second example, the analysis of hemoglobin means led to the conclusion that the effects of weekly and daily dosing with iron supplements were similar. However, by choosing an outcome measure more closely related to the purpose of the supplementation, the inferiority of weekly dosing became evident. Unfortunately, neither method of supplementation is very effective. There is still a need to find better ways to address iron deficiency anemia.
In the third example, we estimated the increase in calcium retention that would be achieved by dietary changes. These additional calculations led to the conclusion that a relatively small increase in calcium intake would result in the same increase in calcium retention as a very large change in sodium intake.
CONCLUSION
In the three settings considered, translational statistics provided additional information to facilitate the use of the scientific results in practice. Together, statisticians and their collaborators can focus on how scientific results can be used to inform practice. Translational statistics can then be developed to meet these needs.