Multiple -Regression: A Comprehensive Approach for Analysis of Weight-Length Relationships in Macrobrachim rosenbergii
- 1. Division of Genetics and Molecular Biology, University of Malaya, Malaysia
- 2. Centre for Research in Biotechnology for Agriculture (CEBAR), University of Malaya, Malaysia
Abstract
The Malaysian giant prawn, Macrobrachium rosenbergii [3], is an important crustacean aquaculture candidate cultured all year round in hot climate regions and seasonally in some temperate areas. According to the authors ’knowledge, there is no reported multiple -approach attempt so far, at least for shrimps, for estimating weight using morphometric measurements. The authors hypothesized that the few morphometric traits, with proved simple significant relationships, are expected to affect body weight simultaneously. Hence, in this study they decided to use a multiple approach using these morphometric measurements as predictors for body weight. The approach was found to be more effective and accurate compared to simple relations, as it explained 73- 94% of the total variance and showed significant differences between populations of the species collected from different geographical locations. Accordingly, the authors confidently recommend this method to be used for such estimations in other crustaceans and other metazoan as well.
Citation
Elsheikh MO, Bhassu S (2016) Multiple -Regression: A Comprehensive Approach for Analysis of Weight-Length Relationships in Macrobrachim Rosenbergii. JSM Biol 1(1): 1002.
Keywords
Freshwater aquaculture, Morphometric traits, Giant Malaysian Prawn
INTRODUCTION
The giant Malaysian prawn (GMP), Macrobrachium rosenbergii [3], is an important aquaculture candidate farmed universally in the tropics and subtropics. It is naturally found in South and Southeast Asia covering East Pakistan, east coast of India, Bay of Bengal, Gulf of Thailand, Malaysia and the northern Indonesian islands of Sumatra, Java and Kalimantan [11].
As the GMP is an aquatic animal, longer time out of water is stressful to it. But companies involved in its production need a mean of estimating fresh body weight (WT) which is crucial for taking decisions on harvest time. Measuring length under field conditions is easier than weighing animals. Length could be measured under previously prepared clear and oxygenized water, while this is not true for weight. So, a regression model of estimating WT using quickly measured lengths such as total length (TL), tail length (TA) and carapace length (CL) is quite supportive. Actually, regression as a statistical model has been applied to estimate body weight using morphometric body measurements for their high associations with WT. Besides, their lesser variability and more easiness in field [7,9], being useful in describing growth in wild populations [1,2], defining the stocks and comparative growth studies [8,9], as well as having high association with fecundity [3] and meat yield [6].
All the predictors seem to affect the response variables (WT) collectively, hence multiple regression approach could be more convenient and accurate compared to obtaining simple relations between variance of variables such as WT and one of the other variables at a time [15]. Although appreciable literature on weight - length relationships is available, yet no reports applying a multiple approach as all reporters whom we come across their studies performed simple regression relationship between two traits at time. So, we conducted this study applying stepwise multiple regression approach to be used for prediction for genetic gain in future studies which is essential for conduction breeding studies.
MATERIALS AND METHODS
Two groups of samples were collected for this study. The first group comprised of four parental ecotype populations of GMP collected directly from wild sources in four rivers, viz Kota Kuala Muda (5.5833° N, 100.3833° E), Muar (2.0500° N, 102.567° E), Perak (4.183? N and 101. 267? E) and Timon (4.932 N, 115.396 E) in the states of Kedah, Johor, Perak and Negeri Sembilan respectively. While the second group, comprised of 8 sets of progenies produced in the second round of cyclical mating of the first group (i.e. parental ecotype populations) raised in grow- out ponds at Tapah village (4.183? N and 101. 267? E) in Perak stateMalaysia.
Data of morphometric measurements, namely total length (TL), tail length (TA), carapace length (CL) and fresh body weight (WT), were collected for the two mentioned groups of the giant Malaysian prawns. WT was measured by a digital balance (Sartorius, accuracy of 0.01g) to the nearest gram. (TL) was measured as the length from the tip of the rostrum to the tip of the telson, carapace length (CL) from the eye to the first abdominal segment and tail length (TA) from the first abdominal segment to the tip of the telson. All length measurements were taken to the nearest millimeter using a 30 cm ruler.
In addition to morphometric body measurement, sex data were also recorded only for parental groups.
Statistical analysis
SAS software (SAS Institute Inc. 2015) procedure of correlation was used to estimate correlation coefficients between each predictor and WT (the response variable) as well as among these predictors. Prior to using the regression procedure of SAS software, data were logarithmically transformed for the nature of the relationships of these morphometric measurements with WT are in a curvilinear manner [7], while the procedure of regression in SAS assumes linearity. The software version used was SAS 9.4. In this regression analysis, length measurements were used as
predictors for estimating WT as a response variable. Data were analyzed for parental groups in both combined and separated sex’s runs but each progeny group was tackled as a single set without sexing.
The general formula applied for these regressions is:
WT = μ + β0 TL+ β1 CL+ β2 TA+ eijk
Where μ is a constant, eijk is a random error which is independent and normally distributed, β0 is the expected difference between two experimental units for which the variable TL differs by one unit, with all other explanatory variables kept the same, β1 and β2 are Similar to β0 with changing the explanatory variable to CL and TA respectively.
RESULTS
Correlation coefficients between WT, TL, CL and TA were found to be 0.902, 0.900 and 0.846 respectively, while between predictor variables the coefficients were 0.889, 0.950 and 0.849 for TL with CL, TL with AT and CL with AT consecutively.
Tables (1-3) show the components of stepwise regression equations for mixed sexes of parental ecotype populations, separated sexes of parental ecotypes and mixed sexes of progeny groups respectively.
Figure 1 Fit diagnostic of residuals against predicted values of WT for Kedah ecotype (mixed).
In this analysis, the model was tuned to the extent that the value of alpha used is one third of that of the default value (0.15) used in the regression procedure of SAS program. However, this tuning exactly equals the conventional alpha value (P≤ .05) used in most statistical tests used in various fields. In spite of the tuning just mentioned, results obtained were unlikely to occur by chance for the maximum of such a probability (Pr > [t]) for each considered trait, were as low as ≤ .0007, .0243 and .0005 in mixed parental ecotype populations, separated ecotype parental populations and mixed progeny groups respectively. Besides, the model was found to explain a minimum of 73% of the total variance. In more details, the first predictor alone explained a minimum of 68% of that total variance, as explained in the tables by standard estimate (standard partial regression coefficient), reaching up to 0.86. In addition, eliminating unimportant variables resulted in losing only 0.02 of the variance explained. This does not means that an eliminated predictor variable does not contribute totally to the response variable, but it means there is no additional variance for it to explain after including its predecessor predictor variables. This conclusion is due to intercorrelations between predictor variables stated above. Further, Figures (1-3) consecutively show plots of fit diagnostics for log WT as an example for each of the variables for the three groups.
Figure 2 Fit diagnostic of residuals against predicted values of WT for Negeri Sembilan Females
These plots indicate fairly random distributions of residuals, with a few observations falling out of the threshold limits, for example of the Cook’s D statistic meaning that these readings are outliers.
Figure 3 Fit diagnostic of residuals against predicted values of WT for Negeri Sembilan Males.
Particularly, total length (TL) was found to be the first predictor in all progeny groups except a single one (87.5%), none of parental groups when each ecotype was treated as a single group (without sexing) and all the males plus 50% of the females when they were sexed. But carapace length (CL) was detected as a first predictor in 75% of unsexed parental groups and 50 % of the females when sexed.
Figure 4 Fit diagnostic of residuals against predicted values of WT for Progeny - NJ×PP
Different equations were obtained for different parental ecotypes (different watersheds) as well as for different progeny groups.
Table 1: Components of the equations drawn by stepwise multiple -regression for estimating WT in response to TL, TA and CL for mixed sexes of parental Malaysian prawn ecotypes.
Constants | Value | Partial R² | Model R² | SE | Standardized Estimate | Pr > F |
Kedah | ||||||
Intercept | -4.02089 | 0.21199 | 0 | < .0001 | ||
β0 TL | 1.72757 | 0.0405 | 0.9353 | 0.22519 | 0.47573 | < .0001 |
Β1 CL | 1.46630 | 0.8949 | 0.8949 | 0.17660 | 0.51488 | < .0001 |
Perak | ||||||
Intercept | -2.81113 | 0.26397 | 0 | < .0001 | ||
β0 TL | 0.31591 | 0.0544 | 0.8887 | 0.06751 | 0.26615 | < .0001 |
Β2 TA | 0.71990 | 0.0142 | 0.9029 | 0.20159 | 0.20157 | .0006 |
Β1 CL | 1.50409 | 0.8343 | 0.8343 | 0.15038 | 0.56090 | < .0001 |
Negeri Sembilan | ||||||
Intercept | -3.82881 | 0.16455 | 0 | < .0001 | ||
β0 TL | 0.35691 | 0.1080 | 0.9112 | 0.03917 | 0.31123 | < .0001 |
Β2 TA | 2.06019 | 0.8032 | 0.8032 | 0.12515 | 0.55019 | < .0001 |
Β1 CL | 0.50317 | 0.0117 | 0.9229 | 0.10109 | 0.20752 | < .0001 |
Johor | ||||||
Intercept | -3.44132 | 0.36477 | 0 | < .0001 | ||
Β2 TA | 1.14120 | 0.0498 | 0.7295 | 0.32005 | 0.36603 | |
Β1 CL | 1.75245 | 0.6797 | 0.6797 | 0.33667 | 0.53435 | < .0001 |
R2 = coefficients of determination, Pr= Probability |
Table 2: Components of the equations drawn by stepwise multiple -regression for estimating WT in response to TL, TA and CL for separate sexes of parental Malaysian prawn ecotypes
Constants | Value | Partial R² | Model R² | SE | Standardized Estimate | Pr > F |
Kedah Male | ||||||
Intercept | -4.80893 | 0.32975 | 0 | < .0001 | ||
β0 TL | 1.04820 | 0.8950 | 0.8950 | 0.45348 | 0.29676 | .0243 |
Β2 TA | 1.28653 | 0.0254 | 0.9204 | 0.42369 | 0.34957 | .0035 |
Β1 CL | 1.03656 | 0.0106 | 0.9310 | 0.24556 | 0.34710 | < .0001 |
Kedah Female | ||||||
Intercept | -4.02218 | 0.30622 | 0 | < .0001 | ||
β0 TL | 0.82440 | 0.0335 | 0.8691 | 0.30254 | 0.33702 | .0095 |
Β1 CL | 1.56875 | 0.8356 | 0.8356 | 0.29771 | 0.63107 | <.0001 |
Perak Male | ||||||
Intercept | -3.11764 | 0.42696 | 0 | < .0001 | ||
β0 TL | 0.96648 | 0.7656 | 0.7656 | 0.39727 | 0.38002 | .0204 |
Β2 TA | 1.40918 | 0.0348 | 0.8004 | 0.40476 | 0.54383 | < .0001 |
Perak Female | ||||||
Intercept | -2.67566 | 0.34605 | 0 | < .0001 | ||
β0 TL | 0.82674 | 0.0493 | 0.8173 | 0.22279 | 0.26427 | .0005 |
Β1 CL | 1.66893 | 0.7679 | 0.7679 | 0.16212 | 0.73313 | < .0001 |
Negeri Sembilan Male | ||||||
Intercept | -4.06354 | 0.43282 | 0 | < .0001 | ||
β0 TL | 1.39323 | 0.8350 | 0.8350 | 0.40194 | 0.43347 | .0012 |
Β2 TA | 0.88155 | 0.0327 | 0.8677 | 0.36830 | 0.28154 | .0210 |
Β1 CL | 0.62696 | 0.0152 | 0.8829 | 0.24790 | 0.26903 | .0151 |
Negeri Sembilan Female | ||||||
Intercept | -3.95486 | 0.18822 | 0 | < .0001 | ||
β0 TL | 2.53042 | 0.8644 | 0.8644 | 0.13915 | 0.80581 | < .0001 |
Β1 CL | 0.44737 | 0.0168 | 0.8812 | 0.11062 | 0.17919 | < .0001 |
Johor Male | ||||||
Intercept | -6.11865 | 0.28776 | 0 | |||
β0 TL | 4.28833 | 0.9209 | 0.9209 | 0.29079 | 1.27451 | < .0001 |
Β1 CL | -1.11848 | 0.0193 | 0.9403 | 0.28086 | -0.34418 | 0.0002 |
Johor Female | ||||||
Intercept | -3.50722 | 0.67490 | 0 | < .0001 | ||
β0 TL | 2.63774 | 0.7458 | 0.7458 | 0.36293 | 0.86362 | < .0001 |
R2 = coefficients of determination, Pr= Probability |
Table 3: The components of stepwise multiple- regression equations of WT estimation for unsexed progeny groups.
Constants | Value | Partial R² | Model R² | SE | Standardized Estimate | Pr > F |
PP×KJ | ||||||
Intercept | -5.04386 | 0.37201 | < .0001 | |||
β0 TL | 2.05612 | 0.8295 | 0.8295 | 0.30709 | 0.62671 | < .0001 |
Β2 TA | 1.12735 | 0.0429 | 0.8723 | 0.30020 | 0.35149 | .0005 |
KP×NJ | ||||||
Intercept | -4.39223 | 0.40006 | < .0001 | |||
β0 TL | 2.18260 | 0.7903 | 0.7903 | 0.25600 | 0.67700 | < .0001 |
Β1 CL | 0.73534 | 0.0559 | 0.8462 | 0.18388 | 0.31755 | 0.0002 |
KK×PJ | ||||||
Intercept | -5.90934 | 0.60435 | < .0001 | |||
β0 TL | 3.44949 | 0.8177 | 0.8177 | 0.29250 | 0.90429 | < .0001 |
NJ×PP | ||||||
Intercept | -4.27239 | 0.43769 | < .0001 | |||
β0 TL | 2.28870 | 0.7889 | 0.7889 | 0.26185 | 0.74984 | < .0001 |
Β1 CL | 0.50120 | 0.0368 | 0.8257 | 0.18178 | 0.23653 | 0.0091 |
PP×KN | ||||||
Intercept | -4.30191 | 0.38637 | < .0001 | |||
β0 TL | 2.29108 | 0.8843 | 0.8843 | 0.31178 | 0.75255 | < .0001 |
Β1 CL | 0.51889 | 0.0127 | 0.8971 | 0.24249 | 0.21914 | 0.0390 |
PJ×KK | ||||||
Intercept | -5.14127 | 0.46667 | < .0001 | |||
β0 TL | 2.70878 | 0.8918 | 0.8918 | 0.33733 | 0.76640 | < .0001 |
Β1 CL | 0.50594 | 0.0144 | 0.9062 | 0.22505 | 0.21456 | 0.0314 |
KN×JJ | ||||||
Intercept | -5.33628 | 0.26267 | < .0001 | |||
β0 TL | 2.52682 | 0.9225 | 0.9225 | 0.24601 | 0.78224 | < .0001 |
Β2 TA | 0.73820 | 0.0114 | 0.9339 | 0.27047 | 0.20787 | 0.0092 |
JJ×KP | ||||||
Intercept | -4.42795 | 0.38059 | < .0001 | |||
β0 TL | 1.16095 | 0.0348 | 0.8635 | 0.39452 | 0.41375 | 0.0058 |
Β2 TA | 1.81996 | 0.8287 | 0.8287 | 0.47300 | 0.54099 | 0.0005 |
J= Johor, N= Negeri Sembilan, K= Kedah and P= Perak, R2 = coefficients of determination, Pr= Probability |
DISCUSSION
Naturally, growth in the giant Malaysian prawn (GMP) is in a leap-frog manner with almost complete cessation before the upcoming moult and its pace decreases with age, expressed as a drop in the slope of the curve [5,14]. Growth is reflected in proportional increments of weight and length measurements. Hence, in this study, results of weight-length relationships in Macrobrachium rosenbergii applying stepwise multiple- linear regression were presented and discussed.
According to our knowledge, this is the first report applying a multiple approach on weight-length relationship data of shrimps. As available literature includes results for only simple relationships, we could inevitably compare only our first predictors in the equations we obtained with such results.
Excitingly, results of the regression equations for mixed sexes groups showed high coefficients of determination (R2 ). This is in an apparent contrast to the known skewed growth of the species, but could be explained by the fact that only blue claw males of the three different known morphotypes were collected, because the aim was to use them in a breeding program. Actually, the small male morphotype seemed to be the one causing most of the skewness in grow outs of the Malaysian prawn, so excluding them made the skewness to disappear. Furthermore, small males are equally fertile, but being socially submissive, they are unlikely to get a chance to mate with berried females in the presence of their dominant counter parts hence excluding them is better. Growth in females is quite homogeneous, so, adding them to BC males would not distort the data and in the contrary, it increased the sample size and resulted in higher coefficients of determination (R2 ) in comparison to equations obtained for separate sexes table (2) meaning it increased the explained variance.
Particularly, total length (TL) was found to be the first predictor in all progeny groups except a single one (87.5%), none of parental groups when each ecotype was treated as a single group (without sexing) and all the males plus 50% of the females when they were sexed. But carapace length (CL) was detected as a first predictor in 75% of unsexed parental groups and 50 % of the females when sexed. So, equations showing total length as a first predictor are in line with [2,6,12] reports, while equations showing carapace as a first estimator coincide with results reported by Kuun et al [3].
TL as representing the whole body length of the animal is expected to be the first predictor of weight. But may be due to rostrum breakage its ratio may be affected [7]. In the parental samples much fighting would have occurred to a form a new social hierarchy for they were collected from different hierarchically formed groups. This aggression would have resulted in breakage of their rostrums causing reduction in their total body lengths; while this is not expected to happen in the progeny groups, which were living in one pond for a long time has a well-developed hierarchy (Table 3).
In addition, obtaining different equations for parental ecotypes (different watersheds) and progeny groups reveals the effects of life stage, as well as genetics through difference in prawn sources on these relationships. This is in line with results reported by Primavera et al., [9] and Lalrinsanga et al., [7].
In conclusion as the multiple- regression can include all effective predictive variables at a time, it best fit weight-length relation data analyses. Consequently we recommend utilization of this approach for studying such relationships for shrimps in general and other aquatic species as well.
ACKNOWLEDGEMENT
S We would like to thank MOSTI ABI funding: ABI 53-02-03- 1030 (2008-2011) for the breeding work for GRA internship awarded for Mr. Mohamed Omer Elsheikh and HIR- MOHE funding project H-23001-G000006 by University Malaya that was awarded to Assoc. Prof. Dr. Subha Bhassu for allowing us to conduct this research and also to National Prawn Fry Production and Research Centre, FRI, Kedah, Malaysia for providing the infrastructure for brood stock management program which was financed by MOSTI ABI funding.