Artificial neural networks as auxiliary tools for the improvement of bean plant architecture
Published: June 29, 2017
Genet.Mol.Res. 16(2): gmr16029500
DOI: 10.4238/gmr16029500
Abstract
lassification using a scale of visual notes is a strategy used to select erect bean plants in order to improve bean plant architectures. Use of morphological traits associated with the phenotypic expression of bean architecture in classification procedures may enhance selection. The objective of this study was to evaluate the potential of artificial neural networks (ANNs) as auxiliary tools in the improvement of bean plant architecture. Data from 19 lines were evaluated for 22 traits, in 2007 and 2009 winter crops. Hypocotyl diameter and plant height were selected for analysis through ANNs. For classification purposes, these lines were separated into two groups, determined by the plant architecture notes. The predictive ability of ANNs was evaluated according to two scenarios to predict the plant architecture - training with 2007 data and validating in 2009 data (scenario 1), and vice versa (scenario 2). For this, ANNs were trained and validated using data from replicates of the evaluated lines for hypocotyl diameter individually, or together with the mean height of plants in the plot. In each scenario, the use of data from replicates or line means was evaluated for prediction through previously trained and validated ANNs. In both scenarios, ANNs based on hypocotyl diameter and mean height of plants were superior, since the error rates obtained were lower than those obtained using hypocotyl diameter only. Lower apparent error rates were verified in both scenarios for prediction when data on the means of the evaluated traits were submitted to better trained and validated ANNs.
Introduction
Common bean (Phaseolus vulgaris L.) is a staple food of the Brazilian population, and is a relevant source of protein, iron, and carbohydrates. Moreover, it is an agricultural product of high social and economic importance since it is cultivated in large areas, generating jobs throughout the crop cycle (Borém and Carneiro, 2015).
Plant architecture is a trait that has received great attention in bean breeding programs (Mendes et al., 2009), since the cultivation of erect plants allows mechanized harvesting with minor losses, facilitates crop management practices, reduces the incidence of some diseases (Coyne, 1980; Pires et al., 2014), and makes it possible to obtain better quality grain. Bean plant architecture is a trait governed by several genes and is influenced greatly by the environment (Teixeira et al., 1999; Basset, 2004; Moreto et al., 2007), which hinders the work of breeders.
Greater accuracy in the selection of erect plants requires information regarding other morphological traits involved in the phenotypic expression of bean plant architecture, such as growth habit, length of the main stem, number of internodes, internode length, plant height, number of branches, branch angles, distribution of pods, and hypocotyl diameter (Teixeira et al., 1999; Kelly, 2001).
Scoring scales are a strategy widely used in breeding programs to evaluate bean plant architecture (Collicchio et al., 1997; Teixeira et al., 1999; de Menezes Júnior et al., 2008). However, there are some issues regarding this practice, such as the efficacy of evaluation using a visual criterion, and the range of scores assigned by different evaluators. Furthermore, studies have shown that the evaluation of plant architecture using scores from individual plants is not very efficient (Teixeira et al., 1999).
Acquaah et al. (1991) and Moura et al. (2013) concluded that hypocotyl diameter, plant height, and branch angles are the main traits associated with bean plant architecture. Additionally, these traits permit more accurate evaluations when compared with evaluation by scores. Morphological traits associated with the phenotypic expression of bean architecture in discriminatory techniques can be used to increase accuracy in the selection of more erect plants.
Artificial neural networks (ANNs) are tools that have been highlighted for individual separation (Braga et al., 2011); these are computational techniques based on mathematical models (Nelson and Illingworth, 1991; Haykin, 2008), and their functioning is inspired by the human brain, by acquiring knowledge by means of experience. ANNs are effective for predicting, recognizing patterns, and establishing clusterings (Haykin, 2008). In the agricultural field, they permit the yield (Kaul et al., 2005), behavior of diseases and pests (Batchelor et al., 1997), and the water retention in soil (Schaap and Bouten, 1996) to be predicted, among others traits. In breeding, ANNs have been used in studies on genetic diversity (Barbosa et al., 2011), prediction of genetic values (Silva et al., 2014), and analyses of adaptability and stability (Barroso et al., 2013; Nascimento et al., 2013).
One of the main attributes of ANNs is their nonlinear structure, which allows more complex properties of data to be captured (Galvão et al., 1999). They also stand out as they do not require detailed information on the physical processes of the systems to be modeled (Sudheer et al., 2003). As a classification method, ANNs have some advantages, such as being non-parametric (Kavzoglu and Mather, 2003) and being tolerant to data loss (Bishop, 1995). Thus, the objective of this study was to evaluate the potential of ANNs as auxiliary tools in the improvement of bean plant architecture.
Materials and Methods
Data on 36 bean lines evaluated in the 2007 and 2009 winter crops were used in this study. Data regarding 22 agronomic traits were obtained in experiments carried out in the experimental field of the Department of Plant Science of the Federal University of Viçosa (UFV) in the municipality of Coimbra, Minas Gerais (20º51'24''S, 42º48'10"W, at720 m asl).
The experiment carried out in the 2007 winter crop consisted of a randomized block design with three replicates, and plots consisted of three 3-m rows, spaced 0.5-m apart. The experiment carried out in the 2009 winter crop had a similar design, except that plots with four rows were used. Sixteen holes per meter were used, with three seeds per hole and subsequent thinning, leaving two plants per hole. The cultivation treatments adopted followed the recommendation for bean crops in the region (Vieira et al., 2015).
The characteristics evaluated were: days to flowering; days until harvest; score of plant architecture at flowering and at harvest; mean plant height within the plot at flowering and at harvest; grain yield; height of insertion of the first pod, measured in the field and after harvest; branch angles; number of pods in the main stem; number of pods in the branches; epicotyl diameter; hypocotyl diameter; total number of branches; number of aborted branches; number of internodes in the main stem; number of internodes in branches with pods; length of the first four internodes of the main stem; total length of internodes, number of grains per pod; and weight of 100 grains.
The number of days to flowering comprised the period between emergence and flowering of 50% of the plants within the plot. Days until harvest refer to the period from emergence to harvest.
In relation to the architecture, plants were evaluated at physiological maturity and close to harvest, considering the center rows of the plot and using a scoring scale from 1 to 5 as proposed by Collicchio et al. (1997), in which: 1 refers to plant type II, erect, with one stem, and with high insertion of the first pod; 2 refers to plant type II, erect, with some branches; 3 refers to plant to type II or III, with many branches and a tendency to prostrate; 4 refers to plant type III, semi-erect or semi-prostrate; and 5 refers to plant type III, with long internodes and very prostrate.
Mean plant height within the plot was measured in centimeters from the ground level to the insertion of the last leaf, considering three representative points in the plot, both at flowering and at harvest.
The two lateral rows were harvested in each plot in order to obtain grain yield in the 2007 harvest. In the 2009 harvest, one of the central rows was used to evaluate grain yield, while the other was used to measure other traits after harvest. The height of insertion of the first pod was measured in the field and after harvest, from the ground level to the height of insertion of the first pod in the raceme. The measurement of the height of insertion of the first pod after harvest differed in relation to the measurement carried out in the field, since the plant was maintained erect. Branch angles were measured with the aid of a semicircular slit between 0 and 180° (protractor), and the three branches following those with primary leaves were considered.
The mean number of pods in the main stem, the mean number of pods in the branches, and the mean number of aborted branches were determined from nine representative plants within the plot. The hypocotyl and epicotyl diameters (mm) were measured using a digital caliper. The epicotyl diameter was taken 1-cm above the cotyledonary node, and the hypocotyl diameter was taken 1-cm below the same node. To determine the number of internodes in the main stem and the number of internodes in branches, only branches/racemes with pods were considered. The length of the first four internodes of the main stem was measured in cm, starting from the cotyledonary node. The number of grains per pod was obtained from nine representative plants within each plot.
In order to identify the main traits determining bean plant architecture, multiple regression analyses with a stepwise option were carried out, aiming to select variables for the model, which originally included 22 traits measured in the lines in both evaluations (2007 and 2009 winter crops). Multiple regression analysis with a stepwise selection strategy was carried out with the aid of the GENES software (Cruz, 2013).
After obtaining the necessary information, data on the variables of interest were subjected to analysis of variance, according to the random blocks model. Then, an analysis of variance was carried out for the years. For all analyses, the effects were all considered as fixed, with the exception of the error.
For analysis with ANNs, data on the lines in each replication were used in order to increase the sample size. Lines were allocated in to two groups established by their plant architecture scores. The first group was composed of lines with scores up to 2.5, and the second was composed of lines with scores greater than 2.5. Lines allocated in different groups in the replications and/or years were not considered in the analysis. Thus, 19 of the 36 lines evaluated in 2007 and 2009 were used in the analysis with ANNs (Table 1), totaling 57 observations per evaluation year, since data from each replication was used for ANNs training and validation.
Line | Commercial group | Group (2007/2009) |
---|---|---|
Meia Noite | Black | 1 |
BRS Supremo | Black | 1 |
CNFC8006 | Carioca | 1 |
CNFC9454 | Carioca | 1 |
A 805 | Carioca | 1 |
IAPAR 44 | Black | 1 |
TB 94-01 | Black | 1 |
A 170 | Mulatinho | 1 |
A 525 | Mulatinho | 1 |
IPA 6 | Mulatinho | 2 |
VC 3 | Carioca | 2 |
Carioca 1030 | Carioca | 2 |
BRS Perola | Carioca | 2 |
BRSMG Talismã | Carioca | 2 |
BRSMG Majestoso | Carioca | 2 |
Ouro Vermelho | Red | 2 |
Vermelhinho | Red | 2 |
Ouro Negro | Black | 2 |
1840 4 PS | Black | 2 |
Table 1: Registration name, description of grain type (commercial group), and classification of 19 lines of the Bean Active Germplasm Bank of Universidade Federal de Viçosa.
ANNs analysis were used under the following scenarios:
Scenario 1: In this scenario, the ability of ANNs to predict the architecture of lines in 2009 was evaluated, with ANNs based on data obtained in 2007 for hypocotyl diameter (HD), individually or in conjunction with mean plant height within the plot (PH). In the ANN training, data related to replications of the 2007 experiment were subjected to the expansion process, as mentioned below, and information of 300 genotypes per group with the same properties (mean, variance, and covariance) of the original lines was obtained. Replication data (57 observations) used in the expansion process was used for validation, and prediction was carried out with data from individual replicates (57 observations) and with the means of replicates (19 observations) from the 2009 experiment, as follows: 2007 harvest - Training and validation; 2009 harvest - Prediction (57 observations - replication data); 2007 harvest - Training and validation; 2009 harvest - Prediction (19 observations - mean of replication data).
Scenario 2: In this scenario, the ability of ANNs to predict the architecture of lines in 2007 was evaluated, with ANNs based on data obtained in 2009 for (HD), individually or in conjunction with (PH). In the training ANN training, data related to replications of the 2009 experiment were subjected to the expansion process, and information was obtained on 300 genotypes per group with the same properties (mean, variance, and covariance) of the original lines. Validation occurred with the replicate data (57 observations) used in the expansion process, and predictions were made with data on individual replicates (57 observations), and the mean of the replicate data (19 observations) of the 2007 experiment, as follows: 2009 harvest - Training and validation; 2007 harvest - Prediction (57 observations - replication data); 2009 harvest - Training and validation; 2007 harvest - Prediction (19 observations - mean of replication data).
In both scenarios, apparent error rates (AER) for ANNs training, validation, and prediction were estimated. AER was determined as the percentage of misclassification, considering the allocation groups of the lines. The apparent error rates per group for validation and for predictions for both scenarios were also estimated.
Data expansion
For ANNs training were simulated three-hundred new information per group from the data of each proposed scenario (Table 1). The expanded data from the information of each group based on the mean and the covariance matrix of the main traits determining plant architecture by the multiple regression technique, with a stepwise option for variable selection. These new data sets presented the same properties (mean, variance, and covariance) as the original datasets. The expansion process was carried out with the aid of the GENES software (Cruz, 2013).
For the purpose of network training, the initial dataset was expanded using the following process: the simulated values were taken as a random variable Y~N(ϕ, Σ); data were transformed into a random variable Z~N(ϕ,I) through linear transformation Z = F'Y, in which F was obtained by the spectral decomposition process of Σ, such that Σ-1 = FF'. The expansion process consisted of the simulation of new values of Y, considering Y~N(ϕ, (F')-1Z). For the simulation of new information, the reverse process was applied, i.e., independent normal variables were generated and data with the desired covariance matrix were obtained by reverse transformation.
The Box-Muller theorem was used to simulate new information; U1 and U2 are independent values generated by the uniform distribution between 0 and 1. Thus,
Artificial neural networks (ANNs)
Data from the 2007 and 2009 experiments were subjected to ANN analysis, carried out with the aid of the MATLAB software (MATLAB, 2016). For ANN training, 600 expanded simulated data were used (300 for each group), considering the multilayer perceptron architecture with the following descriptions for topologies: a) Number of hidden layers: Three hidden layers were considered. b) Number of neurons: Combinations between three and 12 neurons were considered for each hidden layer. c) Activation function: Linear activation function was adopted for the output layer. For the hidden layers, the linear, logistic, and hyperbolic tangent functions were used to establish the best architecture. d) Number of training cycles: Five-thousand iterations were used. The number of iterations was limited so they did not become excessive, which could lead to the loss of generalizability. e) Training function: trainbr - Backpropagation is a training network function that updates weight and bias values based on Levenberg-Marquardt optimization, in order to minimize the combination of the squares of errors and weights, and to determine the correct combination to produce a network with good generalizability, whose process is denominated Bayesian regularization.
Results
Based on multiple regression analysis with a stepwise option for variable selection, the HD, PH, and the mean branch angles (MBA) were the traits determining bean plant architecture (ARC). In the analyses with ANNs, the HD was used individually or in conjunction with plant height within the plot under each of the proposed scenarios, aiming to predict plant architecture.
Table 2 summarizes individual analysis of variance regarding ARC, PH, and HD in 19 bean lines in the 2007 and 2009 winter crops. The coefficient of experimental variation (CVe) for the 2007 and 2009 experiments was below 20% for most traits, indicating good experimental precision (Pimentel Gomes, 1985). The CVe values are consistent with those reported in similar experiments with bean crops (Moura et al., 2013; Poersch, 2013).
SV | d.f. | Mean squares | |||||
---|---|---|---|---|---|---|---|
2007 | 2009 | ||||||
ARC | PH | HD | ARC | PH | HD | ||
Lines | 18 | 13.87** | 173.79** | 0.01** | 16.14** | 371.31** | 0.02** |
CVe (%) | 11.99 | 11.50 | 7.52 | 10.80 | 22.39 | 6.89 | |
h2 (%) | 97.24 | 89.26 | 88.05 | 98.01 | 89.05 | 92.37 | |
Mean | 5.16 | 37.55 | 0.48 | 5.25 | 28.47 | 0.56 |
**Significant at the 1% level as determined by the F-test (P < 0.01); CVe = coefficient of experimental variation; h2 = coefficient of genotypic determination.
Table 2: Summary of individual analysis of variance for the plant architecture traits (ARC), plant height within the plot (PH) and hypocotyl diameter (HD), evaluated in 19 bean lines in 2007 and 2009.
Significant effects were observed (P < 0.01) for the effect of lines in both experiments (Table 2), indicating the existence of genetic variability between lines for the three traits evaluated in both years. Coefficients of genotypic determination (h2) of ARC, PH, and HD were of high magnitude for both experiments.
Table 3 summarizes the joint analysis for ARC, PH, and HD, which were evaluated in both 2007 and 2009. There was a significant effect (P < 0.01) of lines on ARC, PH, and HD. In relation to the source of variation environments (in this case, years), significant effects were observed (P < 0.01) for PH and HD. There was a significant effect (P < 0.05) of the lines x environment interaction for ARC and HD. Based on the methodology described by Cruz and Castoldi (1991), these interactions were observed to be simple, and the estimates of the complex fractions were 19.04 and 30.44% for ARC and HD, respectively.
SV | d.f. | Mean square | ||
---|---|---|---|---|
ARC | PH | HD | ||
Lines | 18 | 29.34** | 480.37** | 0.03** |
Environments | 1 | 0.22ns | 2349.54** | 0.20** |
LxE | 18 | 0.67* | 64.73* | 0.0023ns |
CVe (%) | 11.40 | 16.50 | 7.18 | |
h2 (%) | 98.80 | 93.83 | 95.04 | |
Mean | 5.20 | 33.01 | 0.51 |
**,*Significant at the 1 and 5% level, respectively, as determined by the F test; CVe = coefficient of experimental variation; h2 = coefficient of genotypic determination.
Table 3: Summary of joint analysis of variance of the traits plant architecture (ARC), plant height within the plot (PH), and hypocotyl diameter (HD), evaluated in 19 bean lines in 2007 and 2009.
Table 4 presents the results obtained by ANNs in scenarios 1 and 2, using the HD individually or in conjunction with PH. For scenario 1, high AER was observed for training, validation, and prediction, based on HD. A lower AER was also observed in the prediction with mean data (15.79 and 5.26%) than with replication data (24.56 and 14.04%) for both ANNs using HD individually or in conjunction with PH.
Procedures | Apparent error rate - AER (%) | ||||
---|---|---|---|---|---|
Training | Validation | Prediction1 | Prediction2 | ||
Scenario 1 | ANN (HD) | 10.33 | 15.79 | 24.56 | 15.79 |
ANN (HD + PH) | 6.83 | 12.28 | 14.04 | 5.26 | |
Scenario 2 | ANN (HD) | 12.83 | 14.04 | 33.33 | 15.79 |
ANN (HD + PH) | 1.83 | 3.51 | 14.04 | 0.00 |
1Prediction of replication data (57 observations); 2prediction of means data (19 observations).
Table 4: Apparent error rate (%) obtained in scenario 1, using the hypocotyl diameter (HD) individually or in conjunction with plant height (PH).
In scenario 2, there was also a higher AER for training, validation, and prediction for the ANN based on HD. When comparing the predictions, subjecting the replication data of lines to prediction, the AER of ANNs was found to be higher. By using data on the means, considering HD + PH, ANN presented an AER of 0.00%, and was able to correctly classify all of the lines into their respective groups (Table 4).
Considering the classification of lines in relation to the groups (Table 5), in the validation of ANNs using the HD and PH for scenario 1, there was higher percentage of correctness (81.48%) in the allocation of lines of group 1 than when using only the HD (77.78%). A similar result was observed for the classification of lines in group 2, with 93.33% correctness, considering HD and PH, while 90% accuracy was observed when using HD individually.
Scenario | Groups | Classification (%) | |||||
---|---|---|---|---|---|---|---|
Validation | PredictionA | PredictionB | |||||
1 | 2 | 1 | 2 | 1 | 2 | ||
1 | 1 | 77.78C (81.48)D | 22.22 (18.52) | 100.00 (100.00) | 0.00 (0.00) | 100.00 (100.00) | 0.00 (0.00) |
2 | 10.00 (6.67) | 90.00 (93.33) | 46.67 (26.67) | 53.33 (73.33) | 30.00 (10.00) | 70.00 (90.00) | |
2 | 1 | 92.59 (92.59) | 7.41 (7.41) | 29.63 (85.19) | 70.37 (14.82) | 77.78 (100.00) | 22.22 (0.00) |
2 | 20.00 (0.00) | 80.00 (100.00) | 0.00 (13.33) | 100.00 (86.67) | 10.00 (0.00) | 90.00 (100.00) |
APrediction of replication data (57 observations, 2009); Bprediction of means data (19 observations, 2009); Cvalues without brackets refer to the % of classification considering HD; Dvalues in brackets refer to the % of classification considering HD + PH.
Table 5: Percentage of classified bean lines in the groups for plant architecture (ARC), scenarios 1 and 2, using hypocotyl diameter (HD) individually or in conjunction with plant height (PH).
By subjecting replication data of the lines evaluated in 2009 to prediction, the ANNs using HD individually or in conjunction with PH correctly allocated all lines of group 1. For group 2, the ANNs based on HD and PH were superior, with 73.33% accuracy, while 53.33% correctness was observed when considering HD individually.
When subjecting the mean data of the lines evaluated in 2009 to prediction, ANNs using HD individually or in conjunction with PH also correctly allocated all lines of group 1. For group 2, ANNs based on HD and PH were superior, with 90.00% correctness, while 70.00% correctness was obtained when considering HD individually.
When comparing the predictions, subjecting the replication data or mean data of lines evaluated in 2009 to prediction, ANNs based on HD correctly allocated all the lines of group 1. However, prediction based on mean data was superior, with 70.00% correctness against 53.33% accuracy when using replication data. Based on HD in conjunction with PH, when subjecting the replication data or means data to prediction, ANNs also correctly allocated all lines of group 1. Again, predictions based on the mean data were superior, with 90,00% correctness against 73.33% accuracy when using the replication data.
For scenario 2, the validation of ANNs using HD and PH revealed the same level of correctness (92.59%) in the allocation of group 1 lines (Table 5). For group 2, the ANN based on HD and PH was superior, with 100.00% correctness against 80.00%, when considering HD individually.
By subjecting the replication data of the lines to prediction, the ANN based on both traits had a higher level of correctness at 85.19%, against 29.63% of the ANN that considered HD individually for the classification of the lines of group 1. For group 2, the percentage of correctness was 100% for the ANN based on HD, and 86.67% for the ANN considering both traits. When means data were subjected to prediction, the ANN using HD in conjunction with PH had a higher percentage of correctness, at 100.00%, against 77.78% for the ANN when considering HD individually for the classification of the lines of group 1. For group 2, the ANN based on HD and PH was superior, with 100.00% correctness, against 90.00% when considering HD individually.
When comparing the predictions, subjecting the mean data of the lines to prediction led to the ANNs based on HD being superior, since they correctly allocated 77.78% of the lines of group 1, compared with 29.63% when considering replication data subjected to prediction. For group 2, prediction based on replication data correctly allocated all the lines of this group, i.e., 100% correctness against 90,00% when replication data were subjected to prediction. Based on the HD in conjunction with PH, by subjecting the mean data to prediction, ANNs were superior, since they correctly allocated all lines of group 1, 100% correctness, against 85.19%, when subjecting replication data to prediction. Predictions based on the mean data were also superior in the allocation of lines of group 2, with 100.00% correctness, against 86.67% when using replications data.
Table 6 shows the topologies, considering the multilayer perceptron architecture, the number of neurons, and the activation function in the hidden layers of the ANN that presented low AER in the validation using the HD individually or in conjunction with PH. ANNs based on HD individually presented more complex topology in scenario 2, since they required a larger number of neurons per hidden layer, while for the ANN based on HD and PH, the same number of neurons were observed in both scenarios.
The topologies of ANNs based on HD used more complex activation functions - such as logistic and hyperbolic tangent functions - than ANNs based on HD and PH in both scenarios. (Table 6). ANNs based on HD presented the same complex architecture in both scenarios, since they required activation functions of the same complexity. For ANNs based on HD and PH, topologies of the same complexity were also observed in both scenarios, since there were predominant linear activation functions in the scenarios.
SN | HD | HD + PH | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Neurons | Activation functions | Neurons | Activation functions | |||||||||
O1 | O2 | O3 | O1 | O2 | O3 | O1 | O2 | O3 | O1 | O2 | O3 | |
1 | 3 | 3 | 3 | LogA | Log | Log | 3 | 3 | 3 | Log | PurC | Pur |
2 | 3 | 12 | 3 | TanB | Log | Tan | 3 | 3 | 3 | Pur | Pur | Pur |
ALogistic function: logsig; BHyperbolic tangent function: tansig; CLinear function: purelin.
Table 6: Topology of ANNs regarding the number of neurons and activation function in the hidden layers (O1, O2, and O3), using the hypocotyl diameter (HD) individually or in conjunction with plant height (PH) in scenarios (SN) 1 and 2.
Discussion
Greater accuracy in the selection of erect plants requires information on other morphological traits involved in the phenotypic expression of bean plant architecture. In this study, the determinant traits of plant architecture were observed to be HD, PH, and MBA. Similar results were reported previously by Acquaah et al. (1991), also using regression with a stepwise option for variable selection, and by Moura et al. (2013) based on path analysis.
Since bean plant architecture is a trait governed by several genes, and is influenced by the environment, the selection of more erect bean plants based on the evaluation by scores has low precision (Teixeira et al., 1999; Basset, 2004; Moreto et al., 2007). In this case, the indirect selection for plant architecture based on auxiliary traits has potential for bean breeders. The HD and the mean plant height within the plot stand out as auxiliary traits since they are easier to evaluate. Conversely, the mean branch angle is difficult to measure (Moura et al., 2013). The use of morphological traits associated with the phenotypic expression of bean architecture in discriminatory techniques will be effective if based on high-accuracy traits for the selection process and on easily measurable traits. In this sense, ANNs were based only on the hypocotyl diameter and on mean plant height within the plot.
Under both scenarios, ANNs based on HD and PH were superior to ANNs based on HD individually, since they presented low AER for the training, validation, and prediction stages. Furthermore, ANNs based on HD and PH presented an AER lower than 15% at all stages, which represented misclassification of only three of the 19 tested lines, demonstrating the high generalizability of ANNs, which was also reported by Braga et al. (2011).
In prediction, when using the mean data of lines, ANNs based on HD and PH were also superior to ANNs based on HD individually, since the AERs were much lower. In scenario 2, ANNs were able to correctly classify all lines in their respective groups using this kind of prediction.
In bean breeding programs aimed at the development of more erect plant architecture, plants with scores below 2.5 are usually selected, which correspond to the lines allocated in group 1. Therefore, considering the predictions, ANNs based on the traits HD and PH were found to be superior to those from the analysis based on HD individually. This is because in scenario 1, ANNs presented the same percentage of correctness for group 1, and a higher percentage of correctness of lines of group 2. In scenario 2, ANNs based on both traits presented correctness of the lines of group 1 superior in both predictions, and also higher correctness of the lines of group 2, in the prediction using the means of the lines. In those cases, ANNs based on HD and PH presented percentages of correctness superior to 80%, confirming the result report by Braga et al. (2011) for the high predictive ability of ANNs.
For both scenarios, predictions using mean data of the lines were superior, since in this kind of prediction, AERs were lower, and the correctness of line allocation was higher than when predicted using replication data. These results are consistent with the high environmental influence on bean plant architecture reported by other authors (Basset, 2004; Moreto et al., 2007). The higher accuracy observed using the mean data for HD and PH for prediction is due to the fact that the environmental effects tend to be canceled when means are used.
When lines were evaluated using the scoring scale, 17 of the 36 lines presented contradictory scores for architecture in replications within the same experiment, and/or in different experiments. Considering these contradictions as evaluation errors, an error rate of 47.22% associated with the evaluation by scores for bean plant architecture was confirmed in the present study. This error rate was much higher than the apparent error rate of ANNs based on HD and PH, which highlights the potential use of ANNs for the improvement of bean plants aimed at obtaining more erect plant architecture. ANNs have been very effective for solving prediction problems, in pattern recognition, and in clusterings (Haykin, 2008), which are also problems found at different stages of breeding programs.
In conclusion, ANNs trained and validated with replication data of HD and PH within the plot are superior to ANNs that use HD only to predict bean plant architecture. The use of mean data to predict ANNs generates more reliable results regarding bean plant architecture. Smaller numbers of explanatory variables for training and validation requires ANNs with more complex architectures.
Conflicts of interest
The authors declare no conflict of interest.
Acknowledgments
The authors thank Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Fundação de Pesquisa do Estado de Minas Gerais (FAPEMIG), and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) for financial support and scholarships.
About the Authors
Corresponding Author
V.Q. Carneiro
Departamento de Biologia Geral, Universidade Federal de Viçosa, Viçosa, MG, Brazil
- Email:
- vqcarneiro@gmail.com
References
- Acquaah G, Adams MW and Kelly JD (1991). Identification of effective indicators of erect plant architecture in dry bean. Crop Sci. 31: 261-264.https://doi.org/10.2135/cropsci1991.0011183X003100020004x
- Barbosa CD, Viana AP, Silva S, Quintal R, et al. (2011). Artificial neural network analysis of genetic diversity in Carica papaya L. Crop Breed. Appl. Biotechnol. 11: 224-231.https://doi.org/10.1590/S1984-70332011000300004
- Barroso LMA, Nascimento M, Nascimento ACC, Silva FF, et al. (2013). Uso do método de eberhart e russell como informação a priori para aplicação de redes neurais artificiais e análise discriminante visando a classificação de genótipos de alfafa quanto à adaptabilidade e estabilidade. Rev. Bras. Biometria 31: 176-188.
- Basset MJ (2004). List of genes - Phaseolus vulgaris L. Annu. Rep. Bean Improv. Coop. 47: 1-24.
- Batchelor WD, Yang XB and Tschanz AT (1997). Development of a neural network for soybean rust epidemics. Trans. ASAE 40: 247-252.https://doi.org/10.13031/2013.21237
- Bishop CM (1995). Neural networks for pattern recognition. 1st edn. Claredon Press, Oxford, Birmingham, UK.
- Borém A and Carneiro JE de S (2015). A cultura. In: Feijão - Do Plantio a Colheita. Editora UFV, Viçosa.
- Braga A de P, Carvalho ACPLF and Ludemir TB (2011). Redes Neurais Artificiais - Teoria e Aplicações. 2nd edn. LTC, Rio de Janeiro.
- Collicchio E, Ramalho MAP and Abreu ADFB (1997). Associação entre o porte da planta do feijoeiro e o tamanho dos grãos. Pesqui. Agropecu. Bras. 32: 297-304.
- Coyne DP (1980). Modification of plant architecture and crop yield by breeding. HortScience 15: 244-247.
- Cruz CD (2013). GENES - a software package for analysis in experimental statistics and quantitative genetics. Acta Sci. Agron. 35: 271-276.https://doi.org/10.4025/actasciagron.v35i3.21251
- Cruz CD and Castoldi FL (1991). Desempenho da interação genótipo x ambientes em partes simples e complexa. Rev. Ceres 38: 422-430.
- de Menezes Júnior JÂN, Ramalho MAP and De Abreu ÂFB (2008). Seleção recorrente para três caracteres do feijoeiro. Bragantia 67: 833-838.https://doi.org/10.1590/S0006-87052008000400004
- Galvão CO, Valença MJS, Vieira VPPB, Diniz LS, et al. (1999). Sistemas inteligentes: aplicações a recursos hí́dricos e ciências ambientais. Editora Universidade, Porto Alegre.
- Haykin S (2008). Neural Networks and Learning Machines. 3st edn. Pearson - Prentice Hall, Hamilton.
- Kaul M, Hill RL and Walthall C (2005). Artificial neural networks for corn and soybean yield prediction. Agric. Syst. 85: 1-18. https://doi.org/10.1016/j.agsy.2004.07.009
- Kavzoglu T and Mather P (2003). The use of backpropagation artificial neural networks in land cover classification. Int. J. Remote Sens. 24: 4907-4938.https://doi.org/10.1080/0143116031000114851
- Kelly JD (2001). Remaking bean plant architecture for efficient production. Adv. Agron. 7: 109-143. https://doi. org/10.1016/S0065-2113(01)71013-9
- MATLAB (R2016a). Natick, Massachusetts: The MathWorks Inc., 2016.
- Mendes FF, Ramalho MAP and Abreu  de FB (2009). Índice De Seleção Para Escolha De Populações Segregantes De Feijoeiro-Comum. Pesqui. Agropecu. Bras. 44: 1312-1318. https://doi.org/10.1590/S0100-204X2009001000015
- Moreto AL, Ramalho MAP, Nunes JAR and Abreu  de FB (2007). Estimação dos componentes da variância fenotípica em feijoeiro utilizando o método genealógico. Cienc. Agrotec. 31: 1035-1042.
- Moura MM, Carneiro PCS, Carneiro JE de S and Cruz CD (2013). Potencial de caracteres na avaliação da arquitetura de plantas de feijão. Pesqui. Agropecu. Bras. 48: 417-425. https://doi.org/10.1590/S0100-204X2013000400010
- Nascimento M, Peternelli LA, Cruz CD, Campana ACN, et al. (2013). Artificial neural networks for adaptability and stability evaluation in alfalfa genotypes. Crop Breed. Appl. Biotechnol. 13: 152-156. https://doi.org/10.1590/S1984-70332013000200008
- Nelson MM and Illingworth WT (1991). A Practical Guide to Neural Networks. Prentice Hall PTR, Reading.
- Pimentel Gomes F (1985) Curso de Estatística Experimental. 11th edn. Nobel, São Paulo
- Pires LPM, Ramalho MAP, Abreu AFB and Ferreira MC (2014). Recurrent mass selection for upright plant architecture in common bean. Sci. Agric. 71: 240-243. https://doi.org/10.1590/S0103-90162014000300009
- Poersch NL (2013) Diâmetro do hipocótilo como caráter auxiliar no melhoramento da arquitetura do feijoeiro. Federal University of Viçosa, Viçosa. Available at [http://www.locus.ufv.br/bitstream/handle/123456789/1361/texto%20 completo.pdf?sequence=1]. Accessed at August 17, 2015.
- Schaap MG and Bouten W (1996). Modeling water retention curves of sandy soils using neural networks. Water Resour. Res. 32: 3033-3040.https://doi.org/10.1029/96WR02278
- Silva GN, Tomaz RS, De Castro I, Anna S, et al. (2014). Neural networks for predicting breeding values and genetic gains. Sci. Agric. 71: 494-498.https://doi.org/10.1590/0103-9016-2014-0057
- Sudheer KP, Gosain AK and Ramasastr KS (2003). Estimating actual evapotranspiration from limited climatic data using neural computing technique. J. Irrig. Drain. Eng. 129: 214-218. https://doi.org/10.1061/(ASCE)0733-9437(2003)129:3(214)
- Teixeira FF, Antonio M, Ramalho P, De Fátima Â, et al. (1999). Genetic control of plant architecture in the common bean (Phaseolus vulgaris L.). Genet. Mol. Biol. 582: 577-582. https://doi.org/10.1590/S1415-47571999000400019
- Vieira RF, Lima M, Neves JCL and Andrade MJB de (2015). Adubação In: Feijão - Do Plantio a Colheita. Editora UFV - UFV, Viçosa.
Keywords:
Download:
Full PDF- Share This