Previous PageTable Of ContentsNext Page

Identifying QTL-linked Markers in marker-deficient Crops

Subhash Chandra, Hutokshi K. Buhariwalla, J. Kashiwagi, S. Harikrishna, K. Rupa Sridevi, L. Krishnamurthy, Rachid Serraj and J.H. Crouch

International Crops Research Institute for the Semi-arid Tropics, Patancheru – 502 324, AP, India www.icrisat.org, Email s.chandra@cgiar.org

Abstract

The approach widely adopted to identify genetic markers (QTLs) linked to quantitative traits is based on using a genetic map uniformly populated with 100-300 markers. In less studied crops, this may not be feasible due to paucity of adequate number of polymorphic markers. With only a few (say m) polymorphic markers available, we propose a multiple marker analysis approach to reliably identify important markers for use in marker-aided selection (MAS). Assuming availability of a large mapping population, the approach is based on Bayesian information criterion applied on all possible 2m regressions. We demonstrate the approach using data on 257 RILs genotyped for 14 microsatellite markers to identify markers linked to three drought-avoidance traits in chickpea.

Media Summary

In the absence of a genetic map, QTL-linked markers could be reliably identified based on a multiple marker analysis approach using Bayesian information criterion.

Key Words

QTL-linked markers, MAS, Bayesian information criterion, stepwise regression

Introduction

Molecular markers are now widely used for genetic dissection of quantitative traits of agro-economic importance. A primary objective is to identify markers linked to quantitative traits (QTLs) for their subsequent use in marker-aided selection (MAS). A widely used approach is to construct (or use available) genetic maps to detect, map and estimate the effects of QTLs. This map-based approach, using appropriate interval-based statistical methods, allows choosing markers that show minimum recombination with the QTL and to resolve linked QTLs. In less studied crops, an adequate number of markers may however not be initially available to construct a genetic map. This precludes the use of map-based identification of important markers in these crops. It is however still important to use available information to identify important markers for their possible use in MAS. Using data on 257 RILs genotyped for 14 polymorphic SSR markers, we propose a map-free multiple marker analysis approach for reliable identification of markers associated with three drought-avoidance traits in chickpea.

Methods

Plant materials, phenotyping and genotyping

A population of 257 RILs (F8) was generated by single seed descent from an intraspecific cross between ICC-4958 (a drought tolerant cultivar having large root mass and length, Parent A) and Annigeri (an agronomically favored but drought susceptible variety with a smaller root mass, Parent B). Phenotyping was done under glasshouse conditions using a completely randomized design with three replications. Data on root dry weight (RDW), shoot dry weight (SDW), and root length (RL) were recorded at 35 days after sowing. Further details are reported in Krishnamurthy et al. (2004). The 257 RILs were genotyped using 14 polymorphic SSR markers using standard genotyping protocols.

Biometric analysis

Data on each trait were analyzed using a linear additive random effects model yik = μ + gk + εik where μ represents the general mean, gk the effect of genotype k, and εik the residual effects, assuming gk and εik to be normal random variables each with zero mean and a constant variance. The best linear unbiased prediction (BLUP) of the performance of the genotypes was obtained for each trait using restricted maximum likelihood (ReML) (Patterson and Thompson 1971). The line mean heritability of each trait was estimated as h2g2/[σg2+(σε2/r)] where σg2 and σε2 represent, respectively the ReML estimates of genetic and error variances, and r is the number of replications.

The BLUPs (yk) of the n=257 RILs and their genotyping data from m=14 SSR markers were used to assess marker-trait associations to identify potential QTL-linked markers. Based on χ2-test, the genotyping data from each of these 14 markers conformed to the expected Mendelian ratio of A:B::1:1.

Marker-trait association analysis

A single-marker analysis approach, using simple linear regression, is still commonly adopted to identify potential QTL-harboring markers. A quantitative trait is however expected to result from the joint effect of several QTLs. Approaches that model multiple QTLs simultaneously will therefore possess greater statistical power for detection of underlying genes and will better separate linked QTLs. In the absence of a linkage map and/or with only a few markers that are insufficient to build a linkage map, the identification of QTL-linked markers could proceed on the assumption of each marker itself being a potential QTL. On this assumption, the simplest approach to detect multiple QTLs is to use multiple linear regression that jointly considers all available m=14 markers

yk = α+ ∑l βl xkl + εk k=1,…,n l=1,…,m

where xkl is the marker score (x=0 or 1 for marker genotype B or A respectively) of k-th RIL at l-th marker and βl is the partial regression coefficient (additive genetic effect in the case of RILs) of a putative QTL linked to l-th marker. The above model assumes that QTLs act additively. Our aim is to identify a subset of q≤m (QTL-linked) markers that are simultaneously significantly linked to the trait. This is a model selection problem where we seek to identify a subset of q≤m markers for which βl≠0. This is in contrast to the traditional model selection approach that focuses on selecting a subset of markers to minimize prediction error.

We used step-wise regression (SWR) to identify an appropriate model using Fin=Fout=4 as threshold for partial F-statistic to include linked and exclude unlinked markers. The minimum Bayesian information criterion (BIC), applied on all possible 2m regressions, was used for comparison and to identify QTL-harboring markers that are consistently selected by the different approaches.

Results

The distribution of the two traits was normal and showed transgressive segregation. Heritabilities were 84% for SDW, and 54% for RDW and RL. Results of marker-trait association analyses are shown in Table 1. For SDW, SMA identified six markers (Ta2, Gaa58, Tr20, Tr8, Ta47, Taa170), with Ra2 varying from 1.6% for Ta47 to 56.2% for Taa170. For RDW and RL, SMA detected six markers (Ta2, Ga34, Gaa58, Tr20, Tr8, Taa170). Across the three traits, (closely linked) markers Ta2, Tr20, Tr8, and Taa170 were commonly selected. SMA thus tends to select closely linked markers. The marker locus Taa170 accounted for the maximal phenotypic variation in all traits - SDW (Ra2=56.2%), RDW

(R2a=33.1%), RL (Ra2=33.4%).

Multiple marker analyses (MMA)

In contrast to SWR, the minimum BIC criterion generally delivered the most parsimonious model, selecting at most three markers as important for any trait (Table 1). The difference in model R2a between the two approaches was generally practically negligible. This, coupled with the fact that minimum BIC generally picked up the least number of (unlinked) markers, which were also consistently selected as important by SWR, lends support to using the minimum-BIC criterion to identify a small number of important markers for MAS, considering the cost involved in using many markers in molecular breeding programs. RDW which had heritability of 54% showed linkage with three marker loci, while SDW having the highest h2 of 84% had significant linkage with one marker as identified on the basis of the minimum BIC criterion. This is in accordance with the expectation that highly heritable traits may be controlled by fewer QTLs of large effect.

Table 1. QTL-linked markers identified by different model selection criteria

 

Marker

Ta2

Tr20

Tr8

Taa170

Ta203

Gaa58

Ga34

Ta106

Ta47

Ra2(%)

BIC

   

+

+

+

+

++

++

+++

+++

     

SDW

Statistic

                     

SMA

b

.22**

.23**

.25**

.46***

 

-.09*

   

.09*

   
 

Ra2(%)

11.7

12.7

16.1

56.2

 

1.6

   

1.6

   

SWR

b

 

.07*

 

.44***

   

-.08*

.08*

 

58.0

 

BIC

b

     

.46***

         

56.2

249

RDW

                       

SMA

b

.07**

.01*

.09**

.23***

 

-.06*

-.05*

       
 

Ra2(%)

2.8

2.5

4.3

33.1

 

1.6

1.5

       

SWR

b

     

.23***

-.06*

 

-.06*

   

37.5

 

BIC

b

     

.23***

-.06*

 

-.06*

   

37.5

235

RL

                       

SMA

b

2.9**

2.7**

3.5**

9.6***

 

-2.4*

-2.2*

       
 

Ra2(%)

2.7

2.4

4.2

33.4

 

1.7

1.5

       

SWR

b

     

9.9***

-2.6*

 

-2.8*

   

38.1

 

BIC

b

     

9.9***

-2.6*

 

-2.8*

   

38.1

235

Markers with same number of + signs are linked (recombination frequency ≤.35);

SDW=Shoot dry weight, RDW=Root dry weight, RL=root length;

SMA=single marker analysis using simple linear regression; SWR=stepwise regression;

BIC=Bayesian information criterion;

*P<.05, **P<.01, ***P<.001; b=linear regression coefficient; Ra2=adjusted coefficient of determination

Conclusions

SMA tends to select many markers, some of which may be tightly linked and some might even be explaining relatively very little variation in the trait. Also, SMA cannot differentiate a marker close to a QTL of small effect versus a marker distantly associated with a QTL of substantial effect (Doerge 2001). SWR selects fewer markers than SMA, with none being spurious, but tends to pick up markers that may be linked, for example markers identified for SDW are all significantly linked. Our results thus confirm the findings of Broman & Speed (2002) that minimum BIC criterion is perhaps the most appropriate means to identify important markers. The markers selected by this approach also tend to be identified by SWR. This provides increased confidence for using markers selected from the minimum BIC criterion in MAS programs.

SMA generally selects markers that are also identified as important by more stringent criteria such as minimum BIC. Thus it seems that SMA is generally unlikely to miss important markers. In view of this, a possible strategy could be to select markers based on SMA using information on linkage among markers. This approach however may not allow optimal selection of loosely linked markers, though a set of completely unlinked markers each with, say Ra2>5%, may still be identified. In our view, it seems safer to use more than one marker selection method to more reliably identify the most important markers. Also, when using SWR, it seems prudent to use increasingly more stringent threshold for F-in and F-out to assess the robustness of a marker. We found that as the F-in and F-out values were increased from 4 to 8, SWR selected the same markers as minimum BIC criterion for all traits. Thus, in a situation where there are a relatively small number of markers available, the application of the minimum BIC criterion using all possible regressions (2M) may provide a reliable result without posing a serious computational problem.

Although number of markers used in our analysis is 14 which is not adequate for a uniform coverage of chickpea genome, we used these to construct a linkage map and carry out QTL mapping for a crude verification of our map-free approach. With an F8 RIL mapping population and a sample size of 257, we could expect the already mapped markers out of 14 to fall at expected positions on available maps. Deorge (2001) points out that having more individuals rather than more markers is more desirable for accurate QTL mapping since it is the observed recombinants that are more important. The already mapped markers mapped to the expected positions (with only distances varying) on the map reported by Winter et al. (2000). The most significant marker, Taa 170, has not been previously mapped but falls 16 cM from Ta 2 which has been widely mapped. QTL analysis using PlabQTL (not shown) indicated QTLs to be present in the close neighborhood of the markers identified as important by the minimum BIC criterion. This provides preliminary confirmation of the utility of the minimum BIC criterion over other model selection criteria in correctly identifying QTL-linked markers without a linkage map.

References

Broman KW and Speed TP (2002). A model selection approach for the identification of quantitative trait loci in experimental crosses. Journal of Royal Statistical Society B 64, 641-656.

Doerge RW (2001). Mapping and analysis of quantitative trait loci in experimental populations. Nature Genetics Reviews 3, 43-52.

Krishnamurthy L, Kashiwagi J, Jagdish Kumar, Chandra S, Crouch JH and Serraj R (2004). Phenotyping variation in root traits and the scope for its use in molecular breeding for terminal drought tolerance in chickpea (Cicer arietinum L.). Field Crops Research (In press).

Patterson HD and Thompson R (1971). Recovery of inter-block information when block sizes are unequal. Biometrika 58, 545-54

Winter P, Benko-Iseppon AM, Huttel B, RatnaparkheM, Tullu A, Sonnante G, Pfaaff T, Tekeoglu M, Santra D, Sant VJ, Rajesh PN, Kahl G and Muehlbauer FJ (2000). A linkage map of chickpea (Cicer arietinum L.) genome based on recombinant inbred lines from a C. arietinum x C. reticulatum cross: localization of resistance genes for fusarium wilt races 4 and 5. Theoretical and Applied Genetics 101, 1155-1163.

Previous PageTop Of PageNext Page