OUP user menu

Molecular Evidence for High Levels of Intrapopulation Genetic Diversity in Woodrats (Neotoma micropus)

Francisca M. Mendez-Harclerode, Richard E. Strauss, Charles F. Fulhorst, Mary L. Milazzo, Donald C. Ruthven III, Robert D. Bradley
DOI: http://dx.doi.org/10.1644/05-MAMM-A-377R1.1 360-370 First published online: 20 April 2007

Abstract

Nucleotide sequences from the mitochondrial control region and genotypes from 5 nuclear microsatellite loci were used to examine genetic structure and infer recent (within approximately the last 3,000 years) evolutionary history of a population (549 individuals) of the southern plains woodrat (Neotoma micropus). Observed heterozygosity values ranged from 0.61 to 0.89 across microsatellite loci and systematically were lower than expected heterozygosity values (0.66–0.95). Probability of unique identity using microsatellite data was high (1 individual in 66,005,424). Fifty-three mitochondrial haplotypes were obtained from 150 individuals. Fst values estimated from sequence and microsatellite data were 0.061 and 0.011, respectively, and the Rst for microsatellite data was 0.007. Within-group genetic variation ranged from 93.90% to 99.99% depending on whether sequence or microsatellite data were examined. Analyses of microsatellite data suggested that all sampled individuals belonged to a single population, albeit genetically diverse. However, combined data analyses suggested the presence of low levels of substructure attributable to maternal lineages within the population. Low nucleotide-diversity values (0.007–0.010) in addition to high haplotype-diversity values (0.915–0.933) indicate a high number of closely related haplotypes, and suggest that this population may have undergone a recent expansion. However, Fu's Fs statistic did not fully support this finding, because it did not reveal a significant excess of recent mutations. A phylogenetic approach using the haplotype sequence data and a combined set including both haplotype and genotype data was used to test for evolutionary patterns and history.

Key words
  • control region
  • D-loop
  • evolutionary history
  • genetic structure
  • microsatellites
  • Neotoma micropus
  • population genetics

Most population-genetic studies of mammals have focused on interpopulation comparisons, often limited to sampling a few individuals per population, or comparing samples collected across large geographic regions (Anderson et al. 2004; Ashley 2004; Birungi and Arctander 2000; Bowman et al. 2000; Castleberry et al. 2002; Ehrich and Stenseth 2001; Fedorov et al. 1999; Matocq 2002, 2004; Matson et al. 2000; Monty et al. 2003; Onorato et al. 2004; Peakall et al. 2003; Reese et al. 2001; Rooney et al. 2001; Wisely et al. 2004). Only a handful of studies (Höglund and Shorey 2003; Knutsen et al. 2003; Planes and Lenfant 2002; Stahlhut and Cowan 2004) have been dedicated exclusively to the examination of genetic structure within a single population, and many of these studies may have not been restricted to a single wild population. These discrepancies may be the result of semantics; after all, it is often difficult to effectively define the meaning of the word “population,” especially for different taxa. At the beginning of the present study, we chose to consider a population as a group of individuals that inhabit an area in sufficient proximity to allow short-term gene flow among individuals. However, we also were interested in testing the definition of Pritchard et al. (2000) of a population: a group of individuals that share a homogeneous set of allele frequencies at each locus.

Although interpopulation studies contribute to our general knowledge of genetic diversity and structure across large geographic regions, this approach is likely to underrepresent genetic diversity in local populations because of lack of intensive sampling within a locality. The sampling of several hundred individuals per locality is often impractical, because many studies are designed to address questions of conservation and genetic diversity in small populations composed of few individuals. Nevertheless, not all populations are small and therefore single population studies that report population genetic parameters based on sample sizes representative of a greater percentage of the total number of individuals are needed.

For example, a recent study identified a single population of the big-eared woodrat (Neotoma macrotis) based on a sample size of 127, yet identified 2 populations of the dusky-footed woodrat (N. fuscipes) based on a sample size of 30 (M. Haynie, pers. comm.). Although these results may represent accurately the genetic structure of these groups, we cannot be sure these results are not a consequence of chance due to small sample sizes until additional samples of N. fuscipes are examined. Similarly, Méndez-Harclerode et al. (2005) reported that haplotype-diversity values in 114 southern plains woodrats (N. micropus) from a relatively small area (approximately 36 km2) were considerably higher than those reported in other mammalian population studies (Ehrich and Stenseth 2001; Klaus et al. 2001; Matocq et al. 2000; Matson et al. 2000). Genetic diversity was sufficiently high (h = 0.964) among individuals (79% of all genetic variability was within groups), that the authors hypothesized it could be the product of a recent population growth event, or the commingling of several formerly distinct populations in the area of study. Consequently, Méndez-Harclerode et al. (2005) concluded that a multilocus approach was needed to discern further among these hypotheses, because their study was based on a single maternal marker, nucleotide sequences of the mitochondrial control (D-loop) region.

This study further examines the genetic structure and recent evolutionary history (within approximately the last 3,000 years—F. M. Méndez-Harclerode et al., in litt.) of N. micropus. The goals of this study were 2-fold: 1st, to ensure a meaningful estimate of genetic diversity by intensively sampling members of a local population; and 2nd, to utilize different types of genetic markers to obtain an accurate estimate of intrapopulation relationships. The mitochondrial D-loop and 5 microsatellite loci were chosen as markers for this study. The D-loop was selected to facilitate comparison between this study and the results obtained by Méndez-Harclerode et al. (2005) and for its usefulness in elucidating population structure and evolutionary history because of its rapid rate of evolution (Avise 2000; Matson and Baker 2001; Parker et al. 1998; Sunnucks 2000; Wynen et al. 2001). For example, this mitochondrial marker (maternally inherited) would detect the dispersal and distribution of maternal lineages. Microsatellites were chosen because of their fine degree of resolution, often individual-specific (Parker et al. 1998), and for addressing questions of recent population evolutionary history, relatedness, and movement of individuals (Sunnucks 2000). This type of nuclear (biparentally inherited) marker could detect dispersal of individuals, or when used in conjunction with the maternal marker, detect possible dispersal of males. Additionally, the use of the mitochondrial marker in conjunction with nuclear markers presumably would provide resolution at both deep and shallow nodes of a phylogenetic tree, therefore providing a better description of the evolutionary history of the individuals present in the study area.

Materials and Methods

Study site.—This research was part of an ongoing study on the ecology of Whitewater Arroyo virus and studies on the population genetics and natural history of N. micropus on the Chaparral Wildlife Management Area, as described in Méndez-Harclerode et al. (2005). Suchecki et al. (2004), Ruthven and Synatzske (2002), and Fulhorst et al. (2002) provide detailed descriptions of the study area, including vegetation types, climate, topography, and a description of historical and current land-use practices.

Sampling.—Three trapping webs (identified as web I, web II, and web III) as per Anderson et al. (1983) were used to capture 611 woodrats over 16 trapping periods (once per season over 4 years, 2001–2004) on the Chaparral Wildlife Management Area (Méndez-Harclerode et al. 2005). The 3 webs were established approximately 3–4 km apart with each web containing 16 equidistant spokes, with 20 Sherman traps (H. B. Sherman Trap, Inc., Tallahassee, Florida) placed 5 m apart on each spoke (320 traps per web). In addition, 13 individuals of N. micropus collected from other localities in Texas, New Mexico, and Mexico, and 1 white-toothed woodrat (N. leucodon) were included as reference and outgroup taxa for phylogenetic analyses. Collection localities for the reference and outgroup individuals are provided by Méndez-Harclerode et al. (2005). Toes and ear punches from each of the individuals captured were deposited in the Genetic Resource Collection in the Natural Science Research Laboratory, Museum of Texas Tech University, and served as the DNA source in this study. Genomic DNA was isolated using a Puregene DNA isolation kit (Gentra Systems, Minneapolis, Minnesota) in order to provide a single source of DNA for both the mitochondrial (D-loop sequencing) and nuclear (micro-satellite genotyping) studies.

Five hundred forty-nine woodrats (individual identification numbers are available from the senior author) were selected for the microsatellite genotyping portion of the study, 171 from web I, 212 from web II, and 166 from web III, including individuals from all 16 trapping periods across a 4-year time span. A subset of 150 woodrats, 50 from web I, 47 from web II, and 53 from web III was selected for the haplotype (DNA sequencing) portion and included animals captured during 8 trapping periods spanning the 4-year study. This subset contained 28 individuals previously included by Méndez-Harclerode et al. (2005).

D-loop amplification, sequencing, and data analysis.—Polymerase chain reaction primers specific to the D-loop (Castro-Campillo et al. 1999; Méndez-Harclerode et al. 2005) were used to amplify the entire region (size ranging from 956 to 959 base pairs [bp]) of the mitochondrial genome. The following thermal profile was used for amplification: an initial cycle of 93.5°C for 1 min, followed by 33 cycles of 93.5°C for 40 s, 49°C for 40 s, and 72°C for 2 min 40 s, and a final cycle of 72°C for 2 min. Polymerase chain reaction products were purified with a QIAquick kit (Qiagen Inc., Chatsworth, California). Sequencing reactions used the same cycle sequencing primers as in Méndez-Harclerode et al. (2005). Both forward and reverse sequences were obtained using an automated ABI 3100-Avant sequencer (Applied Biosystems, Foster City, California). Sequences initially were aligned using Sequencher 3.1 software (Gene Codes, Ann Arbor, Michigan) and then adjusted manually to produce the same sequence alignment as Méndez-Harclerode et al. (2005). All sequences were deposited into GenBank (accession numbers are provided in Appendix I).

Number of variable sites, transitions, and transversions were obtained using the software program MEGA, version 2.1 (Kumar et al. 2001). Haplotype frequencies and their distribution, haplotype diversity (h—Nei 1987) and nucleotide diversity (π—Tajima 1983) were estimated using Arlequin 2.0 software (Schneider et al. 2000). Estimates were obtained under 2 schemes: 1st, with all individuals in a single group, and 2nd, with individuals subdivided into 3 groups according to collecting web. Arlequin 2.0 also was used to perform an analysis of molecular variance (AMOVA—Excoffler et al. 1992) and Fu's test of neutrality (FsFu 1996).

In concordance with other studies (Bickham et al. 1996, 1998a, 1998b; Trujillo et al. 2004), a haplotype was defined as a unique nucleotide sequence. When haplotypes differed by a single site, chromatograms were used to verify if sequences were indeed different. Because polymerase error is estimated to be 2.4−8.9 × 10−5 (Cariello et al. 1991), and the D-loop region is 956–959 bp long, it was possible that haplotypes that differed by a single site were artificially different. However, the presence of multiple individuals possessing the same haplotype would require the mistake to have occurred in the same base position multiple times, making this type of error unlikely. In cases where a single individual represented a haplotype, the specific change was at a site that was variable in other haplotypes. Alternatively, haplotypes may have resulted from mitochondrial heteroplasmy, which has been reported in mammals (mice [Gyllensten et al. 1991], dogs [Savolainen et al. 2000], humans [Schwartz and Vissing 2002], and bats [Wilkinson and Chapman 1991]). However, no great differences in length were observed, and chromatograms of both strands (forward and backward) agreed. Additionally, all DNA samples were obtained from approximately the same kind of tissue (toes or ears), which decreases the likelihood of tissue heteroplasmy.

A neighbor-joining distance phylogram of the absolute number of differences among the mitochondrial haplotypes, including the 13 reference individuals from across the range of N. micropus, was constructed using the software PAUP (Swofford 2002) to elucidate evolutionary patterns and evolutionary history of the mitochondrial haplotypes. The absolute number of differences was used to construct the phylogram to enable comparison of this tree with that resulting from the combined data analysis. N. leucodon was used as the outgroup taxon.

Microsatellite amplification, genotyping, and data analysis.—Microsatellite primers and methods reported in Castleberry et al. (2000) specifically for Neotoma were used to amplify 5 polymorphic microsatellite loci (Nma01, Nma04, Nma06, Nma10, and Nma11). Amplicons were obtained from each locus and were analyzed using an ABI 3100-Avant Genetic Analyzer (Applied Biosystems). Alleles were scored using GeneMapper software, version 3 (Applied Biosystems). The software program micro-checker (Oosterhout et al. 2004) was used to test for scoring errors due to stuttering, evidence of large allele dropout, and the presence of null alleles.

The software program Structure (Pritchard et al. 2000) was used to estimate the most likely number of populations (K) represented in the sample based on the genetic data without a priori knowledge of geographic provenance using a Bayesian clustering approach. To avoid underestimation of the number of populations represented in the study area, 3 hypotheses were tested: that all the individuals belong to a single population (K = 1), that the individuals belong to 2 populations (K = 2), and that the individuals belong to 3 populations (K = 3). Hypotheses were tested using the admixture model, a burn-in of 500,000, and 1,000,000 Markov chains with 3 repetitions per value of K. This range of K was chosen to represent the possibility that individuals from all collecting webs were part of the same population or, alternatively, that each collecting web represented a genetically distinct population. Allele frequencies, null allele frequencies, observed and expected heterozygosities, and polymorphic information contents were estimated using the software program CERVUS (Marshall et al. 1998). An AMOVA (Excoffler et al. 1992) was performed using Arlequin 2.0 software (Schneider et al. 2000). Allelic richness, F-statistics (Weir and Cockerham 1984), RST (Goodman 1997; Rousset 1996), relatedness (Queller and Goodnight 1989), Hardy-Weinberg equilibrium, and genotypic disequilibrium among all loci pairs, as well as a test of whether FIS per locus and sample is significantly positive or negative (indicative of a significant deficit or excess of heterozygotes, respectively) were estimated using the software program FSTAT (J. Goudet 2001, FSTAT, a program to estimate and test gene diversities and fixation indices [version 2.9.3], http://www.unil.ch/izea/softwares/fstat.html). Sequential Bonferroni corrections (Holm 1979; Rice 1989) were made to avoid type I errors. Probability of identity (the probability of encountering 2 individuals with identical genotypes in the same population) was estimated using the software program IDENTITY (Wagner and Sefc 1999). Mantel tests of each web and of the entire data set were performed using the software program GenAlEx, version 5 (Peakall and Smouse 2001) to test for isolation-by-distance.

Analysis of composite genotypes.—D-loop sequences and microsatellite data were combined in order to infer the evolutionary history of individuals. A pairwise distance matrix based on the absolute number of differences among all sequence pairs was 1st generated using the software program MEGA, version 2.1 (Kumar et al. 2001). This distance matrix was used as input for a multidimensional scaling function (written in Matlab; Mathworks Inc., Natick, Massachusetts) to reduce that matrix into the 5 statistically independent genetic-distance “factors” that best approximated the pairwise distances among all individuals. Additionally, a separate set of 10 independent genetic-distance “factors” was estimated for the same matrix of interindividual distances. Sets of 5 and 10 genetic-distance scores were chosen to provide an equal number of “haplotype loci” and microsatellite loci. Five distances allowed for the coding of each distance as the 1st allele of each “haplotype locus,” with the 2nd allele coded as “missing” as suggested by J. K. Pritchard (pers. comm., 2004). The 10 distances were paired to form 5 “diploid loci” with no missing data and therefore equal weights were attributed to microsatellite and sequence data. Two separate files, 1 containing the haplotype loci composed of the 5 distances along with the genotype data, and the other containing the haplotype loci composed of the 10 distances and the genotype data, were constructed. Only the 150 individuals having both sequence and microsatellite data were included in all combined data analyses. Structure (Pritchard et al. 2000) was rerun using these combined data sets to obtain a comprehensive estimate of K, the “true” number of populations. The same run settings as in the microsatellite analysis were used with the exception of the range of K tested, which was expanded to 1–13 for the combined data analysis. This was done because the likelihood value for K = 1–3 increased steadily with increasing Ks and merited further testing. For the combined analysis, values of K were tested until the likelihood ceased to increase consistently, as recommended by the Documentation for Structure Software: Version 2 (Pritchard and Wen 2003). Additionally, a Mantel test with 999 permutations was run to test the congruency of the haplotype and microsatellite data sets, using the software GenA1Ex, version 5 (Peakall and Smouse 2001). A neighbor-joining distance phylogram based on the absolute number of differences was constructed with the combined data set containing the 5 haplotype distances using PAUP (Swofford 2002). This was accomplished by coding each allele size and haplotype distance into a single character. The resulting unrooted network was rooted with 3 individuals belonging to the haplotype with the largest outgroup weight as indicated by a statistical parsimony network constructed in TCS, version 1.18 (Clement et al. 2000).

Adequacy of sample size.—An analysis designed to investigate how many individuals would be required to obtain similar population genetic estimates, and whether the estimates vary significantly across sample sizes, was performed as follows. The entire microsatellite data set was used to construct 10 cumulative groups in increments of 55 individuals according to order of capture in order to maximize comparisons and preserve a balanced sample from the 3 webs that approximates natural trapping conditions. Estimates of allelic richness, observed and expected heterozygosity, FIS, Fsx, and relatedness were obtained for each of the 10 groups using the software program FSTAT (J. Goudet 2001, FSTAT, a program to estimate and test gene diversities and fixation indices [version 2.9.3], http://www.unil.ch/izea/softwares/fstat.html), and 2-sided P values were obtained after 10,000 permutations.

Results

D-loop sequences.—The aligned nucleotide sequences were 962 bp in length and had 69 variable sites, 8 transitions (4 A–G, 4 T–C), and no transversions. On average, nucleotide composition was 29.2% T, 25.6% C, 32.2% A, and 3.0% G. Fifty-three unique haplotypes were delineated based on sequence data obtained from the mitochondrial D-loop of 150 individuals. Eight haplotypes were present on multiple webs, whereas the remaining 45 haplotypes were unique to a single web (Table 1). Overall haplotype diversity was 0.964, ranging from 0.915 to 0.933 among all 3 trapping localities, whereas overall nucleotide diversity was 0.009, ranging from 0.007 to 0.011 (Table 2). Fu's test of neutrality produced marginally significant Fs values (Table 2). The AMOVA partitioned the genetic variation into 6.100% among webs and 93.900% within webs. The overall FST value was 0.061.

View this table:
Table 1.

Distribution of the 53 unique haplotypes based on mitochondrial D-loop sequence data from 150 individuals. Number of individuals per haplotype is given per capture site (web) and total. Total number of individuals per web is listed in the last row.

Haplotype no.Web IWeb IIWeb IIITotal
111
311
45712
5516
7151016
101010
1111
1322
1499
1555
16145
17235
1933
20167
3011
3122
3244
3311
3422
3599
3611
3744
38112
4011
4111
42213
4311
4433
4511
4611
4711
4811
4955
5011
5111
5211
5333
5411
5511
5711
5811
5911
6011
6111
6211
6311
6411
6511
6611
6711
6811
8311
8411
Total504753150
View this table:
Table 2.

Haplotype (h) and nucleotide (π) diversity with their respective standard errors (SEs), Fu's statistic (Fs), and probability of significance of Fs based on D-loop sequence data from 150 individuals from the study area. Statistics are listed for each capture site (web) and overall.

LocalityhSEπSEFsProbability
Web 10.9150.0200.0110.0060.1570.574
Web 20.9330.0230.0100.005−3.2080.165
Web 30.9230.0180.0070.004−1.6870.296
Overall0.9640.0060.0090.005−8.1690.056

The neighbor-joining phylogram based on a distance matrix calculated from the absolute number of differences among the mitochondrial haplotypes revealed clades including individuals from the study area along with reference individuals, some of which were from localities as far away as 692 km from study site. Based on our samples, other clades appeared to be restricted to the Chaparral Wildlife Management Area (Hg. 1).

Micro satellite genotypes.—No evidence for scoring error due to stuttering or large allele dropout was identified by micro-checker, but the program warned of the possible presence of null alleles at loci Nma01, Nma04, Nma06, and Nma10, as suggested by the general excess of homozygotes for most allele size classes in those loci. Null allele frequencies as estimated by CERVUS (Marshall et al. 1998) ranged among loci and trapping webs from −0.032 to +0.060 (Table 3). Additionally, error rates based on repeated genotyping (at least twice) of 6% of all genotypes were estimated to be as follows: 0% for Nma01, Nma04, and Nma06; 0.200% for Nma10; and 0.400% for Nma11; resulting in an overall estimated error rate of 0.010%.

View this table:
Table 3.

Number of alleles (k), allelic richness (AR), observed (HO) and expected (HE) heterozygosity, polymorphic information content (PIC), and null allele frequency (NF) based on microsatellite data from 549 individuals from the study area. Values are listed by locus and mean (average of all 5 loci) for each capture site (web) and overall.

Web no.LocuskARHoHEPICNF
INma011514.9420.8070.8680.853+0.034
Nma041615.9410.8250.8310.812+0.004
Nma062019.9110.8710.8770.862+0.001
Nma103736.9050.8830.9460.941+0.034
Nma111110.9410.6080.6630.617+0.040
0.8370.817
IINma011615.5660.8820.9050.894+0.012
Nma042019.2060.7640.8080.792+0.030
Nma061211.7360.8540.8660.850+0.006
Nma103735.9070.8540.9460.941+0.050
Nma111413.4260.7080.6680.628−0.032
0.8390.821
IIINma011414.0000.8310.9000.888+0.039
Nma041919.0000.6690.7570.728+0.060
Nma061919.0000.8070.8870.873+0.047
Nma103737.0000.8860.9510.946+0.034
Nma111212.0000.7410.7050.663−0.029
0.8400.820
OverallNma011717.0000.8430.9010.892+0.033
Nma042626.0000.7540.8080.792+0.036
Nma062424.0000.8450.8840.872+0.022
Nma104747.0000.8720.9520.949+0.044
Nma111616.0000.6870.6790.663−0.008
0.8450.829

The hypothesis that all individuals belong to a single population (K = 1) yielded the highest probability values and smallest confidence intervals, indicating that genetically, all sampled individuals belong to a single population. All subsequent tests were performed 1st by grouping all individuals, and then by subdividing individuals according to web of provenance (I—III).

Observed heterozygosity values ranged from 0.608 to 0.886 across microsatellite loci and expected heterozygosity ranged from 0.663 to 0.952 (Table 3). The polymorphic information content values ranged from 0.617 to 0.949 across loci and webs (Table 3), and the total probability of identity was 1.52 × 10−8 (1 individual in 66,005,424). Allelic richness per locus per population ranged from 10.941 to 37.000, whereas allelic richness over all individuals per locus ranged from 16 to 47 (Table 3). The overall relatedness value with its respective 95% lower and upper confidence bounds was 0.021 (0.012, 0.028). After Bonferroni corrections (adjusted 95% significance level = 0.003) the test of significant deficiency or excess of heterozygotes revealed marginally significant values (P = 0.003) of heterozygote deficiency for web I and web II locus Nma10 and web III loci Nma04, Nma06, and Nma10.

The F-statistics over all loci and their respective 95% confidence bounds were as follows: FIT = 0.056 (0.026, 0.078), FST = 0.011 (0.006, 0.015), and FIS = 0.046 (0.018, 0.067); estimates of RST over loci were: 0.007 (weighted), 0.004 (Goodman 1987), arid 0.004 (unweighted). The AMOVA partitioned genetic variation into 0.160% among individuals within webs, 99.820% within individuals, and 0.010% among webs. Hardy-Weinberg equilibrium tests among individuals within webs and among individuals among webs revealed significant deviations from Hardy-Weinberg equilibrium after 1,000 permutations at loci Nma01 (P = 0.001), Nma04 (P = 0.001), Nma06 (P = 0.022), Nma10 (P = 0.001), and over all loci (P = 0.001). After Bonferroni corrections (adjusted 95% significance level = 0.005, after 1,000 permutations), 6 pairs of loci were found to be significantly associated (indicating geno-typic disequilibrium). These were: Nma01–Nma06 (P = 0.001), Nma01–Nma10 (P = 0.002), Nma04—Nma06 (P = 0.001), Nma04–Nma11 (P = 0.001), Nma06–Nmal0 (P = 0.003), and Nma10–Nma11 (P = 0.028). Mantel tests yielded statistically significant probabilities for the correlation of genetic and geographic distances for web I (P = 0.001), web II (P = 0.014), web III (P = 0.001), and for the entire data set (P = 0.001).

Composite genotypes.—To estimate the actual number of populations represented, a wide range of increasing values of K (1–13) was tested with both 5 and 10 haplotype scores, yet none yielded optimal results. For the set of 5 haplotype scores, likelihood values increased by 100 or more points over the range K = 1–5, after which they continued to increase less rapidly until K = 10. From K=lltoK=13 likelihood values varied without any apparent trend. The set of 10 haplotype scores produced similar results, with likelihood values increasing by several hundred points from K = 1 to K = 7, and values consistently increasing by a lesser margin until K = 9, after which likelihood values varied without apparent trend. However, individuals were strongly assigned to groups having asymmetrical sample sizes in simulations of K = 5–8 and K = 4–11 for the sets of 5 and 10 haplotype scores, respectively, which is strongly indicative of the presence of population structure (Pritchard and Wen 2003).

Interestingly, individuals belonging to the same haplotype clustered together in the neighbor-joining distance phylogram based on the combined data set (Fig. 2). In addition, the Mantel test yielded a statistically significant probability (P = 0.001) that the haplotype and microsatellite distance matrices are correlated, and therefore congruent. However, none of the clustering schemes provided by Structure, for any of the most likely range of K, mapped to include any particular clade in the combined data phylogram.

Fig. 1

Neighbor-joining phylogram of D-loop haplotypes constructed from the absolute number of differences among haplotypes. Haplotype number (H) is followed by the trapping web number (W) in which the haplotype is present. The number of individuals possessing the haplotype in each web is given in parentheses. Reference samples are labeled according to county of provenance. All other samples were collected within the Chaparral Wildlife Management Area.

Adequacy of sample size.—Significant differences among the 10 cumulative groups were detected in observed (P = 0.007) and expected (P = 0.019) heterozygosity, and no significant differences were observed in any of the other parameters (allelic richness, P = 0.998; Fst, P = 0.955; and relatedness, P = 0.946), although the differences observed in FIS were marginally significant (P = 0.057). Estimates for each parameter appeared to asymptote at different points in the cumulative analysis with observed heterozygosity needing as few as 275 individuals, and allelic richness and FIS requiring 495 individuals (Table 4).

View this table:
Table 4.

Sample size (SS), sample size per web (SW; WI = web I, WII = web II, WIII = web III), allelic richness (AR), observed (HO) and expected (HE) heterozygosity, FIS, Fst, and relatedness (Rel.) for each the 10 groups used to test for the effects of sample size.

Group no.SSSWARHoHEFISFstRel.
15524 WI0.8360.8510.0170.0170.0170.017
10 WII
21 WIII
211041 WI8.9730.8110.8520.0480.0170.032
23 WII
46 WIII
316558 WI8.8380.8000.8450.0530.0160.031
37 WII
70 WIII
422071 WI8.9800.8080.8460.0440.0140.026
69 WII
80 WIII
527584 WI8.8470.7940.8430.0580.0140.026
91 WII
100 WIII
6330100 WI8.7810.7950.8390.0530.0140.025
112 WII
118 WIII
7385118 WI8.7930.7950.8390.0530.0120.023
143 WII
124 WIII
8440137 WI8.7830.7970.8410.0530.0110.021
169 WII
134 WIII
9495163 WI8.7650.8000.8400.0480.0110.021
184 WII
148 WIII
10549172 WI8.7630.7980.8360.0460.0110.021
211 WII
166 WIII

Discussion

D-loop sequences.—The number of haplotypes reported in this study (53 from 150 individuals) was similar proportionally to that (42 from 114) reported by Méndez-Harclerode et al. (2005). In relation to the size of the study site, the number of haplotypes was considerably higher than those reported by other mammal studies (Ehrich and Stenseth 2001; Klaus et al. 2001; Matocq et al. 2000; Matson et al. 2000) but comparable to those found by Bickham et al. (1996) after compensating for the size of D-loop fragment used. Low nucleotide-diversity values indicated that haplotypes were closely related; most haplotypes differed by only 1 or 2 nucleotides. High haplotype diversity in concurrence with low nucleotide diversity has been linked to population growth after a period of low effective population size (Grant and Bowen 1998). Further evidence for the growth of this population may have been suggested by the marginally significant negative overall Fs value (−8.169, P = 0.056), indicative of an excess of new mutations concomitant with population growth. The overall Fst value (0.061) was moderate according to the criteria given by Wright (1978), probably because of the number of unique web-specific haplotypes (i.e., 45 haplotypes were restricted to a particular trapping web in relation to the 8 haplotypes present in multiple webs).

The neighbor-joining phylogram also supported the presence of many, closely related haplotypes. As reported by Méndez-Harclerode et al. (2005), some haplotype clades have a widespread distribution, some including haplotypes from areas as distant as southern, western, and northwestern Texas and eastern New Mexico (an area of approximately 10,350,000 km2), whereas other haplotype clades appear to be restricted to the Chaparral Wildlife Management Area (approximately 36 km2).

Microsatellite genotypes.—The 5 microsatellite loci utilized in this study were highly variable, as evidenced by the presence of 47 alleles at 1 locus (Nma10), the high probability of identity (1 individual in 66,005,424), and the moderate-to-high polymorphic information content values (0.617–0.946 across loci and webs). Overall relatedness was low (0.021), and may be due to the highly polymorphic loci examined.

Bayesian clustering analysis indicated that all sampled individuals belong to a single population, as supported by the low overall FST value (0.011), which suggests absence of structure among individuals. However, a slight deficit of heterozygotes in some locus-web combinations (webs I and II-locus Nma10, and web III-loci Nma04, Nma06, and Nma10) suggested the possible presence of null alleles, which could lead to mistakes in the estimation of K (Pritchard and Wen 2003). However, null alleles are probably not responsible for the observed deviations from Hardy-Weinberg equilibrium because we obtained amplicons for all individuals at each locus. Alternatively, heterozygote deficiency also could be the result of rare alleles, correlations among loci, or the sampling of closely related individuals. Additional evidence for the presence of structure was provided by the significant correlations between genetic and geographic distances yielded by the Mantel tests (P = 0.001 to P = 0.014), suggesting that isolation-by-distance may be a major mechanism driving the population structure of woodrats in this area.

Composite genotypes.—Taking into consideration that all of the most likely values for K based on the combined data are high in comparison to the value of K obtained from the analysis of microsatellite data by itself, it is probable that K, in the combined analysis, reflected primarily maternal lineages, although sufficient genetic structure does not exist in this population to allow for the detection of such groups. However, the combined data phylogram (Fig. 2) correctly grouped individuals belonging to the same haplotype, even though only 5 of the 15 markers used to construct the tree were haplotype markers. This, along with the significant correlation between the haplotype and genotype scores, suggests there may be such units of structure as maternal lineages in this population, but because of either admixture or insufficient power of the genetic markers used, we were not able to identify them.

Adequacy of sample size and loci.—The goals of this study were to obtain meaningful and accurate estimates of population genetic parameters through intensive sampling and the utilization of nuclear and mitochondrial markers. The fact that both types of genetic markers, microsatellite loci and nucleotide sequence data from the mitochondrial D-loop, yield comparable and congruent estimates of population genetic parameters suggests that the estimates are accurate, thereby achieving the latter goal. The former goal concerning a meaningful estimate of genetic diversity by intensive sampling of members of the local population was addressed through the testing of differences in population genetic parameters based on 10 cumulative groups of individuals. The results of this test, listed in Table 4, suggest that population genetic studies may need to sample several hundred individuals in order to obtain meaningful results, depending on the parameters of interest.

Conclusions.—Analyses of D-loop sequence data and microsatellite genotype data provide similar estimates of genetic structure after accounting for differing degrees of variability inherent to each marker. For example, a high number of haplotypes was observed in the D-loop sequence data, yet in many instances identical haplotypes were observed in 2 or more individuals (e.g., 16 individuals possessed haplotype 7); whereas all microsatellite genotypes were individual-specific. Results of the AMOVA for each data set attributed most of the genetic variation to be among individuals within webs, and FST values were moderate to low. This apparent congruence of the 2 data sets was supported by the significant correlation indicated by the Mantel test, which provided further support for the combined data approach used in this study.

Fig. 2

Neighbor-joining phylogram of combined data based on the absolute number of differences constructed with the combined data set containing the 5 haplotype distances and 5 microsatellite loci coded with a single character per haplotype distance/allele. Haplotype number (H) is followed by the trapping web number (W) for each individual. The number of individuals possessing the haplotype in each web is given in parentheses. All samples were collected at the Chaparral Wildlife Management Area.

Méndez-Harclerode et al. (2005.) proposed that the high levels of genetic diversity could be a product of the commingling of 3 formerly distinct woodrat populations. Results of the Bayesian clustering analysis on the microsatellite genotypes argued against this hypothesis and supported the presence of a single population within the study area. However, results of the same analysis on the combined data suggested the presence of levels of substructure that, although inadequately defined, may be real. The presence of structure also was supported by the detection of isolation-by-distance within each web and in the whole Chaparral Wildlife Management Area. The idea that the population may be growing was supported in the current study by the pattern of low nucleotide diversity and high haplotype diversity, and the marginally significant value of the overall Fs statistic (P = 0.056). This population may be growing because of the land-use practices in the Chaparral Wildlife Management Area. Events such as turnover in habitat, periodic burns, plowing, and cattle grazing may provide sufficient land disturbance to increase the area habitable by woodrats and encourage their colonization in this area. Similarly, high levels of predation may encourage greater genetic turnover, resulting in a pattern of genetic population expansion.

Additionally, these results indicated the presence of widespread maternal lineages, some occupying most of Texas and eastern New Mexico. A number of phenomena could be acting alone or in conjunction to produce these widespread lineages, such as larger than expected ranges, dispersal, high mutation rates, preferential mating, high woodrat density, inadequate sampling, or retention of ancestral haplotypes (Avise 1994). Additionally, some of the variation could be an artifact of temporal effects introduced by grouping different cohorts in the same sample. However, because we have recaptured some individual woodrats over 3 years (indicating a minimum life span of 3 years), it is unlikely that temporal effects could have significantly biased the results of our 4-year study.

Analyses of D-loop sequence data and microsatellite genotype data independently and in conjunction have confirmed the presence of high levels of genetic diversity in the woodrat population in Chaparral Wildlife Management Area. As previously discussed, few studies have documented a similar level of genetic diversity within a single population. This may be due to insufficient sampling given that expected heterozygosity, also called gene diversity, did not asymptote until after 330 individuals were included in the analysis. Similarly, FST, a commonly reported population genetic parameter required at least 440 individuals to reach stable values. These results suggest that levels of genetic diversity may be underestimated in population studies by failing to sample a sufficient number of individuals from a single population.

Acknowledgments

We thank K. Sefc for providing a version of IDENTITY that would accommodate our large sample size; J. Pritchard for providing helpful suggestions for the combined data analysis; S. Davis and R. A. Van Den Bussche, who were of great help with data analysis; J. J. Harclerode and INCODE, who provided use of computers for the Structure analysis; and M. L. Haynie, who performed the most thorough revision and who, along with B. D. Baxter, J. D. Hanson, and N. D. Durish, listened and debated ideas pertaining to this manuscript and provided helpful suggestions. R. J. Baker and C. Jones also revised the manuscript and provided useful advice. This research was financially supported by National Institutes of Health grant AI-41435, entitled “Ecology and Epidemiology of Emerging Arenaviruses in the Southwestern United States.”

Appendix I

GenBank accession numbers.—Specimens are listed by locality, TK number (unique Texas Tech museum karyotype reference number), collecting site (WI-WIII for web sites I–III), haplotype number (H), and GenBank accession number (AY). This list includes only the 122 newly sampled individuals. GenBank numbers for the remaining 28 individuals are given in Méndez-Harclerode et al. (2005).

Neotoma micropus.—Texas; Dimmit Co., Chaparral Wildlife Management Area (TK100149, WI, H14, AY903782; TK100150, WI, H47, AY903783; TK100152, WI, H49, AY903758; TK100153, WI, H04, AY903784; TK100190, WII, H10, AY903786; TK100191, WII, H16, AY903787; TK100192, WII, H42, AY903788; TK100205, WII, H42, AY903761; TK100206, WII, H30, AY903762; TK100207, WII, H07, AY903763; TK100208, WII, H10, AY903764; TK100226, WI, H14, AY903789; TK100229, WI, H35, AY903844; TK100231, WI, H14, AY903765; TK100310, WI, H49, AY903842; TK100311, WI, H49, AY903875; TK100312, WI, H17, AY903791; TK100313, WI, H14, AY903766; TK100315, WI, H49, AY903792; TK100316, WI, H05, AY903793; TK100317, WI, H49, AY903767; TK100318, WI, H55, AY903768; TK100380, WI, H35, AY903846; TK100381, WI, H61, AY903847; TK100382, WI, H35, AY903796; TK100383, WI, H58, AY903843; TK100384, WI, H35, AY903797; TK100385, WI, H04, AY903798; TK100386, WI, H04, AY903799; TK100395, WII, H43, AY903769; TK100396, WII, H15, AY903770; TK100398, WII, H10, AY903771; TK100399, WII, H10, AY903800; TK100400, WII, H07, AY903801; TK100401, WII, H52, AY903802; TK100402, WII, H44, AY903772; TK100445, WI, H04, AY903805; TK100448, WI, H05, AY903807; TK100450, WII, H10, AY903773; TK100451, WII, H10, AY903774; TK100518, WI, H05, AY903874; TK100519, WI, H35, AY903808; TK100521, WI, H05, AY903809; TK100522, WII, H05, AY903775; TK100524, WII, H17, AY903776; TK100525, WII, H45, AY903777; TK100526, WII, H44, AY903778; TK100527, WII, H10, AY903810; TK100582, WI, H17, AY903815; TK100584, WI, H14, AY903816; TK100589, WII, H07, AY903817; TK100590, WII, H46, AY903779; TK100591, WII, H13, AY903831; TK100644, WII, H60, AY903845; TK100717, WII, H50, AY903821; TK100718, WII, H02, AY903822; TK100719, WII, H02, AY903823; TK100721, WII, H02, AY903824; TK100722, WII, H02, AY903829; TK100765, WII, H02, AY903834; TK100766, WII, H02, AY903825; TK100768, WII, H02, AY903827; TK100770, WII, H02, AY903828; TK100771, WII, H02, AY903826; TK100772, WII, H02, AY903871; TK123780, WI, H63, AY903849; TK123799, WI, H14, AY903850; TK124401, WI, H35, AY903851; TK124402, WI, H34, AY903868; TK124489, WI, H66, AY903854; TK124490, WI, H04, AY903855; TK124491, WI, H35, AY903856; TK124492, WI, H67, AY903857; TK124493, WI, H03, AY903869; TK124498, WI, H20, AY903859; TK124572, WI, H14, AY903863; TK124573, WI, H68, AY903864; TK124574, WI, H35, AY903870).

Neotoma micropus.—Texas; La Salle Co., Chaparral Wildlife Management Area (TK100084, WIII, H54, AY903754; TK100126, WIII, H04, AY903780; TK100128, WIII, H53, AY903755; TK100130, WIII, H19, AY903756; TK100131, WIII, H57, AY903830; TK100132, WIII, H31, AY903757; TK100133, WIII, H04, AY903781; TK100160, WIII, H07, AY903759; TK100161, WIII, H48, AY903785; TK100162, WIII, H53, AY903760; TK100287, WIII, H07, AY903790; TK100337, WIII, H07, AY903794; TK100338, WIII, H07, AY903839; TK100339, WIII, H07, AY903840; TK100340, WIII, H16, AY903795; TK100346, WIII, H59, AY903836; TK100404, WIII, H07, AY903803; TK100406, WIII, H04, AY903837; TK100409, WIII, H32, AY903838; TK100411, WIII, H31, AY903804; TK100412, WIII, H20, AY903835; TK100453, WIII, H07, AY903806; TK100529, WIII, H04, AY903811; TK100531, WIII, H53, AY903833; TK100533, WIII, H20, AY903812; TK100534, WIII, H32, AY903813; TK100536, WIII, H37, AY903814; TK100592, WIII, H04, AY903818; TK100593, WIII, H32, AY903819; TK100594, WIII, H38, AY903820; TK100599, WIII, H62, AY903848; TK100600, WIII, H83, AY903873; TK100725, WIII, H07, AY903841; TK100761, WIII, H03, AY903832; TK123793, WIII, H37, AY903865; TK123794, WIII, H64, AY903866; TK123978, WIII, H04, AY903867; TK124406, WIII, H20, AY903852; TK124407, WIII, H65, AY903853; TK124495, WIII, H20, AY903858; TK124497, WIII, H37, AY903872; TK124568, WIII, H04, AY903860; TK124569, WIII, H16, AY903861; TK124570, WIII, H07, AY903862).

Footnotes

  • Associate Editor was Jesús E. Maldonado.

Literature Cited

View Abstract