Der. First, our results apply to opposite-sex non-Hispanic white pairs within the United States. For nonwhite pairs within the United States, different results might be obtained due to limited genetic variance among non-Hispanic Olmutinib biological activity whites compared with other groups (23) or because of different social contexts for non-Hispanic whites compared with others (e.g., the racial inequities that exist in the United States). That is, if individuals are selecting into a relationship because of genetic similarity,Domingue et al.then we might expect GAM to be higher among non-Hispanic whites who are less likely than others to face limitations in terms of residential, educational, or occupational choices. Second, patterns of GAM and EAM might differ in same-sex couples. Third, differences may be changing over time. For example, recent research (24) suggests that there has been a rise in assortative mating which has contributed to a rise in income inequality. Fourth, we estimated genetic similarity using SNPs from across the genome. Future research could focus on SNPs known to be important for education (11) or those identified in other GWAS to examine homogamy at a finer level than our whole-genome approach. Given our results from the SNPs implicated in the education GWAS, it might be that analyses at levels finer than the entire genome but much larger than a single SNP, such as chromosomes, would be appropriate. Materials and MethodsData. This paper uses data from the Health and Retirement Study (HRS) RAND fat files (13). Access to the genome-wide data was approved by National Center for Biotechnology Information Genotypes and Phenotypes Database (access no. 19335-3). Of the 9,429 individual with genetic data (described below), 4,584 were from the HRS cohort (five other cohorts are also included in the full data). Of the 4,584, there were 3,504 non-Hispanic whites. Of these, 1,763 individuals were in 862 spousal pairs (some individuals had more than one spouse). We focus on only those individuals (with complete data) in spousal pairs, 1,716 individuals in 825 spousal pairs, as there are differences between individuals in spousal pairs and those not in spousal pairs (e.g., spouses have roughly a quarter year of education more on average). These individuals were born during a large span of time (between 1920 and 1970) but the majority (59 ) were born in the 1930s. To assess EAM, we used total years of education. In our sample, 14 had less than a high school education, 38 had a high school education, and the remainder had more than a high school education. We also used information on the respondent’s birthplace (coded as one of nine census divisions plus two categories for US birth with no additional information and foreign birth, 0.1 and 5.1 of the sample, respectively). Genetic data for the HRS is based on DNA samples collected in two phases. The first phase was collected via buccal swabs in 2006 using the Qiagen Autopure method. The second phase used saliva samples collected in 2008 and extracted with Oragene. Genotype calls were then made based on a clustering of both data sets using the Illumina (��)-BGB-3111 web HumanOmni2.5-4v1 array (details on the quality control process can be found via ref. 25). After standard quality control procedures (e.g., removing SNPs that were missing in more than 5 of samples; minor allele frequencies below 1 ; failure to meet Hardy einberg equilibrium, violations of which suggest errors in the genotyping process), we retained 1,707.Der. First, our results apply to opposite-sex non-Hispanic white pairs within the United States. For nonwhite pairs within the United States, different results might be obtained due to limited genetic variance among non-Hispanic whites compared with other groups (23) or because of different social contexts for non-Hispanic whites compared with others (e.g., the racial inequities that exist in the United States). That is, if individuals are selecting into a relationship because of genetic similarity,Domingue et al.then we might expect GAM to be higher among non-Hispanic whites who are less likely than others to face limitations in terms of residential, educational, or occupational choices. Second, patterns of GAM and EAM might differ in same-sex couples. Third, differences may be changing over time. For example, recent research (24) suggests that there has been a rise in assortative mating which has contributed to a rise in income inequality. Fourth, we estimated genetic similarity using SNPs from across the genome. Future research could focus on SNPs known to be important for education (11) or those identified in other GWAS to examine homogamy at a finer level than our whole-genome approach. Given our results from the SNPs implicated in the education GWAS, it might be that analyses at levels finer than the entire genome but much larger than a single SNP, such as chromosomes, would be appropriate. Materials and MethodsData. This paper uses data from the Health and Retirement Study (HRS) RAND fat files (13). Access to the genome-wide data was approved by National Center for Biotechnology Information Genotypes and Phenotypes Database (access no. 19335-3). Of the 9,429 individual with genetic data (described below), 4,584 were from the HRS cohort (five other cohorts are also included in the full data). Of the 4,584, there were 3,504 non-Hispanic whites. Of these, 1,763 individuals were in 862 spousal pairs (some individuals had more than one spouse). We focus on only those individuals (with complete data) in spousal pairs, 1,716 individuals in 825 spousal pairs, as there are differences between individuals in spousal pairs and those not in spousal pairs (e.g., spouses have roughly a quarter year of education more on average). These individuals were born during a large span of time (between 1920 and 1970) but the majority (59 ) were born in the 1930s. To assess EAM, we used total years of education. In our sample, 14 had less than a high school education, 38 had a high school education, and the remainder had more than a high school education. We also used information on the respondent’s birthplace (coded as one of nine census divisions plus two categories for US birth with no additional information and foreign birth, 0.1 and 5.1 of the sample, respectively). Genetic data for the HRS is based on DNA samples collected in two phases. The first phase was collected via buccal swabs in 2006 using the Qiagen Autopure method. The second phase used saliva samples collected in 2008 and extracted with Oragene. Genotype calls were then made based on a clustering of both data sets using the Illumina HumanOmni2.5-4v1 array (details on the quality control process can be found via ref. 25). After standard quality control procedures (e.g., removing SNPs that were missing in more than 5 of samples; minor allele frequencies below 1 ; failure to meet Hardy einberg equilibrium, violations of which suggest errors in the genotyping process), we retained 1,707.