Learn to use dbGaP, the NCBI database of Genotypes and Phenotypes that serves as a repository for the results of studies investigating the interaction between genotypes and phenotypes. The dbGaP resource includes detailed reports of variables, documents, analyses, and data sets from genome wide association studies and other large-scale studies that use high throughput genotyping and sequencing methods. Learn about the two levels of access to dbGaP, open and controlled, and how to search and retrieve open access data. The dbGaP resource compiles studies using stable identifiers and standard formats that allow these data to be browsed, downloaded, and used to facilitate additional studies and replication of results.

You will learn:

  • to perform basic and advanced searches and navigate the dbGaP site
  • to understand the displays for the main open access data types: studies, variables, documents, analyses, and data sets
  • to use the analysis browser to identify candidate genomic regions for genotype-phenotype associations
  • to manipulate and customize the browser displays


This tutorial is a part of the tutorial group Human variations. You might find the other tutorials in the group interesting:

GAD: Genetic Association Database: An archived database associating human genes and polymorphisms with diseases

Madeline 2.0: Human pedigree diagram tools

DrugBank: A chemoinformatics and bioinformatics resource

DGV: Database of Genomic Variants: Database of Genomic Variants, DGV, catalogs and displays structural variation in the human genome

OMIM: Online Mendelian Inheritance in Man (OMIM): A database of human genes, genetic diseases and disorders

CGAP: Characterize the molecular genetic changes that cause a normal cell to become a cancer cell

ENCODE Foundations: ENCyclopedia of DNA Elements

GeneSNPs: An integrated view of gene structure and SNP variations

NIEHS SNPs: National Institute for Environmental Health Sciences Environmental Genome Project (EGP) SNPs

HapMap: HapMap, a database and analysis resource of human variation

Genetics Home Reference: A collection of data describing the effects of genetic variability on human health and disease

SeattleSNPs: Human SNPs in genes

dbSNP: NCBI's SNP database

GeneTests: GeneTests, a current, comprehensive genetic testing resource


Variation & Medical : Resources that include information about sequence variation, phenotypes, or medically-relevant conditions.

NCBI : This category includes resources maintained at the National Center for Biotechnology Information (NCBI).


Friday SNPpets: This week's SNPpets consist of an abbreviated week, because I'm on the road for a conference. But there's plenty of interesting stuff already this week, and #BoG16 is just getting going. As usual, new ...

Friday SNPpets: This week's SNPpets include a lot of back-and-forth on the artist formerly known as personal genomics (now personalized medicine). We got the EU mucking up access, we have privacy issues unearthed, we ...

Video Tip of the Week: PhenDisco, "phenotype discoverer" for dbGap data : The dbGaP, database of Genotypes and Phenotypes, repository at NCBI collects information from research projects that link genotype and phenotype information and human variation, across many different t...

Video Tip of the Week: PheGenI, Phenotype-Genotype Integrator: The hunt for variations in genes and genomes has been both fruitful and frustrating. We can see genome variations in a variety of ways, but we can't always connect them with a phenotype easily. And vic...

Video Tip of the Week: 1000 Genomes Dataset Browser from NCBI: A recent NCBI Newsletter announced the release of a new resource named the 1000 Genomes Dataset Browser, and that is the resource that I will be featuring in this tip. It is one of the tools available...


Recent BioMed Central research articles citing this resource

Rahmani Elior et al., Genome-wide methylation data mirror ancestry information. Epigenetics Chromatin (2017) doi:10.1186/s13072-016-0108-y

Joehanes Roby et al., Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies. Genome Biology (2017) doi:10.1186/s13059-016-1142-6

Weitzel Wiisanen Kristin et al., The IGNITE network: a model for genomic medicine implementation and research. BMC Medical Genomics (2016) doi:10.1186/s12920-015-0162-5

Yang J. James et al., An efficient genome-wide association test for multivariate phenotypes based on the Fisher combination function Comparative genomics. BMC Bioinformatics (2016) doi:10.1186/s12859-015-0868-6

Zhang Wangshu et al., Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships. BMC Systems Biology (2016) doi:10.1186/s12918-015-0247-y