Welcome to the Y-DNA Data Warehouse

The purpose of this project is to collect Y-DNA related test results from a variety of sources and make that information available to citizen scientists. The information in this collection is hosted by the Haplogroup R project and access to it is governed by our data policy outlined below.

Notice: YSEQ customers submitting WGS files. Please join the Y-DNA Warehouse group. This allows us to directly access your chrY reads only BAM. You can then copy the link to your hg38_chrY.bam into the Shared Link Download URL box below.

Upload of either BAM or CRAM files are supported using the shared link. If you have the option readily available, CRAM files contain the same information as a BAM file but are about 50% of the size.

We preparing to move to a new platform!

For more information see the migration status.

Just in case we may need to contact you if issues arise processing your file.
Test Information
Unknown/Not Provided We will generate an identifier for this test.
The most common reference build is GRCh37/h19, but new Big Y's are GRCh38/hg38. The easiest check to determine which you have is to look at the size of the ZIP archive. A file size larger than 1MB for a Big Y VCF/BED ZIP archive is GRCh38/hg38. It is expected that other labs will begin issuing GRCh38/hg38 results as well.
Most Distant Known Paternal Ancestor (Optional)
Please include only the paternal family name here. First and middle names should not be included. If you wish to provide more precise information, add it to the other information below.
The modern name of the most specific location where your ancestor was born. If you are only certain of a new world country, please use it. Check that the map pin corresponds to the correct region before submitting the form.
Any additional information you would like to share publicly about the known ancestry of this tester.DO NOT INCLUDE INFORMATION ABOUT LIVING INDIVIDUALS. USE THE CONTACT FORM FOR SENSITIVE INFORMATION. 2048 characters left.

Raw data upload

Dropbox users please ensure the download link ends with a 1 instead of 0. This allows the file to be directly downloaded.

Warning: The maximum upload size of the ZIP file is 60MB. If the file size is larger or you are experiencing connection time-outs, please use a shared link from a service such as Dropbox or Google Drive. Both of these services offer free online storage for files up 2GB. Many testing vendors also offer a sharing URL that can be pasted into the shared link.

As the legal owner of this data, or their proxy, I agree that this data can be used as per the Data Policy shown below.

Data Policy

This policy was created to balance the rights and privacy of individuals, with the benefit to the whole community of gathering information for their research projects. We have tried to ensure the safety and privacy of any personal data, including data likely to have significant medical relevance, or which can identify a specific person. At the same time, we have tried to retain enough information that test results are useful for research, meaningful for close matches and can be cross-referenced against information on other sites.

  1. The following is the Policy agreed to on upload of data, between Submitters of that data (genetic testers or their designated proxies) and the Project. The Project is defined as those persons with administrative access to the data archive, or successors thereof.
  2. Submitters give the Project free license to analyse the genetic and ancestral data they submit, and publicly release semi-anonymized, filtered analyses of that data, and any associated meta-data found in the public domain. Released genetic data is to be limited to calls assigned to the Y-chromosome.
  3. Raw DNA sequencing data (e.g. BAM or FASTQ datasets) will only be shared with a member's explicit written consent. However, reduced sets of Y-chromosome data (including calls in VCF/gVCF format, test coverage information in BED format, and submitted meta-data) may be shared with co-operating projects.
  4. Tests are publicly identified by the meta-data supplied on submission, i.e. kit numbers and most-distant known paternal ancestor information. Project members may request that public reports anonymize all or part of this information to an internal project identifier instead. Such requests should be made by e-mail before submission to prevent public release of information.
  5. Submitters or legal data owners have the right to request that their raw data is removed from the data analysis at any time. However, since we release a reduced set of data into the public domain, we cannot guarantee these data are removed from external sites once the kit has been analyzed.
  6. The Project may contact Submitters about specific queries regarding their data, using the e-mail address supplied on submission. Sharing of e-mail addresses with any third parties will only be done with Submitters' consent.
  7. Minor updates to this agreement may be necessary, e.g. to modify or make explicit the names of people and parties; to include new data formats; to clarify specific points of ambiguity; or to ensure compliance with national and international law, existing privacy agreements with testing companies, or community guidelines. Such changes may me made by the Project without notification, provided they don't constitute material infringements of the rights and/or privacy granted to Submitters, as described in the version of the Policy they initially accept.

As of 19 October 2017, the list of project administrators is: James Kane (www.haplogroup-r.org), Alex Williamson (www.ytree.net), Iain McDonald (www.jb.man.ac.uk/~mcdonald/genetics.html), and Jef Treece (data analyst).