Y-DNA Warehouse Migration News

After years of development effort the new Y-DNA Warehouse is preparing to relaunch in multiple phases. This page provides information about what is changing and some metrics about the migration progress.

UPDATE: Oct-10-2021

The maximum likelihood tree has been loaded into the demo site. Samples that were not part of this analysis are undergoing initial placement for follow-up analysis. This process will mostly focus on the branches not in The Big Tree until an internal team is trained to use the tools.

You can get a preview of the new tree on the demo site. Please note registration is disabled until we switch over.

Phase 1 Outline

The new platform is built upon a scalable reactive web framework. This retires the very basic submission and reporting systems finally. Everyone who has submitted a kit with the existing form will have an account registered on the new system. A mass email will be sent when we are live, which will allow you to reset the password and login for the first time.

The Subject Screen

Once logged in, you will be able to see a Subject entry for every kit submitted by your email. A tool is provided to merge two subjects, if they are the same person tested at different labs. This will allow the system to combine the sequencing information into a single profile for reporting and ensuring the most complete signature possible for the phylogentic tree development team. If you have a large number of samples, you can use the search box to quickly locate what you are looking for. The "Merge Selected" tool allows you to combine tests from different labs from the same person.

The subjects screen

The Test Data Screen

Inside the subject management screens you can update the most distant known ancestor information (if needed) and see a summary of your STR and SNP testing as well as a report of the calls positioning you within the SNP tree. You can add additional testing or remove the data you no longer wish to share here.

The test data screen

Current Progress

There are roughly 12,000 samples slated to be part of the migration. Depending on the amount of testing the imports take between 1 and 6 minutes. At the time of this page being published 5,000 submissions are fully loaded and have an elementary analysis performed to find the nearest branch. 2,200 submissions are staged to be receive the first pass analysis. 500 have failed the first attempt to import. The remainder are running through the call import batch.

Phase 2 Outline

A suite of tools is in development to assist a team of maintainers to build out the tree with anonymized data within the warehouse. While we are starting from a great point with iqtree. The various file formats make standardizing the non-BAM raw data from various labs challenging at best to produce the call matrices needed for fully automatted products. The tools are expected shortly after the relaunch window.

Phase 3 Outline

The Study system is the heart of the planned collaboration and matching systems. The Study system allows you to contact other site members enrolled in a topic of mutual interest. Joining a study formed for surnames or haplogroups allow the members to communicate with the internal messaging system and project feeds. Researchers may also form higher level studies that allows access to enrolled members with only MRCA information. This type of study is more appropriate to population movements, which do not require detailed genealogical information.

Thank you everyone for your patience in getting to this point!