Jim Cullen - Concept & JavaScript coding. Ken Nordtvedt - Research on modal haplotypes of population
varieties within haplogroup 'I'. Additional Credits & Acknowledgments included at the lower portion of the page
Welcome to the World Haplogroup & Haplo-I Subclade Predictor. Please don't look at the mess - we're still under construction - don't be surprised if the furniture gets moved around during the next few weeks. The movers seem to be taking their time and work at odd hours so I'll be sure to leave a light on for you. Besides these few inconveniences... come on in, we're open for visitors!
Let's start off by saying that Cullen's Predictor in no way replaces Whit Athey's Predictor, a program I still consider a marvel of applied statistics. I highly recommend Whit's Predictor and there's much to be learned by comparing the results obtained from both; we have two different predictors with slightly different goals and methods. To try out Whit's program for yourself, please follow this link to Whit Athey's Haplogroup Predictor.
The World Haplogroup & Haplo-I Subclade Predictor works on a Bootstrap WGD ( weighted genetic distance ) algorithm that's a variation of a goodness-of-fit test, intended to close the margin in the trade-off between size of the database, complexity of the algorithm, and the accuracy of the prediction. In basic terms, the Predictor makes a large number of random sample observations of the entered haplotype and predicts, for each observation, which modal haplotype best describes the sample of markers. Each modal Haplotype is rated in percent by its ability to best describe the sample of markers during the trials. Whit's Bayesian method is surprisingly accurate in predicting amongst 20-some haplogroups with only a few markers. My hope is to extend that to 100-some haplogroups / subhaplogroups and subclades without sacrificing too much in the way of accuracy. Currently I am utilising 86 modal haplotypes (Y-STR-37 markers) representing the world's haplogroups / subhaplogroups, and 56 modal haplotypes (Y-STR-67 markers) of Haplo-I subclades, for a total of 142 modal haplotypes used for comparison to your own Y-STR signature. You may find the following links helpful while learning to use the Predictor:
Updates: Beta-Version 0.93a Beta-v0.93a After finishing the recent internal modifications, the tweaks, and the addition of several modal sets, as well as addressing several issues - I'm calling this a minor revision of 0.93a. This will probably be the last version posted ( except for minor touches ) while the core routine is rewritten. The 'Corbett-Q' module will be also be added during the rewrite. By the time Beta-Version 0.95 is released, all modals should be in full 67-marker format. The core routine must be rewritten due to timeout constraints on the JavaScript programming. As more modals are added, trial runs must be reduced to avoid script timeouts. Reduced trial runs results in more variable prediction percentages in the random bootstrap prediction algorithm. Tolerable margins for prediction variability are now very difficult to maintain with 142 modal sets - and the problem will be have to be addressed before I can contemplate taking all modal sets to full 67-marker format.
390
19
391
385a
385b
426
388
439
389i
392
389ii
459a
459b
455
454
447
437
448
449
464a
464b
464c
464d
H4
YCAIIa
YCAIIb
456
607
576
570
CDYa
CDYb
442
438
578
395a
395b
590
537
641
472
406
511
425
413a
413b
594
436
490
534
450
444
481
* * * Acknowledgments * * *
http://members.bex.net/jtcullen515/haplotest.htm
varieties within haplogroup 'I'. Additional Credits & Acknowledgments included at the lower portion of the page
Welcome to the World Haplogroup & Haplo-I Subclade Predictor. Please don't look at the mess - we're still under construction - don't be surprised if the furniture gets moved around during the next few weeks. The movers seem to be taking their time and work at odd hours so I'll be sure to leave a light on for you. Besides these few inconveniences... come on in, we're open for visitors!
Let's start off by saying that Cullen's Predictor in no way replaces Whit Athey's Predictor, a program I still consider a marvel of applied statistics. I highly recommend Whit's Predictor and there's much to be learned by comparing the results obtained from both; we have two different predictors with slightly different goals and methods. To try out Whit's program for yourself, please follow this link to Whit Athey's Haplogroup Predictor.
The World Haplogroup & Haplo-I Subclade Predictor works on a Bootstrap WGD ( weighted genetic distance ) algorithm that's a variation of a goodness-of-fit test, intended to close the margin in the trade-off between size of the database, complexity of the algorithm, and the accuracy of the prediction. In basic terms, the Predictor makes a large number of random sample observations of the entered haplotype and predicts, for each observation, which modal haplotype best describes the sample of markers. Each modal Haplotype is rated in percent by its ability to best describe the sample of markers during the trials. Whit's Bayesian method is surprisingly accurate in predicting amongst 20-some haplogroups with only a few markers. My hope is to extend that to 100-some haplogroups / subhaplogroups and subclades without sacrificing too much in the way of accuracy. Currently I am utilising 86 modal haplotypes (Y-STR-37 markers) representing the world's haplogroups / subhaplogroups, and 56 modal haplotypes (Y-STR-67 markers) of Haplo-I subclades, for a total of 142 modal haplotypes used for comparison to your own Y-STR signature. You may find the following links helpful while learning to use the Predictor:
- The Help Page: Includes instructions for the use of the Predictor as well as for the COMET utility. Will include FAQ's. Revision history will be kept on this page also.
- Explanation of Results: Soon to be added. Provides extra information regarding test results with links, modal characteristics, SNP trees, and other material.
- Predictor Reference: Includes tables of the modal haplotypes used in the Predictor ( to be updated ) and a full description of the operation of the Predictor algorithm ( to be added ).
Updates: Beta-Version 0.93a Beta-v0.93a After finishing the recent internal modifications, the tweaks, and the addition of several modal sets, as well as addressing several issues - I'm calling this a minor revision of 0.93a. This will probably be the last version posted ( except for minor touches ) while the core routine is rewritten. The 'Corbett-Q' module will be also be added during the rewrite. By the time Beta-Version 0.95 is released, all modals should be in full 67-marker format. The core routine must be rewritten due to timeout constraints on the JavaScript programming. As more modals are added, trial runs must be reduced to avoid script timeouts. Reduced trial runs results in more variable prediction percentages in the random bootstrap prediction algorithm. Tolerable margins for prediction variability are now very difficult to maintain with 142 modal sets - and the problem will be have to be addressed before I can contemplate taking all modal sets to full 67-marker format.
390
19
391
385a
385b
426
388
439
389i
392
389ii
FTDNA Panel 2 |
458 |
459b
455
454
447
437
448
449
464a
464b
464c
464d
FTDNA Panel 3 |
460 |
YCAIIa
YCAIIb
456
607
576
570
CDYa
CDYb
442
438
FTDNA Panel 4 |
531 |
395a
395b
590
537
641
472
406
511
425
413a
413b
557 |
436
490
534
450
444
481