Efficient learning of microbial genotype-phenotype association rules.

Related Articles

Efficient learning of microbial genotype-phenotype association rules.

Bioinformatics. 2010 Aug 1;26(15):1834-40

Authors: MacDonald NJ, Beiko RG

Abstract
MOTIVATION: Finding biologically causative genotype-phenotype associations from whole-genome data is difficult due to the large gene feature space to mine, the potential for interactions among genes and phylogenetic correlations between genomes. Associations within phylogenetically distinct organisms with unusual molecular mechanisms underlying their phenotype may be particularly difficult to assess.
RESULTS: We have developed a new genotype-phenotype association approach that uses Classification based on Predictive Association Rules (CPAR), and compare it with NETCAR, a recently published association algorithm. Our implementation of CPAR gave on average slightly higher classification accuracy, with approximately 100 time faster running times. Given the influence of phylogenetic correlations in the extraction of genotype-phenotype association rules, we furthermore propose a novel measure for downweighting the dependence among samples by modeling shared ancestry using conditional mutual information, and demonstrate its complementary nature to traditional mining approaches.
AVAILABILITY: Software implemented for this study is available under the Creative Commons Attribution 3.0 license from the author at http://kiwi.cs.dal.ca/Software/PICA

PMID: 20529891 [PubMed - indexed for MEDLINE]