Publication
Journal of Proteome Research
Paper

Clinical and pharmacogenomic data mining: 3. Zeta theory as a general tactic for clinical bioinformatics

View publication

Abstract

A new approach, a Zeta Theory of observations, data, and data mining, is being forged from a theory of expected information into an even more cohesive and comprehensive form by the challenge of general genomic, pharmacogenomic, and proteomic data. In this paper, the focus is not on studies using the specific tool FANO (CliniMiner) but on extensions to a new broader theoretical approach, aspects of which can easily be implemented into, or otherwise support, excellent existing methods, such as forms of multivariate analysis and IBM's product Intelligent Miner. The theory should perhaps be distinguished from an existing purely number-theoretic area sometimes also known as Zeta Theory, which focuses on the Riemann Zeta Function and the ways in which it governs the distribution of prime numbers. However, Zeta Theory as used here overlaps heavily with it and actually makes use of these same matters. The distinction is that it enters from a Bayesian information theory and data representation perspective. It could thus be considered an application of the 'mathematician's version'. The application is by no means confined to areas of modern biomedicine, and indeed its generality, even merging into quantum mechanics, is a key feature. Other areas with some similar challenges as modern biology, and which have inspired data mining methods such as IBM's Intelligent Miner, include commerce. But for several reasons discussed, modern molecular biology and medicine seem particularly challenging, and this relates to the often irreducible high dimensionality of the data. This thus remains our main target. © 2005 American Chemical Society.

Date

Publication

Journal of Proteome Research

Authors

Share