- •Copyright © 2006 by Oxford University Press, Inc.
- •Contents
- •1 Introduction
- •References
- •2.1 Macroscopic, Deterministic Chemical Kinetics
- •2.2 Disordered Kinetics
- •2.3 Fluctuations
- •References
- •3 A Brief Review of Methodology for the Analysis of Biochemical Reactions and Cells
- •3.1 Introduction
- •3.2 Measurement of Metabolite Concentrations
- •3.3 Principles and Applications of Mass Spectrometry
- •3.5 Fluorescent Imaging
- •3.6 Conclusions
- •References
- •4.1 Chemical Neurons and Logic Gates
- •4.2 Implementation of Computers by Macroscopic Chemical Kinetics
- •4.3 Computational Functions in Biochemical Reaction Systems
- •References
- •5.1 Theory
- •5.2 An Example: The Glycolytic Pathway
- •References
- •6 Experimental Test of the Pulse Perturbation Method for Determining Causal Connectivities of Chemical Species in a Reaction Network
- •Reference
- •Discussion
- •References
- •References
- •9 Density Estimation
- •9.1 Entropy Metric Construction (EMC)
- •9.2 Entropy Reduction Method (ERM)
- •References
- •10 Applications of Genetic Algorithms to the Determination of Reaction Mechanisms
- •10.1 A Short Primer on Genetic Algorithms
- •10.2 Selection of Regulation of Flux in a Metabolic Model
- •10.3 Evolutionary Development of Biochemical Oscillatory Reaction Mechanisms
- •10.5 Summary
- •References
- •11 Oscillatory Reactions
- •11.1 Introduction
- •11.2 Concepts and Theoretical Constructs
- •11.3 Experiments Leading to Information about the Oscillatory Reaction Mechanism
- •11.4 Examples of Deduction of Reaction Mechanism from Experiments
- •11.5 Limits of Stoichiometric Network Analysis
- •References
- •12.1 Lifetime Distributions of Chemical Species
- •12.2 Response Experiments and Lifetime Distributions
- •12.3 Transit Time Distributions in Complex Chemical Systems
- •12.4 Transit Time Distributions, Linear Response, and Extracting Kinetic Information from Experimental Data
- •12.5 Errors in Response Experiments
- •12.7 Conclusions
- •References
- •13.1 Clustering
- •13.2 Linearization in Various Forms
- •13.3 Modeling of Reaction Mechanisms
- •13.4 Boolean Networks
- •13.5 Correlation Metric Construction for Genetic Networks
- •13.6 Bayesian Networks
- •13.7 Some Other Illustrative Approaches
- •References
- •Index
102 DETERMINATION OF COMPLEX REACTION MECHANISMS
any distribution is less than or equal to that of a Gaussian distribution, we see that the CMC method gives upper bounds to correlation distances between species.
The CMC method gives a reasonable approximation to the reaction pathways of many systems, including nonlinear systems; strong nonlinearities may demand the use of the EMC method, which requires more data points than the CMC method. We do not know how widespread the need is for the EMC method.
9.2Entropy Reduction Method (ERM)
EMC provides a more accurate representation of the relationship between two variables because nonlinearities are better described if higher moments of the pair distribution are considered. There is, of course, a price to be paid for this higher accuracy and that is the requirement for large amounts of data. However, even with well-estimated pair distribution functions one can, theoretically, make better, less ambiguous predictions of network structure from these measurements.
One possible extension of the EMC method is something we call the entropy reduction method. Here, we essentially require that the nonlinear variation in a variable Y be explainable by the variations in the concentration of a subset (possibly all) of the other variables in the system. In terms of entropy, this condition is obtained by minimizing the differential entropy h(Y|X) over the set of variable X. The term h(Y|X) is simply h(X,Y)– h(X), where h(X,Y) = [p(X,Y)log(p(X,Y)]/p(X))dX dY. If Y is completely independent of X, then h(Y|X) = h(Y), otherwise it is less than h(Y). Formally, h(Y|X) goes to negative infinity when X completely determines Y. This latter statement is equivalent to saying that the size of the support set, exp[h(Y|X)], is zero. Thus, by iterating through cycles of adding a variable xi to X that minimizes exp[h(Y|X)] until further additions do not decrease the support set, we can find an ordered set of variables that control the variation in Y.
This technique uses the full joint probability distributions between Y and the set X and so yet again requires a larger amount of data. For multivariate Gaussian distributions, for example, the amount of data needed goes up exponentially with the number of variables [11]. The quality of data and the measurement error, as in the other methods, also play a role in determining the effectiveness of this method. However, it may be shown that this method eliminates a great deal of the ambiguity that arises when only the second moment or a few moments of the distribution are used to define the ordering and causality among the chemical concentration variables. For an expanded discussion of this method see [2].
Acknowledgment This chapter is based on selections from the article “On the deduction of chemical reaction pathways from measurements of time series of concentrations” by Michael Samoilov, Adam Arkin, and John Ross [1].
References
[1]Samoilov, M.; Arkin, A.; Ross, J. On the deduction of chemical reaction pathways from measurements of time series of concentrations. Chaos 2001, 11, 108–114.
[2]Samoilov, M. Reconstruction and functional analysis of general chemical reactions and reaction networks. Ph.D. thesis, Stanford University, 1997.
DENSITY ESTIMATION |
103 |
[3]Michaels, G. S.; Carr, D.; Askenazi, B. M.; Wen, X.; Fuhrman, S.; Somogyi, R. Cluster analysis and data visualization of large scale gene expression data. Pacific Symposium on Biocomputing 1998, 3, 42–53.
[4]Liang, S.; Fuhrman, S.; Somogyi, R. REVEAL, a general reverse engineering algorithm for inference of genetic network structures. Pacific Symposium on Biocomputing 1998, 3, 18–29.
[5]Fuhrman, S.; Cunningham, M. J.; Wen, X.; Zweiger, G. J.; Seilhamer, J.; Somogyi, R. The application of Shannon entropy in the identification of putative drug targets. BioSystems 2000, 55, 5–14.
[6]Butte, A.; J.; Kohane, I. S. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pacific Symposium on Biocomputing 2000, 5, 418–429.
[7]Cover, T.; Thomas, J. Elements of Information Theory; Wiley: New York, 1991.
[8]Shephard, R. N. Multidimensional scaling, tree-fitting, and clustering. Science 1980, 210, 390–398.
[9]Fraser, A. M.; Swinney, H. L. Independent coordinates for strange attractors from mutual information. Phys. Rev. A 1986, 33, 1134–1140.
[10]Gallant, A.; Tauchen, G. New Directions in Time Series Analysis, Part II; Springer-Verlag: New York, 1992.; p 71. SNP: A Program for Nonparametric Time Series Analysis, Version 8.4. User’s Guide, Chapel Hill, NC, 1995.
[11]Silverman, B. W. Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability 26; Chapman & Hall: New York, 1986; p 94.