Researcher profile

Steven A. Frank

Steven A. Frank contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
30works
0followers
14topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

30 published item(s)

preprint2020arXiv

The generalized Price equation: forces that change population statistics

The Price equation partitions the change in the expected value of a population measure. The first component describes the partial change caused by altered frequencies. The second component describes the partial change caused by altered measurements. In biology, frequency changes often associate with the direct effect of natural selection. Measure changes reflect processes during transmission that alter trait values. More broadly, the two components describe the direct forces that change population composition and the altered frame of reference that changes measured values. The classic Price equation is limited to population statistics that can expressed as the expected value of a measure. Many statistics cannot be expressed as expected values, such as the harmonic mean and the family of rescaled diversity measures. We generalize the Price equation to any population statistic that can be expressed as a function of frequencies and measurements. We obtain the generalized partition between the direct forces that cause frequency change and the altered frame of reference that changes measurements.

preprint2016arXiv

Common probability patterns arise from simple invariances

Shift and stretch invariance lead to the exponential-Boltzmann probability distribution. Rotational invariance generates the Gaussian distribution. Particular scaling relations transform the canonical exponential and Gaussian patterns into the variety of commonly observed patterns. The scaling relations themselves arise from the fundamental invariances of shift, stretch, and rotation, plus a few additional invariances. Prior work described the three fundamental invariances as a consequence of the equilibrium canonical ensemble of statistical mechanics or the Jaynesian maximization of information entropy. By contrast, I emphasize the primacy and sufficiency of invariance alone to explain the commonly observed patterns. Primary invariance naturally creates the array of commonly observed scaling relations and associated probability patterns, whereas the classical approaches derived from statistical mechanics or information theory require special assumptions to derive commonly observed scales.

preprint2016arXiv

Invariant death

In nematodes, environmental or physiological perturbations alter death's scaling of time. In human cancer, genetic perturbations alter death's curvature of time. Those changes in scale and curvature follow the constraining contours of death's invariant geometry. I show that the constraints arise from a fundamental extension to the theories of randomness, invariance and scale. A generalized Gompertz law follows. The constraints imposed by the invariant Gompertz geometry explain the tendency of perturbations to stretch or bend death's scaling of time. Variability in death rate arises from a combination of constraining universal laws and particular biological processes.

preprint2016arXiv

Puzzles in modern biology. II. Language, cancer and the recursive processes of evolutionary innovation

Human language emerged abruptly. Diverse body forms evolved suddenly. Seed-bearing plants spread rapidly. How do complex evolutionary innovations arise so quickly? Resolving alternative claims remains difficult. The great events of the past happened a long time ago. Cancer provides a model to study evolutionary innovation. A tumor must evolve many novel traits to become an aggressive cancer. I use what we know or could study about cancer to describe the key processes of innovation. In general, evolutionary systems form a hierarchy of recursive processes. Those recursive processes determine the rates at which innovations are generated, spread and transmitted. I relate the recursive processes to abrupt evolutionary innovation.

preprint2016arXiv

Puzzles in modern biology. III. Two kinds of causality in age-related disease

The two primary causal dimensions of age-related disease are rate and function. Change in rate of disease development shifts the age of onset. Change in physiological function provides necessary steps in disease progression. A causal factor may alter the rate of physiological change, but that causal factor itself may have no direct physiological role. Alternatively, a causal factor may provide a necessary physiological function, but that causal factor itself may not alter the rate of disease onset. The rate-function duality provides the basis for solving puzzles of age-related disease. Causal factors of cancer illustrate the duality between rate processes of discovery, such as somatic mutation, and necessary physiological functions, such as invasive penetration across tissue barriers. Examples from cancer suggest general principles of age-related disease.

preprint2016arXiv

Puzzles in modern biology. IV. Neurodegeneration, localized origin and widespread decay

The motor neuron disease amyotrophic lateral sclerosis (ALS) typically begins with localized muscle weakness. Progressive, widespread paralysis often follows over a few years. Does the disease begin with local changes in a small piece of neural tissue and then spread? Or does neural decay happen independently across diverse spatial locations? The distinction matters, because local initiation may arise by local changes in a tissue microenvironment, by somatic mutation, or by various epigenetic or regulatory fluctuations in a few cells. A local trigger must be coupled with a mechanism for spread. By contrast, independent decay across spatial locations cannot begin by a local change, but must depend on some global predisposition or spatially distributed change that leads to approximately synchronous decay. This article outlines the conceptual frame by which one contrasts local triggers and spread versus parallel spatially distributed decay. Various neurodegenerative diseases differ in their mechanistic details, but all can usefully be understood as falling along a continuum of interacting local and global processes. Cancer provides an example of disease progression by local triggers and spatial spread, setting a conceptual basis for clarifying puzzles in neurodegeneration. Heart disease also has crucial interactions between global processes, such as circulating lipid levels, and local processes in the development of atherosclerotic plaques. The distinction between local and global processes helps to understand these various age-related diseases.

preprint2016arXiv

The inductive theory of natural selection: summary and synthesis

The theory of natural selection has two forms. Deductive theory describes how populations change over time. One starts with an initial population and some rules for change. From those assumptions, one calculates the future state of the population. Deductive theory predicts how populations adapt to environmental challenge. Inductive theory describes the causes of change in populations. One starts with a given amount of change. One then assigns different parts of the total change to particular causes. Inductive theory analyzes alternative causal models for how populations have adapted to environmental challenge. This chapter emphasizes the inductive analysis of cause.

preprint2016arXiv

The invariances of power law size distributions

Size varies. Small things are typically more frequent than large things. The logarithm of frequency often declines linearly with the logarithm of size. That power law relation forms one of the common patterns of nature. Why does the complexity of nature reduce to such a simple pattern? Why do things as different as tree size and enzyme rate follow similarly simple patterns? Here I analyze such patterns by their invariant properties. For example, a common pattern should not change when adding a constant value to all observations. That shift is essentially the renumbering of the points on a ruler without changing the metric information provided by the ruler. A ruler is shift invariant only when its scale is properly calibrated to the pattern being measured. Stretch invariance corresponds to the conservation of the total amount of something, such as the total biomass and consequently the average size. Rotational invariance corresponds to pattern that does not depend on the order in which underlying processes occur, for example, a scale that additively combines the component processes leading to observed values. I use tree size as an example to illustrate how the key invariances shape pattern. A simple interpretation of common pattern follows. That simple interpretation connects the normal distribution to a wide variety of other common patterns through the transformations of scale set by the fundamental invariances.

preprint2015arXiv

Commentary: The nature of cancer research

Cancer research reflects an implicit conflict. On the one hand, there is an overwhelming desire to control the disease. We all wish that. On the other hand, we would like to understand why cancer follows so many clearly defined yet puzzling patterns. Why is there such regularity in the rates of progression? Why do different tissues vary so much? There should, of course, be no conflict between control and understanding. But the history of cancer research seems to say that those different goals remain oddly estranged. Peto's 1977 article locates the seeds of this conflict most clearly. He describes what is still the most powerful theoretical perspective for analyzing the causes of cancer. He presents many key unsolved puzzles within that context. He also says why most cancer researchers are not interested in these fundamental issues. The subsequent decades of research grew around this rift, blindly, in the way that research disciplines often grow. Let us revisit Peto, almost 40 years ago. We can learn much about the current nature of cancer research.

preprint2015arXiv

d'Alembert's direct and inertial forces acting on populations: the Price equation and the fundamental theorem of natural selection

I develop a framework for interpreting the forces that act on any population described by frequencies. The conservation of total frequency, or total probability, shapes the characteristics of force. I begin with Fisher's fundamental theorem of natural selection. That theorem partitions the total evolutionary change of a population into two components. The first component is the partial change caused by the direct force of natural selection, holding constant all aspects of the environment. The second component is the partial change caused by the changing environment. I demonstrate that Fisher's partition of total change into the direct force of selection and the forces from the changing environmental frame of reference is identical to d'Alembert's principle of mechanics, which separates the work done by the direct forces from the work done by the inertial forces associated with the changing frame of reference. In d'Alembert's principle, there exist inertial forces from a change in the frame of reference that exactly balance the direct forces. I show that the conservation of total probability strongly shapes the form of the balance between the direct and inertial forces. I then use the strong results for conserved probability to obtain general results for the change in any system quantity, such as biological fitness or energy. Those general results derive from simple coordinate changes between frequencies and system quantities. Ultimately, d'Alembert's separation of direct and inertial forces provides deep conceptual insight into the interpretation of forces and the unification of disparate fields of study.

preprint2014arXiv

Generative models versus underlying symmetries to explain biological pattern

Mathematical models play an increasingly important role in the interpretation of biological experiments. Studies often present a model that generates the observations, connecting hypothesized process to an observed pattern. Such generative models confirm the plausibility of an explanation and make testable hypotheses for further experiments. However, studies rarely consider the broad family of alternative models that match the same observed pattern. The symmetries that define the broad class of matching models are in fact the only aspects of information truly revealed by observed pattern. Commonly observed patterns derive from simple underlying symmetries. This article illustrates the problem by showing the symmetry associated with the observed rate of increase in fitness in a constant environment. That underlying symmetry reveals how each particular generative model defines a single example within the broad class of matching models. Further progress on the relation between pattern and process requires deeper consideration of the underlying symmetries.

preprint2014arXiv

How to read probability distributions as statements about process

Probability distributions can be read as simple expressions of information. Each continuous probability distribution describes how information changes with magnitude. Once one learns to read a probability distribution as a measurement scale of information, opportunities arise to understand the processes that generate the commonly observed patterns. Probability expressions may be parsed into four components: the dissipation of all information, except the preservation of average values, taken over the measurement scale that relates changes in observed values to changes in information, and the transformation from the underlying scale on which information dissipates to alternative scales on which probability pattern may be expressed. Information invariances set the commonly observed measurement scales and the relations between them. In particular, a measurement scale for information is defined by its invariance to specific transformations of underlying values into measurable outputs. Essentially all common distributions can be understood within this simple framework of information invariance and measurement scale.

preprint2014arXiv

Microbial metabolism: optimal control of uptake versus synthesis

Microbes require several complex organic molecules for growth. A species may obtain a required factor by taking up molecules released by other species or by synthesizing the molecule. The patterns of uptake and synthesis set a flow of resources through the multiple species that create a microbial community. This article analyzes a simple mathematical model of the tradeoff between uptake and synthesis. Key factors include the influx rate from external sources relative to the outflux rate, the rate of internal decay within cells, and the cost of synthesis. Aspects of demography also matter, such as cellular birth and death rates, the expected time course of a local resource flow, and the associated lifespan of the local population. Spatial patterns of genetic variability and differentiation between populations may also strongly influence the evolution of metabolic regulatory controls of individual species and thus the structuring of microbial communities. The widespread use of optimality approaches in recent work on microbial metabolism has ignored demography and genetic structure.

preprint2013arXiv

Input-output relations in biological systems: measurement, information and the Hill equation

Biological systems produce outputs in response to variable inputs. Input-output relations tend to follow a few regular patterns. For example, many chemical processes follow the S-shaped Hill equation relation between input concentrations and output concentrations. That Hill equation pattern contradicts the fundamental Michaelis-Menten theory of enzyme kinetics. I use the discrepancy between the expected Michaelis-Menten process of enzyme kinetics and the widely observed Hill equation pattern of biological systems to explore the general properties of biological input-output relations. I start with the various processes that could explain the discrepancy between basic chemistry and biological pattern. I then expand the analysis to consider broader aspects that shape biological input-output relations. Key aspects include the input-output processing by component subsystems and how those components combine to determine the system's overall input-output relations. That aggregate structure often imposes strong regularity on underlying disorder. Aggregation imposes order by dissipating information as it flows through the components of a system. The dissipation of information may be evaluated by the analysis of measurement and precision, explaining why certain common scaling patterns arise so frequently in input-output relations. I discuss how aggregation, measurement and scale provide a framework for understanding the relations between pattern and process. The regularity imposed by those broader structural aspects sets the contours of variation in biology. Thus, biological design will also tend to follow those contours. Natural selection may act primarily to modulate system properties within those broad constraints.

preprint2013arXiv

Natural selection. VI. Partitioning the information in fitness and characters by path analysis

Three steps aid in the analysis of selection. First, describe phenotypes by their component causes. Components include genes, maternal effects, symbionts, and any other predictors of phenotype that are of interest. Second, describe fitness by its component causes, such as an individual's phenotype, its neighbors' phenotypes, resource availability, and so on. Third, put the predictors of phenotype and fitness into an exact equation for evolutionary change, providing a complete expression of selection and other evolutionary processes. The complete expression separates the distinct causal roles of the various hypothesized components of phenotypes and fitness. Traditionally, those components are given by the covariance, variance, and regression terms of evolutionary models. I show how to interpret those statistical expressions with respect to information theory. The resulting interpretation allows one to read the fundamental equations of selection and evolution as sentences that express how various causes lead to the accumulation of information by selection and the decay of information by other evolutionary processes. The interpretation in terms of information leads to a deeper understanding of selection and heritability, and a clearer sense of how to formulate causal hypotheses about evolutionary process. Kin selection appears as a particular type of causal analysis that partitions social effects into meaningful components.

preprint2013arXiv

Natural selection. VII. History and interpretation of kin selection theory

Kin selection theory is a kind of causal analysis. The initial form of kin selection ascribed cause to costs, benefits, and genetic relatedness. The theory then slowly developed a deeper and more sophisticated approach to partitioning the causes of social evolution. Controversy followed because causal analysis inevitably attracts opposing views. It is always possible to separate total effects into different component causes. Alternative causal schemes emphasize different aspects of a problem, reflecting the distinct goals, interests, and biases of different perspectives. For example, group selection is a particular causal scheme with certain advantages and significant limitations. Ultimately, to use kin selection theory to analyze natural patterns and to understand the history of debates over different approaches, one must follow the underlying history of causal analysis. This article describes the history of kin selection theory, with emphasis on how the causal perspective improved through the study of key patterns of natural history, such as dispersal and sex ratio, and through a unified approach to demographic and social processes. Independent historical developments in the multivariate analysis of quantitative traits merged with the causal analysis of social evolution by kin selection.

preprint2012arXiv

Natural selection. IV. The Price equation

The Price equation partitions total evolutionary change into two components. The first component provides an abstract expression of natural selection. The second component subsumes all other evolutionary processes, including changes during transmission. The natural selection component is often used in applications. Those applications attract widespread interest for their simplicity of expression and ease of interpretation. Those same applications attract widespread criticism by dropping the second component of evolutionary change and by leaving unspecified the detailed assumptions needed for a complete study of dynamics. Controversies over approximation and dynamics have nothing to do with the Price equation itself, which is simply a mathematical equivalence relation for total evolutionary change expressed in an alternative form. Disagreements about approach have to do with the tension between the relative valuation of abstract versus concrete analyses. The Price equation's greatest value has been on the abstract side, particularly the invariance relations that illuminate the understanding of natural selection. Those abstract insights lay the foundation for applications in terms of kin selection, information theory interpretations of natural selection, and partitions of causes by path analysis. I discuss recent critiques of the Price equation by Nowak and van Veelen.

preprint2012arXiv

Natural selection. V. How to read the fundamental equations of evolutionary change in terms of information theory

The equations of evolutionary change by natural selection are commonly expressed in statistical terms. Fisher's fundamental theorem emphasizes the variance in fitness. Quantitative genetics expresses selection with covariances and regressions. Population genetic equations depend on genetic variances. How can we read those statistical expressions with respect to the meaning of natural selection? One possibility is to relate the statistical expressions to the amount of information that populations accumulate by selection. However, the connection between selection and information theory has never been compelling. Here, I show the correct relations between statistical expressions for selection and information theory expressions for selection. Those relations link selection to the fundamental concepts of entropy and information in the theories of physics, statistics, and communication. We can now read the equations of selection in terms of their natural meaning. Selection causes populations to accumulate information about the environment.

preprint2011arXiv

A general model of the public goods dilemma

An individually costly act that benefits all group members is a public good. Natural selection favors individual contribution to public goods only when some benefit to the individual offsets the cost of contribution. Problems of sex ratio, parasite virulence, microbial metabolism, punishment of noncooperators, and nearly all aspects of sociality have been analyzed as public goods shaped by kin and group selection. Here, I develop two general aspects of the public goods problem that have received relatively little attention. First, variation in individual resources favors selfish individuals to vary their allocation to public goods. Those individuals better endowed contribute their excess resources to public benefit, whereas those individuals with fewer resources contribute less to the public good. Thus, purely selfish behavior causes individuals to stratify into upper classes that contribute greatly to public benefit and social cohesion and to lower classes that contribute little to the public good. Second, if group success absolutely requires production of the public good, then the pressure favoring production is relatively high. By contrast, if group success depends weakly on the public good, then the pressure favoring production is relatively weak. Stated in this way, it is obvious that the role of baseline success is important. However, discussions of public goods problems sometimes fail to emphasize this point sufficiently. The models here suggest simple tests for the roles of resource variation and baseline success. Given the widespread importance of public goods, better models and tests would greatly deepen our understanding of many processes in biology and sociality.

preprint2011arXiv

Demography and the tragedy of the commons

Individual success in group-structured populations has two components. First, an individual gains by outcompeting its neighbors for local resources. Second, an individual's share of group success must be weighted by the total productivity of the group. The essence of sociality arises from the tension between selfish gains against neighbors and the associated loss that selfishness imposes by degrading the efficiency of the group. Without some force to modulate selfishness, the natural tendencies of self interest typically degrade group performance to the detriment of all. This is the tragedy of the commons. Kin selection provides the most widely discussed way in which the tragedy is overcome in biology. Kin selection arises from behavioral associations within groups caused either by genetical kinship or by other processes that correlate the behaviors of group members. Here, I emphasize demography as a second factor that may also modulate the tragedy of the commons and favor cooperative integration of groups. Each act of selfishness or cooperation in a group often influences group survival and fecundity over many subsequent generations. For example, a cooperative act early in the growth cycle of a colony may enhance the future size and survival of the colony. This time-dependent benefit can greatly increase the degree of cooperation favored by natural selection, providing another way in which to overcome the tragedy of the commons and enhance the integration of group behavior. I conclude that analyses of sociality must account for both the behavioral associations of kin selection theory and the demographic consequences of life history theory.

preprint2011arXiv

Evolutionary foundations of cooperation and group cohesion

In biology, the evolution of increasingly cooperative groups has shaped the history of life. Genes collaborate in the control of cells; cells efficiently divide tasks to produce cohesive multicellular individuals; individual members of insect colonies cooperate in integrated societies. Biological cooperation provides a foundation on which to understand human behavior. Conceptually, the economics of efficient allocation and the game-like processes of strategy are well understood in biology; we find the same essential processes in many successful theories of human sociality. Historically, the trace of biological evolution informs in two ways. First, the evolutionary transformations in biological cooperation provide insight into how economic and strategic processes play out over time--a source of analogy that, when applied thoughtfully, aids analysis of human sociality. Second, humans arose from biological history--a factual account of the past that tells us much about the material basis of human behavior.

preprint2011arXiv

Maladaptation and the paradox of robustness in evolution

Background. Organisms use a variety of mechanisms to protect themselves against perturbations. For example, repair mechanisms fix damage, feedback loops keep homeostatic systems at their setpoints, and biochemical filters distinguish signal from noise. Such buffering mechanisms are often discussed in terms of robustness, which may be measured by reduced sensitivity of performance to perturbations. Methodology/Principal Findings. I use a mathematical model to analyze the evolutionary dynamics of robustness in order to understand aspects of organismal design by natural selection. I focus on two characters: one character performs an adaptive task; the other character buffers the performance of the first character against perturbations. Increased perturbations favor enhanced buffering and robustness, which in turn decreases sensitivity and reduces the intensity of natural selection on the adaptive character. Reduced selective pressure on the adaptive character often leads to a less costly, lower performance trait. Conclusions/Significance. The paradox of robustness arises from evolutionary dynamics: enhanced robustness causes an evolutionary reduction in the adaptive performance of the target character, leading to a degree of maladaptation compared to what could be achieved by natural selection in the absence of robustness mechanisms. Over evolutionary time, buffering traits may become layered on top of each other, while the underlying adaptive traits become replaced by cheaper, lower performance components. The paradox of robustness has widespread implications for understanding organismal design.

preprint2011arXiv

Natural selection. I. Variable environments and uncertain returns on investment

Many studies have analyzed how variability in reproductive success affects fitness. However, each study tends to focus on a particular problem, leaving unclear the overall structure of variability in populations. This fractured conceptual framework often causes particular applications to be incomplete or improperly analyzed. In this paper, I present a concise introduction to the two key aspects of the theory. First, all measures of fitness ultimately arise from the relative comparison of the reproductive success of individuals or genotypes with the average reproductive success in the population. That relative measure creates a diminishing relation between reproductive success and fitness. Diminishing returns reduce fitness in proportion to variability in reproductive success. The relative measurement of success also induces a frequency dependence that favors rare types. Second, variability in populations has a hierarchical structure. Variable success in different traits of an individual affects that individual's variation in reproduction. Correlation between different individuals' reproduction affects variation in the aggregate success of particular alleles across the population. One must consider the hierarchical structure of variability in relation to different consequences of temporal, spatial, and developmental variability. Although a complete analysis of variability has many separate parts, this simple framework allows one to see the structure of the whole and to place particular problems in their proper relation to the general theory. The biological understanding of relative success and the hierarchical structure of variability in populations may also contribute to a deeper economic theory of returns under uncertainty.

preprint2011arXiv

Natural selection. II. Developmental variability and evolutionary rate

In classical evolutionary theory, genetic variation provides the source of heritable phenotypic variation on which natural selection acts. Against this classical view, several theories have emphasized that developmental variability and learning enhance nonheritable phenotypic variation, which in turn can accelerate evolutionary response. In this paper, I show how developmental variability alters evolutionary dynamics by smoothing the landscape that relates genotype to fitness. In a fitness landscape with multiple peaks and valleys, developmental variability can smooth the landscape to provide a directly increasing path of fitness to the highest peak. Developmental variability also allows initial survival of a genotype in response to novel or extreme environmental challenge, providing an opportunity for subsequent adaptation. This initial survival advantage arises from the way in which developmental variability smooths and broadens the fitness landscape. Ultimately, the synergism between developmental processes and genetic variation sets evolutionary rate.

preprint2011arXiv

Natural selection. III. Selection versus transmission and the levels of selection

George Williams defined an evolutionary unit as hereditary information for which the selection bias between competing units dominates the informational decay caused by imperfect transmission. In this article, I extend Williams' approach to show that the ratio of selection bias to transmission bias provides a unifying framework for diverse biological problems. Specific examples include Haldane and Lande's mutation-selection balance, Eigen's error threshold and quasispecies, Van Valen's clade selection, Price's multilevel formulation of group selection, Szathmary and Demeter's evolutionary origin of primitive cells, Levin and Bull's short-sighted evolution of HIV virulence, Frank's timescale analysis of microbial metabolism, and Maynard Smith and Szathmary's major transitions in evolution. The insights from these diverse applications lead to a deeper understanding of kin selection, group selection, multilevel evolutionary analysis, and the philosophical problems of evolutionary units and individuality.

preprint2011arXiv

Wright's adaptive landscape versus Fisher's fundamental theorem

Two giants of evolutionary theory, Sewall Wright and R. A. Fisher, fought bitterly for over thirty years. The Wright-Fisher controversy forms a cornerstone of the history and philosophy of biology. I argue that the standard interpretations of the Wright-Fisher controversy do not accurately represent the ideas and arguments of these two key historical figures. The usual account contrasts the major slogans attached to each name: Wright's adaptive landscape and shifting balance theory of evolution versus Fisher's fundamental theorem of natural selection. These alternative theories are in fact incommensurable. Wright's theory is a detailed dynamical model of evolutionary change in actual populations. Fisher's theory is an abstract invariance and conservation law that, like all physical laws, captures essential features of a system but does not account for all aspects of dynamics in real examples. This key contrast between embodied theories of real cases and abstract laws is missing from prior analyses of Wright versus Fisher. They never argued about this contrast. Instead, the issue at stake in their arguments concerned the actual dynamics of real populations. Both agreed that fluctuations of nonadditive (epistatic) gene combinations play a central role in evolution. Wright emphasized stochastic fluctuations of gene combinations in small, isolated populations. By contrast, Fisher believed that fluctuating selection in large populations was the main cause of fluctuation in nonadditive gene combinations. Close reading shows that widely cited views attributed to Fisher mostly come from what Wright said about Fisher, whereas Fisher's own writings clearly do not support such views.

preprint2010arXiv

A simple derivation and classification of common probability distributions based on information symmetry and measurement scale

Commonly observed patterns typically follow a few distinct families of probability distributions. Over one hundred years ago, Karl Pearson provided a systematic derivation and classification of the common continuous distributions. His approach was phenomenological: a differential equation that generated common distributions without any underlying conceptual basis for why common distributions have particular forms and what explains the familial relations. Pearson's system and its descendants remain the most popular systematic classification of probability distributions. Here, we unify the disparate forms of common distributions into a single system based on two meaningful and justifiable propositions. First, distributions follow maximum entropy subject to constraints, where maximum entropy is equivalent to minimum information. Second, different problems associate magnitude to information in different ways, an association we describe in terms of the relation between information invariance and measurement scale. Our framework relates the different continuous probability distributions through the variations in measurement scale that change each family of maximum entropy distributions into a distinct family.

preprint2010arXiv

Measurement Invariance, Entropy, and Probability

We show that the natural scaling of measurement for a particular problem defines the most likely probability distribution of observations taken from that measurement scale. Our approach extends the method of maximum entropy to use measurement scale as a type of information constraint. We argue that a very common measurement scale is linear at small magnitudes grading into logarithmic at large magnitudes, leading to observations that often follow Student's probability distribution which has a Gaussian shape for small fluctuations from the mean and a power law shape for large fluctuations from the mean. An inverse scaling often arises in which measures naturally grade from logarithmic to linear as one moves from small to large magnitudes, leading to observations that often follow a gamma probability distribution. A gamma distribution has a power law shape for small magnitudes and an exponential shape for large magnitudes. The two measurement scales are natural inverses connected by the Laplace integral transform. This inversion connects the two major scaling patterns commonly found in nature. We also show that superstatistics is a special case of an integral transform, and thus can be understood as a particular way in which to change the scale of measurement. Incorporating information about measurement scale into maximum entropy provides a general approach to the relations between measurement, information and probability.

preprint2010arXiv

Measurement scale in maximum entropy models of species abundance

The consistency of the species abundance distribution across diverse communities has attracted widespread attention. In this paper, I argue that the consistency of pattern arises because diverse ecological mechanisms share a common symmetry with regard to measurement scale. By symmetry, I mean that different ecological processes preserve the same measure of information and lose all other information in the aggregation of various perturbations. I frame these explanations of symmetry, measurement, and aggregation in terms of a recently developed extension to the theory of maximum entropy. I show that the natural measurement scale for the species abundance distribution is log-linear: the information in observations at small population sizes scales logarithmically and, as population size increases, the scaling of information grades from logarithmic to linear. Such log-linear scaling leads naturally to a gamma distribution for species abundance, which matches well with the observed patterns. Much of the variation between samples can be explained by the magnitude at which the measurement scale grades from logarithmic to linear. This measurement approach can be applied to the similar problem of allelic diversity in population genetics and to a wide variety of other patterns in biology.