2. What is Information?

A vast number of definitions of information have been proposed in widely different contexts. These include definitions from genetics, biology, psychology, linguistics, information technology, library science, literary interpretation, cultural studies, quantum physics, thermodynamics, and philosophy. In most cases, the proposed definitions have little applicability beyond the specialized topic of study.

The purpose here is to identify and explore fundamental concepts of information that are applicable throughout the wide spectrum of uses of the term. This discussion focuses on a few relatively simple concepts. Examples of their applicability (and inapplicability) are described for various scientific disciplines.

Creativity is an important aspect of information processing in living systems that is often overlooked in discussions about the basic nature of information. Creativity involves the generation of new conditions and behaviors, and is a fundamental property of life on all levels—from the evolutionary adaptations of individual living cells through human imagination producing technology and art. Creativity emerges from information processing in living systems. However, most definitions of information and discussions about the basic nature of information focus on information about existing conditions, and basically ignore the creative aspects of information processing. The concepts described here attempt to provide a foundation that incorporates all aspects of information, including the creative aspects.

The terms symbolic information and physical information are used here to distinguish between two fundamentally different concepts of information. The common, everyday use of the term information typically refers to some form of symbolic information. Dictionary definitions of information focus on knowledge, facts, and data. These definitions imply symbolic information. Physical information is more specialized and technical, and refers to the concept of information that is generally assumed with the mathematical models of information used in physics. Recognizing the differences between the two concepts of information is critical for understanding the unique properties and uses of information.

Symbolic information as described here is based on symbols created by living entities as part of processes for perception, memory, communication, purposeful action, or planning. Symbolic information has three components: symbolic representation, media, and interpretational infrastructure.

2.1 Symbolic Representation

One component of symbolic information is symbolic representation. The knowledge, facts, and data of information are represented in some type of symbolic form. For example, perceptions are symbolic representations of conditions in the external world and genes are symbolic genetic information.

For humans, words are the most common symbols for conveying information. The evolution of human culture has resulted in increasing layers of symbolic representation. For example, a video recording of a scientific lecture is a symbolic representation of the original lecture, which in turn provided symbolic representations of scientific findings that were published in journals and were based on measurements that were symbolic representations of the outcomes of certain experimental conditions. Behavior in addition to vocalizations can have symbolic meanings, such gestures, courtship behavior, and other body language.

The symbols in symbolic information are physical states or patterns created by living entities specifically to represent something for processes of perception, memory, communication, purposeful action, or planning. As will be discussed in later sections, a natural condition that is relevant to a living being (e.g., the weather) is not considered symbolic information as described here. Symbolic information occurs when a living system creates symbols that represent the natural condition. The symbols could be created internally by a living being during direct perception, or created externally by registering on recording technology.

2.2 Media

A second component of symbolic information is media for storing and transmitting the symbols. Different media can be used, such as printed pages, electronic signals, living brains, sound waves, and DNA. The same information can be stored and transmitted in different media, and a given medium can be used with different information. Media involve matter and energy, whereas the symbolic representation has meaning that is distinct from the media.

2.3 Interpretational Infrastructure

The third component of symbolic information is an interpretational infrastructure that establishes meaning, value, and usefulness for the symbols, and can generate and decode the symbols. Without consistent meaning of the symbols, there can be no stable knowledge, facts, or data. For example, language requires the interpretational infrastructure of the human mind in context of culture. Note that the symbols and interpretational infrastructure intrinsically have meaning and purpose, which implies a living system.

The interpretational infrastructure includes the abilities to generate and to decode the symbols in the media, and to take actions based on the symbols. For example, perception requires the ability to make some type of internal symbols that represent conditions in the external world, and language requires the ability to speak as well as to listen.

The meanings assigned to symbols are arbitrary in the sense that a different symbol could have been used for a given meaning, as occurs with different languages. The meaning of a symbol is a convention established by the interpretational infrastructure. After the initial assignment or development of meaning, the interpretation of symbols must remain consistent if the symbols are to be used for perception, memory, communication, purposeful action, or planning.

The arbitrary assignment of meaning provides great flexibility and power in the use of symbols, and involves an element of creativity. The interpretational infrastructure establishes a relationship or cause-effect sequence that did not exist before. In some cases the symbols may not be completely arbitrary, such as placing male and female figures on the doors of restrooms. However, even in these cases, different symbols could have been used, such as written words. The potential for arbitrary assignment of meaning to symbols is a key factor for identifying true symbolic information.

The interpretational infrastructure is typically a complex process that requires other information processing steps. As will become apparent, symbolic information processing implies the existence and interaction of multiple information processing systems.

Unfortunately, the interpretational infrastructure is often overlooked in discussions of information. However, the symbols would have no meaning or usefulness without the interpretational infrastructure. Because the symbols and the interpretational infrastructure are both essential, they must develop or evolve together.

2.4 Disproportionate Influences on Mass and Energy

One of the primary properties of symbolic information is that a relatively small amount of matter and energy in the media and symbols can be used to guide much larger amounts of matter and energy through the interpretational infrastructure. At the time of conception, the amount of mass and energy in the DNA of an elephant embryo is small; however, that genetic information will ultimately guide the development of a large animal. Similarly, blueprints guide the creation of large buildings. The amount of mass and energy in a stop sign is much smaller than for the vehicles that the sign influences. A military commander calling out orders produces physical effects that are vastly greater in magnitude than the direct physical force of the vibration of air molecules in his spoken words.

Guiding and controlling the distribution and flow of matter and energy to support living beings appears to be a basic purpose of symbolic information. The interpretational infrastructures have sources of matter and energy that are utilized in response to the symbols. The responses are not limited by the matter and energy in the symbols or in the systems that created the symbols. In general, symbolic information is an innately emergent property that can have strikingly disproportionate influences on the distribution and flow of mass and energy.

Note that the response or effect of information also depends on context and interactions. A military commander issuing orders to an army of 10,000 soldiers has a much greater effect than the same orders issued to a group of 4 soldiers. The construction of a building from blue prints is dependent on the availability of financial resources. The development of an organism from DNA is influenced by environmental conditions.

2.5 Creativity

Symbolic information is a mechanism for living beings to generate variability, adaptability, and creativity from the fixed, relatively deterministic forces of physics. The flexibility to change symbols and to adapt the interpretation and responses for symbols is the foundation for producing different behaviors and outcomes. As discussed in later sections, the manifestations of creativity include evolutionary adaptations and human imagination.

Symbolic information processing is the fundamental reason that living systems have properties that are categorically different from the properties of nonliving matter. As discussed in later sections, all living entities, from individual cells to human beings, manipulate and control matter and energy to achieve their own purposes.

2.6 Physical Information

Physicists often use the term information to indicate conditions in the universe that do not involve symbols, media, and an interpretational infrastructure. For example, the geologic strata in rocks and sediments indicate the geological history of an area. One can argue that these strata are information.

However, it can also be argued that differences such as geologic strata are not information until they are perceived and interpreted. According to this perspective, prior to perception and interpretation the strata can be considered as physical differences—but not meaningful information. Perception and interpretation require symbolic information processing, which includes the creation of symbols that represent the strata and that can be interpreted.

These different perspectives indicate two fundamentally different concepts of information. One concept of information focuses on the creation and interpretation of symbols by living systems, and the other concept focuses on physical differences, whether produced by inanimate or living processes. The physical differences can be any non-uniformity or distinction for any physical parameter. The differences can include the distinction between two objects, or between an object and a background, or differences in the location and movement of matter, or differences in energy states. The differences can include dynamic factors and interactions. When physicists talk about information, they are usually, but not always, referring to these physical differences without regard for whether the differences are perceived and interpreted.

The term physical information is used here for the concept that any physical non-uniformity, difference, or distinction is information. Physical information is conceptually related to thermodynamic entropy, which reflects the homogeneity of the distribution of energy and matter in a system (Avery, 2003; Brillouin, 1962; Lambert, 2005). A system with random, homogeneous distribution of energy and matter has high entropy and low physical information. On the other hand, a complex system with discrete elements and concentrated energies has lower entropy and higher physical information.

Although the conceptual relationship between physical information and thermodynamic entropy is clear, a more precise, quantitative relationship tends to be controversial (Stewart, 2003, pp. 142-144). A quantitative relationship requires assumptions that are applicable only in certain contexts. Thermodynamic entropy is based on energy and temperature, whereas physical information can include differences in other physical parameters. States that have different interpretations can have the same thermodynamic entropy. The writings that focus on definitions of information based on thermodynamics (e.g., Brillouin, 1962; Lloyd, 2006) generally have been controversial and have limited usefulness in understanding the full nature and implications of information. Information based on thermodynamics can be considered a subset of physical information. As Marcos (2011, p. 77) commented, “the basis for a general measure of information could not be [thermodynamic] entropy, negentropy, or distance from equilibrium.”

The plain term information will be used here to refer to symbolic information. The concept of physical information will be labeled as physical information.

2.7 Differences Between Symbolic and Physical Information

The fundamental differences between symbolic and physical information include:

· Symbolic information is an active process initiated by living systems involving the creation and interpretation of symbols, and typically producing physical effects. Physical information is a descriptive property of the distribution of matter and energy.

· Symbolic information intrinsically has purpose for living systems, whereas life is basically incidental for physical information.

· Physical information can describe the media used for symbolic information, but does not consider the interpretation and meaning of the symbols, or the effects resulting from the symbols.

2.8 Failure to Distinguish Between Symbolic and Physical Information

Scientific discussions of information frequently fail to distinguish between symbolic and physical information. This oversight too often results in (a) attributing properties of symbolic information processing and life to nonliving processes, (b) an inadequate appreciation of the active role and complexity of the interpretational infrastructure, and (c) not recognizing the creative aspects of symbolic information processing. Examples are given in later sections.

When designing communication systems, treating the interpretational infrastructure as an unexplained given may be an appropriate assumption. However, that assumption is counterproductive when concepts of information are proposed for fundamental scientific explanations. For scientific explanations, understanding the relevant interpretational infrastructure is important, as is the distinction between symbolic and physical information.

2.9 Quantitative Information Theory

Quantitative information theory was developed to evaluate and design electronic communication systems. The theory focuses on quantifying and optimizing the information transmission rate in a communication channel and the reliability of transmission through a noisy channel (Cover & Thomas, 2006). The methods can be used to quantify information in other areas of investigation, including biology and psychology. Quantitative information theory deals with different states or outcomes without regard for the meaning of the outcomes or whether the outcomes result from a living process or from an inanimate process.

Quantitative information theory requires that the probabilities for different states or outcomes are known, which significantly limits its use. The theory is most reliably applied in highly controlled situations such as the evaluation of communications technology or analyses in academic scientific experiments. Quantitative information theory is least applicable in situations with creativity where the range of possible outcomes and probabilities for the outcomes are dynamically determined by living systems.

Information is quantified as a reduction in uncertainty, where uncertainty is measured as probability (Cover & Thomas, 2006). The unit of measure for quantitative information is a bit. For example, learning the outcome of a coin toss provides one bit of information. The term entropy is used to refer to the amount of uncertainty that could become information. Thus, a coin toss also has one bit of entropy prior to observing the outcome.

This definition of entropy is often called information entropy or Shannon entropy (after the man who defined it). Information entropy is based on probability whereas thermodynamic entropy is based on energy. Probability and energy are fundamentally different concepts. Some writings attempt to equate information entropy and thermodynamic entropy (e.g., Brillouin, 1962). These ideas need to be evaluated with caution. Quantitative information theory can be applied to thermodynamic energy microstates and gives results that are basically the same as thermodynamic entropy. However, these writings sometimes give an inappropriate impression that this is the only use for concepts of information and should be the definition of information.

Virtually any probabilistic or statistical model can be expressed mathematically in terms of quantitative information theory. For example, statistical hypothesis testing and information theory are closely related (Cover & Thomas, 2006). The statistical results of a scientific experiment or the result of an individual scientific measurement can be viewed as information obtained about nature.

Like other probability models, quantitative information theory can be applied to different phenomena and is useful only to the degree that the assumptions fit the phenomena being analyzed. It is a method of analysis, not a testable scientific theory about the nature of information.

This mathematical approach can hinder the understanding of information processing if the distinction between probability and meaning is not recognized. Probability and meaning are different dimensions. Quantitative information theory is based on probability and does not consider the meaning, purpose, usefulness to living creatures, or influences on the distribution and flow of matter and energy (Brillouin, 1962, pp. 9-10; Roederer, 2005, pp. 13, 32-33). For example, one possible outcome in a situation may have a probability of .5 and produce effects that are highly adverse for one living being. Another possible outcome may also have a probability of .5, but produce effects that are highly beneficial for many living beings. Quantitative information theory considers both outcomes to provide the same amount of information and ignores the fact that they have entirely different meanings or results.

For living beings, the responses from the interpretational infrastructure are much more important than the quantifiable probabilities for the possible outcomes or symbols. Quantitative information theory does not incorporate the basic principle that the meaning and responses for symbolic information are disproportionate to the occurrence of the associated symbols. For example, the probability for a genetic mutation is not indicative of the effect or meaning of the mutation. Similarly, the impact of the death of a loved one is not meaningfully characterized by the probability of the death.

Quantitative information theory tends to obscure the distinction between symbolic and physical information and to obscure the unique properties of symbolic information. The theory is useful for designing and evaluating electronic communication systems or evaluating processes that can be modeled as an electronic communication system. However, it has very limited usefulness for understanding basic concepts of information or understanding information processing when interpretation or creativity has significant roles.

2.10 Subcategories of Information

Subcategories of physical and symbolic information may be useful. A few subcategories are noted here, but subcategories are not a major focus of this discussion.

For symbolic information, numerous subcategories are obvious and useful. Many commonly used terms define subcategories or types of symbolic information—such as the terms perception, memory, imagine, plan, instructions, read, write, talk, book, picture, video, etc. In addition, major areas of study such as genetics, information technology, and linguistics are higher-level subcategories.

One obvious subcategory of physical information is physical information that is useful or meaningful to living beings. The weather is a classic example. Another distinction for physical information is whether or not the system being described includes symbols in media created by a living entity for purposes of information processing.

2.11 Related Concepts

Information and Intentionality

In discussing the concept of information in biology, John Maynard Smith (2010) identified two different contexts for the term information. He noted that a cloud provides information about the weather, but the cloud does not have the purpose of providing information about the weather. On the other hand, a weather forecast has the specific purpose of providing information about the weather. He described the weather forecast as information having “intentionality,” while the information associated with a cloud does not have intentionality. He identified the information in biology has having intentionality.

He also recognized the symbolic, creative nature of the intentional information in biology. “I think that it is the symbolic nature of molecular biology that makes possible an indefinitely large number of biological forms” (Marynard Smith, 2010, p. 133). His overall conclusions were similar to the concepts of physical and symbolic information described here, although he provided little consideration of the interpretational infrastructure or the disproportionate responses to symbols.

Two Types of Information

Marcia Bates (2005, 2006) and Anthony Reading (2011) each defined two types of information, one type pertaining to physical properties and the other resulting from meaning by living beings. Using concepts described by mathematician Norbert Weiner, Reading (2011, p. 10) defined intrinsic information as “the way the various particles, atoms, molecules, and objects in the universe are organized and arranged.” Similarly, Bates (2005) defined information 1 as “the pattern of organization of matter and energy.” She says this concept of information was “endemic in the 1970s.” Both of these definitions are equivalent to physical information described here.

Their definitions for information associated with living beings focus on responses to stimuli and differ significantly from the concept of symbolic information described here. Reading (2011, p. 4) defines meaningful information as “a detected pattern of matter or energy that generates a response in a recipient.” Similarly, Bates (2005, 2006) defines information 2 as “some pattern of organization of matter and energy that has been given meaning by a living being.”

For both definitions, the writers specifically state that books that are not currently being read by a person are excluded. Books on a shelf are not generating a response in a recipient and are not currently being given meaning by a living being—and therefore do not qualify as meaningful information.

According to the concept of symbolic information described here, a book is symbolic information because it was created by a living being specifically to symbolically represent certain conditions and ideas. Symbolic information is based on the creation of symbols in media, not whether the symbols are actually being perceived or used at a given point in time. It appears to me that symbolic information is the most important concept for understanding information processing in living systems.

Biosemiotics

“Biosemiotics is an interdisciplinary research agenda investigating the myriad forms of communication and signification found in and between living systems. It is thus the study of representation, meaning, sense, and the biological significance of codes and sign processes, from genetic code sequences to intercellular signaling processes to animal display behavior to human semiotic artifacts such as language and abstract symbolic thought” (International Society for Biosemiotic Studies, 2012).

The term sign rather than symbol is used broadly in biosemiotics. Signs include symbolic information as described here plus inanimate environmental conditions that have meaning to a living being (e.g., thunder indicating a storm) (Barbieri, 2010; Brier, 2010).

The focus on signs and lack of conceptual distinction between symbolic and physical information can be expected to encourage vague concepts of information processing. The importance of interpretation is widely recognized in biosemiotic writings. The active, creative aspects of generating symbols have sometimes been noted (e.g., Pattee, 2007), but generally have received much less attention.

Biosemiotics is a relatively new area of study and basic concepts are still being developed. The field focuses on living creatures and does not include information-processing technology. Symbolic information as discussed here applies to information-processing technology as an extension of human information processing.

Concepts in the Philosophy of Information

Philosopher Luciano Floridi (2010) uses the term data for “lacks of uniformity in the real world” (p. 23)—which is equivalent to physical information as defined here. His General Definition of Information is data that are “meaningful.” He then distinguishes several binary subcategories of information, including environmental/semantic, instructional/factual, true/untrue, and intentional/unintentional. With the exception of environmental information, these categories apply to symbolic information and are primarily subcategories of semantic information.

Floridi’s emphasis on semantic information appears to be based on human information processing, and he does not describe a broader category of symbolic information. He argues that DNA does not contain, carry, or encode information because these concepts are part of semantic information (p. 79). Semantic information has intentionality and DNA does not have intentionality (p. 80).

I think that a broad concept of symbolic information is needed to understand the basic nature of information. As described in the next section, DNA is a classic example of symbolic information, with symbols created by a living system specifically for the purpose of transferring or communicating information to the next generation. The concept of symbolic information used here requires living systems, but does not require living beings with consciousness and language that fit a narrow interpretation of semantics.

Floridi’s categories also classify all information as instructional, factual, or environmental, without a category for the creative symbolic representation of hypothetical possibilities, problem solving, and fantasy that can occur with imagination. As discussed in later sections, imagination is an important ability for humans.

More generally, what makes data or physical information “meaningful?” It appears to me that physical differences become meaningful when they are symbolically represented and interpreted by a living system or a technological extension of a living system. Recognizing the unique, powerful properties of the creation and use of symbols is the most directly useful concept for understanding information in living systems.

Pragmatic Information

Pragmatic information focuses on the impact of a message on a receiving system and associated changes to the structure and behavior of the receiving system (Kornwachs and Jacoby, 1996). Conceptually, syntactics is the theory of the relations between signs, semantics is the theory of the relations between signs and the objects symbolized, and pragmatics is the theory of the relations between the signs and their users. Traditional quantitative information theory deals with syntactics and misses the meaning and uses of information.

Writings on pragmatic information implicitly incorporate symbolic and physical information, but do not explicitly distinguish these aspects of information. These writings focus on the receiving systems and pay little attention to the creation and interpretation of symbols—which results in not fully appreciating the role and complexity of the interpretational infrastructure or the creative aspects of information. These writings also tend to be abstract and mathematical, with little practical application or insight about the properties of specific information processing systems.

References

Avery, John, (2003). Information Theory and Evolution. Singapore: World Scientific Publishing.

Bates, Marcia J., (2005). Information and Knowledge: An Evolutionary Framework for Information Science. Information Research, 10(4). Available at http://informationr.net/ir/10-4/paper239.html.

Bates, Marcia J., (2006). Fundamental Forms of Information. Journal of the American Society for Information Science and Technology, 57(8), 1033-1045. Available at http://pages.gseis.ucla.edu/faculty/bates/articles/NatRep_info_11m_050514.html.

Barbieri, Marcello, (2010). Biosemiotics: A New Understanding of Life. In Donald Favareau (ed.) Essential Readings in Biosemiotics: Anthology and Commentary, pp. 751-796. New York: Springer.

Brier, Søren, (2010). The Cybersemiotic Model of Communication: An Evolutionary View on the Threshold between Semiosis and Information Exchange. In Donald Favareau (ed.) Essential Readings in Biosemiotics: Anthology and Commentary, pp. 697-730. New York: Springer.

Brillouin, L. (1962). Science and Information Theory (2^nd ed.). New York: Academic Press.

Cover, T.M. & Thomas, J.A. (2006). Elements of Information Theory (2^nd ed.). Hoboken, New Jersey: John Wily & Sons.

Floridi, Luciano, (2010). Information: A Very Short Introduction. New York: Oxford University Press.

International Society for Biosemiotic Studies, 2012. “What is Biosemiotics?” Accessed June 6, 2012 at http://www.biosemiotics.org/.

Kornwachs, Klaus, and Jacoby, Konstantin (editors), (1996). Information: New Questions to a Multidisciplinary Concept. Berlin: Akademie Verlag.

Lambert, Frank, (2005). “Entropy is Simple, Qualitatively.” Accessed June 19, 2012 at http://entropysite.oxy.edu/entropy_is_simple/index.html.

Lloyd, Seth, (2006). Programming the Universe. New York: Random House.

Marcos, Alfredo, (2011). Bioinformation as a Triadic Relation. In George Terzis and Robert Arp (eds.) Information and Living Systems: Philosophical and Scientific Perspectives, pp.55-90. Cambridge, MA: MIT Press.

Maynard Smith, John (2010). The concept of Information in Biology. In Paul Davies and Niels Henrick Gregerson (eds.) Information and the Nature of Reality: From Physics to Metaphysics, pp. 123-145. Cambridge, UK: Cambridge University Press.

Pattee, H.H., (2007). The Necessity of Biosemiotics: Matter-Symbol Complementarity. In Marcello Barbieri (ed.), Introduction to Biosemiotics, pp. 115-132. Dordrecht, The Netherlands: Springer.

Reading, Anthony, (2011). Meaningful Information: The Bridge between Biology, Brain, and Behavior. New York: Springer.

Roederer, J.G., (2005). Information and Its Role in Nature. Berlin, Germany: Springer.

Stewart, Ian, (2003). The Second Law of Gravity and the Fourth Law of Thermodynamics. In Niels Henrick Gregersen (ed.) From Complexity to Life: On the Emergence of Life and Meaning, pp. 114-150. New York: Oxford University Press.

[Version of 4/14/2023]

[Originally published on the internet on 5/10/2012 at http://science.jeksite.org/info1/ ]