Skip to main content
Entropy logoLink to Entropy
. 2018 Mar 29;20(4):237. doi: 10.3390/e20040237

Axiomatic Information Thermodynamics

Austin Hulse 1, Benjamin Schumacher 1,*, Michael D Westmoreland 2
PMCID: PMC7512752  PMID: 33265328

Abstract

We present an axiomatic framework for thermodynamics that incorporates information as a fundamental concept. The axioms describe both ordinary thermodynamic processes and those in which information is acquired, used and erased, as in the operation of Maxwell’s demon. This system, similar to previous axiomatic systems for thermodynamics, supports the construction of conserved quantities and an entropy function governing state changes. Here, however, the entropy exhibits both information and thermodynamic aspects. Although our axioms are not based upon probabilistic concepts, a natural and highly useful concept of probability emerges from the entropy function itself. Our abstract system has many models, including both classical and quantum examples.

Keywords: axiomatic thermodynamics, information, entropy, Maxwell’s demon

1. Introduction

Axiomatic approaches to physics are useful for exploring the conceptual basis and logical structure of a theory. One classic example was presented by Robin Giles over fifty years ago in his monograph Mathematical Foundations of Thermodynamics [1]. His theory is constructed upon three phenomenological concepts: thermodynamic states, an operation (+) that combines states into composites, and a relation (→) describing possible state transformations. From a small number of basic axioms, Giles derives a remarkable amount of thermodynamic machinery, including conserved quantities (“components of content”), the existence of an entropy function that characterizes irreversibility for possible processes, and so on.

Alternative axiomatic developments of thermodynamics have been constructed by others along different lines. One notable example is the framework of Lieb and Yngvason [2,3] (which has recently been used by Thess as the basis for a textbook [4]). Giles’s abstract system, meanwhile, has found application beyond the realm of classical thermodynamics, e.g., in the theory of quantum entanglement [5].

Other work on the foundations of thermodynamics has focused on the concept of information. Much of this has been inspired by Maxwell’s famous thought-experiment of the demon [6]. The demon is an entity that can acquire and use information about the microscopic state of a thermodynamic system, producing apparent violations of the Second Law of Thermodynamics. These violations are only “apparent” because the demon is itself a physical system, and its information processes are also governed by the underlying dynamical laws.

Let us examine a highly simplified example of the demon at work. Our thermodynamic system is a one-particle gas enclosed in a container (a simple system also used in [7]). The gas may be allowed to freely expand into a larger volume, but this process is irreversible. A “free compression” process that took the gas from a larger to a smaller volume with no other net change would decrease the entropy of the system and contradict the Second Law. See Figure 1.

Figure 1.

Figure 1

A one-particle gas may freely expand to occupy a larger volume, but the reverse process would violate the Second Law.

Now, we introduce the demon, which is a machine that can interact with the gas particle and acquire information about its location. The demon contains a one-bit memory register, initially in the state 0. First, a partition is introduced in the container, so that the gas particle is confined to the upper or lower half (U or L). The demon measures the gas particle and records its location in memory, with 0 standing for U and 1 for L. On the basis of this memory bit value, the demon moves the volume containing the particle. In the end, the gas particle is confined to one particular smaller volume, apparently reducing the entropy of the system. This is illustrated in Figure 2.

Figure 2.

Figure 2

A Maxwell’s demon device acquires one bit of information about the location of a gas particle, allowing it to contract the gas to a smaller volume.

As Bennett pointed out [8], every demon operation we have described can be carried out reversibly, with no increase in global entropy. Even the “measurement” process can be described by a reversible interaction between the particle location and the demon’s memory bit b, changing the states according to

(U,b)(U,b)(L,b)(L,b¯), (1)

where b¯ is the binary negation of b. However, the demon process as described leaves the demon in a new situation, since an initially blank memory register now stores a bit of information. To operate in a cycle (and thus unambiguously violate the Second Law), the demon must erase this bit and recover the blank memory. However, as Landauer showed [9], the erasure of a bit of information is always accompanied by an entropy increase of at least kBln2 in the surroundings, just enough to ensure that there is no overall entropy decrease in the demon’s operation. The Second Law remains valid.

Information erasure is a physical process. The demon we have described can also erase a stored bit simply by reversing the steps outlined. Depending on the value of the bit register, the demon moves the gas particle to one of two corresponding regions separated by a partition. The same interaction that accomplished the “measurement” process (which takes (U,0)(U,0) and (L,0)(L,1)) can now be used to reset the bit value to 0 (taking (U,0)(U,0) and (L,1)(L,0)). In other words, the information stored redundantly in both the gas and the register is “uncopied”, so that it remains only in the gas. Finally, the partition is removed and the gas expands to the larger volume. The net effect is to erase the memory register while increasing the entropy of the gas by an amount kBln2, in conventional units.

Another link between thermodynamics and information comes from statistical mechanics. As Jaynes has shown [10,11], the concepts and measures of information devised by Shannon for communication theory [12] can be used in the statistical derivation of macroscopic thermodynamic properties. In a macroscopic system, we typically possess only a small amount of information about a few large-scale parameters (total energy, volume, etc.). According to Jaynes, we should therefore choose a probability distribution over microstates that maximizes the Shannon entropy, consistent with our data. That is to say, the rational probability assignment includes no information (in the Shannon sense) except that found in the macroscopic state of the system. This prescription yields the usual distributions used in statistical mechanics, from which thermodynamic properties may be derived.

Axiomatic theories and information analyses each provide important insights into the meaning of thermodynamics. The purpose of this paper is to synthesize these two approaches. We will present an axiomatic basis for thermodynamics that uses information ideas from the very outset. In such a theory, Maxwell’s demon, which accomplishes state changes by acquiring information, is neither a paradox nor a sideshow curiosity. Instead, it is a central conceptual tool for understanding the transformations between thermodynamic states. In our view, thermodynamics is essentially a theory of the descriptions of systems possessed by agents that are themselves (like the demon) physical systems. These descriptions may change as the external systems undergo various processes; however, they also may change when the agent acquires, uses or discards information.

Our work is thus similar in spirit to that of Weilenmann et al. [13], though our approach is very different. They essentially take the Lieb-Yngvason axiomatic framework and apply it to various quantum resource theories. In particular, for the resource theory of non-uniformity [14], the Lieb-Yngvason entropy function coincides with the von Neumann entropy of the quantum density operator, which is a measure of quantum information. We, on the other hand, seek to modify axiomatic thermodynamics itself to describe processes involving “information engines” such as Maxwell’s demon. We do not rely on any particular microphysics and have both classical and quantum models for our axioms. The connections we find between information and thermodynamic entropy are thus as general as the axioms themselves. Furthermore, to make our concept of information as clear as possible, we will seek to base our development on the most elementary ideas of state and process.

We therefore take as our prototype the axiomatic theory of Giles [1]. In fact, Giles’s monograph contains two different axiomatic developments. (The first system is presented in Chapters 1–6 of Giles’s book, and the second is introduced beginning in Chapter 7.) The first, which we might designate Giles I, is based on straightforward postulates about the properties of states and processes. The second, Giles II, is more sophisticated and powerful. The axioms are less transparent in meaning (e.g., the assumed existence of “anti-equilibrium” and “internal” thermodynamic states), but they support stronger theorems. This difference can be illustrated by the status of entropy functions in the two developments. In Giles I, it is shown that an entropy function exists; in fact, there may be many such functions. In Giles II, it is shown that an absolute entropy function, one that is zero for every anti-equilibrium state, exists and is unique. Giles himself regarded the second system as “The Formal Theory”, which he summarizes in an Appendix of that title.

The information-based system we present here, on the other hand, is closer to the more elementary framework of Giles I. We have taken some care to use notation and concepts that are as analogous as possible to that theory. We derive many of Giles’s propositions, including some that he uses as axioms. Despite the similarities, however, even the most elementary ideas (such as the combination of states represented by the + operation) will require some modifications. These changes highlight the new ideas embodied in axiomatic information thermodynamics, and so we will take some care to discuss them as they arise.

Section 2 introduces the fundamental idea of an eidostate, including how two or more eidostates may be combined. In Section 3, we introduce the → relation between eidostates: AB means that eidostate A can be transformed into eidostate B, with no other net change in the apparatus that accomplishes the transformation. The collection of processes has an algebraic structure, which we present in Section 4. We introduce our concept of information in Section 5 and show that the structure of information processes imposes a unique entropy measure on pure information states.

Section 6 introduces axioms inspired by Maxwell’s demon and shows their implications for processes involving thermodynamic states. Section 7 and Section 8 outline how our information axioms also yield Giles’s central results about thermodynamic processes, including the existence of an entropy function, conserved components of content, and mechanical states. In Section 10, we see that an entropy function can be extended in a unique way to a wider class of “uniform” eidostates. This, rather remarkably, gives rise to a unique probability measure within uniform eidostates, as we describe in Section 11.

Section 12 and Section 13 present two detailed models of our axioms, each one highlighting a different aspect of the theory. We conclude in Section 14 with some remarks on the connection between information and thermodynamic entropy, a connection that emerges necessarily from the structure of processes in our theory.

2. Eidostates

Giles [1] bases his development on a set S of states. The term “state” is undefined in the formal theory, but heuristically it represents an equilibrium macrostate of some thermodynamic system. (Giles, in fact, identifies aS with some method of preparation; the concept of a “system” does not appear in his theory, or in ours.) States can be combined using the + operation, so that if a,bS then a+bS also. The new state a+b is understood as the state of affairs arising from simultaneous and independent preparations of states a and b. For Giles, this operation is commutative and associative; for example, a+b is exactly the same state as b+a.

We also have a set S. However, our theory differs in two major respects. First, the combination of states is not assumed to be commutative and associative. On the contrary, we regard a+b and b+a as entirely distinct states for any ab. The motivation for this is that the way that a composite state is assembled encodes information about its preparation, and we want to be able to keep track of this information. On the states in S, the operation + is simply that of forming a Cartesian pair of states: a+b=(a,b), and nothing more.

A second and more far-reaching difference is that we cannot confine our theory to individual elements of S. A theory that keeps track of information in a physical system must admit non-deterministic processes. For instance, if a measurement is made, the single state a before the measurement may result in any one of a collection of possible states {a0,a1,,an} corresponding to the various results.

Therefore, our basic notion is the eidostate, which is a finite nonempty set of states. The term “eidostate” derives from the Greek word eidos, meaning “to see”. The eidostate is a collection of states that may be regarded as possible from the point of view of some agent. The set of eidostates is designated by E. When we combine two eidostates A,BE into the composite A+B, we mean the set of all combinations a+b of elements of these sets. That is, A+B is exactly the Cartesian product (commonly denoted A×B) for the sets. Thus, our state combination operation + is very different from that envisioned by Giles. His operation is an additional structure on S that requires an axiom to fix its properties, whereas we simply use the Cartesian product provided by standard set theory, which we of course assume.

Some eidostates in E are Cartesian products of other sets (which are also eidostates); some eidostates are not, and are thus “prime”. We assume that each eidostate has a “prime factorization” into a finite number of components. Here, is our first axiom:

Axiom 1.

Eidostates: E is a collection of sets called eidostates such that:

  • (a) 

    Every AE is a finite nonempty set with a finite prime Cartesian factorization.

  • (b) 

    A+BE if and only if A,BE.

  • (c) 

    Every nonempty subset of an eidostate is also an eidostate.

Part (c) of this axiom ensures, among other things, that every element a of an eidostate can be associated with a singleton eidostate {a}. Without too much confusion, we can simply denote any singleton eidostate {a} by the element it contains, writing a instead. We can therefore regard the set of states S in two ways. Either we may think of S as the collection of singleton eidostates in E, or we may say that S is the collection of all elements of all eidostates: S=AEA. Either way, the set E characterized by Axiom 1 is the more fundamental object.

Our “eidostates” are very similar to the “specifications” introduced by del Rio et al. [15] in their general framework for resource theories of knowledge. The two ideas, however, are not quite identical. To begin with, a specification V may be any subset of a state space Ω, whereas the eidostates in E are required to be finite nonempty subsets of S—and indeed, not every such subset need be an eidostate. For example, the union AB of two eidostates is not necessarily an eidostate. The set E therefore does not form a Boolean lattice. Specifications are a general concept applicable to state spaces of many different kinds, and are used in [15] to analyze different resource theories and to express general notions of approximation and locality. We, however, will be restricting our attention to an E (and S) with very particular properties, expressed by Axiom 1 and our later axioms, that are designed to model thermodynamics in the presence of information engines such as Maxwell’s demon.

Composite eidostates are formed by combining eidostates with the + operation (Cartesian product). The same pieces may be combined in different ways. We say that A,BE are similar (written AB) if they are made up of the same components. That is, AB provided there are eidostates E1,,En such that A=FA(E1,,En) and B=FB(E1,,En) for two Cartesian product formulas FA and FB. Thus,

(E1+E2)+E3E2+(E1+E3), (2)

and so on. The similarity relation ∼ is an equivalence relation on E.

We sometimes wish to combine an eidostate with itself several times. For the integer n1, we denote by nA the eidostate A+(A+), where A appears n times in the nested Cartesian product. This is one particular way to combine the n instances of A, though of course all such ways are similar. Thus, we may assert equality in A+nA=(n+1)A, but only similarity in nA+A(n+1)A.

Finally, we note that we have introduced as yet no probabilistic ideas. An eidostate is a simple enumeration of possible states, without any indication that some are more or less likely than others. However, as we will see in Section 11, in certain contexts, a natural probability measure for states does emerge from our axioms.

3. Processes

In the axiomatic thermodynamics of Giles, the → relation describes state transformations. The relation ab means that there exists another state z and a definite time interval τ0 so that

a+zτb+z, (3)

where τ indicates time evolution over the period τ. The pair (z,τ) is the “apparatus” that accomplishes the transformation from a to b. This dynamical evolution is a deterministic process; that is, in the presence of the apparatus (z,τ) the initial state a is guaranteed to evolve to the final state b. This rule of interpretation for → motivates the properties assumed for the relation in the axioms.

Our version of the arrow relation → is slightly different, in that it encompasses non-deterministic processes. Again, we envision an apparatus (z,τ) and we write

a+zτb+z, (4)

to mean that the initial state a+z may possibly evolve to b+z over the stated interval. Then, for eidostates A and B, the relation AB means that, if aA and bS, then there exists an apparatus (z,τ) such that a+zτb+z only if bB. Each possible initial state a in A might evolve to one or more final states, but all of the possible final states are contained in B. For singleton eidostates a and b, the relation ab represents a deterministic process, as in Giles’s theory.

Again, we use our heuristic interpretation of → to motivate the essential properties specified in an axiom:

Axiom 2.

Processes: Let eidostates A,B,CE, and sS.

  • (a) 

    If AB, then AB.

  • (b) 

    If AB and BC, then AC.

  • (c) 

    If AB, then A+CB+C.

  • (d) 

    If A+sB+s, then AB.

Part (a) of the axiom asserts that it is always possible to “rearrange the pieces” of a composite state. Thus, A+BB+A and so on. Of course, since ∼ is a symmetric relation, AB implies both AB and BA, which we might write as AB.

Part (b) says that a process from A to C may proceed via an intermediate eidostate B. Parts (c) and (d) establish the relationship between → and +. We can always append a “bystander” state C to any process AB, and a bystander singleton state s in A+sB+s can be viewed as part of the apparatus that accomplishes the transformation AB.

We use the → relation to characterize various conceivable processes. A formal process is simply a pair of eidostates A,B. Following Giles, we may say that A,B is:

  • natural if AB;

  • antinatural if BA;

  • possible if AB or BA (which may be written AB);

  • impossible if it is not possible;

  • reversible if AB; and

  • irreversible if it is possible but not reversible.

Thus, any formal process must be one of four types: reversible, natural irreversible, antinatural irreversible, or impossible.

Any nonempty subset of an eidostate is an eidostate. If the eidostate A is an enumeration of possible states, the proper subset eidostate BA may be regarded as an enumeration with some additional condition present that eliminates one or more of the possibilities. What can we say about processes involving these “conditional” eidostates? To answer this question, we introduce two further axioms. The first is this:

Axiom 3.

If A,BE and B is a proper subset of A, then AB.

This expresses the idea that no natural process can simply eliminate a state from a list of possibilities. This is a deeper principle than it first appears. Indeed, as we will find, in our theory, it is the essential ingredient in the Second Law of Thermodynamics.

To express the second new axiom, we must introduce the notion of a uniform eidostate. The eidostate A is said to be uniform provided, for every a,bA, we have ab. That is, every pair of states in A is connected by a possible process. This means that all of the A states are “comparable” in some way involving the → relation.

All singleton eidostates are uniform. Are there any non-uniform eidostates? The axioms in fact do not tell us. We will find models of the axioms that contain non-uniform eidostates and others that contain none. Even if there are states a,bS for which ab and ba, nothing in our axioms guarantees the existence of an eidostate that contains both a and b as elements. We denote the set of uniform eidostates by UE.

Now, we may state the second axiom about conditional processes.

Axiom 4.

Conditional processes:

  • (a) 

    Suppose A,AE and bS. If Ab and AA then Ab.

  • (b) 

    Suppose A and B are uniform eidostates that are each disjoint unions of eidostates: A=A1A2 and B=B1B2. If A1B1 and A2B2 then AB.

The first part makes sense given our interpretation of the → relation in terms of an apparatus (z,τ). If every aA satisfies a+zτb+z for only one state b, then the same will be true for every aAA. Part (b) of the axiom posits that, if we can find an apparatus whose dynamics transforms A1 states into B1 states, and another whose dynamics transforms A2 states into B2 states, then we can devise a apparatus with “conditional dynamics” that does both tasks, taking A to B. This is a rather strong proposition, and we limit its scope by restricting it to the special class of uniform eidostates.

We can as a corollary extend Part (b) of Axiom 4 to more than two subsets. That is, suppose A,BU and the sets A and B are each partitioned into n mutually disjoint, nonempty subsets: A=A1An and B=B1Bn. In addition, suppose AkBk for all k=1,,n. Then, AB.

4. Process Algebra and Irreversibility

Following Giles, we now explore the algebraic structure of formal processes and describe how the type of a possible process may be characterized by a single real-valued function. Although the broad outlines of our development will follow that of Giles [1], there are significant differences. (For example, in our theory, the unrestricted set P of eidostate processes does not form a group.)

There is an equivalence relation among formal processes, based on the similarity relation ∼ among eidostates. We say that A,BC,D if there exist singletons x,yS such that A+xC+y and B+xD+y. That is, if we augment the eidostates in A,B by x and those in C,D by y, the corresponding eidostates in the two processes are mere rearrangements of each other. It is not hard to establish that this is an equivalence relation. Furthermore, equivalent processes (under ≐) are always of the same type. To see this, suppose that if A,BC,D and AB. Then, A+xC+y, etc., and

C+yA+xB+xD+y. (5)

Hence, CD. It is a straightforward corollary that AB if and only if CD.

Processes (or more strictly, equivalence classes of processes under ≐) have an algebraic structure. We define the sum of two processes as

A,B+C,D=A+C,B+D, (6)

and the negation of a process as A,B=B,A. We call the − operation “negation” even though in general A,B is not the additive inverse of B,A. Such an inverse may not exist. As we will see, however, the negation does yield an additive inverse process in some important special contexts.

The sum and negation operations on processes are both compatible with the equivalence ≐. That is, first, if A,BA,B then A,BA,B. Furthermore, if A,BA,B and C,DC,D, then

A,B+C,DA,B+C,D. (7)

This means that the sum and negation operations are well-defined on equivalence classes of processes. If A,B and C,D denote two equivalence classes represented by A,B and C,D, then

A,B+C,D=A+C,B+D, (8)

which is the same regardless of the particular representatives chosen to indicate the two classes being added.

We let P denote the collection of all equivalence classes of formal processes. Then, the sum operation both is associative and commutative on P. Furthermore, it contains a zero element 0=s,s for some singleton s. That is, if Γ=A,B,

Γ+0=A,B+s,s=A+s,B+s=A,B=Γ. (9)

The set P is thus a monoid (a semigroup with identity) under the operation +. Moreover, the subset PS of singleton processes—equivalence classes of formal processes with singleton initial and final states—is actually an Abelian group, since

a,b+b,a=a+b,b+a=a+b,a+b=0. (10)

In PS, the negation operation does yield the additive inverse of an element.

The → relation on states induces a corresponding relation on processes. If Γ,ΔP, we say that ΓΔ provided the process ΓΔ is natural (a condition we might formally write as ΓΔ0). Intuitively, this means that process Γ can “drive process Δ backward”, so that Γ and the opposite of Δ together form a natural process.

Now, suppose that P^ is a collection of equivalence classes of possible processes that is closed under addition and negation. An irreversibility function I is a real-valued function on P^ such that

  1. If Γ,ΔP^, then I(Γ+Δ)=I(Γ)+I(Δ).

  2. The value of I determines the type of the processes in P^:
    • I(Γ)>0 whenever Γ is natural irreversible.
    • I(Γ)=0 whenever Γ is reversible.
    • I(Γ)<0 whenever Γ is antinatural irreversible.

An irreversibility function, if it exists, has a number of elementary properties. For instance, since the process Γ+(Γ) is always reversible for any ΓP^, it follows that I(Γ)=I(Γ). In fact, we can show that all irreversibility functions on P^ are essentially the same:

Theorem 1.

An irreversibility function on P^ is unique up to an overall positive factor.

Proof. 

Suppose I1 and I2 are irreversibility functions on the set P^. If Γ is reversible, then I1(Γ)=I2(Γ)=0. If there are no irreversible processes in P^, then I1=I2.

Now, suppose that P^ contains at least one irreversible process Γ, which we may suppose is natural irreversible. Thus, both I1(Γ)>0 and I2(Γ)>0. Consider some other process ΔP^. We must show that

I1(Δ)I1(Γ)=I2(Δ)I2(Γ). (11)

We proceed by contradiction, imagining that the two ratios are not equal and (without loss of generality) that the second one is larger. Then, there exists a rational number m/n (with n>0) such that

I1(Δ)I1(Γ)<mn<I2(Δ)I2(Γ). (12)

The first inequality yields mI1(Γ)nI1(Δ)>0, so that the process mΓnΔ (that is, mΓ+n(Δ)) must be natural irreversible. The second inequality yields mI2(Γ)nI2(Δ)<0, so that the process mΓnΔ must be antinatural irreversible. These cannot both be true, so the original ratios must be equal. ☐

The additive irreversibility function I is only defined on a set P^ of possible processes. However, we can under some circumstances extend an additive function to a wider domain. It is convenient to state here the general mathematical result we will use later for this purpose:

Theorem 2 (“Hahn–Banach theorem” for Abelian groups).

Let G be an Abelian group. Let ϕ be a real-valued function defined and additive on a subgroup G0 of G. Then, there exists an additive function ϕ defined on G such that ϕ(x)=ϕ(x) for all x in G0.

Note that the extension ϕ is not necessarily unique—that is, there may be many different extensions of a single additive function ϕ.

5. Information and Entropy

In our theory, information resides in the distinction among possible states. Thus, the eidostate A={a1,a2,a3} represents information in the distinction among it elements. However, thermodynamic states such as the aks may also have other properties such as energy, particle content, and so on. To disentangle the concept of information from the other properties of these states, we introduce a notion of a “pure” information state.

The intuitive idea is this. We imagine that the world contains freely available memory devices. Different configurations of these memories—different memory records—are distinct states that are degenerate in energy and every other conserved quantity. Any particular memory record can thus be freely created from or reset to some null value.

We therefore define a record state to be an element rS such that there exists aS so that aa+r. The state a can be thought of as part of the apparatus that reversibly exchanges the particular record r with a null value. In fact, if A is any eidostate at all, we find that

A+aA+(a+r)(A+r)+a, (13)

and so by cancellation of the singleton state a, AA+r. We denote the set of record states by R. If r,sR, then r+sR, and furthermore rs. Any particular record state can be reversibly transformed into any other, a fact that expresses the arbitrariness of the “code” used to represent information in a memory device.

An information state is an eidostate whose elements are all record states, and the set of such eidostates is denoted I. All information states are uniform eidostates. An information process is one that is equivalent to a process I,J, where I,JI. (Of course, information processes also include processes of the form I+x,J+x for a non-record state xS, as well as more complex combinations of record and non-record states.) Roughly speaking, an information process is a kind of computation performed on information states.

It is convenient at this point to define a bit state (denoted Ib) as an information state containing exactly two record states. That is, Ib={r0,r1}. A bit process is an information process of the form Θb=r,Ib for some rR—that is, a process by which a bit state is created from a single record state.

We can illustrate a bit process in a thermodynamics context by considering a thought-experiment involving Maxwell’s demon. We imagine that the demon operates on a one-particle gas confined to a volume, a situation described by gas state v (see Figure 3). The demon, whose memory initially has a record state r, inserts a partition into the container, dividing the volume in two halves labeled 0 and 1. The gas molecule is certainly in one sub-volume or the other. The demon then records in memory which half the particle occupies. Finally, the partition is removed and the particle wanders freely around the whole volume of the container. In thermodynamic terms, the gas state relaxes to the original state v. The overall process establishes the following relations:

v+r{v0,v1}+r{v0+r0,v1+r1}v+{r0,r1}, (14)

and so r{r0,r1}. In our example, the bit process Θb=r,Ib is natural one.

Figure 3.

Figure 3

Maxwell’s demon interacting with a one-particle gas, illustrating a bit process r,{r0,r1}.

However, we do not yet know that there actually are information states and information processes in our theory. We address this by a new axiom.

Axiom 5.

Information: There exist a bit state and a possible bit process.

Axiom 5 has a wealth of consequences. Since an information state exists, record states necessarily also exist. There are infinitely many record states, since for any rR we also have distinct states r+r,r+(r+r),,nr,, all in R. We may have information states in I that contain arbitrarily many record states, because nIb contains 2n elements. Furthermore, since every nonempty subset of an information state is also an information state, for any integer k1 there exists II so that #(I)=k.

Any two bit states can be reversibly transformed into one another. Consider Ib={r0,r1} and Ib={s0,s1}. Since r0s0 and r1s1, it follows by Axiom 4 that IbIb (and hence IbIb.)

If any bit process is possible, then every bit process is possible. Furthermore, every such process is of the natural irreversible type. To see why, consider the bit state Ib={r0,r1} and suppose Ibr for a record state r. Since rr0, this implies that {r0,r1}r0, which is a violation of Axiom 3. Therefore, it must be that rIb but Ibr; and this is true for any choice of r and Ib.

Now, we can prove that every information process is possible, and that the → relation is determined solely by the relative sizes of the initial and final information states.

Theorem 3.

Suppose I,JI. Then, IJ if and only if #(I)#(J).

Proof. 

To begin with, we can see that #(I)=#(J) implies IJ. This is because we can write I={r1,,rn} and J={s1,,sn}. Since rksk for every k=1,,n, the finite extension of Axiom 4 tells us that IJ.

Now, imagine that #(I)>#(J). There exists a proper subset I of I so that #(I)=#(J), and thus JI. If it happened that IJ, it would follow that II, a contradiction of Axiom 4. Hence, #(I)>#(J) implies IJ.

It remains to show that if #(I)<#(J), it must be that IJ. As a first case, suppose that #(I)=n and #(J)=n+1. Letting I={r1,,rn} and J={s1,,sn,sn+1}, we note that r1s1,r2s2,,rn{sn,sn+1} (the last being a bit process. It follows that IJ.

We now proceed inductively. Given #(J)=#(I)+n, we can imagine a sequence of information states Km with successive numbers of elements between #(I) and #(J), so that #(Km)=#(I)+m. From what we have already proved,

IK1Kn1J, (15)

and by transitivity of the → relation we may conclude that IJ. Therefore, IJ if and only if #(I)#(J). ☐

Intuitively, the greater the number of distinct record states in an information state, the more information it represents. Thus, under our axioms, a natural information process may maintain or increase the amount of information, but never decrease it.

We can sharpen this intuition considerably. Let PI be the set of information processes, which is closed under addition and negation, and in which every process is possible. We can define a function I on PI as follows: For Γ=I,JPI,

I(Γ)=log#(J)#(I)=log#(J)log#(I). (16)

The logarithm function guarantees the additivity of I when two processes are combined, and the sign of I exactly determines whether IJ. Thus, I is an irreversibility function on PI, as the notation suggests. This function is unique up to a positive constant factor—i.e., the choice of logarithm base. If we choose to use base-2 logarithms, so that a bit process has I(Θb)=1, then I is uniquely determined.

The irreversibility function on information processes can be expressed in terms of a function on information states. Suppose we have a collection of eidostates KE that is closed under the + operation. With Giles, we define a quasi-entropy on K to be a real-valued function S such that, for A,BK:

  1. S(A+B)=S(A)+S(B).

  2. If A,B is natural irreversible, then S(A)<S(B).

  3. If A,B is reversible, then S(A)=S(B).

(A full “entropy” function satisfies one additional requirement, which we will address in Section 8 below.) Given a quasi-entropy S, we can derive an irreversibility function I on possible K-processes by I(A,B)=S(B)S(A).

Obviously, S(I)=log#(I) is a quasi-entropy function on I that yields the irreversibility function on PI. We recognize it as the Hartley–Shannon entropy of an information source with #(I) possible outputs [16]. In fact, this is the only possible quasi-entropy on I. Since every information process is possible, two different quasi-entropy functions S and S can only differ by an additive constant. However, since I+rI for II and rR, we know that S(r)=0 for any quasi-entropy. Therefore, S(I)=log#(I) is the unique quasi-entropy function for information states. The quasi-entropy of a bit state is S(Ib)=1.

6. Demons

Maxwell’s demon accomplishes changes in thermodynamic states by acquiring and using information. For example, a demon that operates a trapdoor between two containers of gas can arrange for all of the gas molecules to end up in one of the containers, “compressing” the gas without work. As we have seen, it is also possible to imagine a reversible demon, which acquires and manipulates information in a completely reversible way. If such a demon produces a transformation from state x to state y by acquiring k bits of information in its memory, it can accomplish the reverse transformation (from y to x) while erasing k bits from its memory.

Maxwell’s demon is a key concept in axiomatic information thermodynamics, and we introduce a new axiom to describe “demonic” processes.

Axiom 6.

Demons: Suppose a,bS and JI such that ab+J.

  • (a) 

    There exists II such that ba+I.

  • (b) 

    For any II, either ab+I or b+Ia.

In Part (a), we assert that what one demon can do (transforming a to b by acquiring information in J), another demon can undo (transforming b to a by acquiring information in I). Part (b) envisions a reversible demon. Any amount of information in I is either large enough that we can turn a to b by acquiring I, or small enough that we can erase the information by turning b to a.

A process A,B is said to be demonically possible if one of two equivalent conditions hold:

  • There exists an information state JI such that either AB+J or BA+J.

  • There exists an information process I,JPI such that A,B+I,J is possible; that is, either A+IB+J or B+JA+I.

It is not hard to see that these are equivalent. Suppose we have JI such that AB+J. Then, for any II, A+IB+(I+J). Conversely, we note that AA+I for any II. Thus, if A+IB+J then AB+J as well.

If a process is possible, then it is also demonically possible, since if AB it is also true that A+IB+I. For singleton processes in PS, moreover, the converse is also true. Suppose a,bS and a,b is demonically possible. Then, there exists II such that either ab+I or ba+I. Either way, either trivially or by an application of Axiom 6, there must be JI so that ab+J. A single record state rR is a singleton information state in I. Thus, by Axiom 6 it must be that either ab+r or b+ra. Since b+rb, we find that ab, and so a,b is possible.

A singleton process is demonically possible if and only if it is possible. This means that we can use processes involving demons to understand processes that do not. In fact, we can use Axiom 6 to prove a highly significant fact about the → relation on singleton eidostates.

Theorem 4.

Suppose a,b,cS. If a,b and a,c are possible, then b,c is possible.

Proof. 

First, we note the general fact that, if x,y is a possible singleton process, then there exist I,JI so that xy+I and yx+J.

Given our hypothesis, therefore, there must be I,JI such that ba+I and ac+J. Then

ba+Ic+(I+J). (17)

That is, b,c is demonically possible, and hence possible. ☐

This fact is so fundamental that Giles made it an axiom in his theory. For us, it is a straightforward consequence of the axiom about processes involving demons. It tells us that the set of singleton states S is partitioned into equivalence classes, within each of which all states are related by ⇌.

This statement is more primitive than, but closely related to, a well-known principle called the Comparison Hypothesis. The Comparison Hypothesis deals with a state relation called adiabatic accessibility (denoted ≺) which is definable in Giles’s theory (and ours) but is taken as an undefined relation in some other axiomatic developments. According to the Comparison Hypothesis, if X and Y are states in a given thermodynamic space, either XY or YX. Lieb and Yngvason, for instance, show that the Comparison Hypothesis can emerge as a consequence of certain axioms for thermodynamic states, spaces, and equilibrium [2,3].

Theorem 4 also sheds light on our axiom about conditional processes, Axiom 4. In Part (a) of this axiom, we suppose that Ab for some AE and bS. The axiom itself allows us to infer that ab for every aA. However, Theorem 4 now implies that, for every a,aA, either aa or aa. In other words, Part (a) of Axiom 4, similar to Part (b) of the same axiom, only applies to uniform eidostates.

Finally, we introduce one further “demonic” axiom.

Axiom 7.

Stability: Suppose A,BE and JI. If nAnB+J for arbitrarily large values of n, then AB.

According to the Stability Axiom, if a demon can transform arbitrarily many copies of eidostate A into arbitrarily many copies of B while acquiring a bounded amount of information, then we may say that AB. This can be viewed as a kind of “asymptotic regularization” of the → relation. The form of the Stability Axiom that we have chosen is a particularly simple one, and it suffices for our purposes in this paper. However, more sophisticated axiomatic developments might require a refinement of the axiom. Compare, for instance, Axiom 2.1.3 in Giles to its refinement in Axiom 7.2.1 [1].

To illustrate the use of the Stability Axiom, suppose that A,BE and II such that A+IB+I. Our axioms do not provide a “cancellation law” for information states, so we cannot immediately conclude that AB. However, we can show that nAnB+I for all positive integers n. The case n=1 holds since AA+IB+I. Now, we proceed inductively, assuming that nAnB+I for some n. Then,

(n+1)AnA+A(nB+I)+AnB+(A+I)nB+(B+I)(n+1)B+I. (18)

Thus, nAnB+I for arbitrarily large (and indeed all) values of n. By the Stability Axiom, we see that AB. Thus, there is after all a general cancellation law information states that appear on both sides of the → relation.

7. Irreversibility for Singleton Processes

From our two “demonic” axioms (Axioms 6 and 7), we can use the properties of information states to derive an irreversibility function on singleton processes. Let P^S denote the set of possible singleton processes. This is a subgroup of the Abelian group PS. Thus, if we can find an irreversibility function I on P^S, we will be able to extend it to all of PS.

We begin by proving a useful fact about possible singleton processes:

Theorem 5.

Suppose a,bS so that α=a,b is possible. Then, for any integers n,m0, either a+mIbb+nIb or b+nIba+mIb.

Proof. 

If m=n, the result is easy. Suppose that n>m, so that n=m+k for positive integer k. By Axiom 6, either ab+kIb or b+kIba. We can then append the information state mIb to both sides and rearrange the components. The argument for m>n is exactly similar. ☐

This fact has a corollary that we may state using the → relation on processes. Suppose α=a,b is a possible singleton process, and let q,p be integers with q>0. Then, either qαpΘb or qαpΘb (so that qαpΘb is either natural or antinatural) for the bit process Θb=r,Ib.

Given α, therefore, we can define two sets of rational numbers:

Lα={p/q:qαpΘb}Uα={p/q:qαpΘb} (19)

where q>0. Both sets are nonempty and every rational number is in at least one of these sets. Furthermore, if p/qUα and p/qLα, we have that qαpΘb and qαpΘb, and so

pqΘbqqαpqΘb. (20)

Hence, (pqpq)Θb0. Since Θb is itself a natural irreversible process, it follows that pqpq0, and so

pqpq. (21)

Every element of Uα is an upper bound for Lα. It follows that Uα and Lα form a Dedekind cut of the rationals, which leads us to the following important result.

Theorem 6.

For αP^S, define I(α)=infUα=supLα. Then, I is an irreversibility function on P^S.

Proof. 

First, we must show that I is additive. Suppose α,βP^S. If p/qUα and p/qUβ, then q(α+β)(p+p)Θb, and so (p+p)/qUα+β. It follows that I(α+β)I(α)+I(β). The corresponding argument involving Lα and Lβ proves that I(α+β)I(α)+I(β). Thus, I must be additive.

Next, we must show that the value of I(α) tells us the type of the process α. If I(α)>0, then 0Lα but 0Uα, and so α0 but α0. That is, α is natural irreversible. Likewise, if I(α)<0, then 0Lα but 0Uα, from which we find that α must be antinatural irreversible. Finally, if I(α)=0, we find that qαΘb0 and qα+Θb0 for arbitrarily large values of q. From Axiom 7, we may conclude that α0, and so α is reversible. ☐

Notice that we have arrived at an irreversibility function for possible singleton processes—those most analogous to the ordinary processes in Giles or any text on classical thermodynamics—from axioms about information and processes involving demons (Axioms 5–7). In our view, such ideas are not “extras” to be appended onto a thermodynamic theory, but are instead central concepts throughout. In ordinary thermodynamics, the possibility of a reversible heat engine can have implications for processes that do not involve any heat engines at all. In the information thermodynamics whose axiomatic foundations we are exploring, the possibility of a Maxwell’s demon has implications even for situations in which no demon acts.

We now have irreversibility functions for both information processes and singleton processes. These are closely related. In fact, it is possible to prove the following general result:

Theorem 7.

If αP^S and ΓPI, then the combined process α+Γ is natural if and only if I(α)+I(Γ)0.

Since the set P^S of possible singleton processes is a subgroup of the Abelian group PS of all singleton processes, we can extend the additive irreversiblity function I to all of PS. Though I is unique on the possible set P^S, its extension to PS is generally not unique.

8. Components of Content and Entropy

Our axiomatic theory of information thermodynamics is fundamentally about the set of eidostates E. However, the part of that theory dealing with the set S of singleton eidostates includes many of the concepts and results of ordinary axiomatic thermodynamics [1]. We have a group of singleton processes PS containing a subgroup P^S of possible processes, and we have constructed an irreversibility function I on P^S that may be extended to all of PS. From these we can establish several facts.

  • We can construct components of content, which are the abstract versions of conserved quantities. A component of content Q is an additive function on S such that, if the singleton process a,b is possible, then Q(a)=Q(b). (In conventional thermodynamics, components of content include energy, particle number, etc.)

  • We can find a sufficient set of components of content. The singleton process a,b is possible if and only if Q(a)=Q(b) for all Q in the sufficient set.

  • We can use I to define a quasi-entropy S on S as follows: S(a)=I(a,2a). This is an additive function on states in S such that I(a,b)=S(b)S(a).

Because the extension of the irreversibility function I from P^S to all of PS is not unique, the quasi-entropy S is not unique either. How could various quasi-entropies differ? Suppose I1 and I2 are two different extensions of the same original I, leading to two quasi-entropy functions S1 and S2 on S. Then, the difference Q=S1S2 is a component of content. That is, if a,b is possible,

Q(b)Q(a)=S1(b)S2(b)S1(a)+S2(a)=I1(a,b)I2(a,b)=0, (22)

since I1 and I2 agree on P^S, which contains a,b.

Another idea that we can inherit without alteration is the concept of a mechanical state. A mechanical state is a singleton state that reversibly stores one or more components of content, in much the same way that we can store energy reversibly as the work done to lift or lower a weight. The mechanical state in this example is the height of the weight. In Giles’s theory [1], mechanical states are the subject of an axiom, which we also adopt:

Axiom 8.

Mechanical states: There exists a subset MS of mechanical states such that:

  • (a) 

    If l,mM, then l+mM.

  • (b) 

    For l,mM, if lm then ml.

Nothing in this axiom asserts the actual existence of any mechanical state. It might be that M=. Furthermore, the choice of the designated set M is not determined solely by the → relations among the states. For instance, the set R of record states might be included in M, or not. This explains why the introduction of mechanical states must be phrased as an axiom, rather than a definition: a complete specification of the system must include the choice of which set is to be designated as M. Whatever choice is made for M, the set PM of mechanical processes (i.e., those equivalent to l,m for l,mM) will form a subgroup of PS.

A mechanical state may “reversibly store” a component of content Q, but it need not be true that every Q can be stored like this. We say that a component of content Q is non-mechanical if Q(m)=0 for all mM. For example, we might store energy by lifting or lowering a weight, but we cannot store particle number in this way.

Once we have mechanical states and processes, we can give a new classification of processes. A process ΓP is said to be adiabatically natural (possible, reversible, antinatural) if there exists a mechanical process μPM such that Γ+μ is natural (possible, reversible, antinatural). We can also define the “adiabatic accessibility” relation for states in S, as mentioned in Section 6: ab whenever the process a,b is adiabatically natural.

The set M of mechanical states allows us to refine the idea of a quasi-entropy into an entropy, which is a quasi-entropy S that takes the value S(m)=0 for any mechanical state m. Such a function is guaranteed to exist. We end up with a characterization theorem, identical to a result of Giles [1], that summarizes the general thermodynamics of singleton eidostates in our axiomatic theory.

Theorem 8.

There exist an entropy function S and a set of components of content Q on S with the following properties:

  • (a) 

    For any a,bS, S(a+b)=S(a)+S(b).

  • (b) 

    For any a,bS and component of content Q, Q(a+b)=Q(a)+Q(b).

  • (c) 

    For any a,bS, ab if and only if S(a)S(b) and Q(a)=Q(b) for every component of content Q.

  • (d) 

    S(m)=0 for all mM.

An entropy function is not unique. Two entropy functions may differ by a non-mechanical component of content.

The entropy function on S is related to the information entropy function we found for information states in I. Suppose we have a,bS and I,JI. Then, Theorem 7 tells us that a+Ib+J only if

S(a)+log#(I)S(b)+log#(J). (23)

Let ESI represent the set of eidostates that are similar to a singleton state combined with an information state. Then, S(a+I)=S(a)+log#(I) is an entropy function on ESI. In the next section, we will extend the domain of the entropy function even further, to the set U of all uniform eidostates.

9. State Equivalence

Consider a thought-experiment (illustrated in Figure 4) in which a one-particle gas starts out in a volume v0 and a second thermodynamic system starts out in one of three states e1, e2 or e3. We assume that all conserved quantities are the same for these three states, but they may differ in entropy. We can formally describe the overall situation by the eidostate E+v0, where E={e1,e2,e3} is uniform.

Figure 4.

Figure 4

A state equivalence thought-experiment involving states of an arbitrary thermodynamic system and a one-particle gas.

We can reversibly transform each of the ek states to the same state e, compensating for the various changes in entropy by expanding or contracting the volume occupied by the gas. That is, we can have v0+ekvk+e for adjacent but non-overlapping volumes vk. Axiom 4 indicates that we can write E+v0e+V, where V={v1,v2,v3}.

Now, we note that V can itself be reversibly transformed into a singleton eidostate v. A gas molecule in one of the sub-volumes (eidostate V) can be turned into a molecule in the whole volume (eidostate e) by removing internal partitions between the sub-volumes; and when we re-insert these partitions the particle is once again in just one of sub-volumes. Thus, Vv. To summarize, we have

E+v0e+v. (24)

The uniform eidostate E, taken together with the gas state v0, can be reversibly transformed into the singleton state v+e. We call this a state equivalence for the uniform eidostate E. By choosing the state e properly (say, by letting e=ek for some k), we can also guarantee that the volume v is larger than v0, so that v0v by free expansion.

This discussion motivates our final axiom, which states that this type of reversible transformation is always possible for a uniform eidostate.

Axiom 9.

State equivalence: If E is a uniform eidostate then there exist states e,x,yS such that xy and E+xe+y.

Axiom 9 closes a number of gaps in our theory. For example, our previous axioms (Axioms 1–8) do not by themselves guarantee than any state in S has a nonzero entropy. With the new axiom, however, we can prove that such states exist. The bit state Ib, which is uniform, has some state equivalence given by Ib+xe+y. Since the eidostates on each side of this relation are in ESI, we can determine the entropies on each side. We find that

1+S(x)=S(e)+S(y). (25)

It follows that at least one of the states e,x,y must have S0.

State equivalence also allows us to define the entropy of any uniform eidostate EU. If E+xe+y then we let

S(E)=S(e)+S(y)S(x). (26)

We must first establish that this expression is well-defined. If we have two state equivalences for the same E, so that E+xe+y and E+xe+y, then

(e+y)+xE+(x+x)(e+y)+x, (27)

from which it follows that

S(e)+S(y)S(x)=S(e)+S(y)S(x). (28)

Thus, our definition for S(E) does not depend on our choice of state equivalence for E.

Is S an entropy function on U? It is straightforward to show that S is additive on U and that S(m)=0 for any mechanical state m. It remains to show that, for any E,FU with E,F possible, EF if and only if S(E)S(F). We will use the state equivalences E+xe+y and F+wf+z.

Suppose first that EF. Then, E+(x+w)F+(x+w) and so

(e+y)+w(E+x)+w(F+w)+x(f+z)+x. (29)

From this, it follows that

S(e)+S(y)S(x)S(f)+S(z)S(w), (30)

and hence S(E)S(F).

We can actually extract one more fact from this argument. If we assume that E,F is possible, it must also be true that (e+y)+w,(f+z)+x is a possible singleton process. If we now suppose that S(E)S(F), we know that S((e+y)+w)S((f+z)+x) and thus

E+(x+w)(e+y)+w(f+z)+xF+(x+w). (31)

Therefore, EF, as desired.

We have extended the entropy S to uniform eidostates. It is even easier to extend any component of content function Q to these states. If EU, then any e1,e2E must have Q(e1)=Q(e2), since e1e2. Thus, we can define Q(E)=Q(ek) for any ekE. This is additive because the elements of E+F are combinations ek+fj of states in E and F. Furthermore, suppose we have a state equivalence E+xe+y. Since we assume xy in a state equivalence, Q(x)=Q(y). By the conditional process axiom (Axiom 4) we know that ek+xe+y for any ekE. It follows that Q(e)=Q(ek)=Q(E).

Now, let E1 and E2 be uniform eidostates with state equivalences Ek+xkek+yk. Suppose further that Q(E1)=Q(E2) for every component of content Q. We know that Q(x1)=Q(y1), Q(x2)=Q(y2) and Q(e1)=Q(e2) for every component of content. Thus

E1+(x1+x2)(e1+y1)+x2(e2+y2)+x1E2+(x1+x2). (32)

It follows that E1E2, i.e., that E1,E2 is a possible eidostate process.

We have therefore extended Theorem 8 to all uniform eidostates. We state the new result here.

Theorem 9 (Uniform eidostate thermodynamics).

There exist an entropy function S and a set of components of content Q on U with the following properties:

  • (a) 

    For any E,FU, S(E+F)=S(E)+S(F).

  • (b) 

    For any E,FU and component of content Q, Q(E+F)=Q(E)+Q(F).

  • (c) 

    For any E,FU, EF if and only if S(E)S(F) and Q(E)=Q(F) for every component of content Q.

  • (d) 

    S(m)=0 for all mM.

The set U of uniform eidostates includes the singleton states in S, the information states in I, all combinations of these, and perhaps many other states as well. (Non-uniform eidostates in E might exist, as we will see in the model we discuss in Section 12, but their existence cannot be proved from our axioms.) The type of every process involving uniform eidostates can be determined by a single entropy function (which must not decrease) and a set of components of content (which must be conserved).

10. Entropy for Uniform Eidostates

We have extended the entropy function from singleton states and information states to all uniform eidostates. It turns out that this extension is unique. The following theorem and its corollaries actually allow us to compute the entropy of any EU from the entropies of the states contained in E.

Theorem 10.

Suppose EU is a disjoint union of uniform eidostates E1 and E2. Then

S(E)=S(E1E2)=log2S(E1)+2S(E2). (33)

Proof. 

E and Ek (k=1,2) all have equal components of content. Define

Δ(E1,E2)=S(E)log2S(E1)+2S(E2). (34)

Note that, if we replace Ek by Ek=Ek+I for II, then these new eidostates are still disjoint and

E=E1E2=E1+IE2+I=E+I. (35)

These eidostates have the same components of content as the original E. Furthermore,

Δ(E1,E2)=Δ(E1+I,E2+I)=S(E+I)log2S(E1+I)+2S(E2+I)=S(E)+log#(I)log2log#(I)2S(E1)+2S(E2)=Δ(E1,E2). (36)

We can find a uniform eidostate E0 with the same components of content such that S0=S(E0) is less than or equal to S(E1), S(E2) and S(E). (It suffices to pick E0 to be the state of smallest entropy among E1, E2 and E.) Then, there exist integers mk1 such that

  • There are disjoint information states Jk containing mk record states.

  • There are disjoint information states Jk* containing mk+1 record states.

  • J=J1J2 and J*=J1*J2* have m1+m2 and m1+m2+2 record states, respectively.

  • We have
    S0+logmkS(Ek)<S0+log(mk+1). (37)
    That is, we choose mk so that S(Ek)S00 is between log(mk) and log(mk+1). To put it more simply, mk=2S(Ek)S0.

Once we have mk, we can write that

2S0·mk2S(Ek)<2S0·(mk+1). (38)

Adding these inequalities for k=1,2 and taking the logarithm yields

S0+log(m1+m2)log2S(E1)+2S(E2)<S0+log(m1+m2+2). (39)

How far apart are the two ends of this chain of inequalities? Here, is a useful fact about base-2 logarithms: If n1, then log(n+2)<log(n)+2/n. This implies

S0+log(m1+m2)log2S(E1)+2S(E2)<S0+log(m1+m2)+2m1+m2. (40)

The two ends of the inequality differ by less than 2/(m1+m2).

We can get another chain of inequalities by applying Axiom 4 about conditional processes. Since all of our uniform eidostates have the same components of content, we know that

E0+JkEkE0+Jk*. (41)

From the axiom, we can therefore say

E0+JEE0+J* (42)

which implies that

S0+log(m1+m2)S(E)S0+log(m1+m2+2)<S0+log(m1+m2)+2m1+m2. (43)

We have two quantities that lie in the same interval. Their separation is therefore bounded by the interval width—i.e., less than 2/(m1+m2). Therefore,

Δ(E1,E2)=S(E)log2S(E1)+2S(E2)<2m1+m2. (44)

How big are the numbers mk? We can make such numbers as large as we like by considering instead the eidostates Ek=Ek+I, where I is an information state. As we have seen, Δ(E1,E2)=Δ(E1,E2). Given any ϵ>0, we can choose I so that

mk=2S(Ek)S0=#(I)·2S(Ek)S0>1ϵ. (45)

Then, 2/(m1+m2)<ϵ, and so

Δ(E1,E2)=Δ(E1,E2)<ϵ. (46)

Since this is true for any ϵ>0, we must have Δ(E1,E2)=0, and so

S(E)=log2S(E1)+2S(E2), (47)

as desired. ☐

Theorem 10 has a corollary, which we obtain by applying the theorem inductively:

Theorem 11.

If E is a uniform eidostate,

S(E)=logekE2S(ek). (48)

The entropy of any uniform eidostate is a straightforward function of the entropies of the states contained therein. If E contains more than one state, we notice that S(E)>S(ek) for any ekE. It follows that ek,E is a natural irreversible process.

To take a simple example of Theorem 11, consider the entropy of an information state. Every record state r has S(r)=0. Thus, for II,

S(I)=logrkI20=log#(I), (49)

as we have already seen.

11. Probability

An eidostate in E represents a state of knowledge of a thermodynamic agent. It is, as we have said, a simple list of possible states, without any assignment of probabilities to them. If the agent is to use probabilistic reasoning, then it needs to assign conditional probabilities of the form P(A|B) where A,BE.

The entropy formula in Theorem 11 allows us to make such an assignment based on the entropy itself, provided the eidostate conditioned upon is uniform. Suppose EU and let aS. Then, the entropic probability of a conditioned on E is

P(a|E)=2S(a)2S(E)=2S(a)S(E)aE0aE. (50)

Clearly, 0P(a|E)1 and aP(a|E)=1. It is worth noting that, although E is “uniform” (in the sense that all aE have exactly the same conserved components of content), the probability distribution P(a|E) is not uniform, but assigns a higher probability to states of higher entropy.

We can generalize entropic probabilities a bit further. Let A be any subset of S (be it an eidostate or not) and EU. Then, AE is either a uniform eidostate or the empty set ∅. If we formally assign S()=, then both E and AE have well-defined entropies. Then, we define

p(A|E)=aAP(a|E)=2S(AE)2S(E)=2S(AE)S(E). (51)

Obviously, P(E|E)=1. Now, consider two disjoint sets A and B along with EU. The set (AB)E is a disjoint union of uniform eidostates (or empty sets) AE and BE. Thus,

P(AB|E)=2S((AB)E)2S(E)=2S(AE)+2S(BE)2S(E)=P(A|E)+P(B|E), (52)

in accordance with the rules of probability. We also have the usual rule for conditional probabilities. Suppose A,BS and EU such that AE. Then

P(B|AE)=2S(B(AE))2S(AE)=P(BA|E)P(A|E). (53)

The entropy formula in Theorem 11 and the probability assignment in Equation (50) call to mind familiar ideas from statistical mechanics. According to Boltzmann’s formula, the entropy is S=logΩ, where Ω is (depending on the context) the number (or phase space volume or Hilbert space dimension) of the microstates consistent with macroscopic data about a system. In the microcanonical ensemble, a uniform probability distribution is assigned to these microstates. In an eidostate E comprising non-overlapping macrostates states {e1,e2,}, we would therefore expect Ω(E)=Ω(e1)+Ω(e2)+, and the probability of state ek should be proportional to Ω(ek). However, in our axiomatic system Theorem 11 and Equation (50) do not arise from any assumptions about microstates, but solely from the “phenomenological” → relation among eidostates in E.

Even though the entropy function S is not unique, the entropic probability assignment is unique. Suppose S1 and S2 are two entropy functions for the same states and processes. Then, as we have seen, the difference S1S2 is a component of content. All of the states within a uniform eidostate E (as well as E itself) have the same values for all components of content. That is, S1(a)S2(a)=S1(E)S2(E) for any aE. Thus,

P1(a|E)=2S1(a)S1(E)=2S2(a)S2(E)=P2(a|E). (54)

(Both probabilities are zero for aE, of course.)

The entropic probability assignment is not the only possible probability assignment, but it does have a number of remarkable properties. For instance, suppose E and F are two uniform eidostates. If we prepare them independently, we have the combined eidostate E+F and the probability of some particular state x+y is

P(x+y|E+F)=2S(x)+S(y)2S(E)+S(F)=P(x|E)P(y|F), (55)

as we would expect for independent events.

The entropic probability also yields an elegant expression for the entropy S(E) of a uniform eidostate E.

Theorem 12.

Suppose E is a uniform eidostate, and P(a|E) is the entropic probability for state aE. Then

S(E)=S(a)+H(P), (56)

where S(a) is the average state entropy in E and H(P) is the Shannon entropy of the P(a|E) distribution.

Proof. 

We first note that, for any aE, logP(a|E)=S(a)S(E). We rewrite this as S(E)=S(a)logP(a|E) and take the mean value with respect to the P(a|E) probabilities:

S(E)=S(E)=aEP(a|E)S(a)logP(a|E)=aEP(a|E)S(a)aEP(a|E)logP(a|E), (57)

Therefore, S(E)=S(a)+H(P), as desired. ☐

We previously said that the list of possible states in an eidostate represents a kind of information. Theorem 12 puts this intuition on a quantitative footing. The entropy of a uniform eidostate E can be decomposed into two parts: the average entropy of the states, and an additional term representing the information contained in the distinction among the possible states. For a singleton state, the entropy is all of the first sort. For a pure information state II, it is all of the second.

In fact, the decomposition itself uniquely picks out the entropic probability assignment. Suppose P(a|E) is the entropic probability of a given E, and P(a|E) is some other probability distribution over states in E. By Gibbs’s inequality [17],

0aEP(a|E)logP(a|E)P(a|E) (58)

with equality if and only if P(a|E)=P(a|E) for all aE. We find that

0aEP(a|E)logP(a|E)2S(E)2S(a)=aEP(a|E)logP(a|E)+S(E)aEP(a|E)S(a) (59)

and so

S(E)S(a)P+H(P), (60)

with equality if and only if the P distribution is the entropic one.

An agent that employs entropic probabilities will regard the entropy of a uniform eidostate E as the sum of two parts, one the average entropy of the possible states and the other the Shannon entropy of the distribution. For an agent that employs some other probability distribution, things will not be so simple. Besides the average state entropy and the Shannon entropy of the distribution, S(E) will include an extra, otherwise unexplained term. Thus, for a given collection of eidostates connected by the → relation, the entropic probability provides a uniquely simple account of the entropy of any uniform eidostate.

It is in this sense we say that the entropic probability “emerges” from the entropy function S, just as that function itself emerges from the → relation among eidostates.

Some further remarks about probability are in order. Every formal basis for probability emphasizes a distinct idea about it. In Kolmogorov’s axioms [18], probability is simply a measure on a sample space. High-measure subsets are more probable. In the Bayesian approach of Cox [19], probability is a rational measure of confidence in a proposition. Propositions in which a rational agent is more confident are also more probable. Laplace’s early discussion [20] is based on symmetry. Symmetrically equivalent events—two different orderings of a shuffled deck of cards, for instance—are equally probable. (Zurek [21] has used a similar principle of “envariance” to discuss the origin of quantum probabilities.) In algorithmic information theory [16], the algorithmic probability of a bit string is related to its complexity. Simpler bit strings are more probable.

In a similar way, entropic probabilities express facts about state transformations. In a uniform eidostate E, any two states a,bE are related by a possible process. If ab, then the output state is at least as probable as the input state: P(a|E)P(b|E).

12. A Model for the Axioms

A model for an axiomatic system may serve several purposes. The existence of a model establishes that the axioms are self-consistent. A model may also demonstrate that the system can describe an actual realistic physical situation. If the axioms have a variety of interesting models, then the axiomatic theory is widely applicable. We may almost say that the entire significance of an axiomatic system lies in the range of models for that system.

Terms that are undefined in an abstract axiomatic system are defined within a model as particular mathematical structures. The axioms of the system are provable properties of those structures. Therefore, a model for axiomatic information thermodynamics must include several elements:

  • A set S of states and a collection E of finite nonempty subsets of S to be designated as eidostates.

  • A rule for interpreting the combination of states (+) in S.

  • A relation → on E.

  • A designated set MS of mechanical states (which might be empty).

  • Proofs of Axioms 1–9 within the model, including the general properties of →, the existence of record states and information states, etc.

The model will therefore involve specific meanings for S, E, +, → and so forth. It will also yield interpretations of derived concepts and results, such as entropy functions and conserved components of content. In the abstract theory, the combination a+b is simply the Cartesian pair (a,b). This definition may suffice for the model, or the model may have a different interpretation of +. In any case, it must be true in the model that a+b=a+b implies a=a and b=b.

Our first model for axiomatic thermodynamics is based on a set A of “atomic” states, from which all states in S and all eidostates in E are constructed. We assign entropies and components of content to these states, which extend to composite states by additivity. Let us consider a simple but non-trivial example that has just one component of content Q. A suitable set A of atomic states is shown in Figure 5. It includes a special state r (with S(r)=0 and Q(r)=0) and a continuous set of states sλ with Q(sλ)=1 and S(sλ)=λ. The parameter λ ranges over the closed interval [0,1].

Figure 5.

Figure 5

Atomic states in our simple “macrostate” model.

The set of states S includes everything that can be constructed from A by finite application of the pairing operation. In this way, we can build up aS with any non-negative integer value of the component of content Q(a) and any entropy value 0S(a)Q(a). Indeed, there will typically be many different ways to create given (Q,S) values. To obtain Q(a)=2 and S(a)=1, for example, we might have a=s1+s0,s0+s1,(s1/2+s1/2)+r,.

Anticipating somewhat, we call an eidostate uniform if all of its elements have the same Q-value. We calculate the entropy of a uniform eidostate by applying Theorem 11 to it.

Our model allows any finite nonempty set of states to play the role of an eidostate. Hence, we have both uniform and non-uniform eidostates in E. Each eidostate A has a finite Cartesian factorization

A=FA(E1,,En). (61)

If A is uniform, then all of its factors are also uniform. If none of its factors are uniform, we say that A is completely non-uniform. More generally, we can write down an NU-decomposition for any eidostate A:

ANA+UA, (62)

where NA is a completely non-uniform eidostate and UA is a uniform eidostate. Of course, if A itself is either completely non-uniform or uniform, one or the other of these eidostates may be absent from the decomposition. The NU-decomposition is unique up to similarity: If NA+UANA+UA for completely non-uniform Ns and uniform Us, then NANA and UAUA.

We can now define the → relation on E in our model. If A,BE, we first write down NU-decompositions ANA+UA and BNB+UB. We say that AB provided three conditions hold:

  1. Either NA and NB both do not exist, or NANB.

  2. Either UA and UB both do not exist, or only one exists and its Q-value is 0, or both exist and Q(UA)=Q(UB).

  3. Either UA and UB both do not exist, or only UA exists and S(UA)=0, or only UB exists and S(UB)0, or both exist and S(UA)S(UB).

We may call these the N-criterion, Q-criterion, and S-criterion, and summarize their meaning as follows: AB provided we can transform A to B by: (a) rearranging the non-uniform factors; and (b) transforming the uniform factors in a way that conserves Q and does not decrease S.

Now, let us examine each of the axioms in turn.

  • Axiom 1

    The basic properties of eidostates follow by construction.

  • Axiom 2

    Part (a) holds because AB implies that A and B can have the same NU-decomposition. Part (b) holds because similarity, equality (for Q) and inequality (for S) are all transitive. Parts (c) and (d) make use of the general facts that NA+BNA+NB and UA+BUA+UB.

  • Axiom 3

    If NANB, then AB. If NANB, then it must be true that UBUA, and so S(UA)>S(UB). The S-criterion fails, so AB in this case as well.

  • Axiom 4

    For Part (a), we note that A must be uniform, and so AA is also uniform. The statement follows from the S-criterion. Part (b) also follows from the S-criterion.

  • Axiom 5

    The atomic state r with Q(r)=0 and S(r)=0 is a record state, as is r+r, etc. We can take our bit state to be Ib={r,r+r}. Since every information state II is uniform with Q(I)=0, every information process (including Θb=r,Ib) is possible.

  • Axiom 6

    Since all of the states of the form a+I are uniform, the statements in this axiom follow from the S-criterion.

  • Axiom 7

    Suppose nAnB+J. Since J is uniform, it must be that nNAnNB, from which it follows that NANB. The S-criterion for UA and UB follows from a typical stability argument—that is, if nxny+z for arbitrarily large values of n, then it must be true xy.

  • Axiom 8

    The set M of mechanical states may be defined to include all states that can be constructed from the zero-entropy atomic state s0 (s0+s0, s0+(s0+s0), etc.). The required properties of M follow.

  • Axiom 9

    The uniform eidostate E has Q(E)=q0 and S(E)=σ0. Choose an integer n>σ. Now, let e=qs0 (or e=r if q=0), x=ns0, and y=nsλ where λ=σ/n. We find that Q(e)=q, Q(x)=Q(y)=n, S(e)=S(x)=0 and S(y)=n(σ/n)=σ. It follows that xy and E+xe+y.

This model based on a simple set of atomic states has several sophisticated characteristics, including a non-trivial component of content Q and possible processes involving non-uniform eidostates. It is not difficult to create models of this type that are even richer and more complex. However, it may be objected that this type of model obscures one of the key features of axiomatic information thermodynamics. Here, the entropy function S does not emerge from the → relation among eidostates, but instead is imposed by hand to define → within the model. We address this deficiency in our next model.

13. A Simple Quantum Model

Now, we present a model for the axioms in which the entropy function does emerge from the underlying structure. The model is a simple one without mechanical states or non-trivial components of content. Every eidostate is uniform and every process is possible. On the other hand, the model is based on quantum mechanics, and so is not devoid of features of interest.

Consider an agent A that can act upon an external qubit system Q having Hilbert space Q. Based on the information the agent possesses, it may assign the states ψ1 or ψ2 to the qubit. These state vectors need not be orthogonal. That is, it may be that no measurement of Q can perfectly distinguish which of the two states is actually present. The states, however, correspond to states of knowledge of agent A, and the agent is able to perform different operations on Q depending on whether it judges the qubit to be in one state or the other. Our notion of information possessed by the agent is thus similar to Zurek’s concept of actionable information [22]. Roughly speaking, information is actionable if it can be used as a control variable for conditional unitary dynamics. This means that the two states of the agent’s memory (μ1 and μ2 in a Hilbert space A) must be distinguishable. Hence, if we include the agent in our description of the entire system, the states μ1ψ1 and μ2ψ2 are orthogonal, even if ψ1 and ψ2 are not.

Our model for axiomatic information thermodynamics envisions a world consisting of an agent A and an unbounded number of external qubits. Nothing essential in our model would be altered if the external systems had dimQ=d —“qudits” instead of qubits. The thermodynamic states of the qubit systems are actually states of knowledge of the agent, and so we must include the corresponding state of the agent’s memory in our physical description. The quantum state space for our model is of the form:

H=AQQ (63)

To be a bit more rigorous, we restrict H to vectors of the form Ψ00, where ΨAQn for some finite n, and 0 is a designated “zero” ket in Q. Physical states in H have ΨΨ=1. (The space H is not quite a Hilbert space, since it is not topologically complete, but this mathematical nicety will not affect our discussion.)

Since the Qs are qubits, dimQ=2. The agent space A, however, must be infinite-dimensional, so that it contains a countably infinite set of orthogonal quantum states. These are to be identified as distinct records of the agent’s memory.

In our thermodynamic model, the elements of S (the thermodynamic “states”) are projection operators on H. For any aS, we have a projection on H of the form

Πa=aaπa00 (64)

where a is an agent state in A and πa is a non-null projection in Qn for some finite n1. The value of n is determined by a specified integer function L(a), which we call the length of the state a. Heuristically, the thermodynamic state a means that the state of the world lies in the subspace Sa onto which Πa projects. The agent’s memory is in the state a and the the quantum state of the first L(a) external qubits lies somewhere in the subspace onto which πa projects. (All of the subsequent qubits are in the state 0.)

Two distinct thermodynamic states correspond to orthogonal states of the agent’s memory. If a,bS with ab, then ab=0. The projections Πa and Πb are orthogonal to each other (so that ΠaΠb=0). However, it need not be the case that πa and πb are orthogonal.

Given a, the projection Πa projects onto the subspace Sa The dimension of this subspace is da=dimSa=TrΠa=Trπa. Note that da2L(a). We will assume that there are Sa subspaces of every finite dimension: For any integer n1, there exists aS with da=n.

Suppose a,bS correspond to Πa=aaπa and Πb=bbπb. Then, we will specify that the combined state a+b corresponds to

Πa+b=a+ba+bπaπb. (65)

Since the state a+b entails a distinct state of the agent’s knowledge, the agent state vector a+b is orthogonal to both a and b. We also note that L(a+b)=L(a)+L(b).

Here is a clarifying example. Suppose a,b,cS. Then,

Π(a+b)+c=(a+b)+c(a+b)+cπaπbπc (66)
Πa+(b+c)=a+(b+c)a+(b+c)πaπbπc (67)

are distinct thermodynamic states and hence orthogonal projections in H, even though they correspond to exactly the same qubit states. The difference between (a+b)+c and a+(b+c) entirely lies in the distinct representations of the states in the agent’s memory.

The eidostates in our model are the finite nonempty collections of states in S. We can associate each eidostate with a projection operator as well. Let E={a,} be an eidostate. We define

ΠE=aEΠa=aEaaπa. (68)

This is a projection operator because the Πa projections are orthogonal to one another. ΠE projects onto a subspace SE, which is the linear span of the collection of subspaces {Sa,}.

Interestingly, this subspace SE might contain quantum states in which the agent A is entangled with one or more external qubits. Suppose a,bS are associated with single-qubit projections onto distinct states ψa and ψb. The eidostate E={a,b} is associated with the projection

ΠE=aaψaψa+bbψbψb, (69)

which projects onto a subspace SE that contains the quantum state

Ψ=12aψa+bψb. (70)

In this state, the agent does not have a definite memory record state. However, if a measurement is performed on the agent (perhaps by asking it a question), then the resulting memory record a or b would certainly be found to be consistent with the state of the qubit system, ψa or ψb.

Suppose we combine two eidostates A={a,} and B={b,}. Then, the quantum state lies in a subspace of dimension

dA+B=TrΠA+B=Tra,bΠa+b=a,bTra+ba+bπaπb=a,bTrπaTrπb=aTrπabTrπb=dA·dB. (71)

When eidostates combine, subspace dimension is multiplicative.

It remains to define the → relation in our quantum model. We say that AB if there exists a unitary time evolution operator U on H such that ΨSA implies that UΨSB. That is, every quantum state consistent with A evolves to one consistent with B under the time evolution U. (Note that the evolution includes a suitable updating of the agent’s own memory state.) This requirement is easily expressed as a subspace dimension criterion: AB if and only if dAdB.

We are now ready to verify our axioms.

  • Axiom 1

    This follows from our construction of the eidostates E.

  • Axiom 2

    All of these basic properties of the → relation follow from the subspace dimension criterion.

  • Axiom 3

    If BA, then dA>dB, and so AB.

  • Axiom 4

    Again, both parts of this axiom follow from the subspace dimension criterion. If eidostate A is a disjoint union of eidostates A1 and A2, then dA=dA1+dA2.

  • Axiom 5

    Any state r with dr=1 functions as a record state. We have assumed that such a state exists. We can take Ib={r,r+r}. The bit process Θb=r,Ib is natural (rIb) by the subspace dimension criterion. Notice that, for any information state I, dI=#(I).

  • Axiom 6

    For any bS and II, we have db+I=db·#(I). For Part (a), we can always find a large enough information state so that dbda·#(I). For Part (b), either dadb+I or db+Ida.

  • Axiom 7

    If (dA)n(dB)n·#(J) for arbitrarily large values of n, then dAdB.

  • Axiom 8

    It is consistent to take M=.

  • Axiom 9

    All of our eidostates are uniform. For any eidostate E, we can choose e so that de=dE. (Recall that we have assumed states with every positive subspace dimension.) If we chose x=y to be any state, then E+xe+y.

Our quantum mechanical model is therefore a model of the axioms of information thermodynamics. It is a relatively simple model, of course, having no non-trivial conserved components of content and no mechanical states.

In the quantum model, the entropy of any eidostate is simply the logarithm of the dimension of the corresponding subspace: S(E)=logdE. This is the von Neumann entropy of a uniform density operator ρE=1dEΠE. We can, in fact, recast our entire discussion in terms of these mixed states, and this approach does yield some insights. For example, we find that the density operator ρE for eidostate E is a mixture of the density operators for its constituent states:

ρE=aEP(a|E)ρa, (72)

where P(a|E) is the entropic probability

P(a|E)=2S(a)2S(E)=dadE. (73)

There are, of course, many quantum states of the agent and its qubit world that do not lie within any eidostate subspace SE. For example, consider a state associated with a pure state projection Πa=aaψaψa. Let χ be a state orthogonal to ψa. Then, aχ is a perfectly legitimate quantum state that is orthogonal to Sa and every other eidostate subspace. This state represents a situation in which the agent’s memory record indicates that the first L(a) external qubits are in state ψa, but the agent is wrong.

The exclusion of such physically possible but incongruous quantum states tells us something significant about our theory of axiomatic information thermodynamics. The set E does not necessarily include all possible physical situations; the arrow relations → between eidostates do not necessarily represent all possible time evolutions. Our axiomatic system is simply a theory of what transformations are possible among a collection of allowable states. In this, it is similar to ordinary classical thermodynamics, which is designed to consider processes that begin and end with states in internal thermodynamic equilibrium.

14. Remarks

The emergence of the entropy S, a state function that determines the irreversibility of processes, is a key benchmark for any axiomatic system of thermodynamics. Our axiomatic system does not yield a unique entropy on S, since it is based on the extension of an irreversibility function to impossible processes. However, many of our results and formulas for entropy are uniquely determined by our axioms. The entropy measure for information states, the Hartley–Shannon entropy log#(I), is unique up to the choice of logarithm base. This in turn uniquely determines the irreversibility function on possible singleton processes, since this is defined in terms of the creation and erasure of bit states. There is a unique relationship between the entropy of a uniform eidostate and the entropy of the possible states it contains. Finally, the entropic probability distribution on a uniform eidostate, which might appear at first to depend on the singleton state entropy, is nonetheless unique. It remains to be seen how the axiomatic system developed here for state transformations is related to the axiomatic system, similar in some respects, given by Knuth and Skilling for considering problems of inference [23]. There, symmetry axioms in a lattice of states give rise to probability and entropy measures. In a similar way, the “entropy first, probability after” idea presented here is reminiscent of Caticha’s “entropic inference” [24], in which the probabilistic Bayes rule is derived (along with the maximum entropy method) from a single relative entropy functional.

The entropy in information thermodynamics has two aspects: a measure of the information in a pure information state and an irreversibility measure for a singleton eidostate (the kind most closely analogous to a conventional thermodynamic state). The most general expression for the entropy of a uniform eidostate in Theorem 12 exhibits this twofold character. However, the two aspects of entropy are not really distinct in our axiomatic system. Both are based on the structure of the → relation among eidostates, which tells how one eidostate may be transformed into another by processes that may include demons.

The connection between information and thermodynamic entropy is nowhere more clearly stated than in Landauer’s principle, the minimum thermodynamic cost of information erasure. It is easy to state a theorem of our system corresponding to Landauer’s principle. Suppose a,bS and Ib is a bit state. If a+Ibb, then it follows that S(b)S(a)+1. Erasing a bit state is necessarily accompanied by an increase of at least one unit in the thermodynamic entropy. More generally, suppose A and B are uniform eidostates with AB. If we use Theorem 12 to write the entropies of the two states as

S(A)=SA+HAandS(B)=SB+HB, (74)

then it follows that ΔSΔH for the process taking A to B. In other words, any decrease of the “Shannon information” part of the eidostate entropy (ΔH) must be accompanied by an increase, on average, in the thermodynamic entropy of the possible states (ΔS).

These theorems do not really constitute a “proof” of Landauer’s principle, because they depend upon the physical applicability of our axioms. On the other hand, our axiomatic framework does obviate some of the objections that have been raised to existing derivations of Landauer’s principle. Norton [25], for example, has argued that many “proofs” of the principle improperly mix together two different kinds of probability distribution, the microstate distributions associated with thermodynamic equilibrium states and the probability assignments to memory records that represent information. He calls this the problem of “illicit ensembles”. Our approach, by contrast, is not based on concepts of probability, except for those that emerge naturally from the structure of the → relation. We do not resolve thermodynamic states into their microstates at all, and we represent information by a simple enumeration of possible memory records.

The Second Law of Thermodynamics, the law of non-decrease of entropy, is the canonical example of the “arrow of time” in physics. Time asymmetry in our axiomatic theory is found in the direction of the → relation. It is an enlightening exercise to consider the axioms of information thermodynamics with the arrows reversed. Some axioms (e.g., Axioms 1 and 2) are actually unchanged by this. Others may be modified but still remain true statements in the theory. A few become false. The most striking example of the last is Axiom 3, which states that no eidostate may be transformed into a proper subset of itself. That is, no process can deterministically delete one of a list of possible states. This is the principle that leads to irreversbility in information processes, and through them, to more general irreversibility and entropy measures.

Acknowledgments

The authors are pleased to acknowledge many useful conversations with, and suggestions from, Rob Spekkens, Jonathan Oppenheim, David Jennings, Charles Bennett, John Norton, Graham Reid, Avery Tishue, Ian George, Aixin Li, and Tiancheng Zheng. AH was supported by the Kenyon Summer Science Scholar program in the summer of 2016. BWS and MDW gratefully acknowledge funding from the Foundational Questions Institute (FQXi), via grant FQXi-RFP-1517.

Appendix A

This appendix provides a convenient summary of axioms of our theory.

First, we remind ourselves of a few essential definitions:

An eidostate is a set whose elements are called states. The collection of eidostates is E and the collection of states is S. An element aS may be identified with the singleton eidostate {a}E.

Eidostates are combined by the Cartesian product, which we denote by the symbol +. When we combine an eidostate with itself n times, we use nA to denote A+(A+(A+)). Two eidostates are similar (written AB) if they are made up of the same Cartesian factors, perhaps combined in a different way.

There is a relation → on E, and thus also on the singletons in S. An eidostate A is uniform if, for all a,bA, either ab or ba. A formal process is a pair of eidostates A,B. We say that a process A,B is possible if either AB or BA.

A record state r is a state for which there exists another state a such that aa+r and a+ra (denoted aa+r). An information state is an eidostate containing only record states, and the set of information states is called I. A bit stateIb is an information state containing exactly two distinct record states. A bit process is a formal process r,Ib, where r is a record state and Ib is a bit state.

Now, we may state our axioms:

  • Axiom 1
    (Eidostates.) E is a collection of sets called eidostates such that:
    • (a) 
      Every AE is a finite nonempty set with a finite prime Cartesian factorization.
    • (b) 
      A+BE if and only if A,BE.
    • (c) 
      Every nonempty subset of an eidostate is also an eidostate.
  • Axiom 2
    (Processes.) Let eidostates A,B,CE, and sS.
    • (a) 
      If AB, then AB.
    • (b) 
      If AB and BC, then AC.
    • (c) 
      If AB, then A+CB+C.
    • (d) 
      If A+sB+s, then AB.
  • Axiom 3

    If A,BE and B is a proper subset of A, then AB.

  • Axiom 4
    (Conditional processes.)
    • (a) 
      Suppose A,AE and bS. If Ab and AA then Ab.
    • (b) 
      Suppose A and B are uniform eidostates that are each disjoint unions of eidostates: A=A1A2 and B=B1B2. If A1B1 and A2B2 then AB.
  • Axiom 5

    (Information.) There exist a bit state and a possible bit process.

  • Axiom 6
    (Demons.) Suppose a,bS and JI such that ab+J.
    • (a) 
      There exists II such that ba+I.
    • (b) 
      For any II, either ab+I or b+Ia.
  • Axiom 7

    (Stability.) Suppose A,BE and JI. If nAnB+J for arbitrarily large values of n, then AB.

  • Axiom 8
    (Mechanical states.) There exists a subset MS of mechanical states such that:
    • (a) 
      If l,mM, then l+mM.
    • (b) 
      For l,mM, if lm then ml.
  • Axiom 9

    (State equivalence.) If E is a uniform eidostate then there exist states e,x,yS such that xy and E+xe+y.

Author Contributions

This work began as a student research project by A.H. under the supervision of B.S., with M.D.W. acting as a consultant. That project established many of the basic concepts, axioms and proofs (particularly the elegant formulation of Axiom 3). B.S. and M.D.W. refined and extended the axiomatic system and wrote the present paper. All three authors have contributed editorial input.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Giles R. Mathematical Foundations of Thermodynamics. Pergamon Press Ltd.; Oxford, UK: 1964. [Google Scholar]
  • 2.Lieb E.H., Yngvason J. The physics and mathematics of the second law of thermodynamics. Phys. Rep. 1999;310:1–96. doi: 10.1016/S0370-1573(98)00082-9. [DOI] [Google Scholar]
  • 3.Lieb E.H., Yngvason J. A guide to entropy and the second law of thermodynamics. Not. Am. Math. Soc. 1998;45:571–581. [Google Scholar]
  • 4.Thess A. The Entropy Principle: Thermodynamics for the Unsatisfied. Springer; Berlin, Germany: 1964. [Google Scholar]
  • 5.Vedral V., Kashefi E. Uniqueness of the entanglement measure for bipartite pure states. Phys. Rev. Lett. 2002;89:037903. doi: 10.1103/PhysRevLett.89.037903. [DOI] [PubMed] [Google Scholar]
  • 6.Leff H.S., Rex A.F. Maxwell’s Demon 2: Entropy, Classical and Quantum Information, Computing. IOP Publishing Ltd.; Bristol, UK: 2003. [Google Scholar]
  • 7.Szilard L. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Z. Phys. 1929;53:840–856. doi: 10.1007/BF01341281. (In German) [DOI] [PubMed] [Google Scholar]
  • 8.Bennett C.H. The thermodynamics of computation—A review. Int. J. Theor. Phys. 1982;21:905–940. doi: 10.1007/BF02084158. [DOI] [Google Scholar]
  • 9.Landauer R. Irreversibility and heat generation in the computing process. IBM J. Res. Dev. 1961;5:183–191. doi: 10.1147/rd.53.0183. [DOI] [Google Scholar]
  • 10.Jaynes E.T. Information theory and statistical mechanics. Phys. Rev. 1957;106:620–630. doi: 10.1103/PhysRev.106.620. [DOI] [Google Scholar]
  • 11.Jaynes E.T. Information theory and statistical mechanics II. Phys. Rev. 1957;108:171–190. doi: 10.1103/PhysRev.108.171. [DOI] [Google Scholar]
  • 12.Shannon C.E. A mathematical theory of communication. Bell Sys. Tech. J. 1948;27:379–423, 623–656. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
  • 13.Weilenmann M., Kraemer L., Faist P., Renner R. Axiomatic relation between thermodynamic and information-theoretic entropies. Phys. Rev. Lett. 2016;117:260601. doi: 10.1103/PhysRevLett.117.260601. [DOI] [PubMed] [Google Scholar]
  • 14.Gour G., Müller M.P., Narasimhachar V., Spekkens R.W., Halpern N.Y. Resource theory of informational nonequilibrium in thermodynamics. Phys. Rep. 2015;583:1–58. doi: 10.1016/j.physrep.2015.04.003. [DOI] [Google Scholar]
  • 15.Del Rio L., Kraemer L., Renner R. Resource theories of knowledge. arXiv. 2015. 1511.08818
  • 16.Cover T.M., Thomas J.A. Elements of Information Theory. 2nd ed. John Wiley and Sons; Hoboken, NJ, USA: 2006. [Google Scholar]
  • 17.MacKay D.J.C. Information Theory, Inference, and Learning Algorithms. Cambridge University Press; Cambridge, UK: 2003. [Google Scholar]
  • 18.Kolmogorov A.N. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer; Berlin, Germany: 1933. (In German) [Google Scholar]
  • 19.Cox R.T. The Algebra of Probable Inference. Johns Hopkins Press; Baltimore, MD, USA: 1961. [Google Scholar]
  • 20.De Laplace P.S. A Philosophical Essay on Probabilities. Dover; New York, NY, USA: 1951. [Google Scholar]
  • 21.Zurek W.H. Probabilities from entanglement, Born’s rule pk = |ψk|2 from envariance. Phys. Rev. A. 2005;71:052105. doi: 10.1103/PhysRevA.71.052105. [DOI] [Google Scholar]
  • 22.Zurek W.H. Wave-packet collapse and the core quantum postulates: Discreteness of quantum jumps from unitarity, repeatability, and actionable information. Phys. Rev. A. 2013;87:052111. doi: 10.1103/PhysRevA.87.052111. [DOI] [Google Scholar]
  • 23.Knuth K.H., Skilling J. Foundations of inference. Axioms. 2012;1:38–73. doi: 10.3390/axioms1010038. [DOI] [Google Scholar]
  • 24.Caticha A. Entropic inference. AIP Conf. Proc. 2011;1305:20–29. [Google Scholar]
  • 25.Norton J.D. Eaters of the lotus: Landauer’s principle and the return of Maxwell’s demon. Stud. Hist. Phil. Mod. Phys. 2005;36:375–411. doi: 10.1016/j.shpsb.2004.12.002. [DOI] [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES