ON ENTROPY IN NONEQUILIBRIUM STATISTICAL MECHANICS
PHYSICS
Submitted 1969-01-01 | SovietRxiv: ru-196901.33459 | Translated from Russian

Abstract Generated abstract

This paper examines the definition of entropy in nonequilibrium statistical mechanics, focusing on the relation between Planck, Boltzmann, and Gibbs coarse-grained entropies for an isolated classical system divided into weakly interacting subsystems. It defines coarse-graining by assigning a constant probability density within each macrostate region of phase space and proves inequalities connecting the expectation of Planck entropy, Gibbs entropy, and Boltzmann entropy. The paper further estimates the differences among these quantities, showing that with an appropriate number of coarse-graining cells they are equivalent for describing average nonequilibrium behavior, up to controlled errors. It argues that coarse-grained entropy can retain thermodynamic meaning, whereas fine-structure entropy remains invariant and therefore cannot represent irreversible thermodynamic change.

Full Text

UDC 530:161

PHYSICS

Yu. E. MURAKHVER

ON ENTROPY IN NONEQUILIBRIUM STATISTICAL MECHANICS

(Presented by Academician V. A. Fock on 30 IV 1968)

1. The present work is devoted to the study of the concept of entropy in nonequilibrium statistical mechanics. The problem of irreversibility is not touched upon here, but the entropy is chosen in such a way that, in principle, it could increase; moreover, so that its increase would be quite plausible.

We shall study the connection between the Boltzmann–Planck and Gibbs entropies. The qualitative distinction between the concepts of these authors is discussed in detail in Ehrenfest’s review \((^{1})\), in Krylov’s book \((^{2})\), and in a number of papers by other authors \((^{3,4})\). A quantitative study of the connection between the indicated definitions of entropy was carried out in \((^{3-5})\). In the present article substantially stronger results are obtained (see Sec. 4).

Let us divide our isolated system into \(N\) weakly interacting subsystems, each of which contains \(r\) particles. We regard the subsystems as localized, so that their permutation leads to a new state \((^{6,7})\). The results of the work are also applicable to nonlocalized subsystems; it is only necessary, in all the expressions given below for the entropy, to subtract \(\ln N!\).

A microstate of a classical system is determined by specifying its coordinates \(x\) in \(\Gamma\)-space. A macrostate is determined by a set of occupation numbers \(N_j\) \((j = 1,\ldots,M)\) of the cells of \(\mu\)-space. By \(\mu\)-space we mean the phase space of a subsystem, which is a generalization in comparison with \((^{1})\). Since (see Sec. 3) \(M \sim \sqrt{N}\), the occupation numbers are on the average \(\gg 1\). This makes it possible, in principle, to satisfy the second law of thermodynamics (see Sec. 4) and, by virtue of Theorem 3, does not lead to physically undesirable consequences.

The spatial localization of the subsystems implies their division into parts whose volumes are, in some sense, fixed. Consequently, the number of particles in a subsystem, generally speaking, may change. This, however, does not limit the range of applicability of the results of the work, since the consideration may be carried out in the spirit of the grand canonical Gibbs ensemble, combining in one cell states with different \(N\).

2. To a distribution \((N_1,\ldots,N_M)\) over the cells of \(\mu\)-space there corresponds a region of \(\Gamma\)-space \(Z(N_1,\ldots,N_M)\), called a \(Z\)-star. Its volume is equal to (\(\mu_j\) is the volume of a cell)

\[ \gamma(N_1,\ldots,N_M)=\frac{N!}{N_1!\ldots N_M!}\,\mu_1^{N_1}\ldots \mu_M^{N_M}. \tag{1} \]

Planck proposed the following definition of entropy:

\[ S_P=\ln \Omega(N_1,\ldots,N_M) \tag{2} \]

(\(\Omega\) is the number of microstates realizing the given macrostate), which he called Boltzmann’s (see, for example, \((^{10})\)). Although this name is generally accepted, for convenience we shall call \(S_P\) the Planck entropy.

Planck. In the classical case

\[ \Omega(N_1,\ldots,N_M)=\gamma(N_1,\ldots,N_M)/h^{3rN}. \tag{3} \]

Let us note that \(\gamma\) is defined in a way quite different from the phase volume entering Liouville’s theorem. Ignoring this circumstance leads to an apparent contradiction between the second law of thermodynamics and this theorem \((^{8,9})\).

By the Boltzmann entropy we shall mean the quantity

\[ S_B=-\sum_j N_j\ln (N_j/NG_j), \tag{4} \]

where \(G_j=\mu_j/h^{3r}\).

\(S_P\) and \(S_B\), by definition, are functions of the coordinates of the system \(x\) in \(\Gamma\)-space, i.e., random variables. Their mathematical expectations are

\[ \overline{S}_P=\int S_P(x)\rho(x,t)\,dx,\qquad \overline{S}_B=\int S_B(x)\rho(x,t)\,dx, \tag{5} \]

where \(\rho(x,t)\) is the probability density satisfying Liouville’s equation.

The Gibbs entropy (coarse-structured entropy) is defined by the formula

\[ S_G=-\int \widetilde{\rho}\ln(\widetilde{\rho}h^{3rN})\,dx; \tag{6} \]

where \(\widetilde{\rho}\) is the coarse-structured probability density introduced by Ehrenfest \((^1)\). Thus, \(S_G\) is a definite function of time and cannot undergo fluctuations.

\(\widetilde{\rho}\) has not previously been defined exactly. It is natural to define it so that it is constant within any \(Z\)-star, namely

\[ \widetilde{\rho}_{\,x\in Z(N_1,\ldots,N_M)} = w(N_1,\ldots,N_M)/\gamma(N_1,\ldots,N_M). \tag{7} \]

Here \(w(N_1,\ldots,N_M)\) denotes the probability of the given distribution over cells, i.e.,

\[ w(N_1,\ldots,N_M)= \int_{Z(N_1,\ldots,N_N)} \rho(x,t)\,dx = \int_{Z(N_1,\ldots,N_M)} \widetilde{\rho}(x,t)\,dx. \tag{8} \]

For the equilibrium case \(S_P\), \(S_B\), and \(S_G\) give the thermodynamic entropy, which is proved in the usual way. In the proof the weakness of the interaction between subsystems is used essentially.

3. Lemma. If \(\widetilde{\rho}\) is expressed in terms of \(\rho\) by formula (7), then

\[ \widetilde{p}(j)=p(j), \tag{9} \]

where \(p(j)\) and \(\widetilde{p}(j)\) are the probabilities that a subsystem taken at random will be in the \(j\)-th cell; here \(p(j)\) is defined in the usual way through \(\rho(x)\), and \(\widetilde{p}(j)\) analogously through \(\widetilde{\rho}(x)\).

Indeed (we denote the coordinates of the \(i\)-th subsystem by \(y_i\); summation is over all distributions among cells):

\[ \widetilde{p}_1(j)= \int_{y_1\in\mu_j}\widetilde{\rho}(x)\,dx = \sum_{(N_1,\ldots,N_M)} \frac{w(N_1,\ldots,N_M)}{\gamma(N_1,\ldots,N_M)} \int_{Z(N_1,\ldots,N_M),\,y_1\in\mu_j} dx = p_1(j). \]

Theorem 1. If \(\widetilde{\rho}\) is expressed in terms of \(\rho\) by formula (7), then

\[ \overline{S}_P\le S_G\le S_B(\overline{N}_1,\ldots,\overline{N}_M); \qquad \overline{S}_P<\overline{S}_B\le S_B(\overline{N}_1,\ldots,\overline{N}_M). \tag{10} \]

By virtue of (3) and (6)—(8) we have

\[ S_G=\overline{S}_P- \sum_{(N_1,\ldots,N_M)} w(N_1,\ldots,N_M)\ln w(N_1,\ldots,N_M), \tag{11} \]

whence, in particular, the inequality \(\overline{S}_P\le S_G\) follows.

Let us now verify the validity of the inequality \(S_G \leqslant S_B(\overline{N}_1,\ldots,\overline{N}_M)\). On the basis of (4), (6), and the lemma we have

\[ S_G=-\sum_{j_1,\ldots,j_N=1}^{M} \mathscr{P}(j_1,\ldots,j_N)\ln \frac{\mathscr{P}(j_1,\ldots,j_N)}{G_{j_1}\cdots G_{j_N}}; \tag{12} \]

\[ S_B(\overline{N}_1,\ldots,\overline{N}_M) =-N\sum_t p(j)\ln\bigl(p(j)/G_j\bigr), \tag{13} \]

where

\[ \mathscr{P}(j_1,\ldots,j_N)= \int_{y_1\in \mu_{j_1},\ldots,y_N\in \mu_{j_N}} \widetilde{\rho}(x)\,dx . \]

We shall now use the well-known inequality

\[ \sum_i p_i\ln p_i \geqslant \sum_i p_i\ln q_i, \quad \text{where } \sum_i p_i \geqslant \sum_i q_i \quad ({}^{11}), \text{ Ch. 2). \]

Then

\[ S_G \leqslant -\sum_{j_1,\ldots,j_N} \mathscr{P}(j_1,\ldots,j_N) \ln\frac{p(j_1)\cdots p(j_N)}{G_{j_1}\cdots G_{j_N}} = S_B(\overline{N}_1,\ldots,\overline{N}_M). \]

Finally, from the refined Stirling formula \(({}^{12})\),

\[ \sqrt{2\pi n}\,n^n \exp\left(-n+\frac{1}{12n+1}\right) < n! < \sqrt{2\pi n}\,n^n \exp\left(-n+\frac{1}{12n}\right) \tag{14} \]

there follows the inequality \(S_B>S_P\), while \(\overline{S}_B \leqslant S_B(\overline{N}_1,\ldots,\overline{N}_M)\) is simply Jensen’s inequality \(({}^{13})\). The theorem is proved.

Theorem 2. In order that the differences \(S_B-\overline{S}_P\) and \(S_G-\overline{S}_P\) not exceed \(C\sqrt{N}\) (for an arbitrary distribution), it is sufficient that the number of cells \(M\) not exceed \(C\sqrt{N}/\ln N\).

Indeed, by virtue of (11) and the condition of maximality of the entropy \(({}^{11})\), Ch. 2, we have \(S_G-\overline{S}_P\leqslant \ln C\), where \(C=(N+M-1)!/(M-1)!N!\) is the number of distributions of the subsystems over the cells. Simplifying \(C\) by Stirling’s formula, we find

\[ S_G-\overline{S}_P < M\bigl(1+\ln (N+M)/M\bigr). \tag{15} \]

Using (14) and the well-known inequality \(({}^{13})\)

\[ \ln x_1+\cdots+\ln x_M \leqslant M\ln\bigl[(x_1+\cdots+x_M)/M\bigr], \]

we also obtain

\[ S_B-S_P < M\bigl(\ln \sqrt{2\pi N}/M + {}^{1}/{}_{12}\bigr). \tag{16} \]

From this it is not difficult to arrive at the theorem formulated above. Theorem 2 means that, with a suitable choice of coarse-grained averaging and of the number of cells, the entropy definitions under consideration are equivalent in describing the average behavior of the system. If we want the fluctuations of \(S_B\) and \(S_P\) (equal, in order of magnitude, to \(\sqrt{Nr}\)) also to coincide, we must take \(C\sim 1\).

Theorem 3. If the conditions of Theorems 1–2 are fulfilled and two sets of cells are such that \(\rho(x)\) changes within the limits of a cell by no more than \(1-\varepsilon\) (\(\varepsilon>0\)) of its value, then, to within terms not exceeding \((1-\varepsilon)/2\varepsilon\), the entropy \((S_G,\overline{S}_B,\overline{S}_P)\) is the same for these sets.

The proof follows from formula (12).

The entropy also must not depend on the choice of subsystems (i.e., on the numbers \(N\) and \(r\)). This will be so as long as the correlations between the states of the subsystems are sufficiently weak (they, of course, need not be absent altogether). In ordinary situations this is the case; however, there are cases (for example, laser radiation) when the correlations between any subsystems are ...

are strong. The definition of entropy for such systems requires additional analysis.

  1. In the present section we shall compare the results of this work with the conclusions of other authors.

In the book (4) and in a number of other works it is shown that, in the state of equilibrium, the Gibbs and Boltzmann–Planck entropies coincide. As for the nonequilibrium state, the results obtained on this question have so far been of a partial and preliminary character. In Klein’s work (3), \(S_G\) and \(S_P\) were investigated as applied to the Ehrenfest model. It was shown that \(S_G \geq \widetilde S_P\), but the difference \(S_G-\widetilde S_P\) was not estimated. In Jaynes’s work (5), estimates of the difference \(S_B-S_G\) were made, where \(S_B\) is expressed by a formula obtained by a further simplification of (13), and whose accuracy for a system of strongly interacting particles is very doubtful.

A correct description of the entropy of an isolated system requires the operation of coarse-grained averaging. It can be defined in such a way that, for an arbitrary nonequilibrium state (when describing the average behavior of the system), the definitions of entropy considered are equivalent.

If, however, \(\widetilde\rho\) in formula (6) is replaced by \(\sigma\), then \(S_G\) becomes the fine-structure entropy \(S_G'\), which, as is well known, is independent of \(t\) and therefore does not satisfy the second law of thermodynamics. It coincides with the information-theoretic entropy, and its invariance is a consequence of the determinism of classical theory.

\(S_G'\) is also constant in quantum theory (see, for example, (14)). There, however, it is not connected with information, and its constancy cannot be interpreted in the manner indicated above. The constancy of the fine-structure Planck entropy is obvious, since as \(\mu_j\to 0\), \(N_j\) tends to a constant value equal to 0 or 1.

Coarse-grained averaging reflects the fact that the most accurate description of a macrostate gives only coarse information about the microstate. If, however, the verifying experiment is complete (in the classical or quantum-mechanical sense), then the entropy does not change with time and thereby loses its thermodynamic meaning. Thus, there exists a kind of complementarity between macroscopic and microscopic descriptions. An analogous point of view was defended by Ya. I. Frenkel (15), who noted that entropy does not change when it is defined exactly and increases when it is defined inexactly.

In conclusion, the author expresses his deep gratitude to A. I. Anselm and Yu. N. Obraztsov for useful discussions.

Leningrad Forestry Engineering Academy
named after S. M. Kirov

Received
17 IV 1968

REFERENCES

  1. P. and T. Ehrenfest, In.: P. Ehrenfest, Collected Scientific Papers, Amsterdam, 1959.
  2. N. S. Krylov, Works on the Foundations of Statistical Physics, Moscow—Leningrad, 1950.
  3. M. J. Klein, Physica, 22, No. 7, 569 (1956).
  4. S. de Groot, P. Mazur, Nonequilibrium Thermodynamics, Moscow, 1964.
  5. E. T. Jaynes, Am. J. Phys., 33, No. 5, 391 (1965).
  6. R. Fowler, E. Guggenheim, Statistical Thermodynamics, Moscow, 1949.
  7. E. Schrödinger, Statistical Thermodynamics, Moscow, 1948.
  8. J. Uhlenbeck, G. Ford, Lectures on Statistical Mechanics, Moscow, 1965.
  9. M. Katz, Probability and Related Questions in Physics, Moscow, 1965.
  10. A. Sommerfeld, Thermodynamics and Statistical Physics, Moscow, 1955.
  11. L. Brillouin, Science and Information Theory, Moscow, 1960.
  12. W. Feller, An Introduction to Probability Theory and Its Applications, 1, Moscow, 1964.
  13. A. M. Yaglom, I. M. Yaglom, Probability and Information, Moscow, 1960.
  14. V. M. Fain, UFN, 79, No. 4, 641 (1963).
  15. Ya. I. Frenkel, Statistical Physics, Moscow—Leningrad, 1948.

Submission history

ON ENTROPY IN NONEQUILIBRIUM STATISTICAL MECHANICS