Chapter 1
INTRODUCTION
Entropy occupies a secure and prominent position on any list of the most fundamental and important concepts of physics, but this trite observation does not fully do it justice. Entropy has a certain solitary mystique all its own, which both fascinates and frustrates those who aspire to comprehend its elusive and multifaceted subtleties. Unlike mass, momentum, and energy, the entropy of an isolated system is not conserved but has the peculiar property of spontaneously increasing to a maximum. A quantity which can only grow larger is curious if not unsettling to contemplate; it vaguely suggests instability, or even some inexorable impending catastrophe worthy of a science fiction movie, perhaps entitled The Heat Death of the Universe! It also seems curious that in spite of its thermodynamic origins, entropy cannot be fully understood in thermodynamic terms alone but requires statistical concepts for its complete elucidation. Indeed, the statistical interpretation of entropy is in many respects simpler and easier to comprehend than its thermodynamic aspects, and arguably provides the most transparent pedagogical approach to the subject. This of course is the rationale for treating thermodynamics and statistical mechanics together as a single discipline, an approach which was once considered heretical but has now become accepted and commonplace.
The statistical interpretation of entropy is approaching its sesquicentennial, and by now has been explored and expounded in a virtually uncountable and unsurveyable multitude of papers, textbooks, treatises, essays, reviews, conference proceedings, etc. This prolonged collective endeavor has largely demystified entropy, while at the same time expanding its scope well beyond the confines of thermodynamics. Indeed, entropy has become perhaps the most pervasive and multifaceted of all physical concepts in terms of its wide applicability to a variety of other disciplines. In short, entropy emerged from its formative years long ago and has now evolved into a mature concept which for the most part is well established and well understood. In spite of this, however, the concept of entropy has yet to converge to a stable equilibrium state. Although the correct formulae to be employed in applications are by now well established, there is no similar consensus as to their proper derivation and interpretation. Moreover, the interpretation, significance, and even the proper definition of entropy itself remain remarkably controversial and contentious [1ā8]. This situation is rather disconcerting and can hardly be regarded as satisfactory.
This book represents an attempt to ameliorate the residual controversy and conceptual confusion surrounding entropy by revisiting and systematically reconstructing its statistical foundations ab initio, making full use of the clarity of hindsight. Our intention and objective has been to develop and present those foundations and their most basic implications in a straightforward, concise, unified, and hopefully cogent form. To this end, the logical structure of the subject is emphasized by clearly stating the minimal postulates and hypotheses required in order to proceed, and deriving their logical consequences as simply and economically as possible. The development is therefore essentially axiomatic and deductive in spirit and structure, but in an informal physical sense rather than a rigorous mathematical sense. An assiduous effort has been made to avoid logical lacunae and non sequiturs, which alas are not uncommon in textbooks. As described in the Preface, the material has been organized in the manner which seemed to result in the most direct and transparent logical structure, which in several respects departs significantly from the orthodox beaten path. The overall organization and main distinguishing features of the book are indicated in the Preface and Table of Contents, so it would be redundant to reiterate that information here. The remainder of this introductory discussion accordingly focuses on those aspects of the treatment which warrant further emphasis or elaboration at the outset.
The present development is predicated on the deceptively simple but highly fruitful proposition that the essence of entropy as a statistical concept is that it represents a consistent quantitative measure of uncertainty which is additive for statistically independent systems. Somewhat surprisingly, this sole and superficially trivial criterion suffices to determine the entropy almost uniquely. The remainder of the development consists, at least in principle, mainly in mere details of elaboration, but as usual the latter are wherein the Devil resides. Indeed, it would not be entirely inaccurate to say that itās all uphill from there, or that entropy is so simple that it borders on the incomprehensible, in the sense that it is at first difficult to imagine how such an apparently simplistic notion could have such subtle and profound consequences. Thus, to contraphrase Einstein, our task is to make statistical entropy as complicated as necessary to be comprehensible, but hopefully no more so.
Uncertainty implies a plurality of possibilities, which in turn implies the need for a statistical or probabilistic description. For example, the uncertainty as to whether flipping a coin will result in heads (H) or tails (T) evidently depends on the relative probabilities p(H) and p(T) = 1 ā p(H) of those two outcomes, and is clearly larger if the coin is fair (i.e., p(H) = p(T) = 1/2) than if the coin is loaded so that the outcome is almost certain to be heads (e.g., p(H) = 1 ā p(T) = 0.99). This observation suggests that the uncertainty of a situation is largest, ceteris paribus, when all possibilities are equally likely. Conversely, in situations where the uncertainty is perceived to be a maximum it is therefore not unreasonable to presume, at least provisionally, that all outcomes are equally likely. This presumption is traditionally referred to as the hypothesis of equal a priori probabilities (EAPP). This hypothesis is obviously inappropriate in situations where the probabilities are manifestly unequal, so it must be expected that its validity will in general be subject to certain restrictions. In physical applications, the most important such restriction is to systems of constant energy. What is remarkable is that virtually all of equilibrium statistical mechanics can then be derived, in a relatively straightforward way, by combining the EAPP hypothesis with the superficially imprecise conception of entropy as a quantitative measure of uncertainty.
The most important applications of entropy are to composite systems comprising a very large number of elementary parts or components which can be arranged in a very large number of possible ways or configurations. Those configurations constitute the states of the system, and the uncertainty then pertains to which of those states the system occupies. Those states are almost always defined and characterized in terms of the states of the individual components. As a trivial example, the states of an individual coin are H and T, so a system of two distinguishable coins has four possible states, which can be represented by the ordered pairs (H, H), (H, T), (T, H), and (T, T). In contrast, if the coins are identical and indistinguishable there are only three possible states: {H, H} (both heads), {T, T} (both tails), and {H, T} = {T, H} (one of each).
The composite systems of greatest interest in physics are macroscopic thermodynamic systems, which are composed of very large numbers of very small particles (usually atoms, molecules, ions, and/or free electrons). For simplicity we shall restrict attention to pure systems whose constituent particles are all of the same type or species. Mixtures are therefore specifically excluded from the discussion. They present no special conceptual difficulties, but they introduce cumbersome notational and bookkeeping complications and distractions which are best avoided in a critical discussion of the fundamentals. The omission of mixtures implies that our primary focus is on systems of identical indistinguishable particles. However, it is necessary to consider distinguishable particles as well, because a clear conceptual understanding of the one ironically requires an understanding of the other. Indeed, as will be seen, the states of a system of distinguishable particles are more easily defined and characterized than those of an otherwise identical system of indistinguishable particles, so it is sometimes useful to describe the latter in terms of the former. However, mathematical convenience must not be confused with physical reality. If a physical system is composed of physically indistinguishable particles, then the corresponding system of distinguishable particles is fictitious, and so are its states, however useful they may be mathematically.
As noted in the Preface, this book differs from many if not most introductory treatments of statistical mechanics in that (1) many-particle systems are not discussed until relatively late in the book, and (2) the combinatorial arguments employed for that purpose are minimized. The reasons for these deviations from the norm are as follows: (1) Much of the general formalism of entropy is actually independent of the detailed structure of the states and the variables used to describe them. This generality is obscured if the formalism is specialized to particular cases such as many-particle systems before it is necessary to do so. (2) More conventional treatments in which combinatorial concepts play a central role exaggerate their importance in the conceptual structure of entropy. Indeed, such concepts do not even arise until the general formalism is applied to composite systems, where they play a limited albeit important role as mathematical tools used to enumerate and classify the states. Even for that purpose, they can be largely circumvented by the use of simpler and more elegant analytical techniques, in particular the canonical and grand canonical probability distributions. Although statistical mechanics can be, and often still is, motivated and developed by means of combinatorial arguments, they do not in my judgment provide the simplest or most transparent approach to the subject. Of course, such arguments played an important role in its historical development, of which the best known and most important example is the conceptually precarious āmethod of the most probable distributionā developed by Boltzmann circa 1877. This method is still commonly invoked in introductory treatments because of its simplicity, even though it is universally regarded and acknowledged as unsatisfactory. In contrast, I regard it as being of historical interest only and make no use of it in this book.
Perhaps the most heretical distinguishing feature of the present treatment is its categorical rejection of the persistent misconception that the indistinguishability of identical particles is peculiar to quantum mechanics, and that identical classical particles are in principle distinguishable. This unfortunate misconception has afflicted statistical mechanics for nearly a century, and its eviction from the subject is long overdue. It has been propagated from each generation of textbooks to the next for so long that it has evolved into a pernicious memetic defect which most authors simply take for granted and no longer subject to critical scrutiny. As a result, the alleged distinguishability of identical classical particles remains firmly entrenched as established dogma, a situation which future historians of science are likely to find incomprehensible. However, this misconception has not gone entirely unchallenged; it has been intermittently disputed and/or refuted on various occasions by several intrepid authors [2, 9ā16], not to mention Gibbs himself [17], but from various different perspectives which have quite likely detracted from their collective cogency. In particular, some authors seem to consider indistinguishability a property of the states rather than the particles, whereas this book is based on the converse view.
The present treatment of multi-particle systems is firmly based on the principle that indistinguishability is an intrinsic property of identical particles in general, and is not peculiar to quantum mechanics. Readers who are unfavorably disposed toward that principle are encouraged to peruse Chapter 9, where we adduce arguments to the effect that the obvious methods whereby one might imagine distinguishing between identical classical particles are untenable. It is my fond but possibly forlorn hope that those arguments will hasten the ultimate demise of the misconception that identical classical particles are somehow distinguishable. This misconception has created an artificial and entirely unnecessary dichotomy between classical and quantum statistical mechanics, thereby obscuring the conceptual coherence of the whole subject and making it more difficult to understand. Conversely, the fundamental principle that all identical particles are inherently and unconditionally indistinguishable unifies and simplifies classical and quantum statistical mechanics, which no longer require separate consideration and coalesce into a single unified treatment. The Gibbs paradox and its infamous ad hoc 1/N! correction factor are thereby entirely circumvented, and the resulting formulation yields the further insight that the essential distinction between classical and quantum statistics does not reside in the alleged distinguishability of classical particles, but rather in their statistical independence.
Finally, it is recommended that this book be accessed serially or sequentially rather than randomly or selectively, for only in that manner does the logical structure and coherence of the material fully reveal itself. References specifically cited in the text are numbered and listed at the back of the book in the order in which they appear, but references to basic, familiar, and/or well known formulae and results are sporadic. General references wherein useful discussions of such topics can be found are listed in a supplementary Bibliography which precedes the numbered References.
Chapter 2
FUNDAMENTALS
2.1Entropy as Uncertainty
The most common and popular description of entropy in everyday language is that it provides a measure of disorder. Other terms used to convey an intuitive feeling for entropy include randomness, disorganization, āmixed-up-nessā (Gibbs), missing information, incomplete knowledge, complexity, chaos, ignorance, and uncertainty. Of course, such descriptors are inherently imprecise and qualitative, and are inadequate to provide a precise quantitative definition, but the vernacular term that we believe best captures the essence of entropy is uncertainty. (āIgnoranceā would be equally accurate, but seems less suitable due to its negative connotations.)
Thus we shall regard and approach entropy as a quantitative measure of uncertainty. Uncertainty is clearly related to the number of possibilities that exist, and would be expected to increase as that number becomes larger. It also seems natural to define the collective uncertainty of two or more independent or unrelated situations as the sum of their individual uncertainties, so that uncertainty is additive. In the present context, uncertainty pertains to which of its accessible states a particular system occupies. Thus we seek a quantitative measure of uncertainty which is additive over statistically independent systems.
2.2Systems and States
The concept of the āstateā of a system is of fundamental importance in many areas of physics, and lies at the very heart of entropy. As will be seen, evaluating the entropy requires identifying and labelin...