[Previous] [Contents] [Next]
[Comments or questions]

7 The W + Q Equation

7.1 Grady and Ungrady One-Forms

Sometimes people who are trying to write equation 6.5 or equation 6.11 instead write something like

dE = dW + dQ (allegedly) (7.1)

which is deplorable.

Using the language of differential forms, the situation can be understood as follows:

E is a scalar state-function.
V is a scalar state-function.
S is a scalar state-function.
P is a scalar state-function.
T is a scalar state-function.
ΔE := E₂ − E₁ is a scalar function of two states.
ΔS := S₂ − S₁ is a scalar function of two states.
ΔV := V₂ − V₁ is a scalar function of two states.
dE is a grady one-form state-function.
dS is a grady one-form state-function.
dV is a grady one-form state-function.
w := PdV is in general an ungrady one-form state-function.
q := TdS is in general an ungrady one-form state-function.
There is in general no state-function W such that w = dW.
There is in general no state-function Q such that q = dQ.

where in the last four items, we have to say “in general” because exceptions can occur in peculiar situations, mainly cramped situations where it is not possible to construct a heat engine. Such situations are very unlike the general case, and not worth much discussion beyond what was said in conjunction with equation 6.30. When we say something is a state-function we mean it is a function of the thermodynamic state. The last two items follow immediately from the definition of grady versus ungrady.

Figure 7.1 shows the difference between a grady one-form and an ungrady one-form.

As you can see in on the left side of the figure, the quantity dS is grady. If you integrate clockwise around the loop as shown, the net number of upward steps is zero. This is related to the fact that we can assign an unambiguous height (S) to each point in (T,S) space.

In contrast, as you can see on the right side of the diagram, the quantity TdS is not grady. If you integrate clockwise around the loop as shown, there are considerably more upward steps than downward steps. There is no hope of assigning a height “Q” to points in (T,S) space.

Figure 7.1: dS is Grady, TdS is Not

For details on the properties of one-forms, see reference 3 and perhaps reference 18.

Be warned that in the mathematical literature, what we are calling ungrady one-forms are called “inexact” one-forms. The two terms are entirely synonymous. A one-form is called “exact” if and only if it is the gradient of something. We avoid the terms “exact” and “inexact” because they are too easily misunderstood. In particular, in this context,
exact is not even remotely the same as accurate.
inexact is not even remotely the same as inaccurate.
inexact does not mean “plus or minus something”.
exact just means grady. An exact one-form is the gradient of some potential.

The difference between grady and ungrady has important consequences for practical situations such as heat engines. Even if we restrict attention to reversible situations, we still cannot think of Q as a function of state, for the following reasons: You can define any number of functions Q₁, Q₂, ⋯ by integrating TdS along some paths Γ₁, Γ₂, ⋯ of your choosing. Each such Q_i can be interpreted as the total heat that has flowed into the system along the specified path. As an example, let’s choose Γ₆ to be the path that a heat engine follows as it goes around a complete cycle – a reversible cycle, perhaps a Carnot cycle or some such. Let Q₆(N) be the value of Q₆ at the end of the Nth cycle. We see that even after specifying the path, Q₆ is still not a state function, because at the end of each cycle, all the state functions return to their initial values, whereas Q₆(N) grows linearly with N. This proves that in any situation where you can build a heat engine, q is not equal to d(anything).

7.2 Abuse of the Notation

Suppose there are two people, namely wayne and dwayne. There is no special relationship between them. In particular, we interpret dwayne as a simple six-letter name, not as d(wayne) i.e. not as the derivative of wayne.

Some people try to use the same approach to supposedly define dQ to be a “two-letter name” that represents T dS – supposedly without implying that dQ is the derivative of anything. That is emphatically not acceptable. That would be a terrible abuse of the notation.

In accordance with almost-universally accepted convention, d is an operator, and dQ denotes the operator d applied to the variable Q. If you give it any other interpretation, you are going to confuse yourself and everybody else.

The point remains that in thermodynamics, there does not exist any Q such that dQ = T dS (except perhaps in trivial cases). Wishing for such a Q does not make it so. See chapter 18 for more on this.

7.3 Procedure for Extirpating dW and dQ

Constructive suggestion: If you are reading a book that uses dW and dQ, you can repair it using the following simple procedure:

For reversible processes, it’s easy: Every time you see dQ, cross it out and write T dS instead. Every time you see dW, cross it out and write P dV or −P dV instead. The choice of sign depends on convention. It should be easy to determine which convention the book is using.
For irreversible processes, much more effort is required. Classical thermodynamics books like to say that for an irreversible process «T dS is greater than dQ». In this case, you can’t simply replace dQ by T dS because dQ (to the extent that it means anything at all) sometimes does not account for the entire T dS. In this context, it probably involves only the entropy that flowed in across the boundary – not the entropy that was created from scratch. So the rule in this context is to cross out dQ and replace it by T dS_transferred.
As for the idea that T dS > T dS_transferred for an irreversible process, we cannot accept that at face value. For one thing, we would have problems at negative temperatures. We can fix that by getting rid of the T on both sides of the equation. Another problem is that according to the modern interpretation of the symbols, dS is a vector, and it is not possible to define a “greater-than” relation involving vectors. That is to say, vectors are not well ordered. We can fix this by integrating. The relevant equation is:

∫

Γ

dS

=
∫

Γ

(dS_transferred + dS_created)

>
∫

Γ

dS_transferred

(7.2)

for some definite path Γ. We need Γ to specify the “forward” direction of the transformation; otherwise the inequality wouldn’t mean anything. We have an inequality, not an equality, because we are considering an irreversible process.
At the end of the day, we find that the assertion that «T dS is greater than dQ» is just a complicated and defective way of saying that the irreversible process created some entropy from scratch.
Note: The underlying idea is that for an irreversible process, entropy is not conserved, so we don’t have conservative flow. Therefore the classical approach was a bad idea to begin with, because it tried to define entropy in terms of heat divided by temperature, and tried to define heat in terms of flow. That was a bad idea on practical grounds and pedagogical grounds, in the case where entropy is being created from scratch rather than flowing. It was a bad idea on conceptual grounds, even before it was expressed using symbols such as dQ that don’t make sense on mathematical grounds.
Beware: The classical thermo books are inconsistent. Even within a single book, even within a single chapter, sometimes they use dQ to mean the entire T dS and sometimes only the T dS_transferred.

7.4 Some Reasons Why dW and dQ Might Be Tempting

It is remarkable that people are fond of writing things like dQ … even in cases where it does not exist. (The remarks in this section apply equally well to dW and similar monstrosities.)

Even people who know it is wrong do it anyway. They call dQ an “inexact differential” and sometimes put a slash through the d to call attention to this. The problem is, neither dQ nor ðQ is a differential at all. Yes, TdS is an ungrady one-form or (equivalently) an inexact one-form, but no, it is not properly called an inexact differential, since it is generally not a differential at all. It is not the derivative of anything.

One wonders how such a bizarre tangle of contradictions could arise, and how it could persist. I hypothesize part of the problem is a too-narrow interpretation of the traditional notation for integrals. Most mathematics books say that every integral should be written in the form

∫

(integrand) d(something) (7.3)

where the d is alleged to be merely part of the notation – an obligatory and purely mechanical part of the notation – and the integrand is considered to be separate from the d(something).

However, it doesn’t have to be that way. If you think about a simple scalar integral from the Lebesgue point of view (as opposed to the Riemann point of view), you realize that what is indispensable is a weighting function. Specifically: d(something) is a perfectly fine, normal type of weighting function, but not the only possible type of weighting function.

In an ordinary one-dimensional integral, we are integrating along a path, which in the simplest case is just an interval on the number line. Each element of the path is a little pointy vector, and the weighing function needs to map that pointy vector to a number. Any one-form will do, grady or otherwise. The grady one-forms can be written as d(something), while the ungrady ones cannot.

For purposes of discussion, in the rest of this section we will put square brackets around the weighting function, to make it easy to recognize even if it takes a somewhat unfamiliar form. As a simple example, a typical integral can be written as:

∫

(integrand) [(weight)] (7.4)

where Γ is the domain to be integrated over, and the weight is typically something like dx.

As a more intricate example, in two dimensions the moment of inertia of an object Ω is:

I :=

∫

r² [dm] (7.5)

where the weight is dm. As usual, r denotes distance and m denotes mass. The integral runs over all elements of the object, and we can think of dm as an operator that tells us the mass of each such element. To my way of thinking, this is the definition of moment of inertia: a sum of r², summed over all elements of mass in the object.

The previous expression can be expanded as:

I =

∫

r² [ρ(x,y) dx dy] (7.6)

where the weighting function is same as before, just rewritten in terms of the density, ρ.

Things begin to get interesting if we rewrite that as:

I =

∫

r² ρ(x,y) [dx dy] (7.7)

where ρ is no longer part of the weight but has become part of the integrand. We see that the distinction between the integrand and the weight is becoming a bit vague. Exploiting this vagueness in the other direction, we can write:

∫

[r² dm]

∫

[r² ρ(x,y) dx dy]

(7.8)

which tells us that the distinction between integrand and weighting function is completely meaningless. Henceforth I will treat everything inside the integral on the same footing. The integrand and weight together will be called the argument¹ of the integral.

Using an example from thermodynamics, we can write

Q_Γ

∫

T [dS]

∫

[T dS]

∫

[q]

(7.9)

where Γ is some path through thermodynamic state-space, and where q is an ungrady one-form, defined as q := TdS.

It must be emphasized that these integrals must not be written as ∫[dQ] nor as ∫[dq]. This is because the argument in equation 7.9 is an ungrady one-form, and therefore cannot be equal to d(anything).

There is no problem with using TdS as the weighting function in an integral. The only problem comes when you try to write TdS as d(something) or ð(something):

Yes, TdS is a weighting function.
Yes, it is a one-form.
No, it is not a grady one-form.
No, it is not d(anything).

I realize an expression like ∫[q] will come as a shock to some people, but I think it expresses the correct ideas. It’s a whole lot more expressive and more correct than trying to write TdS as d(something) or ð(something).

Once you understand the ideas, the square brackets used in this section no longer serve any important purpose. Feel free to omit them if you wish.

There is a proverb that says if the only tool you have is a hammer, everything begins to look like a nail. The point is that even though a hammer is the ideal tool for pounding nails, it is suboptimal for many other purposes. Analogously, the traditional notation ∫ ⋯ dx is ideal for some purposes, but not for all. Specifically: sometimes it is OK to have no explicit d inside the integral.

There are only two things that are required: the integral must have a domain to be integrated over, and it must have some sort of argument. The argument must be an operator, which operates on an element of the domain to produce something (usually a number or a vector) that can be summed by the integral.

A one-form certainly suffices to serve as an argument (when elements of the domain are pointy vectors). Indeed, some math books introduce the notion of one-forms by defining them to be operators of the sort we need. That is, the space of one-forms is defined as an operator space, consisting of the operators that map column vectors to scalars. (So once again we see that one-forms correspond to row vectors, assuming pointy vectors correspond to column vectors). Using these operators does not require taking a dot product. (You don’t need a dot product unless you want to multiply two column vectors.) The operation of applying a row vector to a column vector to produce a scalar is called a contraction, not a dot product.

It is interesting to note that an ordinary summation of the form ∑_i F_i corresponds exactly to a Lebesgue integral using a measure that assigns unit weight to each integer (i) in the domain. No explicit d is needed when doing this “integral”. The idea of “weighting function” is closely analogous to the idea of “measure” in Lebesgue integrals, but not exactly the same. We must resist the temptation to use the two terms interchangeably. In particular, a measure is by definition a scalar, but sometimes (such as when integrating along a curve) it is important to use a weighting function that is a vector.

People heretofore have interpreted d in several ways: as a differential operator (with the power, among other things, to produce one-forms from scalars), as an infinitesimal step in some direction, and as the marker for the weighting function in an integral. The more I think about it, the more convinced I am that the differential operator interpretation is far and away the most advantageous. The other interpretations of d can be seen as mere approximations of the operator interpretation. The approximations work OK in elementary situations, but produce profound misconceptions and contradictions when applied to more general situations … such as thermodynamics.

In contrast, note that in section 16.1, I do not take such a hard line about the multiple incompatible definitions of heat. I don’t label any of them as right or wrong. Rather, I recognize that each of them in isolation has some merit, and it is only when you put them together that conflicts arise.

Bottom line: There are two really simple ideas here: (1) d always means exterior derivative. The exterior derivative of any scalar-valued function is a vector. It is a one-form, not a pointy vector. In particular it is always a grady one-form. (2) An integral needs to have a weighting function, which is not necessarily of the form d(something).

1: This corresponds to saying that θ is the argument of the cosine in the expression cos(θ).

[Previous] [Contents] [Next]
[Comments or questions]