The optimality of a control band policy

Review of Economic Dynamics 8 (2005) 877–901
www.elsevier.com/locate/red
The optimality of a control band policy
Jose M. Plehn-Dujowich
Department of Economics, University at Buffalo (SUNY), 435 Fronczak Hall, Buffalo, NY 14260
Received 9 July 2002; revised 18 November 2003
Available online 3 August 2005
Abstract
Consider a firm with an arbitrary profit function whose relative price follows a Brownian motion
with negative drift. When the firm faces a fixed cost of price adjustment, we prove the optimal pricing
policy is a control band if the following sufficient conditions are met: the profit function is continuous, strictly concave and single-peaked; moreover, together with its first and second derivatives, it
is bounded in absolute value by a polynomial. We also demonstrate various ways of constructing the
value function associated with the control band policy and show it has certain properties carried over
from the profit function. Numerical examples are found to be consistent with empirical estimates
regarding the frequency of price adjustments.
 2005 Elsevier Inc. All rights reserved.
JEL classification: E31; C61
Keywords: Stochastic control; Menu cost
1. Introduction
Models with fixed costs of adjustment are popular due to their applicability in various
fields of economics. It is frequently assumed in such models that firms adopt an (s, S)
policy; the familiar value-matching and smooth-pasting conditions are then utilized. If the
firm chooses to follow a control band policy, these conditions determine the optimal choice
of trigger and return points, but they do not guarantee the type of policy is optimal against
all possible alternatives (for example, non-Markov policies). This paper provides sufficient
conditions for a control band policy to be optimal against all alternatives.
E-mail address: [email protected].
1094-2025/$ – see front matter  2005 Elsevier Inc. All rights reserved.
doi:10.1016/j.red.2005.05.001
878
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
Optimality of the control band policy has been proven for specific profit functions. In
Sheshinski and Weiss (1983), the profit function is quasi-concave. In Scarf (1959), the
inventory holding cost function is convex, which is equivalent to a concave profit function.
In Danziger (1983), the profit function is a strictly concave polynomial. Caplin and Leahy
(1997) and Tsiddon (1993) assume the profit function is quadratic.
Assuming the uncontrolled process follows a Brownian motion, we prove a control
band policy is optimal if the value function and its derivatives exist and are continuous and
bounded in absolute value, and the constants of the homogeneous solution of the Bellman
equation are strictly positive. We demonstrate various ways of constructing the value function associated with the control band policy and show it has certain desirable properties
that are carried over from the profit function. By imposing the following three sufficient
conditions on the profit function, we ensure the value function has the stated properties
guaranteeing optimality of the control band policy: the profit function and its derivative are
continuous; the profit function is strictly concave and single-peaked (unimodal); and the
profit function together with its first and second derivatives are bounded in absolute value
by a polynomial.
Our results are proven for a firm facing fixed costs of nominal price adjustment, i.e.
menu costs. If there is inflation but no uncertainty, then the firm employs an (s, S) policy as
shown in Sheshinski and Weiss (1977). Similarly, if there is uncertainty but no inflation, we
also have price control of the (s, S) type as in Caplin and Leahy (1997). For our problem,
we incorporate both features by modeling the relative price as a Brownian motion with
negative drift.
We then perform a numerical analysis of the model with a linear-quadratic profit function to show the following. First, as in Bertola and Caballero (1990), we find that the firm
chooses a control band policy that causes its steady state price to be very close to its profitmaximizing level. In the absence of menu costs, the firm would continuously regulate its
price to be at that level. With menu costs, the firm achieves the second-best by being at
the profit-maximizing level on average. Second, though inflation per se is detrimental to
the firm, price volatility is significantly more costly in terms of lost profits. This ties in
with our third finding, that the control band policy considerably reduces the (steady state)
variance of the price, even up to an order of magnitude. Finally, in the baseline parameter
configuration, the firm adjusts its price on average 1.44 times per year, which is consistent
with empirical estimates in Blinder (1991).
Menu cost models emerged from the inventory and cash management literature. In the
former, the solution to the firm’s problem typically leads to a type of impulse control policy. In the latter, we observe instantaneous control. A control band is a specific type of
impulse control policy; in general terms, these are given by a sequence of stopping times
and corresponding jumps. For example, if the firm found it optimal to have different trigger
and return points depending on its state, then this would be another type of impulse control
policy. This is what we observe in models that have discontinuous behavior at the aggregate level, as in Sheshinski and Weiss (1983). In their model, the economy is governed
by two states, one in which there is no inflation, and one with positive inflation. In each
state, the firm uses a different control band policy. In inventory problems, the underlying
stochastic process becomes the demand for the firm’s product. In a discrete-time setting,
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
879
Scarf (1959) is the first to have successfully solved and fully characterized the solution to
the firm’s problem in such a framework.
The case of instantaneous control arises when the state in question cannot exceed certain
boundaries. In cash management problems, the state variable is the cash holdings of the
firm; its business practices imply the holdings cannot get too low (typically, they must
be non-negative) or too high (the opportunity cost of holding cash becomes large). Such
problems have been modeled by having two cost functions, one associated with holding
cash, the other with changing its state (the menu cost). The solution to cash management
problems thus usually becomes one of regulating the state, and not necessarily subjecting
it to large, discrete jumps. Harrison et al. (1983) and Constantinides and Richard (1978)
analyzed the inventory and cash management problems, respectively.
The finance literature has also applied menu cost models, motivated by the stylized fact
that investment at the firm level is lumpy, having the same discrete and staggered pattern
observed with inventory, cash, and price levels. In the Abel and Eberly (1994) model of
investment, there are flow fixed costs, giving rise to instantaneous control (characterized
by a band) rather than impulse control. Finally, a similar concept to that of an (s, S) policy
has been used to derive the value of exercising an option when the underlying asset follows
a Brownian motion process (Dixit and Pindyck, 1994).
The paper is organized as follows. Section 2 begins by formally describing the firm’s
problem and deriving a set of sufficient conditions a value function must satisfy for it to
equal the value of following the optimal impulse control policy. We then describe the control band policy we postulate to be optimal and construct its associated value function.
Section 3 proves the control band policy is optimal by showing its value function satisfies the sufficient conditions stipulated in Section 2. At the end of Section 3, we derive
the ergodic distribution of the controlled relative price and the expected waiting time in
the inaction region of the control band. Section 4 solves a linear-quadratic example and
compares its predictions to empirical estimates. Finally, Section 5 concludes. Appendix A
contains the proofs of all theorems.
2. The firm’s problem
The firm seeks to maximize the discounted flow of profits π(x) net of price adjustment
costs over an infinite horizon. The firm’s profits depend solely on the price of its product
relative to some average price of the economy. We shall interpret x as being the log of the
price ratio, but it will be referred to simply as the price. The cost of adjusting the price
is assumed to be fixed at B > 0 and not depend on the amount of the price change. The
firm operates in an inflationary environment so that its relative price decreases over time
on average. In particular, when price control is not exercised, the relative price of the firm
follows a Brownian motion with negative drift:
dx(t) = −g dt + σ dw
(1)
where w is a Wiener process, and the drift and variance terms are constants. If the process
starts at x, then we may write (1) equivalently as
x(t) = x − gt + σ w(t).
(2)
880
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
We immediately rule out the possibility of instantaneous control, whereby the firm continuously regulates the price, because of the assumed menu cost structure. Each time the
process is regulated, the firm incurs a fixed cost; thus, since we are dealing with Brownian
motion, it would incur infinite costs in finite time: once it reaches a barrier of the regulator, which occurs with positive probability, it will have to adjust the price an uncountably
infinite number of times since a Brownian motion is a continuous process (Harrison, 1985,
Chapter 2). Hence, the firm’s problem becomes one of impulse control: it must choose
times at which to change its price, and by how much. Formally, a general impulse control
policy p is defined as a sequence of stopping times denoted by τ and a sequence of corresponding return points denoted by y: p = {τ0 , y0 ; τ1 , y1 ; . . .}. As the underlying process is
stochastic, these sequences are random variables. For convenience, we initialize τ0 = 0.
The process x(t) describes the behavior of the relative price absent of any form of
control. We now define a new process, z(t), which describes the relative price subject to
control. The policy p is a prescription for the control procedure, so z(t) depends on p.
Since the process x(t) follows (1), the controlled process z(t) associated with policy p is
z(0) = x. The
defined as follows: dz(t) = −g dt + σ dw for τi t <τi+1 ; z(τi ) = yi ; and
∞
−rτi B},
value of following policy p is given by V (p; x) = Ex { 0 e−rt π(z(t)) dt − ∞
i=0 e
where the expectation is taken conditional on the initial condition x and r > 0 is the constant interest rate.
Our goal is to find an impulse control policy pˆ that yields the greatest possible value
out of all possible impulse control policies. We will not associate with pˆ a specific type
of policy. Rather, we describe its features as they compare to the arbitrary impulse control
policy p defined above. To achieve this, we begin with a candidate value function u(x). The
inaction region I can be defined as the set of states for which the value of inaction, given by
u(x), exceeds the largest net value of exercising control: I = {x: u(x) > supy {u(y) − B}}.
To understand I , suppose the firm is considering exercising control by jumping to the
relative price y. The cost of doing so is B, implying the net gain of such control is u(y)−B.
Because the firm can jump to any relative price, incurring the same fixed cost B, the maximum net value of exercising control at any point in time (and hence at any state) is given
by supy {u(y) − B}. If the firm’s relative price is currently x, then the value of inaction (i.e.
doing nothing) is simply u(x). It follows that if the firm’s current relative price is x, and
u(x) > supy {u(y) − B}, then it is optimal for the firm to do nothing. In other words, that
point x lies in the inaction region I .
By construction, once the process leaves the inaction region, control is exercised. Moreover, τˆi+1 is the smallest time, after τˆi , at which control is exercised. So the stopping times
/ I }, where once again we
of policy pˆ can be defined recursively as τˆi+1 = arg inft>τˆi {ˆz(t) ∈
initialize τˆ0 = 0. The optimal return points must maximize the return to exercising control, so they are given by yˆi = arg supy {u(y) − B}. Since the cost is fixed irrespective of
the magnitude or direction of control, the return point is always the same in this model.
Finally, the controlled process zˆ (t) associated with policy pˆ is defined as was done for p:
dˆz(t) = −g dt + σ dw for τˆi t < τˆi+1 ; zˆ (τˆi ) = yˆi ; and zˆ (0) = x.
The following theorem provides two important results. First, it proves the policy pˆ
constructed above is indeed the optimal policy, as it maximizes V (p; x) over all possible impulse control policies p. Second, it specifies sufficient conditions for the candidate
ˆ x). The
value function u(x) to equal the value of the optimal impulse control policy V (p;
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
881
theorem is a type of principle of optimality result as it allows us to go from the sequential problem, which is given by maximizing V (p; x) over all possible sequences of return
points and stopping times, to a recursive formulation dictated by a value function. (All
proofs are contained in Appendix A.)
Theorem 1. If there exists a value function u(x) that satisfies the following four conditions:
(P1)
(P2)
(P3)
(P4)
u(x) is bounded in absolute value and continuous;
u (x) is bounded in absolute value and continuous;
u(x) supy {u(y) − B} for all x;
ru(x) + gu (x) − 12 σ 2 u (x) = π(x) for all x ∈ I ,
then the policy pˆ is the optimal impulse control policy, and u(x) attains its associated
value. That is, u(x) = V (p;
ˆ x) V (p; x) for all p.
We now proceed with constructing a value function V (x) which will ultimately satisfy
the sufficient conditions (P1)–(P4) of Theorem 1. Following Dixit (1991), we postulate
that the optimal impulse control policy is a control band, so we shall derive the value
function associated with following such a policy. Because we have a fixed cost of price
adjustment which does not depend on the amount of control exercised, the band we choose
is characterized by two barrier points and a single return point, such that s < S < S. That
is, when the relative price hits either s or S, the firm instantly jumps to S. For our problem,
then, the inaction region corresponds to the open interval (s, S). As Dixit showed, if there
is a proportional cost of adjustment, then the return points will be different, since in that
case the marginal cost of adjustment has to be taken into consideration.
We begin by deriving the continuous-time Bellman equation of the firm’s problem. In
the absence of control, the relative price follows (1). Using Ito’s Lemma and standard
methods, the Bellman equation is given by
1
rV (x) + gV (x) − σ 2 V (x) = π(x).
2
(3)
By construction, the Bellman only holds over the inaction region. For every sample path
of the Brownian motion, the Bellman equation is an ordinary differential equation (ODE)
(Karatzas and Shreve, 1991). Thus, standard calculus techniques may be used in solving
(3). The general solution hence takes the following form:
V (x) = VP (x) + c1 eR1 x + c2 eR2 x ,
(4)
where VP (x) is a particular solution of (3), c1 , c2 are constants to be determined by boundary conditions, and the rootssolve the standard characteristicequation of the ODE, which
are given by R1 = σ12 [g + g 2 + 2rσ 2 ] and R2 = σ12 [g − g 2 + 2rσ 2 ]. Since the Bellman ODE only holds over the inaction region (s, S), the same holds true for the general
solution (4).
Recognizing the Bellman ODE as a representation of the famous Cauchy problem,
a particular solution of (3) is given by the Feynman–Katz solution (Theorem 7.6 of
882
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
Karatzas and Shreve, 1991, p. 366). It is equal to the expected discounted value of profits
in the absence of control:
∞
−rt
VP (x) = Ex
e π x(t) dt .
(5)
0
This is the particular solution used by Dixit. Harrison (1985, p. 45) showed that it is equivalent to the following, whereby we integrate over states instead of time:
∞
x
1
2
R1 x
−R1 y
R2 x
−R2 y
VP (x) = 2
π(y)e
dy + e
π(y)e
dy .
(6)
e
σ R1 − R2
x
−∞
This alternative representation can also be derived directly using the calculus method of
variation of parameters (Edwards and Penney, 1993, p. 171). The integral in (5) is not
necessarily finite. To guarantee it exists, we must first check whether the drift and variance
terms in (1) satisfy Lipschitz conditions; in our case, they do since they are constants
(Karatzas and Shreve, 1991, p. 289). The second requirement is that the profit function be
bounded in absolute value by a polynomial. Alternatively, we could have the profit function
be non-negative. Since we wish to allow for profit functions that take on negative values
(as they frequently appear in menu cost and investment with adjustment costs models), we
assume the former, together with other conditions, which we now turn to.
Throughout the remainder of this paper, we assume the profit function satisfies the following three conditions:
(A1)
(A2)
(A3)
π and π are continuous;
π is strictly concave and single-peaked (unimodal);
|π|, |π |, and |π | are bounded by a polynomial.
Our assumptions are generally consistent with the existing literature. In menu cost models,
it seems that concavity and some form of boundedness assumptions are needed to ensure optimality of a control band policy. Sheshinski and Weiss (1983) required the profit
function to be quasi-concave. Similarly, Scarf (1959) assumed the inventory holding cost
function was convex, which translates to a concave profit function in our model. In a
model with geometric Brownian motion, Danziger (1983) assumed the profit function was
a strictly concave polynomial. Most papers, such as Caplin and Leahy (1997) and Tsiddon
(1993), assume the profit function is quadratic, which is allowed by (A1)–(A3).
The goal is to construct a value function that will satisfy the conditions stipulated by
Theorem 1. The assumptions (A1)–(A3) reflect this strategy taking into account that properties imposed on the profit function are inherited by the value function. In order to apply
Theorem 1, and thus demonstrate that the control band policy is optimal, the value function
and its derivatives must exist and be continuous and bounded in absolute value; moreover,
the constants c1 , c2 of the homogeneous solution of the Bellman equation must be strictly
positive (which we argue below demonstrates the type of control being exercised is beneficial to the firm). We prove (A1)–(A3) is a sufficient set of conditions for the value function
to have these properties. Nevertheless, if the chosen profit function does not satisfy as-
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
883
sumptions (A1)–(A3), yet its associated value function meets the stipulated conditions,
then the control band policy is the optimal policy against all alternatives.
Continuity of the profit function and its derivative ensures continuity of the value function and its derivative, which is required for various reasons. First, it enables the usage of
Ito’s Lemma in proving Theorem 1. Second, it allows one to prove that the value function
and its derivative are bounded, which is also required by Theorem 1 according to (P1) and
(P2). Third, it is used to prove the value function has an extremum point. When combined
with the assumption that the profit function is single-peaked (implying the value function
has this property), we show the extremum point is the unique return point of the control
band policy and the unique maximum of the value function.
Assuming the profit function is strictly concave ensures the value function is strictly
concave. We utilize this property to prove the constants c1 , c2 of the homogeneous solution
of the Bellman equation are strictly positive. Quasi-concavity would not yield the desired
outcome. The assumption that the profit function and its derivative (and thus the value
function and its derivative) are bounded in absolute value by a polynomial is also used to
prove the constants c1 , c2 are strictly positive. Moreover, assuming the profit function and
its first two derivatives are bounded in absolute value by a polynomial ensures the value
function and its first two derivatives exist. There are other ways to guarantee the constants
c1 , c2 are strictly positive and the value function and its derivatives exist, so theoretically
one could extend or modify these assumptions to suit the situation at hand.
A number of cases do not satisfy the stated assumptions (A1)–(A3), though the proof
can be modified to accommodate most cases. Having a linear profit function obviously violates (A2). The inventory model of Harrison et al. (1983) is a cost minimization problem
wherein holding costs are linear for positive inventory holdings and infinite for negative
holdings, so this special case also does not satisfy our assumptions. Bar-Ilan and Sulem
(1995) fails the differentiability condition. Together with Harrison et al., the papers share
the feature that they have a single kink in the payoff function at zero. As long as the
value function is continuous, differentiable (with continuous derivatives), strictly concave,
single-peaked, and bounded in absolute value by a polynomial (together with its derivatives), then the proof of optimality still applies (without having to check that c1 , c2 are
strictly positive).
We now derive the familiar value-matching conditions associated with the {s, S, S} control band policy. By definition, the net return to jumping to a point is equal to the value
function evaluated at that point minus the fixed cost B of exercising control. It follows that
if the choice of trigger and return points is optimal, the net return to jumping must equal
the value at the trigger points: V (s) = V (S) = V (S) − B. Since control is exercised outside the inaction region (s, S), the value function is constant at V (s) = V (S) everywhere
outside that open interval, so we may extend our value function over the whole real line by
defining it as V (x) = V (s) = V (S) for x < s and x > S.
In addition to the value-matching conditions, we have the smooth-pasting conditions
at the barrier and return points. They can be derived from the following arbitrage argument using the random walk approximation of a Brownian motion (Cox and Miller, 1965).
Suppose the relative price is at the lower trigger point s. We shall compare the payoff
to waiting at s, yielding an expected return of R, versus jumping to state S. The two returns should be equal if the control band policy was chosen optimally. With probability
884
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
√
p = 12 [1 − σg t] the process jumps up, where t is the length of the time interval and
√
h = σ t is the size of the jump, and with probability q = 1 − p it jumps down at
which point control is exercised instantly. So the expected return to waiting at s is given
by R = π(s)t + (pV (s + h) + q[V (S) − B])/(1 + rt). Approximate the first term
in R with a first-order Taylor expansion around s: V (s + h) = V (s) + V (s)h. By the
value-matching conditions, we have V (s) = V (S) − B; dropping terms of higher order,
the return becomes R = V (S) − B + 12 V (s)h. The net return to jumping to S is simply
V (S) − B. Thus, for the two returns to be equal, we require that 12 V (s)h = 0. Doing the
same calculation for the upper barrier, we get the smooth-pasting conditions at the trigger
points: V (s) = V (S) = 0. Dixit derived the condition at the return point to be V (S) = 0,
which we shall see later follows from the first-order condition of a maximization problem.
The value-matching and smooth-pasting conditions determine the optimal control band
policy by providing the five equations needed to solve for the three policy parameters
{s, S, S} and the two constants of the homogeneous solution c1 , c2 . We now obtain expressions for the constants as a function of the policy parameters using the hitting-time
equations in Harrison, in which case one just needs to use the smooth-pasting conditions to
solve the system. Let T be the first time the process x(t) hits either barrier s or S. Since the
value function V (x) satisfies the Bellman ODE (3), we can apply Proposition 2 of Harrison
(p. 74):
T
V (x) = Ex
e
−rt
π x(t) dt + Ex e−rT V (xT ) .
(7)
0
This is a type of Bellman equation as the value today is decomposed into its two components: the expected discounted value up to the hitting time (or the value over the inaction
region), and the expected discounted terminal value once the process has hit either barrier.
T
Define the first component as VP P (x) ≡ Ex { 0 e−rt π(x(t)) dt}. Then by Proposition 3 of
Harrison (p. 49):
VP P (x) = VP (x) − VP (s)ψ2 (x) − VP S ψ1 (x),
(8)
where
ψ1 (x) =
θ1 (x) − θ2 (x)θ1 (s)
1 − θ2 (S)θ1 (s)
,
ψ2 (x) =
θ2 (x) − θ1 (x)θ2 (S)
1 − θ2 (S)θ1 (s)
,
θ1 (x) = eR1 (x−S) and θ2 (x) = eR2 (x−s) .
It can be shown that VP P (x) satisfies the Bellman ODE, so it is also a particular solution.
Moreover, as we shall see, we may interpret VP P (x) as the choice of particular integral
which takes on the value zero when x = s or x = S. Using the above definitions, we find
that Ex {e−rT V (xT )} = V (s)ψ2 (x) + V (S)ψ1 (x), leading to the following expression for
the value function:
V (x) = VP (x) + ψ2 (x) V (s) − VP (s) + ψ1 (x) V S − VP S .
(9)
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
885
(In effect, (9) is the particularization of (4) in which the arbitrary constants c1 and c2 have
been replaced by “new” arbitrary constants V (S) and V (S).) Alternatively, using only the
value-matching condition V (s) = V (S), we have
V (x) = VP P (x) + V (s) ψ1 (x) + ψ2 (x) .
(10)
Note further that rψ1 (x) + gψ1 (x) − 12 σ 2 ψ1 (x) = rψ2 (x) + gψ2 (x) − 12 σ 2 ψ2 (x) = 0.
Therefore, since both VP (x) and VP P (x) satisfy the Bellman ODE, so does V (x) as constructed in (9) and (10). It remains to define V (s) and V (S) using our newly created
value function. Because ψ1 (S) = ψ2 (s) = 1 and ψ1 (s) = ψ2 (S) = 0, evaluating V (s)
and V (S) using either (9) or (10) leads to identities since V (s) = V (S). For this reason, we use the other value-matching condition, V (S) − B = V (s) = V (S), which yields
V (s) = V (S) = (VP P (S) − B)/(1 − ψ1 (S) − ψ2 (S)).
Given a profit function, suppose one wanted to solve this model to find the optimal
barrier and return points. There are two ways of doing so. First, one could treat the constants of the homogeneous solution as unknowns, yielding a total of five variables to be
calculated. The smooth-pasting and value-matching conditions would provide the necessary five equations to solve the system. The second approach involves using the equations
derived above. Since these gave us explicit formulas for the constants as a function of the
policy, we would then have just three variables to solve for; the three equations to be used
would simply be the smooth-pasting conditions. Indeed, given a control band policy, one
can solve for the constants using the value-matching conditions (Dixit, 1991).
3. Optimality of the control band policy
Our goal is to prove that the {s, S, S} control band policy described in the previous
section is the optimal policy out of all possible impulse control policies. By virtue of Theorem 1, to achieve this, we must show the value function V (x) constructed above satisfies
the four conditions (P1)–(P4). Before this can be done, some preliminary results are required. Specifically, we show in the next theorem that most of the properties assumed for
the profit function carry over to the particular solution. Though this type of result is commonly and easily obtained in discrete time Bellman equations, it is by no means immediate
in continuous time.
Theorem 2. VP (x) and VP (x) are continuous and bounded in absolute value by a polynomial; VP (x) and VP P (x) are strictly concave; moreover, VP P (x) is single-peaked with a
unique maximum that lies within the inaction region.
Recall that VP (x) represents the value of the firm when no control is exercised. Hence,
we may interpret the homogeneous solution of the Bellman ODE as the value of exercising
control. Dixit argued heuristically that because of this, the constants of the homogeneous
solution must be positive. This property does not follow directly from our construction
of the value function. Rather, it is akin to making a statement regarding the optimality
of following such a policy. The following theorem proves Dixit’s intuition to be correct.
However, the key ingredient in proving this result follows from the assumed strict concavity
886
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
and boundedness of the profit function, which by Theorem 2 led the particular solution
to have the same properties. Moreover, since we assumed the profit function is strictly
concave and single-peaked, it follows that it has a unique maximum m. The following
theorem also proves m lies within the inaction region. The intuition for this result is quite
obvious: in order to maximize the value function, the flow of profits over the inaction
region, VP P (x), must be maximized.
Theorem 3. The constants of the homogeneous solution c1 , c2 are strictly positive; moreover, the global maximum m of the profit function lies within the inaction region.
The next theorem proves the value function is bounded from below by V (S) − B over
the whole real line. Outside the inaction region, control is exercised so that V (x) = V (s) =
V (S), which equals V (S) − B by the value-matching conditions. Thus, the result is trivial
outside the inaction region. However, it also holds within the inaction region; the motivation for this result follows from an arbitrage argument. Suppose the firm is at a point
x ∈ (s, S). The return to waiting at x is simply V (x). The net return to jumping to S is
V (S) − B. If the latter exceeds the former, then it is not optimal to remain at x, but rather
to exercise control and jump to S. Since x was chosen to be in the interior of the inaction
region, it follows that the choice of barrier points was not optimal. Hence, it must be the
case that V (x) V (S) − B for all x in the inaction region (s, S). The theorem also proves
that S is the unique maximum of the value function, which allows us to represent S as follows: S = arg maxx {V (x) − B}. The intuition for this representation should be quite clear.
If S was chosen correctly, it must be the optimal point to jump to, so that the net return of
exercising control is maximized. This leads to the first-order condition V (S) = 0, which
is of course one of the smooth-pasting conditions.
Theorem 4. V (x) is continuous, single-peaked with a unique maximum S, and bounded
from below by V (S) − B; moreover, V (x) is continuous and bounded.
We now proceed with the main result of this paper. In deriving the value function, we
obtained the value-matching and smooth-pasting conditions, which determine the optimal
choice of trigger and return points for the control band policy. However, this does not mean
that the type of policy is optimal. The assumed cost structure would tend to imply that a
control band is the best choice, but this has not yet been proven. Theorem 1 provided us
with a set of testable conditions to show precisely this since it considered an arbitrary
impulse control policy. In turn, Theorem 5 shows the value function associated with a
control band policy satisfies these conditions. Theorems 2 through 4 did all the legwork,
so the proof of Theorem 5 merely summarizes the findings.
Theorem 5. The {s, S, S} control band policy is the optimal policy out of all possible
impulse control policies.
From an empirical standpoint, the menu cost model can be used to estimate the cost of
inflation. By varying the drift rate of the Brownian motion process, one could evaluate the
amount by which the value of the firm changes. The steady-state average value function
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
887
would be a good choice as a metric since it incorporates the long-run effects of the firm’s
policy. The following theorem derives the unconditional probability density function of
the controlled relative price; integrating the value function against this density yields the
average value function. (Bertola and Caballero, 1990 already calculated the unconditional
distribution of the control variable when an agent follows an (s, S) policy in a slightly
different context.) To analyze the effect of changes in the variance of aggregate prices, one
could use the same metric.
To study the impact of inflation and the variability of prices at the firm level, the literature has typically estimated the effect of changing the underlying parameters on the firm’s
choice of trigger and return points, and in particular the width of the control band. However, it has been found that the width of the control band is an imperfect measure of the
frequency of adjustment. A more meaningful approach would involve looking at the effect
of parameters on the steady-state average and variance of the firm’s price. If we seek to
estimate the frequency of price adjustments, then instead of using the width of the control
band, a more appropriate metric is the average waiting time in the inaction region, which
is also derived in the following theorem.
Theorem 6. The unconditional distribution of the controlled relative price z is given by
A0 + C0 eαz for z ∈ [s, S) and A1 + C1 eαz for z ∈ [S, S], where −α = 2g/σ 2 , A0 =
−C0 eαs , A1 = −C1 eαS , and
C0 =
C1 =
eαS − eαS
(S − s)eαs [eαS − eαS ] + (S − S)eαS [eαS − eαs ]
eαs − eαS
(S − s)eαs [eαS − eαS ] + (S − S)eαS [eαS − eαs ]
,
.
Moreover, the expected waiting time in the inaction region is given by [x − pS − (1 −
p)s]/g, where p = (e−αx − e−αs )/(e−αS − e−αs ) and s < x < S.
4. A quadratic example
This section demonstrates how a simple quadratic menu cost model can replicate fairly
accurately the pricing behavior of firms in various types of industries. We are also interested in obtaining some comparative statics results to understand how a firm’s pricing
behavior adapts to changes in the environment, as measured by the interest rate, cost of
adjustment, and rate of inflation, for example. Theoretically such results generally depend
on the chosen functional form for the profit function. However, after having calibrated numerous examples, it became clear that the functional form affects merely the magnitude,
and not the direction, of the changes.
The following are some of our key findings. First, as in Bertola and Caballero (1990),
the steady state price of the firm (i.e. the unconditional mean of the controlled relative
price, calculated using the equations in Theorem 6) is very close to the maximum of its
profit function. In the absence of menu costs, the firm would perpetually remain at its
profit-maximizing price. With adjustment costs, it exercises control in such a fashion that
888
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
it remains at that price on average. Second, the control band policy significantly reduces
the variance of the price, at times by an order of magnitude. This occurs because of our
third finding, that the variance of the relative price, and not the presence of inflation, is
most costly to the firm. Finally, in the baseline model, the firm adjusts its price on average
1.44 times per year, which is consistent with empirical estimates.
In any calibration experiment for this model, the most difficult question involves deciding the magnitude of the fixed cost. We begin by normalizing a unit of time to equal one
year. Given this normalization, we opted to represent the fixed cost as a percentage of the
maximum level of profits. With this choice, we require a baseline percentage in order to
determine the magnitude of the fixed cost. The paper by Levy et al. (1997) is a useful reference for our purposes. The authors studied the pricing behavior of five large supermarket
chains and directly measured the menu costs associated with changing their product prices.
They found that, on average, the menu costs amounted to 0.70% of yearly revenues, or 35%
of yearly profits. The menu costs were large as a percentage of profits since the net profit
margins of these stores lie between 1 and 3%. The low profit margins followed from the
intense competition characteristic of that industry. We do not expect most industries to
have such low profit margins and hence such large menu costs relative to profits. So as an
approximation, we may consider the 35% fixed cost rate to be a type of upper bound.
A crucial metric of the model is the average frequency of price adjustments. Indeed,
the most important property associated with menu cost models relates to the observed
staggered pricing of firms; that is, even though average prices may be moving continuously,
a single firm does not change its price continuously, but rather in discrete time intervals,
and typically in large relative amounts. Naturally, we postulate this pattern arises due to
the presence of menu costs. To test the model, we compare the predicted frequency of
price adjustments to those measured by Blinder (1991). The author interviewed 72 firms
in a wide variety of industries, asking them about the frequency of their price adjustments
and their reasons for doing so. He found that 37.7% changed their prices once a year,
20.3% between once and twice, 24.5% more than twice, and 17.4% less than once per
year. Therefore, a total of 58% changed their prices either once or twice, or in between.
We consider the simple case when the profit function is linear-quadratic of the form
π(x) = ax − 12 bx 2 , where a > 0, b > 0. Using the standard calculus method of undetermined coefficients, we find that a particular solution of the Bellman ODE is given by:
2
σ b ag bg 2
b 2
a bg
VP (x) = − x +
(11)
+ 2 x−
+ 2 + 3 .
2r
r
r
2r 2
r
r
We set a = 1 and b = 2, so that the profit function becomes x − x 2 , yielding an argmax of
0.5. The following four parameters fully describe our model: the trend component inflation,
the variance component σ 2 , the fixed cost percentage, and the interest rate r. These are
varied between low, middle, and high values. Recall that we consider 35% to be an upper
bound for the fixed cost percentage, so this is taken as the high value. The middle value is
20%, and the low one 5%. We would like the inflation and variance combinations to reflect
the cases of various countries or industries, so the low inflation rate is set at 5%, the middle
one at 15%, and the high one at 50%. Similarly, the variance rates are chosen as 15, 25,
and 50%. Finally, we allow r to vary between 10, 15, and 30%. A typical industry could
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
889
hence be represented by an inflation rate of 5%, variance of 15%, and interest rate of 10%
(all the low values).
We seek to track the values of the following seven variables. First, to describe the optimal control band policy, we measure the width of the control band S − s, and calculate
the value of the return point S. These are then compared to the maximum of the profit
function m and the unconditional expectation of the controlled price (using the steady state
density), mean. Second, we wish to compare the variance parameter σ 2 to its steady state
counterpart, the unconditional variance of the controlled price. This will allow us to evaluate by how much the effect of controlling the price reduces the resulting price variability.
So as to describe the average value of the firm, we calculate the unconditional expectation of the value function, evalue, using the steady state density. The average frequency of
price adjustments is then calculated using results from Theorem 6. Finally, to evaluate the
steady state gain of exercising control, we calculate the unconditional expectation of the
homogeneous solution.
In all 81 cases, the mean was found to be very close to 0.5. The lowest value was
0.49, and the highest 0.50. Recall that for our profit function, 0.5 was set to be its argmax.
Since the firm would prefer to be at its profit-maximizing level, it chose a control band
policy that led its mean to be as close as possible to 0.5. When the parameters of the profit
function a and b were varied, the mean would always adjust to approximately equal the
argmax. In general, the expected variance was very small. It became as low as 1.78%, and
reached a maximum of 8.83%. The effect of the control band policy is therefore not only
to maintain the relative price near the argmax, but also to lower significantly the resulting
price variability.
Perhaps the most interesting result of these simulations follows from the calculated frequency of price adjustments per year. The model predicts that in an environment governed
by 5% inflation, 15% variance, 10% interest rate, and 5% fixed cost, a firm will change
its price on average 1.44 times per year. This quantity is representative of the estimates
obtained by Blinder, since he found that the majority of firms interviewed adjusted their
prices between once and twice per year. The highest frequency, equal to 3.00, occurred with
50% inflation, 50% variance, 10% interest rate, and 5% fixed cost. The smallest frequency
of 0.55 occurred with 5% inflation, 15% variance, 30% interest rate, and 35% fixed cost.
Moreover, the frequency is very sensitive to the fixed cost. As the fixed cost is increased
from 5 to 20%, it tends to fall by half, and when it rises from 20 to 35%, it falls by about
20%. Our figures are also very similar to those calculated by Sheshinski and Weiss (1977)
in their deterministic model, who obtained frequencies of about 1.3.
The expected gain is our measure of the steady state gain of exercising control. We
found this to be very large relative to the expected value. It got to be 1000 times larger
in the extreme case when inflation and variance were at 50%, and on average it tended
to be between 20–50 times larger, suggesting the gains of exercising control are considerable, and perhaps much larger than one would expect. Analyzing the expected value sheds
some light on the costs associated with inflation, uncertainty, and the menu cost. When
the variance is 15%, interest rate 10%, and fixed cost 5%, going from 5 to 15% inflation
leads to a fall in evalue of only 0.3%. However, if the fixed cost is 35%, then the expected
value falls 2.8%. If inflation is 5%, interest rate 10%, and fixed cost 35%, going from 15%
variance to 25% reduces evalue by 17.8%, which is very large. If the fixed cost is 5%, the
890
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
reduction is only 4.8%. These calculations suggest it is the variance of the relative price,
and not the presence of inflation, which is most costly to the firm. Of course, the fixed cost
has a very large effect on evalue. In the baseline model, going from 5 to 35% reduces the
expected value by 27.9%. Moreover, if the variance is 25%, the fall increases to 37.8%. It
seems clear then that the variance drives the fall in value. Similarly, increasing the interest rate has a very large cost, as would be expected. For example, in the baseline model,
increasing it by 5% reduces evalue by 33.3%.
The width of the control band is positively related with all the parameters. The positive
effect of inflation and the fixed cost on the width coincides with the result in Sheshinski and
Weiss (1977), implying the introduction of uncertainty does not change these relationships.
It is well known that the width of the band is an imperfect measure of the frequency of price
adjustments. We confirm this to be the case. When the band widens, one might expect
the frequency to fall since it will take longer to reach a barrier point. However, the two
only move in the opposite direction in response to changes in the interest rate and fixed
cost. The expected variance is positively related to all the parameters, so the width of the
control band is a better measure of the expected variance, and not of the frequency of
price adjustments. In fact, the correlation between the band and expected variance is 0.99,
whereas that between the frequency and band is −0.48, suggesting the width of the band
is a measure of dispersion.
5. Conclusion
This paper lays out the groundwork for further theoretical and empirical research. We
assumed the cost of price adjustment is fixed and not proportional to the amount of the price
change. The theorems presented here would require only slight modification to incorporate
this alternative cost structure; moreover, we believe the required assumptions on the profit
function would remain the same. The smooth-pasting conditions will be different: instead
of setting derivatives of the value function equal to zero at the trigger points, they will equal
the marginal cost of price adjustment. Dixit (1991) derived these conditions when the cost
function depends linearly on the amount of control exercised.
The theory behind this menu cost model has been fully developed, so it would be a good
starting point for a macroeconomic model (Caplin and Leahy, 1991, 1997; Tsiddon, 1993;
Caplin and Spulber, 1987). With such models, one must first assume or prove that a control
band policy is optimal at the firm level. This paper allows one to bypass that step if the
chosen profit function satisfies the stipulated sufficient conditions.
An interesting variation of this model would be to consider the effect of having the firm
maximize the long-run average of expected discounted profits net of adjustment costs. This
would imply that instead of maximizing the value function, the firm would be maximizing
the average (over relative prices) of the value function using the steady state density. The
firm’s choice of trigger and return points would be different, since the steady state density
depends on these points. Comparing the optimal control band policy of the current model
to that of this alternative setup could shed some light on the long-run versus short-run
effects of aggregate uncertainty and inflation on the firm’s policy and value.
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
891
The simulations reported are perhaps representative of menu cost models. An empirical study could then compare the theoretical predictions to those found in the data more
rigorously than was done here. The most difficult aspect arises from the lack of data on
menu costs, so that the veracity of the model cannot easily be tested directly. However, the
book by Blinder et al. (1998) reports very detailed survey data on observed frequencies, so
the argument could be turned around to answer the following question. Suppose the menu
cost model is an accurate representation of an economy or industry. Given the observed
frequency of price adjustments, how large must the fixed cost be in order to generate those
frequencies?
Acknowledgments
I thank Nancy Stokey and John Leahy for helpful comments. Financial support from the
National Research Council is gratefully acknowledged.
Appendix A
Proof of Theorem 1. The proof is a modification of the work by Richard (1977) who
considered the parallel cost minimization problem. Similar results can be found in Theorem 4.1 of Fleming and Rishel (1975, p. 159), Constantinides and Richard (1978), and
Propositions 2.13 and 2.18 in Harrison et al. (1983).
To prove the result, it is helpful to slightly modify some of our earlier notation. We
now define a general impulse control policy p as a sequence of stopping times τ and
a sequence of corresponding jumps J , such that p = {τ0 , J0 ; τ1 , J1 ; . . .}. As before, we
initialize without loss of generality τ0 = 0. The controlled process z(t) associated with p
still follows dz(t) = −g dt + σ dw for all τi t < τi+1 . However, now we have z(τi ) =
z(τi− ) + Ji , where τi− is the time immediately before control is exercised the ith time.
We then set z(0) = x. The optimal policy pˆ is constructed in a similar fashion: dˆz(t) =
−g dt + σ dw for all τˆi t < τˆi+1 ; zˆ (τˆi ) = zˆ (τˆi− ) + Jˆi ; and zˆ (0) = x. Finally, we must
construct a jump function J (x) which describes the optimal jump size when in state x,
using the candidate value function u(x) to evaluate states. The function is simply defined
as u(x + J (x)) − B ≡ supj =0 {u(x + j ) − B}. The optimal jump at the ith stopping time
will thus equal Jˆi = J (ˆz(τˆi− )).
To simplify the exposition, the proof is broken up into three lemmas. The first derives
a useful result based on Ito’s Lemma. The second shows u(x) V (p; x). The third shows
ˆ x). Put together, the three lemmas prove the desired result. For simplicity,
u(x) = V (p;
d
d2
define the differential operator ≡ −g dx
+ 12 σ 2 dx
2.
Lemma 1. Suppose there exists a function u(x) that satisfies (P1) and (P2). Then
∞
∞
− −rτi
−rt
u(x) = Ex
u z τi
− u z(τi )
e
+ Ex
e [ru − u] z(t) dt , (A.1)
i=0
0
892
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
u(x) = Ex
∞
e
−r τˆi
∞
− −rt
u zˆ τˆi
− u zˆ (τˆi )
+ Ex
e [ru − u] zˆ (t) dt .
i=0
(A.2)
0
Proof. Let p˜ be an arbitrary impulse control policy and z˜ its associated process defined as
was done for the other two policies. Consider an arbitrary function F (˜z(t), t) that is continuous and has continuous partial derivatives. Using the specification of our uncontrolled
process x(t), we begin by applying Ito’s Lemma to the time interval [τ˜i−1 , τ˜i ):
F z˜ τ˜i− , τ˜i = F z˜ τ˜i−1 , τ˜i−1 +
τ˜i
σ F1 z˜ (t), t dw
τ˜i−1
τ˜i
+
τ˜i−1
1
F2 z˜ (t)t, t − gF1 z˜ (t), t + σ 2 F11 z˜ (t), t dt.
2
Now let F (˜z(t), t) = e−rt u(˜z(t)). Consider an arbitrary time T > 0. Plug in our choice of
F into the above expression and sum up the intervals from τ˜0 = 0 to T to get:
e
−rT
u z˜ (T − ) − u z˜ (0) =
T
e
0
+
−rt
[u − ru] z˜ (t) dt +
T
σ e−rt u z˜ (t) dw
0
e
−r τ˜i
u z˜ τ˜i − u z˜ τ˜i− .
τ˜i <T
This summing up procedure is identical to that described in Proposition 4 of Harrison
(pp. 71–72). As always, we initialize z˜ (0) = x. Now let T → ∞. Since u is bounded by
(P1), e−rT u(˜z(T − )) → 0. So we are left with:
∞
−u(x) =
e
−rt
0
+
[u − ru] z˜ (t) dt +
∞
σ e−rt u z˜ (t) dw
0
∞
e−r τ˜i u z˜ τ˜i − u z˜ τ˜i− .
i=0
Now take expectations of both sides. Since u is bounded by (P2), the zero-expectation
property applies to the stochastic integral [Proposition 5 of Harrison (p. 62)], leaving us
with:
∞
∞
− −rt
−r τ˜i
u z˜ τ˜i − u z˜ τ˜i
.
e [u − ru] z˜ (t) dt + Ex
e
−u(x) = Ex
0
i=0
Since this holds for any policy and its associated process, it holds for p and p.
ˆ
2
Lemma 2. Suppose there exists a function u(x) that satisfies (P1)–(P4). Then u(x) V (p; x).
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
893
Proof. Consider policy p. By (P3), we know that, for all i,
u z(τi− ) u z(τi ) − B.
So we can multiply this by e−rτi , sum over i, and take expectations to get:
∞
∞
− −rτi
−rτi
u z τi
− u z(τi )
e
−Ex
e
B .
Ex
i=0
(A.3)
i=0
In general, z will not remain in the inaction region I . However, by (P4), we have that
1
/ I.
ru(x) + gu (x) − σ 2 u (x) π(x) for all x ∈
2
Hence, combining (P4) with the above inequality, we have that, for all z, ru(z) − u(z) π(z). This leads to the following:
∞
∞
−rt
−rt
Ex
e [ru − u] z(t) dt Ex
e π z(t) dt .
(A.4)
0
0
Now add the LHS of (A.4) to the LHS of (A.3), and the RHS of (A.4) to the RHS of (A.3)
to get the following inequality:
∞
∞
− −rτi
−rt
Ex
u z τi
− u z(τi )
e
+ Ex
e [ru − u] z(t) dt
i=0
∞
Ex
e
−rt
π z(t) dt − Ex
0
∞
e
−rτi
B .
i=0
0
By (A.1) of Lemma 1, the LHS is u(x). The RHS is V (p; x) by definition.
2
Lemma 3. Suppose there exists a function u(x) that satisfies (P1)–(P4). Then u(x) =
V (p;
ˆ x).
Proof. By the definition of the jumps of p,
ˆ we have
− u zˆ τˆi
− u zˆ τˆi = −B for all i.
(A.5)
By construction, the controlled process zˆ always remains in the inaction region I , so by
(P4) we have that, for all zˆ , ru(ˆz) − u(ˆz) = π(ˆz). Plugging in this result together with
(A.5) into (A.2) of Lemma 1, we get
∞
∞
−rt
−r τˆi
u(x) = Ex
e π zˆ (t) dt − Ex
e
B .
i=0
0
We recognize the RHS as V (p;
ˆ x).
2
Proof of Theorem 2. We begin by proving the particular solution is continuous. By Theorem 6.13 in Rudin (1976, p. 129), we have the following inequality:
894
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
∞
−rt
VP (x) − VP (x)
¯
dt ¯ = e Ex,x¯ π x(t) − π x(t)
0
∞
dt.
¯
e−rt Ex,x¯ π x(t) − π x(t)
0
The absolute value function is convex, so we may apply Jensen’s inequality to the RHS of
this inequality to get the following:
VP (x) − VP (x)
¯ ∞
dt.
¯
e−rt Ex,x¯ π x(t) − π x(t)
0
Since the profit function is continuous, the continuity of VP (x) follows directly by applying
the standard epsilon–delta definition in Rudin (1976, p. 85). Now consider VP (x). By the
Lebesgue Dominated Convergence Theorem (Billingsley, 1995, p. 209), we can bring the
derivative inside the expectation to get
VP (x) =
∞
e
−rt
Ex
∞
d π x(t) dt = e−rt Ex π x(t) dt,
dx
0
0
where the first equality follows from Fubini’s Theorem (bringing the expectation operator
inside the integral), and the last equality follows from (2). By (A3), we know the integral
exists. Since the derivative of the profit function is continuous by assumption, the same
proof is applicable.
We now prove the particular solution is bounded by a polynomial. Consider the first
integral of (6). By assumption, the profit function is bounded by a polynomial, so there
exist an L > 0 and λ 1 such that π(y) L(1 + y λ ) for all y. Multiply this by e−R1 y and
integrate to get
∞
e
−R1 y
∞
π(y) dy L
x
e
x
−R1 y
∞
dy + L
e−R1 y y λ dy.
x
Now perform integration by parts repeatedly on the second integral of the RHS, yielding:
∞
e
R1 x
e−R1 y π(y) dy x
L λ
λ λ−1 λ(λ − 1) λ−2
λ!
x +
x
+
x
+
·
·
·
+
+
1
.
R1
R1
R1λ
R12
Doing the same for the second integral of (6), we have
∞
e
R2 x
e
x
−R2 y
L λ
λ λ−1 λ(λ − 1) λ−2
λ!
x +
π(y) dy x
+
x
+ ··· + λ + 1 .
R2
R2
R2
R22
It follows that VP (x) is bounded by a polynomial.
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
895
To show VP (x) is bounded by a polynomial, we first take the derivative of (6). Since the
profit function is continuous by assumption, Theorem 6.20 in Rudin (1976, p. 133) states
that
∞
x
d
d
e−R1 y π(y) dy = −e−R1 x π(x) and
e−R2 y π(y) dy = e−R2 x π(x).
dx
dx
−∞
x
Using the chain rule,
∞
x
1
2
VP (x) = 2
π(y)e−R2 y dy ,
R1 eR1 x π(y)e−R1 y dy + R2 eR2 x
σ R1 − R 2
−∞
x
so the same proof as that for VP (x) is applicable. Note that we only used the continuity
and boundedness of the profit function, and not that of its derivative.
The above calculations prove the result when the profit function is non-negative everywhere. To extend the result to the case where the profit function takes on negative values,
and hence show that the particular solution and its derivative are bounded in absolute value
by a polynomial, one simply breaks up the integrals in (6) and the above expression into
their positive and negative components. Using the same integration by parts procedure,
one then shows that each component is bounded by a polynomial from above and below.
This follows from the assumption that the profit function is bounded in absolute value by
a polynomial.
We now prove that VP (x) is strictly concave using (5). Consider two initial conditions
x1 , x2 . Let x¯ = θ x1 + (1 − θ )x2 , where 0 < θ < 1. Suppose the Brownian motion process
has initial condition x.
¯ Then since the profit function is strictly concave,
π x¯ − gt + σ w(t) > θ π x1 − gt + σ w(t) + (1 − θ )π x2 − gt + σ w(t) .
This inequality holds for all possible realizations of the Wiener process, so we may take
the conditional expectation of both sides, preserving the relation to get:
Ex¯ π x(t) > θ Ex1 π x(t) + (1 − θ )Ex2 π x(t) .
Multiply through by e−rt and integrate over time. By Fubini’s Theorem (Harrison, 1985,
p. 131), we may take the expectation operator outside the integral. The inequality becomes:
∞
∞
−rt
−rt
e π x(t) dt > θ Ex1
e π x(t) dt
Ex¯
0
0
∞
+ (1 − θ )Ex2
e
−rt
π x(t) .
0
Therefore, VP (x) is strictly concave.
We now prove VP P (x) is strictly concave. According to the definition VP P (x) ≡
T
T
Ex { 0 e−rt π(x(t)) dt}, we have that VP P (x) = 0 e−rt Ex {π(x(t))} dt, where T is a stopping time, so the same proof follows through as did for VP (x). Moreover, from the equations defining VP P (x), we find that VP P (s) = VP P (S) = 0, so it is single-peaked with a
unique maximum that lies in (s, S). 2
896
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
Proof of Theorem 3. We begin by proving the first claim. Before doing so, we must derive
a couple equations using the method proposed by Dixit (1991) to obtain the smooth-pasting
condition at the return point. Differentiate with respect to S the value-matching condition
V (s) = V (S) using the general solution (4):
∂c2 R S
∂c1 R1 s
e − e R1 S =
e 2 − e R2 s .
∂S
∂S
Since s < S and R1 > 0, R2 < 0, the partial derivatives have the same sign. Now consider
the effect of S on the value function V (x):
∂V (x) ∂c1 R1 x ∂c2 R2 x
=
e
e .
+
∂S
∂S
∂S
Since the partial derivatives have the same sign, the entire value function is shifted up
or down. So the optimal choice of S involves setting the above expression equal to zero,
implying:
∂c1 ∂c2
=
= 0.
(A.6)
∂S
∂S
We will prove the constants of the homogeneous solution are strictly positive by ruling out
all the other possible cases. Consider the smooth-pasting condition V (S) = 0, which is
VP (S) = −c1 R1 eR1 S − c2 R2 eR2 S .
Since this holds as an identity, we may differentiate it with respect to S and preserve the
relation. Using the result (A.6), we are left with
VP (S) = −c1 R12 eR1 S − c2 R22 eR2 S .
Suppose both constants are zero. Then the RHS is equal to zero, which cannot be since the
particular solution is strictly concave according to Theorem 2. Moreover, if c1 = 0, c2 < 0,
then the RHS is strictly positive, which again is not possible. Similarly, we can rule out the
cases c1 < 0, c2 = 0 and c1 < 0, c2 < 0. Ruling out the other cases involves more work.
We will look at the particular and homogeneous solutions of the Bellman ODE and analyze
their properties under each scenario. In each case, the smooth-pasting conditions will yield
a contradiction.
Consider the case c1 > 0, c2 < 0. Let f (x) equal the derivative of the homogeneous solution: f (x) ≡ c1 R1 eR1 x + c2 R2 eR2 x . We shall study the number of intersections between
f (x) and g(x) ≡ −VP (x). Take the derivative of f (x) and set it equal to zero, to obtain
x = (R1 − R2 )−1 ln[−c2 R22 /(c1 R12 )]. Since c1 > 0, c2 < 0, the argument of the log function is positive, so f (x) = 0 has a unique, real solution. Now take the second derivative
of f (x), to obtain f (x) = c1 R13 eR1 x + c2 R23 eR2 x . Since c1 > 0, c2 < 0, R1 > 0, R2 < 0,
the entire RHS is strictly positive. Collating these results, we have shown that f (x) is
strictly convex, with a unique minimum. Now consider g(x). Since VP (x) is strictly concave by Theorem 2, g (x) > 0. By Theorem 2, g(x) is bounded by a polynomial, implying
that g(x) < f (x) for both large and small x since f (x) grows exponentially. Putting together all these properties, it follows that f (x) and g(x) will intersect an even number of
times, if at all. The smooth-pasting conditions require that there be exactly three points of
intersection, so we cannot have that c1 > 0, c2 < 0. If c1 > 0, c2 = 0, then f (x) is strictly
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
897
increasing and strictly convex, so there will be at most two points of intersection, which
also cannot happen.
Finally, consider the case c1 < 0, c2 > 0. As before, f (x) = 0 has a unique, real solution, but now f (x) is strictly concave with a unique maximum. Therefore, g(x) > f (x) for
both large and small x. So as before, we will get an even number of intersections, if at all,
which cannot be. If c1 = 0, c2 > 0, then f (x) is strictly increasing and strictly concave, so
there will be at most two points of intersection, which is also ruled out.
We now prove that the global maximum m of the profit function lies within the inaction
region. Using (10), we define the function η(x) ≡ V (s)[ψ1 (x) + ψ2 (x)] = V (x) − VP P (x).
Because rψ1 (x) + gψ1 (x) − 12 σ 2 ψ1 (x) = rψ2 (x) + gψ2 (x) − 12 σ 2 ψ2 (x) = 0, it follows
that η(x) has the same property: rη(x) + gη (x) − 12 σ 2 η (x) = 0. Moreover, due to the
definitions of ψ1 (x) and ψ2 (x), we find that η(x) is single-signed in the region [s, S], and
η(s) = η(S) = V (s). If η(x) is not constant, then it has an extremum point in the interval,
call it e. Evaluating the ODE at this extremum point e, we find rη(e) = 12 σ 2 η (e), implying
that η (x) has the same sign as η(x) at the extremum. It follows that η(x) and VP P (x)
behave in the same fashion. Because Theorem 2 showed VP P (x) is single-peaked with a
unique maximum that lies within the inaction region, the result follows immediately. 2
Proof of Theorem 4. The continuity within the inaction region of the value function and
its derivative follow directly from Theorem 2. By Dumas (1991), they are continuous at
the barrier points; outside the inaction region, they are continuous since the value function
is constant by definition. Outside the interval [s, S], V (x) is zero (and hence bounded)
by construction. Since the interval [s, S] is compact, and V (x) is continuous over that
interval, V (x) is bounded over [s, S] by Theorem 4.15 in Rudin (p. 89). Therefore, V (x)
is bounded over the whole real line. The value function has an extremum point within the
inaction region by the Mean-Value Theorem (since it is continuous), and it is unique by
the smooth-pasting condition, so it equals S. To show S is a maximum, we prove V (x) is
increasing immediately to the right of s, and decreasing immediately to the left of S. Since
V (x) = V (S) − B for all x outside the inaction region, this will hence demonstrate that
V (x) V (S) − B for all x, further implying the value function is single-peaked.
We prove V (x) is increasing immediately to the right of s by showing that, for small
ε > 0, V (s + ε) V (s). Using (4), this will hold if
VP (s + ε) + c1 eR1 (s+ε) + c2 eR2 (s+ε) VP (s) + c1 eR1 s + c2 eR2 s .
Rewriting this, we find that V (s + ε) V (s) if
VP (s + ε) − VP (s) + c1 eR1 s eR1 ε − 1 c2 eR2 s 1 − eR2 ε .
Now do a Taylor expansion of VP (s + ε) around s:
1
VP (s + ε) = VP (s) + εVP (s) + ε 2 VP (s).
2
Since we chose ε small, we set the second-order term equal to zero. So the inequality
becomes:
εVP (s) + c1 eR1 s eR1 ε − 1 c2 eR2 s 1 − eR2 ε .
898
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
Now consider the smooth-pasting condition at s, given by V (s) = 0. Using (4) again, this
is VP (s) = −c1 R1 eR1 s − c2 R2 eR2 s . Plug this equation above to get
c1 eR1 s eR1 ε − 1 − εR1 c2 eR2 s 1 + εR2 − eR2 ε .
(A.7)
So if we can prove that this holds, then V (s +ε) V (s). Consider the function ex −(1+x).
We argue that it is non-negative for all x. This will be so if and only if x ln(1 + x). Define
y = 1 + x, f (y) = ln y. Since the natural log function is concave, a Taylor series expansion
around any point always lies above the function itself; hence, doing the expansion around
unity proves the claim. Since ex − (1 + x) is non-negative for all x, the term in brackets on
the LHS of (A.7) is positive, and that on the RHS is negative. By Theorem 3, c1 > 0, c2 > 0,
so the entire LHS of (A.7) is positive, and the entire RHS is negative, so the inequality
holds.
To prove V (x) is decreasing immediately to the left of S, one uses a similar procedure.
We show that for small ε > 0, the following holds: V (S − ε) V (S). Using (4), this will
hold if
(A.8)
VP S − ε − VP S + c1 eR1 S e−R1 ε − 1 c2 eR2 S 1 − e−R2 ε .
Dropping second-order terms as ε is chosen small, a Taylor approximation around S yields
(A.9)
VP S − ε = VP S − εVP S .
The smooth-pasting condition at S implies
−εVP S = εc1 R1 eR1 S + εc2 R2 eR2 S .
(A.10)
Plugging (A.10) and (A.9) into (A.8) leads to
c1 eR1 S e−εR1 + εR1 − 1 c2 eR2 S 1 − εR2 − e−εR2 .
(A.11)
e−x
+ x − 1 0 for all x. This will be so if −x ln(1 − x). Let y = 1 − x,
We argue that
f (y) = ln y. Then since the natural log function is concave, we find that f (y) f (1) +
f (1)(y − 1). Evaluating these functions, we get ln(1 − x) −x, which is equivalent
to −x ln(1 − x). Hence, the term in brackets on the LHS of (A.11) is positive, and
that on the RHS is negative. Since the constants are positive by Theorem 3, the entire
LHS is positive, and the entire RHS is negative. It follows that (A.11) holds, implying
V (S − ε) V (S) also holds. 2
Proof of Theorem 5. By Theorem 1, we must show the value function V (x) satisfies conditions (P1)–(P4). By Theorem 4, both V (x) and V (x) are continuous and
bounded, implying (P1) and (P2) are satisfied. Now consider (P3), which requires V (x) supy {V (y) − B}. By Theorem 4, S is the unique maximum of V (x), so the RHS equals
V (S)−B. Hence, (P3) is also satisfied since V (x) V (S)−B by Theorem 4. By construction, the value function satisfies the Bellman ODE over the inaction region, which in our
case becomes I = {x: V (x) > V (S) − B} = (s, S). Therefore, (P4) is satisfied, completing
the proof. 2
Proof of Theorem 6. To derive the stationary distribution, we approximate the controlled
Brownian motion process z(t) by a random walk, so that we may use standard Markov
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
899
chain techniques.
Suppose the length of a time interval is dt and the size of a jump is
√
dz = σ dt. The probability p of an upward jump is given by
1
g√
g
1
p= 1−
dt = 1 − 2 dz .
2
σ
2
σ
Let q = 1 − p be the probability of a downward jump. Let f (z) denote the density of the
controlled process. Consider a state z that lies in the set (s, S) ∪ (S, S) (the union of the
sets). It can be reached either by jumping up from z − dz or down from z + dz, so we have
f (z) = pf (z − dz) + qf (z + dz). Re-arranging terms and using the definitions of p and
dz, we get
0 = f (z + dz) − f (z) − f (z) − f (z − dz)
g
+ 2 dz f (z + dz) − f (z) + f (z) − f (z − dz) .
σ
Now divide this by (dz)2 and take the limit as dz → 0. We get the ODE f (z) =
−(2g/σ 2 )f (z). As it is a term that appears throughout the calculations, define α =
−2g/σ 2 , and note that R1 + R2 = −α. The general solution of the ODE is f (z) =
A + Ceαz , where the constants A and C remain to be determined. They will be given
by boundary conditions that we now derive.
Once the process hits either s or S, it instantly jumps to S, so there will be no mass at
the barriers: f (s) = f (S) = 0. Consider the return point S. It can be reached in four ways:
an upward jump from S − dz; a downward jump from S + dz; an upward jump from S − dz;
or a downward jump from s + dz. Therefore, we have
f (S) = pf (S − dz) + qf (S + dz) + pf S − dz + qf (s + dz).
Re-arranging terms, we get the following:
f (S) − f (S − dz) = f (S + dz) − f (S) + f (s + dz) − f (s)
− f S − f S − dz
dt f (S − dz) − f (S + dz) − f (s + dz) − f (s)
−g
dz
− f S − f S − dz .
Now divide this by dz and take the limit as dz → 0. Since (dz)2 = σ 2 dt, the entire
last term converges to zero. Letting f− (z), f+ (z) denote the LHS and RHS derivatives of
f (z) respectively, this becomes f− (S) = f+ (S) + f+ (s) − f− (S). Finally, since f (z) is a
S
density, we have that it must integrate to one: s f (z) dz = 1. We thus get five equations
in four unknowns, whose unique solution is given in the text.
We now turn to deriving the expected waiting time in the inaction region. As before,
let x(t) denote our Brownian motion given by (1). Define a process u(t) associated with
x(t) as follows: u(t) = −x(t)/g − t. We claim u(t) is a martingale with respect to x(t).
According to Karlin and Taylor (1975, p. 239), we must show that E{u(t + s) | x(r), 0 r t} = u(t) for all t > 0 and s > 0. Using the definition (2) for x(t), this can be shown
easily:
900
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
1 E u(t + s) | x(r), 0 r t = − E x(t + s) | x(t) − (t + s)
g
1
= − x(t) − gs − (t + s) = u(t).
g
Let T be the first time x(t) reaches either s or S. Then by the Optional Sampling Theorem
for martingales (Theorem 3.2 of Karlin and Taylor, 1975, p. 261), E[u(T )] = E[u(0)],
which implies E[u(T )] = E[u(0)] = −x(0)/g, where by assumption s < x(0) < S.
From the definition of u(t), we have E[u(T )] = −(1/g)E[x(T )] − E[T ]. Plugging in
the previous result, we have E[T ] = x(0)/g − (1/g)E[x(T )]. Now it remains to calculate E[x(T )]. Let p be the probability that x(t) hits the upper barrier before the
lower barrier. In Theorem 5.2 of Karlin and Taylor (1975, p. 361), p is given as p =
(e−αx(0) − e−αs )/(e−αS − e−αs ), where −α = 2g/σ 2 as before. Then the expectation is
E[x(T )] = pS + (1 − p)s, which yields the result given in the text. 2
References
Abel, A., Eberly, J., 1994. A unified model of investment under uncertainty. American Economic Review 84,
1369–1384.
Bar-Ilan, A., Sulem, A., 1995. Explicit solution of inventory problems with delivery lags. Mathematics of Operations Research 20, 709–720.
Bertola, G., Caballero, R., 1990. Kinked adjustment costs and aggregate dynamics. In: NBER Macroeconomics
Annual. MIT Press, Cambridge, MA.
Billingsley, P., 1995. Probability and Measure. Wiley.
Blinder, A., 1991. Why are prices sticky? Preliminary results from an interview study. American Economic
Review 81, 89–96.
Blinder, A., Canetti, E., Lebow, D., Rudd, J., 1998. Asking About Prices: A New Approach to Understanding
Price Stickiness. Russell Sage Foundation, New York, NY.
Caplin, A., Leahy, J., 1991. State-dependent pricing and the dynamics of money and output. Quarterly Journal of
Economics 106, 683–708.
Caplin, A., Leahy, J., 1997. Aggregation and optimization with state-dependent pricing. Econometrica 65, 601–
625.
Caplin, A., Spulber, D., 1987. Menu costs and the neutrality of money. Quarterly Journal of Economics 102,
703–725.
Constantinides, G., Richard, S., 1978. Existence of optimal simple policies for discounted-cost inventory and
cash management in continuous time. Mathematics of Operations Research 26, 620–636.
Cox, D.R., Miller, H.D., 1965. The Theory of Stochastic Processes. Chapman and Hall.
Danziger, L., 1983. Price adjustments with stochastic inflation. International Economic Review 24, 699–707.
Dixit, A., 1991. A simplified treatment of the theory of optimal control of Brownian motion. Journal of Economic
Dynamics and Control 15, 657–673.
Dixit, A.K., Pindyck, R.S., 1994. Investment Under Uncertainty. Princeton Univ. Press, Princeton, NJ.
Dumas, B., 1991. Super contact and related optimality conditions. Journal of Economic Dynamics and Control 15,
675–685.
Edwards, C.H., Penney, D.E., 1993. Elementary Differential Equations. Prentice Hall.
Fleming, W.H., Rishel, R.W., 1975. Deterministic and Stochastic Optimal Control. Springer-Verlag.
Harrison, M.J., 1985. Brownian Motion and Stochastic Flow Systems. Krieger.
Harrison, M., Sellke, T., Taylor, A., 1983. Impulse control of Brownian motion. Mathematics of Operations
Research 8, 454–466.
Karatzas, I., Shreve, S.E., 1991. Brownian Motion and Stochastic Calculus. Springer-Verlag.
Karlin, S., Taylor, H.M., 1975. A First Course in Stochastic Processes. Academic Press.
J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901
901
Levy, D., Bergen, M., Dutta, S., Venable, R., 1997. The magnitude of menu costs: Direct evidence from large US
supermarket chains. Quarterly Journal of Economics 112, 791–825.
Richard, S., 1977. Optimal impulse control of a diffusion process with both fixed and proportional costs of control.
SIAM Journal of Control and Optimization 15, 79–91.
Rudin, W., 1976. Principles of Mathematical Analysis. McGraw–Hill.
Scarf, H., 1959. The optimality of (S, s) policies in the dynamic inventory problem. In: Arrow, K., Karlin, S.,
Suppes, P. (Eds.), Mathematical Methods in Social Sciences. Stanford Univ. Press, Palo Alto, CA, pp. 196–
202.
Sheshinski, E., Weiss, Y., 1977. Inflation and costs of price adjustment. Review of Economic Studies 44, 287–303.
Sheshinski, E., Weiss, Y., 1983. Optimum pricing policy under stochastic inflation. Review of Economic Studies 50, 513–529.
Tsiddon, D., 1993. The (mis)behavior of the aggregate price level. Review of Economic Studies 60, 889–902.