Review of Economic Dynamics 8 (2005) 877–901 www.elsevier.com/locate/red The optimality of a control band policy Jose M. Plehn-Dujowich Department of Economics, University at Buffalo (SUNY), 435 Fronczak Hall, Buffalo, NY 14260 Received 9 July 2002; revised 18 November 2003 Available online 3 August 2005 Abstract Consider a firm with an arbitrary profit function whose relative price follows a Brownian motion with negative drift. When the firm faces a fixed cost of price adjustment, we prove the optimal pricing policy is a control band if the following sufficient conditions are met: the profit function is continuous, strictly concave and single-peaked; moreover, together with its first and second derivatives, it is bounded in absolute value by a polynomial. We also demonstrate various ways of constructing the value function associated with the control band policy and show it has certain properties carried over from the profit function. Numerical examples are found to be consistent with empirical estimates regarding the frequency of price adjustments. 2005 Elsevier Inc. All rights reserved. JEL classification: E31; C61 Keywords: Stochastic control; Menu cost 1. Introduction Models with fixed costs of adjustment are popular due to their applicability in various fields of economics. It is frequently assumed in such models that firms adopt an (s, S) policy; the familiar value-matching and smooth-pasting conditions are then utilized. If the firm chooses to follow a control band policy, these conditions determine the optimal choice of trigger and return points, but they do not guarantee the type of policy is optimal against all possible alternatives (for example, non-Markov policies). This paper provides sufficient conditions for a control band policy to be optimal against all alternatives. E-mail address: [email protected]. 1094-2025/$ – see front matter 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.red.2005.05.001 878 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 Optimality of the control band policy has been proven for specific profit functions. In Sheshinski and Weiss (1983), the profit function is quasi-concave. In Scarf (1959), the inventory holding cost function is convex, which is equivalent to a concave profit function. In Danziger (1983), the profit function is a strictly concave polynomial. Caplin and Leahy (1997) and Tsiddon (1993) assume the profit function is quadratic. Assuming the uncontrolled process follows a Brownian motion, we prove a control band policy is optimal if the value function and its derivatives exist and are continuous and bounded in absolute value, and the constants of the homogeneous solution of the Bellman equation are strictly positive. We demonstrate various ways of constructing the value function associated with the control band policy and show it has certain desirable properties that are carried over from the profit function. By imposing the following three sufficient conditions on the profit function, we ensure the value function has the stated properties guaranteeing optimality of the control band policy: the profit function and its derivative are continuous; the profit function is strictly concave and single-peaked (unimodal); and the profit function together with its first and second derivatives are bounded in absolute value by a polynomial. Our results are proven for a firm facing fixed costs of nominal price adjustment, i.e. menu costs. If there is inflation but no uncertainty, then the firm employs an (s, S) policy as shown in Sheshinski and Weiss (1977). Similarly, if there is uncertainty but no inflation, we also have price control of the (s, S) type as in Caplin and Leahy (1997). For our problem, we incorporate both features by modeling the relative price as a Brownian motion with negative drift. We then perform a numerical analysis of the model with a linear-quadratic profit function to show the following. First, as in Bertola and Caballero (1990), we find that the firm chooses a control band policy that causes its steady state price to be very close to its profitmaximizing level. In the absence of menu costs, the firm would continuously regulate its price to be at that level. With menu costs, the firm achieves the second-best by being at the profit-maximizing level on average. Second, though inflation per se is detrimental to the firm, price volatility is significantly more costly in terms of lost profits. This ties in with our third finding, that the control band policy considerably reduces the (steady state) variance of the price, even up to an order of magnitude. Finally, in the baseline parameter configuration, the firm adjusts its price on average 1.44 times per year, which is consistent with empirical estimates in Blinder (1991). Menu cost models emerged from the inventory and cash management literature. In the former, the solution to the firm’s problem typically leads to a type of impulse control policy. In the latter, we observe instantaneous control. A control band is a specific type of impulse control policy; in general terms, these are given by a sequence of stopping times and corresponding jumps. For example, if the firm found it optimal to have different trigger and return points depending on its state, then this would be another type of impulse control policy. This is what we observe in models that have discontinuous behavior at the aggregate level, as in Sheshinski and Weiss (1983). In their model, the economy is governed by two states, one in which there is no inflation, and one with positive inflation. In each state, the firm uses a different control band policy. In inventory problems, the underlying stochastic process becomes the demand for the firm’s product. In a discrete-time setting, J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 879 Scarf (1959) is the first to have successfully solved and fully characterized the solution to the firm’s problem in such a framework. The case of instantaneous control arises when the state in question cannot exceed certain boundaries. In cash management problems, the state variable is the cash holdings of the firm; its business practices imply the holdings cannot get too low (typically, they must be non-negative) or too high (the opportunity cost of holding cash becomes large). Such problems have been modeled by having two cost functions, one associated with holding cash, the other with changing its state (the menu cost). The solution to cash management problems thus usually becomes one of regulating the state, and not necessarily subjecting it to large, discrete jumps. Harrison et al. (1983) and Constantinides and Richard (1978) analyzed the inventory and cash management problems, respectively. The finance literature has also applied menu cost models, motivated by the stylized fact that investment at the firm level is lumpy, having the same discrete and staggered pattern observed with inventory, cash, and price levels. In the Abel and Eberly (1994) model of investment, there are flow fixed costs, giving rise to instantaneous control (characterized by a band) rather than impulse control. Finally, a similar concept to that of an (s, S) policy has been used to derive the value of exercising an option when the underlying asset follows a Brownian motion process (Dixit and Pindyck, 1994). The paper is organized as follows. Section 2 begins by formally describing the firm’s problem and deriving a set of sufficient conditions a value function must satisfy for it to equal the value of following the optimal impulse control policy. We then describe the control band policy we postulate to be optimal and construct its associated value function. Section 3 proves the control band policy is optimal by showing its value function satisfies the sufficient conditions stipulated in Section 2. At the end of Section 3, we derive the ergodic distribution of the controlled relative price and the expected waiting time in the inaction region of the control band. Section 4 solves a linear-quadratic example and compares its predictions to empirical estimates. Finally, Section 5 concludes. Appendix A contains the proofs of all theorems. 2. The firm’s problem The firm seeks to maximize the discounted flow of profits π(x) net of price adjustment costs over an infinite horizon. The firm’s profits depend solely on the price of its product relative to some average price of the economy. We shall interpret x as being the log of the price ratio, but it will be referred to simply as the price. The cost of adjusting the price is assumed to be fixed at B > 0 and not depend on the amount of the price change. The firm operates in an inflationary environment so that its relative price decreases over time on average. In particular, when price control is not exercised, the relative price of the firm follows a Brownian motion with negative drift: dx(t) = −g dt + σ dw (1) where w is a Wiener process, and the drift and variance terms are constants. If the process starts at x, then we may write (1) equivalently as x(t) = x − gt + σ w(t). (2) 880 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 We immediately rule out the possibility of instantaneous control, whereby the firm continuously regulates the price, because of the assumed menu cost structure. Each time the process is regulated, the firm incurs a fixed cost; thus, since we are dealing with Brownian motion, it would incur infinite costs in finite time: once it reaches a barrier of the regulator, which occurs with positive probability, it will have to adjust the price an uncountably infinite number of times since a Brownian motion is a continuous process (Harrison, 1985, Chapter 2). Hence, the firm’s problem becomes one of impulse control: it must choose times at which to change its price, and by how much. Formally, a general impulse control policy p is defined as a sequence of stopping times denoted by τ and a sequence of corresponding return points denoted by y: p = {τ0 , y0 ; τ1 , y1 ; . . .}. As the underlying process is stochastic, these sequences are random variables. For convenience, we initialize τ0 = 0. The process x(t) describes the behavior of the relative price absent of any form of control. We now define a new process, z(t), which describes the relative price subject to control. The policy p is a prescription for the control procedure, so z(t) depends on p. Since the process x(t) follows (1), the controlled process z(t) associated with policy p is z(0) = x. The defined as follows: dz(t) = −g dt + σ dw for τi t <τi+1 ; z(τi ) = yi ; and ∞ −rτi B}, value of following policy p is given by V (p; x) = Ex { 0 e−rt π(z(t)) dt − ∞ i=0 e where the expectation is taken conditional on the initial condition x and r > 0 is the constant interest rate. Our goal is to find an impulse control policy pˆ that yields the greatest possible value out of all possible impulse control policies. We will not associate with pˆ a specific type of policy. Rather, we describe its features as they compare to the arbitrary impulse control policy p defined above. To achieve this, we begin with a candidate value function u(x). The inaction region I can be defined as the set of states for which the value of inaction, given by u(x), exceeds the largest net value of exercising control: I = {x: u(x) > supy {u(y) − B}}. To understand I , suppose the firm is considering exercising control by jumping to the relative price y. The cost of doing so is B, implying the net gain of such control is u(y)−B. Because the firm can jump to any relative price, incurring the same fixed cost B, the maximum net value of exercising control at any point in time (and hence at any state) is given by supy {u(y) − B}. If the firm’s relative price is currently x, then the value of inaction (i.e. doing nothing) is simply u(x). It follows that if the firm’s current relative price is x, and u(x) > supy {u(y) − B}, then it is optimal for the firm to do nothing. In other words, that point x lies in the inaction region I . By construction, once the process leaves the inaction region, control is exercised. Moreover, τˆi+1 is the smallest time, after τˆi , at which control is exercised. So the stopping times / I }, where once again we of policy pˆ can be defined recursively as τˆi+1 = arg inft>τˆi {ˆz(t) ∈ initialize τˆ0 = 0. The optimal return points must maximize the return to exercising control, so they are given by yˆi = arg supy {u(y) − B}. Since the cost is fixed irrespective of the magnitude or direction of control, the return point is always the same in this model. Finally, the controlled process zˆ (t) associated with policy pˆ is defined as was done for p: dˆz(t) = −g dt + σ dw for τˆi t < τˆi+1 ; zˆ (τˆi ) = yˆi ; and zˆ (0) = x. The following theorem provides two important results. First, it proves the policy pˆ constructed above is indeed the optimal policy, as it maximizes V (p; x) over all possible impulse control policies p. Second, it specifies sufficient conditions for the candidate ˆ x). The value function u(x) to equal the value of the optimal impulse control policy V (p; J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 881 theorem is a type of principle of optimality result as it allows us to go from the sequential problem, which is given by maximizing V (p; x) over all possible sequences of return points and stopping times, to a recursive formulation dictated by a value function. (All proofs are contained in Appendix A.) Theorem 1. If there exists a value function u(x) that satisfies the following four conditions: (P1) (P2) (P3) (P4) u(x) is bounded in absolute value and continuous; u (x) is bounded in absolute value and continuous; u(x) supy {u(y) − B} for all x; ru(x) + gu (x) − 12 σ 2 u (x) = π(x) for all x ∈ I , then the policy pˆ is the optimal impulse control policy, and u(x) attains its associated value. That is, u(x) = V (p; ˆ x) V (p; x) for all p. We now proceed with constructing a value function V (x) which will ultimately satisfy the sufficient conditions (P1)–(P4) of Theorem 1. Following Dixit (1991), we postulate that the optimal impulse control policy is a control band, so we shall derive the value function associated with following such a policy. Because we have a fixed cost of price adjustment which does not depend on the amount of control exercised, the band we choose is characterized by two barrier points and a single return point, such that s < S < S. That is, when the relative price hits either s or S, the firm instantly jumps to S. For our problem, then, the inaction region corresponds to the open interval (s, S). As Dixit showed, if there is a proportional cost of adjustment, then the return points will be different, since in that case the marginal cost of adjustment has to be taken into consideration. We begin by deriving the continuous-time Bellman equation of the firm’s problem. In the absence of control, the relative price follows (1). Using Ito’s Lemma and standard methods, the Bellman equation is given by 1 rV (x) + gV (x) − σ 2 V (x) = π(x). 2 (3) By construction, the Bellman only holds over the inaction region. For every sample path of the Brownian motion, the Bellman equation is an ordinary differential equation (ODE) (Karatzas and Shreve, 1991). Thus, standard calculus techniques may be used in solving (3). The general solution hence takes the following form: V (x) = VP (x) + c1 eR1 x + c2 eR2 x , (4) where VP (x) is a particular solution of (3), c1 , c2 are constants to be determined by boundary conditions, and the rootssolve the standard characteristicequation of the ODE, which are given by R1 = σ12 [g + g 2 + 2rσ 2 ] and R2 = σ12 [g − g 2 + 2rσ 2 ]. Since the Bellman ODE only holds over the inaction region (s, S), the same holds true for the general solution (4). Recognizing the Bellman ODE as a representation of the famous Cauchy problem, a particular solution of (3) is given by the Feynman–Katz solution (Theorem 7.6 of 882 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 Karatzas and Shreve, 1991, p. 366). It is equal to the expected discounted value of profits in the absence of control: ∞ −rt VP (x) = Ex e π x(t) dt . (5) 0 This is the particular solution used by Dixit. Harrison (1985, p. 45) showed that it is equivalent to the following, whereby we integrate over states instead of time: ∞ x 1 2 R1 x −R1 y R2 x −R2 y VP (x) = 2 π(y)e dy + e π(y)e dy . (6) e σ R1 − R2 x −∞ This alternative representation can also be derived directly using the calculus method of variation of parameters (Edwards and Penney, 1993, p. 171). The integral in (5) is not necessarily finite. To guarantee it exists, we must first check whether the drift and variance terms in (1) satisfy Lipschitz conditions; in our case, they do since they are constants (Karatzas and Shreve, 1991, p. 289). The second requirement is that the profit function be bounded in absolute value by a polynomial. Alternatively, we could have the profit function be non-negative. Since we wish to allow for profit functions that take on negative values (as they frequently appear in menu cost and investment with adjustment costs models), we assume the former, together with other conditions, which we now turn to. Throughout the remainder of this paper, we assume the profit function satisfies the following three conditions: (A1) (A2) (A3) π and π are continuous; π is strictly concave and single-peaked (unimodal); |π|, |π |, and |π | are bounded by a polynomial. Our assumptions are generally consistent with the existing literature. In menu cost models, it seems that concavity and some form of boundedness assumptions are needed to ensure optimality of a control band policy. Sheshinski and Weiss (1983) required the profit function to be quasi-concave. Similarly, Scarf (1959) assumed the inventory holding cost function was convex, which translates to a concave profit function in our model. In a model with geometric Brownian motion, Danziger (1983) assumed the profit function was a strictly concave polynomial. Most papers, such as Caplin and Leahy (1997) and Tsiddon (1993), assume the profit function is quadratic, which is allowed by (A1)–(A3). The goal is to construct a value function that will satisfy the conditions stipulated by Theorem 1. The assumptions (A1)–(A3) reflect this strategy taking into account that properties imposed on the profit function are inherited by the value function. In order to apply Theorem 1, and thus demonstrate that the control band policy is optimal, the value function and its derivatives must exist and be continuous and bounded in absolute value; moreover, the constants c1 , c2 of the homogeneous solution of the Bellman equation must be strictly positive (which we argue below demonstrates the type of control being exercised is beneficial to the firm). We prove (A1)–(A3) is a sufficient set of conditions for the value function to have these properties. Nevertheless, if the chosen profit function does not satisfy as- J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 883 sumptions (A1)–(A3), yet its associated value function meets the stipulated conditions, then the control band policy is the optimal policy against all alternatives. Continuity of the profit function and its derivative ensures continuity of the value function and its derivative, which is required for various reasons. First, it enables the usage of Ito’s Lemma in proving Theorem 1. Second, it allows one to prove that the value function and its derivative are bounded, which is also required by Theorem 1 according to (P1) and (P2). Third, it is used to prove the value function has an extremum point. When combined with the assumption that the profit function is single-peaked (implying the value function has this property), we show the extremum point is the unique return point of the control band policy and the unique maximum of the value function. Assuming the profit function is strictly concave ensures the value function is strictly concave. We utilize this property to prove the constants c1 , c2 of the homogeneous solution of the Bellman equation are strictly positive. Quasi-concavity would not yield the desired outcome. The assumption that the profit function and its derivative (and thus the value function and its derivative) are bounded in absolute value by a polynomial is also used to prove the constants c1 , c2 are strictly positive. Moreover, assuming the profit function and its first two derivatives are bounded in absolute value by a polynomial ensures the value function and its first two derivatives exist. There are other ways to guarantee the constants c1 , c2 are strictly positive and the value function and its derivatives exist, so theoretically one could extend or modify these assumptions to suit the situation at hand. A number of cases do not satisfy the stated assumptions (A1)–(A3), though the proof can be modified to accommodate most cases. Having a linear profit function obviously violates (A2). The inventory model of Harrison et al. (1983) is a cost minimization problem wherein holding costs are linear for positive inventory holdings and infinite for negative holdings, so this special case also does not satisfy our assumptions. Bar-Ilan and Sulem (1995) fails the differentiability condition. Together with Harrison et al., the papers share the feature that they have a single kink in the payoff function at zero. As long as the value function is continuous, differentiable (with continuous derivatives), strictly concave, single-peaked, and bounded in absolute value by a polynomial (together with its derivatives), then the proof of optimality still applies (without having to check that c1 , c2 are strictly positive). We now derive the familiar value-matching conditions associated with the {s, S, S} control band policy. By definition, the net return to jumping to a point is equal to the value function evaluated at that point minus the fixed cost B of exercising control. It follows that if the choice of trigger and return points is optimal, the net return to jumping must equal the value at the trigger points: V (s) = V (S) = V (S) − B. Since control is exercised outside the inaction region (s, S), the value function is constant at V (s) = V (S) everywhere outside that open interval, so we may extend our value function over the whole real line by defining it as V (x) = V (s) = V (S) for x < s and x > S. In addition to the value-matching conditions, we have the smooth-pasting conditions at the barrier and return points. They can be derived from the following arbitrage argument using the random walk approximation of a Brownian motion (Cox and Miller, 1965). Suppose the relative price is at the lower trigger point s. We shall compare the payoff to waiting at s, yielding an expected return of R, versus jumping to state S. The two returns should be equal if the control band policy was chosen optimally. With probability 884 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 √ p = 12 [1 − σg t] the process jumps up, where t is the length of the time interval and √ h = σ t is the size of the jump, and with probability q = 1 − p it jumps down at which point control is exercised instantly. So the expected return to waiting at s is given by R = π(s)t + (pV (s + h) + q[V (S) − B])/(1 + rt). Approximate the first term in R with a first-order Taylor expansion around s: V (s + h) = V (s) + V (s)h. By the value-matching conditions, we have V (s) = V (S) − B; dropping terms of higher order, the return becomes R = V (S) − B + 12 V (s)h. The net return to jumping to S is simply V (S) − B. Thus, for the two returns to be equal, we require that 12 V (s)h = 0. Doing the same calculation for the upper barrier, we get the smooth-pasting conditions at the trigger points: V (s) = V (S) = 0. Dixit derived the condition at the return point to be V (S) = 0, which we shall see later follows from the first-order condition of a maximization problem. The value-matching and smooth-pasting conditions determine the optimal control band policy by providing the five equations needed to solve for the three policy parameters {s, S, S} and the two constants of the homogeneous solution c1 , c2 . We now obtain expressions for the constants as a function of the policy parameters using the hitting-time equations in Harrison, in which case one just needs to use the smooth-pasting conditions to solve the system. Let T be the first time the process x(t) hits either barrier s or S. Since the value function V (x) satisfies the Bellman ODE (3), we can apply Proposition 2 of Harrison (p. 74): T V (x) = Ex e −rt π x(t) dt + Ex e−rT V (xT ) . (7) 0 This is a type of Bellman equation as the value today is decomposed into its two components: the expected discounted value up to the hitting time (or the value over the inaction region), and the expected discounted terminal value once the process has hit either barrier. T Define the first component as VP P (x) ≡ Ex { 0 e−rt π(x(t)) dt}. Then by Proposition 3 of Harrison (p. 49): VP P (x) = VP (x) − VP (s)ψ2 (x) − VP S ψ1 (x), (8) where ψ1 (x) = θ1 (x) − θ2 (x)θ1 (s) 1 − θ2 (S)θ1 (s) , ψ2 (x) = θ2 (x) − θ1 (x)θ2 (S) 1 − θ2 (S)θ1 (s) , θ1 (x) = eR1 (x−S) and θ2 (x) = eR2 (x−s) . It can be shown that VP P (x) satisfies the Bellman ODE, so it is also a particular solution. Moreover, as we shall see, we may interpret VP P (x) as the choice of particular integral which takes on the value zero when x = s or x = S. Using the above definitions, we find that Ex {e−rT V (xT )} = V (s)ψ2 (x) + V (S)ψ1 (x), leading to the following expression for the value function: V (x) = VP (x) + ψ2 (x) V (s) − VP (s) + ψ1 (x) V S − VP S . (9) J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 885 (In effect, (9) is the particularization of (4) in which the arbitrary constants c1 and c2 have been replaced by “new” arbitrary constants V (S) and V (S).) Alternatively, using only the value-matching condition V (s) = V (S), we have V (x) = VP P (x) + V (s) ψ1 (x) + ψ2 (x) . (10) Note further that rψ1 (x) + gψ1 (x) − 12 σ 2 ψ1 (x) = rψ2 (x) + gψ2 (x) − 12 σ 2 ψ2 (x) = 0. Therefore, since both VP (x) and VP P (x) satisfy the Bellman ODE, so does V (x) as constructed in (9) and (10). It remains to define V (s) and V (S) using our newly created value function. Because ψ1 (S) = ψ2 (s) = 1 and ψ1 (s) = ψ2 (S) = 0, evaluating V (s) and V (S) using either (9) or (10) leads to identities since V (s) = V (S). For this reason, we use the other value-matching condition, V (S) − B = V (s) = V (S), which yields V (s) = V (S) = (VP P (S) − B)/(1 − ψ1 (S) − ψ2 (S)). Given a profit function, suppose one wanted to solve this model to find the optimal barrier and return points. There are two ways of doing so. First, one could treat the constants of the homogeneous solution as unknowns, yielding a total of five variables to be calculated. The smooth-pasting and value-matching conditions would provide the necessary five equations to solve the system. The second approach involves using the equations derived above. Since these gave us explicit formulas for the constants as a function of the policy, we would then have just three variables to solve for; the three equations to be used would simply be the smooth-pasting conditions. Indeed, given a control band policy, one can solve for the constants using the value-matching conditions (Dixit, 1991). 3. Optimality of the control band policy Our goal is to prove that the {s, S, S} control band policy described in the previous section is the optimal policy out of all possible impulse control policies. By virtue of Theorem 1, to achieve this, we must show the value function V (x) constructed above satisfies the four conditions (P1)–(P4). Before this can be done, some preliminary results are required. Specifically, we show in the next theorem that most of the properties assumed for the profit function carry over to the particular solution. Though this type of result is commonly and easily obtained in discrete time Bellman equations, it is by no means immediate in continuous time. Theorem 2. VP (x) and VP (x) are continuous and bounded in absolute value by a polynomial; VP (x) and VP P (x) are strictly concave; moreover, VP P (x) is single-peaked with a unique maximum that lies within the inaction region. Recall that VP (x) represents the value of the firm when no control is exercised. Hence, we may interpret the homogeneous solution of the Bellman ODE as the value of exercising control. Dixit argued heuristically that because of this, the constants of the homogeneous solution must be positive. This property does not follow directly from our construction of the value function. Rather, it is akin to making a statement regarding the optimality of following such a policy. The following theorem proves Dixit’s intuition to be correct. However, the key ingredient in proving this result follows from the assumed strict concavity 886 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 and boundedness of the profit function, which by Theorem 2 led the particular solution to have the same properties. Moreover, since we assumed the profit function is strictly concave and single-peaked, it follows that it has a unique maximum m. The following theorem also proves m lies within the inaction region. The intuition for this result is quite obvious: in order to maximize the value function, the flow of profits over the inaction region, VP P (x), must be maximized. Theorem 3. The constants of the homogeneous solution c1 , c2 are strictly positive; moreover, the global maximum m of the profit function lies within the inaction region. The next theorem proves the value function is bounded from below by V (S) − B over the whole real line. Outside the inaction region, control is exercised so that V (x) = V (s) = V (S), which equals V (S) − B by the value-matching conditions. Thus, the result is trivial outside the inaction region. However, it also holds within the inaction region; the motivation for this result follows from an arbitrage argument. Suppose the firm is at a point x ∈ (s, S). The return to waiting at x is simply V (x). The net return to jumping to S is V (S) − B. If the latter exceeds the former, then it is not optimal to remain at x, but rather to exercise control and jump to S. Since x was chosen to be in the interior of the inaction region, it follows that the choice of barrier points was not optimal. Hence, it must be the case that V (x) V (S) − B for all x in the inaction region (s, S). The theorem also proves that S is the unique maximum of the value function, which allows us to represent S as follows: S = arg maxx {V (x) − B}. The intuition for this representation should be quite clear. If S was chosen correctly, it must be the optimal point to jump to, so that the net return of exercising control is maximized. This leads to the first-order condition V (S) = 0, which is of course one of the smooth-pasting conditions. Theorem 4. V (x) is continuous, single-peaked with a unique maximum S, and bounded from below by V (S) − B; moreover, V (x) is continuous and bounded. We now proceed with the main result of this paper. In deriving the value function, we obtained the value-matching and smooth-pasting conditions, which determine the optimal choice of trigger and return points for the control band policy. However, this does not mean that the type of policy is optimal. The assumed cost structure would tend to imply that a control band is the best choice, but this has not yet been proven. Theorem 1 provided us with a set of testable conditions to show precisely this since it considered an arbitrary impulse control policy. In turn, Theorem 5 shows the value function associated with a control band policy satisfies these conditions. Theorems 2 through 4 did all the legwork, so the proof of Theorem 5 merely summarizes the findings. Theorem 5. The {s, S, S} control band policy is the optimal policy out of all possible impulse control policies. From an empirical standpoint, the menu cost model can be used to estimate the cost of inflation. By varying the drift rate of the Brownian motion process, one could evaluate the amount by which the value of the firm changes. The steady-state average value function J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 887 would be a good choice as a metric since it incorporates the long-run effects of the firm’s policy. The following theorem derives the unconditional probability density function of the controlled relative price; integrating the value function against this density yields the average value function. (Bertola and Caballero, 1990 already calculated the unconditional distribution of the control variable when an agent follows an (s, S) policy in a slightly different context.) To analyze the effect of changes in the variance of aggregate prices, one could use the same metric. To study the impact of inflation and the variability of prices at the firm level, the literature has typically estimated the effect of changing the underlying parameters on the firm’s choice of trigger and return points, and in particular the width of the control band. However, it has been found that the width of the control band is an imperfect measure of the frequency of adjustment. A more meaningful approach would involve looking at the effect of parameters on the steady-state average and variance of the firm’s price. If we seek to estimate the frequency of price adjustments, then instead of using the width of the control band, a more appropriate metric is the average waiting time in the inaction region, which is also derived in the following theorem. Theorem 6. The unconditional distribution of the controlled relative price z is given by A0 + C0 eαz for z ∈ [s, S) and A1 + C1 eαz for z ∈ [S, S], where −α = 2g/σ 2 , A0 = −C0 eαs , A1 = −C1 eαS , and C0 = C1 = eαS − eαS (S − s)eαs [eαS − eαS ] + (S − S)eαS [eαS − eαs ] eαs − eαS (S − s)eαs [eαS − eαS ] + (S − S)eαS [eαS − eαs ] , . Moreover, the expected waiting time in the inaction region is given by [x − pS − (1 − p)s]/g, where p = (e−αx − e−αs )/(e−αS − e−αs ) and s < x < S. 4. A quadratic example This section demonstrates how a simple quadratic menu cost model can replicate fairly accurately the pricing behavior of firms in various types of industries. We are also interested in obtaining some comparative statics results to understand how a firm’s pricing behavior adapts to changes in the environment, as measured by the interest rate, cost of adjustment, and rate of inflation, for example. Theoretically such results generally depend on the chosen functional form for the profit function. However, after having calibrated numerous examples, it became clear that the functional form affects merely the magnitude, and not the direction, of the changes. The following are some of our key findings. First, as in Bertola and Caballero (1990), the steady state price of the firm (i.e. the unconditional mean of the controlled relative price, calculated using the equations in Theorem 6) is very close to the maximum of its profit function. In the absence of menu costs, the firm would perpetually remain at its profit-maximizing price. With adjustment costs, it exercises control in such a fashion that 888 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 it remains at that price on average. Second, the control band policy significantly reduces the variance of the price, at times by an order of magnitude. This occurs because of our third finding, that the variance of the relative price, and not the presence of inflation, is most costly to the firm. Finally, in the baseline model, the firm adjusts its price on average 1.44 times per year, which is consistent with empirical estimates. In any calibration experiment for this model, the most difficult question involves deciding the magnitude of the fixed cost. We begin by normalizing a unit of time to equal one year. Given this normalization, we opted to represent the fixed cost as a percentage of the maximum level of profits. With this choice, we require a baseline percentage in order to determine the magnitude of the fixed cost. The paper by Levy et al. (1997) is a useful reference for our purposes. The authors studied the pricing behavior of five large supermarket chains and directly measured the menu costs associated with changing their product prices. They found that, on average, the menu costs amounted to 0.70% of yearly revenues, or 35% of yearly profits. The menu costs were large as a percentage of profits since the net profit margins of these stores lie between 1 and 3%. The low profit margins followed from the intense competition characteristic of that industry. We do not expect most industries to have such low profit margins and hence such large menu costs relative to profits. So as an approximation, we may consider the 35% fixed cost rate to be a type of upper bound. A crucial metric of the model is the average frequency of price adjustments. Indeed, the most important property associated with menu cost models relates to the observed staggered pricing of firms; that is, even though average prices may be moving continuously, a single firm does not change its price continuously, but rather in discrete time intervals, and typically in large relative amounts. Naturally, we postulate this pattern arises due to the presence of menu costs. To test the model, we compare the predicted frequency of price adjustments to those measured by Blinder (1991). The author interviewed 72 firms in a wide variety of industries, asking them about the frequency of their price adjustments and their reasons for doing so. He found that 37.7% changed their prices once a year, 20.3% between once and twice, 24.5% more than twice, and 17.4% less than once per year. Therefore, a total of 58% changed their prices either once or twice, or in between. We consider the simple case when the profit function is linear-quadratic of the form π(x) = ax − 12 bx 2 , where a > 0, b > 0. Using the standard calculus method of undetermined coefficients, we find that a particular solution of the Bellman ODE is given by: 2 σ b ag bg 2 b 2 a bg VP (x) = − x + (11) + 2 x− + 2 + 3 . 2r r r 2r 2 r r We set a = 1 and b = 2, so that the profit function becomes x − x 2 , yielding an argmax of 0.5. The following four parameters fully describe our model: the trend component inflation, the variance component σ 2 , the fixed cost percentage, and the interest rate r. These are varied between low, middle, and high values. Recall that we consider 35% to be an upper bound for the fixed cost percentage, so this is taken as the high value. The middle value is 20%, and the low one 5%. We would like the inflation and variance combinations to reflect the cases of various countries or industries, so the low inflation rate is set at 5%, the middle one at 15%, and the high one at 50%. Similarly, the variance rates are chosen as 15, 25, and 50%. Finally, we allow r to vary between 10, 15, and 30%. A typical industry could J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 889 hence be represented by an inflation rate of 5%, variance of 15%, and interest rate of 10% (all the low values). We seek to track the values of the following seven variables. First, to describe the optimal control band policy, we measure the width of the control band S − s, and calculate the value of the return point S. These are then compared to the maximum of the profit function m and the unconditional expectation of the controlled price (using the steady state density), mean. Second, we wish to compare the variance parameter σ 2 to its steady state counterpart, the unconditional variance of the controlled price. This will allow us to evaluate by how much the effect of controlling the price reduces the resulting price variability. So as to describe the average value of the firm, we calculate the unconditional expectation of the value function, evalue, using the steady state density. The average frequency of price adjustments is then calculated using results from Theorem 6. Finally, to evaluate the steady state gain of exercising control, we calculate the unconditional expectation of the homogeneous solution. In all 81 cases, the mean was found to be very close to 0.5. The lowest value was 0.49, and the highest 0.50. Recall that for our profit function, 0.5 was set to be its argmax. Since the firm would prefer to be at its profit-maximizing level, it chose a control band policy that led its mean to be as close as possible to 0.5. When the parameters of the profit function a and b were varied, the mean would always adjust to approximately equal the argmax. In general, the expected variance was very small. It became as low as 1.78%, and reached a maximum of 8.83%. The effect of the control band policy is therefore not only to maintain the relative price near the argmax, but also to lower significantly the resulting price variability. Perhaps the most interesting result of these simulations follows from the calculated frequency of price adjustments per year. The model predicts that in an environment governed by 5% inflation, 15% variance, 10% interest rate, and 5% fixed cost, a firm will change its price on average 1.44 times per year. This quantity is representative of the estimates obtained by Blinder, since he found that the majority of firms interviewed adjusted their prices between once and twice per year. The highest frequency, equal to 3.00, occurred with 50% inflation, 50% variance, 10% interest rate, and 5% fixed cost. The smallest frequency of 0.55 occurred with 5% inflation, 15% variance, 30% interest rate, and 35% fixed cost. Moreover, the frequency is very sensitive to the fixed cost. As the fixed cost is increased from 5 to 20%, it tends to fall by half, and when it rises from 20 to 35%, it falls by about 20%. Our figures are also very similar to those calculated by Sheshinski and Weiss (1977) in their deterministic model, who obtained frequencies of about 1.3. The expected gain is our measure of the steady state gain of exercising control. We found this to be very large relative to the expected value. It got to be 1000 times larger in the extreme case when inflation and variance were at 50%, and on average it tended to be between 20–50 times larger, suggesting the gains of exercising control are considerable, and perhaps much larger than one would expect. Analyzing the expected value sheds some light on the costs associated with inflation, uncertainty, and the menu cost. When the variance is 15%, interest rate 10%, and fixed cost 5%, going from 5 to 15% inflation leads to a fall in evalue of only 0.3%. However, if the fixed cost is 35%, then the expected value falls 2.8%. If inflation is 5%, interest rate 10%, and fixed cost 35%, going from 15% variance to 25% reduces evalue by 17.8%, which is very large. If the fixed cost is 5%, the 890 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 reduction is only 4.8%. These calculations suggest it is the variance of the relative price, and not the presence of inflation, which is most costly to the firm. Of course, the fixed cost has a very large effect on evalue. In the baseline model, going from 5 to 35% reduces the expected value by 27.9%. Moreover, if the variance is 25%, the fall increases to 37.8%. It seems clear then that the variance drives the fall in value. Similarly, increasing the interest rate has a very large cost, as would be expected. For example, in the baseline model, increasing it by 5% reduces evalue by 33.3%. The width of the control band is positively related with all the parameters. The positive effect of inflation and the fixed cost on the width coincides with the result in Sheshinski and Weiss (1977), implying the introduction of uncertainty does not change these relationships. It is well known that the width of the band is an imperfect measure of the frequency of price adjustments. We confirm this to be the case. When the band widens, one might expect the frequency to fall since it will take longer to reach a barrier point. However, the two only move in the opposite direction in response to changes in the interest rate and fixed cost. The expected variance is positively related to all the parameters, so the width of the control band is a better measure of the expected variance, and not of the frequency of price adjustments. In fact, the correlation between the band and expected variance is 0.99, whereas that between the frequency and band is −0.48, suggesting the width of the band is a measure of dispersion. 5. Conclusion This paper lays out the groundwork for further theoretical and empirical research. We assumed the cost of price adjustment is fixed and not proportional to the amount of the price change. The theorems presented here would require only slight modification to incorporate this alternative cost structure; moreover, we believe the required assumptions on the profit function would remain the same. The smooth-pasting conditions will be different: instead of setting derivatives of the value function equal to zero at the trigger points, they will equal the marginal cost of price adjustment. Dixit (1991) derived these conditions when the cost function depends linearly on the amount of control exercised. The theory behind this menu cost model has been fully developed, so it would be a good starting point for a macroeconomic model (Caplin and Leahy, 1991, 1997; Tsiddon, 1993; Caplin and Spulber, 1987). With such models, one must first assume or prove that a control band policy is optimal at the firm level. This paper allows one to bypass that step if the chosen profit function satisfies the stipulated sufficient conditions. An interesting variation of this model would be to consider the effect of having the firm maximize the long-run average of expected discounted profits net of adjustment costs. This would imply that instead of maximizing the value function, the firm would be maximizing the average (over relative prices) of the value function using the steady state density. The firm’s choice of trigger and return points would be different, since the steady state density depends on these points. Comparing the optimal control band policy of the current model to that of this alternative setup could shed some light on the long-run versus short-run effects of aggregate uncertainty and inflation on the firm’s policy and value. J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 891 The simulations reported are perhaps representative of menu cost models. An empirical study could then compare the theoretical predictions to those found in the data more rigorously than was done here. The most difficult aspect arises from the lack of data on menu costs, so that the veracity of the model cannot easily be tested directly. However, the book by Blinder et al. (1998) reports very detailed survey data on observed frequencies, so the argument could be turned around to answer the following question. Suppose the menu cost model is an accurate representation of an economy or industry. Given the observed frequency of price adjustments, how large must the fixed cost be in order to generate those frequencies? Acknowledgments I thank Nancy Stokey and John Leahy for helpful comments. Financial support from the National Research Council is gratefully acknowledged. Appendix A Proof of Theorem 1. The proof is a modification of the work by Richard (1977) who considered the parallel cost minimization problem. Similar results can be found in Theorem 4.1 of Fleming and Rishel (1975, p. 159), Constantinides and Richard (1978), and Propositions 2.13 and 2.18 in Harrison et al. (1983). To prove the result, it is helpful to slightly modify some of our earlier notation. We now define a general impulse control policy p as a sequence of stopping times τ and a sequence of corresponding jumps J , such that p = {τ0 , J0 ; τ1 , J1 ; . . .}. As before, we initialize without loss of generality τ0 = 0. The controlled process z(t) associated with p still follows dz(t) = −g dt + σ dw for all τi t < τi+1 . However, now we have z(τi ) = z(τi− ) + Ji , where τi− is the time immediately before control is exercised the ith time. We then set z(0) = x. The optimal policy pˆ is constructed in a similar fashion: dˆz(t) = −g dt + σ dw for all τˆi t < τˆi+1 ; zˆ (τˆi ) = zˆ (τˆi− ) + Jˆi ; and zˆ (0) = x. Finally, we must construct a jump function J (x) which describes the optimal jump size when in state x, using the candidate value function u(x) to evaluate states. The function is simply defined as u(x + J (x)) − B ≡ supj =0 {u(x + j ) − B}. The optimal jump at the ith stopping time will thus equal Jˆi = J (ˆz(τˆi− )). To simplify the exposition, the proof is broken up into three lemmas. The first derives a useful result based on Ito’s Lemma. The second shows u(x) V (p; x). The third shows ˆ x). Put together, the three lemmas prove the desired result. For simplicity, u(x) = V (p; d d2 define the differential operator ≡ −g dx + 12 σ 2 dx 2. Lemma 1. Suppose there exists a function u(x) that satisfies (P1) and (P2). Then ∞ ∞ − −rτi −rt u(x) = Ex u z τi − u z(τi ) e + Ex e [ru − u] z(t) dt , (A.1) i=0 0 892 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 u(x) = Ex ∞ e −r τˆi ∞ − −rt u zˆ τˆi − u zˆ (τˆi ) + Ex e [ru − u] zˆ (t) dt . i=0 (A.2) 0 Proof. Let p˜ be an arbitrary impulse control policy and z˜ its associated process defined as was done for the other two policies. Consider an arbitrary function F (˜z(t), t) that is continuous and has continuous partial derivatives. Using the specification of our uncontrolled process x(t), we begin by applying Ito’s Lemma to the time interval [τ˜i−1 , τ˜i ): F z˜ τ˜i− , τ˜i = F z˜ τ˜i−1 , τ˜i−1 + τ˜i σ F1 z˜ (t), t dw τ˜i−1 τ˜i + τ˜i−1 1 F2 z˜ (t)t, t − gF1 z˜ (t), t + σ 2 F11 z˜ (t), t dt. 2 Now let F (˜z(t), t) = e−rt u(˜z(t)). Consider an arbitrary time T > 0. Plug in our choice of F into the above expression and sum up the intervals from τ˜0 = 0 to T to get: e −rT u z˜ (T − ) − u z˜ (0) = T e 0 + −rt [u − ru] z˜ (t) dt + T σ e−rt u z˜ (t) dw 0 e −r τ˜i u z˜ τ˜i − u z˜ τ˜i− . τ˜i <T This summing up procedure is identical to that described in Proposition 4 of Harrison (pp. 71–72). As always, we initialize z˜ (0) = x. Now let T → ∞. Since u is bounded by (P1), e−rT u(˜z(T − )) → 0. So we are left with: ∞ −u(x) = e −rt 0 + [u − ru] z˜ (t) dt + ∞ σ e−rt u z˜ (t) dw 0 ∞ e−r τ˜i u z˜ τ˜i − u z˜ τ˜i− . i=0 Now take expectations of both sides. Since u is bounded by (P2), the zero-expectation property applies to the stochastic integral [Proposition 5 of Harrison (p. 62)], leaving us with: ∞ ∞ − −rt −r τ˜i u z˜ τ˜i − u z˜ τ˜i . e [u − ru] z˜ (t) dt + Ex e −u(x) = Ex 0 i=0 Since this holds for any policy and its associated process, it holds for p and p. ˆ 2 Lemma 2. Suppose there exists a function u(x) that satisfies (P1)–(P4). Then u(x) V (p; x). J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 893 Proof. Consider policy p. By (P3), we know that, for all i, u z(τi− ) u z(τi ) − B. So we can multiply this by e−rτi , sum over i, and take expectations to get: ∞ ∞ − −rτi −rτi u z τi − u z(τi ) e −Ex e B . Ex i=0 (A.3) i=0 In general, z will not remain in the inaction region I . However, by (P4), we have that 1 / I. ru(x) + gu (x) − σ 2 u (x) π(x) for all x ∈ 2 Hence, combining (P4) with the above inequality, we have that, for all z, ru(z) − u(z) π(z). This leads to the following: ∞ ∞ −rt −rt Ex e [ru − u] z(t) dt Ex e π z(t) dt . (A.4) 0 0 Now add the LHS of (A.4) to the LHS of (A.3), and the RHS of (A.4) to the RHS of (A.3) to get the following inequality: ∞ ∞ − −rτi −rt Ex u z τi − u z(τi ) e + Ex e [ru − u] z(t) dt i=0 ∞ Ex e −rt π z(t) dt − Ex 0 ∞ e −rτi B . i=0 0 By (A.1) of Lemma 1, the LHS is u(x). The RHS is V (p; x) by definition. 2 Lemma 3. Suppose there exists a function u(x) that satisfies (P1)–(P4). Then u(x) = V (p; ˆ x). Proof. By the definition of the jumps of p, ˆ we have − u zˆ τˆi − u zˆ τˆi = −B for all i. (A.5) By construction, the controlled process zˆ always remains in the inaction region I , so by (P4) we have that, for all zˆ , ru(ˆz) − u(ˆz) = π(ˆz). Plugging in this result together with (A.5) into (A.2) of Lemma 1, we get ∞ ∞ −rt −r τˆi u(x) = Ex e π zˆ (t) dt − Ex e B . i=0 0 We recognize the RHS as V (p; ˆ x). 2 Proof of Theorem 2. We begin by proving the particular solution is continuous. By Theorem 6.13 in Rudin (1976, p. 129), we have the following inequality: 894 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 ∞ −rt VP (x) − VP (x) ¯ dt ¯ = e Ex,x¯ π x(t) − π x(t) 0 ∞ dt. ¯ e−rt Ex,x¯ π x(t) − π x(t) 0 The absolute value function is convex, so we may apply Jensen’s inequality to the RHS of this inequality to get the following: VP (x) − VP (x) ¯ ∞ dt. ¯ e−rt Ex,x¯ π x(t) − π x(t) 0 Since the profit function is continuous, the continuity of VP (x) follows directly by applying the standard epsilon–delta definition in Rudin (1976, p. 85). Now consider VP (x). By the Lebesgue Dominated Convergence Theorem (Billingsley, 1995, p. 209), we can bring the derivative inside the expectation to get VP (x) = ∞ e −rt Ex ∞ d π x(t) dt = e−rt Ex π x(t) dt, dx 0 0 where the first equality follows from Fubini’s Theorem (bringing the expectation operator inside the integral), and the last equality follows from (2). By (A3), we know the integral exists. Since the derivative of the profit function is continuous by assumption, the same proof is applicable. We now prove the particular solution is bounded by a polynomial. Consider the first integral of (6). By assumption, the profit function is bounded by a polynomial, so there exist an L > 0 and λ 1 such that π(y) L(1 + y λ ) for all y. Multiply this by e−R1 y and integrate to get ∞ e −R1 y ∞ π(y) dy L x e x −R1 y ∞ dy + L e−R1 y y λ dy. x Now perform integration by parts repeatedly on the second integral of the RHS, yielding: ∞ e R1 x e−R1 y π(y) dy x L λ λ λ−1 λ(λ − 1) λ−2 λ! x + x + x + · · · + + 1 . R1 R1 R1λ R12 Doing the same for the second integral of (6), we have ∞ e R2 x e x −R2 y L λ λ λ−1 λ(λ − 1) λ−2 λ! x + π(y) dy x + x + ··· + λ + 1 . R2 R2 R2 R22 It follows that VP (x) is bounded by a polynomial. J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 895 To show VP (x) is bounded by a polynomial, we first take the derivative of (6). Since the profit function is continuous by assumption, Theorem 6.20 in Rudin (1976, p. 133) states that ∞ x d d e−R1 y π(y) dy = −e−R1 x π(x) and e−R2 y π(y) dy = e−R2 x π(x). dx dx −∞ x Using the chain rule, ∞ x 1 2 VP (x) = 2 π(y)e−R2 y dy , R1 eR1 x π(y)e−R1 y dy + R2 eR2 x σ R1 − R 2 −∞ x so the same proof as that for VP (x) is applicable. Note that we only used the continuity and boundedness of the profit function, and not that of its derivative. The above calculations prove the result when the profit function is non-negative everywhere. To extend the result to the case where the profit function takes on negative values, and hence show that the particular solution and its derivative are bounded in absolute value by a polynomial, one simply breaks up the integrals in (6) and the above expression into their positive and negative components. Using the same integration by parts procedure, one then shows that each component is bounded by a polynomial from above and below. This follows from the assumption that the profit function is bounded in absolute value by a polynomial. We now prove that VP (x) is strictly concave using (5). Consider two initial conditions x1 , x2 . Let x¯ = θ x1 + (1 − θ )x2 , where 0 < θ < 1. Suppose the Brownian motion process has initial condition x. ¯ Then since the profit function is strictly concave, π x¯ − gt + σ w(t) > θ π x1 − gt + σ w(t) + (1 − θ )π x2 − gt + σ w(t) . This inequality holds for all possible realizations of the Wiener process, so we may take the conditional expectation of both sides, preserving the relation to get: Ex¯ π x(t) > θ Ex1 π x(t) + (1 − θ )Ex2 π x(t) . Multiply through by e−rt and integrate over time. By Fubini’s Theorem (Harrison, 1985, p. 131), we may take the expectation operator outside the integral. The inequality becomes: ∞ ∞ −rt −rt e π x(t) dt > θ Ex1 e π x(t) dt Ex¯ 0 0 ∞ + (1 − θ )Ex2 e −rt π x(t) . 0 Therefore, VP (x) is strictly concave. We now prove VP P (x) is strictly concave. According to the definition VP P (x) ≡ T T Ex { 0 e−rt π(x(t)) dt}, we have that VP P (x) = 0 e−rt Ex {π(x(t))} dt, where T is a stopping time, so the same proof follows through as did for VP (x). Moreover, from the equations defining VP P (x), we find that VP P (s) = VP P (S) = 0, so it is single-peaked with a unique maximum that lies in (s, S). 2 896 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 Proof of Theorem 3. We begin by proving the first claim. Before doing so, we must derive a couple equations using the method proposed by Dixit (1991) to obtain the smooth-pasting condition at the return point. Differentiate with respect to S the value-matching condition V (s) = V (S) using the general solution (4): ∂c2 R S ∂c1 R1 s e − e R1 S = e 2 − e R2 s . ∂S ∂S Since s < S and R1 > 0, R2 < 0, the partial derivatives have the same sign. Now consider the effect of S on the value function V (x): ∂V (x) ∂c1 R1 x ∂c2 R2 x = e e . + ∂S ∂S ∂S Since the partial derivatives have the same sign, the entire value function is shifted up or down. So the optimal choice of S involves setting the above expression equal to zero, implying: ∂c1 ∂c2 = = 0. (A.6) ∂S ∂S We will prove the constants of the homogeneous solution are strictly positive by ruling out all the other possible cases. Consider the smooth-pasting condition V (S) = 0, which is VP (S) = −c1 R1 eR1 S − c2 R2 eR2 S . Since this holds as an identity, we may differentiate it with respect to S and preserve the relation. Using the result (A.6), we are left with VP (S) = −c1 R12 eR1 S − c2 R22 eR2 S . Suppose both constants are zero. Then the RHS is equal to zero, which cannot be since the particular solution is strictly concave according to Theorem 2. Moreover, if c1 = 0, c2 < 0, then the RHS is strictly positive, which again is not possible. Similarly, we can rule out the cases c1 < 0, c2 = 0 and c1 < 0, c2 < 0. Ruling out the other cases involves more work. We will look at the particular and homogeneous solutions of the Bellman ODE and analyze their properties under each scenario. In each case, the smooth-pasting conditions will yield a contradiction. Consider the case c1 > 0, c2 < 0. Let f (x) equal the derivative of the homogeneous solution: f (x) ≡ c1 R1 eR1 x + c2 R2 eR2 x . We shall study the number of intersections between f (x) and g(x) ≡ −VP (x). Take the derivative of f (x) and set it equal to zero, to obtain x = (R1 − R2 )−1 ln[−c2 R22 /(c1 R12 )]. Since c1 > 0, c2 < 0, the argument of the log function is positive, so f (x) = 0 has a unique, real solution. Now take the second derivative of f (x), to obtain f (x) = c1 R13 eR1 x + c2 R23 eR2 x . Since c1 > 0, c2 < 0, R1 > 0, R2 < 0, the entire RHS is strictly positive. Collating these results, we have shown that f (x) is strictly convex, with a unique minimum. Now consider g(x). Since VP (x) is strictly concave by Theorem 2, g (x) > 0. By Theorem 2, g(x) is bounded by a polynomial, implying that g(x) < f (x) for both large and small x since f (x) grows exponentially. Putting together all these properties, it follows that f (x) and g(x) will intersect an even number of times, if at all. The smooth-pasting conditions require that there be exactly three points of intersection, so we cannot have that c1 > 0, c2 < 0. If c1 > 0, c2 = 0, then f (x) is strictly J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 897 increasing and strictly convex, so there will be at most two points of intersection, which also cannot happen. Finally, consider the case c1 < 0, c2 > 0. As before, f (x) = 0 has a unique, real solution, but now f (x) is strictly concave with a unique maximum. Therefore, g(x) > f (x) for both large and small x. So as before, we will get an even number of intersections, if at all, which cannot be. If c1 = 0, c2 > 0, then f (x) is strictly increasing and strictly concave, so there will be at most two points of intersection, which is also ruled out. We now prove that the global maximum m of the profit function lies within the inaction region. Using (10), we define the function η(x) ≡ V (s)[ψ1 (x) + ψ2 (x)] = V (x) − VP P (x). Because rψ1 (x) + gψ1 (x) − 12 σ 2 ψ1 (x) = rψ2 (x) + gψ2 (x) − 12 σ 2 ψ2 (x) = 0, it follows that η(x) has the same property: rη(x) + gη (x) − 12 σ 2 η (x) = 0. Moreover, due to the definitions of ψ1 (x) and ψ2 (x), we find that η(x) is single-signed in the region [s, S], and η(s) = η(S) = V (s). If η(x) is not constant, then it has an extremum point in the interval, call it e. Evaluating the ODE at this extremum point e, we find rη(e) = 12 σ 2 η (e), implying that η (x) has the same sign as η(x) at the extremum. It follows that η(x) and VP P (x) behave in the same fashion. Because Theorem 2 showed VP P (x) is single-peaked with a unique maximum that lies within the inaction region, the result follows immediately. 2 Proof of Theorem 4. The continuity within the inaction region of the value function and its derivative follow directly from Theorem 2. By Dumas (1991), they are continuous at the barrier points; outside the inaction region, they are continuous since the value function is constant by definition. Outside the interval [s, S], V (x) is zero (and hence bounded) by construction. Since the interval [s, S] is compact, and V (x) is continuous over that interval, V (x) is bounded over [s, S] by Theorem 4.15 in Rudin (p. 89). Therefore, V (x) is bounded over the whole real line. The value function has an extremum point within the inaction region by the Mean-Value Theorem (since it is continuous), and it is unique by the smooth-pasting condition, so it equals S. To show S is a maximum, we prove V (x) is increasing immediately to the right of s, and decreasing immediately to the left of S. Since V (x) = V (S) − B for all x outside the inaction region, this will hence demonstrate that V (x) V (S) − B for all x, further implying the value function is single-peaked. We prove V (x) is increasing immediately to the right of s by showing that, for small ε > 0, V (s + ε) V (s). Using (4), this will hold if VP (s + ε) + c1 eR1 (s+ε) + c2 eR2 (s+ε) VP (s) + c1 eR1 s + c2 eR2 s . Rewriting this, we find that V (s + ε) V (s) if VP (s + ε) − VP (s) + c1 eR1 s eR1 ε − 1 c2 eR2 s 1 − eR2 ε . Now do a Taylor expansion of VP (s + ε) around s: 1 VP (s + ε) = VP (s) + εVP (s) + ε 2 VP (s). 2 Since we chose ε small, we set the second-order term equal to zero. So the inequality becomes: εVP (s) + c1 eR1 s eR1 ε − 1 c2 eR2 s 1 − eR2 ε . 898 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 Now consider the smooth-pasting condition at s, given by V (s) = 0. Using (4) again, this is VP (s) = −c1 R1 eR1 s − c2 R2 eR2 s . Plug this equation above to get c1 eR1 s eR1 ε − 1 − εR1 c2 eR2 s 1 + εR2 − eR2 ε . (A.7) So if we can prove that this holds, then V (s +ε) V (s). Consider the function ex −(1+x). We argue that it is non-negative for all x. This will be so if and only if x ln(1 + x). Define y = 1 + x, f (y) = ln y. Since the natural log function is concave, a Taylor series expansion around any point always lies above the function itself; hence, doing the expansion around unity proves the claim. Since ex − (1 + x) is non-negative for all x, the term in brackets on the LHS of (A.7) is positive, and that on the RHS is negative. By Theorem 3, c1 > 0, c2 > 0, so the entire LHS of (A.7) is positive, and the entire RHS is negative, so the inequality holds. To prove V (x) is decreasing immediately to the left of S, one uses a similar procedure. We show that for small ε > 0, the following holds: V (S − ε) V (S). Using (4), this will hold if (A.8) VP S − ε − VP S + c1 eR1 S e−R1 ε − 1 c2 eR2 S 1 − e−R2 ε . Dropping second-order terms as ε is chosen small, a Taylor approximation around S yields (A.9) VP S − ε = VP S − εVP S . The smooth-pasting condition at S implies −εVP S = εc1 R1 eR1 S + εc2 R2 eR2 S . (A.10) Plugging (A.10) and (A.9) into (A.8) leads to c1 eR1 S e−εR1 + εR1 − 1 c2 eR2 S 1 − εR2 − e−εR2 . (A.11) e−x + x − 1 0 for all x. This will be so if −x ln(1 − x). Let y = 1 − x, We argue that f (y) = ln y. Then since the natural log function is concave, we find that f (y) f (1) + f (1)(y − 1). Evaluating these functions, we get ln(1 − x) −x, which is equivalent to −x ln(1 − x). Hence, the term in brackets on the LHS of (A.11) is positive, and that on the RHS is negative. Since the constants are positive by Theorem 3, the entire LHS is positive, and the entire RHS is negative. It follows that (A.11) holds, implying V (S − ε) V (S) also holds. 2 Proof of Theorem 5. By Theorem 1, we must show the value function V (x) satisfies conditions (P1)–(P4). By Theorem 4, both V (x) and V (x) are continuous and bounded, implying (P1) and (P2) are satisfied. Now consider (P3), which requires V (x) supy {V (y) − B}. By Theorem 4, S is the unique maximum of V (x), so the RHS equals V (S)−B. Hence, (P3) is also satisfied since V (x) V (S)−B by Theorem 4. By construction, the value function satisfies the Bellman ODE over the inaction region, which in our case becomes I = {x: V (x) > V (S) − B} = (s, S). Therefore, (P4) is satisfied, completing the proof. 2 Proof of Theorem 6. To derive the stationary distribution, we approximate the controlled Brownian motion process z(t) by a random walk, so that we may use standard Markov J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 899 chain techniques. Suppose the length of a time interval is dt and the size of a jump is √ dz = σ dt. The probability p of an upward jump is given by 1 g√ g 1 p= 1− dt = 1 − 2 dz . 2 σ 2 σ Let q = 1 − p be the probability of a downward jump. Let f (z) denote the density of the controlled process. Consider a state z that lies in the set (s, S) ∪ (S, S) (the union of the sets). It can be reached either by jumping up from z − dz or down from z + dz, so we have f (z) = pf (z − dz) + qf (z + dz). Re-arranging terms and using the definitions of p and dz, we get 0 = f (z + dz) − f (z) − f (z) − f (z − dz) g + 2 dz f (z + dz) − f (z) + f (z) − f (z − dz) . σ Now divide this by (dz)2 and take the limit as dz → 0. We get the ODE f (z) = −(2g/σ 2 )f (z). As it is a term that appears throughout the calculations, define α = −2g/σ 2 , and note that R1 + R2 = −α. The general solution of the ODE is f (z) = A + Ceαz , where the constants A and C remain to be determined. They will be given by boundary conditions that we now derive. Once the process hits either s or S, it instantly jumps to S, so there will be no mass at the barriers: f (s) = f (S) = 0. Consider the return point S. It can be reached in four ways: an upward jump from S − dz; a downward jump from S + dz; an upward jump from S − dz; or a downward jump from s + dz. Therefore, we have f (S) = pf (S − dz) + qf (S + dz) + pf S − dz + qf (s + dz). Re-arranging terms, we get the following: f (S) − f (S − dz) = f (S + dz) − f (S) + f (s + dz) − f (s) − f S − f S − dz dt f (S − dz) − f (S + dz) − f (s + dz) − f (s) −g dz − f S − f S − dz . Now divide this by dz and take the limit as dz → 0. Since (dz)2 = σ 2 dt, the entire last term converges to zero. Letting f− (z), f+ (z) denote the LHS and RHS derivatives of f (z) respectively, this becomes f− (S) = f+ (S) + f+ (s) − f− (S). Finally, since f (z) is a S density, we have that it must integrate to one: s f (z) dz = 1. We thus get five equations in four unknowns, whose unique solution is given in the text. We now turn to deriving the expected waiting time in the inaction region. As before, let x(t) denote our Brownian motion given by (1). Define a process u(t) associated with x(t) as follows: u(t) = −x(t)/g − t. We claim u(t) is a martingale with respect to x(t). According to Karlin and Taylor (1975, p. 239), we must show that E{u(t + s) | x(r), 0 r t} = u(t) for all t > 0 and s > 0. Using the definition (2) for x(t), this can be shown easily: 900 J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 1 E u(t + s) | x(r), 0 r t = − E x(t + s) | x(t) − (t + s) g 1 = − x(t) − gs − (t + s) = u(t). g Let T be the first time x(t) reaches either s or S. Then by the Optional Sampling Theorem for martingales (Theorem 3.2 of Karlin and Taylor, 1975, p. 261), E[u(T )] = E[u(0)], which implies E[u(T )] = E[u(0)] = −x(0)/g, where by assumption s < x(0) < S. From the definition of u(t), we have E[u(T )] = −(1/g)E[x(T )] − E[T ]. Plugging in the previous result, we have E[T ] = x(0)/g − (1/g)E[x(T )]. Now it remains to calculate E[x(T )]. Let p be the probability that x(t) hits the upper barrier before the lower barrier. In Theorem 5.2 of Karlin and Taylor (1975, p. 361), p is given as p = (e−αx(0) − e−αs )/(e−αS − e−αs ), where −α = 2g/σ 2 as before. Then the expectation is E[x(T )] = pS + (1 − p)s, which yields the result given in the text. 2 References Abel, A., Eberly, J., 1994. A unified model of investment under uncertainty. American Economic Review 84, 1369–1384. Bar-Ilan, A., Sulem, A., 1995. Explicit solution of inventory problems with delivery lags. Mathematics of Operations Research 20, 709–720. Bertola, G., Caballero, R., 1990. Kinked adjustment costs and aggregate dynamics. In: NBER Macroeconomics Annual. MIT Press, Cambridge, MA. Billingsley, P., 1995. Probability and Measure. Wiley. Blinder, A., 1991. Why are prices sticky? Preliminary results from an interview study. American Economic Review 81, 89–96. Blinder, A., Canetti, E., Lebow, D., Rudd, J., 1998. Asking About Prices: A New Approach to Understanding Price Stickiness. Russell Sage Foundation, New York, NY. Caplin, A., Leahy, J., 1991. State-dependent pricing and the dynamics of money and output. Quarterly Journal of Economics 106, 683–708. Caplin, A., Leahy, J., 1997. Aggregation and optimization with state-dependent pricing. Econometrica 65, 601– 625. Caplin, A., Spulber, D., 1987. Menu costs and the neutrality of money. Quarterly Journal of Economics 102, 703–725. Constantinides, G., Richard, S., 1978. Existence of optimal simple policies for discounted-cost inventory and cash management in continuous time. Mathematics of Operations Research 26, 620–636. Cox, D.R., Miller, H.D., 1965. The Theory of Stochastic Processes. Chapman and Hall. Danziger, L., 1983. Price adjustments with stochastic inflation. International Economic Review 24, 699–707. Dixit, A., 1991. A simplified treatment of the theory of optimal control of Brownian motion. Journal of Economic Dynamics and Control 15, 657–673. Dixit, A.K., Pindyck, R.S., 1994. Investment Under Uncertainty. Princeton Univ. Press, Princeton, NJ. Dumas, B., 1991. Super contact and related optimality conditions. Journal of Economic Dynamics and Control 15, 675–685. Edwards, C.H., Penney, D.E., 1993. Elementary Differential Equations. Prentice Hall. Fleming, W.H., Rishel, R.W., 1975. Deterministic and Stochastic Optimal Control. Springer-Verlag. Harrison, M.J., 1985. Brownian Motion and Stochastic Flow Systems. Krieger. Harrison, M., Sellke, T., Taylor, A., 1983. Impulse control of Brownian motion. Mathematics of Operations Research 8, 454–466. Karatzas, I., Shreve, S.E., 1991. Brownian Motion and Stochastic Calculus. Springer-Verlag. Karlin, S., Taylor, H.M., 1975. A First Course in Stochastic Processes. Academic Press. J.M. Plehn-Dujowich / Review of Economic Dynamics 8 (2005) 877–901 901 Levy, D., Bergen, M., Dutta, S., Venable, R., 1997. The magnitude of menu costs: Direct evidence from large US supermarket chains. Quarterly Journal of Economics 112, 791–825. Richard, S., 1977. Optimal impulse control of a diffusion process with both fixed and proportional costs of control. SIAM Journal of Control and Optimization 15, 79–91. Rudin, W., 1976. Principles of Mathematical Analysis. McGraw–Hill. Scarf, H., 1959. The optimality of (S, s) policies in the dynamic inventory problem. In: Arrow, K., Karlin, S., Suppes, P. (Eds.), Mathematical Methods in Social Sciences. Stanford Univ. Press, Palo Alto, CA, pp. 196– 202. Sheshinski, E., Weiss, Y., 1977. Inflation and costs of price adjustment. Review of Economic Studies 44, 287–303. Sheshinski, E., Weiss, Y., 1983. Optimum pricing policy under stochastic inflation. Review of Economic Studies 50, 513–529. Tsiddon, D., 1993. The (mis)behavior of the aggregate price level. Review of Economic Studies 60, 889–902.
© Copyright 2024