A Utility Model of Learning How to Consume Philip H. Dybvig

A Utility Model of Learning How to Consume∗
Philip H. Dybvig∗∗ , Bong-Gyu Jang† , and Hyeng Keun Koo‡
October 2013.
Abstract
This paper proposes a utility model in which agents require effort to learn how to
consume effectively. In this model, there is an ideal utility function of consumption that
requires skill to achieve. At each time, there is a range of consumption levels for which
the agent can consume at the full potential described by this ideal utility function, and
consuming outside this range generates less than the potential utility. There is an optimal
policy for expending effort to move the boundaries of the range of consumption levels at
which the agent is skilled at consuming. When the range is narrow, the presence of
this learning induces a kind of risk aversion in the large, and makes the indirect utility
function more concave than it would otherwise be. When the utility loss of consuming
outside the comfortable zone is relatively small or the cost of learning is sufficiently large,
the agent consumes near the optimal consumption level, spending little effort in learning.
In the opposite case, however, the agent never consumes outside the comfortable zone and
expends relatively large effort on learning.
∗
Primary versions of this paper was presented at Southwestern University of Finance and Economics (China),
Universit´e Paris Nord (France), Kyoto University Economic Research Institute (Japan), 2013 Annual Meeting
of the Asian Econometric Society. This research was supported by WCU (World Class University) program
(R31-20007) and by Basic Science Research Program (2012R1A1A2038735) through the National Research
Foundation of Korea funded by the Ministry of Education, Science and Technology.
∗∗
†
‡
Washington University in Saint Louis, USA, E-mail: [email protected].
Department of Industrial and Management Engineering, POSTECH, Korea, E-mail: [email protected].
Graduate Department of Financial Engineering, Ajou University, Korea, E-mail: [email protected].
1
Introduction
The mechanisms which connects consumption decisions is not that of rational
planning but of learning and habit formation. (Duesenberry 1949)
Dynamic models of consumption choice lies at the heart of modern economic analysis.
They are related to the consumption function which plays a central role in macroeconomic
policy analysis and also closely linked with asset pricing by determining portfolio demand
and its consequent equilibrium asset pricing relationship.
Since the pioneering research by Modigliani and Brumberg (1955), and Friedman (1957),
dynamic consumption choice has mostly been modeled as results of optimization of time
separable von Neumann-Morgenstern utility functions. The modeling paradigm is now called
the permanent income/life-cycle theory.
There are, however, challenges to the life cycle/permenent income theory. For example,
Carroll (2002) claims that the life cycle theory does not explain the consumption pattern
of the rich. He proposes alternative explanation based on the capitalist spirit. A more serious challenge came unexpectedly from research on asset pricing. Hansen and Singleton
(1982) tested the validity of the restrictions on asset pricing imposed by optimization of timeseparable von Neumann-Morgenstern utility functions, and strongly rejected them. Mehra
and Prescott (1985) and Weil (1989) calibrated a general equilibrium model where a representative individual maximizes utility of lifetime consumption, similar to the individual in the
permanent income hypothesis, matching the US historical data and found three anomalies:
the historical equity premium was too high, the historical risk-free rate was too low, and the
volatility of the stock returns was too large to be reconciled reasonably with the theoretical
model.
The asset pricing puzzles provided motive for researchers to go beyond the life-cylce/permanent
income theory and to probe more realistic models of consumption behavior. Constantinides’
model of habit formation was one notable among many such works (Constantinides 1990).
He modeled a consumer who suffers a utility loss if one’s consumption falls short of a bench2
mark level which is a weighted average of past consumption rates and showed that the equity
premium puzzle could be resolved by considering such a representative agent. Dybvig (1995)
shows in a partial equilibrium model that the puzzling aspect of low correlation of consumption and stock returns, which was the main reason for the past rejection of the asset pricing
restrictions as in Hansen and Singleton (1982), can be justified if the representative agent
does not tolerate decline in consumption, an extreme form of habit formation. Campbell
and Cochrane (2000) show that all the three puzzles can, at least partially, answered by
considering external habit formation.
Much earlier in investigation of consumer behavior before Modigliani and Blumberg (1955)
and Friedman’s (1957) research, Duesenberry (1949) had already understood unrealistic aspects of explanation based on rational planning based on time separable utility functions as
explicit in his remark quoted at the head of our Introduction. In his seminal book, he went
on saying
At any moment a consumer already has a well-established set of consumption
habits... Suppose a man suffers a 50 per cent reduction in his income and expects
this reduction to be permanent. Immediately after the change he will tend to act
in the same way as before... In retrospect he will regret some of his expenditures.
In the ensuing periods the same stimuli as before will arise, but eventually he will
learn to reject some expenditures and respond by buying cheap substitutes for the
goods formerly purchased. (p. 24)”
In this paper we propose a theoretical model which captures the realistic aspects of consumer behavior. In our model there exists a comfortable zone of consumption, which is
bounded by upper and lower barriers. When the agent consumes beyond the comfortable zone
he/she suffers utility loss but he/she can enlarge the comfortable zone by learning. Learning
also incurs utility loss. Thus, when the agent has a positive or negative income (or wealth)
shock, the agent often faces a tradeoff between consuming near the consumption level, which
would be optimal if the comfortable zone were the whole space, beyond the comfortable zone
3
and consuming within the comfortable zone but enlarging the comfortable zone by learning.
We show that both cases can occur depending upon parameter values. When the utility loss
of consuming outside the comfortable zone is relatively small or the cost of learning is sufficiently large, the agent consumes near the optimal consumption level, spending little effort
in learning. In the opposite case, however, the agent never consumes outside the comfortable
zone and expends relatively large effort on learning. The result is similar to Keller and Rady’s
(1999) conclusion that when a monopolist learns about the environment by observing series
of orders two cases emerge depending on parameter values; producing a myopically optimal
quantity without much effort to learn or producing an amount significantly different from
the myopically optimal level in an attempt to explore and learn. Our model, however, does
not exhibit an extreme bifurcation around a set of parameter values as in their model. We
also show that relative risk aversion (RRA) can be quite different from those implied by a
period utility function. Introducing a comfortable zone and learning generates rich pattern
of consumption and risk taking.
Our model opens new possibilities for empirical research. For example, the existence of
the two different types of consumption patterns can be tested against the data. If a large
proportion of individuals exhibit the second pattern, i.e., that of consuming always within the
comfortable zone, then the empirical rejection of the asset pricing models can potentially be
justified. Dybvig’s (1995) model of intolerance for decline in consumption can already justify
it. However, his preference is an extreme and ideal type of habit formation and does not have
a great empirical relevance. We propose a more realistic model in this paper.
The paper proceeds as follows. Section 2 explains our model. Section 3 studies an optimal
consumption and investment choice problem in a standard financial market environment. It
gives an analytic solution to the problem. Section 4 discusses the case where the agent
has constant relative risk aversion as a baseline felicity function. Section 5 concludes and
Appendix contains technical results.
4
2
A Utility Model in Continuous Time
We consider an infinitely-lived economic agent whose objective is to maximize
[∫ ∞
)
{(
E
e−ρt U (ct ) − K(U (ct ) − U (Lt ))+ − K(U (Lt ) − U (ct ))+ dt
0
}]
−αKdU (Lt ) + βKdU (Lt ) ,
(1)
where Lt is non-decreasing in t and Lt is non-increasing in t, and 0 ≤ K ≤ 1, K ≥ 0, α > 0,
and β > 0 are constants. U (c) is a strictly increasing and strictly concave felicity function
of consumption, ct is the agent’s consumption rate at t, and Lt ≤ ct ≤ Lt is the agent’s
comfortable zone of consumption beyond which the agent pays utility penalty K(U (ct ) −
U (Lt ))+ and K(U (ct ) − U (Lt ))+ . The two terms, αKdU (Lt ) and −βKdU (Lt ), represent
utility costs of making efforts to enlarge the comfortable zone. They can be interpreted as costs
of learning at uncomfortably rich or poor levels. When consuming slightly beyond a boundary
level, either Lt or Lt , the agent faces the tradeoff of receiving a utility penalty of consuming
at an uncomfortable zone or expending costs of learning about the new environment.
The model builds upon a model by Dybvig and Rogers (2013). In their model there is
a non-decreasing anticipation level, which provides utility of hope, but the agent suffers a
penalty if consumption falls short of the anticipation level. Our model is different from theirs
in the following sense: we have two levels Lt and Lt which define the comfortable zone of
consumption and the penalty terms have natural interpretation as learning in an unfamiliar
environment.
The financial market environment is as follows. There are two assets: one risky asset (or
a stock) and one riskless asset (or a bond). The risky asset price, St , satisfies
dSt = µSt dt + σSt dBt ,
where Bt is a Wiener process, µ and σ the mean and standard deviation of the return on the
risky asset. The bond price, Pt , evolves according to the equation
dPt = rPt dt,
5
where r is a risk-free rate. The risk premium on the risky asset is assumed to be positive, i.e.,
(µ − r) > 0. We assume the investment opportunity is constant, i.e., µ, σ and r are constant.
The agent’s wealth, wt , evolves according to the following dynamics
dwt = (rwt + (µ − r)πt − ct )dt + σπt dBt , w0 = w,
where πt is the agent’s investment (in dollars) in the risky asset.
The agent’s objective is to maximize (1) subject to
wt ≥ 0 for all t ≥ 0.
The constraint in the above precludes arbitrage opportunities in the financial market and
makes the agent’s problem well-defined (see Dybvig and Huang 1988).
For convenience, we define, α
˜ ≡ ρα, β˜ ≡ ρβ, and
˜
˜ KU (L) + βKU
(L)
G(c, L, L) ≡ U (c) − K(U (c) − U (L))+ − K(U (L) − U (c))+ − α
}
{
= min U (c), (1 − K)U (c) + KU (L), (1 + K)U (c) − KU (L)
(2)
˜
−α
˜ KU (L) + βKU
(L).
Note that G is a concave function of c and exhibits jumps in gradient at c = L or L. The
expectation in (1) can now be rewritten as
]
[∫ ∞
e−ρt G(ct , Lt , Lt )dt + αKU (L0 ) − βKU (L0 ).
E
0
Thus, we can restate the agent’s problem as follows:
Problem 1 Given L, L, w, obtain
V (w, L, L) ≡ sup E
w,L,L
[∫
∞
]
e−ρt G(ct , Lt , Lt )dt ,
0
subject to Lt is non-decreasing and Lt is non-increasing, where the supremum is taken over all
admissible consumption and investment processes, (c, π)’s, which maintain the wealth process
w nonnegative. The dynamics of the wealth process is given by the following equation
dwt = (rwt + (µ − r)πt − ct )dt + σπt dBt .
Here, E w,L,L [·] ≡ E[·| w0 = w, L0 = L, L0 = L].
6
3
Optimal Consumption and Learning
In this section we derive a solution to Problem 1. The derivation is similar to the one by
Dybvig and Rogers (2013). There is, however, a significant difference between this paper and
theirs in technical derivation of solutions. Namely, we need to determine two rather than one
boundary. We will show that the determination of the two boundaries of the comfortable
zone can be done independently, except the fact that one is linked to the other by the agent’s
initial wealth level.
As in Dybvig and Rogers (2013), we will rely on the martingale method, and hence the
optimal consumption and boundary processes are expressed by using the state price density
{
ξt = exp
(
}
1 )
− r + κ2 t − κBt ,
2
(3)
where κ is the market price of risk defined as
κ≡
µ−r
.
σ
(4)
We now impose the following standing assumption on U , which allows an ordinary Merton
problem to be well-posed.1
Assumption 1 For all λ > 0
′
′′
U (0) = ∞, U (∞) = 0, E w,L,L
[∫
∞
]
ξt I(λeρt ξt )dt < ∞,
0
′
where I(·) ≡ (U )−1 (·).
Theorem 3.1 (a) The solution to Problem 1 is characterized by a unique parameter λ > 0,
∗
in terms of which the optimal consumption process c∗ and barrier processes L and L∗ are
1
The first two conditions in the assumption are the Inada condition. We can relax the Inada condition and
′
′
treat the case U (0) < ∞ and U (∞) ≥ 0 and derive a solution to the agent’s problem. This generalization
does not add much to economic intuition and confine our attention to the case where the Inada condition is
satisfied in this paper.
7
given by
Λt
= λeρt ξt ,
Λt
= inf 0≤s≤t Λs ,
Λt
= sup0≤s≤t Λs ,
(Λ )
′
t
=
∧ U (L0 ),
∗
z )
(Λ
′
t
=
∨ U (L0 ),
∗
z
ηt
ηt
∗
Lt
(5)
= I(η t ),
L∗t = I(η t ),
c∗t =
 (
Λt )


I
,



1−K



∗


Lt ,
















Λt ≤ (1 − K)η t ,
(1 − K)η t ≤ Λt ≤ η t ,
η t ≤ Λt ≤ η t ,
I(Λt ),
(6)
L∗t ,
η t ≤ Λt ≤ (1 + K)η t ,
( Λ )
t
I
,
Λt ≥ (1 + K)η t .
1+K
The constants z ∗ and z ∗ are determined by optimal stopping problems described below, which
do not depend on U .
(b) Define
J(Λ, L, L) ≡ E
w,L,L
[∫
∞
−ρt
e
]
∗
∗
∗
˜
G(Λt , Lt , Lt )dtΛ0 = Λ, L0 = L, ult∗ = L ,
0
e is the convex dual of G. The value function V of Problem 1 is given by
where G
{
}
V (w, L, L) = min J(λ, L, L) + λw ,
λ>0
(7)
and there exists a unique λ∗ which minimizes the righthand side of equation (7). Moreover,
λ∗ satisfies
w=−
∂J(λ∗ , L, L)
,
∂Λ
(8)
wt = −
∂J(Λt , Lt , Lt )
∂Λ
(9)
and generally we have
with Λ0 = λ∗ . Therefore, the solution to Problem 1 is characterized by process c∗ in (6) and
∗
barrier processes L and L∗ in (5), with λ = λ∗ > 0.
8
The parameter λ∗ satisfies the budget constraint
E
w,L,L
[∫
∞
]
ξt c∗t dt = w.
(10)
0
The optimal investment in the risky asset, πt∗ , is given by
πt∗ =
κ ( ∂ 2 J(Λt , Lt , Lt ) / ∂J(Λt , Lt , Lt ) )
κ ∂ 2 J(Λt , Lt , Lt )
=
−
Λt
Λt wt .
σ
∂Λ2
σ
∂Λ2
∂Λ
(11)
PROOF of (a): As in Dybvig and Rogers (2013), the proof proceeds in the following
steps:
• Write down the Lagrangian form of the problem.
• Optimize over c assuming L and L as given.
• Optimize over non-decreasing adapted L and over non-increasing adapted L.
• Verify that the candidate solution is optimal.
1. The Lagrangian Form of the Problem
The financial market is dynamically complete, and hence the problem can be stated as
the following Lagrangian form:
sup Ψ(c, L, L, λ) ≡ sup E
c,L,L
c,L,L
w,L,L
[∫
∞
]
e−ρt {G(ct , Lt , Lt ) − Λt ct }dt + λw0 ,
(12)
0
where
Λt = λeρt ξt .
2. Optimizing over c
From the Lagrangian the agent’s optimal consumption can be derived from the first-order
condition
∂G ∗
(c , Lt , Lt ) = Λt .
∂c t
9
(13)
We know that



(1 + K)U ′ (c),




[ ′
]


′ (c) ,

U
(c),
(1
+
K)U



∂G
(c, L, L) =
U ′ (c),

∂c


]
[



(1 − K)U ′ (c), U ′ (c) ,





 (1 − K)U ′ (c),
c < L,
c = L,
L < c < L,
(14)
c = L,
c > L.
From this and the first-order condition we derive the expression for optimal consumption in
Theorem 3.1.
3. Optimizing over L and L.
e of G defined as
By using the convex dual function, G,
e
G(Λ,
L, L) ≡ G(c∗ , L, L) − Λc∗ ,
(15)
the Lagrangian can now be written as
sup Ψ(c, L, L, λ) ≡ sup E w,L,L
c,L,L
[∫
∞
]
e t , Lt , Lt )dt + λw0 .
e−ρt G(Λ
(16)
0
L,L
The convex dual function satisfies



KU ′ (L),
Λ < (1 − K)U ′ (L),



e
∂G
′
′
′
+α
˜ KU ′ (L) =
 U (L) − Λ, (1 − K)U (L) ≤ Λ ≤ U (L),
∂L



 0,
Λ > U ′ (L),



0,
Λ < U ′ (L),



e
∂G
′
˜
− βKU
(L) =
U ′ (L) − Λ, U ′ (L) ≤ Λ ≤ (1 + K)U ′ (L),

∂L



 −KU ′ (L), Λ > (1 + K)U ′ (L),
(17)
(18)
and
e
∂2G
= 0.
∂L∂L
This can be rewritten as
e
∂G
∂L
( ′
)
′
′
= −˜
αKU (L) + (U (L) − Λ)+ ∧ (KU (L))
(
((
))
Λ )+
′
= U (L) − α
˜K + 1 − ′
∧K ,
U (L)
10
(19)
e
∂G
∂L
Note that
(
)
′
′
′
˜
= βKU
(L) + − (Λ − U (L))+ ∨ (−KU (L))
(
(( Λ
)+
))
′
˜ −
= U (L) βK
−1 ∧K .
′
U (L)
[∫
(20)
]
e t , Lt , Lt )dt
e−ρt G(Λ
0
∫ Lt e
[∫ ∞
{
} ]
∂G
−ρt e
w,L,L
=E
e
G(Λt , Lt , L0 ) +
(Λt , Lt , y)dy dt
∂L
0
L
∫ 0Lt e
{
[∫ ∞
∂G
e t , L0 , L0 ) +
= E w,L,L
e−ρt G(Λ
(Λt , x, L0 )dx
0
L0 ∂L
∫ Lt ( e
∫ Lt 2 e
) } ]
∂G
∂ G
+
(Λt , L0 , y) +
(Λt , x, y)dx dy dt
∂L
L0
L0 ∂L∂L
[∫ ∞
]
e t , L0 , L0 )dt
= E w,L,L
e−ρt G(Λ
0
∫ ∞
[∫ ∞
]
e
∂G
w,L,L
+E
e−ρt
1{x≤Lt }
(Λt , x, L0 )dxdt
∂L
0
∫LL0 0
[∫ ∞
( ∂G
)
]
e
e−ρt
+ E w,L,L
1{y≥Lt } −
(Λt , L0 , y) dydt
∂L
0
0
Q ≡
E w,L,L
∞
≡ Q0 + Q1 + Q2 ,
where 1A is the characteristic function of set A. The first term, Q0 , is determined by Λ0 , L0 , L0
and is independent of Lt and Lt for t > 0.
Define
f (z) ≡ E w,L,L
[∫
∞
0
Then, Q1 can be expressed as
Q1 =
E w,L,L
∫
=
∞
0
∫L∞
=
]
{
} e−ρt − α
˜ K + ((1 − Λt )+ ∧ K) dtΛ0 = z .
[∫
∞
0
E w,L,L
∫
]
e
∂G
(Λt , x, L0 )dxdt
∂L
L0
]
∞
e
∂
G
e−ρt
(Λt , x, L0 )dt dx
∂L
τx
−ρt
e
[∫
(21)
∞
1{x≤Lt }
′
(22)
′
U (x)E w,L,L [e−ρτx f (Λτx /U (x))]dx.
L0
Similar to Dybvig and Rogers (2013), maximization of Q1 over non-decreasing L can be
regarded as a family of optimal stopping problems
′
sup E w,L,L [e−ρτx f (Λτx /U (x))],
τx
one for each x. There is essentially one optimal stopping problem
[
]
f¯(z) ≡ sup E w,L,L e−ρτ f (Λτ )Λ0 = z ,
τ
11
(23)
′
since the multiplicative factor, U (x), can be absorbed into the initial condition for Λ. As in
Appendix A.2, the optimal stopping time for the problem is of the form
τ ∗ ≡ inf{t : Λt ≤ z ∗ }
(24)
for some positive constant z ∗ . The optimal stopping time, τx , will be simply
′
τx ≡ inf{t : Λt ≤ z ∗ U (x)},
(25)
from which it follows that the stopping times, τx , increase with x. Hence from the equality
∗
of events for each x > L0 the optimal upper barrier, L , must satisfy
{ (Λ )
}
′
∗
t
{Lt > x} = {τx < t} = {Λt < z ∗ U (x)} = I ∗ > x ,
z
(26)
we know that the constructed L is non-decreasing, and may be expressed explicitly as
(Λ
{
( Λ )}
)
′
∗
t
t
= I ∗ ∧ U (L0 ) .
Lt = max L0 , I ∗
z
z
∗
(27)
∗
This proves the form (5) of L in Theorem 3.1. Note that the choice of L does not depend
upon the level of L except possibly through Λ0 , the initial Lagrange multiplier.
Next we define
g(z) ≡ E
w,L,L
[∫
∞
0
]
{
} ˜ + ((Λt − 1)+ ∧ K) dtΛ0 = z .
e−ρt − βK
(28)
∗
The proof proceeds similarly to the proof for the choice of L in the above. Since
∫ L0
[∫ ∞
( ∂G
)
]
e
−ρt
w,L,L
e
Q2 = E
1{y≥Lt } −
(Λt , L0 , y) dydt
∂L
0
0
∫ L0
) ]
[∫ ∞
( ∂G
e
=
E w,L,L
(Λt , L0 , y) dt dy
e−ρt −
∂L
τy
∫0 L0
′
′
U (y)E w,L,L [e−ρτy g(Λτy /U (y))]dy,
=
(29)
0
maximization of Q2 over non-increasing L can be regarded as a family of optimal stopping
problems
′
sup E w,L,L [e−ρτy g(Λτy /U (y))],
τy
one for each y. There is essentially one optimal stopping problem
[
]
g(z) ≡ sup E w,L,L e−ρτ g(Λτ )Λ0 = z ,
τ
12
(30)
′
since the multiplicative factor, U (y), can be absorbed into the initial condition for Λ. As in
Appendix A.4, the optimal stopping time for the problem is of the form
τ ∗ ≡ inf{t : Λt ≥ z ∗ }
(31)
for some positive constant z ∗ . The optimal stopping time, τy , will be simply
′
τy ≡ inf{t : Λt ≥ z ∗ U (y)},
(32)
from which it follows that the stopping times, τy , decrease with y. Hence from the equality
of events for each y < L0 the optimal lower barrier L∗ must satisfy
{ (Λ )
}
′
{L∗t < y} = {τy < t} = {Λt > z ∗ U (y)} = I ∗t < y ,
z
(33)
we know that the constructed L is non-increasing, and may be expressed explicitly as
{
( Λ )}
)
(Λ
′
L∗t = min L0 , I ∗t
= I ∗t ∨ U (L0 ) .
z
z
(34)
This proves the form (5) of L∗ in Theorem 3.1. Note that the choice of L∗ does not depend
upon the level of L except possibly through Λ0 , the initial Lagrange multiplier.
4. Verifying the optimality
Exploiting the arguments in Dybvig and Rogers (2013), the verification is straightforward.
PROOF of (b): Equation (7) is a consequence of the duality theory (see, e.g., Rockafellar
1964). Equations (8) and (9) are consequences of equation (7).
We apply It´o’s lemma to equation (9) and compare the result with the wealth evolution
equation and we get equation (11).
Q.E.D.
From Equation (5) and (6) in Theorem 3.1, we see that c∗t > Lt under the condition that
Λt < (1 − K)η t . The condition hold with a positive probability if z ∗ < (1 − K), which is true
whenever α > α∗ , where α∗ is a constant defined in Proposition A.2 in Appendix. Similarly,
we can derive a condition for c∗t < Lt .
13
4
The Case with CRRA Utility
In this section we consider the case with CRRA utility:
U (c) ≡
c1−R
1−R
(35)
for a constant R > 0, which is called the coefficient of relative risk aversion. When R = 1 the
righthand side of (35) is not well-defined. However, as R approaches 1, the von NeumannMorgenstern preference characterized by the above cardinal utility function approaches a
preference represented by U (c) = log c.
4.1
Derivation of the Solution
Notice that, from the proof of Theorem 3.1,
J = Q = Q0 + Q1 + Q2 .
We first compute Q0 , which is defined as
Q0 (Λ, L, L) ≡ E
w,L,L
[∫
∞
} ]
{
e t , L, L) dt
e−ρt G(Λ
0
We know that














e
G(Λ, L, L) =













R−1
1
R
′
˜
(1 − K) R Λ R + K(1 − α
˜ )U (L) + βKU
(L), Λ ≤ (1 − K)U (L),
1−R
′
′
˜
(1 − α
˜ K)U (L) + βKU
(L) − ΛL,
(1 − K)U (L) ≤ Λ ≤ U (L),
R−1
R
′
′
˜
Λ R −α
(L),
U (L) ≤ Λ ≤ U (L),
˜ KU (L) + βKU
1−R
′
′
˜
(1 + βK)U
(L) − α
˜ KU (L) − ΛL,
U (L) ≤ Λ ≤ (1 + K)U (L),
R−1
1
R
′
˜ (L) − α
(1 + K) R Λ R − K(1 − β)U
˜ KU (L), (1 + K)U (L) ≤ Λ.
1−R
(36)
And, by the Feynmann-Kac Theorem, Q0 (Λ, L, L) satisfies the following linear differential
equation:
e t , L, L) = 0,
LQ0 (Λt , L, L) + G(Λ
where
1
d2
d
L ≡ κ2 Λ2 2 + (ρ − r)Λ
− ρ.
2
dΛ
dΛ
14
Assume first the constants N+ > 1 and N− < 0 are the two roots of the quadratic equation
1
ι(x) ≡ κ2 x(x − 1) + (ρ − r)x − ρ = 0.
2
Then,

˜
R−1
1
βKU
K(1 − α)U
˜ (L)
(L)
R


(1 − K) R γM −1 Λ R +
+
+ A1 ΛN+ ,



1
−
R
ρ
ρ


˜

(1 − αK)U
˜
(L)
(L)
βKU
L



+
− Λ + A2 ΛN+ + B2 ΛN− ,


ρ
ρ
r


˜
(L)
(L)
αKU
˜
βKU
R
−1 R−1
R
Q0 (Λ, L, L) =
γM Λ
+
+ A3 ΛN+ + B3 ΛN− ,
−

1
−
R
ρ
ρ



˜

(1 + βK)U
(L)
αKU
˜
(L)
L


−
− Λ + A4 ΛN+ + B4 ΛN− ,


ρ
ρ
r



˜ (L)
R−1

1
− β)U
αKU
˜
(L)
K(1
R
−1

R
R

(1 + K) γM Λ
−
−
+ B5 ΛN− ,
1−R
ρ
ρ
Λ ≤ (1 − K)L
(1 − K)L
L
−R
−R
−R
,
≤Λ≤L
−R
,
≤ Λ ≤ L−R ,
L−R ≤ Λ ≤ (1 + K)L−R ,
(1 + K)L−R ≤ Λ,
(37)
where
ρ−r 1R−1 2
+
κ .
R
2 R2
γM ≡ r +
Here, A1 , A2 , B2 , A3 , B3 , A4 , B4 and B5 are constants, which can be determined by the C 1 −conditions
of the function Q0 with respect to the first variable Λ.
Now we compute Q1 defined as
∫
Q1 (Λ, L, L) ≡
∞
′
U (x) sup E
w,L,L
τx
L
−ρτx
[e
∫
′
∞
f (Λτx /U (x))]dx =
L
( Λ )
′
U (x)f¯ ′
dx.
U (x)
As shown in the proof of Proposition A.2,

( ) N−

 f (z ∗ ) z
, z ≥ z∗,
∗
¯
z
f (z) =

 f (z),
z ≤ z∗,
for f defined in (21), and notice that we get an explicit form of f in Proposition A.1 in
Appendix. Thus, Q1 has an the form of
 ∫
RN− −R+1
∞
( ΛxR )N−

L
−R

−R
∗
∗ −N−
∗

x f (z )
f (z )
ΛN− ,
Λ ≥ z∗L ,
dx = −(z )

∗

z
RN
−
R
+
1

−

∫ (z∗ /Λ)1/R
 ∫L(z∗ /Λ)1/R
( ΛxR )N−
−R
R
−R
∗
Q1 =
x f (Λx )dx +
x f (z )
dx

z∗

L
L

∫ (z∗ /Λ)1/R



(z ∗ )N− −1+1/R R−1
−R


=
x−R f (ΛxR )dx − (z ∗ )−N− f (z ∗ )
Λ R , Λ ≤ z∗L .
RN− − R + 1
L
(38)
Notice that (40) reveals
∫
(z ∗ /Λ)1/R
x−R f (ΛxR )dx = C1 Λ
L
15
R−1
R
+ C2
for some constants C1 and C2 .
We find Q2 which is defined as
∫
Q2 (Λ, L, L) ≡
L
′
U (y) sup E
w,L,L
[e
τy
0
−ρτy
∫
′
g(Λτy /U (y))]dy =
L
′
U (y)g
0
( Λ )
dy.
U ′ (y)
As shown in the proof of Proposition A.4,


 g(z),
z ≥ z∗,
(
)
g(z) =
z N+

 g(z ∗ )
, z ≤ z∗,
z∗
for g defined in (28), which has an explicit form as seen in Proposition A.3 in Appendix.
Thus, Q2 becomes
 ∫ L
∫ (z∗ /Λ)1/R
( Λy R )N+

−R
R


y g(Λy )dy +
y −R g(z ∗ )
dy


z∗

(z ∗ /Λ)1/R
0


∫ L
∗ N+ −1+1/R R−1
−R
R
∗ −N+
∗ (z )
Q2 =
=
y
g(Λy
)dy
+
(z
)
g(z
)
Λ R , Λ ≥ z ∗ L−R ,


RN+ − R + 1
(z ∗ /Λ)1/R


∫ L
( Λy R )N+


LRN+ −R+1 N+

∗ −N+
∗

y −R g(z ∗ )
dy
=
(z
)
g(z
)
Λ ,
Λ ≤ z ∗ L−R .
∗
z
RN
−
R
+
1
+
0
(39)
Notice that we can obtain
∫
L
(z ∗ /Λ)1/R
y −R g(Λy R )dy = D1 Λ
R−1
R
+ D2
for some constants D1 and D2 from (48).
By the equations in (37), (38) and (39), we have an explicit solution J(Λ, L, L) to our
problem. Also, we know that, from equation (8) and (9), the optimal coefficients λ∗ and Λ∗t
of the Lagrange multiplier can be calculated. Furthermore, equation (11) yields the optimal
investment in the risky asset at time t.
4.2
Numerical Examples
In this subsection we explain properties of optimal consumption and investment by using
numerical examples. Throughout this subsection, we make use of the following values of
parameters as the baseline case: r = 3.17%, µ = 9.57%, σ = 20.81% for the parameters
describing the financial market and take ρ = r, K = 0.5, K = K, α = 20, β = α, R = 2, w =
16
100, L = w · γM + 5, L = w · γM − 2 for parameters related to the individual’s characteristic.
Here γM is the optimal consumption to wealth ratio in Merton’s (1971) model, and equal to
4.35% for the baseline case. We have chosen the parameters to match the historical data for
the US financial market, that is, r = 3.17% is the average rate from rolling over 1-month
Treasury bills for the time period of 1926 – 2009 (source: Table 5.2 of Bodie et al. 2011), and
µ = 9.57%, σ = 20.81% are the average rate and volatility of US large stocks for the time
period of 1926 – 2009 (source: Table 5.3 of Bodie et al. 2011).
4.2.1
Relative Risk Aversion
The agent’s coefficient of relative risk aversion (RRA) can be defined by using the value
function:
(
∂ 2 V )/( ∂V )
Γ≡− w·
.
∂w2
∂w
Γ = R if the comfortable zone is the whole space.
Figure 1 shows the RRA Γ for various initial wealth w for the baseline parameters. Note
that the RRA in our model is larger than R. That is, the existence of a comfortable zone of
consumption makes the agent more risk averse that in its absence. The figure is U-shaped;
the more extreme (either larger or smaller) the agent’s wealth, he/she exhibits larger risk
aversion. If wealth becomes either large or small, then the optimal consumption is near or
the beyond boundary and the agent becomes more risk averse. This aspect implies that the
agent’s risk aversion is time-varying as his/her wealth changes over time.
[Insert Figure 1 here.]
4.2.2
Optimal Stockholdings
Figures 2 and 3 give the optimal portfolio proportion as a function of the level of a boundary.
Here we take extreme cases and assume L = 0 for Figure 2 and L = ∞ for Figure 3. Thus,
there exists only one barrier in the examples.
17
The optimal stock-to-wealth ratio in our model has a value smaller than the one from
Merton’s (1971) model. The agent in our model is more risk averse due to the existence of
the comfortable zone than an agent in Merton’s. Thus, the ratio is smaller in our model.
Moreover, the ratio in our model gets closer from below to the one in Merton’s as L (L) gets
larger (smaller, respectively). This is consistent with the fact that our model converges to
Merton’s as L = 0 and L = ∞.
[Insert Figure 2 and 3 here.]
Figures 4 and 5 show the optimal stock-to-wealth ratio for the benchmark case. They
show the same patterns as in Figures 2 and 3, respectively. If we compare Figure 4 (Figure
5) with Figure 2 (Figure 3), we see that the optimal stock-to-wealth ratio decreases as L (L)
increases (decreases, respectively). This implies that the agent in our model is likely to invest
more in the risky asset if he/she has a wider comfortable zone.
[Insert Figure 4 and 5 here.]
4.2.3
Optimal Consumption
Figures 6 and 7 show the consumption-to-wealth ratio as a function of a barrier. Here we
take extreme cases, L = 0 and L = ∞, and thus there exists only one barrier.
For the case where L = 0 (L = ∞), the optimal consumption-to-wealth ratio is larger
(smaller) than in Merton’s (1971) model. It approaches from above (below) to the ratio in
Merton’s model as L (L) gets larger (smaller, respectively). Intuitively, in the presence of
the upper (lower, resp.) barrier the agent consumes at a level higher (lower, resp.) than the
optimal level in its absence in anticipation of the possibility of reaching the boundary.
[Insert Figure 6 and 7 here.]
Figures 8 and 9 show the optimal consumption-to-wealth ratio for the benchmark case.
Both the graphs representing the ratio intersect the so-called Merton Line, the optimal
18
consumption-to-wealth ratio of the Merton model. Comparing Figure 8 (Figure 9) with Figure
6 (Figure 7), we find that the optimal consumption-to-wealth ratio decreases (increases) as L
(L) increases (decreases, respectively). This implies that an individual is likely to consume
more if he/she has a wider comfortable zone.
[Insert Figure 8 and 9 here.]
5
Conclusion
This paper has proposed a utility model in which agents require effort to learn how to consume
effectively. There is a comfortable zone of consumption and an agent suffers utility loss if
he/she consumes a level beyond the comfortable zone. The agent can expand the comfortable
zone by paying learning costs. Two cases emerge as an optimal behavior: the agent can
consume beyond the comfortable zone exerting relatively little efforts to learn or he/she
consume always within the zone and exert large amount of efforts to learn. It will be an
interesting research topic to test the implications of the model empirically
Appendix
A
A.1
Solution of the Optimal Stopping Problems
Explicit form of f
We present an explicit form of f here, which was defined as
f (z) ≡ E w,L,L
[∫
0
∞
]
e−ρt {−˜
αK + ((1 − Λt )+ ∧ K)}dt Λ0 = z .
19
Proposition A.1 The function f has an explicit form of



˜ )Kρ−1 + Af1 z N+ ,
0 ≤ z ≤ 1 − K,
 (1 − α


f (z) =
(1 − α
˜ K)ρ−1 − r−1 z + Af2 z N+ + B2f z N− , 1 − K ≤ z ≤ 1,





−˜
αKρ−1 + B3f z N− ,
z ≥ 1,
and the four constants Af1 , Af2 , B2f , and B3f satisfy
 

f
1−N+ )(rN + ρ(1 − N ))
−
−
 A1   (1 − (1 − K)

 
 f  
rN− + ρ(1 − N− )
 A2  
 
rρ(N+ − N− ) 
 f =
 B   −(1 − K)1−N− (rN+ + ρ(1 − N+ ))
 2  
 

B3f
(1 − (1 − K)1−N− )(rN+ + ρ(1 − N+ ))
(40)





.




(41)
PROOF. A slight modification of Proposition 1 in Dybvig and Rogers (2013) can complete
the proof. Note that using Ito’s formula we can obtain a differential equation for f :
1
−˜
αK + ((1 − z)+ ∧ K) − ρf (z) + κ2 z 2 f ′′ (z) + (ρ − r)zf ′ (z) = 0.
2
And we know that the solution has the explicit form of (40) since f must be bounded at zero
and infinity. Moreover, we can derive the relationship of (49) if we use the C 1 conditions of
f (z) at z = (1 − K) and z = 1.
A.2
Solving the Optimal Stopping Problem for f¯
Recall the optimal stopping problem for f¯ is
]
[
f¯(z) ≡ sup E w,L,L e−ρτ f (Λτ )Λ0 = z ,
(42)
τ
then the following is true.
Proposition A.2 The optimal stopping problem (42) has a solution which is given by taking
τ = τ ∗ ≡ inf{t : Λt ≤ z ∗ },
20
where z ∗ is the unique solution to H(z) = 0, where H is defined as



(1 − (1 − K)1−N+ )(rN− + ρ(1 − N− ))z N+ − rN− (1 − α
˜ )K, 0 ≤ z ≤ 1 − K,



rρH(z) =
(rN− + ρ(1 − N− ))z N+ + ρ(N− − 1)z − rN− (1 − α
˜ K),
1 − K ≤ z ≤ 1,





rN− α
˜ K,
z ≥ 1.
(43)
Moreover, z ∗ > (1 − K) if and only if α < α∗ where
α∗ ≡
(N− (ρ − r) − ρ)(1 − K)N+ + ρ(1 − N− )(1 − K) + rN−
.
rρKN−
(44)
PROOF. Using the same argument of Dybvig and Rogers (Proposition 2, 2013), we know
that f¯(z) = f (z) for z ≤ z ∗ and f¯(z) satisfies a differential equation
1 2 2 ¯′′
κ z f (z) + (ρ − r)z f¯′ (z) − ρf¯(z) = 0
2
for z ≥ z ∗ .
Since f¯(z) must be bounded, f¯(z) = Az N− for z ≥ z ∗ and a suitable constant A, and
the continuity of f¯(z) at z = z ∗ tells us that A = (z ∗ )−N− f (z ∗ ), consequently, f¯(z) =
( z )N−
for z ≥ z ∗ . If we hold z is fixed, then such f¯ has its maximum if we choose z ∗
f (z ∗ ) ∗
z
( 1 )N−
; namely, we get the maximum at which
to maximize f (z ∗ ) ∗
z
H(z) ≡ −N− f (z) + zf ′ (z) = 0,
(45)
and we can easily check that H satisfies (43).
We know that H is a decreasing function in the interval [0, 1 − K] with H(0) > 0, if we
use the fact of
1
rN± + ρ(1 − N± ) = κ2 N± (N± − 1),
2
(46)
and is a negative constant function in the interval [1, ∞]. Also we can easily verify that it is
a strictly decreasing function in the middle interval if we use the fact of
N+ N− = −
2ρ
.
κ2
(47)
Thus the uniqueness of z ∗ is proven. The last statement is true since H(1 − K) > 0 if and
only if α < α∗ .
21
A.3
Explicit form of g
We present an explicit form of g defined as
g(z) ≡ E
w,L,L
[∫
0
∞
]
{
} ˜ + ((Λt − 1)+ ∧ K) dt Λ0 = z .
e−ρt − βK
Using the same argument in Proposition (A.1), we can obtain the following.
Proposition A.3 The function g has an explicit form of


˜ −1 + Ag z N+ ,

−βKρ
0 ≤ z ≤ 1,

1


−1 + r −1 z + Ag z N+ + B g z N− , 1 ≤ z ≤ 1 + K,
˜
g(z) =
−(1 + βK)ρ
2
2




 (1 − β)Kρ
−1 + B g z N− ,
˜
z ≥ 1 + K,
3
(48)
where N+ > 1 and N− < 0 are described in Proposition A.1 and the four constants Ag1 , Ag2 , B2g ,
and B3g satisfy

 
g
1−N+ )(rN − ρ(N − 1))
A
−
−
 1   (1 − (1 + K)

 
 g  
 A2   (1 + K)1−N+ (−rN− + ρ(N− − 1))
 
rρ(N+ − N− ) 
 g =
 B  
rN+ + ρ(1 − N+ ))
 2  

 
(1 − (1 + K)1−N− )(rN+ + ρ(1 − N+ ))
B3g
A.4





.




(49)
Solving the Optimal Stopping Problem for g
The optimal stopping problem for g is
[
]
g(z) ≡ sup E w,L,L e−ρτ g(Λτ )Λ0 = z .
(50)
τ
Proposition A.4 The optimal stopping problem (50) has a solution which is given by taking
τ = τ ∗ ≡ inf{t : Λt ≥ z ∗ },
where z ∗ is the unique solution to H(z) = 0, where H satisfies


˜

rN+ βK,
0 ≤ z ≤ 1,



˜
rρH(z) =
−(rN+ + ρ(1 − N+ ))z N− − ρ(N+ − 1)z + rN+ (1 + βK),
1 ≤ z ≤ 1 + K,




 −(1 − (1 + K)1−N− )(rN + ρ(1 − N ))z N− − rN (1 − β)K,
˜
z ≥ 1 + K.
+
+
+
(51)
22
Moreover, z ∗ < (1 + K) if and only if β < β ∗ where
β∗ ≡
(N+ (r − ρ) + ρ)(1 + K)N− + ρ(N+ − 1)(1 + K) − rN+
.
rρKN+
(52)
PROOF. Using the argument similar with Proposition 2 of Dybvig and Rogers (2013), we
know that g(z) = g(z) for z ≥ z ∗ and g(z) satisfies a differential equation
1 2 2 ′′
κ z g (z) + (ρ − r)zg ′ (z) − ρg(z) = 0
2
for z ≤ z ∗ .
Since g(z) must be bounded, g(z) = Az N+ for z ≤ z ∗ and a suitable constant A, and the
( z )N+
continuity of g(z) at z = z ∗ tells us that A = (z ∗ )−N+ g(z ∗ ), consequently, g(z) = g(z ∗ ) ∗
z
∗
∗
for z ≤ z . If we hold z is fixed, then such g has its maximum if we choose z to maximize
( 1 )N+
g(z ∗ ) ∗
; namely, we get the maximum at which
z
H(z) ≡ −N+ g(z) + zg ′ (z) = 0,
and we can easily check that H satisfies (51).
Note that H is a positive constant function in the interval [0, 1], an decreasing function in
the interval [1 + K, ∞) with lim H(z) < 0 (use (46) for the proof), and a strictly decreasing
z→∞
function in the middle interval (use (47) for the proof). Thus the uniqueness of z ∗ is clear.
The last statement is true since H(1 + K) < 0 if and only if β < β ∗ .
References
Zvi Bodie, Alex Kane, and Alan J. Marcus, 2011, Investments and Portfolio Management:
Global Edition, McGraw-Hill Companies, Inc..
Christopher D. Carroll, 2002, “Portfolio of the Rich” in Luigi Guiso, Michael Haliassos, and
Tullio Jappelli, ed.: Household Portfolios (The MIT Press).
John Y. Campbell and John H. Cochrane, 2000, “Explaining the Poor Performance of Consumptionbased Asset Pricing Models”, The Journal of Finance, 55(6), 2863-2878.
23
George M. Constantinides, 1990, “Habit Formation: A Resolution of the Equity premium
puzzle”, The Journal of Political Economy, 98(3), 519-543.
James S. Duesenberry, 1949, Income, Saving, and the Theory of Consumer Behavior, Harvard
University Press.
Philip H. Dybvig, 1995, “Dusenberry’s Ratcheting of Consumption: Optimal Dynamic Consumption and Investment Given Intolerance for any Decline in Standard of Living”, The
Review of Economic Studies, 62(2), 287-313.
Philip H. Dybvig and Huang, 1988, “Nonnegative Wealth, Absence of Arbitrage, and Feasible
Consumption Plans”, The Review of Financial Studies, 1(4), 377-401.
Philip H. Dybvig and L.C.G. Rogers, 2013, “High Hopes and Disappointment”, working paper.
Milton Friedman, 1957, A Theory of the Consumption Function, Princeton University Press.
Lars P. Hansen and Kenneth Singleton, 1982, “Generalized Instrumental Variables Estimation of Nonlinear Rational Expectations Models”, Econometrica, 50, 1269-1286.
Godfrey Keller and Sven Rady, 1999, “Optimal Experimentation in a Changing Environment”, The Review of Economic Studies, 66(3), 475-507.
Rajnish Mehra and Edward Prescott, 1985, “The Equity Premium Puzzle”, Journal of Monetary Economics, 15, 145-161.
Robert C. Merton, 1971, “Optimum Consumption and Portfolio Rules in a Continuous-Time
Model”, Journal of Economic Theory, 3, 373-413.
J. Michael Harrison, Brownian Motion and Stochastic Flow Systems, 1985, John Wiley &
Sons.
24
Franco Modigliani and Richard Brumberg, 1955, “Utility Analysis and the Consumption
Function” in K. Kurihara, ed.: Post Keynesian Economics (G. Allen, London).
R. Tyrrell Rockafellar, 1964, “Duality Theorems for Convex Functions”, Bulletin of the American Mathematical Society, 70, 189-192.
Philippe Weil, 1989, “The Equity Premium Puzzle and the Risk-free Rate Puzzle,” Journal
of Monetary Economics 401-421.
25
Figure 1: The relative risk aversion Γ as a function of initial wealth w for the benchmark case
26
Figure 2: Optimal stock-to-wealth ratio as a function of L for the case where L = 0
Figure 3: Optimal stock-to-wealth ratio as a function of L for the case where L = ∞
27
Figure 4: Optimal stock-to-wealth ratio as a function of L for the baseline case
Figure 5: Optimal stock-to-wealth ratio as a function of L for the baseline case
28
Figure 6: Optimal consumption-to-wealth ratio as a function of L for the case where L = 0
Figure 7: Optimal consumption-to-wealth ratio as a function of L for the case where L = ∞
29
Figure 8: Optimal consumption-to-wealth ratio as a function of L for the baseline case
Figure 9: Optimal consumption-to-wealth ratio as a function of L for the baseline case
30