RECURSION - Research School of Computer Science

RECURSION
Week 6 Laboratory for Introduction to Programming and Algorithms
Uwe R. Zimmer based on material by James Barker
Pre-Laboratory Checklist
vv Skills: You can write any conditional expression.
vv Knowledge: You know how conditions can be based upon
pattern matching, boolean guards, or a mixture of both.
vvYou have read the laboratory text below.
Objectives
In this lab, you will learn how to understand, design and implement recursive functions. You will
make use of the submission testing and peer reviewing system frequently this week.
Interlude: Recursion
So far, you have learnt more about the structure of lists, and written some functions that use
the first few elements of a list. The last exercise from last lab already went a bit further and
evaluated a whole list of tokens without making much of a point of it. Yet this time you’ll learn
in more general terms to go through a list, without making any kind of assumptions about how
many elements there are. A great first example is the sum function, which returns the total of
all elements in a (numeric) list. Here is an alternative definition of the sum function over lists of
Integers: (the ' at the end of sum' is needed to distinguish our function from the build-in function sum.)
sum' :: [Integer] -> Integer
sum' list =
case list of
[]
-> 0
x: xs -> x
+ sum' xs
Before you look at what’s returned by this function, check out the patterns which this function
attempts to match the input against. Those are the two most generic patterns for lists and the
two of them together will match against any list. Convince yourself that this is the case. Go
back to the interlude about list patterns in the previous lab, if in doubt.
If you’re paying attention, you will have also noticed that the sum’ function is defined in terms
of itself. This is called a recursive definition, as it uses recursion: expressing something in
terms of itself. At first glance, this probably seems confusing and unintuitive, but it turns out
that recursion is an extremely powerful idea. Many formal definitions of computation rely heavily on the idea of recursion, and it is fundamental to how Haskell (and all other functional or
logical programming languages) operate. Programming languages of different paradigms (like
1 | ANU College of Engineering and Computer Science
March 2015
imperative languages) will offer other forms of iteration as well, but almost all of them also offer
recursion1. Chances that you will ever use a programming language which does not support
recursion are slim.
So what does the above function definition actually say? Well, if the input list is empty, then the
sum of that list is 0 – this makes reasonable sense. If the input list is non-empty, then the sum
of that list is calculated by adding the head element to the sum of the tail, i.e. the rest of the list.
Although it makes sense, it’s instructive to repeatedly expand an expression involving sum’ to
see what happens:
sum’
sum’
sum’
sum’
sum’
sum’
sum’
sum’
[1,
[1,
[1,
[1,
[1,
[1,
[1,
[1,
2,
2,
2,
2,
2,
2,
2,
2,
3]
3]
3]
3]
3]
3]
3]
3]
=
=
=
=
=
=
=
1
1
1
1
1
1
6
+
+
+
+
+
+
sum’
sum’
sum’
sum’
sum’
5
[2,
[2,
[2,
[2,
[2,
3]
3]
3]
3]
3]
=
=
=
=
1
1
1
1
+
+
+
+
2
2
2
2
+
+
+
+
sum’ [3]
sum’ [3] = 1 + 2 + 3 + sum’ []
sum’ [3] = 1 + 2 + 3 + 0
3
You notice that the problem is transformed into a simpler expression which is only evaluated
after we found the simple case (or base case) (in red) which we can solve without recursing.
Here is another way of doing it which does the calculations while progressing through the list
(as we did in lectures already, I also replaced Integer with a placeholder a which represents all
numerical types just by adding the constraint (Num a) => … more on this in the next lab, yet for
this lab you can just read a as “type standing in for any numerical type”):
sum’’ :: (Num a) => [a] -> a
sum’’ list = case list of
[]
-> 0
[x]
-> x
x: y: zs -> sum’’ ((x + y): zs)
This would expand into the following sequence of expressions:
sum’’
sum’’
sum’’
sum’’
sum’’
[1,
[1,
[1,
[1,
[1,
2,
2,
2,
2,
2,
3]
3]
3]
3]
3]
=
=
=
=
sum’’ [3, 3]
sum’’ [3, 3] = sum’’ [6]
sum’’ [3, 3] = 6
6
As the function calculated intermediate results while going through the list, we already have the
final result when we reach the end of the list.
Certainly looks shorter, but did it actually make it faster? Type:
:set +s
in your GHCi environment to enable statistical displays with every evaluation. Now try to run
sum’ [(1::Integer) .. 1000000] as well as sum’’ [(1::Integer) .. 1000000]. Do it a few
times to see how reliable your measurements are. What does that mean?
Sometimes it is not possible to use one of the existing parameters to accumulate a result, and
thus a function with an additional parameter is introduced as an internal helper function. This
might then look like this:
1 The programming languages which do not support recursion are either early versions of some mainstream
languages (like Cobol in its original form from the ‘60s) or contemporary, specialized languages for high integrity
systems, which might limit or forbid recursion (as they also forbid while-loops and other “unbound” primitives).
2 | ANU College of Engineering and Computer Science
March 2015
sum’’’ :: (Num a) => [a] -> a
sum’’’ list = sum_with_extra_parameter 0 list
where
sum_with_extra_parameter accumulator shrinking_list = case shrinking_list of
[]
-> accumulator
x: xs -> sum_with_extra_parameter (accumulator + x) xs
This would expand into the following sequence of expressions (sum_with_extra_parameter
replaced with swep):
sum’’’ [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= swep 0 [1, 2, 3]
= 6
=
=
=
=
=
=
swep
swep
swep
swep
swep
6
1
1
1
1
1
[2,
[2,
[2,
[2,
[2,
3]
3]
3]
3]
3]
=
=
=
=
swep 3 [3]
swep 3 [3] = swep 6 []
swep 3 [3] = 6
6
Did this last version improve or degrade performance? Run
sum’’’ [(1::Integer) .. 1000000]. Any better? Any worse?
Note that recursive functions do not have to involve lists at all. The canonical example of a simple recursive function is the factorial, i.e.
6n ! N 0 : n ! =
n
% k = n $ (n - 1 ) $ (n - 2 ) $ f $ 2 $ 1
k=1
You can express factorials recursively:
6n ! N 0: n! = )
1
;n = 0
n $ (n - 1 ) ! ; n > 0
Which translates almost one-to-one into a recursive factorial function in Haskell:
import Integer_Subtypes
factorial :: Natural -> Natural
factorial n = case n of
0 -> 1
_ -> n * factorial (n - 1)
Always nice of course to see a beautifully working solution, but even more interesting is to understand what traps I just jumped over without telling you. So let’s look into what I might have
gotten wrong here:
Why do I include the case where n = 0? Simple: it tells the function where to stop recursing.
For example, imagine that I’d defined
forgot_base_case :: Natural -> Natural
forgot_base_case n = n * forgot_base_case (n - 1)
Let’s try a sample evaluation out:
forgot_base_case 3
= 3 * forgot_base_case (3 - 1)
= 3 * (2 * forgot_base_case (2 - 1))
= 3 * (2 * (1 * forgot_base_case (1 - 1)))
= 3 * (2 * (1 * (0 * forgot_base_case (0 - 1))))
*** Exception: 0 ´-´ 1: Out of range for Natural
3 | ANU College of Engineering and Computer Science
March 2015
This expansion will never end, and the computer will never be able to provide an answer. This
is called a non-terminating program … one of the possibilities when your computer freezes up
on you, while flooring the processors at the same time. However, including the case for n = 0
provides a point at which expansion ends, and the result can be computed:
factorial 3
= 3 * factorial (3 - 1)
= 3 * (2 * factorial (2 - 1))
= 3 * (2 * (1 * factorial (1 - 1)))
= 3 * (2 * (1 * factorial 0))
= 3 * (2 * (1 * 1))
= 6
This case, n = 0, is called a base case. Now we have a terminating program, which is usually
what we want – but not always .. try to think of programs which are not supposed to terminate.
The other case is called a step case. (If you studied proof by mathematical induction in high
school, these terms should be familiar to you. This is not at all accidental: recursion and induction are two sides of the same coin, and complement each other perfectly.) Yet, the step case
can drop the ball as well … let’s have a look:
stepping_on_the_spot :: Natural -> Natural
stepping_on_the_spot n = case n of
0 -> 1
_ -> n * stepping_on_the_spot n
What happens here?
stepping_on_the_spot 3
= 3 * stepping_on_the_spot 3
= 3 * (3 * stepping_on_the_spot 3)
= 3 * (3 * (3 * stepping_on_the_spot 3))
= ...
Again we end with a non-terminating program. What else can go bad? Have a look at this
function:
stepping_in_the_wrong_direction :: Natural -> Natural
stepping_in_the_wrong_direction n = case n of
0 -> 1
_ -> n * stepping_in_the_wrong_direction (n + 1)
I leave it to you to figure out what happens here. Yet those three traps are the most common
mistakes with recursive functions (or with any form of iteration really). Usually those traps are
not quite as obvious as they are here, yet knowing those problem cases intimately will help you
avoiding them from the beginning.
Relating back to the sum’’’ example: Could we also calculate the factorial value while we
recurse rather then while we return from the base case? Have a look at this version, where we
accumulate the final answer in an extra parameter:
factorial’ :: Natural -> Natural
factorial’ n = factorial_in_parameter n 1
where
factorial_in_parameter :: Natural -> Natural -> Natural
factorial_in_parameter x fac = case x of
0 -> fac
_ -> factorial_in_parameter (x - 1) (x * fac)
4 | ANU College of Engineering and Computer Science
March 2015
Is this version expected to also differ in performance similar to what you measured for the different sum implementations? Give it a try!
When the final answer is already at hand when the base case is selected (meaning the base
case already returns the final answer), then such a recursive function is called tail-recursive.
Many compilers apply optimization if they detect this pattern (basically delete all the code and
memory which would be otherwise needed to “hand the result back” through all the recursive
stages which got us there), yet the practical effect will differ depending on a number of other
factors (which we need to address later). Your Haskell compiler optimises in multiple other ways
which may outperform tail-recursive structures.
To be of any use at all, every recursive function must include a base case – otherwise, its expansion will never terminate – and a step case which takes it one step closer to the base case
with every function call. Return to the sum' function above. Here, the base case is the case
when the input is an empty list. The sole step case is the case where the input is a non-empty
list.
Recursive functions are a universal concept which you will find everywhere, once you recognize them as recursive. Try the following function and predict the result before you actually run it
with happy_creatures Man:
data Creatures = Salmon | Puffin | Fox | Bear | Man
deriving (Eq, Enum, Show)
happy_creatures :: Creatures -> String
happy_creatures creature = case creature of
Salmon -> “the “ ++ (show creature) ++ “ who is always happy”
_
-> “the “ ++ (show creature) ++ “ who dreams of eating “
++ happy_creatures (pred creature)
++ “ which makes the “ ++ (show creature) ++ “ happy”
We have only scratched the surface of what can be done with recursive functions. In labs and
assignments to come, you will learn to make increasingly complex uses of this technique.
Exercise 1: List Product
Write a function product’ :: (Num a) => [a] -> a, which multiplies all elements inside a list
such that for instance:
product’ [(1 :: Integer), 2, 3] = 6
product’ [(3 :: Integer)] = 3
product’ [] = 1
Submit your solution under “Lab 6 List Product” on the SubmissionApp (http://cs.anu.edu.au/
SubmissionApp) and don’t forget to have this apostrophe at the end of your function.
Exercise 2: Lift them up
Write a function convert_to_upper_case :: String -> String. This function should take an
input of type String, and return the same String with every character converted to upper-case.
The technique you use to do this will be similar, but not identical, to that used for sum’. Hint:
you will need to use both the attach operator “:” and the toUpper function in the Data.Char
module. You can, of course, load this module using the familiar import syntax.
Submit your solution under “Lab 6 Upper Cases” on the SubmissionApp (http://cs.anu.edu.au/
SubmissionApp) to see whether all checks out.
5 | ANU College of Engineering and Computer Science
March 2015
Exercise 3: 3 esicrexE
Write a function invert which accepts any list and returns a list with the same elements, yet in
reverse order. So your function should respond like this:
invert [(1 :: Integer) .. 10] = [10,9,8,7,6,5,4,3,2,1]
invert “abc” = “cba”
Reminder: you can concatenate too lists with the ++ operation. This might come in handy when
you implement this function in a short way.
Make sure your first shot works for the above examples and submit your solution under “Lab 6
Invert” on the SubmissionApp (http://cs.anu.edu.au/SubmissionApp) to see whether all checks
out.
Exercise 4: Fast version
Now also test your function invert for:
last (invert [(1 :: Integer) .. 100000])
(last being a predefined function to pick out the last element of a list.)
If you find yourself hoping the it will get the job done before the lab finishes, then you need to
improve your function in this exercise (this should not take longer than 0.1 seconds) – otherwise
you get a freebie and can submit the same function again and its own fast version under “Lab
6 Invert (fast)” on the SubmissionApp (http://cs.anu.edu.au/SubmissionApp).
Exercise 5: Pack the rucksack
Write a function which accepts a list of natural numbers and a target sum:
rucksack :: [Natural] -> Natural -> [[Natural]]
such that all subsequences of the original list which add up to the target sum are returned. For
instance (sorted by large subsequence start numbers first):
rucksack [3,7,5,9,13,17] 30 = [[13,17],[3,5,9,13]]
This is a simplified version of the well known knapsack problem. Submit under “Lab 6
Rucksack” on the SubmissionApp (http://cs.anu.edu.au/SubmissionApp) (and also include
“import Integer_Subtypes” at the beginning of your submission).
Make Sure You Logout
to Terminate Your Session!
Outlook
Next week you will write functions which will work for multiple types (polymorphic functions).
6 | ANU College of Engineering and Computer Science
March 2015