Math 365 Homework 3 Due Oct. 9, 2014 Grady Wright

Math 365
Grady Wright
Homework 3
Due Oct. 9, 2014
1. Gaussian Elimination. (10 points) Solve the following linear system of equations by hand
using Gaussian Elimination with partial pivoting.
( 2x
+4x3 +3x4
=4
1
−2x1
+2x3 −13x4 = 40
x1
+15x2 +2x3 −4.5x4 = 29
−4x1 +5x2 −7x3 −10x4
=9
The process you use to solve the system should follow exactly the process shown in the lecture
slides (except now with partial pivoting). You will not receive full credit if you deviate from
the standard algorithm. Show all your steps, and describe in words what you are doing in each
step in terms of elementary row operations. When you complete the problem, sit back and be
thankful we have computers that can be programmed to solve these types of problems.
√
2. Statics. (7 points) NCM, problem 2.3. Please also explain where the factor α = 1/ 2 comes
from in the set of equations.
3. Cholesky factorization. (8 points) NCM 2.5(a)
4. Timing a linear solve. (30 points) In this problem, you will investigate the cost of solving
linear systems and compare your results to the theoretical computational cost we have discussed
in class.
Before starting this problem, you will need the following information about the laptop or PC
you are working on.
• The amount of RAM in your computer. Typical values are betweem 4 and 12 GB (gigabytes).
• The clock speed of your PC or laptop. A typical value is 2.5 GHz (gigahertz).
(a) Experimental setup.
Determine a matrix size that you can comfortably fit into
your available RAM. For example, if you have a 4 GB machine, you should be able to
comfortably store a matrix that occupies about 100MB. Store this value in a variable
“Mb”. Use the following information to compute a maximum matrix dimension N that
you can store in Mb megabytes of memory.
•
•
•
•
A megabyte has 1024 kilobytes
A kilobyte is 1024 bytes
A floating point number is 8 bytes.
An N × N matrix contains N 2 floating point numbers.
Call the N you compute ’nmax’ (Nmax ).
(b) Calibrate the timing experiment. We need to “calibrate” our experimental set up by
timing an operation whose flop count is easy to compute. Create two random matrices A
and B each of size Nmax × Nmax . Using the Matlab functions tic and toc, determine
how much time (seconds) it takes to compute the product AB. Determine the number of
floating point operations (additions and multiplications) it takes to compute the Nmax ×
Nmax matrix-matrix product. Use this number to estimate the number of floating point
operations per second (’flops’) your computer can carry out. Call this flop rate ’flops’.
Compare this number to the theoretical number obtained using your computer’s clock
speed. The following information might be useful.
Figure 1: Timing plot for problem 3. The x-axis is the matrix size N (N × N matrix), and the
y-axis is the time (in seconds) as measured by tic and toc for solving a linear system of size N .
• A gigahertz is 109 hertz.
• A hertz is equal to one clock cycle.
• A typical microprocessor (CPU) can compute 4 flops per clock cycle.
If you have a dual or quad core machine, you may also want to investigate whether Matlab
is making use of multiple cores.
(c) Time a dense linear solve. Create an integer sequence of values Nvec between, say,
N = 100 and the Nmax you found above. To generate this sequence, it is a good idea to
use the logspace command.
Using a for loop, loop over the entries Ni of the sequence. In each pass through the loop,
create a random matrix A of size Ni × Ni , and a random right hand side vector b of size
Ni × 1. Then, using tic and toc, time how long it takes to solve the system Ax = b
using the backslash operator. Store this time as the ith entry in a vector ’lutimes’.
(d) On a log-log plot, plot the time values you found above (stored in lutimes) verses the
Ni in your sequence of N values. On the same set of axis, plot the curve of the theoretical
time estimated by the operation count we discussed in class. Be sure to use the flop rate
’flops’ you computed above to get a time (in seconds) from an operation count. The two
plots should be very close. Be sure to label your plots. Use the Matlab legend command
to identify each curve that you get. Your plot should look something like the figure shown
in Figure 1.
(e) (Extra Credit - 10 points) Visit the website http://www.top500.org and answer the
following questions.
• Using the value from the ’Linpack Benchmark’, what is the performance of the world’s
fastest supercomputer, measured in GFlops (= 109 flops)?
• How much faster is the world’s fastest computer than your computer or laptop?
• How recently would your computer have made it on the Top 500 list of the world’s
fastest computers?
• When can we expect to see an exaflop machine?
2
The ’Linpack Benchmark’ solves a dense linear system of equations, very much like what
you have done for this problem.
5. Countercurrent Exchangers. (30 points). Countercurrent exchangers occur both in nature
and in industry. As an example, consider an n-stage countercurrent exchanger for extracting
a chemical from a fluid as illustrated in the figure below. In this example, the fluid (such as
water) contains a mass fraction xin of some chemical as it enters the extractor on the top left
at a mass flow rate of W . At the same time, a solvent containing a mass fraction yin of the
same chemical enters the extractor on the bottom left at a mass flow rate of S. At each stage
of the reactor, the chemical in the fluid is removed and transferred to the solvent by some
chemical process. Assuming no chemical is created or destroyed in any stage, we can use the
conservation of mass (mass balance) to model this process. This yields the following set of
linear equations for the mass fraction of the chemical in the solvent and fluid at any stage i.
W xi−1 + Syi+1
{z
}
|
=
Total mass fraction entering stage i
W xi + Syi
| {z }
,
i = 2, 3, . . . , n − 1
Total mass fraction leaving stage i
The equations for stage 1 and n are given as
W xin + Sy2 = W x1 + Sy1
W xn−1 + Syin = W xn + Syn
Suppose that at each stage yi is proportional to xi , i.e. yi = mxi , where m is some constant
that depends on the target chemical. Then the above system of equations can be rewritten
just in terms of xi , i = 1, . . . , n as
−(W + Sm)x1 + Smx2 = −W xin ,
W xi−1 − (W + Sm)xi + Smxi+1 = 0
(i = 2, 3, 4, . . . , n − 1),
W xn−1 − (W + Sm)xn = −Syin .
Chemicallyrich fluid
Chemicallyclean fluid
Stage n
Stage n-1
Stage i
Stage 2
Stage 1
Chemicallyclean solvent
Chemicallyrich solvent
Figure 2: Schematic of the countercurrent exchanger for extracting a chemical from a fluid with
mass fraction xin .
(a) Write a Matlab function that computes the mass fraction of the chemical in the fluid
stream at each of the n stages (i.e. xi , i = 1, 2, . . . , n). Your function should take as
input W (in kg/hr), S (in kg/hr), xin , yin , m, and n and should output xi , i = 1, . . . , n.
It should use the spdiags function in Matlab to form the coefficient matrix and then
3
use the backslash operator \ to solve the system. As output, the function should return
both the mass fraction in the fluid and in the solvent. Your function should contain no
explicit looping structures. Turn-in a print out of your code.
(b) Write another Matlab function that performs the same task as part (a), but that uses
the tridisolve function from the NCM book to solve the linear system instead of the
spdiag function combined with the backslash operator.
(c) Show that your functions from part (a) and (b) produce the same output by solving the
n-stage countercurrent chemical extraction problem with W = 145 kg/hr, S = 27 kg/hr,
xin = 0.15, yin = 0, m = 7, and n = 51. To compare the results of the two functions,
compute the sum of the absolute value of the differences between the mass fraction values
returned by both functions.
(d) Using your function from either part (a) or (b), make an appropriate plot of the mass
fraction vs. the stages of the extraction for the problem in part (c). Include the mass
fraction of the chemical in the fluid and in the solvent on the same plot. An appropriate
plot means that I can see precisely how the mass fraction decays as the water flows
through the stages.
(e) By what percentage has the original chemical in the fluid been reduced by at the end of
the extraction procedure using the parameters from part (c). What is the mass fraction
of the chemical in the fluid and the solvent at Stage 26?
6. Heat transfer.(30 points) Imagine that we have a metal rod of length L (m) whose temperature at each end is held fixed at 20C o (293.15 K) and which is being heated by a flame held
at the midpoint. We can approximate the steady state temperature at equally spaced points
along the rod using the following model. We divide the rod into N intervals of equal length h
as shown in the figure below. This partitioning of the rod gives us N + 1 equally spaced points
xj = hj,
h=
L
,
N
j = 0, 1, 2, . . . N
The steady state temperature Tj (in Kelvin) at interior points xj in the rod can be modeled
as
Tj+1 + Tj−1
+ h2 f (xj ),
j = 1, 2, . . . , N − 1,
Tj =
2
where T0 = TN = 293.15K. The flame f (x) is given by
2 !
S
x − L/2
f (x) =
exp −
2ρcp β
ε
where
• S is the heating rate for the flame (W/m3 )
• β is the thermal diffusivity of the metal (m2 /s)
• cp is the heat capacity of the metal (J/(kg · K))
• ρ is the density of the metal (kg/m3 )
• ε is a scaling factor specifying the flame width (m)
(a) Using either the tridisolve or \spdiags function in Matlab , set up a linear system
to solve for the steady state temperature distribution in the rod of length L = 1 and
N = 201.
(b) Solve the system for a thermal diffusivity of β = 10−3 , heat rate S = 1 × 104 , heat
capacity of cp = 200, density of ρ = 10 and a scaling factor ε = 5 × 10−2 .
(c) Plot the resulting temperature distribution as a function of x. Be sure to add labels and
a legend to your graph.
(d) How close to the center of the rod could you put your hand with out getting burned?
Give the value of x and indicate where this point is on the graph from the previous part.
Note that you need to find the approximate temperature at which skin burns.
4
h
L
20°=
=20°
Figure 3: Schematic for the heated metal rod.
7. Cholesky factorization and sparse matrices. (15 points) As we discussed in class and
in problem 2.5 from the book, if A is a symmetric positive definite matrix then A can be
decomposed into the form A = LLT , where L is a lower triangular matrix (or A = RT R, where
R is and upper triangular matrix). This decomposition is called the Cholesky decomposition.
For this problem you will use the Cholesky function built into Matlab (chol), which is a very
sophisticated implemenation that efficiently handles the decomposition of a sparse symmetric
positive definite matrix by exploiting any structure formed by the non-zero entries in the
matrix.
(a) Download the file “bcsstk36.mat” from the course web page and save it to your working
directory in Matlab . This file contains a sparse matrix that arises from a structural
analysis of an automobile shock absorber assembly1 . Load the matrix into Matlab using
the following series of commands:
>>load bcsstk36;
>>A = Problem.A;
Use the spy command in Matlab to generate a plot showing the sparsity pattern of
the matrix A. Compute the sparsity ratio of the A matrix by comparing the number of
non-zero entries in A (use the nnz command) to the total number of entries in A. Report
this number along with some descriptive text in the title of the spy plot of A.
(b) Compute the Cholesky decomposition of the matrix from part (a) using the “lower” option
in the chol function (this is faster for sparse matrices than the default setting). Plot the
sparsity pattern of the lower triangular matrix from the decomposition. Compute the
amount of “fill-in” from the Cholesky decomposition. The “fill-in” can be defined as the
ratio of the number of non-zero entries in the Cholesky decomposition to the number of
non-zero entries in the original matrix. Report this number along with some descriptive
text in the title of the spy plot of the lower triangular matrix.
(c) What you would like to happen is for the “fill-in” from the Cholesky decomposition to
be as small as possible since this translate directly into a more computationally efficient
method for solving the underlying linear system involving the matrix A. For any given
sparse symmetric positive definite matrix, the “fill-in” that occurs from the Cholesky
decomposition depends on how the rows and columns of the matrix are ordered. Several
algorithms exist for permuting the rows and columns to try and minimize the “fill-in”. All
these algorithms are based on results from Graph Theory. One of the more popular (and
better) of these algorithms is the Symmetric Approximate Minimum Degree Permutation
method. Matlab contains a function for this method called symamd. Your goal is to
use this function on the original matrix A from part (a) to compare the “fill-in” from
the Cholesky decomposition of the permuted A matrix to the “fill-in” from the original
one. Create plots for the sparsity pattern of the permuted A matrix and its Cholesky
decomposition. In the latter include the “fill-in”.
1 see
http://www.cise.ufl.edu/research/sparse/matrices/Boeing/bcsstk36.html
5