M Grell, PHY221: ‘Topics in classical physics” Contents Intro: What is classical physics? 1. Dimensional analysis Units vs. dimensions Dimensional analysis Dimensionless groups 2. The harmonic oscillator Simple and damped harmonic oscillator Driven oscillator Coupled oscillators 3. Waves General description, wave equation Waves on strings, in fluids, and in solids Intensity of waves Transmission and reflection of waves Dispersion of waves Lightwaves in matter 4. Fictitious forces Coordinate systems and frames of reference Outer product, right hand rules, pseudovectors Inertial and rotating frames of reference Fictitious forces 5. Mechanics à la Lagrange Lagrange in a nutshell The Lagrangian formalism Cyclic coordinates Using the Lagrangian: Simple examples Using the Lagrangian: Advanced examples The 2- body problem Recommended reading: Fowles ‘Analytical Mechanics’, or the newer edition, Fowles and Cassiday, ‘Analytical Mechanics’ Goldstein, ‘Classical Mechanics’ For dimensional analysis, e.g. J F Douglas, ‘An introduction to dimensional analysis for engineers’, Pitman 1969 What is ‘Classical Physics’? There is (at least) three ways of defining classical physics: Via the areas of physics it does or doesn’t apply too – like engineering mechanics, electrical engineering, geometric and wave optics, thermodynamics, and relativity…; but NOT quantum mechanics, quantum optics, solid state physics,.. Or, by philosophical debate on underlying concepts and assumptions on the nature of reality… which we will avoid here. Or, we can just state a number of assumptions on which this course is based: - 3-dimensional space with no curvature. Time, t, ticks away evenly in the same way for all (‘absolute time’). Space and time are separate phenomena that provide an arena for physical events, but are not themselves involved. There is no limit to the velocity a body can have, and mass is independent of a particle’s velocity. All bodies at the same time have precisely defined location and momentum. Bodies interact through forces only. You will have noticed that we exclude not only quantum phenomena, but also relativity. Relativity IS part of classical physics, but just like optics or thermodynamics, it gets a course of its own. The ‘weirdest’ of the assumptions is the final one. Exercise: What quantum mechanical interactions between particles are NOT forces? Practically, these are justified assumptions for bodies that are much bigger and heavier than elementary particles, and move much slower than the speed of light. This still covers a lot of real world, e.g. all mechanical engineering. 1 Dimensional analysis Units vs dimensions A key difference between equations in physics, and equations in mathematics, is that physical equations usually relate quantities that have units as well as numbers. Mathematically, 3 + 2 = 5, end of story. However, if you have 3 apples and 2 pears, you neither have 5 apples, nor 5 pears. You cannot sensibly add or equate quantities that have different units (strictly: different dimensions- see later). This gives us a consistency check on all physical equations we may have derived, or recalled from memory: If the equation adds, subtracts, or equates, quantities of different units we must have gone wrong. Exercise: Can we multiply or divide quantities of different units? Exercise: You remember the equation for the centrifugal acceleration is EITHER acf=ωr2, OR acf = ω2r. ω is the angular velocity, r the distance from the axis of rotation. Which equation is correct? This consistency check is probably the single most powerful tool you have at your disposal when sitting exams. Always carry units as well as numbers through your calculations, and check if units come out right. If not, there’s a mistake somewhere, and often you may get a clue, as well- if you try to calculate a length, but units work out as m2, or m-1, you probably forgot a root or an inversion somewhere. A generalisation of the concept of units is that of dimension. Note ‘dimension’ can have different meanings even within physics- the one you encounter here may be different from what you are familiar with. ‘Dimension’ in the sense used here does NOT mean one of the 3 directions of classical space. Instead, best look at an example: Exercise: Spot the odd- one- out in the following list: meters, cm, nanometers, kilometres, feet, seconds, inches, miles. Apart from the second, all are units of length. We therefore assign the quality or ‘dimension’ length [L] to any quantity that is measured in a unit of length, whichever unit that may be. You may prefer imperial or metric, micrometers or miles, all of these still have something in common, which the ‘second’ has not. That ‘something’ is the dimension L. Similarly, we introduce the dimension ‘time’ [T] to anything measured in seconds, hours, days, years, or any other unit of time, and the dimension ‘mass’ [M] to anything measured in kg, micrograms, etc. (careful with imperial units here, they don’t clearly distinguish between mass and weight. A good reason to avoid them altogether). From these three dimensions, we can make up the dimensions of less basic quantities such as velocity, or force. Velocity is distance/time, so we assign to it dimension L/T or LT-1. Again, we may prefer km/h, or m/s, as units of velocity, but dimension will still be L/T. Note that differentials make no difference here: If we define velocity as v = Δx/Δt, or v = dx/dt, it still has units m/s (or km/h), and dimensions L/T. Acceleration has dimensions LT-2, Force is given by F = ma, so multiply dimensions of mass and acceleration to get dimensions of force, work (energy) is W = Fs (force x distance)- when you know either a defining equation, or a quantities’ units, you can work out the dimensions. Exercise: What are the dimensions of acceleration, force, energy, density? More dimensions are required when e.g. electrical charges are involved, but we will limit ourselves to non- electrical phenomena here. With the concept of ‘dimension’, an equation that equates 2 physical quantities with different units can still be correct, as long as it equates quantities with the same dimensions: 1 m/s = 3.6 km/h is a sensible equation. Units are different, but dimensions are the same. The equation is ‘dimensionally consistent’. Dimensional analysis The technique of dimensional analysis takes the consistency check of equations to the next level. Rather then just checking if an equation can possibly be right, the demand for being consistent with respect to dimension can give us a tool to (sometimes) derive equations from dimensional considerations alone. Let us look at the example for the angular frequency ω of the physical pendulum. From ‘proper’ reasoning (setting up an equation of motion, solving it under the assumption of small amplitude), we know ω = (g/l)1/2. Instead, let’s try the following: We simply list the quantities we think may affect ω: length of the pendulum (l), mass of the pendulum bob (m), acceleration due to gravity (g). Now, we assume we can make up the correct equation for ω simply by taking the relevant quantities l,m,g to (unknown) powers, and multiply these powers: (eq. 1.1) ω = l a m b gc Now, we determine the unknown powers a,b,c by the demand that the equation has to be dimensionally consistent. We simply replace the quantities ω,l,m,g by their dimensions, and write eq. 1.1 as a € dimensional equation: (eq. 1.2) T −1 c L = L M 2 = L(a+c) M bT −2c T a b wherein we have used the dimension of acceleration, L/T2. Now, from€comparing left- and right side of the eq. 1.2, we derive a set of equations for a,b,c. No factor L does appear on the left side, which means it has power zero – it then must have power zero on the right side, as well : a + c =0. Same for M, so b = 0. T has power -1 on the left side, so it must have -1 on the right side, too: -2c = -1. We have the following set of equations for a,b,c, which is easy to solve: (eq. 1.3) a+c = 0 b=0 −2c = −1 ⇒ b = 0;c = 1 ;a = − 1 2 2 Substituting a,b,c back into eq. 1.1, we get: € 1.4) (eq. ω = l −1/ 2 g1/ 2 = g l Which we know to be the correct equation, at least in the limit of small amplitudes, from the ‘proper’ derivation. Like pulling a rabbit out of a hat. In particular, we have shown that the pendulum period does NOT depend on the mass of the bob. Galileo. € One fly in the ointment, with dimensional analysis we can never derive any factors in equations that have no dimension (or, unit). So, there could be a factor 2, or π, or √5 in front of (g/l)1/2, which we can’t find by dimensional reasoning. The unknown numerical factor happens to be one in the present example, but we won’t be that lucky every time. In fact, the problem is a bit more serious than that. There is one more variable, the amplitude of the pendulum, which may well affect ω. However, we have not even mentioned it for the purpose of dimensional analysis. Exercise: Why not? Pendulum amplitude, φmax, is an angle. The ‘units’ of an angle are ‘radian’ (rad), but if you recap the definition of rad, you’ll find it is a length divided by another length- hence it has no units or dimensions at all. Quantities that have no units are called ‘dimensionless’. That does not mean that φmax cannot affect ω, but it means that dimensional analysis cannot tell us if or how. So, we should re- cast eq (1.4): (eq. 1.5) ω = f (φ max ) g l Wherein f(φmax) is an unknown function, which we cannot determine from dimensional reasoning. All we know is that, f(φmax) has to be dimensionless, just like its argument, φmax. As it happens here, for small amplitude, f(φmax) 1, f(φmax) thus becomes ‘invisible’ for small amplitudes. € We have seen in a nutshell what dimensional analysis does: It splits a problem into a dimensional, and a non- dimensional part. It then solves (or, as we will see, sometimes only partly solves) the dimensional problem. Dimensional analysis is therefore also sometimes called nondimensionalisation. Dimensionless groups Dimensional analysis is particularly useful in situations where a full theory is not yet established. (take note- that is almost a euphemism for ‘research’). A practically very important application of dimensional analysis is in fluid dynamics, which is e.g. concerned with the drag force, F, experienced in the movement of objects in water or air, or the flow of fluids through pipes. Predicting the experienced drag force on an object of known size and shape is exceedingly difficult – many say, harder than quantum mechanics. While a general solution remains elusive, dimensional analysis can significantly simplify the amount of experimental work required to study drag. Other than in the example for the period of the pendulum, we will not be able to solve even the dimensional part of the hydrodynamic drag problem completely by dimensional analysis, but we will be able to reduce the complexity of the problem by combining some of the relevant quantities into so- called dimensionless groups. The punchline is that the number of dimensionless groups will be smaller than the number of relevant quantities. Hydrodynamics assumes a totally submerged body (or, completely filled pipe) in an incompressible fluid, which is approximately true for liquids. The relevant quantities that determine drag force are the density of the fluid, ρ, the velocity between fluid and craft (or pipe), v, the viscosity of the fluid, η, and the size of the object, l. For compressible fluids (e.g., air: ‘aerodynamics’), the compressibility κ also comes into it, but for simplicity, we will work the example of the incompressible fluid here. Exercise: List the dimensions of force, density, velocity, viscosity, and size. We assume the drag force to be a function of all the relevant quantities in the form: (eq.1.6) € F = Al a ρ bη cv d Wherein A is a dimensionless factor that will depend on the (dimensionless!) shape of the craft or object. Eq. 1.6 translates into the dimensional equation MLT −2 = La [ML−3 ]b [ML−1T −1 ]c [LT −1 ]d ⇒ (eq. 1.7) 1= b+c 1 = a − 3b − c + d −2 = −c − d These are 3 equations for 4 unknowns, so there cannot be a complete solution. However, it is possible to express 3 of them in terms of a single, final unknown. Of course, it is somewhat arbitrary € which power you leave as the remaining unknown, and different choices lead to different dimensionless groups. Go with intuition: Here, c offers itself. Exercise: Why pick c as the final unknown, not a,b, or d? It is then easy to express a, b, and d all in terms of c: (eq 1.8) b = 1− c a = 2−c d = 2−c Now, substitute a,b, and d by their respective expression in terms of c in eq. 1.6: € (eq.1.9) F = Al 2−c ρ1−cηc v 2−c ⇒ ρvl −c = A η ρv 2 l 2 F Where we have moved all known powers of ρ,v,l to the ‘F’ side of the equation and kept the unknown powers on the other side. If all is well with our technique, (ρvl/η) should be € dimensionless, otherwise there would be trouble raising it to unknown power c- we might end up with bizarre dimensions, what if c = √17 ? If (ρvl/η) is dimensionless, then the left- hand side of eq. 1.9 must also be dimensionless. Exercise: Check directly that both ρvl/η, and the left- hand side of eq. 1.9, are both dimensionless. We have re- written eq. 1.6 in different, simpler terms, with variables lumped into so- called dimensionless groups, sometimes just called numbers: The Newton number Ne = F/ρv2l2, and the Reynolds number Re = ρvl/η. Note we started with force a function of four unknown powers, but finished with number (Ne) as unknown power of only one variable, Re : (eq. 1.9) Ne = A Re-c. Finally, we slightly bend the rules and assume that Ne may be a general (and potentially, complicated) function of Re, not just a power (a point that even the textbooks on dimensional analysis gloss over, so we won’t dwell on it). We arrive at 1.10: (eq.1.10) Ne = AΦ(Re) Dimensional analysis cannot tell us anything about A, which is down to the shape of the craft, hence has no dimensions. But A will depend on shape only, not on size, velocity, or else. Also, dimensional analysis has not, and cannot, give us the unknown function, Φ. But it does significantly €effort required if we want to measure Φ with systematic experiments, because we have reduce the reduced the number of independent variables. If we have measured the hydrodynamic behaviour of , say, a water pipeline, we then don’t have to repeat it for an oil pipeline of the same shape (it would be a cylinder usually). The relationship between Ne and Re will be the same, only you have to calculate Ne and Re with different density, viscosity, velocity, size. A most useful aspect of dimensional analysis is that it allows experimentation with down- scaled models, rather than full- size objects. As long as the shape of the object is kept the same, dimensionless equations such as eq. 1.10 remain valid; just use different length l to calculate Ne and Re. The potential savings in effort and cost are obvious, think e.g. of ship or aircraft design. Dimensional analysis tells you how to ‘translate’ experimental conditions, and results, between model and full size. Of course, the inverse, square, root, or other power of a dimensionless quantity will again be dimensionless. Therefore, there is some ambiguity in the definition of a dimensionless group like the Reynolds number. Re here happens to emerge in the form Re = (ρvl/η) in our analysis, and we took it as it was - but one could also settle for 1/Re, or Re2, or, say 2πRe, which would also be dimensionless, and an unknown power of Re is also an unknown power of Re2, or Re-1 (just, a different one, but still unknown). There is room for manoeuvre here. Some dimensionless groups are convention for historic reasons, but often a particular form of dimensionless group is chosen because it has a clear physical interpretation. An example is the Mach number in aerodynamics, which can be interpreted as the ratio of the speed of an object to the speed of sound. However, Mach number does not immediately emerge in that form from dimensional analysis. Instead, a different dimensionless group emerges, Mach number is a power of that group. Fr and Oh Wikipedia lists a table with about 100 entries under ‘dimensionless quantity’. Although here introduced as ‘classical physics’, any scientific or engineering equation must be dimensionally consistent, and hence, there will be scope to apply dimensional analysis e.g. in quantum mechanics. Exercise: Show that the Schrödinger equation is dimensionally consistent. 2 important examples: The drag experienced by ships is different again from both hydrodynamics and aerodynamics, as it is largely caused by the surface waves a ship generates as it moves. William Froudé pioneered the testing of ship models in the 19th century. Because he did not know dimensional analysys yet, Froudé experimented with models of same shape but different size and found a ‘law of comparison’ (scaling). Later, his law was supported by dimensional analysis, which introduced a dimensionless number that essentially compares the vessel’s speed to the speed of waves it generates. This number is now called Froudé number (Fr), although Froudé did not strictly formulate his ‘law of comparison’ in dimensionless terms. In 1936, W Ohnesorge wrote on the "Formation of drops by nozzles and the breakup of liquid jets", wherein he combined a liquid’s viscosity, density, droplet size, and surface tension into a dimensionless group now known as Ohnesorge number, Oh. Oh is highly relevant e.g. for the design of inkjet printers, and Diesel injection nozzles. 2 The harmonic oscillator Linear harmonic oscillator Oscillations are common in both classical and quantum mechanical systems. We will here discuss mechanical oscillations, but the developed concepts can readily be generalised, e.g. to electrical oscillators. All oscillators contain elements that can store and release energy, e.g. springs (stores potential energy) and masses (stores kinetic energy) or capacitors (store electrical energy) and coils (store magnetic energy). Such oscillators will go on forever, and that is the first case we will discuss. Realistically, however, oscillators will also dissipate energy, e.g. in a mechanical dashpot or in an electric resistor. We will return to that situation later. The simplest oscillator is a body of mass m attached to an ideal spring, with the mass being able to move only in the direction of the spring’s long axis, which we choose to call the x- axis – note we are at liberty to do that. Such an oscillator is called a linear harmonic oscillator (LHO), sometimes also ‘simple harmonic oscillator’. An ‘ideal’ spring is a spring that has zero mass of its own, and responds to stretching or compression away from its equilibrium length with a restoring Force, Fres = -k(x-x0), that points back towards the equilibrium point, x0 (Hooke’s law). k is known as ‘spring constant’, and is a characteristic of the spring- it can be very different for different springs. Exercise: What are the dimensions of k? We are at liberty to place the origin of the coordinate system we are using at any point that we find convenient- say, x0. (Note we do NOT have to choose the point where the spring is anchored as origin!). We can therefore always, without loss of generality, say x0 = 0, and Fres = -kx. Pause here and question how important or general the assumption of a linear force law, Fres =-kx, is – in the real world, there are few bodies that are literally connected to springs, but there many oscillators, the ‘spring’ is a model for all sorts of restoring forces. Is it sensible to assume restoring forces are linear- couldn’t it be quadratic, root, exponential,…. - are we just conveniently picking something that is mathematically easy to handle, at the expense of losing the generality of our approach? Even some springs are deliberately designed to have force laws other than Hooke’s law (‘progressive springs’ in vehicle suspension). As a general oscillator, assume a mass that initially rests in a local minimum x0 of a general potential energy function, V(x). As above, we can place the origin of our coordinate system so that x0 = 0, and we are at liberty to gauge our potential energy so that V(x0=0) = 0 (Note, a force is the negative derivative of a potential energy- hence, adding or subtracting any constant to a potential does not change forces). What restoring force will the mass experience when it is displaced for a small distance from 0? To answer that, we use the first few terms of a Taylor expansion of the potential energy around its local minimum at x0 = 0 as an approximation: (eq.2.1) dV 1 2 d 2V V (x) = V (0) + x + x + ... dx x=0 2 dx 2 x=0 Therein, V(0) is a constant, which we have just ‘gauged’ to be zero, and dV/dx (x=0) = 0, because we have assumed we have a minimum at x = 0- at a minimum, derivative is 0. The potential energy therefore, in the approximation of small deflection, x, scales quadratically with x. To a first € (that is, for small deflection), that is true for every realistic potential energy. A approximation potential energy that scales quadratically with deflection is called harmonic. The negative derivative with respect to x of the potential is the restoring force, hence for the restoring force of the harmonic potential, we find Fres = -kx. Exercise: Relate k in Fres = -kx to the potential energy. (Hint: Look at the Taylor expansion of the potential). Why is k always positive? This explains the importance of the so- called harmonic oscillator. For small amplitudes, every oscillator is harmonic, that is why Fres = -kx is much more than just a convenient assumption. Now, we apply the Newtonian equation of motion, F = ma, to our model spring, with m the mass of the body and a, the acceleration, equal to d2x/dt2. This leads to the following differential equation: (eq.2.2) m d 2 x(t) dt 2 = −kx(t) → d 2 x(t) dt 2 + k x(t) = 0 m Exercise: What are the dimensions of k/m? Differential equations are notorious throughout physics, and often very hard to solve (no worries, not in this case). A differential equation ‘describes’ a function – in this case, x(t). The set of functions€ that answer the description are called the solutions of the differential equation. While it is often a hard, and mathematical rather than physical, task to find all solutions of a differential equation, it is easy to test if a ‘candidate’ is or isn’t a solution: Enter into the equation, see if the ‘=’ sign holds true. That means you have found a particular solution to the diff. eqn. There often are several different (i.e., linearly independent) particular solutions, e.g. eqn. 2.2 has two. The general solution of a differential equation is a function containing several parameters, that encompasses all particular solutions as special cases. In a specific situation, the parameters of the general solution have to be chosen to find a particular solution that is consistent with the initial and/or boundary conditions of a system, e.g. its position and velocity at t = 0. Eqn. 2.2 is an example of a linear and homogeneous differential equation. ‘Linear’ means that the function x and all its derivatives enter the equation linearly, not with a power or root or log or else (Note, the ‘square’ in d2x/dt2 stands for second derivative, not first derivative squared!). ‘Homogeneous’ means the right- hand side is zero, rather than a function of the variable, t. Eq. 2.2 is known as ‘second order’ differential eqn, because the highest derivative of the function is the second derivative. A diff. eqn. has as many linearly independent particular solutions as its order is, i.e. eqn. 2.2 has two. Linear homogeneous diff. eqn.s are among the most benign. It is a property of linear differential equations that the ‘linear combination’ of solutions again is a solution. If a set of n linearly independent particular solutions to a linear diff. eqn. of order n is known, the general solution of that linear diff eqn. can be constructed by linear combination of all particular solutions. Exercise: From the defining properties of ‘linear’ diff. eqns, show that linear combinations of solutions are again solutions! You can easily confirm two particular solutions of eqn. 2.2: x(t) = Asin(ω0t), and x(t) = Bcos(ω0t), wherein ω0 = (k/m)1/2 is known as angular frequency of the harmonic oscillator. Exercise: Show that Asin(ω0t) and x(t) = Bcos(ω0t) are solutions of eqn. 2.2. What are the SI units of A and B? Why is simply sin(ω0t) or cos(ω0t) a mathematically, but NOT physically, acceptable solution? Exercise: Use dimensional analysis to derive ω0 = (k/m)1/2. Initially, consider the possibility that ω0 may depend on amplitude as well as k, m- dimensional analysis will prove that it does not. As we have confirmed two particular solutions of a 2nd order linear diff. eqn., we can construct the general solution of the harmonic oscillator: x(t) = A cos(ω 0t) + B sin(ω 0t) (eq. 2.3) This is mathematically equivalent to (eq.2.4) € x(t) = X max cos(ω 0t + ϕ ) Exercise: Show that the above eqn.s 2.3 and 2.4 are equivalent, and give the relation between A, B and Xmax, φ. Note that€ eq. 2.4 is more convenient, as it allows to directly read the amplitude Xmax of oscillation. The harmonic oscillator perpetually undergoes periodic (sinusoidal) motion, with an amplitude Xmax, depending how much energy it had in the beginning, and a phase, φ. The oscillator ‘repeats’ itself with frequency f = ω 0/2π, or period T = 1/f. Oscillators are clocks. During oscillation, energy is converted forward and backwards between potential and kinetic energy, with maximum potential energy and zero kinetic energy when x = Xmax, and maximum kinetic energy and zero potential energy at x = 0. The following relations hold: v max = ω 0 X max (eq. 2.5) 2 amax = ω 0 X max 2 2 W = Vmax = Tmax = 12 kX max = 12 mv max Wherein vmax is maximum velocity, amax maximum acceleration, of the oscillator, and W is the oscillator energy, T is kinetic energy, V is the potential energy, (do not confuse V and v here: Pot. € Energy vs. velocity). Note that phase angle, φ, is absent from eq. 2.5. Exercise: Derive all of eq. 2.5 from 2.4. Above exercise confirms that eq. 2.4 is the more convenient form of representing the oscillation, because maximum amplitude Xmax is directly linked to the oscillators’ energy. Mathematically, that’s it- as physicists, we can do better. As much as we can call any minimum of a potential, x0, the origin of a coordinate system, hence x0 = 0, we can call any time, t0, as the ‘beginning’ of time, hence t = 0. Obviously, not the beginning of all time, but the beginning of our timekeeping of a particular observation. So we can always make it so that our oscillator starts at maximum amplitude, Xmax (or at zero amplitude, but non- zero velocity – but I prefer the previous convention: Pull your body away from equilibrium to some amplitude, Xmax, and let go. The moment you let go you call t = 0). By that convention, we always have ϕ = 0. For a single oscillator, ϕ is a rather meaningless concept. It becomes meaningful only when we compare two oscillators, which may or may not be in step with each other. Since it is physically meaningless, ϕ is absent from eq.s 2.5. Damped harmonic oscillator The harmonic oscillator goes on forever, no real oscillator does. Our model misses to take into account ‘damping’. Damping is introduced conceptually in the form of a dashpot that displays loss of energy via friction. A real oscillator may not literally contain a dashpot, but e.g. there may be air resistance in mechanical oscillators, or ohmic electrical resistance in an electrical oscillator. The dashpot is assumed to exhibit friction, that is a force, Ff, that always points into the opposite direction of the current velocity v = dx/dt - that much is not controversial – and in magnitude, is proportional to velocity with a constant c: (eq. 2.6) F f = −cv Exercise: What are the dimensions of c? € There was a good justification for harmonic forces (Fres = -kx). Ff = -cv is much less well justified – more generally, one should assume Ff = -cvn. Different types of friction are known with different powers n. Stokes friction (slow movement in highly viscous medium) indeed shows n = 1, but there are other friction laws, such as Coulomb friction (friction of dry body on dry surface, n = 0), Newton friction (fast movement in low viscosity medium, n = 2), Reynolds friction (between lubricated solid bodies, n = 1/2). For now, we stick with Ff = -cv for our further discussion. We have to add the force due to friction into the oscillator’s equation of motion, leading to the following extended diff eqn.: m (eq. 2.7) d2x dt 2 d 2 x(t) dt 2 = −c dx − kx → dt + 2γ dx(t) + ω 02 x(t) = 0 dt Wherein we define the damping factor γ = c/2m, and as before, ω02=k/m. It looks weird at first to define γ = c/2m, and then have 2γ in the equation, but you’ll soon see why. € Exercise: Look at eqn. 2.7 - why did we insist in the sometimes unphysical assumption n = 1? What type of diff. eqn. is eqn 2.7 as long as n = 1, but not for n ≠ 1? The extended differential eqn. is again linear, homogeneous, and 2nd order, like eq. 2.2. However, the presence of first as well as second derivative in eq. 2.7 means that neither sin nor cos alone can be a solution. Instead of guessing 2 particular solutions, we will employ a standard, systematic approach that is known to solve homogeneous linear diff. eq.ns of all orders. The one- size- fits- all approach to all homogeneous linear diff. eqn.s. is to start with the ‘Ansatz’ (educated guess) that all particular solutions are exponential, of the form x(t) = Aexp(at), but keeping in mind that there may be several sets (A,a) that solve the eqn- in fact, as many as the order of the diff eqn. is. Note how much easier it is to take the derivative of the exponential rather than of sin/cos: Differentiation means ‘multiply by a’. The exponential function itself doesn’t change – sin/cos do when you take the derivative! Exercise: How can exponentials be related to sin/cos? Exercise: Take 1st, 2nd derivative of x(t) = Aexp(at), enter into eqn, 2.7, and cancel what you can, to get an eqn for a. If the ‘guess’ x(t)=Aexp(at) is entered into eqn. 2.7, and you cancel all you can, you find that Aexp(at) is indeed always a solution, as long as a fulfils the following equation: (eq. 2.8) a 2 + 2γa + ω 02 = 0 Eq. 2.8 is known as the characteristic equation of the linear diff. eqn. 2.7. Note that the characteristic equation is no longer a differential equation, but a conventional (‘algebraic’) equation. The characteristic equation is always of the same order as the diff eqn. was, here 2nd order = quadratic.€ Eqn. 2.8 can be solved by the standard method for quadratic equations, yielding two a’s: (eq. 2.9) a1/2 = −γ ± γ 2 − ω 02 Inspection of 2.9 reveals that the a’s may well be complex numbers, namely in the case ω0 > γ – which in fact is quite common. In that case, eqn. 2.9 shows that a1, a2 will be the conjugate complex of each other. The benefit of having a standardised route to solving all linear €eqn.s far outweighs biting the bullet of complex numbers. homogeneous diff Exercise: Brush up on complex numbers, in particular the meaning of ‘conjugated complex’, and exp(ix) = cos(x) + isin(x), Solutions of eqn. 2.7 hence may be of different types, depending if the quantity under the root in eqn. 2.9 is larger, equal to, or smaller than zero, that is γ2 = c2/4m2 (>/=/<) ω02 = k/m. Another way of writing this is c2 (>/=/<) 4mk. We see, the type of solution depends on the relative magnitudes of squared friction constant, c, that quantifies dissipation of energy, to spring constant k times mass m, that is the two quantities that quantify the amount of energy stored in an oscillator. 1st case: c2/4m2= γ 2 > ω 02 = k/m (c2 > 4mk) In this case, both roots a1/2 of the quadratic eqn. 2.8 are real, and both < 0. This case is known as ‘overdamping’. Damping is so strong that the ‘oscillator’ no longer oscillates, as you will see from entering a1/2 into the solution ‘Ansatz’: (eq. 2.10) x(t) = A1 exp(a1t) + A2 exp(a2t) With a1/2 given by 2.9. Since a1.2 both < 0, the exponentials decay to zero for large times. The only unknowns in the general solution eq. 2.10 are A1, A2, which have to be fitted to the system’s specific initial conditions. € Exercise: Show that the initial conditions x(0), v(0) specify A1, and A2 as A1=(a2x0-v0)/(a2-a1), and A2=x0-A1. For v0=0, A1= a2x0/(a2-a1), A2=a1x0/(a1-a2). Note that if v(0)= 0, then x(t) never changes sign: x(0) is the largest deflection in modulus the system will ever have, from then on, it decays- but it never changes sign. (If v(0) ≠ 0, x(t) may change sign once, but no more than once!). The overdamped ‘oscillator’ does not oscillate. The system ‘creeps’ to zero. This is the case most different from the original, undamped oscillator we had discussed first. Friction, quantified by c, has the upper hand over energy storage, quantified by k and m. 2nd case: c2/4m2 = γ 2 < ω 02 = k/m (c2 < 4mk) Now, there is a negative number under the root in eq. 2.9. Consequently, a1 and a2 are conjugated complex (a2 = a1*) with Re{a1} = Re{a2} = -γ. We also introduce the quantity ωd as ωd2 = ω02-γ2, with the index ‘d’ for ‘damped’. This gives the general solution (eq. 2.11) a1 / 2 = −γ ± iω d and x(t) = exp(−γt)[ A+ exp(iω d t) + A− exp(−iω d t)] This looks confusing: x(t) has to be real- so how do we make sure of that when eq. 2.11 contains imaginary € exponents? It is easy to show that eq. 2.11 always returns a real number, as long as A+, A- are both complex numbers themselves, and conjugated to each other: A-=A+*. This also implies that there are only 2 independent parameters in A+, A- (not 4, as it would be in two independent complex numbers). Hence, 2 initial conditions (x(0), v(0)), are again sufficient to determine both A+ and A-. Exercise: Show that eq. 2.11 will always return a real number as long as A-=A+*. The ‘real’ nature of eqn 2.11 is clearly visible when it is written in the alternative, mathematically identical form 2.12: (eq.2.12) x(t) = X max (0)exp(−γt)cos(ω d t + ϕ ) = X max (0)exp(−t / τ )cos(ω d t + ϕ ) Wherein Xmax, φ can be related to A+, A-. Exercise: Derive the relation between Xmax, φ and A+, A-. € Eq. 2.12 describes an oscillation, similar to the undamped oscillator (cf. Eqn. 2.4), not ‘creep’ to zero. The angular frequency of this oscillation is ωd < ω0, smaller than the angular frequency of the corresponding undamped oscillator. For very weak damping, γ << ω0, ωd is very close to ω0. Again, if we assume oscillation starts at maximum amplitude and zero velocity, then ϕ = 0. The main difference between Eq. 2.4 and 2.12 is that in 2.4, the oscillator amplitude always remains at Xmax, because in the absence of damping, the oscillator doesn’t lose energy. The amplitude of the damped oscillator, described by eqn. 2.12, decays over time with a time constant τ = 1/γ, Xmax(t) = Xmax(0) exp(-t/τ), because the oscillator loses energy due to damping. We have an oscillation with somewhat lower angular frequency, folded into an exponential decay ‘envelope’. This is quite different from the ‘creep ‘ observed in the overdamped case. When just referring to a ‘damped oscillator’, we usually mean the scenario described by eq. 2.12. Exercise: Show that for weak damping, ωd ≈ ω0 – γ2/2ω0 Exercise: Show that the time constant for the loss of the energy stored in the oscillator is half of the time constant for amplitude decay. There is a 3rd possibility in the characteristic eqn. 2.9, which marks the borderline between the two cases discussed above: 3rd case: c2/4m2= γ 2 =ω 02 =k/m (c2 = 4mk) Now, energy storage (4mk) and dissipation (c2) are balanced into a ‘stalemate’: The root in eq. 2.9 is zero. This is known as ‘critical’ damping. Now, eqn. 2.9 has only one solution, which is real, and negative, and a1 = a2 = -γ. In quantum mechanics, this is called ‘degeneracy’. At first glance that may seem to simplify matters, but it doesn’t: The characteristic eqn. provides only one particular solution, but we need two particular solutions to make up the general solution. It takes a bit of mathematical trickery to come up with another, linearly independent particular solution. We won’t go there… but it can be shown that the general solution in critical damping is given by: x(t) = (At + B) exp(−γt) (eq.2.13) With a1=a2=γ. Exercise: Show that for γ =ω0, eq.2.13 is a solution for eq.2.7. € Again, there are no oscillations, only ‘creep’ towards zero. The time it takes to approach 0 is as short as possible without oscillation- time constant τ = 1/γ. Practically, critical or near- critical damping often is desirable, as it is the fastest return to equilibrium without oscillations. An important example of near- critical oscillators are vehicle suspension systems. Wheels are linked to springs to soften the blows from potholes etc. However, with springs alone, your car would soon hop along the road like a bouncy ball. Therefore, in parallel to the springs, a car has ‘shock absorbers’, that is dashpots with damping, c. Shock absorbers should be strong enough (c large enough) to make your car overcritically damped- but on the other hand, the car should ‘creep’ back to equilibrium position quickly. Ideally, therefore, you should be precisely at critical damping. Since the mass of the car may change, engineers tend to err on the save side and somewhat overdamp the suspension, but when you overload the car you may cross the critical boundary: Overloaded cars tend to swing a few times after a pothole blow. Don’t overdo it. So far the ‘canonical’ treatment of the damped oscillator, which you find in many textbooks. But note, it all relies in the assumption that damping force is proportional to v. If it is not – and there is a number of friction laws with powers n ≠ 1 of velocity – the resulting diff. eqn is no longer linear, and the standard procedure introduced above does not apply. In the case of weak damping, approximate solutions to the nonlinearly damped oscillator can be found though- no details here. The prediction is that such an oscillator will still undergo decaying harmonic oscillations, but the ‘decay envelope’ is not exponential. The following table covers a few examples: Table 1 n 0 1/2 1 2 Shape of envelope Linear Parabolic Exponential Hyperbolic Note that the approximate treatment of the weakly linearly damped (n=1) oscillator predicts an exponential shape for the decay envelope- which we know to be correct from the precise treatment. So it seems the method of approximate treatment is reliable. Whatever the decay law, every practical oscillator is somewhat damped, and will not go on forever… unless, we ‘drive’ it. The driven harmonic oscillator In this chapter we introduce the extremely important concept of resonance. This is important well beyond classical physics. Resonance is particular interesting for weakly damped oscillators. We will, in due course, make the assumption of weak damping, γ << ω0, to keep equations simple. Let’s go back to the diff eqn. 2.9 of the ‘free’ (i.e., not driven) harmonic oscillator. The absence of any external, driving force results in the ‘0’ on the right- hand- side. To describe an oscillator that is driven (or ‘forced’) by a (time- dependent) external force, Fext(t), we introduce Fext(t) on the righthand- side: Eq. 2.14 d 2 x(t) dt 2 + 2γ dx(t) 1 + ω 02 x(t) = Fext (t) dt m (We need to introduce m on the left- hand- side- or, 1/m on the right hand side- to keep the dimensions correct. The left- hand- side is an acceleration, not a force). In mathematical terminology, introducing a function of time on the right- hand side of the differential eqn. means we € a ‘homogeneous’ diff eqn to an ‘inhomogeneous’ diff. eqn. go from Exercise: Show that for a constant external force, Fext = F0 ≠ Fext(t), the solution of the ‘driven’ oscillator eqn is exactly the same as that for the damped ‘free’ oscillator, apart from the fact that the mass no longer oscillates around the origin (the equilibrium of the spring under no force), but around xeq = F0/k, the equilibrium of the spring stretched by F0. The above exercise shall convince you that an external force that is constant is a ‘boring’ scenario. Also, forces that forever increase will at some point break the spring, while forces that forever decrease will at some point become the same as zero (or constant) force. That’s hardly what we mean by ‘driving’ the oscillator! So we should consider a periodic time- dependent external force – which still leaves us a lot of choice, so we need to think what driving forces may be sensible. The choice of external force we will discuss is: Eq. 2.15 Fext (t) = F0 exp(iωt) That is, a harmonic driving force (written in exponent- i- form for mathematical convenience). Therein, ω is an arbitrary drive frequency, not to be confused with ω0 or ωd. ω0, ωd are determined by the properties of the oscillator – how big k, m, c are. ω isn’t! If you drive your oscillator by an € you may have a dial to choose ω, completely independently of k, m, c. external motor, We choose harmonic periodic forces, as they are the most ‘basic’ periodic function. The free oscillator undergoes harmonic motion, so it appears natural to drive it by one. In fact, the driver may be another oscillator, or a passing wave. More rigorously, one argues on the basis of Fourier’s theorem. This basically says that every periodic function can be decomposed into a superposition of harmonic oscillations of different angular frequency, amplitude, and phase. So, if we find a general solution for the driven oscillator eqn. in response to a harmonic external force, we have solved the general problem for an oscillator driven by any periodic external force: We only need to decompose the driving force into its harmonic components à la Fourier, then calculate the response of the oscillator to every harmonic component, and add up all responses. That may be difficult technically, but conceptually, the solution for the harmonic driving force is a complete solution of the problem. Soon, we’ll see that due to the nature of ‘resonance’ we can in fact ignore most of the Fourier components apart from those near what we’ll call ‘resonance frequency’. So, we describe the driven oscillator by the inhomogeneous, linear differential equation: (eq. 2.16) d 2 x(t) dt 2 + 2γ dx(t) F + ω 02 x(t) = 0 exp(iωt) dt m The following theorem applies to inhomogeneous diff eqn.s: The general solution of an inhomogeneous diff. Eqn. is a particular solution of the full inhomogeneous eqn., plus the general solution of€the corresponding homogeneous diff eqn. Exercise: Show that if you have a solution of the full inhg. diff eqn., and add to that a solution of the corresponding homogeneous diff eqn, the resulting function will still be a solution of the full, inhg. diff. Eqn. Physically, that means that whatever particular solution of the inhomogeneous eqn we find, the resulting motion may still be superimposed by the harmonic oscillations of the ‘free’ (not driven) oscillator, which will be with angular frequency ωd – while, as we will see, the particular solution of the driven oscillator will usually not be with ωd. This is an irritation we could do without. And we will: In all that follows, we assume the absence of these ‘free’ oscillations. We note that ‘free’ oscillations always decay due to damping, over a timescale τ = 1/γ – free oscillations are transient. We simply assume our driven oscillator has been driven for a long time, t >> 1/γ, so that all ‘free’ oscillations have died away. Keep in mind, however, that shortly after switching the driver on, your oscillator may behave differently- let it ‘swing in‘ first to reach its ‘steady state’! Now, ‘all’ we need is to find a special solution of the above inhg. diff eqn. Easier said then done, you may say… so let’s try our old trick again, the trial solution or ‘Ansatz’: Guess a solution, and show it’s correct. Exercise: try to guess a sensible trial solution. (eq 2.17) x(t) = A(ω )exp(i(ωt − ϕ (ω )) So, we assume that the response of the harmonic oscillator to harmonic drive is a harmonic oscillation. Also we assume the resulting oscillator amplitude may depend on ω, in a way that € equation will allow us to work out, and that there possibly is a phase ϕ between the hopefully, our drive frequency, and the oscillator response. Remember, for the free oscillator, we can dismiss phase as arbitrary- if only we choose the arbitrary point in time we call ‘0’ in a suitable way, we can always make the phase 0. For the driven oscillator, however, phase is meaningful. Exercise: Discuss why ‘phase’ is a meaningful concept for driven, but not for free oscillators. All these assumptions, I hope, are quite intuitive, but there is one less obvious assumption in the trial solution eqn. 2.16: We prescribe that the oscillator responds with the same frequency ω as the drive frequency – not with it’s ‘own’ frequency ωd, or any other. Remember, for the damped oscillator without driver, we did not prescribe a frequency: ωd results from the maths, it’s not forcefed into the ‘Ansatz’ at the beginning! A reasoning why that has to be so is that if the driver is linked to the oscillator by a rigid drive shaft, driver and driven have to oscillate with the same frequency so that the drive shaft doesn’t have to stretch. Off course, the driver could be linked with a rubber band, or it could be a wind or a wave or a magnetic field… so it may be a bit iffy in those scenarios. The maths works out with this assumption, so go along with it. In the end, those who feel uneasy about prescribing a frequency to the oscillator that it may not like will have the last laugh. So, let’s take the first and second derivative of our ‘Ansatz’ and enter into the inhg. diff eqn. 2.16: x(t) = A(ω ) exp(iωt − ϕ ) dx = iωA(ω ) exp(iωt − ϕ ) = iωx(t) dt d2x dt 2 = −ω 2 x(t) See how helpful it is to write harmonics as exp(iωt)- differentiation to t is multiplying by iω, it doesn’t get simpler than that! € Now, let’s substitute dx/dt, d2x/dt2 into eq. 2.16: (eq. 2.18) F A(ω ) −ω 2 + 2iωγ + ω 02 exp(i(ωt − ϕ )) = 0 exp(iωt) m F ⇒ −ω 2 + 2iωγ + ω 02 A(ω ) = 0 exp(iϕ ) m [ [ ] ] The time- dependent term exp(iωt) cancels, and we are left with an algebraic rather than a differential equation- so the ‘Ansatz’ was a success. €Exercise: In eqn, 2.18, there are 2 unknowns- name them! How can one equation be enough to solve for two unknowns? If we apply exp(ix) = cosx + isinx to the right- hand side of eqn. 2.18, it splits into two equations: Both real and imaginary parts have to be equal simultaneously. Equations in complex numbers are two equations formulated in a single line. (eq. 2.19) F A(ω ) ω 02 − ω 2 = 0 cos ϕ m F 2ωγA(ω ) = 0 sin ϕ m [ ] Exercise: From eqns, 2.19, find separate expressions for A(ω) and ϕ. This leads to the following result for A(ω) and ϕ: € tan ϕ = 2γω ω 02 − ω 2 (eq. 2.20) F0 A(ω ) = m (ω 02 − ω 2 )2 + 4γ 2ω 2 Exercise: In the limit ω 0, the ‘harmonic’ driving force becomes a constant, F0. Show that above eqn for A(ω) reproduces the previous result for constant force, i.e. A(0)= F0/k. € Eq. 2.20 is the key equation for the driven oscillator, which we now will discuss. The most important question, of course, is, at what ω is A(ω) as large as possible? Exercise: Find the maximum of A(ω) The standard procedure to find a maximum is to equate dA(ω)/dω = 0. You’ll appreciate that this can be done, albeit it is rather technical… Here, only the result: (eq. 2.21) ω max = ω 02 − 2γ 2 = ω d2 − γ 2 = ω r The property of A(ω) to display a maximum when driven with ωr is called resonance. ωr is called resonance frequency, hence the index, r. For weak damping, it is close, but slightly smaller than, ωd, the frequency of the free oscillator. Resonance is the key phenomenon in the physics of the € oscillator, sometimes a driven oscillator is even called ‘resonator’. driven Exercise: Recap the meaning of the different ω’s: ω0, ωd, ωr, and ω without index. Do not confuse them! Now we know the location of the ‘resonance peak’, at ω = ωr, but how high is the amplitude Amax at resonance? To calculate Amax = A(ωr), simply enter ωr into the general equation for A(ω). You will find: F0 (eq. 2.22) Amax = A(ω r ) = m 2γω d (Remember the definition of ωd, the frequency of the freely oscillating damped harmonic oscillator). € Eq. 2.22 still contains the somewhat arbitrary F0. To put Amax into perspective, we calculate A(ωr)/A(0), that is the amplitude at resonant drive, divided by amplitude under the effect of the same force, F0, applied statically. Since A(0)=F0/k=F0/(mω02) (eq. 2.23) A(ω r ) ω 02 ω km = ≈ 0= =Q A(0) 2γω d 2γ c Eq. 2.23 is only sensible for sub- critically damped oscillators, and the approximation is valid for weakly damped oscillators, γ << ω0. As of now, we will usually make the assumption of weak damping γ << ω0. Eq. 2.23 introduces the quantity Q, which is known as the quality of the € oscillator. Eq. 2.23 gives Q as the relative magnitude of the two energy- storing elements, mass (m) and spring (k), to the energy- dissipating element, dashpot (c). However, Q is a key general concept that applies to all oscillators, not just mechanical oscillators. Of course, for an electrical oscillator we can not calculate Q from k,m,c- because it doesn’t have a mass or spring or dashpot. Instead, Q can be extracted from measured oscillator data – we will see how in due course - without breaking it down into its components and measuring those seperatly. Q is so important precisely because it is not specific to a particular type of oscillator. You cannot directly compare e.g. c in a mechanical oscillator to resistance R in an electric oscillator- to begin with, they have different units. But you can compare Q of mechanical and electrical oscillators. Exercise: Show that Q is dimensionless! At resonance, amplitude is Q times higher than when the same oscillator is subject to the same force applied constantly, rather than with frequency ωr. The energy stored in the oscillator is proportional to amplitude squared, hence the oscillator fully ‘swung in’ at resonance contains Q2 times more energy than the same oscillator ‘driven’ with zero frequency. This gives you a glimpse of the potential destructive power of resonance. It also shows one way of extracting Q from measured data: All you need A(0) , and Amax, then use Q = Amax/A(0). With hindsight, the concept of ‘Quality’ can be applied to the freely decaying, damped oscillator, as well- according to eq. 2.23, all you need to calculate Q are k, m, c, and all of those are present for the freely decaying oscillator. Q is a property of the oscillator, not the driver. Freely decaying oscillations allow another way of measuring Q very simply, without measurement of k,m,c. Exercise: Show that a freely decaying oscillator undergoes Q/π oscillations until its amplitude has decayed to 1/e of its initial amplitude. Deflect the oscillator to an arbitrary initial value, mark 1/e of this value on a measuring scale, and count the number of oscillations, N, until the oscillator has decayed to 1/e of the original amplitude. Multiply that number by π, and you have measured Q: (eq.2.24) Q = πN The above also tells us about how long we have to wait until a driven oscillator has reached its ‘steady state’. It takes N oscillations for transient oscillations to decay, which will take time TQ/π, with€T the period of the oscillator. In a similar way, you could look at the decay of an electrical oscillator on an oscilloscope screen, and determine Q simply by counting. The electrical oscillator has no m, k, or c (instead, capacitors, inductances, and resistors). Still, Q is well defined and easily measured, even if you do not exactly know how the oscillator works or what parts it contains. Q of a musical instrument is known as its sustain, e.g. a high sustain guitar will keep ringing long after it has been plucked. ‘Sharpness’ of resonance We have found maximum of A(ω) at ω = ωr, but how ‘sharp’ is the resonance – i.e., what size of frequency ‘mismatch’ between drive frequency, and resonance, ω – ωr, can we tolerate to still get a considerable response from the oscillator? To answer this quantitatively, we find the ‘Full Width Half Maximum’ (FWHM) of the resonance function, A(ω). We define the half- width frequencie(s), ωFWHM, via A(ωFWHM) = 1/2 A(ωr). So, by definition, at ω1/2 (there will be two such frequencies), the amplitude response of the driven oscillator is half as high as the resonance amplitude. Groan, two more ω’s with yet another meaning. Exercise: In the limit of weak damping (ωr ≈ ω0), find ωFWHM ‘s from the defining equation. There are two solutions: (eq.2.25) ω FWHM = ω 0 ± 3γ Hence, the FWHM of the resonance peak is given by € (eq.2.26) Δω FWHM = 2 3γ or Δω FWHM 3 = ωr Q (Always remember, for weak damping, ω0 ≈ ωd ≈ ωr): € With Q the quality of the oscillator. Note that resonance curves sometimes are plotted as intensity or energy (rather than amplitude) against frequency, intensity being proportional to the square of amplitude. The FWHM frequencies of a plot of intensity or energy against frequency relate to Q slightly differently, namely, (eq. 2.27) Δω FWHM ,Intensity ωr = 1 Q Eq. 2.26 or 2.27 give a third way of extracting Q directly from measured data: Finding the FWHM from a graph of A(ω) is easy, without having to measure m, c, k separately, and again, we are not restricted to mechanical oscillators. € Q is a universal and dimensionless measure that allows us to compare the relative merit of different types of oscillators without (mentally or really) breaking them down into springs, masses, dashpots, or whatever. In fact, people commonly talk about Q in the context of atomic emission spectra. Plot the light emission intensity against frequency, find maximum and FWHM, calculate Q. (Note: Light intensity is proportional to the square of the amplitude, hence eq. 2.27 should be used). This is light coming from an atom: No springs, masses and dashpots, but Q remains a useful concept. Exercise: Read up what a Q- switch is in the context of laser physics. If you had an intuitive tummy ache when we mathematically forced the driven oscillator to respond with the drive frequency ω, rather than e.g with its ‘own’ frequency, ωd, you can relax now. Postulating response with ω is correct, otherwise the maths wouldn’t have worked out- after all, we have successfully solved the inhomogeneous differential eqn. But the oscillator gets its own back: Many oscillators have Q larger than 1000, which means they won’t respond with substantial amplitude unless driven with a frequency that is (roughly) within a fraction of 1/1000th or less of ωd. If you try to drive the oscillator with a frequency that is substantially different from ωd, it will respond with very small amplitude. Mathematically, you force the oscillator to respond with ‘your’ frequency ω, not with the oscillator’s own ωd or ωr. Practically, however, the oscillator forces you to drive it at or near ωr, otherwise it just won’t oscillate much. Eq. 2.27 tells us that we have to match resonance frequency the better the higher Q gets. A table of a few typical Qs: Table 2: Quality for typical oscillators Oscillator Piano string Microwave cavity Electron shell of atom Nuclear γ- transition Q 3000 104 107 1012…1013 The extremely high Q (extremely narrow spectral lines) of nuclear γ- transitions are the basis of physics’ most sensitive spectroscopy, Mössbauer spectroscopy (Nobel price 1961). Because Q is so high, Mössbauer spectroscopy can measure the tiny frequency shift from the energy loss of a γphoton flying ‘uphill’ in Earth’s gravity, thus confirming one of the predictions of general relativity. It can also measure the very small energy shifts that occur in the energy levels inside γ- active atomic nuclei resulting from different chemical bonds the respective atom may be engaged in. Prominent example: Oxidation states of iron (Fe). Phase angle We briefly discuss the behaviour of the phase angle as a function of drive frequency. Since tanϕ =2γω/(ω02-ω2), we see that in the limit ω << ω0, tanφ approaches zero from positive values – hence, for ω 0, ϕ 0: At low frequency, oscillator response is in phase with the driver (e.g., like sine and sine). For very large ω, ω >> ω0, tanϕ approaches zero from negative values, hence ϕ π: At high frequencies, oscillator response is 180 deg. out of phase with drive (like sine / -sine) At ω = ω0 ≈ ωr, tanϕ infinity, hence, ϕ = 90 deg. At resonance, drive and response are out of phase by 90 deg., e.g. like sine / cosine. Exercise: Show that the power, P, the driver delivers to the oscillator has a maximum at resonance. Use P = Fv (power = force x velocity) to do so. Reasoning will do, no calculation required to answer this! Feedback In a feedback loop, an oscillator is driven by a driver, as before, only now, the drive frequency is not set externally, but the driver’s frequency itself is set by the oscillator, so it is synchronised, and the system stabilises at the oscillator’s ωr and a constant amplitude, so the oscillation does not decay until the driver runs out of power. Frequency stabilisation is not perfect, relative tolerance is proportional to the width of the resonance curve, i.e. 1/Q. Typical application is in clocks. A very accessible example of feedback is the so- called Accutron, an electromechanical clock popular in the 1960s and 1970s. A tuning fork that resonates at 360 Hz drives a mechanical clockwork that turns the hands of the clock. The fork is made of magnetic material and is itself driven by the AC magnetic field of two induction coils, which periodically receive a current pulse from a drive transistor. Feedback is facilitated by a third coil that acts as pickup. The vibrating fork induces an AC current in the pickup coil, which is driven into the transistor’s base, that is its ‘control’ input. In this way, the oscillator synchronises its driver. The Accutron was the first electronic clock, one of the first commercial products using a transistor, and went into space on early satellites, e.g Telstar. Modern electronic clocks use an oscillating quartz crystal instead of a tuning fork. Via a phenomenon known as piezoelectricity, the quartz couples mechanical oscillations to an electric oscillator circuit, and stabilises the electric oscillator at the resonant frequency of the mechanical oscillations of the quartz. In terms of electronics, the drive and pickup of a quartz oscillator is ‘capacitative’ (via electric fields), while for the Accutron, it is ‘inductive’ (via magnetic fields). Systems similar to the Accutron, albeit with capacitative pickup, are making a comeback in microelectromechanical systems (MEMS), which essentially are micrometer- sized tuning forks. Exercise: Why do we need to couple the electric circuit to a mechanical oscillator to make an accurate clock- why not just build a purely electric oscillator? Feedback is an extremely important phenomenon both in electrical engineering, and nature, and can get extremely unpredictable or ‘chaotic’ when feedback is time- delayed (e.g. population dynamics in predator/prey ecosystems). It is a challenge to traditional scientific thinking that works ‘linearly’, from input = cause to output = effect. In feedback, output is returned to input, and you have to start thinking in self- consistent loops instead. 2.4 Coupled oscillators Often, oscillators interact with each other, i.e. they are coupled. Oscillations are then communicated between them. A typical example are atoms in a crystal. Atoms have clearly defined equilibrium positions, but due to thermal motion, will not sit still at these positions, but oscillate around them. A first approximation to thermal oscillations might be that atoms are bound to their equilibrium positions by harmonic forces, and indeed an early theory of heat capacity in crystals assumed just that (Einstein theory of heat capacity). However, if you think about it, atoms will rather be bound by harmonic forces to their neighbours, not their equilibrium positions – the ’spring’ will be a chemical bond, and the bond is between atom and atom, not between atom and lattice site. Hence, their oscillations will be coupled. That led to a treatment of heat capacity based on coupled oscillations by Debye, which gives much better agreement with measured data. As a very simple example of coupled oscillators, let us discuss two harmonic oscillators with the same spring constant k, and mass m. Now, we introduce a third spring, k’, that is parallel to the first two, and links the two masses. k’ couples the two previously independent oscillators. We call x1, x2 the displacement of mass 1/2 from its equilibrium position (note we use different origins for x1/2 but call the same direction ‘positive’). We assume one- dimensional motion along the springs only, and neglect damping. The motion of the coupled oscillators is described by a pair of coupled differential equations: (eq. 2.28) .. m x1+ kx1 − k ' (x2 − x1 ) = 0 .. m x2 + kx2 + k ' (x2 − x1 ) = 0 Both equations begin like the eq. of the independent oscillator, but add another force that is due to the coupling spring. k’ ‘mixes’ the two equations. x1/2 are no longer independent of each other, because they appear in each other’s equations. Note the equations, while coupled, are still linear, € damping has been neglected. Let’s try to solve them again with an ‘Ansatz’, x1/2 = A1/2 and exp(iωt). That means, we assume oscillations will be harmonic. Amplitudes may be different, and there may be a phase shift between the two oscillations – mathematically, this would be indicated by complex A1/2. What frequency ω the coupled oscillation will have, we have to work out. So, enter the ‘Ansatz’ into eq. 2.28, recalling d2x1/2/dt2 = -ω2x1/2. −mω 2 A1 + kA1 + k ' A1 − k ' A2 = 0 −mω 2 A2 + kA2 + k ' A2 − k ' A1 = 0 (eq.2.29) or −mω 2 + k + k ' A 0 −k ' 1 = ' 2 ' A 0 −k −mω + k + k 2 Again, all the ‘oscillatory’ exp(iωt) parts can be cancelled from the equation, and we are left with an algebraic rather than a differential equation. Only, as we had coupled differential equations, we end up with a ‘coupled’ algebraic equation- that is, two equations with three unknowns, A1, A2, ω. € alternative version of eq. 2.29 re- writes the coupled equations in matrix form. The Exercise: Recall the rules of multiplying a matrix and a vector. Show that the two ways of writing eq. 2.29 are equivalent. We use the matrix form, because there is a well- rehearsed standard procedure how to solve it, which basically is a souped- up version of the rule that a product is zero when one of its factors is zero. So, we want the product of a matrix and a vector to equal the zero- vector. One way of making the product zero is to make the A- vector zero (i.e., A1 = A2 = 0). That means, the oscillator has zero amplitude, i.e. it is at rest. That is the so- called trivial solution- not quite what we had in mind. The other possibility is that the matrix somehow is ‘zero’. Only, we can’t make all elements of the matrix 0- the off- diagonal elements, for example, are –k’≠ 0. Instead, we need the determinant of the matrix to equal 0. In a 2x2 matrix, the determinant is the product of the diagonal elements minus the product of the off- diagonal elements: (eq. 2.30) (−mω 2 + k + k ' )2 − k '2 = 0 eq. 2.30 is a characteristic equation again, namely a quadratic equation for the only unknown, ω. (strictly, it’s a quadratic eqn. for ω2, but as ω has to be positive, just solve for ω2 and only consider €positive roots). the Exercise: Solve eq. 2.30. Eq. 2.30 has two solutions for ω, namely: ωS = (eq.2.31) k m k + 2k ' ωA = m The first solution, ωS, is the same as for the uncoupled oscillator. We can substitute this solution back into eqn. 2.29 to mathematically work out a relationship between A1 and A2, but I hope as physicists, we can do better, simply by interpreting the result. ωS does not contain k’. The only way € the coupling spring k’ can possibly not matter is that it never gets stretched! Both masses how oscillate in phase with the same amplitude, hence A1=A2. Exercise: Show that eq. 2.29 is satisfied when ω = ωs, and A1=A2. We will call this solution the symmetric solution, hence index S. The other solution, ωA, contains k’ (in fact, weighs it double compared to k), so the coupling spring will get stretched. We can substitute ωA into eq. 2.29 to find the relation between A1 and A2 mathematically. Exercise: Find the relation between A1 and A2 for ω = ωA. But we can again use our intuition… coupling spring k’ is weighed double, compared to k, in ωA. The obvious reason is that it is stretched or compressed twice as far as k. That means, the two masses will oscillate 180deg out- of- phase’ in other words, A1 = -A2. We call this the antisymmetric solution, hence index A. The frequency of the antisymmetric solution is higher, because now, the coupling spring contributes to the restoring force. E.g., if we assume k’ = k, then ωA = √3ωS. Together, the two solutions we have found for the coupled oscillator are known as normal modes. Off course, we recall that superpositions of solutions of linear diff. eqn’s again are solutions. Hence, the coupled oscillator can undergo a motion that is neither purely symmetric, nor purely antisymmetric. However, every motion of the oscillator can be broken down into its (in this case, two) normal modes. The normal modes are a bit like the ‘unit vectors’ of a vector space- not every vector is a unit vector, but every vector can be expressed as a linear combination of the unit vectors. Just like different unit vectors are ‘linearly independent’, the normal modes never mix. If a motion of the coupled oscillator can be broken down at any point in time into, say, 70% symmetric and 30% antisymmetric, then it will always remain like that. In other words, no energy is exchanged between normal modes. Energy may be exchanged between the masses though. Applications of the ‘coupled oscillator’ concept include the vibrations of molecules, and 3dimensional crystal, consisting of many coupled oscillators (but with a small number of different masses, and springs). Of course, strictly this requires quantum mechanical treatment. However, the conclusions about breaking down oscillations into normal modes carries through to the quantum mechanical treatment, only that resulting normal modes are quantised, i.e. they can only carry integer multiples of a basic unit of energy. Quantised normal modes in a crystal are known as Phonons. In thermodynamics, normal modes are also known as degrees of freedom. Like in our very simple example, normal modes in molecules or crystals will either be symmetric, or antisymmetric. The symmetry of the mode has important consequences: A symmetric mode does never have an oscillating dipole moment associated with it, while an antisymmetric mode may have. Therefore, antisymmetric modes can couple to electromagnetic radiation (i.e, they can absorb or emit radiation – typically in the infrared (IR)) – symmetric modes can’t. Similarly, the symmetric / antisymmetric phonon modes in crystals are called ‘acoustic’ and ‘optical’ modes, respectively. Infrared spectroscopy is an important tool in analytical chemistry, as the precise location of resonances in the IR spectrum reveals the presence, and concentration, of certain chemical groupsbut IR spectroscopy is ‘blind’ to symmetric modes. Highly symmetric, small molecules do not have an antisymmetric mode, hence don’t absorb or emit IR. Examples: N2, O2, the most common molecules in our atmosphere. However, CO2 does have an antisymmetric mode, and does absorb in the IR: Global warming! The ‘equipartition theorem’ of thermodynamics says that in thermodynamic equilibrium, all degrees of freedom on average have the same energy. This can only be true if there is a way of distributing energy between the degrees of freedom = normal modes. But we concluded earlier that normal modes don’t mix… how does that go together? Normal modes strictly don’t mix as long as the forces are precisely harmonic, F = -kx. Practically, that is never true. In particular at high x, every real ‘spring’ will have anharmonic contributions to the restoring force, i.e. contributions that are not exactly proportional to displacement. Anharmonic contributions to the force law do allow the ‘mixing’ of normal modes, that is the transfer of energy between them. Still, coupling will be weak, and the exchange of energy will be ‘slow’- which means, it takes much longer than the period of the oscillation. Exercise: If binding forces in crystals where precisely (not just approximately) harmonic, crystals would have zero thermal expansion. Explain that! Parametric oscillator Another important aspect is the parametric oscillator. A mechanical example is a playground swing. Did you ever wonder how you can get yourself to swing high on a playground swing when you sit on it, feet off the ground? The trick is that you periodically change the ‘parameters’ of the oscillator: By moving your centre of gravity, you effectively change the length of the swing’s ropeshence, ω0 and ωr! In this way, you can amplify the amplitude of pre- existing oscillations (and a small amplitude oscillation always pre- exists, e.g. from when you sat down, or the wind blows you a little). The key ‘trick’ is that you have to change the ‘parameter’ with twice the frequency of the average ω0. Think how you swing your legs forward and back during every single swing cycle. It takes a while to learn that, that’s why young children want pushing. The optical parametric oscillator is a key tool in modern laser physics, enabling frequency mixing and photon entanglement. 3 Waves Definitions and Terms Assume a medium of infinite size that is at rest and in equilibrium. For simplicities’ sake, we will assume a one- dimensional medium, and treat waves in 3 dimensions ‘anectodatally’ rather than precisely. The medium can be ‘deflected’ or ‘distorted’ somehow away from its equilibrium state. Initial deflection is described by a function y = f(x,t=0). (Note: Often it is useful to write f(x,t=0) = Ag(x,t=0), wherein A is a constant physical quantity (or a vector) with units appropriate of the respective deflection, and g a dimensionless mathematical function). There will be an energy associated with this distortion, typically scaling with A2. In many cases, the initial deflection will not be stationary, nor will it simply return to equilibrium. Instead, it will travel along the medium. Such a travelling distortion is a wave. Waves are of course closely related to oscillations, the difference is that oscillations don’t travel. There is, however, a hybrid between the two, the so- called ‘standing wave’, which we will discuss briefly later. If the deflection is parallel to the direction the wave propagates, we speak of a longitudinal wave. If the deflection is orthogonal to the direction the wave travels, the wave is called transversal. Note there are two directions orthogonal to any given direction of propagation, therefore, there is two possible, mutually perpendicular directions for the distortion. Which of these two possible directions the distortion has is known as the polarisation of the wave. (Careful- polarisation is a term that even within physics has more than one meaning- not to be confused with polarisation of a dielectric). Let’s illustrate a travelling distortion. At first, we assume it does not change in shape as it travels, only in location. The deflection that was peaked at x=0 for t=0 peaks at Δx the little time, Δt, later, while its shape has not changed. From this simple observation we can reach an important conclusion: The function f(x,t) that describes the deflection must be of either of the two forms: (eq.3.1) y = f (x − ct) or y = f (x + ct) € Wherein the ‘-‘ describes a wave travelling from left to right, and the ‘+’ describes a wave that travels from right to left. In either case, f is a function of x and t only in their combination x-ct or x+ct, not of x and t separately or in any other combination. Exercise: Only one of the following three describes a wave: f1=A/(x+ct2), f2=B/(x+ct)2, f3= Cexp(at)sin(x+ct). Which? Mathematically, f is a function of only one variable (x+/-ct), not two (x and t). Physically, this ‘lumping’ of two variables into one precisely captures the most important aspect of a wave, namely, that it travels in space and time. To observe a wave, you can take a snapshot at one fixed time of the wave in all space. Since x ranges over all space (-∞ to ∞), you get the full range of x +/- ct, and hence the ‘full picture’ of the wave. Or, you can observe at a fixed point in space, and let the wave wash over you for a long time (strictly speaking, forever). You then again get the full range of the combined variable x +/- ct, hence the full picture of the wave. Waves travel, so you don’t have to. The description of a wave as f(x+/-ct) leaves a loose end to tie up: The combined variable, (x+/-ct), has a unit. Exercise: What are the dimensions of x + ct ? It is not generally convenient to feed a variable with units into a mathematical function. We know sin(π), but what is sin(π meters)? exp(seconds)? In other words, mathematical functions like sine or log or exponential assume a dimensionless argument. We therefore introduce a constant, k, with dimension L-1, to multiply with x+/-ct before feeding it into the wave function. Surely, a function of k(x+/-ct) is a function of x+/-ct, since k is a constant. Further, we introduce the definition ck = ω. So, we name the product of our constant k with c, and call it ω. Note, ω = ck cannot be ‘derived’ somehow. It is a definition of ω, hence always true, and shall be memorised! Exercise: What are the dimensions of ω? It is of course no coincidence that ω gets the same symbol as the angular frequency of an oscillator. We will later see that c usually is a function of k, and therefore, ω will be as well: ω(k)=c(k)k. But still, ω = ck. Now, we can write our wave function in the form f(kx-ωt), or f(kx+ωt), with the dimensionless group kx+/-ωt. The group kx+/-ωt that combines time and space into a single variable is so important to the description of waves that we give it a name: kx+/-ωt is known as the phase (ϕ) of the wave. Points in space that have equal phase are known as wave fronts. The wave function has the same value along a wave front, because by definition, it has the same argument. (Note, wave fronts are a sensible concept in 2 or 3 spatial dimensions, not in 1). To interpret the meaning of k, ω, and c, we now assume one specific form for the wave function, namely, (eq. 3.2) f (kx − ωt) = A sin(kx − ωt) Note it is a big misconception to believe that waves necessarily must be sinusoidal. Our initial sketch was a finite- size ‘hump’ with a single peak, not even periodic! Nevertheless, it is instructive to discuss waves in terms of sine functions. The deeper reason why waves are commonly treated as sinusoidal€waves is Fourier’s theorem. According to Fourier, every function can be decomposed into a superposition of sine (and cosine) waves of different ω (this is known as Fourier analysis). So, at least in principle, the problem of wave propagation can be considered solved if we can describe the propagation of sine (and cosine) waves: Take whatever wave function we have (e.g., a ‘hump’), decompose it into its sine/cosine components, let these components propagate, and in the end, add them up again (this is known as Fourier synthesis). This may be awkward and indeed, issues arise when waves of different ω propagate with different c – this is known as dispersion, and we will discuss it in some detail later. However, at least conceptually, the complete problem is solved if it is solved for sinusoidal waves, that is why ‘sine’ and ‘wave’ go together so well. We know that mathematically, the sine is periodic with 2π. On the other hand, physically, sine waves are periodic in space with a wavelength, λ, and periodic in time with a period, T. Let’s equate the mathematical and physical periodicity. First, periodicity in space: (eq. 3.3) k(x + λ ) − ωt = kx − ωt + 2π That is, if we move on in space by wavelength λ, the phase has to move on by 2π. We can cancel most of the terms in eq. 3.3 to arrive at the important equation: € (eq. 3.4) 2π λ k= Which gives meaning to the constant k we had previously introduced only to fix the unit problem. k is confusingly called wave number although it has a unit. For three- dimensional waves, k becomes a vector, the so- called wave vector. The modulus of wave vector k will again equal 2π/λ, but k also has a€direction, namely, the direction the wave propagates into. In the phase of the wave, the product kx is replaced by the scalar product kx (or, kr) between wave vector and location vector. A completely analogous reasoning can be applied to periodicity with time, T, resulting in eqn. 3.5. Exercise: Apply the above reasoning to periodicity with time, T, to derive eq. 3.5. (eq. 3.5) ω= 2π T ω is known as angular frequency, just as it was for oscillations. That is why it even gets the same letter, ω. From the previous definition, ω = ck, we now arrive at € (eq. 3.6) c= ω λ = = λf k T c is known as the phase velocity of the wave, and f = 1/T is the frequency. (Not to be confused with the wave function f = f(kx+/-ωt)). You see that ω = ck is just another way of saying c = λf. Note that€a phase is a mathematical, not a physical object. Phase is dimensionless, it has neither mass, momentum, nor energy. The phase velocity is therefore not the speed of any physical object. On the other hand, waves very much are physical objects, in particular, they do transport momentum and energy. Phase velocity can therefore not be the last word on how fast a wave travels. Phase velocity does describe how fast a sine wave travels- another good reason for using sine waves as examples when exploring the properties of waves. But for other waves, we need to think harder- prepare for a re- visit later. The wave equation We have already reached a rather advanced description of waves. To give this description a sounder footing in theory, we are again looking for a differential equation. Remember, differential equations are descriptions of functions, the solution is a function that answers to the description. We already have established what functions are solutions: All those that have as their argument time and space only in the combination of the phase ϕ , kx-ωt or kx+ωt. So, here we are looking for a differential equation that describes a specific argument, rather than a specific function. The following partial differential equation fits the bill: (eq.3.7) ∂2 f ∂t 2 2∂ f =c 2 2 ∂x Eq. 3.7 holds true for any function, f, as long as it is a function of the phase, ϕ. To show that, we need to apply the chain rule to f = f(ϕ), and ϕ = kx +/- ωt: € (eq.3.8) ∂f ∂f ∂ϕ ∂f = = ±ω ∂t ∂ϕ ∂t ∂ϕ and ∂f ∂f ∂ϕ ∂f = =k ∂x ∂ϕ ∂x ∂ϕ hence : 2 2 ∂2 f ∂2 f 2 ∂ f 2 ∂ f =ω and 2 = k ∂t 2 ∂ϕ 2 ∂x ∂ϕ 2 Substitute the bottom line of eq.3.8 into 3.7, and you will see d2f/dφ2 will cancel out, whatever f is. Use c=ω/k to find eq. 3.7 is fulfilled. Eq. 3.7 prescribes the argument of the functions to be the phase, it does not prescribe any particular function. Exactly what we ordered! Eq. 3.7 is known as the wave€ equation, one of the key equations of physics. In three dimensions, the wave equation is extended to 2 2 2 2 ∂ f ∂ f ∂ f = c2 (∇)2 f = c2 Δf = c + + 2 2 2 2 ∂2 f (eq 3.9) ∂t ∂x ∂y ∂z with the Laplace operator, Δ, which is defined as the square of the del operator. Note that even the 3- dimensional wave equation usually is an oversimplification- e.g. like oscillators, real waves are usually damped, hence travel only a finite distance. To describe that, we € have to add first derivative terms to 3.7 or 3.9, but we won’t do that here. Also, not all would again waves are described by 3.7: For example, the namesake of the entire discipline- waves on the surface of water- are described by the much more difficult Korteweg- de Vries (KdV) equation. Sine waves are not solutions of the KdV eqn- ouch!- and can at best serve as an approximation for very low amplitudes. Still, the terms and concepts we have and will develop in this chapter (ω, k, phase velocity, group velocity, dispersion) can be applied to water (and other) waves. You may question if the exercise of finding a differential equation is worthwhile when we already know the solution. The solution is dead simple; any function that is a function of the phase is a wave- why do I need the differential equation? Why find a question when we already know the answer? The reason why we want eq. 3.7 is that it provides a bridge between the general and the specific. Given any specific medium- will it sustain waves? The answer is, whenever the possible ‘deflections’ of the medium (fields, pressures, strains,…) can be shown to satisfy an eqn. of the general form 3.7, that is there is a relation between the second derivative to space, and the second derivative to time, then this will be a wave- sustaining medium. Off course, for any specific medium, we will not end up with an equation that says c2 between the two second derivatives, but some constant that depends on the properties of the specific medium. That means, we have found c for that medium! Exercise: According to Maxwell, in vacuum, magnetic induction B and electric field E are related by the following two equations (simplified to 1 dimension): ∂E ∂B =− ∂x ∂t ∂B ∂E = −µ0ε0 ∂x ∂t Manipulate these two equations so that you find a relation between the second derivative to time and the second derivative to space for E. Show that it is a wave equation, and give the phase velocity of the corresponding waves. € We will discuss examples of media that can sustain waves. But first, another lose end to tie up… Waves we have discussed so far are waves in infinite media. That means, eq. 3.7 has initial conditions (f(x, t=0)), but no boundary conditions- an infinite medium has no boundaries. Quite a different situation arises when we discuss a finite medium with boundaries, e.g. solid walls. This adds a boundary condition, f(x=boundary, t) = 0 to eq 3.7. A differential eqn. is only complete when initial and boundary conditions are specified, and indeed, the addition of a boundary condition completely changes the character of the solutions of eqn. 3.7. Eq. 3.7 plus boundary conditions leads to so- called standing waves. The term ‘standing wave’ is an oxymoron, the defining property of a wave is that it travels, but the term is commonly used. Exercise: A philosophical dispute: Your friend says you can see wild animals in the zoo, in fact, they have a lion. You say, no you can’t. Sure they have a lion, but… Base your reasoning on a definition of the term ‘wild’, and how the zoo’s boundary conditions affect the nature of the lion. To cope with the presence of boundaries in mathematical terms, you are often encouraged to solve the wave equation by separation of variables. That is, assume your wave function can be written as f(x,t)=T(t)X(x), as the product of one function T(t) of time only, and another function X(x) of location only. Lo and behold, separation of variables leads to solutions, but you see that this approach is a head- on collision with our previous demand that waves are described by functions that combine location and time into a single variable, the phase. Funny that separation of variables works at all, when eqn. 3.7 was specifically designed to give only solutions that are functions of the combined variable, phase. Physically, the apparent contradiction can be reconciled by describing the standing wave as the sum of two travelling waves (remember the superposition principle- it applies here, as well). The two travelling waves have the same amplitude, only, one travels from left to right (call this the incoming wave), and another from right to left (you can understand that as the reflection of the incoming wave at the boundary). That is, we superimpose one solution with phase in the form kx-ωt with another with phase kx+ωt. You can separate variables to solve eqn. 3.7, and it works- but you don’t have to: Exercise: Use the mathematical identity sin(kx-ωt) + sin(kx+ωt) = 2sin(kx)cos(ωt) to come to terms with the standing wave problem. How can apparently separated variables nevertheless be interpreted as phases? Let us now consider some examples of wave- carrying media. Waves on strings A piece of string is held under tension, σ =F0/A (F0 force, A crosssection of string). The string’s material has density ρ. We pluck the string at time t = 0, giving it an initial deflection y(x, t=0) = f (x). In 1st year, you have already discussed that this leads to waves propagating on the string with phase velocity, c: (eq. 3.10) c= F0 σ = ρ µ Wherein µ = ρA is the mass/length of the string. Of course, on strings of finite length, we will get standing waves again, with wavelength given by the length of the string. € Exercise: Verify that the units of √σ/ρ are m/s Exercise: Discuss musical string instruments using Eq. 3.10, c = λf, and λ=2L (wherein L is the length of a vibrating string). How do you tune the guitar with ‘tuning pegs’ (screws) that tighten strings? How do you play it by ‘fretting’ the strings, i.e. shortening L? Why does a bass guitar or double bass have thicker and longer strings than a lead guitar or violin? Pressure waves in gases and liquids (‘sound’) Sound waves are an important example of waves, we even have a sense that picks them up. In a fluid (gas or liquid), sound waves are always longitudinal waves, i.e. the fluid is compressed or expanded along the direction of the wave’s propagation. Assume the gas is contained in a linear tube of crosssection A, wherein it may move forwards or back with velocity, v. We look at a small volume of gas (or liquid), V = Al, between locations x, and x+l. At location, x, gas has pressure, p(x), and moves with velocity v(x) along the tube (i.e, longitudinally). At x+l, pressure is p(x+l), velocity is v(x+l). Since the pressure at x and x+l may be different, an overall force F acts on V: F = -AΔp ≈ -Aldp/dx =-Vdp/dx, assuming l to be small compared to the length scale of pressure change, l << λ, so that Δp ≈ l dp/dx. (We’ll see that l cancels out in the end, so we can always assume it ‘small enough’). This force will accelerate the mass of gas contained in the volume; acceleration again will be resisted by inertia, F = ma = ρVdv/dt, wherein v is the velocity of the gas. Hence, (eq. 3.11) ∂v 1 ∂p =− ∂t ρ ∂x On the other hand, velocity may also be different at x and x+l, hence volume V changes with time, too. Within Δt, length l changes by Δl = (v(x) – v(x+l))Δt = ΔvΔt ≈ ldv/dx Δt. Hence, ΔV = AΔl = V dv/dx Δt, or ΔV/V = dv/dx Δt. The link between this, and eq. 3.11, is given by the fact that every Volume € change causes a pressure change. Volume- and pressure change are linked by the compressibility κ, which is defined as κ = -1/V dV/dp, from which follows Δp ≈ -1/κ ΔV/V. Compressibility is much larger for gases than for liquids (try compress a liquid! Remember that in hydrodynamics, liquids are treated as ‘incompressible’). Exercise: There are two compressibilities, ‘adiabatic’ and ‘isothermal’. Revise the Carnot cycle to remember the meaning of adiabatic and isothermal. The compressibility we need here is the adiabatic compressibility- why? For the ideal gas, adiabats are given by p ∝ V−γ, with the adiabatic exponent, γ. Calculate adiabatic κ for the ideal gas. We substitute ΔV/V = dv/dx Δt into Δp ≈ -1/κ ΔV/V to get Δp = -1/κ dv/dx Δt, or (eq. 3.12) ∂v ∂p = −κ ∂x ∂t Exercise: Combine eqn’s 3.11, 3.12 into a single equation. Now, take the derivative of 3.11 w.r.t. x, and of 3.12 w.r.t. t. The left- hand sides will then be equal, because the order of differentiation does not matter. Consequently, equate the right hand sides: (eq.3.13) ∂2 p 1 ∂2 p = 2 κρ 2 ∂t ∂x Compare to 3.7- you see 3.13 is a wave equation for the pressure in the tube. Such waves are known as sound, or acoustic waves. The point of having a general wave eqn is illustrated nicely, also. Not only can we tell that fluids can sustain pressure waves, but also, we can immediately read their € as c = 1/√κρ. phase velocity As an example, the adiabatic compressibility of water is κ = 5x10-10 m2/N, and the density of water is 1000 kg/m3. Consequently, sound in water has phase velocity 1.4 km/s. Exercise: You are assigned to measure the compressibility of a number of hydraulic oils. Given that compressibility of liquids is extremely low, this is far from easy to do directly. Propose a simple method that avoids use of high pressure rams and ultra- strong vessels. Exercise: Assuming that phase velocity c in fluids is controlled by compressibility and density, use dimensional analysis to show c = 1/√κρ. Interpret the dimensionless Mach number from aerodynamics, M = v√κρ, with v being the velocity of the moving object. Exercise: Show that for an ideal gas, c ∝ √T, with temperature T. (This is not easy. You need to calculate the adiabatic compressibility of an ideal gas, using the adiabatic equation, p ∝ V-γ). The root- mean- square velocity of molecules in gases also is proportional to √T. Interpret why the phase velocity of sound in a gas is coupled to rms velocity of gas molecules. It is different for solids- why? Mechanical waves in solids A similar reasoning as for pressure in fluids can be applied to the mechanical tension (σ, also called ‘stress’) in solids. We find that a solid can also sustain longitudinal mechanical waves, which are also called sound, or acoustic waves. The relevant mechanical property that replaces the inverse of compressibility is the elastic modulus E (also known as Young’s modulus) of the material. E is defined similar to a spring constant (but has different dimensions!). When a certain tension (stress) is applied to a solid of length, l, and, it will stretch by δl, called ‘strain’. How big δl is depends on E as follows: (eq. 3.14) δl σ = l E 3.14 serves as the defining equation for E. You see it can be re- arranged as σ = Εδl/l, which compares to Δp = -1/κ ΔV/V (apart from the sign). The phase velocity of sound in solids is given by € (eq 3.15) c= E ρ Solids can also sustain another type of waves, known as shear waves. Shear waves are transversal rather than longitudinal waves, and therefore can be polarised. Shear is defined as a ‘sideways’ deformation € of a body, rather than expansion or compression. A force F is applied along the surface, A, of a body, not normal to it. F/A = τ is known as shear stress. As a result of shear stress, the body tilts sideways (‘shears’) by a little angle, α. The size of α relates to τ via a property of the solid called shear modulus, G: (eq. 3.16) α= τ G τ, G have the same dimensions as σ, E, respectively, but the direction of the force involved is different. Shear waves propagate with phase velocity c given by: € (eq. 3.17) c= G ρ Some materials may display anisotropic shear modulus, i.e. shear modulus is different for the two possible polarisations of shear waves. However, for almost all materials, shear modulus is smaller than elastic€modulus, typically E/3 < G < E/2. Consequently, shear waves have lower phase velocity than longitudinal mechanical waves. Shear waves are unique to solids, fluids (liquids or gases) do have zero shear modulus and therefore, cannot sustain shear waves. Exercise: Discuss why liquids and gases have zero shear modulus. Discuss why G < E. Longitudinal and transversal waves in solids are collectively known as elastic waves. An important example are seismic waves - ‘earthquakes’. A seismic event in the Earth’s mantle generates both longitudinal and shear waves. In general, propagation of seismic waves is quite complicated, because modulus and density depend on depth below ground, the Earth’s mantle is solid, but the core is liquid, and waves can travel along the surface as well as through the bulk. But often, a quake hits you twice: The longitudinal ‘primary’ or ‘P’ wave first, then the secondary ‘S’ wave, which is a shear wave. The S- wave usually is more destructive (why?). When a quake hits your building, but does not destroy it, be prepared for a more destructive second hit coming in a few seconds. Get out immediately! Seismic waves hitting the sea bed from below can cause tsunamis. Seismic waves originating from a ‘shallow’ centre, having low frequency and with little surface wave contribution are the telltale of a nuclear test rather than a natural earthquake. Exercise: Is it the P- wave or the S- wave that causes a tsunami? Intensity of waves We will now study the very important behaviour of waves that encounter the boundary between 2 media. Waves transport energy and momentum. The density of the energy flux, that is energy transported through unit area in unit time, is called Intensity I of the wave. The question is, what fraction of intensity will be transmitted, what fraction will be reflected, at the boundary? Exercise: Give the SI units, or the dimensions, of intensity. To calculate intensity, we first define the energy density w, Energy/Volume, of a wave. Intensity is then given by how fast the energy density propagates. A harmonic wave, similar to an oscillator, can store energy in two ways: Either, as kinetic energy (e.g., a moving mass of gas, or piece of string), or, as potential energy (e.g., as a compressed gas, or a stretched piece of string). As the wave propagates, it continuously ‘swaps’ one form of energy for the other; however, at every point in space, the sum of the two is the same at all times. E.g, at the node of pressure, you will have an antinode of velocity (maximum velocity), and vice versa. The energy density w can therefore be calculated in two ways, either from the velocity amplitude (all energy = kinetic), or the pressure amplitude (all energy = potential). Let us develop the concept of intensity at the example of sound waves. For sound waves, (eq. 3.18) 1 w = ρv02 2 1 w = κΔp 2 2 The first line in 3.18 is simply the kinetic energy/volume of the oscillating mass. Density replaces mass, because we are talking of energy density, and v0 is the velocity amplitude, that is the fastest velocity that the oscillating particle has. Note this has nothing to do with the phase velocity, c (nor € to do with the group velocity we will introduce later). Phase velocity tells us how has it anything fast the phase (a mathematical, not a physical, object) propagates along the medium. v0 tells us the maximum velocity of the particles that make up the medium as they oscillate, ‘surfing’ the wave. The second line in 3.18 is the potential energy density of a compressed gas, with Δp the pressure amplitude, that is the maximum difference between pressure at a given time and place, and equilibrium or ‘background’ pressure of the respective gas or fluid in the absence of a sound wave. Exercise: Derive w = 1/2 κΔp2 Since the two forms of energy have to be equal due to energy conservation, we can equate the two expressions for w in eq. 3.18 to establish a relation between velocity- and pressure amplitude: (eq. 3.19) Δp ρ = = cρ = Z v0 κ Wherein we have used c =1/√κρ. The product cρ is known as the Impedance Z of the wavecarrying medium (not a property of the wave!), please memorise Z = cρ. Impedance can be generalised to all wave- carrying media. Impedance always is given by the ratio of the two quantities€that characterise the different forms in which the wave stores energy. Four sound waves, it has dimensions ML-2T-1, but not necessarily for other waves. Electromagnetic waves store energy either as electric field or as magnetic field. The ratio between the two field amplitudes E0 and H0 is given by: (eq 3.20) E0 µµ0 = =Z H0 εε0 In vacuum, µ = ε = 1, and the impedance for electromagnetic waves in vacuum is given by € 3.21) (eq. Z vac = µ0 = 376.7Ω ε0 Note the different unit! In materials, usually µ ≈ 1, but ε > 1, hence Z < Zvac. Also, please note that Zvac has units Ohm (Ω), but it is not an ohmic resistance, as in V = RI. Exercise:€What is the ohmic resistance of vacuum? Exercise: A coaxial cable used to link oscilloscopes to electronic circuits is specified ’50 Ω’. This is NOT an ohmic resistance (e.g., cables of different lengths are all specified 50 Ω- but ohmic resistance is proportional to length). What does the ’50 Ω’ stand for? Finally, we calculate intensity as the velocity with which energy density propagates. Again, we will assume a harmonic wave. For harmonic waves, speed of propagation is given by the phase velocity, c. Just to remind you, that is not always precisely true for other waveforms, we’ll return to that later. Consequently, (eq. 3.22) I = wc We now combine eq.s 3.18, 3.19, and 3.22 into € 3.23) (eq. 1 1 I = wc = ρcv02 = Zv02 2 2 Transmission and reflection of waves The reason to introduce a quantity called impedance, rather than just leaving the product cρ in eq. 3.19, is that impedance is the quantity that controls the transmission and reflection of waves that encounter€the boundary between two different media. Assume a wave encounters the boundary between two media, which have different impedance Z1 and Z2, respectively. Part of the wave’s intensity may transmit from one medium to the other, part may be reflected. Probably without realising it, you have already met two special cases of this situation, namely Z2 = 0 and Z2∞. You know these as ‘loose’ end and ‘fixed’ end. Exercise: What fraction of the wave’s intensity is reflected / transmitted at a loose end, and at a fixed end? Here, we discuss the general situation, with 0 < Z2 < ∞. E.g. a thin rope linked to a thicker one, not dangling loose, nor linked to a solid wall. We call the incoming, transmitted, and reflected intensities Ii/t/r, and the respective velocity amplitudes vi/t/r. From energy conservation, (eq. 3.24) 1 1 1 Ii = I r + I t ⇔ Z 1vi2 = Z 1vr2 + Z 2 vt2 2 2 2 2 2 2 ⇒ Z 1 (vi − vr ) = Z 2 vt where we have used the relation between intensity, impedance, and velocity amplitude, eq.3.23. Also, at the interface, deflection has to be continuous, that is equal in both media (if it weren’t, that means the interface has ruptured- something we assume not to happen). As deflection is equal at zero time,€the only way it can be equal at all times is if the velocity amplitudes at the interface satisfy eq. 3.25: (eq. 3.25) vi + vr = vt Note that vi, vr may have opposite sign. Between eqn.s 3.24, 3.25, all velocity amplitudes can be eliminated to express both Ir and It in terms € of Ii. Exercise: Between eqn.s 3.24, 3.25, eliminate all velocity amplitudes and express both Ir and It in terms of Ii. Hints: First, divide 3.24 by 3.25 to get a new equation that links vt, vr, and vin. Then, between (3.25) and the new equation, express vt, vr in terms of vin. Enter these expressions for vr, vt into the expression for Ir, It. We then define the transmission coefficient t, and the reflection coefficient r, as the ratio of transmitted and reflected intensity, respectively, to incoming intensity. Overall, we find: t= (eq.3.26) It Z 1Z 2 =4 Ii (Z 1 + Z 2 ) 2 I r (Z 2 − Z 1 ) 2 r= = Ii (Z 1 + Z 2 ) 2 We see that t, r are given by the impedances of the media. It is eqn. 3.26 that makes the concept of ‘impedance’ so important. The reflection and transmission of light at interfaces between transparent media is controlled by a very similar equation, you have to replace impedance by the refractive € all these equations apply to ‘normal incidence’, it is more intricate under an angle). indices. (NB Exercise: What fraction of light is reflected at the interface between air (nair = 1) and a glass window (nglass = 1.56)? What fraction is reflected at the interface glass- to- air? What percentage of daylight do you loose behind closed double- glazed windows? Let us look at a number of special cases, and check that eq.3.26 reproduces what we already know: Firstly, we check if always, r + t = 1. That is energy conservation in other words. Exercise: Show that r + t =1 for all Z1 and Z2. When Z1 = Z2, we do not actually have a boundary, hence, there should be no reflection, and full transmission. Exercise: Check that for Z1 = Z2, r = 0, and t = 1 Exercise: What t, r do you expect for Z2 = 0, and Z2∞? Check that eq. 3.26 gives the expected results. Although r =1, t = 0 for both Z2 = 0, and Z2∞, there is an important physical difference between the two cases. For Z2∞, vt has to remain zero at all times- otherwise, some intensity would be transmitted, but we know t = 0. In other words, there has to be a node at the interface. Consequently, vr = -vi, which means, the reflected wave is 180 deg. (π) out- of- phase with the incoming wave (remember we are discussing sine waves only!). Even when Z2 is not infinite, when a wave is reflected at the interface to a higher impedance medium, the reflected part of the wave always undergoes a ‘phase jump’ of π. For Z2 0, however, no phase jump is required, because a zero impedance medium carries no intensity, whatever the velocity amplitude. Exercise: If there are 3 media with two successive junctions, and you wish to calculate total transmission- do you add the transmission coefficients at either junction, or multiply them, or else? Exercise: When Z1 and Z2 are swapped (e.g., the wave approaches the same junction from the other medium), how do r,t change? Dispersion of waves Back to the ‘hump’ we had used in the very beginning. A frequent observation is that such ‘humps’ or ‘wave packets’ spread on propagation. That is, as the peak propagates in space, the shape of the ‘hump’ becomes broader and flatter. A typical example is thunder: The closer you are to a lightning the louder and briefer thunder is, a loud ‘bang’. Thunder from a distance, however, is a much longer grumble. We understand intensity will reduce with distance for 3- dimensional waves (I ~ 1/r2), but why is it a lengthy grumble, no longer a brief albeit not so loud bang? This is clearly something that will not happen to pure sine waves, and our discussion now has to go beyond. Following Fourier again, our ‘hump’ can be described by a superposition of sine waves of different frequency. The broadening of the hump can be described by a phenomenon called dispersion: Phase velocity, c, is not a constant, instead, c = c(k); hence ω = c(k)k = ω(k). Consequently, the different Fourier components propagate with different phase velocity: That is how the hump spreads. There is only one ‘medium’ that shows no dispersion at all, that is vacuum for light waves. All mechanical waves will experience dispersion at high frequencies, e.g. elastic modulus or compressibility becomes frequency dependent at high frequencies. Light in matter also experiences dispersion, which is evident from the dependency of the refractive index on wavelength- we will discuss that in detail later. Group velocity How fast does a wave packet move? The packet is the superposition of sine or cos waves of different k. Please do not confuse this with interference, which is the superposition of waves with the same k but different origin, or direction. Interference is one of the most important topics in wave physics – so important that it is comprehensively dealt with in optics, and quantum mechanics – but not here. An example of the superposition of waves with slightly different k (and slightly different angular frequency: ω = ck) are so- called beats. Asking how fast the packet moves has no off- the- cuff answer. We need a new concept. We define the velocity of the wave packet as the velocity of its peak. The distance the peak shifts, divided by the time taken, is what we will call the group velocity vG of the wave packet. Group velocity is how fast the wave packet transports energy, or how fast a signal (= information) is transmitted. Note that information needs to be transmitted as a wave packet- a permanent sine wave has no beginning or end, and contains no information. You see that the group velocity, not the phase velocity, is the velocity of a physical object. We now need a physical reasoning to work out how fast the peak of a packet moves. We cannot just pick the phase velocity of one of the different harmonic components, because different components have different k, and different k have different phase velocity. We consider the simplest possible ‘group’ or wave packet, the superposition of 2 sine waves, the first has angular frequency ω, and wavenumber k; the second, ω+Δω, k+Δk, with Δk << k, Δω << ω. Exercise: How can you work out the position of the peak, x, as a function of time, t? The peak of these two- wave- packet is located at point xp(t) where the two waves are “in phase”, kx p (t) − ωt = (k + Δk)x p (t) − (ω + Δω )t (eq. 3.27) ⇒ x p (t)Δk = tΔω ⇒ x p (t) = Δω t Δk Eq, 3.27 tells us the location of the peak xP moves with velocity Δω/Δk. Hence, we define the group velocity vG as: € (eq.3.28) vG = dx p (t) Δω dω = ≅ dt Δk dk That is, group velocity is the derivative of angular frequency with respect to k. Compare to phase velocity, which is the ratio of angular frequency to k. To calculate group velocity, we first need to express ω as a function of k. ω(k) is known as dispersion relation for the medium. Dispersion relations € play an important role in many areas of physics, and the concept ‘survives’ into quantum mechanics. Quanta carry an energy proportional to their angular frequency E = hω, while their wave vector is directly proportional to the momentum, p = hk. Hence, the plot of energy vs momentum is nothing else but the dispersion relation, ω vs k, with both axis multiplied by a constant, h. For example, the dispersion relation for electrons (‘matter waves’) in a solid is known as the solid’s band structure. It is instructive to write the dispersion relation in the form ω(k) = c(k)k. Remember, phase velocity c is defined as ω/k, therefore, ω(k) = c(k)k interprets dispersion as a k- dependend phase velocity. In the absence of dispersion, c is not a function of k, but a constant; consequently, vG = dω/dk = ω/k = c: Group velocity equals phase velocity, no dispersion, wave packets keep their shape. Constant phase velocity can be a good approximation over a limited range of k’s, however, the only ‘medium’ that shows no dispersion at all for any k is vacuum as a medium for light waves. You may therefore question our previous eqn’s for the phase velocity of waves on strings, and sound waves, and rightly so. These are approximations for ‘small’ k, i.e. long wavelength, λ. Exercise: Roughly at what wavelength do you think the relation c = √E/ρ for elastic waves in solids breaks down? In general, all media other than vacuum will display dispersion, and vG ≠ c. Establishing the dispersion relation is an important and non- trivial exercise in the description of a medium. Note that some dispersion relations contain a point where vG = 0: A wave can stand still after all, without being confined by boundaries. Note how this is different from what we discussed previously as ‘standing wave’. Application: Mirrorless lasing. Exercise: Recap the different definitions and meanings of the three velocities relevant for waves: c, v0, vG. Dispersion of light in matter We now discuss in depth one example for dispersion relations, namely the dispersion of light waves in a transparent medium. Light waves in transparent matter can be described as the interaction of a wave, which acts as driver, with stationary oscillators, that is the atoms of the medium. We will see that in this context, we get surprisingly far by describing atoms with classical means. Maxwell’s equations of electromagnetism relate electrical and magnetic fields to each other as follows: (eq. 3.29) ∂B ∂E =− ∂t ∂x ∂E 1 ∂B =− ∂t εε 0 µµ 0 ∂x Therein ε is the dielectric constant of the medium, and µ the magnetic permeability. We have assumed an isotropic and linear medium, i.e. one where ε and µ do not depend on direction or amplitude. In the same way as we had done for eqn.s 3.11 and 3.12, these two equations can be combined€into one equation of the form of the wave equation for either E or B. The corresponding phase velocity is given by: (eq.3.30) c= 1 1 1 c = = 0 εε 0 µµ 0 ε 0 µ 0 εµ n Wherein c0 is the phase velocity of light in vacuum, c0 ≈ 3x108 m/s, c is the phase velocity of the light wave in the medium, and n = √εµ is called the refractive index of the medium. Dispersion of light waves is usually described by a wavelength- dependent refractive index n(λ). € Exercise: We had previously introduced dispersion as an ω that depends on k. Show that knowledge of n(λ) is equivalent to knowledge of ω(k), hence, the term dispersion relation is appropriate for n(λ). We will now attempt to describe the qualitative behaviour of n as a function of λ, or rather, first as a function of ω. We assume µ = 1, which is almost precisely true for almost all transparent media. Hence, n = √ε. That leaves us to describe the dielectric constant ε as a function of angular frequency of the electromagnetic wave, ω. We assume that all matter is made of electrically charged particles that are bound together, not an unreasonable assumption even within the framework of classical physics. When an electric field E is applied to matter, positively and negatively charged particles will be displaced in opposite directions, resulting in dipole moments p = ex (e charge, x displacement). The density of dipole moments is known as electric polarisation, P. (eq. 3.31) P= N ei xi ∑ V i Wherein N/V is the number of atoms per volume, ei is the charge of subatomic particle i, and xi is the displacement from equilibrium position of the same particle as a result of the applied electric field, E. The sum adds all subatomic dipole moments in one atom or molecule, the pre- factor N/V €atoms or molecules there are in a given volume. Note this ‘polarisation’ means counts how many something rather different from the polarisation of transversal waves. The dielectric constant ε is defined in terms of the polarisation P, and the electric field E: (eq. 3.32) P = (ε −1)ε0 E Exercise: What are the SI units or dimensions of ε? With the help of eqn 3.31, we now calculate the polarisation resulting from an oscillating electric field, and€ then equate to eqn 3.32 to obtain an expression for ε. We assume that every charged particle (numbered by a running index, i) is bound to its equilibrium position by a harmonic ‘spring’, and consequently, can undergo undamped harmonic oscillations with resonance frequency ω0,i. An electromagnetic wave of amplitude Emax and frequency ω acts like a driver to this sub- atomic oscillator, which therefore responds with driven oscillations of the same frequency as E, and an amplitude xi given by the resonance function, eq. 2.20, setting γ = 0 and F0 = eiEmax. We then substitute that amplitude into eq. 3.32, and lump all particles that have the same resonance frequency into one set. Sets are labelled by a new index, α, and the number of subatomic particles in each set is given by zα: ei E max x max = (eq.3.33) mi 2 ω 0,i −ω2 ⇒ Pmax N E max ei2 NE max e 2 zα = ∑ = ∑ 2 2 2 V i mi (ω 0,i − ω ) Vm e α ωα − ω 2 Therein, me stands for the mass of an electron- note, we now assume the subatomic particles to be either electrons or protons, and neglect the protons. € Exercise: Why is it justified to neglect the contribution of the neutrons? Why is it justified to neglect the protons? Still, there may be different ‘kinds’ of electron, with different resonance frequency, ωα. Of course, both ωα and zα are quantities that ask for an interpretation, a question that cannot be answered within classical physics. Exercise: Attempt an interpretation of ωα and zα In the ‘real’ physics of atoms, a ‘resonance frequency’ ωα corresponds to an atomic transition, and zα is the ‘oscillator strength’ of the transition. Atomic transitions imply the absorption of electromagnetic waves, which would lead us into trouble- we therefore will apply our theory only to frequencies well away from resonance. Feeding eq. 3.33 into 3.32 allows us to extract ε = n2 as a function of ω: (eq.3.34) Ne 2 zα ε (ω ) = n(ω ) = 1+ ∑ ε 0Vm e α ωα2 − ω 2 2 As long as resonances are far apart, eq. 3.34 describes a set of singularities (‘poles’) at the respective ωα, with change of sign, and n settling back to near 1 far away from resonances. The poles are not observed in reality, they are an artefact of neglecting damping initially. € Exercise: If resonances correspond to absorptions in the ‘real’ physics of atoms, what does ‘damping’ correspond to? We therefore shall take our results seriously only well away from the resonances. A schematic representation of n2 vs ω is shown in the lecture. For ω’s smaller than ωa, n increases with increasing ω (i.e., for wavelength shorter than the absorption, n decreases with increasing wavelength). This is known as normal dispersion. Recall Snell’s law, and you’ll find this explains how a prism can split white light into its colours: Blue light is refracted strongest. For a typical crown glass, refractive index between red and blue differs by (0.01…0.02); for diamond it’s 0.044, which gives diamonds their ‘fire’. Anomalous dispersion (n decreases with increasing ω) indicates you are approaching an absorption band. Absorption bands have to be avoided in optics, because it means your lenses etc. are no longer transparent. If you wish to push optical methods into the UV, you have to look for materials with very high ωa (very short absorption wavelength). Popular choice: Calcium Fluoride. Exercise: Why push optics into the UV? Say, 192 nm? Beyond the highest- frequency absorption band, materials have refractive index slightly smaller than 1. Therefore, X-rays show rather unusual refraction at the air/material interface: Air or vacuum becomes the optically dense medium. An application of this unusual situation is grazing incidence X-ray diffraction, which is a sensitive probe for surface properties of a sample, rather than bulk properties as with conventional X-ray diffraction. Exercise: Discuss total reflection for a material with n < 1. Traditionally, refractive index usually is expressed as a function of the vacuum wavelength of the wave, λ = 2πc0/ω, instead of ω. Replacing ω in favour of λ in 3.34 leads to eq. 3.35: (eq. 3.35) Aα λ2 n(λ ) = 1+ ∑ 2 2 α λ − λα 2 Wherein all constants have been lumped into Aα. Exercise: Discuss the difference between light wavelength in vacuum, and light wavelength in a medium with n > 1. Is there a difference between the angular frequencies in vacuum, and in medium? Justify why it is smart to discuss the above subject in terms of ω first, and switch to λ only at the very end. Eq. 3.35 is essentially the same as the empirical Sellmeier equation, only Sellmeier restricted the sum to 3 contributions. Sellmeier’s equation gives a very good fit to measured refractive index data in transparent media, e.g. optical glasses. An even simpler form emerges if we allow only one absorption, typically in the UV: (eq. 3.36) 2 n(λ ) = 1+ A λ2 λ2 − λ2abs I have downloaded refractive index data for a particular optical glass (Glass type BK7 from Dow Corning). Exercise:€Re- write eq.3.36 so data should fall onto a straight line. When data plotted in appropriate form, a very good straight line results. In optics, even what we call ‘normal’ dispersion is a problem, because it leads to chromatic aberration: A lense will not focus white light in a single point. The focal length of a lens depends on refractive index n of the glass. For normal dispersion, n is larger for blue than red, so the blue light component of white light has a shorter focal length than the red light component. This is a nuisance for the design of precise optical instruments. Chromatic abberation was ‘defeated’ by physicist E Abbé, in collaboration with glass technologist C F Schott, working in the factory of Carl Zeiss in Jena, Germany, in the 19th century. Abbé’s ‘apochromats’ combine a strongly focussing (converging) lens made of glass with weak dispersion, and a weakly diverging lens of glass with strong dispersion. Exercise: Discuss how an apochromat results in a device that overall, focuses light without chromatic abberation. With the help of apochromats, Abbé pushed optical microscopy to the theoretical limit of resolution (diffraction limit). Abbé’s work fitted like a glove to the then emerging discipline of microbiology. In 1882, Robert Koch reported on the discovery of Myobacterium tuberculosis, made with a Carl Zeiss microscope. In those days, about one in seven Europeans died of tuberculosis. Medicine as we know it today is unthinkable without Abbé’s microscope. 4 Fictitious forces Coordinate systems and frames of reference A coordinate system is a system that mathematically describes the location of points in space. Coordinate systems are man- made to describe physical events, but they are NOT themselves part of physical reality. The term ‘coordinate system’ is a mathematical term, which often is used synonymous with the physical term ‘frame of reference’. However, it is not quite the same. A coordinate system is a specific way of locating a given point in space. We will soon provide the 3 most common examples: Cartesian, spherical, and cylindrical coordinate systems. Every coordinate system has an origin (point of reference), which can be chosen arbitrarily. The term ‘frame of reference’ refers to the state of motion of the origin of the coordinate system (and sometimes, the state of rotation of the coordinate system). So, we can use different coordinate systems in the same frame of reference, or we can use the same coordinate system in different frames of reference. While the location of an origin can be chosen arbitrarily, not all choices of origin are equally convenient- you should always seek to exploit arbitrariness to make the most convenient choice. This is a guideline which we will re- iterate many times throughout this, and the next, chapter. For example, the potential energy of a loaded spring is often described as V = 1/2k(x-x0)2. Therein, the origin is chosen as the point where the coil is anchored. Since the un- stretched spring has zero energy at non- zero length, l, the other end of the un- stretched coil will be located at position x = x0 = l. This description is correct, but inconvenient. The choice of origin is arbitrary, therefore, it is smarter to locate the origin at the free, rather than the anchored end of the coil. When you use that origin of coordinate system, the energy is simply described by V = 1/2kx2. It will simplify all further calculations that you have removed x0 from the equation. In 3- dimensional space, we need 3 coordinates to specify a point. The most common coordinate systems are: Cartesian coordinates Cartesian coordinates are based on 3 unit vectors; many different conventions to name them are common: i j,k; x,y,z; ex, ey, ez; e1, e2, e3. Be prepared to encounter either. Unit vectors by definition have modulus 1, and are mutually orthogonal. In terms of the ‘inner’ or ‘scalar’ or ‘dot’ product this property can be written as: (eq 4.1) ei ⋅ e j = δij Wherein δij is Kronecker’s symbol, equal to 0 for i ≠ j, equal to 1 for i = j. In the lingo of linear algebra, the unit vectors of a Cartesian system form a ‘complete orthonormal set’ (CONS). € leaves a lot of arbitrariness- or freedom- in the definition of the unit vectors. The Eq. 4.1 still direction of the first unit vector (say, ex), is completely arbitrary. So, when we say ‘assume a body moves with constant velocity along the x- direction…’, what we mean is ‘assume a body moves with constant velocity into an arbitrary direction, which for the sake of convenience, we call xdirection of a coordinate system…’. Note that the first formulation appears to be much less generalwhat if the body happens to move into y direction instead? Until you realise that it is you giving names to directions. What you call the first direction, for the first body, is irrelevant. Only when 2 or more are involved, you have to take care which is x, y, and z. Once the x- direction is specified, we need to pick the y- direction so that it is orthogonal to the xdirection. This still leaves the freedom of rotating the y- direction around the x- direction into any convenient direction. When x- and y direction are specified, the z direction is uniquely specified apart from its sign- it can still point ‘up’ or ‘down’. The convention is to choose the right- handed coordinate system (1st right hand rule: Thumb x, index finger y, middle finger z of the right hand). Exercise: When a right- handed Cartesian system is rotated around an arbitrary axis, is the resulting system right- or left handed, or does it depend on axis and amount of rotation? If a right- handed Cartesian system is ‘inverted’ by reversing the sign of each unit vector (x,y,z -x, -y, -z i.e., start at the same origin but point into the opposite direction as before), is the resulting system right- or left handed? Spherical coordinates First, we define an arbitrary z- axis. Remember ‘arbitrary’ means convenient, e.g. in a rotating body, it will be the axis of rotation. The ‘positive’ direction of this z- axis is defined by a 2nd right hand rule: Point the fingertips of the right hand into the direction of rotation, then the stretched thumb points into the +z direction. Most screws are threaded with right- handed thread, hence the mnemonic ‘Righty tighty, lefty loosey’. A point in space is characterised by 3 spherical coordinates: the distance, r, of that point from the origin, the angle θ ‘down’ from the z- axis, and the angle φ of rotation around the z- axis in positive sense, as defined by the 2nd right hand rule. Note definition of a z- axis is crucial, but itself it is not a coordinate. The zero of φ is another arbitrary definition. The longitude/latitude system of specifying locations on earth is essentially a spherical coordinate system, with r = Rearth = constant taken for granted, and longitude φ = 0 defined by the location of Greenwich. Only, zero latitude is defined as the equator, when it ‘should’ be the North pole- so the θ scale runs from +90o to -90o (+/- called North/South instead), not from 0 to 180o. Cylindrical coordinates Define again an arbitrary z- axis. Coordinates of a point in space are the z- coordinate of the point, as it would be in a Cartesian system, the distance ρ from the z- coordinate (NOT distance from origin!), and an angle of rotation φ around the z- axis, as in spherical coordinates. Exercise: Just like Cartesian coordinates, spherical and cylindrical coordinates have 3 unit vectors, that is vectors of unit length pointing in the direction of increasing respective coordinate, with the other 2 coordinates fixed. Visualise the unit vectors er/θ/φ for spherical coordinates, and the unit vectors ez/ρ/φ of cylindrical coordinates. Plus, any other system that uniquely specifies points in space can be used as coordinate system. Which begs the question, which one to choose? Or, why is it that by default everybody uses Cartesians? Answer: Cartesians are the best coordinate system for handling vectors. Exercise: Why? In Cartesians, and only in Cartesians, the unit vectors are equal at every point in space. Exercise: Go back to the one but previous exercise and convince yourself that in general, unit vectors in spherical and cylindrical coordinates are different at different points. This is essential when adding vectors, and calculating the dot product. Exercise: Imagine what happens if you want to add, or dot- multiply, two vectors that use different unit vectors. It is revolting. Whenever you write down a vector, you usually imply a Cartesian coordinate system, because handling vector operations is all but impossible in any other system. Which, in turn, begs the question why the other coordinate systems are in use at all. Spherical coordinates in particular (and sometimes, cylindrical coordinates) are particularly adapted to the symmetry of some physical situations. Both electrical and gravitational fields of point charges (masses) have ‘spherical symmetry’, that is, the force depends on distance, r, only (assuming the arbitrary but convenient choice of origin to coincide with the position of the charge or mass). The two other coordinates (θ,φ) do not appear in the formula at all. If you want to express electrical or gravitational forces in Cartesian coordinates, you need all three coordinates, since r = √(x2+y2+z2). We will return to these so- called ‘central forces’ in chapter 5. So, there is a dilemma. Electrical and gravitational forces are most conveniently expressed in spherical coordinates, but forces are vectors. As soon as you want to calculate the movement of a body under the force, you have to enter the force into Newton’s equation of motion, F = ma. To handle vectors, you would prefer Cartesian coordinates. We will solve this dilemma in chapter 5. The outer product Before moving on, we have to re- visit the most awkward of all vector operations, even in Cartesians: The outer product between two vectors, c = axb. The direction of the outer product, c, is defined to be perpendicular to the directions of both a and b, that is, perpendicular to the plane defined by a and b. Exercise: If a and b are parallel, they do not define a plane. So, the direction of c is not specified. What then? When a and b are parallel, no direction for c is defined, consequently, c is required to be zero. For non- zero c, the sign of the direction of c is still unspecified. The convention is again that the 1st right hand rule applies. A direct, surprising and important consequence of this definition is that the outer product is anticommutative: axb = - bxa. Exercise: Show that the right hand rule implies that the outer product is anticommutative. That is not maths as you know it, other operations such as addition, multiplication, dot product of vectors are commutative: a + b = b + a, etc. The strangeness of the outer product does not end here. The definition for the direction of c also implies that outer product will work for 3- dimensional vector spaces only. In 2- dim spaces, all vectors a,b will be in the same plane, but there is no direction perpendicular to that plane the resulting c could point into. In 4- or more dimensions, there is 2 or more directions perpendicular to any given plane, and c again would not be specified. Bizarre! But the worst is yet to come. In theoretical physics, the term ‘vector’ is defined as an object that changes its representation in a coordinate system in a specific way when the coordinate system is transformed (rotated or inverted). In particular, when a coordinate system is ‘inverted’, i.e. the unit vectors are ‘flipped’ from ex, ey, ez to -ex, -ey, -ez,, then the representation of a vector also has to change sign: a -a under coordinate inversion. Note carefully how I say the representation of the vector changes sign. The vector itself does not, because it is part of physical reality, which is not affected by our conventions how to represent it. The same is true for another vector, b, of course. Exercise: What happens to the sign of c = axb under coordinate inversion? Since both the representations a and b flip their sign under coordinate inversion, the representation of their outer product does not! The result of the operation ‘outer product’ between two vectors therefore does not qualify as a vector, because its representation does not behave as a vector should under coordinate system inversions (there is no difference in the behaviour under coordinate system rotations, but an inversion can never be achieved by any rotation or sequence of rotations). For such objects, the term pseudovector has been coined. Exercise: Is the outer product between two pseudovectors a vector or a pseudovector? Is the outer product between a vector and a pseudovector a vector or a pseudovector? So far the mathematical properties of pseudovectors, which often confuse on first contact. What does it mean in physical terms? Outer products and pseudovectors always appear when we discuss rotation, rather than linear motion. The physical difference between a vector and a pseudovector is that the arrow tip of a vector defines a direction: Flip vector to -vector, and you reverse movement from left to right to movement from right to left (or up/down, forward/backward). The arrow tip of a pseudovector specifies a sense of rotation: Flip pseudovector to -pseudovector, and you reverse anticlockwise rotation into clockwise rotation. Typical examples are angular velocity ω, angular momentum L, torque T, and also, the magnetic induction, B. Exercise: Recall the definitions of L and T. What ‘rotates’ to generate a magnetic field? Exercise: Compare the reflections of a vector (say, an arrow) in a mirror to the vector itself when the vector either points at the mirror, or runs in parallel to the mirror. Then, look at the reflection of a clock face with a moving hand for seconds, when the clock faces the mirror, or stands parallel to it. Discuss the differences. Electric fields are vectors, E vectors begin and end on charges, hence they have a direction. B- field lines are always closed loops, as there are no magnetic charges where B- field lines begin or end. Closed loops have no ‘direction’, but they do have a sense of rotation (clockwise or anticlockwise). Quantitatively, the outer product is characterised by the following eq. 4.2: ( a × b = a b sin < a,b (eq.4.2) ) a1 b1 a2 b3 − b2 a3 a × b = a2 × b2 = a3b1 − a1b3 a b a b − a b 3 3 1 2 2 1 ex a × b = a1 ey a2 ez a3 b1 b2 b3 wherein <a,b means the angle between vectors a and b, and the last line is the outer product written as the determinant of a matrix. Exercise:€Show that the component- by- component representation of axb agrees with all the properties of the outer product we had introduced: axb is perpendicular to both a and b; axb does not change sign when both a and b do change sign; axb = - bxa; and axb = 0 when a and b are parallel. Frames of reference A physical ‘frame of reference’ is defined by the state of motion of the origin of a coordinate system. There are two fundamentally different types of frames of reference, namely, inertial and non- inertial frames of reference. The difference goes back directly to Newton’s law of inertia. An inertial frame of reference is one that is not accelerated, i.e., neither does its origin change its state of motion, nor do its coordinate axis rotate. Note that there still are an infinite number of inertial frames: The above says nothing about the state of constant motion that the origin may have, as long as it is not accelerated. Assume two frames of reference, A (using Cartesian coordinates x,y,z) and B (using Cartesian coordinates, x’,y’,z’, parallel to x,y,z) that are both not accelerated, but their origins move with constant velocity v along the x/x’ axis (recall what we said before of how to choose axis). As far as the laws of mechanics are concerned, the following two points of view are logically completely equivalent: A says I am resting, B moves with +v into x direction; and B says, I am resting, A is moving with –v into x’ direction. There is no experiment in mechanics that could be used to decide which of the two is ‘right’. Both therefore must be taken equally seriously and must be served equally by whatever physical description we come up with: The Galilean principle of relativity states that the laws of mechanics must be equal in both A and B. In other words, acceleration is absolute, but velocity is relative. Einstein went one step further, claiming that all laws of physics (not just mechanics) must be the same in all inertial frames. All of special relativity can be derived from that apparently small extension of the Galilean principle. But not in this course. Within an inertial frame of reference, all forces are ‘real’ forces, that is forces are the result of interactions between bodies (or, between bodies and force fields- every force field is generated by another body). Non- inertial frames of reference, on the other hand, are characterised by apparent or fictitious forces, that is forces that do not originate from other bodies. For example, if the origin of a frame of reference, C, would accelerate into the x direction, a body that initially rests at the origin would get left behind in the –x direction. If I take frame C seriously, and insist in the law of inertia, F = ma, I must conclude that a force accelerates the body. But, this force has no physical origin, like gravitation or an electric field. Also, there is no ‘reaction’ force (remember ‘equal and opposite’, another of Newton’s principles!). It is therefore a fictitious force, which results from the use of a non- inertial frame of reference. Of course, we understand where the force ‘really’ comes from: The inertia of the body. In this course, we will discuss rotating frames of reference, and the peculiar fictitious forces that occur in them. One wonders why bother at all - after what we said up to here, you would be forgiven to dismiss non- inertial frames altogether. Forces of no physical origin that violate the law of inertia!? Ban them by banning non- inertial frames, problem solved. While this is a logically consistent point- of- view, it is not very practical. Exercise: Why is there a practical need to consider non- inertial frames of reference? We happen to life on a rotating frame of reference. If we insist that e.g. buildings are fixed, nonmoving points of reference, we need to accept that we will experience the fictitious forces that are peculiar to rotating frames of reference. Let us therefore discuss the fictitious forces in a rotating frame of references, but without acceleration of the origin. Fictitious forces in rotating frames of reference We will discuss rotating frames from two points of view: Firstly, from looking at it from ‘outside’, using an inertial frame of reference, A, and secondly, from ‘inside’ the rotating frame, B. Assume B is not linearly accelerated but only rotates. Rotation is characterised by pseudovector ω, which specifies firstly, an axis of rotation, which we will call z- axis, and secondly, a ‘speed’ or angular velocity of rotation, given by ω = 2π/τ, with the period of rotation, τ. Similar, but not exactly the same, as angular frequency in an oscillator or wave. We also assume dω/dt = 0, i.e., there is no angular acceleration. Within such a frame of reference, two fictitious forces arise, which are known as centrifugal force and Coriolis force. Centrifugal force Assume, from A’s point of view, somebody stands at the origin and swings a mass around him/herself on a string of length R with angular velocity, ω. The mass changes direction all the time, therefore, the motion is accelerated. Exercise: Show that ω = v/R, with v being the linear velocity of the mass. What is the direction of the mass’s acceleration? A is an inertial frame, therefore, acceleration requires a real force. This real force is provided by the person standing in the centre, and is known as centripetal force. However, due to the reaction principle, there is an equal and opposite force that the swinging body applies to the person standing in the centre- this is known as centrifugal force. Centrifugal force results from the inertia of the rotating body. We use A’s frame to calculate direction and magnitude of the centrifugal force. Consider the position of the body at 2 points in time, 0 (defined arbitrarily), and briefly later, at Δt. Velocity is always tangential, but in the limit Δt 0, the direction of Δv becomes radial. Looking at the triangles defined by r(0), r(Δt), and Δr; and v(0), v(Δt), and Δv, we find these are similar triangles. Hence: Δr Δv = r v ⇒ (eq.4.3) Δr Δt = Δv Δt r v v a ⇒ = r v v2 ⇒ aCf = = ω 2 r r v2 FCf = maCf = m = mω 2 r r Centrifugal acceleration equals ω2r, with ω = v/r. Centrifugal force on an object equals centrifugal acceleration times the mass of the object. Centrifugal acceleration always points perpendicular away from the axis of rotation. Note that ‘r’ here may be rather different from ‘r’ in a spherical coordinate system, where it stands for the distance from the origin. ‘r’ in the centrifugal acceleration € from the axis of rotation (z- axis), like ‘ρ’ in a cylindrical coordinate system. equals the distance The direction of centrifugal acceleration also equals the radial unit vector in cylindrical, not spherical, coordinates (eρ not er). This is true even when you look at centrifugal acceleration on the surface of a sphere, e.g., Earth. There will be no centrifugal acceleration at the North/South pole, because these are located on the axis of rotation – zero r. The direction of centrifugal force on Earth is in general not exactly opposite to gravity, which points radially to the centre. Exercise: Where on Earth would you experience maximum centrifugal force? Where is centrifugal force exactly opposite to gravity? Exercise: At the Earth’s equator, what fraction of g is compensated by the centrifugal acceleration? How short would a day have to be to compensate gravity completely? Centrifugal force also applies to bodies moving on a curved trajectory within an inertial frame- ofreference. The radius, r, that goes into eqn. 4.3 then would be the radius of curvature (R) of the trajectory. R is the radius of a circle that can be fitted smoothly to a given point of the trajectory. If the trajectory is known, e.g. in the form y(x), there is a mathematical formula for calculating R at any point, but it isn’t pretty. Centrifugal force acts as it would in a rotating frame- of- reference that spins with ω = v/R, wherein v is the velocity of the body moving along the trajectory. For noncircular (and in particular, non- closed) trajectories, the definition of an angular velocity, ω, may seem questionable. Best avoid it- use aCf = v2/R. However, in general, R is not constant along a trajectory, and velocity v may also change, e.g. if the trajectory runs downhill in the presence of gravity. Therefore, centrifugal force changes, as well- it may in fact be zero at some points, and change direction as well as magnitude. You can experience that in a rollercoaster ride. Coriolis Force Centrifugal force acts on objects that are stationary within the rotating frame, as well as moving ones. However, there is another fictitious force that acts only on objects that move within the rotating frame. Assume two observers, A, B, located in the centre of a disc that rotates with ω. A uses an inertial frame, while B uses the non- inertial frame of the rotating disc. Fixed to the edge of the disc is a football goal. It appears stationary for B, but rotates with ω for A. B aims a shot dead straight at goal- and misses! No wonder, thinks A. The ball flies dead straight, but the goal rotates away. But for B, the goal is stationary. A mysterious force of no obvious origin pulls the ball away sideways. This fictitious force is known as Coriolis force. Note how this force only arises when objects move within the non- inertial frame. Exercise: What is the relation between the direction of the Coriolis force, the direction of movement within the rotating frame, and the direction of the axis of rotation (i.e., direction of ω)? Let us discuss the Coriolis force quantitatively. Within a short time interval Δt, the ball travels radially outwards by x = vΔt, where v is the radial velocity of the object, as observed by both A and B. However, from A’s point of view, within Δt, the disc rotates by angle α = ωΔt. From B’s point of view, the disc is stationary, but the ball gets deflect sideways (tangentially) by y = x tanα ≈ xα = vωΔt2, using α << 1, which we can always make sure of by choosing Δt short enough. Since tangential deflection is proportional to Δt2, B interprets it as the result of an acceleration (constantly accelerated motion gives distance ∝ (time)2, as in s = 1/2gt2). The magnitude of this acceleration is 2vω. Hence, we get for the Coriolis acceleration, and Coriolis force: (eq.4.4) aco = 2vω Fco = m ⋅ aco = m2vω Eq. 4.4 is only correct if movement, v, is perpendicular to the axis of rotation, ω , and does not give the direction of the Coriolis acceleration. If you have answered the exercise above, you may be able to guess the general form of the Coriolis acceleration: € (eq.4.5) a co = 2v × ω a co = 2vω sin(< v, ω ) i.e., Coriolis acceleration is given by twice the outer product of v and ω. Eq. 4.5 defines the direction (via right hand rule) as well as the magnitude of the Coriolis acceleration. Note it is important to remember the order in an outer product, as it is anticommutative. € Exercise: What is the Coriolis acceleration for an object that moves parallel to the axis of rotation? Exercise: Accelerations and forces are vectors, not pseudovectors. How can it be that the Coriolis acceleration then is given by an outer product- doesn’t that give us a pseudovector? It is most instructive to discuss Coriolis forces for objects that move along the surface of Earth (or, slightly above, like clouds). Exercise: Assume Earth were a perfect sphere of RE that rotates with ω = 2π/(23hrs 56 min) around the North Pole / South Pole axis. Work out the direction of the Coriolis force on an object that moves along the equator, an object that crosses the equator moving due North or due South, and an object that moves across the North Pole. Exercise: How fast, and in what direction, would you have to move along the equator for the Coriolis acceleration to compensate acceleration due to gravity, g? Exercise: A high- jumper runs up with 10 m/s ground speed before ‘taking off’. He tries to optimise performance by using the Coriolis force to his advantage. Where on Earth should he jump, and into what direction should he choose to run up? Assuming his personal best without Coriolis assistance is 2.40m, can he expect to improve by 1 cm? Compare to the centrifugal force. Also, compare to an athlete of equal ability who chooses sports facilities at 3km above sea level to reduce gravity, but runs up due north. Who gains more? What if you combine the three effects? Intriguingly, Bob Beamon (long-)jumped a sensational 8.90m at the Mexico City Olympics in 1968, setting one of the longest- lasting athletics world records ever. Check if that is a ‘suspicious’ location with regards to height, Coriolis effect, and centrifugal force. Assuming he ran up in the ‘right’ direction (which I don’t know), do you think his world record breaking leap was substantially assisted by choice of location, or simply an exceptional performance of an outstanding athlete at the peak of his career? Coriolis forces make an important contribution to weather phenomena on Earth. As air masses move towards an area of low pressure, the Coriolis force deflects them sideways, resulting in a cyclone. Cyclones are clockwise in the northern hemisphere, anticlockwise in the southern hemisphere. Exercise: There is another force in physics that only acts on moving bodies, always is perpendicular to the direction of movement, and is quantified by an outer product. Which? How ‘real’ or ‘fictitious’ is that force? Complete expression for fictitious forces Our treatment of fictitious forces assumed a rotating frame that is not linearly accelerated, and does not change angular velocity. This introduces the two most important fictitious forces, centrifugal and Coriolis. However, a complete general expression can be derived that drops these assumptions. Here it is: (eq. 4.6) F − ma 0 − 2mω × v' −mω˙ × r' −mω × (ω × r' ) = ma' Therein, primed (‘) quantities are quantities measured in the non- inertial frame, quantities without prime are measured in the corresponding inertial frame. Eq. 4.6 is the expression that corresponds to the simple F = ma in an inertial frame. F is the real force acting on the body, all other contributions €fictitious forces. a0 is the linear acceleration of the origin of the non- inertial frame. The next are term is the Coriolis force, note that v is primed corresponding to motion within the non- inertial frame, while ω is not primed. The following term is another fictitious force known as transverse force, which is present when there is an angular acceleration dω/dt ≠ 0, i.e. the rotating frame changes angular velocity. Finally, the centrifugal force. Note the double outer product, which firstly, clarifies the direction, and secondly, calculates the distance of r’ to the axis of rotation. All this equals ma’, that is mass times the acceleration as observed in the non- inertial frame. 5 Mechanics à la Lagrange Newton formulated the most basic equation of motion in mechanics, F i = ma i = mi r˙˙i (eq.5.1) Note : Fi = Fi (r1 ,..., rn ) wherein every body involved in the system gets a number, index i, running from one to n. This describes the acceleration of body i under the influence of the force acting on it. To completely formulate the equation of motion for a set of interacting bodies, we yet need to know the forces. € will generally depend on the distance of body, i, from all other bodies, which is a rather These important footnote to eq. 5.1. Forces will also depend on the masses and possibly electrical charges of all bodies involved. Solving or ‘integrating’ the equation of motion means being able to predict ri(t) for all bodies in the system for all future times. This also requires initial conditions (locations and velocities at t = 0). When practically trying to integrate equations of motion, it very quickly becomes apparent that 5.1 is a deceptively simple formulation of a highly convoluted problem. The complication arises from the intertwined nature of the problem: Forces depend on distances between bodies, but forces also accelerate bodies, which means bodies change velocity and location, and hence their mutual distances. Also, all is reciprocal - remember actio = reactio. You won’t get one body moving through an array of other bodies, which generate forces to accelerate the body in question, but themselves sit still. The other bodies will experience an equal force of opposite direction, and will therefore also accelerate. The differential equation 5.1 takes a ‘snapshot’ at one fixed time, that is at fixed distances. This brings formal order and simplicity at any one given moment in time, but only a fool could believe that this mathematical trick can reduce the necessary complexity of the problem. It turns out that eq. 5.1 can always be integrated analytically (at least for forces being gravitation and electromagnetic) when n = 2. For larger n, analytical solutions exist only sometimes under specific assumptions (e.g. all masses equal, or one or more masses are negligible). But in general, even the ‘three body problem’ has no analytic solution! This is a problem in principle, and no re- formulation of eq. 5.1 can make it go away. Other concepts need to be introduced to come to grips with many- body problems, that do not rely in tracking the motion of every single body- like thermodynamics and statistical physics. However, there also is an unnecessary complication that is implied in eq. 5.1. Eq. 5.1 is a vector equation, and for reasons we discussed in chapter 4, is therefore best formulated and tackled in Cartesian coordinates. However, by far the most common forces- gravitational and electrostaticdepend on distance only, and are therefore best formulated in terms of spherical coordinates. Clash! To resolve this clash, we will follow in the footsteps of Joseph Louis de Lagrange. Lagrange has reformulated Newton’s equation of motion, 5.1, in a way that does not involve forces, but energies instead. Energies are scalars, not vectors, and therefore do not ‘demand’ any particular coordinate system. We can instead use whatever coordinates are most appropriate to the problem. We’ll soon see that these need not necessarily be either of Cartesian, spherical, or cylindrical, and often, we’ll find that we need less than 3 coordinates to describe the state of a mechanical system. We’ll also see that we can often very quickly identify so- called ‘constants of the motion’. Lagrange in a nutshell To begin, I’ll show you that you can treat a familiar mechanical problem with energies rather than forces: The free- falling body. Of course, we know that F = mg, with the initial conditions x(0) = v(0) = 0, leads to x(t) = 1/2gt2. But we can also look at the energies: The free- falling body converts potential energy (V = -mgx, assuming x is positive downhill) into kinetic energy (T = 1/2mv2). Energy conservation applies, so T + V = const., and initially, x(0) = v(0) = 0, so T +V = 0 or T = V. This we can turn into an equation for x, that can be solved, like so: 2 dx T = −V ⇒ m = mgx dt dx ⇒ = 2g x dt dx ⇒ 2gdt = x dx ⇒ 2g ∫ dt = ∫ x ⇒ 2g t = 2 x 1 2 Eq. 5.2 ⇒ 2gt 2 = 4 x 1 ⇒ x = gt 2 2 So we can derive the correct result by looking at energies only, without ever referring to a force at all. The Lagrangian formalism is a generalised, systematic version for the treatment of mechanical problems along the lines shown in eqn. 5.2. The key assumption is that of energy conservation, so € consider conservative forces here, i.e. forces that conserve mechanical energy (kinetic we will only plus potential energy). Treatment of non- conservative forces, e.g. friction, with Lagrangian methods is more difficult, and we shall not go there. The Lagrangian formalism For a systematic approach, we need a few definitions first. Lagrangian mechanics works with any set of coordinates, as long as they are sufficient to completely specify the state of the mechanical system in question (location and orientation of every body in the system). In the absence of any constraints, this will be 3 coordinates for every (point- like) body, plus possibly more to specify a non- point- like body’s orientation. However, when the movement of a body is constrained, fewer than 3 coordinates may be necessary to specify its location. The minimum number of coordinates required to completely describe the state of a mechanical system is known as the degrees of freedom, N, of the system. N is independent of what particular coordinate system you choose to use- and may be surprisingly small. It may not be immediately obvious, but these are in fact closely related to the degrees of freedom you heard of in thermodynamics, and the normal modes of coupled oscillators. There are a lot of simple mechanical systems with only a single degree of freedom. Exercise: A pendulum, a cylinder rolling down an inclined plane, an Atwood machine- how many degrees of freedom do these systems have? Since we do not wish to constrain what kind of coordinates we use, we will speak as of now of generalised coordinates, qi. By definition, for a system with N degrees of freedom, we will need N generalised coordinates to completely describe its state. Usually, we could also find another set of coordinates that uses more than N generalised coordinates. Exercise: Describe a swinging pendulum with one, and with two, coordinates. However, a set of more than N coordinates is ‘wasteful’, the coordinates are not independent. We could always eliminate one or more of the coordinates to reduce the number down to N. Exercise: Reduce the two coordinates you have chosen to describe the pendulum to one. Note that generalised coordinates do not necessarily have dimension L, e.g. they may be angles. In future, we will always assume that our set of coordinates contains as many, but no more, coordinates than the degrees of freedom, N. A set of generalised coordinates, q1,….,qN, that completely describes the state of a mechanical system using no more coordinates than the degrees of freedom is called a holonomic set. Still, there may be more than one possible holonomic set. E.g. for the unconstrained motion of one single point- like body in space, Cartesians, spherical coordinates, or cylindrical coordinates all are a holonomic set. The Lagrangian approach describes mechanical systems in terms of their energies, assuming conservation of overall energy. Just like we needed to formulate V and T for the free- falling body to set up eqn. 5.2, we now need to formulate the two types of mechanical energy – potential (V) and kinetic energy (T) – in terms of the holonomic set of generalised coordinates we have chosen. Potential energy, V Since the mechanical system is completely characterised by its holonomic set of generalised coordinates, it is always possible to express its potential energy as a function of the generalised coordinates, and possibly, time: (eq.5.3) V = V (t,q1 ,...,q N ) In many cases, V does not explicitly depend on time, but including it here gives us the option e.g. to describe potential energies of charged particles in an AC electrical field. However, V will usually still depend on time, albeit not explicitly: Some or all generalised coordinates will change with € V depends on the generalised coordinates, it will also change with time. However, a time, since potential energy does not depend on velocities. Note also that while generalised coordinates may have unusual units – or no units, e.g. they may be angles – V is an energy, units Joule. Often, generalised coordinates are chosen so that eq. 5.3 takes a simple form. E.g., a potential that depends on distance only will be described in spherical coordinates, so it will depend on r only, not on θ, φ. This may, however, sometimes lead to more complicated expressions for kinetic energy; we will see an example for that soon. Sometimes it may be smarter to use generalised coordinates that make calculation of kinetic energy simpler, at the expense of a more complicated expression for V. Note that potential energies are always specified apart from an arbitrary constant. The theoretically most satisfying choice of that constant is so that the potential energy is zero when all the bodies of the system are ‘infinitely’ far apart from each other, e.g. two bodies that interact by gravity will have zero ‘interaction’ (= force) as distance goes to infinity. It is just as well to call the potential energy at infinite distance ‘zero’ then. At any finite distance, potential energy will then always be negative. However, this normalisation is not always desirable or even possible- e.g. there is no ‘infinite’ distance for the pendulum. It also is not necessary (with a few exceptions), and adopting different conventions may be more convenient. E.g., we may wish to call potential energy ‘zero’ at the lower resting point of the pendulum. Or, we may wish to call potential ‘zero’ at time zero. All of these conventions are acceptable. Exercise: Write down the potential energy of a pendulum as a function of a (single) holonomic coordinate, the angle φ it makes to the vertical axis. We will see later that adding any constant to V results in the same equations of motion, hence it is physically irrelevant and can be chosen at our convenience. But note, other than for normalisation to zero at infinite distance, potential energy can then be positive as well as negative! Still, potential energies always count negative ‘downhill’: A movement downhill means potential energy becomes either less positive, or more negative. Getting the direction of potential energy wrong is one of the two major pitfalls of the Lagrangian mechanism. If there is more than one body involved in the system, you have to add the potential energies of all bodies in the system to get V. Kinetic energy, T We also need to formulate kinetic energy, T, in terms of our generalised coordinates. Generalised coordinates, qk, may change with time. We define the generalised velocity as a generalised coordinate’s derivative with respect to time, dqk/dt. We can then formulate the kinetic energy of mechanical system in terms of its generalised coordinates, and generalised velocities: (eq. 5.4) T = T (q1,...,q N ; q˙1,..., q˙ N ) Note kinetic energy never contains time explicitly, but of course, it does implicitly. I wish to stress here immediately that ‘generalised velocities’ can be a rather misleading term, and great care needs to be exercised when calculating T from generalised velocities. In fact this is usually the hardest € in applying the Lagrangian formalism, and the other of its two major pitfalls. Firstly, you part should note that the unit of a generalised velocity is not necessarily m/s, i.e. generalised velocities are not always ‘proper’ velocities. For example, if we use the angle φ of a pendulum as its generalised coordinate, then its generalised velocity is dφ/dt. This is in fact the pendulum bobs’ angular velocity, with units 1/s. You cannot simply slot this angular velocity into the place of ‘v’ in the familiar formula T = 1/2mv2 for kinetic energy. To begin with, angular velocity does not have the proper units for a velocity, so what you would calculate does not have the units of energy. But T is an energy, and like V, has units Joule. In the case of the pendulum bob, the problem is easy to fix. Exercise: Express the kinetic energy of a pendulum bob in terms of dφ/dt. We simply have to multiply dφ/dt with the length of the pendulum rod, l. However, it is not always that simple, in particular in systems that require more than one generalised coordinate. Imagine the pendulum bob would not be a rigid rod, but ‘springy’, so it can extend and contract along its length, as well as swing: N = 2. Then, l would not be a constant, but you would use it as another generalised coordinate. You see that the tangential velocity of the bob ldφ/dt then depends on a generalised coordinate (l) as well as on a generalised velocity (dφ/dt). In the familiar equation T = 1 /2mv2, with v in Cartesian coordinates, T depends on velocities only, not on coordinates (x,y,z). When we use generalised coordinates, T may depend on generalised coordinates as well as generalised velocities. Eq. 5.4 is formulated to account for that. Also, for the pendulum with ‘springy’ rod, there will be another component in the velocity of the pendulum bob, in the direction of the extending or contracting rod. This velocity equals dl/dt. The direction of the two generalised velocities will change as generalised coordinates change, however, in this case, fortunately, the direction of dl/dt is always perpendicular to the direction of dφ/dt. If, and only if, the velocities are always perpendicular to each other, you can calculate the ‘proper’ v2 simply by adding the squares of the velocity components: Here, v2 = (ldφ/dt)2 + (dl/dt)2, and T = 1/2m[(ldφ/dt)2 + (dl/dt)2]. However, it is not always or generally true that different generalised velocities are perpendicular to each other. An example you will discuss is a pendulum suspended under a moving cart: The velocity of the pendulum bob has contributions from both the change of angle, and the linear movement of the cart. However, these two components are not perpendicular to each other. You can then NOT add the squares of generalised velocity components (even if you ‘fix’ dφ/dt by multiplying with l first) to get the square of the velocity that goes into T = 1/2mv2. In other words, generalised velocities are not (always) components of a Cartesian vector, although the term ‘generalised velocity’ may mislead you to believe they were. Beware! Nevertheless, the flexibility to choose a coordinate system tailored to a specific problem far outweighs these difficulties. You will appreciate that if you can chose between different holonomic sets of generalised coordinates, some of them may make it easier than others to calculate T, just like some may make it easier than others to calculate V. Unfortunately, it may not be the same set that simplifies both of them. Other than for V, there is never a question mark over the sign of T. In classical physics, T is always positive, or zero. However, T may be different in different frames- of- reference, since masses may be considered at rest, or in motion, depending on which frame- of- reference you chose to adopt. Often it is helpful to treat mechanical systems in the frame of reference wherein the system’s ‘centre- of- mass’ is at rest. In the centre- of- mass frame, T has its minimum possible value. However, no law of physics tells you which frame- of- reference is ‘right’- in fact, Galileo’s (and more so, Einstein’s) principle of relativity tells us that all frames are equally valid. You must, however, be consistent: Once you have chosen a particular frame, you must stick with it from beginning to end, and also interpret your results within that frame. If there is more than one body involved in the mechanical system, you have to add the kinetic energy contributions of all bodies to calculate T. Even when there is only one body, and only one degree of freedom, T may have more than one contribution, e.g. from translation and rotation. The Lagrangian Once we have calculated V and T in generalised coordinates / generalised velocities, we can write down the Lagrangian, L, of the system: (eq. 5.5) L = L(t,q1,...,q N ; q˙1,..., q˙ N ) = T (q1,...,q N ; q˙1,..., q˙ N ) − V (t,q1,...,q N ) Or, in brief, L = T – V. € Exercise: What are the SI units of a Lagrangian? The Lagrangian equation of motion With the help of L, Lagrange has formulated an alternative to Newton’s equation of motion, eq. 5.1, which does not contain vectors and is therefore much more flexible with respect to the choice of coordinates: (eq.5.6) d ∂L ∂ L = dt ∂q˙i ∂qi There is one equation for every i, so in total, there are as many equations as there are degrees of freedom. This set of equations is logically equivalent to Newton’s equation of motion, and can be € it mathematically- we won’t do that here. In analogy to eq. 5.1, derived from generalised force, and ∂L ∂q˙i ∂L ∂q i is called a a generalised momentum, sometimes written as pi. However, the same caution has to be applied to generalised forces and momenta, as to generalised velocities: € Different generalised forces and momenta are not necessarily perpendicular to each other in direction, and can not be interpreted as components of a Cartesian vector. There is no need to memorise these terms at all. What matters is that you can formulate (differential) equations for the generalised € coordinates. Solving these differential equations is ‘integrating’ the eqn. of motion, just as much as integrating 5.1 is. Exercise: Introduce the momentum, p=mv, into eq.5.1 to highlight the analogy that leads to the terms ‘generalised momentum’ and ‘generalised force’. Exercise: Revise the difference between a partial and a total differential. A partial derivative does ‘look’ for a variable explicitly only, so ∂L ∂q i ‘looks’ for qi in the Lagrangian, but NOT for anything else, e.g. qidot. A total differential, on the other hand, looks for explicit as well as implicit dependency on a variable, e.g. d/dt in 5.6 ‘looks’ for t explicitly (which often may be absent!), as well as t implicitly, via the time- dependent coordinates qi. In other words, you have to apply the chain rule. € What the Lagrangian equation of motion in general can not do is to separate the generalised coordinates / velocities between the N equations of motion. The derivative of L to ith generalised velocity / ith generalised coordinate may still contain other general velocities / coordinates than the ith. We will still have coupled differential equations that reflect the coupled and reciprocal nature of interactions between bodies. If a system cannot in principle be integrated using the Newtonian eq. of motion, it cannot be integrated by the Lagrangian equation either, since it is logically equivalent and can be strictly derived from the Newtonian equation. However, a system that can be solved in principle, usually can be solved easier in the Lagrangian formalism. Since the Lagrangian formulation is logically equivalent to Newtonian mechanics, the Lagrangian formalism does not represent conceptually different Physics. Lagrange is still rooted in the ‘clockwork universe’, just as much as Newton. You cannot derive either relativity or quantum mechanics from the Lagrange equation, just as you cannot derive them from the Newtonian equation of motion. New physics needs new concepts, not just mathematical transformations. However, it is fair to say that the founders of quantum mechanics have heavily borrowed from the ideas of Lagrange, and a closely related energy- based (rather than force- based) re- formulation of mechanics that has been introduced by Hamilton (The Hamiltonian H is given by H = T+V, rather than the Lagrangian L = T-V). The founding equation of quantum mechanics, Schrödinger’s equation, is formulated in terms of the Hamiltonian directly, that is in terms of energies. You may have noticed that in quantum mechanics, you rarely ever hear the term ‘force’. Cyclic coordinates Before discussing examples, one more general concept. In some cases, the Lagrangian may not depend on a particular generalised coordinate, qi (it may still depend on the corresponding generalised velocity). In other words, ∂L ∂q i = 0. Such a generalised coordinate is called cyclic or ignorable (I will call it cyclic). The significance of a generalised coordinate being ‘cyclic’ becomes ∂L ∂L = 0 into eq. 5.6, and you will immediately see that this implies ∂q i ∂q˙i words,€every cyclic coordinate directly gives you a constant of the motion. apparent when you enter = const.. In other Often, you will find a ‘constant of the motion’ reproduces a well- known conservation law, e.g. momentum or angular momentum. Constants of€the motion are exceedingly helpful: Firstly, one of the equations € of the motion has solved itself. Secondly, often the constant can be fed into the remaining, unsolved equations, and simplify them. Finding a cyclic coordinate is ace. You will recall that usually, there are different possible choices of sets of generalised coordinates, and you are free to pick whichever appears most suitable, e.g. because it makes either T or V easy to formulate. However, there is no ‘conservation law’ for the number of cyclic coordinates: You may find that one particular choice of coordinate set gives you a cyclic coordinate, while others don’t. If you can find a particular set that gives you a cyclic coordinate, while another doesn’t, that is the strongest possible reason in favour of that set. Remember, with Newton you have no choice: Vectors dictate Cartesians. The real advantage of Lagrange over Newton is that Lagrange affords you the freedom to adopt whichever set of generalised coordinates gives you one (or more) cyclic coordinates, which makes solving the equations so much easier. How to find a set that gives you a cyclic coordinate, or if that is possible at all… there is no simple rule, but trying to keep both T and V (or at least, one of them) as simple as possible is a start. Using the Lagrangian: Simple examples The discussion of Lagrangian mechanics so far is general and abstract, and the key equation, eq. 5.6, was not even derived properly. Lagrange invented his approach to simplify mechanics, not to confuse it- you are forgiven if you are not quite convinced of that so far. In this paragraph, we will apply the formalism to some simple examples of mechanical systems, which you are already familiar with. We will arrive at familiar results and well- known physics. I hope that this will convince you as much as a formal derivation that eq. 5.6 is firstly, valid, and secondly, represents a simplification of analytical mechanics. Single body in empty space Because there is only one body, no force or potential applies, because that would require a second body. Hence, V = const., and we take the liberty of calling this V = 0. In the absence of any potential, there is nothing to suggest which particular coordinate system is most convenient, we therefore ‘default’ to Cartesians as generalised coordinates, because it makes T simple to calculate. We say the body moves along the x- direction of a Cartesian coordinate system (remember from our discussion of coordinate systems that this really means, the body moves into an arbitrary direction). The kinetic energy is given by T = 1/2mv2 = 1/2m x˙ 2, and the Lagrangian by L = T - V = T = 1 /2m x˙ 2. This Lagrangian does depend on x˙ only, but not on x- in Lagrange lingo, x is cyclic. Hence, we get a constant- of- the motion: € ∂L ∂ 1 2 = mx˙ = mx˙ = const. ∂x˙ ∂x˙ 2 So we re- discover the familiar conservation of € momentum: A body that experiences no forces continues € forever moving into the same direction with constant velocity- just like Newton says. Free fall under gravity, g € by looking at energy alone, but without the Lagrangian We have already treated this problem formalism. You’ll see it is easier now. Gravity establishes a potential, the body’s potential energy depends on the height of the body, h, above an (arbitrary) zero level. h fully characterises the state of the body, so N = 1, and we take h as the single generalised coordinate to describe this system. We again take the liberty to call V = 0 and h = 0 at t = 0, because constants in V do not matter. If we define positive (+h) direction as pointing downhill, then the potential energy is given by V = mgh. Exercise: Confirm that within the chosen conventions, V = -mgh counts potential energies negative downhill. 1 Kinetic energy is given by T = 1/2m h˙2. Hence, L = T − V = mh˙ 2 + mgh , and we write the 2 Lagrangian eqn. of motion as: (eq. 5.7) d ∂L ∂L €˙ = mh˙˙ = €= mg ⇒ h˙˙ = g dt ∂h ∂h Note we have simply taken the general equation, 5.6, and applied it to the current, specific situation: ‘The holonomic set of generalised coordinates qi’ here is just the single coordinate, h. € tells us the free- falling body undergoes constantly accelerated motion straight down. So, Lagrange If we assume the body rests at t=0, eqn. 5.7 has the familiar solution h(t) = 1/2gt2. As +h points straight down by definition, the body falls downhill, as expected. Of course, which way we chose h to point is arbitrary. If we had chosen h positive uphill, then we would have to write potential energy with a plus: V = +mgh Exercise: Confirm that under the changed convention, V = mgh counts potential energies negative downhill. Now, the Lagrangian equation will give us h˙˙ = −g . The equation flips the sign of h, but we had initially flipped the meaning of what ‘positive’ h is. We get the same physics, the body falls down not up. The choice of convention is up to us, but we need to clarify it in the very beginning and formulate V consistently within the chosen convention. € The harmonic oscillator As another simple example, we discuss the linear harmonic oscillator (LHO). We need one generalised coordinate, x, the position of the oscillators’ mass. We chose the origin of x to be the location where the oscillator’s spring is neither stretched nor compressed. The kinetic energy of the oscillating mass is given by 1 1 T = mx˙ 2 , the potential is given by V = kx 2 . Note the positive 2 2 sign of the potential: x = 0 corresponds to minimum potential energy, all deflections of the spring – be it stretching or compression – lead to higher potential energy. Since x is squared for V, both direction (+/- x) lead to positive V, and on this occasion, it doesn’t really matter which x- direction you call positive. € € Exercise: Formulate the Lagrangian of the LHO, and slot it into the equation of motion, 5.6, to confirm that we get eq. 2.2, exactly the same equation for the LHO we had derived in chapter 2, only then, we started with the Newtonian equation of motion. The Lagrangian formalism returns exactly the same differential equation 2.2 as we had derived using Newtonian (force- based) reasoning. Of course, once we have exactly the same equation of motion, we will have exactly the same solution(s). The physical pendulum We have already discussed the use of an angle, α, as generalised coordinate for the physical pendulum, and we have shown that the potential energy is given by V = -mglcosα, after removing an arbitrary constant. The speed at which the bob moves is given by l α˙ . Kinetic energy, T, is therefore given as T = 1/2 ml2 α˙ 2. Therefore, the Lagrangian of the pendulum is 1 L = T − V = ml 2α˙ 2 + mgl cos α € 2 (eq. 5.8) € and the Lagrangian equation of motion is: € (eq. 5.9) d ∂L ∂L = ml 2α˙˙ = = −mgl sin α dt ∂α˙ ∂α g ⇒ α˙˙ + sin α = 0 l In the limit of small amplitudes, sinα α, and eq. 5.9 reduces again to the diff. eqn of the simple harmonic oscillator, eq. 2.2, only with g/l replacing k/m. For small amplitudes, the pendulum will € undergo harmonic oscillations with angular frequency elementary mechanics. ω= g , a result that you will know from l I hope the fact that the Lagrangian formalism reliably reproduces known results inspires confidence. It is not at all hard, in fact, the equation for the pendulum is not that easy to derive when working with forces. Using the Lagrangian: Advanced examples To familiarise you with the new method, let us now look at a few worked examples of the Lagrangian formalism to a situations where you may not yet know the result. Cylinder rolling down an inclined plane A homogeneous cylinder (radius R, mass M) rolls down an inclined plane (constant inclination angle, α) under the influence of gravity. We assume there is no slippage between the rim of the cylinder, and the plane. This is an example where we have two contributions to T: Translational kinetic energy (Ttrans) from the downhill movement of the cylinder’s centre, and rotational kinetic energy (Trot) due to the cylinder rolling. Also, of course, there will be a potential energy due to the cylinder’s centre dropping. First, we need to clarify N. Here, the no- slippage assumption is critical: It means that rotation and translation are related. When the cylinder makes a full turn (2π), it progresses downhill by 2πR, so rotation and translation are not independent. If you know how many times the cylinder has turned, you know where it is- or vice versa. The system has only one degree of freedom! Next, we have to decide which generalised coordinate to use. There are at least three possibilities, each with its merits: We could use the distance (d) by which the cylinder has progressed down the plane. This choice makes it particularly easy to calculate Ttrans, as the derivative of d with respect to time is the linear velocity of the cylinder. Or, we could use the angle, φ, by which the cylinder has turned. Note the difference between angle, φ, which is a generalised coordinate and will change with time, and the inclination angle, α, which is a constant, not a coordinate. Chosing φ would make it particularly simple to calculate rotational kinetic energy, Trot, as the derivative of φ with respect to t is the angular velocity of the cylinder’s rotation. Finally, we could choose the height, h, by which the cylinder has dropped vertically from its initial position. This would make it particularly easy to calculate potential energy, V. We see that a system with N = 1 may yet be described by a number of different generalised coordinates. However, the big difference between this, and a system with N = 3, is that for an N = 3 system, generalised coordinates would be independent of each other, and you would need all 3 of them simultaneously. Not here! As soon as you know one, you can work out the others. Exercise: Express h in terms of d (as well as constants of the system). Express d in terms of φ (as well as constants). Express φ in terms of h (as well as constants). We could pick any one of the above 3- but one will do! I’ll work with d, if you wish you may pick another and work an alternative approach as an exercise. When the cylinder has travelled downhill by distance d, it has turned by angle φ = d/R, and it’s centre- of mass has dropped by h = dsinα. Hence, we can calculate all the relevant energies in terms of d, or its derivative to time: 1 Ttrans = Md˙ 2 2 2 Eqn. 5.10 d˙ 1 1 1 Trot = Iφ˙ 2 = MR 2 = Md˙ 2 2 4 R 4 3 ˙2 Md 4 V = −Mgh = −Mgd sin α T = Ttrans +Trot = Wherein rotational kinetic energy is first expressed generally – in terms of moment- of- inertia I, and angular velocity – and then, the known I for a cylinder, I = 1/2MR2, is applied. € Exercise: Review the concepts of rotational kinetic energy, angular momentum, and moment- ofinertia. Exercise: If it were a block sliding down the plane without friction, rather than a rolling cylinder, the Lagrangian would have no contribution from rotation. Work that (simpler) case. Of course, if you don’t know I (e.g. may be the cylinder is not homogeneous, or you just don’t know), you can leave unknown I in the equation for rotational kinetic energy, it just means you have one more constant of the system. Note also the sign of V. We count d positive as the cylinder progresses down the plane, so we need the ‘minus’ to make sure V is negative downhill. Now we can formulate L, and the equation of motion: 3 ˙2 Md + Mgd sin α 4 d ∂L 3 ˙˙ ∂L = Md = = Mg sin α dt ∂d˙ 2 ∂d L =T −V = Eqn. 5.11 2 ⇒ d˙˙ = g sin α 3 From this, you can easily work out d(t), given the initial conditions. So the result is, the cylinder’s centre undergoes constantly accelerated motion, similar to the free falling body, but acceleration is less than g. There are two reasons for that: Firstly, it doesn’t fall straight down, but at the plane’s € of inclination, and secondly, the free- falling body won’t start rotating, but the cylinder does. angle Exercise: Attempt to solve the above problem Newton- style, without Lagrangian. You may stop as soon as you admit that the Lagrangian is so much easier. Pendulum mounted under a cart This exercise illustrates the possible difficulties of calculating T, and how to deal with it. A car of Mass M runs on frictionless and zero- inertia wheels in the x- direction of a horizontal plane, with gravity, g, acting downwards. The horizontal plane is slitted along the x- direction, and a massless pendulum rod of length l is suspended through the slit under the car. At the end of the rod is a pendulum bob of mass, m. Exercise: How many degrees of freedom does this system have? Pick the corresponding number of sensible generalised coordinates. This system requires 2 coordinates to describe, we shall use the horizontal position of the car, D, and the angle of the pendulum rod to the vertical, α. The overall potential energy, V, is equal to that of the previous pendulum, the car on its horizontal plane has constant potential energy. The simple form of V, depending on only one of the two generalised coordinates, justifies the choice of generalised coordinates. The kinetic energy, T, however, is more difficult to calculate. The first contribution is that of the linear motion of the car, Tcar = 1/2M(dD/dt)2. The pendulum bob, however, undergoes a combination of linear and rotational motion: As the cart moves horizontally, so does the bearing of the pendulum, hence the bob has a velocity component that is the same in magnitude (dD/dt) and direction as that of the cart. However, angle α may also change, leading to rotational motion, with a velocity that is given by ldα/dt. But note: The direction of that motion makes angle α to the direction of the linear velocity (dD/dt). The two components of the bob’s velocity hence are not usually orthogonal to each other, and Tbob can not be calculated by adding the squares of the two velocity components, since that would only be correct if they were orthogonal. The systematic approach to this situation is to find the relation between Cartesian coordinates of the pendulum bob, (x,y) and generalised coordinates, D, α, then calculate the velocities in Cartesian coordinates, and finally, Tbob = 1/2m((dx/dt)2+(dy/dt)2). Simple geometry gives us: (eq. 5.12) x = D + l sin α y = −l cos α then, we calculate velocities dx/dt, dy/dt of the pendulum bob in terms of the generalised coordinates. D and α are coordinates, hence will both depend on time, and we need to use the chain rule where appropriate: € (eq. 5.13) x˙ = D˙ + lα˙ cos α y˙ = lα˙ sin α Unlike (D,α), (x,y) are Cartesian, and we can add their squares for T: € (eq. 5.14) 1 Tbob = m( x˙ 2 + y˙ 2 ) = 2 1 = m(D˙ 2 + 2 D˙ α˙ l cos α + l 2α˙ 2 cos 2 α + l 2α˙ 2 sin 2 α ) 2 1 = m(D˙ 2 + 2 D˙ α˙ l cos α + l 2α˙ 2 ) 2 The full Lagrangian is therefore: (eq.5.15) € L = T − V = Tcar +Tbob + mgl cos α = 1 1 = (m + M )D˙ 2 + ml(D˙ α˙ + g) cos α + ml 2α˙ 2 2 2 It wasn’t pretty, but if you think it is easier to treat the system with forces rather than the Lagrangian, try it. The Lagrangian 5.15 cannot be fully solved. But… € Exercise: Spot the good news in 5.15 5.15 is independent of D (it depends on dD/dt, but not D) Hence, D is a cyclic coordinate. Exercise: Find the constant- of- the motion corresponding to the cyclic coordinate, D. Try to interpret the constant in terms of known conservation laws. The central force problem We will now apply the Lagrangian formalism to central forces, that is forces that depend only on the distance (r) of a body from the origin of the force. Central forces are very common (gravitation, electrical forces), and control the movement of heavenly bodies, like orbits in the solar system, as well as e.g. Rutherford scattering. We will also limit ourselves to isotropic central forces, that is a central force that acts only in the direction of the connecting line between the body and the centre of the force, that is radial direction, er. Exercise: Do you know a central force that is not isotropic? Argue why, due to symmetry, there should not be such a thing as a non- isotropic central force. There is a non- isotropic central force lurking in the general expression for fictitious forces (which?), but that is, well, fictitious. Since force, and potential energy, depend on r only, this is obviously a situation where we will use r as a generalised coordinate. Potential energy can then be written as V = V(r). Of course, V(r) is known in a specific situation- e.g. for gravity or electric forces, it scales inversely with r. But we won’t go into details, the most important feature of V is that it depends only on r. To begin with, we prove a general theorem that applies to all central motion, completely without Lagrangian concepts. Theorem: Movement of a point- like object under an isotropic central force is confined to a plane. Proof: Orbital angular momentum, L, is a pseudovector given by L = rxp = mrxv.with r,p the location and momentum of the body. (Don’t confuse pseudovector L with the Lagrangian, L). The direction of L is therefore perpendicular to the plane defined by r and v, that is the plane in which the motion takes place at the moment. Angular momentum is a conserved quantity. Conservation of angular momentum applies to its direction as well as its magnitude - conservation of L’s direction means conservation of the plane of movement. The above proof assumes that L is the only form of angular momentum, and we can ignore or neglect spin, S. But it is total angular momentum, J = L+S, that is conserved, not L or S seperatly. That’s why textbooks often assume ‘point- like’ bodies under central forces- points don’t spin, hence S = 0, and J = L. Only, real objects aren’t point- like… but, while S may not be zero or negligible, there usually is very little exchange between L and S, so it is a fair assumption that L is conserved separately. L and S can be exchanged by so- called tidal forces, but for tidal forces to be significant, bodies have to come close to the centre of the force- ‘close’ meaning close compared to the body’s own size. So ‘point- like’ is playing it safe, but even in the Earth/Moon system, exchange between L and S is very slow, albeit not completely absent- that is how the moon always shows us the same face. In quantum mechanics it’s a different story, even point- like particles (electrons) can have a spin, and an associated magnetic field, which interacts with the magnetic field due to L- this is known as LS coupling (or, spin- orbit coupling), an important phenomenon in atomic and molecular physics. Movement in a plane has only 2 degrees of freedom, hence we need 2 (not 3) generalised coordinates to describe the location of a body. One of those, clearly, will be r, the distance from the origin of the central force. The second will be an angle, θ, which the current radial direction makes with some reference direction, e.g. the radial direction at t = 0. Now, under these assumptions, let us calculate T, and formulate the Lagrangian. We have to consider two components, radial and tangential, that is in the direction of increasing r, and in the direction of increasing θ. Fortunately, other than in the example of the car- mounted pendulum, the two components always are mutually perpendicular, therefore v2 = vr2+vθ2 = vx2+vy2, with no ‘mixed’ components. Similar as for the physical pendulum, vθ is given by the product of radius and angular velocity, dθ/dt. We therefore have vr = r˙ (eq 5.16) vθ = rθ˙ 1 1 T = mr˙ 2 = m(r˙2 + r 2θ˙ 2 ) 2 2 V = V (r) 1 1 ⇒ L = T − V = mr˙ 2 − V (r) = m(r˙2 + r 2θ˙ 2 ) − V (r) 2 2 This is the general Lagrangian for the central force problem, where we have not yet specified a particular form for V(r). € cyclic coordinate in 5.16 Exercise: Spot the From this Lagrangian, we can derive two equations of motion by the standard Lagrangian method. First, we spot the cyclic coordinate, θ, and derive the constant of the motion: (eq.5.17) ∂L = mr 2θ˙ = const. ∂θ˙ To interpret eq 5.17, we show that the identified constant of the motion is nothing else but the orbital angular momentum: € (eq 5.18) L = r × p = mr × v = m(re r ) × (r˙e r + rθ˙eθ ) er × er = 0 e r × eθ = e z ⇒ L = mr 2θ˙e z where we have explicity worked out the outer product between non- Cartesian unit vectors. We see, the Lagrangian constant of the motion is the modulus of angular momentum. The Lagrangian formalism correctly reproduces the known law of angular momentum conservation. € Exercise: Show that angular momentum conservation is equivalent to Kepler’s second law of planetary orbits: ‘The radius vector sweeps out equal areas in equal times’. Note that we have not yet specified the precise form of V(r), so Kepler’s second law would be valid for any central force. The other Lagrangian equation of motion comes from the other generalised coordinate, r: (eq.5.19) d ∂L ∂L ∂V (r) = m˙r˙ = = mrθ˙ 2 − dt ∂r˙ ∂r ∂r At this point, we would have to specifiy the precise form of V(r). For example, for gravity or electric forces, V(r) potential energy has a 1/r- law, giving the familiar 1/r2 law for the force. € We will not go into the details here, because the problem becomes mathematically difficult without being particularly enlightening physically. But for a 1/r type potential energy, eqn 5.19 can be solved. The solution heavily relies on using the constant- of- the motion we have identified in eqn. 5.17, as well as some mathematical trickery – a clever substitution, and repeated use of chain rule. The constant of the motion is used to eliminate time from 5.19, so that the result appears in the form of an orbit, r(θ), rather than r(t) and θ(t). The resulting orbits are curves known as ‘conic sections’ in classical geometry: Circle, ellipse, parabola, hyperbola. Which of those still depends on initial conditions, and the nature of the force law. For a repulsive force – e.g. equal electric charges – only the open hyperbola is possible, the prominent example is Rutherford scattering. But for attractive forces, we may get closed orbits such as circle or ellipse, as well as open curves. If the orbit is closed or open depends on the balance of kinetic vs potential energy: If T is larger than the modulus of V, the orbit will be open, but if T is smaller than V, it will be closed (if they are exactly equal, it’s a parabola). However, on this occasion, it is important to calibrate V and T ‘properly’ before comparing them- obviously, adding or subtracting an arbitrary constant from V could tip the balance between T and V either way. So here we need to make sure that V is normalised ‘properly’, i.e. V(r) 0 for r infinity. For example, the orbit of the Earth around the sun is an ellipse with eccentricity e = 0.0167. Eccentricity measures how similar or different an ellipse is from a circle. The limiting cases are e = 0, which means the ellipse is a circle, and e = 1, which corresponds to a parabola. The low eccentricity of the Earth’s orbit means it is close to a perfect circle, but not quite. The two- body problem The above treatment of a single body moving in a central field is a mathematically well- defined problem, but it is unphysical: It does not consider the origin of the central field, and a single body can never fulfil Newton’s ‘equal and opposite forces’ (action = reaction). A force (or, potential energy) is always generated by another body- e.g. the sun in the solar system. This other body will take the reaction force. But then, the other body will also experience an acceleration. We cannot justify our implicit assumption that the centre of the central force is a fixed point in space. Exercise: Under what circumstances is the assumption of a resting centre of the central force approximately justified? Under what circumstances is it particularly bad? The simplest physically realistic problem is the two- body problem. So it looks like, back to the drawing board, formulate the problem in a realistic way, that is in terms of the interaction of two bodies rather than the interaction of one body with a fixed central force. That means we need more generalised coordinates (a set of coordinates for each body is required), a more difficult form of the Lagrangian, and probably and even worse set of diff. eqn.s than 5.19, which may turn out to be completely untractable… Fortunately, it turns out this will not be necessary- we can fix it: It is possible to ‘translate’ the twobody central force problem into an equivalent single body problem. What we need to do is pick a clever choice of inertial frame of reference, namely the one wherein the centre- of- mass of the system is at rest. We arrive at an equation like 5.19, only the mass of the body has to be replaced by the so- called reduced mass, µ, defined as µ = m1m2/(m1+m2). Exercise: Show that when one of the two masses is much larger than the other, then the reduced mass is approximately equal to the smaller of the two masses. So we can reduce the 2- body- problem to an equivalent single body problem, if we transform into the centre- of- mass system. However, it is impossible to reduce the three- body- problem in a similar way. It still simplifies matters somewhat if it is formulated in the centre- of- mass frame of reference, but we end up with a set of coupled differential equations for which there is no analytical solution. This is of course the situation in our solar system: Sun is orbited by a number of planets. The approach to such a system is one of approximation, using the fact that the mass of the sun is far bigger than that of any of the planets: First, we solve the orbit for every planet around the sun, pretending the other planets were absent. We get perfectly periodic orbits: Ellipses that close into themselves after a year- careful with the term ‘year’, it means the period of an orbit, hence will be different for different planets. Then, we introduce planet- planet interactions as a perturbation to the perfectly periodic orbits. You will encounter perturbation theories in quantum mechanics, as well, in similar situations: When one part of the Hamiltonian is weak compared to the other, you first neglect it and solve the problem without, then re- introduce it and consider its impact on the solutions of the unperturbed problem. But historically, the first perturbation theories were developed for planetary orbits. The consequence of planet- planet interactions is that the planetary orbits do not perfectly close- in other words, the earth describes a slightly different ellipse every year. The eccentricity of the Earth’s orbit varies between 0.005 and 0.058 on a timescale of several 100,000 years. This is one of a number of irregularities in the Earth’s orbit that are collectively known as Milankovitch cycles. In a classic paper [Science Vol. 194, 4270 (1976)], Hays, Imbrie, and Shackleton have compared Milankovitch cycles to the Earth’s climate history and confirmed what Milankovitch had first speculated: Milankovitch cycles are the ‘pacemaker of the ice ages’. Of course, results like this have been cited by those who deny that the rapid climate change we are currently experiencing is man- made. The overall consensus among climate researchers is that while Milankovitch cycles may explain past ice ages and warm periods, the current global warming is much faster than any climate change in the last 1.5 million years, and hence cannot be blamed on Milankovitch cycles. PHY221- Essential equations 2π 1 = 2 πf ; f = T T Harmonic oscillator : x˙˙ + ω 02 x = 0 ω= ω0 = k g (LHO); ω 0 = (Pendulum) m l x(t) = X max cos(ω 0 t + ϕ ); vmax = ω 0 X max ; a max = ω 02 X max Damped oscillator : x(t) = X max exp(−γt)cos(ω d t + ϕ ) Driven oscillator : x(t) = A(ω )exp(i(ωt − ϕ (ω ))) A(ω res ) = QA(0); Q = πN (N oscillations for amplitude → 1/e) € 2π dω (k) ; ω (k) = kc(k); vgroup = λ dk Z = cρ; r + t = 1 c = λf ; k = v2 ; FCf = maCf (meaningof r;direction of force!) r = 2v × ω; F Cor = maCor (right − hand − rule!) aCf = ω 2 r = aCor € L=T −V € V = −mgh (uniform gravitation); 1 1 T = mv 2 (linear motion); T = Iω 2 (rotation) 2 2 d ∂L ∂ L = dt ∂q˙k ∂qk €
© Copyright 2024