INSTITUT FOR NATURFAGENES DIDAKTIK KØBENHAVNS UNIVERSITET One-dimensional regression in high school Jeanette Kjølbæk Kandidatspeciale Juni 2015 IND’s studenterserie nr. 39 INSTITUT FOR NATURFAGENES DIDAKTIK, www.ind.ku.dk Alle publikationer fra IND er tilgængelige via hjemmesiden. IND’s studenterserie 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. Ellen Berg Jensen: 15-åriges viden om klimaforskelle (2007) Martin Sonnenborg: The Didactic Potential of CAS (2007) 15. 16. 17. 18. 19. 20. 21. 22. Philipp Lorenzen: Hvem er de nye studenter? Baggrund, interesse & uddannelsesstrategi (2010) 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. Karina Søgaard og Sarah Kyhn Buskbjerg: Galoisteori i Gymnasiet (2007) Ana Hesselbart: Mathematical reasoning and semiosis (2007) Julian Tosev: Forskningslignende situationer (2007) Niels Nørskov Laursen: En Covarians-tilgang til Variabelssammenhænge i gymnasiet (2007) Katja Vinding Petersen: Lyd og Liv (2007) Jesper Bruun: Krop og computer i fysikundervisning (2008) Jakob Svendsen: Matematiklærerens forberedelse (2009) Britta Hansen: Didaktik på tværs af matematik og historie (2009) Nadja Ussingkær: En didaktisk undersøgelse af brudte lineære funktioner i rammerne af tysk fritidsmatematik (2009) Thomas Thrane: Design og test af RSC-forløb om vektorfunktioner og bevægelse Flemming Munch Hansen: Samspil omkring differentialregningens elementer i gymnasiets matematik og fysik (2009) Hasan Ademovski og Hatice Ademovski: Proportionalitet på mellemtrinnet - Design af didaktiske situationer baseret på stofdidaktisk analyse (2009) Signe Ougaard: Logiske strukturer i matematisk analyse på gymnasieniveau. Et forløb om kvantorer og ε-δ-definition af grænseværdi (2010) Jesper Winther Sørensen: Abstrakt algebra i gymnasiet - design, udførelse og analyse af undervisning i gruppeteori (2010) Sofie Stoustrup: En analyse af differentialligninger på A-niveau i STX ud fra den antropologiske didaktiske teori (2010) Jan Henrik Egballe Heinze: Eksponentialfunktioner i STX (2010) Mette Beier Jensen: Virtuelgalathea3.dk i biologiundervisningen i gymnasiet (2010) Servet Dönmez: Tosprogede elever og matematik i gymnasiet (2011) Therese Røndum Frederiksen: Designing and implementing an engaging learning experience about the electric sense of sharks for the visitors at Danmarks Akvarium (2011) Sarah Elisabeth Klein: Using touch-tables and inquiry methods to attract and engage visitors (2011) Line Kabell Nissen: Matematik for Sjov – Studie- og forskningsforløb i en eksperimentel kontekst (2011) Jonathan Barrett: Trigonometriske Funktioner i en Gymnasial Kontekst – en epistemologisk referencemodel (2011) Rune Skalborg Hansen: Et studie i læringsopfattelse og -udbytte – Om fysik C kursister på Københavns VUC (2011) Astrid Camilus: Valideringssituationer i undervisningsforløb om differentialligninger og radioaktivitet (2012) Niven Adel Atie: Didaktiske situationer for fuldstændiggørelse af kvadratet i andengradsligningen (2013) Morten C. B. Persson: Kvantekemi i gymnasiet - Tilrettelæggelse, udførelse og evaluering af et undervisningsforløb (2013) Sofie Birch Jensen: Køn, evaluering og The Force Concept Inventory (2013) Simone Gravlund Nielsen: Når børn forsker i matematik (2013) Henrik Egholm Wessel: Smartphones as Scientific Instruments in Inquiry Based Science Education (2013) Nicole Koefoed: Et didaktisk design om definition, eksistens og eksakt værdi af bestemt integral (2013) Trine Louise Brøndt Nielsen: From Master’s programme to labour market – A study on physics graduates’ experience of the transition to the labour market (2013) Rie Hjørnegaard Malm: Becoming a Geologist – Identity negotiations among first year geology students (2013) Mariam Babrakzai Zadran: Gymnasiealgebra I et historisk perspektiv – Matematiske organisationer I gymnasiealgebra (2014) Marie Lohmann-Jensen: Flipped Classroom – andet end blot en strukturel ændring af undervisningen? (2014) Jeppe Willads Petersen: Talent – Why do we do it? (2014) Jeanette Kjølbæk: One-dimensional regression in high school (2015) Abstract Development of calculators and IT-tools has caused that students can easily solve mathematical tasks without knowing intermediate calculations and the mathematics behind the technique. Unfortunately a consequence of this is that the mathematics theory behind these techniques has been given a lower priority. This is the case with teaching in one dimensional regression in high school, where many students learn the instrumented techniques to make regression without knowing what to find or how to find it. In this thesis, the theory about one-dimensional regression at the mathematical community and the possibilities to work more theoretically with one-dimensional regression in high school is investigated. IND’s studenterserie består af kandidatspecialer og bachelorprojekter skrevet ved eller i tilknytning til Institut for Naturfagenes Didaktik. Disse drejer sig ofte om uddannelsesfaglige problemstillinger, der har interesse også uden for universitetets mure. De publiceres derfor i elektronisk form, naturligvis under forudsætning af samtykke fra forfatterne. Det er tale om studenterarbejder, og ikke endelige forskningspublikationer. Se hele serien på: www.ind.ku.dk/publikationer/studenterserien/ ONE-DIMENSIONAL REGRESSION IN HIGH SCHOOL jeanette kjølbæk A Master Thesis in Mathematics Department of Science Education, University of Copenhagen Submitted: 1 June 2015 name: Jeanette Kjølbæk title: One-dimensional regression in high school A Master Thesis in Mathematics department: Department of Science Education, University of Copenhagen supervisor: Carl Winsløw time frame: December 2014 - June 2015 ABSTRACT Development of calculators and IT-tools has caused that students can easily solve mathematical tasks without knowing intermediate calculations and the mathematics behind the technique. Unfortunately a consequence of this is that the mathematics theory behind these techniques has been given a lower priority. This is the case with teaching in one-dimensional regression in high school, where many students learn the instrumented techniques to make regression without knowing what to find or how to find it. In this thesis, the theory about one-dimensional regression at the mathematical community and the possibilities to work more theoretically with one-dimensional regression in high school is investigated. Initial, the external didactic transposition was analyzed. This was done by analyze and present the theory of regression in the mathematical community and describe how regression is included in the curricula, written exams and textbooks. The presentation and descriptions were based on the Anthropological Theory of Didactic (ATD). Based on the analysis four main questions were highlighted and an epistemological reference model (ERM) was developed. The second part of the thesis concerns the internal didactic transposition and focus on the design and evaluation of the teaching sequence. The planned and the realized didactic process was analyzed using ATD and ERM. Finally, the possibilities and obstacles to work more theoretically with regression were presented and discussed. The analysis of the external didactic transposition showed that one technological-theoretical discourse of linear regression is based on mathematics well-known to students in high school. This discourse were applied in the design of the teaching sequence. Further it was found that the students only are presented for a minimum of technological-theoretical discourse and do not learn how to determinate the best class of functions. The technological-theoretical discourse of linear regression and the question of best class of functions were tested in a high school class. It turned out that the students had a sensible basis to reach the technological discourse. The technical work to reach the theoretical level was difficult, especially the algebraic notation and the rules of sum were found to be challenging. The determination of the best class of functions was found to function in practice. iii ACKNOWLEDGEMENTS This thesis marks the end of my education at the Department of Mathematical Sciences and the Department of Science Education at the University of Copenhagen. The thesis combines my interest of didactic and mathematics. The thesis is a thesis in mathematical didactics, where the theory of didactic is used to design and evaluate a teaching sequence. First of all I would like to thank Anne Tune Svendsen for the opportunity to test my teaching sequence and for executed the teaching sequence in her class. A great thank to my supervisor, Carl Winsløw, for competent and thorough supervision throughout the process. Thanks for you always have been disposal with excellent advice. A big thank to my sister, Louise Kjølbæk, for proofreading the thesis. Finally thanks to the students and employees at the Department of Science Education for advices and to make the thesis office a great place to be. v CONTENTS 1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Structure of the thesis . . . . . . . . . . . . . . . . . . 2 the anthropological theory of didactics . . . . 2.1 Praxeologies . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Mathematical praxeologies and organizations 2.1.2 Didactic praxeologies and moments . . . . . . 2.2 The didactic transposition . . . . . . . . . . . . . . . . 2.2.1 Level of codetermination . . . . . . . . . . . . I 3 4 5 6 . . . . . . . . . the external didactic transposition and the epistemological reference model . . . . . . . . . . . . . 1 st research question . . . . . . . . . . . . . . . . . . . the scholar knowledge . . . . . . . . . . . . . . . . . . 4.1 Historical perspective . . . . . . . . . . . . . . . . . . . . 4.2 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Types of regression . . . . . . . . . . . . . . . . . 4.2.2 Definition of the best linear model . . . . . . . . 4.2.3 Definition of the best model . . . . . . . . . . . . 4.3 The method of least squares . . . . . . . . . . . . . . . . 4.3.1 Elementary algebra . . . . . . . . . . . . . . . . . 4.3.2 Multivariable calculus . . . . . . . . . . . . . . . 4.3.3 Linear algebra . . . . . . . . . . . . . . . . . . . . 4.4 Statistical approach . . . . . . . . . . . . . . . . . . . . . 4.4.1 Maximum likelihood estimation . . . . . . . . . 4.4.2 Random variable and orthogonal projection . . 4.5 The quality of a model . . . . . . . . . . . . . . . . . . . 4.5.1 Visual examination . . . . . . . . . . . . . . . . . 4.5.2 The correlation coefficient . . . . . . . . . . . . . 4.5.3 The coefficient of determination . . . . . . . . . 4.6 Non-linear regression . . . . . . . . . . . . . . . . . . . . the knowledge to be taught . . . . . . . . . . . . . . . 5.1 Curricula . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Written exams . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 The type of tasks . . . . . . . . . . . . . . . . . . 5.2.2 The methods and techniques . . . . . . . . . . . 5.3 Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Gyldendals Gymnasiematematik . . . . . . . . . 5.3.2 Lærebog i matematik . . . . . . . . . . . . . . . . 5.3.3 Hvad er matematik? . . . . . . . . . . . . . . . . epistemological reference model . . . . . . . . . . . 6.1 Q1 : The type of model (F i ) . . . . . . . . . . . . . . . . 1 1 2 5 5 6 8 10 12 15 17 19 19 20 21 22 25 26 27 30 33 38 38 39 41 41 42 44 45 49 49 51 51 54 56 56 56 57 63 65 vii viii contents 6.2 6.3 6.4 II 7 8 9 10 11 12 Q2 : The best model (f 2 F i ) . . . . . . . . . . . . . . . Q3 : The best linear model (f 2 FL ) . . . . . . . . . . . . Q4 : Evaluate the model . . . . . . . . . . . . . . . . . . . 66 67 69 the internal didactic transposition: the teaching sequence . . . . . . . . . . . . . . . . . . . . . . . . . 2 nd research question . . . . . . . . . . . . . . . . . . . design of the teaching sequence . . . . . . . . . . . 8.1 The class . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Mathematical organizations . . . . . . . . . . . . . . . . 8.2.1 MO belonging to Q3 . . . . . . . . . . . . . . . . 8.2.2 MO belonging to Q1 . . . . . . . . . . . . . . . . 8.3 The specific aims . . . . . . . . . . . . . . . . . . . . . . 8.4 Didactic principles . . . . . . . . . . . . . . . . . . . . . 8.5 Description of the planned didactic process . . . . . . . 8.5.1 Module 1: Definition of best linear model . . . . 8.5.2 Module 2: The proof of best linear model . . . . 8.5.3 Module 3: The type of model . . . . . . . . . . . a priori analysis . . . . . . . . . . . . . . . . . . . . . . . 9.1 Module 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Module 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Module 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . a posteriori analysis . . . . . . . . . . . . . . . . . . . . 10.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Analysis of the realized teaching sequence . . . . . . . 10.2.1 Q3 : The best linear model f 2 F L . . . . . . . . 10.2.2 Q1 : The type of model F i . . . . . . . . . . . . . discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Redesign . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The rationale for teaching more theoretically . . . . . . conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 73 75 75 76 76 78 78 79 81 81 83 85 87 87 89 92 99 99 101 101 110 115 115 117 119 list of references . . . . . . . . . . . . . . . . . . . . . . . . 121 III appendix . . . . . . . . . . . . . a appendix . . . . . . . . . . . . . . a.1 The tasks . . . . . . . . . . . . a.2 The tasks with solutions . . . a.3 Tables of the didactic process a.4 The planned didactic process a.5 The realized didactic process a.6 The students’ solutions . . . . a.7 The transcriptions in danish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 127 127 137 162 166 183 188 202 LIST OF FIGURES Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 Figure 14 Figure 15 Figure 16 Figure 17 Figure 18 Figure 19 Figure 20 Figure 21 Figure 22 Figure 23 Figure 24 Figure 25 Figure 26 Figure 27 Figure 28 The research methodology and structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . A mathematical praxeology . . . . . . . . . . . Connection between MOs . . . . . . . . . . . . The didactic transposition . . . . . . . . . . . . Level of didactic codetermination . . . . . . . . Examples of vertical, horizontal or perpendicular errors . . . . . . . . . . . . . . . . . . . . . Projection of v to W . . . . . . . . . . . . . . . . Projection of ~b to C(A) . . . . . . . . . . . . . . Examples of scatter plot and residual plot . . . Anscombe’s quartet . . . . . . . . . . . . . . . . Exam exercise from 1980 . . . . . . . . . . . . . Exam exercise from 2013 . . . . . . . . . . . . . Task of type TD and TQD from 1998 . . . . . . Illustration of the LSQ method in "Hvad er matematik? - C" . . . . . . . . . . . . . . . . . . Illustration of projection in "What are mathematics? A" . . . . . . . . . . . . . . . . . . . . . ERM . . . . . . . . . . . . . . . . . . . . . . . . . Data set in a example . . . . . . . . . . . . . . . Scatter plot and residual plot of the linear and exponential model . . . . . . . . . . . . . . . . . Illustration of MOs to Q3 . . . . . . . . . . . . . Scatterplot of the data set in exercise 1 and 2 . Illustration of displacement of a data set . . . . The representation of the data set in exercise 9 Outline of the planned and realized didactic process . . . . . . . . . . . . . . . . . . . . . . . The plots of the transformed data set . . . . . . The rules of sum written at the black board . . A student’s solution to task 1a . . . . . . . . . A student’s solution to task 2c + 2d) . . . . . . The linear model in task 11a . . . . . . . . . . . 2 6 8 10 12 23 33 35 42 43 53 53 54 58 59 63 64 65 68 82 84 93 101 104 106 107 109 111 ix Figure 29 Plots of the displacement, when the graph window is set . . . . . . . . . . . . . . . . . . . . . 116 L I S T O F TA B L E S Table 1 Table 2 The properties of the three methods to linear regression . . . . . . . . . . . . . . Exam exercises regarding regression in 1999 + 2009-2013 . . . . . . . . . . . . . . make . . . . 1975. . . . 25 52 ACRONYMS ATD Anthropological Theory of Didactics MO Mathematical organization ERM Epistemological reference model LSQ Least squares D Data set D = {(x1 , y1 ), . . . , (xn , yn )} FL The class of linear functions FL = {f(x) = ax + b|a, b 2 R} FE The class of exponential functions FE = {f(x) = bax |a 2 R+ , b 2 R} FP The class of power functions FP = {f(x) = bxa |a, b 2 R} Fi The collection of three class of functions (linear, exponential and power functions). Fi with i 2 {L, E, P} x 1 INTRODUCTION 1.1 objectives Development of calculators and IT-tools in the recent 10-15 years has had a great influence on mathematics in high school, not only on the techniques to solve exercises, but also on the mathematical tasks and content. This has enabled an inclusion of new mathematical themes in the curricula and one of these themes is one-dimensional regression, which is mandatory for all students in high school. Using the calculators and IT-tools, students can now easily solve several tasks without knowing the intermediate calculations to solve the tasks and furthermore solve tasks that would have be very painstaking by hand. Unfortunately, a consequence is that the mathematics behind the techniques has been given a lower priority. The motivation for investigating one-dimensional regression is the frequently imprecise approach to the theme, where the teacher focus on the instrumented techniques to determine the best linear, exponential or power model, without presenting the students to what is meant by best or how the models are determined. This approach has the unfortunate consequence that the instrumented techniques became mysterious processes the students cannot explain or justify. Winsløw (2003) denotes this a black box effect. This effect is of course undesirable and recently in the magazine of LMFK it has been discussed how the theory of regression can be presented to avoid the black box effect. In the articles1 the possibilities and obstacles to teach linear regression is discussed and four different proofs of the formula of the best linear model are presented. Thus, it is interesting to consider the didactic problem of teaching onedimensional regression in high school and investigate the possibilities to design a teaching sequence containing theoretical work with onedimensional regression to avoid the black box effect. An important part of the didactic problem lies in the elaboration of a mathematical knowledge complete enough to allow the development of theoretical insight in regression. To elaborate the mathematical knowledge, the knowledge in the mathematical community and high school will be 1 Nr. 1/Januar: Fodgængerversion af lineær regression. Nr. 2/Marts: Regression med mindste kvadraters metode, Mere om linear regression. Nr. 3/Maj: De mindste kvadraters metode. 1 The magazine of LMFK is a member’s magazine for the union of teachers in mathematics, physics and chemistry 2 introduction analyzed and the relevant knowledge will be transformed to design the teaching sequence. The teaching sequence will be tested in a high school class. 1.2 structure of the thesis Figure 1: The research methodology and structure of the thesis Figure 1 illustrate the research method used and the structure of the thesis. Initial, the main elements of the Anthropological Theory of Didactics (ATD) are described (Chapter 2). The elements constitute the theoretical basis of the thesis. This is followed by part I, which concerns the external transposition and the epistemological reference model (ERM). Firstly, the first research question about the body of knowledge and the transposition is presented (chapter 3). In chapter 4, the scholar knowledge about onedimensional regression at the mathematical community is presented, afterward the knowledge to be taught is described by analyzing curricula, exams and textbooks, in section 5.1, 5.2, 5.3. The analysis of the external transposition constitutes the basis of the ERM. The ERM is presented in chapter 6. The first part is basis for the second part, which describes the internal transposition, hence the design and the evaluation of the teaching sequence. Part II prefaces with the second research question about the possibilities and obstacles to work more theoretically with regression (chapter 7). This is followed by a description of the design in chapter 8 and the a priori analysis in chapter 9. These chapters describe the reflections about the design and contain detailed descriptions of the planned didactic process. The realized didactic process and the students’ work is evaluated in the a posteriori analysis in chapter 10. 1.2 structure of the thesis Finally, the teaching sequence is discussed are described in chapter 11. The thesis is finalized with a conclusion in chapter 12. Navigation Reference in the thesis is marked with different colors. The brown color marks references to the list of reference. Click on the reference and you will jump to the reference. The same procedure, but with green reference will guide to a homepage. The blue reference will guide to other parts of the thesis, for examples chapter, section, subsection, theorems and definitions. 3 T H E A N T H R O P O L O G I C A L T H E O RY O F D I D A C T I C S Didactics of mathematics is the science of study and aid to study mathematics. Its aim is to describe and characterize the study processes (or didactic processes) in order to provide explanations and solid responses to the difficulties of people (students, teachers, parents, professionals, etc.) studying or helping others to study mathematics. (Bosch and Gascón, 2006, p. 60) The theoretical basis of thesis is based on ATD, a research programme evolved by Yves Chevallard, studying the phenomena related to teaching and learning. ATD postulates that mathematics is a human activity, produced, taught, learnt and practiced in educational institutions. One of the main point in ATD is that what is taught and learnt in an educational institution are influenced by the nature and origin of the mathematical knowledge and the institutional setting. Thus to study didactic phenomena the knowledge outside the classrooms and the specific institution have to be considered. In ATD, the notion of didactic transposition and the level of codetermination is used to study the conditions and constraints coming from the society and institutions. To describe mathematical activity ATD proposes a general epistemological model, where mathematics is described by praxeologies. According to ATD, a mathematical activity be comprised of two related aspects; the study process in which mathematical knowledge is constructed and the result of the constructions, the mathematical knowledge. These two components are described by respectively didactic and mathematical praxeologies. 2.1 praxeologies The didactic and mathematical praxeologies are structured in two interrelated components, praxis and logos. The praxis or "know how" is the way of doing to solve a problem or task, while logos or "knowledge" refers to the thinking and justification of the way of doing. In every praxeology the two components occur simultaneously and affect each other, since no praxis can be executed without being justified or explained (logos) and the justification or explanation (logos) influence the way of doing (praxis) (Bosch and Gascón, 2006, p. 59). 5 2 6 the anthropological theory of didactics 2.1.1 Mathematical praxeologies and organizations Like any other praxeology, a mathematical praxeology is composed of a practical and theoretical block. The practical block is made of types of problems or tasks (T) to be studied as well as the technique (⌧) to solve them. The types of task can be anything from solving an equation, determining coefficients in a model, evaluate a model or making a proof. Solving a task requires work with the task and the technique refers to the way of working. The technique characterizes the type of tasks in the practical block. The type of tasks and the technique constitute the practical block [T / ⌧] of a praxeology. This means that a specific task (t) can belong to more practical blocks, depending on which technique used to solve the task. For instance, there are different techniques to solve the task t: Given a function f(x) = 2x + 3 and a value of f(x0 ) = 5. Determine x0 : ⌧A : Isolating x0 . x0 = f(x0a)-b ⌧G : Draw f(x) as a graph. Read off x0 when f(x) = f(x0 ). ⌧I : Use solve at a computer algebra system (CAS). As the technique characterizes the type of task and the technique ⌧I can be used to determine an unknown variable in all classes of functions, several tasks belong to the practical block [T / ⌧I ]. In contrast the technique ⌧A is more specific and fewer tasks belong to the practical block characterized by this technique. The technique used to solve a task is, according to ATD, influenced by the institutional and historical settings. The logos or theoretical block includes the discourses necessary to describe, explain and justify the technique and is formed by the two levels: technology (✓) and theory (⇥). The technology explains and justifies the technique, while the theory is an argument which explain and justify the technology. The technology and theory constitute the theoretical block [✓ / ⇥] of a praxeology. The union of the two blocks (figure 2) constitutes the four basis components of a mathematical praxeology. A praxeology is usually described by a four-tuble [T / ⌧ / ✓ / ⇥]. Type of tasks T Technique ⌧ Technology ✓ Theory ⇥ Praxis Logos Figure 2: A mathematical praxeology is the union of the two main components: Praxis and logos The two blocks are inseparable and developed together. Solving a task 2.1 praxeologies consists in the construction of praxeologies bringing up ways of doing (praxis) and ways of justifying (logos). Development of the practical block raises new technological needs that influence the theoretical block. Similarly, the developments in the theoretical block influence the praxis, as description and justification of tasks can lead to new types of tasks or techniques. According to ATD, mathematical activities consist in activating praxeologies (old knowledge) to construct or reconstruct mathematical praxeologies to solve problems and tasks (Rodríguez et al., 2008, p. 291). Solving problems or tasks will result in construction of mathematical praxeologies and which praxeologies constructed or reconstructed depends on the task and the technique used to solve the task. The mathematical praxeologies and particular the collection of mathematical praxeologies can be classified by the notion of mathematical organization (MO), which is a collection of praxeologies arranged into a sequence of increasing complexity. The classification is based on the technique, technology or theory and forms a sequence of MOs with increasing complexity. The simplest MO is the punctual, generated by a unique type of tasks with the related technique. The punctual MO is the same as a mathematical praxeology. In punctual MO the theoretical block is often ignored or implicit assumed (Garcia et al., 2006, p. 227). The collection of praxeologies sharing technological discourse forms a local MO. The local MO is characterized by its technology and is made up of practical blocks sharing technology and theory. Like a task can belong to several mathematical praxeologies, a punctual MO can belong to different local MOs depending on the technology used to describe and justify the technique. Going a step further a regional MO is obtained by connecting several local MOs having the theoretical discourse in common. The most complex MO is the global MO, which is a collection of regional MOs. The connection between the MOs is shown in figure 3. The notion of mathematical praxeologies and MOs are used to describe and analyze mathematical knowledge, but as mathematical knowledge never is a definitive construction, the praxeologies and MOs are either. According to Bosch and Gascón (2006), it is possible to find knowledge (logos) that are not used to solve problems, since one did not know or has forgotten what type of tasks it can help to solve. Further techniques (praxis) which is not described do also exist. Praxeologies and MOs are result of a complex and ongoing process, where new tasks and mathematics are studied in a study process. The study is by itself a human activity and thus it can be modelling in term of praxeologies, called didactic praxeologies. 7 8 the anthropological theory of didactics Figure 3: The connection of punctual, local and regional MOs. The figure illustrates how punctual MOs can be collected in local MOs and how these local MOs can be collected in a regional MO. In fact, there is no mathematical praxeology without a study process that engenders it but, at the same time, there is no study process without a mathematical praxeology in construction. Process and product are the two sides of the same coin. (Garcia et al., 2006, p. 227) 2.1.2 Didactic praxeologies and moments The study process or didactic process constitutes together with mathematical praxeologies the aspects of mathematical activity. The didactic process is, in ATD, modeled in terms of praxeologies, called didactic praxeologies. The didactic praxeologies refer to any activity related to setting up or develop praxeologies and so can be used to model didactic processes for students and teachers. Students’ didactic praxeology is a model of how students study and learn mathematical praxiologies, whereas teachers’ didactic praxeology is a model of how teacher helps students to acquire the mathematical praxeologies, for instance by setting up the teaching (Barbé et al., 2005, p. 239). The study process can be organized into six moments, each of which characterize different dimensions of the studied MO: The moment of first encounter is the first meeting with the type of tasks or questions of the process. The first encounter should moti- 2.1 praxeologies vate the development of the MO at stake. The second moment, the moment of exploratory concerns the exploration of the type of tasks and elaboration of techniques to solve the tasks. The third moment, the technological-theoretical moment, consists of the construction of the technological-theoretical environment related to the technique. In this moment the students describe and justify the techniques, which can generate a need for new techniques. The fourth moment, the technical moment concerns the technical work with the task, in which a technique is improved or new techniques are developed to make the technique more reliable. In the firth moment, the moment of institutionalization, the elaborated and praxeology is identified and finally in the the moment of evaluation the praxeology is evaluated by investigate the limitations of the technique (Barbé et al., 2005, p. 238-239). In a study process the six moments do not have a chronological or linear structure and some moments can occur simultaneously as the moments are closely interrelated. Each moment has a specific function in the study process and a study process able to carry out all the moments leads to the construction of a local MO generated by the initial question or task (Barbé et al., 2005, 239). The teachers’ task is to construct a didactic process that enables the students to carry out all the moments to let the students establish the MO in stake. The possible ways of constructing the didactic process can be described by use of didactic praxeologies, since didactic praxeologies, as every praxeology, include a set of problematic didactic tasks, didactic techniques and didactic technologies and theories. A classic example of a teacher’s didactic task is how to set up and teach a MO under the certain conditions given to the teacher. In this thesis, the didactic task: "How to establish a MO around linear regression" is considered. The technique to a didactic task is the way the teacher solve the task, for instance the tools the teacher applied (ex: textbooks and tasks) and the way the teacher teach. ATD postulated that didactic tasks and the corresponding techniques can be set up and described by consider the six didactic moments, for instance; How to manage the first encounter?, Which specific tasks can help the students explore techniques?, How to introduce the technologicaltheoretical environment? (Rodríguez et al., 2008, p. 292). The didactic technologies and theories are the discourses that explain and justify the techniques. For instance, classroom teaching can be justified by the perspective that"students learn what the teacher explains clearly" (Garcia et al., 2006, p. 228), while the discourse of applying inquiry based teaching is that students learn most, if they by themselves construct the knowledge by experiments. It is important to notice that the didactic techniques and the techno- 9 10 the anthropological theory of didactics logical - theoretical discourse used in mathematical activity is mainly determined by institutions and historical tradition, for which the didactic research have to go beyond what identifiable in classrooms. The teachers’ practice (and the students) is strongly conditioned by restrictions of both mathematical and didactical origin. Thus, according to ATD, analyze and describe the MOs and the didactic praxeologies, the conditions and constraints from the society, scholarly and educational institutions have to be considered by examine the didactical transposition and the level of codetermination (Bosch and Gascón, 2006). 2.2 the didactic transposition ATD postulates that the mathematical and didactic praxeologies occurring in educational institutions are mainly determined by conditions and constraints that cannot be reduced to a specific classroom. To analyze the knowledge of regression and the conditions and constraints to teach in regression, it is important to study the origin of the theory and the rationale of the knowledge. To this the didactic transformation and the levels of determination can help. The process of didactic transposition refers to the transformation through which a piece of mathematical knowledge is transposed from the mathematical community until it is actually learned in a given institution. The knowledge taught in school is a production generated outside school that is made teachable through transposition. The transposition is a complex process consisting of three steps. Figure 4 illustrates the steps of the didactic process and shows how the knowledge is distinguish in four components; the scholarly knowledge, the knowledge to be taught, the taught knowledge and the learned knowledge (Bosch and Gascón, 2006, p. 56). Figure 4: The didactic transposition The first component is the knowledge produced in the mathematical community and this knowledge takes place at universities and occupation. The process of transposition starts with a selection of bodies of scholarly knowledge to be transmitted. This knowledge then undergoes a process of deconstruction and reconstruction to make the knowledge teachable in school (Bosch and Gascón, 2006, p. 53). This process is executed by the noospheres, who are politicians, mathematicians and members of the teaching system. The products of the 2.2 the didactic transposition process are the curricula, official programmes, exams, textbooks and recommendations to teacher etc. These products constitute the knowledge to be taught. The knowledge to be taught set up the mathematical content taught in the teaching institutions. Based on the knowledge to be taught, the teacher construct and teach concrete knowledge in the classroom, which constitute the taught knowledge. The last transformation takes place in the classroom when the students transform the taught knowledge to learned knowledge. The taught knowledge will be identical for students in a class whereas the learned knowledge will be distinct since students learn differently. The didactic transposition can also be divided into two parts; the external and internal transposition to differentiate the three institutions; the mathematical community, the educational system and the classroom (Winsløw, 2011, p. 537). The external transposition is the first transformation going from the mathematical community to the educational system, where the noospheres constitute the knowledge to be taught. The internal transposition is the second transposition from the educational system to the classroom. This transposition is called internal, since the teacher can influence this step by the proposed activities, test, resources and the teacher’s discourse. The didactic transposition underlines that mathematical praxeologies always arise in institutional setting and that the knowledge differ in the institutions (Bosch and Gascón, 2014, p. 70). This means that the knowledge in a specific institution has to be considered when doing research. For this reason, it makes no sense to apply the scholar knowledge as reference knowledge when consider knowledge in high school, as the transformation has imposes a lot of limitations of the knowledge found in the class. As the researchers always will belong to at least one of the three institutions, the ATD postulates the necessity to make an ERM such that the researchers avoid to be subject to the institution examined (Bosch and Gascón, 2006, p. 57). The ERM has to be elaborated from empirical data of all three institutions to secure that no institutions became more dominant than other. The collection of knowledge in an ERM should let the researchers answer questions occurring when analyzing didactic problems. Some relevant questions in this thesis are: What is the object called "regression"?, Where does it come from?, How does it vary from one institution to another, both in time and in institutional space?, What kind of praxeologies is regression made of in the different institutional setting?. These questions are part of studying what is taught and learnt in the classrooms about regression. In the thesis these questions will be answered to illuminate the bodies 11 12 the anthropological theory of didactics of knowledge existing about regression at the mathematical community and in high school. As in any act of modeling, models are used to simplify and structure data (Winsløw, 2011, p. 8) and in this thesis the data relevant for high school will be collected in a ERM, that will be used in the design and evaluation of the teaching sequence. 2.2.1 Level of codetermination ATD points out that the taught knowledge in school is determined through the didactic transposition and the frames in the educational institutions. Consequently, the conditions and constraints affecting the mathematical and didactic transposition occurring in the classroom have to be considered. Chevallard suggests a hierarchy of determination that can help identifying these restrictions at the different levels. In the hierarchy two kinds of restrictions are identified. The generic restrictions are the restrictions coming from the levels above the discipline and the specific restrictions arise from the level below the discipline (knowledge to be taught). The two kinds of restrictions may help to identify conditions and constraints that influence and define the knowledge taught, but go beyond what identifiable in the classroom (Barbé et al., 2005, p. 239). The hierarchy is structured into a sequence from the most generic level, (the civilisation) to the specific task (the subject), see figure 5. The highest level is the civilisation, followed by the society, the school and the pedagogy. These levels restrict the possible ways of organizing the didactic process and the kind of MOs that can be studied. For instance, restrictions at the school level Figure 5: Level of didactic are; the structure of time and class, the textcodetermination books used and CAS available. The generic levels influence the lower levels. The specific restrictions are arranged in the discipline, domain, sector, theme and subjects. The last two levels, subject and theme, are according to Garcia et al. (2006) the only levels the teachers can influence, as all the above levels are determined by the curricula and educational authorities. This didactic phenomenon is called the thematic autism of the teacher (Barbé et al., 2.2 the didactic transposition 2005, p. 257) and refers to that the teachers only have to decide how to organize the study of a MO at the theoretical level. The thematic autism of the teacher has the unfortunate influence that school mathematics rarely works at levels higher than the themes, as the teachers’ didactic problem is restricted to the selection of tasks to be studied. This means that teachers in a specific institution have not to decide, which domain a mathematical subject or theme belongs to, because the decision already is taken. For instance, regression belongs to two different domains depending on the institutions investigated. In high school the theme belongs to the domain of functions, while it at university is taught in the domain of statistics. Restrictions on the lower levels, especially from the curricula, influence that the themes, sectors and domains are only slightly connected in high school (Barbé et al., 2005, p. 257). As a result, the students do not form regional or global MOs, as the four lowest levels in the hierarchy correspond to the four different MOs (Barbé et al., 2005, p. 257). The lowest level in the hierarchy (the subject) corresponds to a mathematical praxeology, as a type of tasks with a technique is examined. The theme is formed by the types of tasks and techniques sharing technology, i.e. a local MO. The sector is formed by a collection of themes sharing theory, thus similar to a regional MO. The last level below the discipline is constituted by regional MOs. The restrictions coming from the different levels influence the teachers’ didactic process and consequently the students’ MO. In this thesis, ATD will be used to investigate the MOs about regression and to analyze the conditions and constraints related to regression in high school. 13 Part I THE EXTERNAL DIDACTIC TRANSPOSITION AND THE EPISTEMOLOGICAL REFERENCE MODEL To the didactician [the concept of didactic transposition] is a tool that allows to stand back, question the evidence, erode simple ideas, get rid of the tricky familiarity on the object of study. It is one of the instruments of the rupture didactics should make to establish itself as a proper field; it is the reason why the "entrance through the knowledge" into didactic problems passes from power to action: because the "knowledge" becomes problematic through it and, from now on, can figure as a term in the formulation of problems (either new or reformulated ones), and in their solution. (Bosch and Gascón, 2006, p. 55) 3 1st R E S E A R C H Q U E S T I O N According to ATD every didactic problem has to consider the way the mathematical knowledge is constructed and taking into account the conditions and restrictions imposed by the institutional setting, since what students actually learn in an institution depends of the transposition. The transposition lead to conditions and restrictions that influence what teachers teach and students learn in high school, therefore before designing the teaching sequence about regression the mathematical knowledge in different institutions will be analyzed. In this part the external transposition of regression will be analyzed and based on the analysis an ERM will be elaborated. The ERM will be basis of the design and the evaluation of the teaching sequence in part two. This part will investigate: • Which bodies of knowledge regarding regression exist in the mathematical community? • Which elements of scholar knowledge are relevant in relation to high school? • Which bodies of scholar knowledge are transposed to high school and how are they described in the knowledge to be taught? 17 4 THE SCHOLAR KNOWLEDGE This chapter concerns the first step in the didactic transposition, the scholar knowledge. In the chapter the bodies of knowledge about onedimensional regression existing in the mathematical community will be presented. 4.1 historical perspective: the derivation of the least squares method and regression The least squares (LSQ) method was one of the earliest statistical methods and became a very important statistical method in the nineteenth century. Adrien-Marie Legendre published the method first time in 1805. The reason for introducing the method was the problem of determining the relationship between objects (mostly the orbits of the comets) based on various measurements, when more data points than unknowns parameters were given (the problem of finding the best approximate solution to a system of m linear equations in n unknowns, m > n). The method was first published in an appendix to a work about the orbits of comets and immediately, the publication had a widespread effect. Within ten years the method became a standard method to solve astronomical and geodetical problems. Few years later (18081809), Robert Adrain and Carl Friedrich Gauss also published the method and since Gauss went far beyond Legendre, the development and discovery is mostly assigned to him (Katz, 2009, p. 820). In Legendre’s treatment of the method was a lack of formal consideration of probability, for which reason the accuracy of the method could not be determined. In contrast, Gauss justified in his publication Theory of the Motion of Heavenly Bodies from 1809 the theoretical basis of the method by linking the method to probability. In addition, Gauss proved algorithms to calculate the estimates, the method of Gaussian elimination. In the publication, the assumptions that were basis of his arguments and the methods can be found. The three principles are (Waterhouse, 1990, p. 43-45): 1) If several direct measurement of a quantity, all equally reliable are given, then the arithmetic mean of the measured value should be used as the estimate of the actual value of the quantity. 2) The parameter values that seem most likely, given the measurements, should be chosen (what Gauss mean by "most likely" is described below) 19 20 the scholar knowledge 3) The parameter values should be chosen such that the errors are distributed over all the equations rather than just a few of them. We should suppose that larger and smaller errors are equally possible in all of the equations without distinction. With principle 3) Gauss remarked that it makes no sense to throw some of the observation away, but rather the errors have to balance. In principle 2) the phrase "most likely" refers to greatest probability. Using the principles, Gauss showed that the parameter values are most probable, when the sum of the LSQ are minimized. Gauss worked with a function (x), which describe the probability of an error of magnitude x. Gauss proved by assuming 1) that the function (x) must be normal distributed, if the function satisfies; that it is symmetric both R1 about zero, asymptotic to the real axish in-h 2 x2 p directions and -1 (x)dx = 1. The function was (x) = ⇡ e (Katz, 2009, p. 821-822). Denote that (x) is equivalence with the standard normal distribution when h= p12 . Gauss claimed that the errors (ei ) of m various observations are independent if the observations are independent. Consequently the probability of all the errors was given by the product of the m joint functions (ei ): m Y i=1 (ei ) = hm ⇡-(1/2)m e-h 2 (e2 +...+e2 ) m 1 Q Pm 2 Gauss proved that m i=1 (ei ) is maximal (principle 2), when i=1 ei is minimal (Katz, 2009). Using the three principles, Gauss justified the method of LSQ. Initial, Galton used the word reversion (Katz, 2009, p. 826) Even through the LSQ method is the earliest form of regression, Legendre or Gauss did not use the term regression to describe the work. First around 1870-1880 the term was used to describe relation between biological phenomena, studied by Francis Galton. Analyses of Galton’s work show that he did not use the LSQ method to determine the relation between the empirical data. Galton simply plotted the points and estimated the line by eye. In the last decade of the nineteenth century George Udny Yule showed how to calculate the regression line using the LSQ method (Katz, 2009, p. 826). 4.2 regression The problem which regression is used to solve is the problem of determine the underlying relation of some measurements, where the measurements reflect the reality with some errors. The idea is that any phenomenon can be describe by a relation or a theory and that 4.2 regression the measurements are reflections of some ideal underlying relation. In real life the measurements will be random spread around the model that describe the relation. Regression is used to model the underlying relation based on the measurements. This way to consider regression agrees with the etymology of the word. Regression is compound of the verb regress and the suffix ion. To regress means to move backward or go back and the suffix -ion denote an action (Dictionary.com, 2015). Thus, regression can be interpreted as the method to go back from the measurements and determine the underlying relation, which produces the measurements. Regression analysis is the determination of a best mathematical model by identifying the relationship between a dependent and one or more independent variables. Making regression analysis two problems have to be considered: The determination of the class of functions that has to be used to model the data set (the type of model) and estimation of the parameters given the class of functions (the specific model). The two problems are interrelated and influence each other, however the techniques to solve the problems are quite different. Determine the specific model require construction of mathematical praxeologies, while determination of the class of functions requires construction of mathematical praxeologies mixed with non-mathematical object. 4.2.1 Types of regression There are four types of regression: Simple linear regression, simple non-linear regression, multiple linear regression and multiple nonlinear regression. The type of regression is classified by the number of independent variables and the relation between the variables. If the data set D = {(x1 , y1 ), . . . , (xn , yn )} is simple, the function is of one independent variable. In the thesis the notation D will refer to a simple data set. Simple linear regression refers to determination of the best line f 2 FL = {f(x) = ax + b|a, b 2 R} for a given data set D. Simple non-linear regression refers to determination of the best model of one independent variable given a non-linear class of functions. In this thesis the class of exponential functions FE = {f(x) = bax |a 2 R+ , b 2 R} and power functions FP = {f(x) = bxa |a, b 2 R} will be considered. The notation Fi , with i 2 {L, E, P} will refer to the collection of the three classes of functions. The multiple regression models are functions of two or more independent variables. The models can either be linear or non-linear in the parameters. The class of functions determines the type of regression done, for example exponential regression refer to determine the best f 2 FE . 21 22 the scholar knowledge Later, it will be described how exponential regression is converted to a problem of simple linear regression. This thesis will primary consider simple linear regression. As the method of simple linear regression is used to determine other models than the simple linear models, the connection and application to multiple linear models and simple non-linear models will be illustrated and reviewed. 4.2.2 Definition of the best linear model By far, no clear definition of "best model" has been stated, since no unique definition exist. Study the mathematical community, several definitions and justifications of "best model" can be found and in the following some of the definitions and the related methods will be presented to show the complexity of interpreting the "best model". The complexity of interpreting "best" is illustrated in the case of linear models. It seems that the definition of best model depends on the disciplines examined. In statistics the best model is define as the most probable given a data set, whereas the best model minimize some errors (vertical, perpendicular etc.) in the discipline of mathematics. In mathematics two very different approach to linear regression can be found. The geometric approach is an inaccurate approach, where the model is determined by eye and ruler. The technological-theoretical discourse of the technique is defective, because no common criterion or description of how to make the line can be made. As it is natural for one’s mind to consider the orthogonal distances, most common the length of the perpendicular distances will be minimized using this method. The method is vague, as it is based on an individual preference. This causes that several best models in a given class of functions can be determined. The algebraic approach is a more rigorous approach, because the sum to minimize can be explicit stated and consequently the technologicaltheoretical discourses can be described. In the algebraic approach different quantities to sum can be found. The quantities are characterized by the type of error and by the size of the error. What quantity is minimized affect which method to use to minimize the sum. The size of the errors To avoid positive errors cancel negative errors out, the errors have to be positive. This can be obtained by use of absolute values of the errors (|ek |) or squaring the errors (e2k ). Using the absolute values the errors are weighted by their distances, which means that an er- 4.2 regression ror twice as large is weighted twice. Squaring the errors, large errors are weighted more, for instance an error twice as large is weighted by four. Using the absolute size of the errors, outliers in data set influence the model to a lesser extent than using the squared errors (Weisstein, 2015). The type of the errors Figure 6 illustrates that three types of errors; the vertical, the horizontal and the perpendicular exist and that the type of errors used had a influence on the line determined. Figure 6: The best linear model calculated by minimize the sum of the squares of the (A): The vertical errors, (B): The horizontal errors, (C): The perpendicular errors. The figure are retrieved 23/2 2015 from http://elifesciences.org/content/2/ e00638 The figure illustrates that the choice of dependent and independent variable is important when making regression, as the model differ depending on whether x is the independent or dependent variable. However, often the relation of the variable is clear when modeling data. For example, consider the variable pulse and workload. Workload has an effect on pulse, while the reverse makes no sense, thus pulse has to be the dependent variable. Determine the best line the errors have to minimal, why it makes sense to minimize the errors in the variables by the perpendicular distance. Using the perpendicular distance, the shortest distance from the point to the line is used, which means that the errors in both the independent and dependent variables be taken into account. Because both variables are taken into account, the determination of the distance can be difficult if the variables are measured in different units (for instance pulse and workload), since calculations of perpendicular distances require that the units of the variables are equal. Thus one variable has to be rescaled, before the distances can be determined. For that reason using the perpendicular distances require lot of calculation. Usually, when consider measurements the errors in the independent variable is few or zero, why it makes sense only to minimize the er- 23 24 the scholar knowledge rors in the dependent variable (vertical errors) and not the errors in the independent variable (horizontal errors). The method The size of the errors and the type of errors that should be minimized affect the method used to minimize the sum. In the following three different methods, which can be used to determine the best line will shortly be commented. The least absolute deviation method: Minimize the sum of the absolute values of the vertical errors n X |yk - (axk + b)| k=1 The linear LSQ method: Minimize the sum of the squared vertical errors n X yk - (axk + b) 2 k=1 The total LSQ method: Minimize the sum of squared perpendicular errors. In the formula the variables are assumed to be measured in the same units. n X (yk - (axk - b))2 1 + b2 k=1 The least absolute deviation method can be helpful to used when all observations have to be given equal emphasis, as the method is resistant to outliers in data (Siemsen and Bollen, 2007, p. 228). The method has the disadvantaged that no unique solution to a data set exists, as infinitely many lines to one data set can occur1 . In addition, the method is not simple to use, because the sum is non differentiable. The linear LSQ method is simple to use and the method will always give one unique solution. The unique solution and the method will be described in 4.3. The total LSQ method is easy to use when the variables are measured in the same units, but in the cases were they are not, the method is not straightforward, as firstly the variables have to be rescaled. Because of the rescaling different lines can be found. If the errors in the independent variable is zero, the method is equivalence with the linear LSQ method. Table 1 sum up the properties of the three methods. 1 Consider the three points (0,0), (3,2), (3,6). The linear model y = kx with k 2 [ 23 ; 2] minimize the sum of the absolute values of the vertical errors 4.2 regression Linear LSQ Least absolute deviation Total LSQ Not robust Robust Not robust Simple method Not simple method Not simple method Always one unique solution Possibly more solutions More solutions (depends of the rescaling) Table 1: The properties of the three methods to make linear regression Finally, the statistic interpretation of the best model is commented. In statistics, the best linear model is defined as the model that maximizes the likelihood function given some distribution of the errors (like Gauss did). The maximum likelihood estimation can be performed when the distribution of the errors belongs to a parametric family of probability distribution. Assuming the errors are independent and normal distributed, the linear model determined by the estimation will be similar to the model determined by the linear LSQ method. If the errors instead follow a Laplace distribution, the linear model determined by maximum likelihood estimation will be equivalence to the linear model determined by the least absolute deviation method. Analyzing the distribution of the errors, the relevant method to use in a given situations can be determined and justified. The most applied method used to determine the best linear model found in the mathematical community is the linear LSQ method that minimize the vertical errors. The rationales for using this method are: • Determination of the linear model by linear LSQ method is simple • Assuming the errors are independent and normal distributed, the model determined by the linear LSQ method is also the most probable model. • In many applications, an error twice as large as another error is more than twice as bad. Squaring the errors accounts for this discrepancy. In the rest of the thesis, the best linear model refers to the model f 2 FL determined by the linear LSQ method, unless otherwise mention. 4.2.3 Definition of the best model The criterion for the best linear model can be generalized to the nonlinear and multiple models, but in contrast to the linear case, the LSQ method are not simple to use when the models are non-linear. For instance, consider the simple non-linear function f(x, ), where is a 25 26 the scholar knowledge vector of m parameters. Using the LSQ method the sum S( ) that are minimized is given by: S( ) = n X yk - f(xk , ) 2 k=1 The minimum of S( ) occurs in a stationary point, i.e. rS( ) = 0. The derivatives are given by n X @S @f(xk , ) =2 (yk - f(xk , )) @ j @ j k=1 k, ) Since @f(x is a function of both the independent variable and the @ j parameters, the parameters cannot be directly determined. This implies that no direct method to determine the parameters (i.e. the model) exists. Consequently the non-linear model is found by approximation of the model by linear models. This problem is mostly solved by iterative refinement. As the technique of iterative refinement is demanding and complex, if possible, other techniques are used to model non-linear functions. One technique is linearization, where a non-linear data set is transformed into a linear domain and then the squared errors (in the linear domain) are minimized by the linear LSQ method. This technique can be applied to regression models that can be transformed to be linear in the parameters, for instance exponential and power functions. The technique is discussed in section 4.6. Other non-linear models can be considered as linear in the parameters and can be easily solved by either simple or multiple linear LSQ method (The technique to solve multiple linear models is shortly described in section 4.3.3). Examples of non-linear models are polynomials, f(x) = 0 + 1 · log(x) and f(x) = 0 + 1 · sin(x) + 2 · sin(2x). 4.3 the method of least squares The method of LSQ is a standard approach to the approximate solution of overdetermined systems. Given a data set D = {(xk , yk ), k = 1, . . . , n} and a class of functions, the task is to find the model f(x) which minimize the sum S(f) = n X yk - f(xk ) 2 k=1 This means to find a model f(x) which satisfy that {(xk , f(xk ), k = 1, . . . , n} is closest to the data set D, where closest is measured as the sum of squared vertical distances from the model to the data points. The LSQ method can be applied to all classes of functions, but as described in section 4.2 the method is only simple, when the class of 4.3 the method of least squares functions is linear. Thus the method will be described for the linear class of functions FL = {f(x) = ax + b|a, b 2 R}. Using the method of linear LSQ a unique linear model can be determined by solve task tM : Minimize Pn 2 k=1 (yk - (axk + b)) . In the following, the theorem of the linear model and several techniques to solve task tM (prove the theorem) will be presented. To solve the task many techniques has to be used, however one will be essential to solve the task. This technique will be highlighted. The essential techniques belong to the mathematical domain of; elementary algebra, multivariable calculus and linear algebra. For shortness, the following notations will be used: n 1 X x= xk n k=1 Sx = Sy = Sxy = Tx2 = n X (xk - x) k=1 n X (yk - y) k=1 n X n 1 X y= yk n k=1 S2x = S2y = n X (xk - x)2 k=1 n X (yk - y)2 k=1 (xk - x)(yk - y) k=1 n X x2k k=1 Txy = n X xk yk k=1 Theorem 4.3.0.1 (Linear LSQ). Let D with n > 2 be given. Assume not all xk are equal. The linear model y = ax + b, which minimizes the sum P 2 S(a, b) = n is given by: k=1 yk - (axk + b) Pn Sxy (x - x)(yk - y) Pn k a = 2 = k=1 , b = y - ax 2 Sx k=1 (xk - x) 4.3.1 Elementary algebra The following proof is based on elementary algebra and the technique is: completing the square. The technique is well-known to students at second and third year in high school, where the technique is taught in relation with quadratic polynomial2 . 2 The technique is used to convert a quadratic polynomial of the form ax2 + bx + c to the form a(x + h)2 + k, with h, k constants 27 28 the scholar knowledge In this proof and the subsequent the following rules of summation is used: n X k=1 ak ± n X bk = k=1 n X k=1 n X n X (ak ± bk ) k=1 c · ak = c · n X ak k=1 c = nc k=1 The proof is based on Winsløw (2015). 4.3.1.1 1st proof of Theorem 4.3.0.1 Proof. Given D with n > 2 and not all xk equal. Determine a, b that minimize the sum S(a, b) = n X yk - (axk + b) 2 k=1 Pn P P First denote that Sx = k=1 (xk - x) = n xk - n k=1 k=1 x = x · n Pn x · n = 0 and similar Sy = k=1 (yk - y) = 0. Since not all xk are equal, S2x > 0. Let D = y - b - ax. S(a, b) = = = = n X k=1 n X k=1 n X yk - axk - b 2 y + (yk - y) - a(x + xk - x) - b D + (yk - y) - a(xk - x) k=1 n ✓ X 2 2 D2 + 2D(yk - y) - 2aD(xk - x) + (yk - y)2 - k=1 2a(xk - x)(yk - y) + a2 (xk - x)2 ◆ = nD2 + S2y - 2aSxy + a2 S2x ✓ ✓ ◆ ✓ ◆ ◆ aSxy Sxy 2 Sxy 2 2 2 2 = nD + Sx a - 2 2 + + S2y Sx S2x S2x ✓ ◆ Sxy 2 (Sxy )2 2 2 = nD + Sx a - 2 + S2y Sx S2x The first two terms depend on the parameters a, b, while the last two terms are constants. 4.3 the method of least squares 29 The sum S(a, b) is minimal, when the two terms that depend on the parameters a, b are equal zero. ✓ ◆ Sxy 2 Sxy Sxy 2 2 Since Sx > 0, Sx a - 2 = 0 , a- 2 = 0 , a = 2 Sx Sx Sx Since n 6= 0, nD2 = 0 , D2 = 0 , 0 = D = y - b - ax The sum S(a, b) is proved to be minimal when Pn Sxy (x - x)(yk - y) Pn k a = 2 = k=1 2 Sx k=1 (xk - x) b = y - ax In the domain of elementary algebra many different techniques to make the proof can be found. The techniques used influence the didactic process and the mathematical praxeology constructed. In the following proof another technique in the domain of elementary algebra is used. The technique is to constrain the problem to a centralized data set and used that min (f(a, b))=min ( (a)) + min( (b)), when f(a, b) = (a) + (b). The proof is designed by me based on Key (2005) and Nielsen (2015). 4.3.1.2 2nd proof of Theorem 4.3.0.1 Proof. Given D with n > 2 and not all xk equal. Determine a, b that minimize the sum S(a, b) = n X yk - (axk + b) 2 k=1 Let {(hk , vk )|k = 1, . . . , n} be a data set with hk = xk - x and vk = yk - y. The data set is centralized since n X hk = k=1 n X (xk - x) = k=1 and similar n X xk - k=1 n X vk = k=1 n X n X k=1 x = x·n-x·n = 0 (yk - y) = 0 k=1 The best line v = ah + b of the centralized data set is f(a, b) = = = = n X k=1 n X k=1 n X k=1 n X k=1 vk - ahk - b 2 v2k - 2ahk vk - 2bvk + a2 h2k + 2abhk + b2 v2k - 2a v2k - 2a n X k=1 n X k=1 hk vk - 2b h k vk + a 2 n X k=1 n X k=1 vk + a 2 n X k=1 h2k + nb2 h2k + 2ab n X k=1 hk + nb2 30 the scholar knowledge The sum f(a, b) can be written as two second degree polynomials (a) + (b) in respectively a and b. Using min(f(a, b))=min( (a)) + min( (b)), (a) og (b) are minimized. P 2 Since n k=1 hk > 0 the polynomial (a) = a 2 n X h2k - 2a k=1 is minimal for n X hk vk + k=1 n X v2k k=1 Pn k=1 hk vk a= P n 2 k=1 hk (b) = nb2 is minimal for b = 0. The sum f(a, b) is minimized when Pn k=1 hk vk a= P n 2 k=1 hk b=0 Since a displacement do not change the slope of the line, the best line y = ax + b for the data set D is given by: Pn Pn (x - x)(yk - y) k=1 hk vk Pn k a = Pn = k=1 2 2 k=1 (xk - x) k=1 hk The intersection b is given by: v = ah , y - y = a(x - x) , y = a(x - x) + y. 4.3.2 Thus b = y - ax Multivariable calculus The following proof is based on multivariable calculus. The technique is to determine the global minimum of a function of two variables. The technique requires knowledge of function of two variables, more precise: how to differentiate a function of two variables, both the first and second derivative, how to calculate extreme values, i.e. the second derivative test and a method to determine if the points are local/global. Before the proof, a definition and a theorem from first year calculus at university is stated (Kro, 2003). The proof is based on the definition and the theorem. Definition 4.3.2.1 (Stationary point). Let f : Rn ! R be a function of n variables. A point p is called stationary if rf(p) = 0. Theorem 4.3.2.1 (Second derivative test). Let f(x, y) be a C2 -function and assumes that (a, b) is a stationary point of f. Let Hf(a, b) be the Hessian of f in (a, b). Let D be the determinant of the Hessian in (a, b), so D =det(Hf(a, b)). 4.3 the method of least squares • If D < 0, then (a, b) is a saddle point • If D > 0 and @2 f (a, b) @x2 > 0, then is (a, b) a local minimum • If D > 0 and @2 f (a, b) @x2 < 0, then is (a, b) a local maximum • If D = 0, then the test gives no conclusion The following proof is mainly based on Miller (2006). 4.3.2.1 3rd proof of theorem Theorem 4.3.0.1 Proof. Given D with n > 2 and not all xk equal. Determine a, b that minimize the sum S(a, b) = n X yk - (axk + b) 2 k=1 Firstly the stationary points of the function S(a, b) are determined. The stationary points are given by: rS(a, b) = 0 , @S @S (a, b) = 0 ^ (a, b) = 0 @a @b The derivatives of the function are: ◆ ◆ n ✓ n ✓ X X @S 2 = 2(yk - axk - b) · -xk = -2 yk · xk - axk - bxk @a k=1 k=1 ✓ ◆ 2 = -2 · Txy - aTx - b · n · x ◆ ◆ n ✓ n ✓ X X @S = 2(yk - axk - b) · -1 = -2 yk - axk - b @b k=1 k=1 ✓ ◆ = -2 · n · y - a · x - b Setting the derivatives equal to 0 and dividing with -2 gives: Txy - aTx2 - b · n · x = 0 n· y-a·x-b = 0 Rearrange the terms gives a linear system with two equations and two unknowns (a, b). Txy = aTx2 + b · n · x n·y = n· a·x+b 31 32 the scholar knowledge The solution is: b = y-a·x ^ Txy = aTx2 + b · n · x , Txy = aTx2 + (y - a · x) · n · x Txy - y · n · x = a(Tx2 , 2 - x · n) , Pn Txy - y · n · x yk · xk - y · n · x a= = k=1 Pn 2 2 2 2 Tx - x · n k=1 xk - n · x Pn Pn Using Sxy = - x)(yk - y) = k=1 (xkP k=1 yk · xk - y · n · x and P n 2 = 2 - n · x2 gives: S2x = n (x x) x k=1 k k=1 k a= Sxy S2x The function has one stationary point. The point is: a= Sxy S2x and b = y-a·x Since the function only has one stationary point and S(a, b) > 0, a local minimum is also a global minimum. Thus it is enough to show that the stationary point is a local minimum. By theorem 4.3.2.1 the stationary point is a local minimum if @2 S det(HS(a, b)) > 0 and @a 2 (a, b) > 0. The second derivatives are: n X @2 S (a, b) = 2 x2k = 2Tx2 2 @a k=1 @2 S (a, b) = 2n @b2 n X @2 S (a, b) = 2 xk = 2nx @b@a k=1 The determinant of the Hessian: det(HS(a, b)) = = @2 S (a, b) @a2 @2 S @b@a (a, b) 2Tx2 2nx @2 S @a@b (a, b) @2 S (a, b) @b2 = 4nTx2 - 2nx 2 2nx 2n ✓ ◆ ✓X ◆ n 2 2 2 2 = 4n · Tx - nx = 4n · xk - nx k=1 = 4n · n X (xk - x)2 > 0 k=1 4.3 the method of least squares Pn @2 S 2 Since @a 2 (a, b) = 2 k=1 xk > 0, the stationary point corresponds to a local and thus a global minimum of S(a, b). It is proved that S(a, b) has global minimum when a = b = y-a·x 4.3.3 Sxy S2x and Linear algebra The last technique to prove theorem 4.3.0.1 is based on linear algebra. Using the technique the data is treated as vectors. The technique is to use orthogonal projection to minimize the length of the residual vector. Before the proof, a definition and two theorems that the proof is based on, are given. Definition 4.3.3.1 (Orthogonal projection). Let V be an Euclidian space with a vector v. Let W be a linear subspace of V. A vector w 2 W is the orthogonal projection of v onto W if • w2W • (v - w) ? W. The vector w is denoted projW v Theorem 4.3.3.1 (The approximation theorem). Let V be an Euclidian space with a vector v. Let W be a linear subspace of V. projW v is the vector in W closest to v 2 V and is the only vector with this property. kv - projW vk 6 kv - wk 8w 2 W Proof. Let w be a arbitrary vector in W. Figure 7: Projection of v to W 33 34 the scholar knowledge kv - wk2 = kv - projW v + projW v - wk2 = kv - projW vk2 + kprojW v - wk2 + 2hv - projW v, projW v - wi = kv - projW vk2 + kprojW v - wk2 using that projW v - w 2 W (see figure 7) and consequently by definition 4.3.3.1 v - projW v ? projW v - w , hv - projW v, projW v - wi = 0. Since kprojW v - wk2 > 0 we have kv - projW vk 6 kv - wk8w 2 W Next we show that projW v is unique. Let v = projW v + m for a vector m 2 W ? . Assume projW v is not unique. Let v = p + n with p 2 W, n 2 W ? . Then projW v + m = p + n , projW v - p = n - m Since projW v - p 2 W and n - m 2 W ? the projW v is unique because hprojW v - p, n - mi = hprojW v - p, projw v - pi = 0 so projW v = p. The next theorem uses theorem 4.3.3.1 to prove the best approximation solution of an overdetermined system. In simple linear regression with n data points we have a system of n equations with two unknowns (a, b). The proof of the best linear model(theorem 4.3.0.1) is a special case of the following theorem. Theorem 4.3.3.2 (The best approximation solution to an overdetermined system). Let A~x = ~b, with A a n ⇥ k matrix, ~x 2 Rk , ~b 2 Rn and A 6= 0, ~x 6= ~0, ~b 6= ~0. The best solution x~⇤ to the system is given by x~⇤ = (AT A)-1 AT ~b The proof of theorem 4.3.3.2 is mainly based on Khanacademy. k ~ n Proof. Let A~x = ~b, with A a n ⇥ k2matrix, 3 ~x 2 R 2, b 23R . 2 3 x b 6 17 6 17 6 7 6 6 7 x2 7 b2 7 7 7 , ~x = 6 ~b = 6 Let A = 6 and 6 7 6 7 a ~ a ~ . . . a ~ 2 k5 . . 4 1 6 .. 7 6 .. 7 4 5 4 5 xk bn ~ If a solution to A~x = b exist the solution fulfills a~1 x1 + a~2 x2 + . . . + a~k xk = ~b This means that there must exist a linear combination of the column vector of A equal to ~b, i.e. ~b must be in the column space of A, denoted C(A). 4.3 the method of least squares If ~b 2 / C(A), A~x 6= ~b. Thus the system cannot be solved. In this case a solution x~⇤ that gets Ax~⇤ as close as possible to ~b is determined. This means a solution x~⇤ such that kAx~⇤ - ~bk 6 kA~x - ~bk 8~x 2 Rk Since Ax~⇤ 2 C(A), ~b 2 Rn and C(A) is a linear subspace of Rn , projC(A)~b is the vector in C(A) closest to ~b (theorem 4.3.3.1). Hence kAx~⇤ - ~bk is minimized when Ax~⇤ = projC(A)~b, see figure 8. Figure 8: Projection of ~b to C(A) By definition 4.3.3.1 Ax~⇤ - ~b ? C(A) , Ax~⇤ - ~b 2 C(A)? , Ax~⇤ - ~b 2 N(AT ) where N(AT ) denote the null space of AT . The best solution is given by: Ax~⇤ - ~b 2 N(AT ) , AT · (Ax~⇤ - ~b) = 0 AT Ax~⇤ = AT ~b , x~⇤ = (AT A)-1 AT ~b , In the last equation is used that the column vectors of A are linear independent, which implies that A has full column rank. Thus (AT A)-1 exists. In the proof the best approximate solution is obtain by minimize kAx~⇤ - ~bk. Let Ax~⇤ = v, then p kAx~⇤ - ~bk = kv - ~bk = (v1 - b1 )2 + (v2 - b2 )2 + . . . + (vn - bn )2 . For that reason the technique is equivalent with the linear LSQ method and can be used to prove the best linear model(4.3.0.1). 35 36 the scholar knowledge 4.3.3.1 4th proof of Theorem 4.3.0.1 The proof is a special case of theorem 4.3.3.2 with A a n ⇥ 2 matrix, ~x 2 Rn and ~b 2 R2 . Proof. Given D with n > 2. Let 2 3 1 x1 6 7 61 x 7 27 6 X = 6. . 7 6 .. .. 7 4 5 1 xn ~p = 2 y1 3 6 7 6y 7 6 27 ~y = 6 . 7 6 .. 7 4 5 yn " # b a From theorem 4.3.3.2 the best solution to X~p = ~y is given by ~p = (XT X)-1 XT ~y Want to determine ~p: " # 1 1 ... 1 T X = x1 x2 . . . xn XT X = " 1 = " n (XT X)-1 = (XT X)-1 XT = (XT X)-1 XT ~y = = 1 x1 x2 # nx̄ 1 x1 3 7 " # 6 # Pn 61 x 7 n x 27 6 k k=1 · 6 . . 7 = Pn Pn 2 .. .. 7 . . . xn 6 x k=1 k k=1 xk 4 5 1 xn ... 1 nx̄ Tx2 1 nTx2 - nx̄ 2 · 2 · 1 nTx2 - nx̄ 1 nTx2 - nx̄ 2 2 2 " Tx2 # -nx̄ -nx̄ n " # Tx2 - x1 nx̄ . . . Tx2 - xn nx̄ nx1 - nx̄ . . . nxn - nx̄ ✓ ◆ ✓ ◆3 2 2 6y1 Tx - x1 nx̄ + . . . + yn Tx - xn nx̄ 7 ✓ ◆ ✓ ◆ 7 ·6 4 5 y1 nx1 - nx̄ + . . . + yn nxn - nx̄ 3 2 nȳTx2 -Txy nx̄ 4 nTx2 -(n2x̄)2 5 nTxy -n x̄ȳ nTx2 -(nx̄)2 This parameters a, b is proved to be: 3 " # 2 nȳT 2 -Txy nx̄ 3 2 x y xa 2 2 b nTx -(nx̄) 5 ~p = = 4 nT = 4 Sxy 5 2 xy -n x̄ȳ a 2 2 S2x nT -(nx̄) x 4.3 the method of least squares In the rewriting is used that b= nȳTx2 - Txy nx ȳTx2 - Txy x = nTx2 - nx)2 Tx2 - nx2 ȳTx2 - ynx2 - Txy x + ynx2 Tx2 - nx2 Txy - ynx ȳ(Tx2 - nx2 ) = -x 2 2 2 Tx - nx Tx - nx2 = = y - xa a= nTxy - n2 xy nTx2 - nx 2 = Txy - xny Tx2 - nx2 = Sxy S2x The proof of simple linear regression is a special case of theorem 4.3.3.2. The theorem (4.3.3.2) can be used to determine the best multiple linear model (multiple linear regression). Let f be a multiple linear function of k variables, st. f(x1 , . . . , xk ) = a0 + a1 x1 + . . . ak xk . Let n data points in Rk+1 be given: (x11 , . . . , x1k , y1 ), . . . , (xn1 , . . . , xnk , yn ) . Let X be a n ⇥ (k + 1) matrix, ~y 2 Rn and ~p 2 Rk+1 . 2 3 2 3 2 3 1 x11 . . . x1k a0 y 6 7 6 7 6 17 61 x 7 6 7 6 7 21 . . . x2k 7 6 6 a1 7 6 y2 7 X = 6. ~ p = ~ y = 7 6 7 6 7 .. .. 7 6 ... 7 6 ... 7 6 .. . . 4 5 4 5 4 5 1 xn1 . . . xnk ak yn The parameters ~p is given by ~p = (XT X)-1 XT ~y (theorem 4.3.3.2). The four proofs of theorem 4.3.0.1 differ in complexity. The first two proofs are based on mathematical knowledge that can be found in high school. The other two proofs are based on mathematics that are not taught in high school. Making the two last proofs teachable in high school while keeping its power require implementation of new mathematical domain (linear algebra, function of two variable) in high school. This shows that the proofs cannot be implemented in high school without reduce its power or justification. 37 38 the scholar knowledge 4.4 statistical approach 4.4.1 Maximum likelihood estimation The statistical approach to make simple linear regression is based on normal distributed stochastic variables and their probability density functions. Let {y = + ↵(x - x̄)|↵, 2 R} be the class of linear functions describing the linear relation between the variables. Using the maximum likelihood estimation, the parameters that make the observed values (D) most probable is estimated (Ditlevsen and Sørensen, 2009). The basis of the statistical linear model y(x) = + ↵(x - x) is independent stochastic variables Y1 , . . . , Yn each normal distributed with mean + ↵(xk - x) and variance 2 , i.e. Yk ⇠ N( + ↵(xk - x), 2 ). Assuming the following assumptions, the model determined will be the most probable. • The mean of Yi is a linear function of the xi • The Yi are independent and normal distributed • The variance of the residuals is constant across observations (homoscedasticity) (Ditlevsen and Sørensen, 2009, p. 105) Assuming Yk to be independent normal distributed stochastic variables, the most probable values of the parameters are found by maximize the likelihood function. The likelihood function is the joint density of yk regarded as a function of the parameters ↵, , 2 . The likelihood function is: ◆ n 1 1 X 2 2 Ly (↵, , ) = · exp - 2 (yk - - ↵(xk - x)) 2 (2⇡ 2 )n/2 k=1 The estimates, which maximize the likelihood function is denoted ↵, ˆ ˆ , ˆ 2 and fulfill that Ly (↵, ˆ ˆ , ˆ 2 ) > Ly (↵, , 2 ) 8↵, , 2 2 R ⇥ R ⇥ [0, 1[. Maximize the likelihood function is simplified to a problem of maximize the log-likelihood function. As log(x) is a strictly monotonically increasing function, the transformation does not affect the optimality. The log-likelihood function is: n log(Ly (↵, ˆ ˆ , ˆ 2 )) = - log(2⇡ 2 2 )- 2 n 1 X 2 (yk - - ↵(xk - x))2 k=1 The log-likelihood function log(Ly (↵, ˆ ˆ , ˆ2 )) is maximal, when Pn 2 k=1 (yk - - ↵(xk - x)) is minimal. Thus the problem is reduced to the LSQ method and the method to determine ↵, is also found 4.4 statistical approach to be similar to the method described in section 4.3 (Ditlevsen and Sørensen, 2009). After determine ↵, the variance can be determined, since the function is reduced to be of one variable. The estimate of the parameters is given by: ↵ˆ = Sxy S2x ˆ =y ˆ2 = n 1 X (yk - y - ↵(x ˆ k - x))2 n k=1 The estimated variables are independent and their distributions are: ↵ˆ ⇠ N(↵, 2 /S2x ), ˆ ⇠ N( , 2 /n), nˆ2 ⇠ 2 2 n-2 Using the maximum likelihood function the linear model is justified as the most probable model. This justification has a high epistemological value when consider and argue for the definition of the best linear model, as it seems rational that the best linear model is the most probable. However, the technique of maximum likelihood function is difficult to transform to be simple while keeping its functional character, since both the technique of maximum likelihood function and the LSQ method has to be used. 4.4.2 Random variable and orthogonal projection In the following, another approach to simple linear regression in statistics is shown. The approach is based on vector space of random variables and orthogonal projection. The best linear model (Theorem 4.3.0.1) is proved by considering a linear space of random variables and use orthogonal projection to show the best predictor of Y given X is the conditional expectation E(Y|X). The proof of the best linear model is a special case of a more general theorem, which first is stated. The theorems and proof are mainly based on Kachapova and Kachapov (2010) and Grimmett and Stirzaker (2001) Theorem 4.4.2.1 (Projection theorem). Let W be a Euclidian space of random variables with scalar product given by (X, Y) = E(X · Y). Let Y satisfy E(Y 2 ) < 1. Let M 2 W. The following two statements are equivalent: E (Y - M)V = 0 8V 2 W kY - Mk 6 kY - Vk 8V 2 W Theorem 4.4.2.2. Let X and Y be random variables and suppose E(Y 2 ) < 1. Let W = {f(X)|f : R ! R, E(f(X)2 ) < 1}. The best predictor of Y given X is the conditional expectation E(Y|X). 39 40 the scholar knowledge Proof. The best predictor of Y has to fulfill that kY - Ŷk is minimal. Have to prove that E(Y|X) 2 W and E (Y - E(Y|X))V = 0 8V 2 W (theorem 4.4.2.1). First E(Y|X) 2 W is proved: Using Cauchy-Schwarz inequality and E(Y 2 |X) = E(Y 2 ): E E(Y|X)2 6 E(E(Y 2 |X)) = E(Y 2 ) Since E(Y 2 ) < 1 ) E E(Y|X)2 < 1 and so E(Y|X) 2 W. Now E (Y - E(Y|X))V = 0 8V 2 W is proved Let V = f(X) 2 W. E (Y - E(Y|X))V = E(Y · f(X)) - E E(Y|X) · f(X) = E(Y · f(X)) - E(Y · f(X)) =0 using the property: E(r(X) · E(Y|X)) = E(r(X) · Y) 8r : X ! R. By theorem 4.4.2.1 kY - E(Y|X)k < kY - Vk8V 2 W. Theorem 4.4.2.2 gives that the best predictor is given by the conditional expectation. In the following the theorem is used in the special case when W = {f(X) = aX + b|a, b 2 R}. Proof. Let W = {aX + b|a, b 2 R}. Let the vector in W closest to Y be + ↵X for ↵, 2 R (theorem 4.4.2.1) Let " = Y - - ↵X. Then E(" · 1) = 0 and E(" · X) = 0, as " 2 W ? and 1, X 2 W. This implies: E((Y - ") · 1) = E(Y · 1) - E(" · 1) , E((Y - ") · 1) = E(Y · 1) and E((Y - ") · X) = E(Y · X) - E(" · X) , E(Y - ") · X) = E(Y · X) This leads to a system of two linear equations. Using Y - " = gives: E( + ↵X) = E(Y) E( X + ↵X2 ) = E(Y · X) , , + ↵X + ↵E(X) = E(Y) E(X) + ↵E(X2 ) = E(Y · X) 4.5 the quality of a model The solution of the linear system is given by: + ↵E(X) = E(Y) , = E(Y) - ↵E(X) = µY - ↵µX E(X) + ↵E(X2 ) = E(Y · X) , (E(Y) - ↵E(X))E(X) + ↵E(X2 ) = E(Y · X) 2 , 2 ↵(E(X ) - E(X) ) = E(Y · X) - E(Y)E(X) ↵= ↵= E(Y · X) - E(Y)E(X) E(X2 ) - E(X)2 Cov(X, Y) 4.5 2 X , 2 X where µY , µX is the mean, Cov(X, Y) the covariance and ance. Substitute µY , µX , Cov(X, Y), be similar to 4.3.0.1. , 2 X the vari- with y, x, Syx , S2x , the linear model the quality of a model After a model is determined, a natural question is; How well did the model represent the set of data?. To answer this questions, the model has to be evaluated. As several ways to estimate the quality of the model exist, only the simplest methods will be presented. The methods presented are equivalent with the methods in the high school textbooks. 4.5.1 Visual examination A simple, though not qualitative precise way to evaluate models is to conduct visual examination of the data and error distribution by use of plots. Scatter plot can be used to study the relationship between two variables, since the plot provides a good indication of the nature of the relationship. Using scatter plot, a linear relation is easy to recognize, whereas non-linear relations are more difficult to recognize because some non-linear relations look like each other (figure 9). As some non-linear functions can be transformed to be linear, plotting the transformed data can help recognize the relation. For instance, exponential and power functions became linear in the variables with respectively semi-log and log-log transformation (see section 4.6). The distribution of errors can be investigated on residual plots. Residual plot is used to investigate obviously deviations from randomness between a data set and a model. Making models the errors are assumed to be constant across observations. Figure 9 shows examples of scatter and residuals plot. 41 42 the scholar knowledge (a) Exponential model (b) Power model Figure 9: Examples of scatter plot and residual plot. At figure a) the errors are random, while the variances of the errors in b) is systematic 4.5.2 Two strongly related definitions of the Pearson correlation coefficients appear; The population and the sample. The correlation coefficient The most familiar correlation coefficient is the Pearson correlation coefficient, which measures the correlation of a linear relation between two variables. Definition 4.5.2.1 (Sample Pearson correlation coefficient). The correlation coefficient r 2 [-1, 1] is given by Pn Sxy Sxy (xk - x̄)(yk - ȳ) q r = pPn k=1 =q =p Pn 2 2 S2x · S2y S2x · S2y k=1 (xk - x̄) k=1 (yk - ȳ) The correlation coefficient is a measure of the strength and direction of a simple linear relation between two variables x and y. If r < 0 the slope of the linear model is negative, while r > 0 denote a positive slope. A value of 0 implies that there are no linear correlation of the variables, while |r| = 1 implies that the linear model describes the relationship perfectly. When |r| is close to 1 the variables are strongly related. The formula of the correlation coefficient is complex and from the formula it is not intuitively, why the coefficient is a measure of the strength and direction. Using vectors can help make sense to the formula. Consider the data set D as vectors x = (x1 , . . . , xn ), y = (y1 , . . . , yn ). Then the correlation coefficient is the angle ✓ between the normalized vectors. Normalize the vectors yield z = (x1 - 4.5 the quality of a model x, . . . , xn - x) and w = (y1 - y, . . . , yn - y). The angle between the normalized vectors are given by: Pn Sxy z·w (xk - x̄)(yk - ȳ) p pPn cos(✓) = = Pn k=1 =p q 2 2 kzk · kwk S2x S2y k=1 (xk - x̄) k=1 (yk - ȳ) The knowledge of vectors and linear algebra can be used to justify why the coefficient measures the strength and direction. Two vectors that are linear independent are orthogonal, i.e. r = cos(✓) = 0. Two vectors pointing roughly in the same direction are closely related and the angle between them is small. Thus cos(✓) will be about 1. The correlation coefficient can be usable if data has a linear relation, but the coefficient cannot be used blindly, as the value does not characterize the relationship. Anscombe’s quartet (figure 10) illustrates how four very different distributed data set can have equal correlation coefficient. For instance is Set 3 perfectly linear, except for one outlier, while Set 2 is non-linear. The example confirms that the correlation coefficient have to be combined with visual examination before a model can be evaluated. Figure 10: Anscombe’s quartet: Four sets of data with same correlation coefficient r = 0.816. Retrieved 18/2 2015 from http://2.bp.blogspot.com/_IFzDPHUxHI0/ SG0ocfCh01I/AAAAAAAAADI/VAqSLJd0dLc/s400/anscombe_quartet.gif 43 44 the scholar knowledge 4.5.3 The coefficient of determination The coefficient of determination measures (like the correlation coefficient) the strength of a model. Definition 4.5.3.1 (The coefficient of determination of simple linear models). The coefficient of determination r2 2 [0, 1] is given by Pn (ax + b - ȳ)2 2 Pn k r = k=1 2 k=1 (yk - ȳ) In the case of linear models, the coefficient of determination is the square of the correlation coefficient and the formula of r2 can be inferred from the correlation coefficient: ✓ ◆ 2 (Sxy )2 (Sxy )2 S2x Sxy 2 S2x 2 2 Sx r = 2 2 = = · = a · Sx Sy (S2x )2 S2y S2x S2y S2y P P 2 2 2 a2 · n a2 · n k=1 (xk + x - 2xk x) k=1 (xk - x) = = S2y S2y Pn Pn (axk - ax)2 (axk - ax + ax + b - y)2 = k=1 2 = k=1 Sy S2y Pn (ax + b - y)2 Pn k = k=1 2 k=1 (yk - y) (Winsløw, 2015) The domain [0, 1] of the coefficient of determination can be proved by using that S(a, b) > 0 and S(a, b) = S2y minimized (a result from 1st proof) (Sxy )2 r2 = 2 2 = Sx Sy (Sxy )2 S2x S2y = (Sxy )2 , S2x (Sxy )2 S2x (S )2 S(a, b) + Sxy2 x when S(a, b) is 61 (Winsløw, 2015). Generalize the formula of coefficient of determination to all classes of functions yields: Pn 2 k=1 (f(xk ) - y) r2 = P n 2 k=1 (yk - y) The sum in the numerator is called the explained sum, as it explain the squared distance between the model and the mean, while the sum in the denominator is called the total sum, as it is the sum of the distance between the observed value and the mean. Thus, the coefficient of determination can be described as the measure of the proportion of the variation in the dependent variable y, which the independent variable x can explain. If r2 = 0.8, the x-variable can explain 80% of the variation in y, while the rest 20% must be explained from other factors. A r2 value close to 1 indicate a strong relation. 4.6 non-linear regression The coefficient of determination is a useful tool to explain how certain predictions can be made, however using the quantity to choose and justify the class of functions used to model a data set requires caution. Firstly, a perfect model can be found to all data set using the class of polynomials. A model perfectly fit a data set does not imply that the type of model (the class of functions) is useful to describe the actual relation. Further it is not possible to decide the class of functions based on the r2 -values, which the Anscombe’s quartet was an example of. Consequently to choose and justify the class of functions used to model a data set, the phenomenon behind the data has to be considered. Thus the determination of the class of functions requires construction of mathematical praxeologies mixed up with non- mathematical objects. As non-mathematical objects have to be considered to evaluate the class of functions, the criteria of a good model differ and no general criteria can be set up. If the purpose is to demonstrate theoretical laws, r2 values of 0.9 - 0.95 are interpreted as small values, while values of 0.6-0.7 are high, when verifying relations in social science, as many other factors influence the relation. 4.6 non-linear regression As demonstrated earlier, it is different to use the LSQ method to make non-linear regression, because no direct method to minimize the sum exist. Because of that the determination of non-linear models are reduced to tasks, which are simpler to solve. The non-linear functions, which can be transformed to be linear in the parameters are solved by linear LSQ method on the transformed data set. Two non-linear functions, which can be transformed to be linear are the exponential and power functions and therefore the best exponential and power model is found by linearization and linear LSQ on the transformed data. Exponential regression An exponential function is linearized by use of the properties of logarithm: y = b · ax , log(y) = log(b · ax ) , log(y) = log(b) + x · log(a) If D is the original data set, the transformed data set will be { xk , log(yk ) |k = 1, . . . , n}. Linear regression on the transformed data set gives a linear model log(y) = Ax + B. Using that log(x) 45 46 the scholar knowledge and 10x are inverse function the model is transformed back to an exponential model by log(y) = Ax + B , y = (10A )x · 10B = ax · b with a = 10A and b = 10B The exponential model y = b · ax minimizes n ✓ X k=1 log(yk ) - log(b) + x · log(a) ◆2 The exponential model is given by: log(a) = Pn k=1 log(yk ) - log(y) 2 k=1 (xk - x) xk - x Pn log(b) = log(y) - log(a) · x Power regression Similarly, the best power model f 2 FP is determined by use of logarithm transformation and linear regression. The power function is transformed by: y = b · xa , log(y) = log(b · xa ) , log(y) = log(b) + a · log(x) The last expression is a linear function log(y) of log(x). The data set D is transformed to { log(xk ), log(yk ) |k = 1, . . . , n }. Linear regression on the transformed data set gives a linear model log(y) = A · log(x) + B. The linear model is transformed back to a power model by: log(y) = A · log(x) + B , y = xA · 10B = xa · b with a = A and b = 10B The power model y = b · xa minimizes n ✓ X log(yk ) - (log(b) + log(x) · a) k=1 ◆2 The power model is given by: a= Pn k=1 log(xk ) - log(x) log(yk ) - log(y) Pn 2 k=1 log(xk ) - log(x) log(b) = log(y) - a · log(x) The linearization of the functions correspond to work with straight lines on logarithmic paper, since the exponential and power functions form straight lines at respectively semi-log and log-log paper. The use of logarithm paper to make exponential and power regression is described in the next chapter. 4.6 non-linear regression The technique of linearization are further applied to other types of non-linear regression, for instance logarithm or reciprocal functions. Another type of transformation is to transform a simple non-linear function to a multiple linear function and use multiple linear regression. This transformation is applied in polynomial regression. The polynomial f(x) = a0 + a1 x + . . . + ak xk is treated as a linear function in k variable by consider x, x2 , . . . , xk as independent variables. The problem of minimize LSQ for non-linear functions are, if possible simplified to problems, which are easier to solve. However, transformation of non-linear data requires caution, since the model solved by the simpler method is not the best model in the "strict" sense, (minP 2 imize n k=1 (yk - f(xk ))P). Making exponential regression with the xk 2 LSQ method, the sum n k=1 (yk - B · A ) is minimized and this sum differs from the sum minimized by use of linearization and linear LSQ method. The two models will differ, because the points are not weighted equally, e.g. small errors are weighted greater using linearization. In the rest of the thesis, LSQ method will refer to the method used to determine the best exponential and power model. The best exponential model is defined as the function f 2 FE which minimize Pn 2 k=1 log(yk ) - (log(b) + x · log(a)) . The best exponential model is defined as the function f 2 FP which P 2 minimize n k=1 log(yk ) - (log(b) + log(x) · a) . 47 5 T H E K N O W L E D G E T O B E TA U G H T In this chapter, the knowledge to be taught about simple regression in high school will be examined. The chapter highlights the conditions and constraints crucial for teachers’ didactic processes and the mathematical praxeologies taught. The knowledge to be taught is composed of many sources, some of them are common for high schools, e.g. the curricula and exams, whereas sources as textbooks and materials differ between high schools. The common sources, the curricula and exams are analyzed to illustrate the generic (purpose, aim, didactic principle) and specific (specific task, themes) restrictions. In the analysis, old exams and curricula are included to highlight how the MOs about regression have evolved over time. Afterwards the content in three series of textbooks is presented to illustrate the MOs taught in high school. The examples and citations from the sources are translated from danish. 5.1 curricula The curricula state the reasons to teach a discipline, what should be taught, what should be learned and didactic principles. These elements set up constraints for teachers and influence what and how MOs be taught. The conditions and constraints in high school is analyzed in the curricula of STX B-level from 1999 and 2013 . The analysis focuses mainly on the curriculum from 2013, since the curriculum influences the taught knowledge nowadays. In the curriculum from 2013 the application of mathematics in real life is emphasized as being very important when teaching and learning mathematics and are also highlighted as a reason to learn mathematics. Mathematics is based on abstraction and logical thinking and comprise a lot of method to modeling and problem treatment. Mathematics is essential in many professions, in science and technology, in medicine and ecology, in economy and social sciences and as a basis for policy making [...] Mathematics as a science has evolved in a continuous interaction between application and theory building. (STXBekendtgørelsen, 2013, Section 1.1) In the curriculum the necessity and application of mathematics are clearly pointed out, as well as the theoretical part of mathematics, 49 50 the knowledge to be taught where the students should learn to "practice simple reasons and proofs" (STXBekendtgørelsen, 2013, Section 2.1) to justify the methods and mathematics used. Some aims of the teaching in mathematics is that the students "know about how mathematics can contribute to understand, formulate and solve problems within different disciplines" (STXBekendtgørelsen, 2013, Section 1.2) and further learn "the important aspects of mathematics interaction with culture, science and technology" (STXBekendtgørelsen, 2013, Section 1.2). These aims demonstrate that the application of mathematics has to be central when doing mathematics. The specific domains, sectors, themes and subjects that have to be taught can be found in the core substance. In the curricula from 1999 and 2013 regression belongs to the domain of functions. - [...] and the use of semi log paper have to be mentioned. In relation to the treatment of power functions the use of log-log paper has to be mentioned. The facilities of the calculator to do regression must be commented. (STXBekendtgørelsen, 1999, Section 11.2) - the notation f(x), the characteristics of the following functions: Linear function, polynomial, exponential-, power and logarithm functions and the characteristic properties of these functions graph, application of regression. (STXBekendtgørelsen, 2013, Section 2.2) The two extracts from the curricula indicate that the MOs about regression have evolved in time. In 1999, regression is mentioned in relation with logarithm paper, which could be an indicator of that regression is taught in relation with this theme. Nowadays logarithm paper is no longer a part of the core substance. Furthermore, it seems that the technique has change, as the calculator only has to be commented in 1999, which indicate that the students had other techniques than the instrumented to make regression. In the students’ study process of mathematical knowledge the students have to develop academic aims. The most relevant in relation to regression are: - Apply simple functions in modeling of given data, make simulation and extrapolation and reflect about the idealization and range of the models - Practice simple reasons and proofs - Demonstrate knowledge about the use of mathematics in selected areas, including knowledge of use in the treatment of a more complex problem - Use of IT-tools to solve mathematical problems 5.2 written exams (STXBekendtgørelsen, 2013, Section 2.1) Finally, it is worth mentioned how the curricula restrict the generic levels (in the level of determination) of the teacher. In the curriculum from 2013 it is clearly that the application of mathematics has to be in focus, which means that the teacher has to "consider emphasis on mathematical applications" (STXBekendtgørelsen, 2013, Section 3.1), and that "mathematics has to be mixed up with other disciplines" (STXBekendtgørelsen, 2013, Section 3.4). Furthermore, the teacher has to set up the students’ study process in a way that let the students learn mathematics in an interaction between "by hand" and "by CAS", such that "the calculator and IT become essential tools in the students acquisition of notation and problem solving" (STXBekendtgørelsen, 2013, Section 3.3). 5.2 written exams The written exam in mathematics is the final written examination that test students’ knowledge for the purpose of grading the students’ abilities in written mathematics. The exam exercises be an indicator of what students have to learn and consequently teachers make use of the exercises as guideline of what to teach1 . Since only "application of regression" is part of the curricula, the mathematical knowledge of regression is rarely examined in the oral exam. Thus the knowledge of regression examined can be found in the written exams. In the following, exam exercises and evaluations reports are analyzed and the types of tasks, methods and techniques (related to regression) appearing in the exam exercises are presented. The analysis is based on all levels of exam exercises from 1975-1999, 2009-2013 and evaluation reports from 2002-2011. 5.2.1 The type of tasks The first exercise in regression appeared in an exam set from august 1980. The exercise was remarkable because until 1985 no other exercises regarding regression were found. After 1985 exercises in regression became more frequently, but not in the same extent as today, see table 2. The lay out of the exercises from 1980-1999 and 2009-2013 are very similar (see figure 11 and 12), with a text explaining the context (phenomenon) of the data set D, a table with data and two-four tasks. In the following I have listed the types of tasks found in the exams without take into account the technique used to solve the tasks, since 1 Studies have shown that exam exercises play an important role in designing teaching (Svendsen, 2009, p. 74) 51 52 the knowledge to be taught Year Number of exercises/ Linear Exponential Power Number of exam set regression regression regression 1975 - 1979 0/30 0 0 0 1980 - 1984 1/30 0 1 0 1985 - 1989 6/33 0 4 2 1990 - 1994 5/24 0 3 2 1995 - 1999 7/38 0 5 2 2009 - 2013 37/38 7 17 13 Table 2: Exam exercises regarding regression in 1975-1999 and 2009-2013. The second column shows the ratio of exercise and number of exam set. The last three columns describe the type of regression. I cannot deduce the techniques from the exams. The types of task in the exams are: TRL : Given a data set D. Determine the best model f(x) within a given class FL = {f(x) = ax + b |a, b 2 R} TRE : Given a data set D. Determine the best model f(x) within a given class FE = {f(x) = bax |a 2 R+ , b 2 R} TRP : Given a data set D. Determine the best model f(x) within a given class FP = {f(x) = bxa |a, b 2 R} TPL : Given a data set D and a class of functions (FE or FP ). Draw the data set D in a suitable coordinate system (semi-log or log-log) TQD : Given a data set D and a class of functions Fi (FL or FE or FP ). Evaluate the model f(x) for a data set D TQP : Given a point P = (x0 , y0 ) 2 / D and a model f(x). Comment the range of the model. TXM : Given a model f(x) and a value of f at x0 . Determine x0 TXD : Given a data set D, one class of functions Fi (FL or FE or FP ) and a value of f at x0 . Determine x0 TYM : Given a model f(x) and x0 . Determine f(x0 ) TYD : Given a data set D, one class of functions Fi (FL or FE or FP ) and x0 . Determine f(x0 ) In figure 11, examples of tasks belonging to TQD , TYD , TXM are presented. In the exercise from 1980 (figure 11) the class of functions has to be examined (task TQD ) before the model is determined. The model is determined in relation to the second task (TYD ). The last task is of type TXM . Figure 12 illustrates examples of tasks of types TRL , TYM , TXM . 5.2 written exams The valuation prices of a specific danish stamp appear in the following table. Year 1972 1974 1976 1978 The valuation price in kr. 175 225 300 400 a) Explain that the price trend approximately can be describe by a function of the type f(x) = a · bx It is assumed that the price trend continues. b) What is the valuation price in 1990? c) When is a valuation at 1000 kr. obtain? Figure 11: Exercise from 1980 containing tasks of types TQD , TYD , TXM (Petersen and Vagner, 2003) The table show the assessment of the number of traffic casualty in the first half-year of the years 2007-2012 from the Roads Directorate. Years 2007 2008 2009 2010 2011 2012 Traffic casualty 195 190 161 110 107 82 In a model, is assumed that the development in the number of traffic casualty can be described by a function of the type y = ax + b where y is the number of traffic casualty in the first half-year and x is the time (measured in years after 2007). a) Use the data in the table to determine the coefficients a and b b) Use the model to determine the number of traffic casualty in the first half-year of 2013. c) Determine that year, where the number of traffic casualty in the first half-year are 50 people in accordance to the model. Source: FDM Figure 12: Exercise from 2013 (Undervisningsministeriet, 2013) containing tasks of type TRL , TYM , TXM The problem in the first three types of tasks (TRL , TRE , TRP ) is to determine the model given a class of functions, i.e. respectively linear, exponential and power regression. The listed types of tasks appear in the exam from 1980-1999 + 20092013, however not all in the whole period, since the type and frequency of tasks have changed. The types of tasks TPL , TQD , TXD , TYD occur only in the exams before 1999, whereas linear regression (TRL ) only occurs in the recent exams, see table 2. In all exercises at least one task of types TXM , TXD , TYM , TYD is in- 53 54 the knowledge to be taught cluded. The four types of tasks have in common that a variable is estimated by intra- or extrapolation. Besides prediction of values and determination of models, the exercises contain also types of tasks (TQD , TQP , TPL ) about the quality of the model. In exams from 1980-1999 the quality of the class of functions is considered in several exercise (task of type TQD , TPL ) and mostly the class of functions is evaluated before the specific model is determined (figure 13). In the exams from 2009-2013, the task of type TQP is the only task of type concerning the quality of the model. At waterworks the operating costs D (measure in 1000) by treatment of water depends on the treated volume of water x (measured in m3 /hour). The scheme shows some corresponding values of x and D. Volume of water x(m3 /hour) 30 50 80 140 200 Operating costs 55 91 143 248 350 Draw the information from the scheme in a suitable coordinate system and explain hereby that the operating costs D approximately can be described by a function of the volume of water x of the type D = bxa Figure 13: Types of tasks TPL and TQD from 1998 (Petersen and Vagner, 2003) 5.2.2 An instrumented technique is a set of rules and methods in a technological environment that is used for solving a specific type of problem. (Drijvers and Gravemeijer, 2005, p. 169) The methods and techniques In rest of the thesis three kind of techniques is used to distinguish between the way of solving a task. Graphic techniques: Indicate that a task is solved by use of graphs, for instance determine the best model by eye and ruler, make scatter plot or residual plot Algebraic techniques: Refer to solving tasks by use of algebra, for instance determine the best model by use of theorem 4.3.0.1 or calculate the quantities r, r2 by the formulae in definition 4.5.2.1 or 4.5.3.1. Instrumented techniques: Refer to solving tasks by use of tools with the intermediate calculations invisible for the students, for instance commandoes to do regression or calculation of the coefficient of determination. It is not possible to deduce the techniques from the exam, but examination of other exercises and the evaluation reports indicates the techniques used. In exams from 1980-1999 students got tasks where the formula of exponential and power functions had to be determined from straight lines on logarithm paper. It seems reasonable to infer that the technique also had been applied to make regression. This indicates that students made regression by the following technique: 5.2 written exams 1) Plot the points at logarithmic paper (semi-log or log) (Task TPL ) 2) Draw the best line by eye 3) Determine the formula of the linear model using two points 1 (a = yx22 -y -x1 , b = y1 - ax1 ) 4) Transform the linear model back to a model in the variable x and y (cf. section 4.6). Using the graphic technique the approach is geometric (section 4.2). Step 1) and 4) can be explained by the theory of logarithm, while step 2) and 3) cannot be justified by mathematics. The evaluation report supported that the technique described was the technique used in the exams from 1980-1999. In the evaluation report is written: "the students should use regression or calculate from reading of two points on a line drawn in technical paper" (Grøn, 2004). By the accessibility and development of calculators in the end of the 20th century, the first signs of the use of linear LSQ method and instrumented techniques are found in high school. The development caused that an instrumented technique was mentioned in the curricula (STXBekendtgørelsen, 1999, section 11.2), however the instrumented technique did not replace the graphic technique immediately (Grøn, 2004). Since the instrumented technique to make regression is based on the LSQ method, not only two techniques, but also two methods (LSQ and the geometric) were actually taught in a period. The two methods were both accepted until 2007 where the first exam, after the new reform in 2005, was hold. With the reform in 2005 the LSQ method became mandatory and is nowadays the only acceptable method. The analysis of the evaluation reports shows that an evolution in methods and techniques had happened in the period 1999-2005. New techniques to solve regression occur and consequently new MO was taught. The tasks were now solved by other techniques and so new practical blocks were taught. The LSQ method has been implemented in the curriculum from 2005 and exercises regarding regression have since been a stable and permanent part of the written exam. The exercises related to regression is only solved by the instrumented technique nowadays and thus the tasks can be solved without knowledge of the intermediate calculations or the method behind. 55 56 the knowledge to be taught 5.3 textbooks In this section, the content of regression in textbooks is presented. The following three points will be presented: • How is the problem described and what tasks are related to the problem • Which techniques are presented to solve the problem • How is the solution justified or proved 5.3.1 Gyldendals Gymnasiematematik The serie of Gyldendals Gymnasiematematik consists of four books, of which only the first book (c-level) describes regression (Schomacker et al., 2010). The book presents regression in the domain of functions and mainly through examples. Regression is introduced by problematize how a relation between a data set can be describe, when the class of functions is unknown. The problem is TR : Given a data set D. Determine the best model f(x) within a class of functions Fi . The technique to solve tasks of type TR is shown by five examples (all linear models). The technique presented is: Initial make scatter plot. Solve TRL at CAS. Plot the data together with the model. In the book, linear regression is described as "the method to determine the formula of the straight line that describe the location of the points best possible" (Schomacker et al., 2010, p. 54). No further description or explanation about the method or what best means is included. Besides task of type TR , TRL , tasks of type TXM , TYM are presented, but not solved. Exponential and power regression are presented and described by use of an exam exercise of respectively type TRE and TRP . The technique presented is the instrumented. In the presentation of task TR for non-linear data it is described how the phenomenon have to be considered when modeling the data. The book presents the types of tasks TR , TRL , TRE , TRP , TXM , and TYM and the instrumented technique without presenting the technological - theoretical discourse. 5.3.2 Lærebog i matematik Lærebog i matematik is a serie of four books at A level. Regression is presented in the first book (Brydensholt and Ebbesen, 2010) in the context of functions and the linear LSQ method is described and proved in the second book (Brydensholt and Ebbesen, 2011) in relation to differentiation. The first book presents the problem of how measured data has to be 5.3 textbooks modeled by approximate a class of functions (TRL , TRE , TRP ) to the data. The technique, which are instrumented, is shown through an example. In the example, the data is plotted and a figure of the data with the model are presented, but the visual examination is not described. The technique is described as: "Find the constants a and b by a statistical method, called linear regression" (Brydensholt and Ebbesen, 2010, p. 181). However, the book did not explain the statistical method or make further explanation of the method. The exponential and power regression are presented through examples of tasks of type TRE and TRP without explaining or justifying the technique. The technique to solve TRE is described as: "We use all the data in the table and determine the constants a and b by exponential regression" (Brydensholt and Ebbesen, 2010, p. 194). After presenting exponential regression the book presents that exponential functions became straight lines in semi-log paper, but no connection to exponential regression is made. The presentation of power regression and power function at log-log paper is similarly. In the second book, the method of linear LSQ is described and the best linear model is defined. The formula of the parameters in the linear model f(x) = ax + b is presented in a theorem (Brydensholt and Ebbesen, 2011, p. 106) Given the points (x1 , y1 ), (x2 , y2 ), . . . , (xN , yN ). The best straight line describing the points is the equation y = ax + b with P P P P P N xy - ( x)( y) y-a x P P a= , b= N x2 - ( x)2 N The theorem is proved by multivariable calculus. The parameters are proved by partial derivatives (like 3th proof), but the proof lacks to show that the parameters minimize the function. The authors state that the last part is omitted, since it "requires knowledge of function with more reel variables to determine the extremum" (Brydensholt and Ebbesen, 2011, p. 106). This shows that the proof in the transformation to be teachable in high school lose its power and rational, since the students’ knowledge limits what can be done. 5.3.3 Hvad er matematik? The serie of Hvad er matematik? consists of three books (A, B, C). In the serie, regression is presented at C-level in the domain of functions and at A-level in a supplementary section, named Regressionsmodeller. The C level book has a long introduction to the problem and describes what regression is used to. The book presents the problem of finding 57 58 the knowledge to be taught an ideal model that describes data and describes that the way to solve the problem is to go back (the etymology) and find the model that the data set reflect. The book describes how data points in real life are spread randomly around an ideal line (describing the phenomenon) and the problem is to determine the formula of the ideal line, called the regression line. It is explained that the technique (instrumented) is based on a method to calculate the best model and this method depends on a criterion of best model. For a linear model the regression line and the method are defined in a separate box. Definition: Regression line The line that best fit the data points is called the regression line and is obtained by making linear regression. The best is determined by the least squares method. (Grøn et al., 2014, p. 59) In the book the linear LSQ method is illustrated by figure 14 and it is described that the line that minimize the sum of the squares is the regression line. Figure 14: The illustration of the LSQ method in Hvad er matematik? - C (Grøn et al., 2014, p. 60) After linear regression and the linear LSQ method is presented, methods to evaluate models are presented and discussed. The techniques; coefficient of determination and residual plot are presented and further, it is described how other disciplines have to be considered, since "even through r2 is close to 1, [..], we cannot with certainty know that there is a genuine casual relationship" (Grøn et al., 2014, p. 60). The problem presented in the sections about exponential and power regression is similar with the linear, however the definition for best model is not specified. For instance, the best power model is described as: "Best is based as linear and exponential regression on a decision about how we measure this deviation. The measure of the deviation is thus more complicated compared to linear regression" (Grøn et al., 2014, p. 188). In the section about logarithm an exercise2 which should let the students study the connection between non-linear regression (exponen2 The exercise consists of four tasks, where the first task is of type TRE . In the second task the transformed data has to be calculated. In the third task the linear model of 5.3 textbooks tial and power) and linear regression is included. In the A level book (Grøn et al., 2013) the formula of the the best line is proved and the coefficient of determination, hypothesis test, multiple linear regression, non-linear regression is presented. The linear LSQ method is proved by simplify the sum and use the results of the simplified problem to prove the linear LSQ method. The linear LSQ method is proved by elementary algebra. The linear LSQ method is further explained using linear algebra and projection. Pn 2 Firstly, the t that minimize i=1 (xi - t) is determined (theorem 1) (Grøn et al., 2013, p. 402). Theorem 1: The mean is the number with the smallest variation n Given data {x1 , . . . , xn } the mean x̄ = x1 +...+x gives the lowest n 1 2 2 variation, Var(t) = n (x1 - t) + (x2 - t) + . . . + (xn - t)2 Theorem 1 is proved in two ways: using the vertex of second degree polynomial and by projection. The technique using projection is: Let x be the n-dimensional vector of {x1 , . . . , xn } and d the n-dimensional vector given by d = {1, 1, . . . , 1}. The sum is equal (x - t · d)2 . Since x and d are vectors, t · d is the position vector and e = x - t · d the vector that links the vectors x and t · d, see figure 15. Figure 15: Illustration of projection in What are mathematics? A (Grøn et al., 2013, p. 404) As (x - t · d)2 = e2 , the sum is minimal when the length of e is minimized, thus t is determined by minimize the length of e. The projection of x at d minimizes the length of e, hence t is determined by: xd = t · d = x·d x·d x1 + x2 . . . xn x1 + x2 + . . . xn ·d , t = 2 = 2 = = x̄ 2 2 1 +...+1 n d d (Grøn et al., 2013, p. 403-404) the transformed data are determined and in the last task the exponential model is compared with the linear model. 59 60 the knowledge to be taught After this the best proportional model y = kx is presented (theorem 2). The proof of the theorem is left to the students. Theorem 2: The best proportionality Given data {x1 , . . . , xn }. The value of k that minimize the variation of the model y = kx is k = x·y . x2 The variation is given by: 1 var(k) = n (y1 - kx1 )2 + (y2 - kx2 )2 + . . . (yn - kxn )2 The two theorems are used to prove theorem 3 (the best linear model) (Grøn et al., 2013, p. 409). Theorem 3: The best straight line For the data (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) the best straight line y = ax + b is given by a= (y - y)(x - x) (x - x)2 = 1 n ((y1 - y)(x1 - x) + . . . + (yn - y)(xn - x)) 1 n ((x1 - x)2 + . . . + (xn - x)2 ) b = y - ax The values of a, b minimize the variation of the straight line Var(a, b) = 1 (y1 - ax1 - b)2 + . . . (yn - axn - b)2 n The proof is divided into two steps, based on respectively theorem 1 and 2. First, a is treated as a constant a0 and b is found by minimize var(a0 , b). Then, var(a, b) is minimized by substitute b into var(a, b). In summarized form the proof is: 1 (y1 - a0 x1 - b)2 + . . . + (yn - a0 xn - b)2 n 1 = (z1 - b)2 + . . . + (zn - b)2 with zi = yi - a0 xi . n Var(a0 , b) = Using theorem 1 implies b = z = y - a0 x = y - a0 x. Rewrite the straight line: y = ax + b = ax + y - ax = a(x - x) - y ✓ ◆ 1 2 2 Var(a, b) = y1 - a(x1 - x) - y + . . . + yn - a(xn - x) - y n ✓ ◆ 1 2 2 = (y1 - y) - a(x1 - x) + . . . + (yn - y) - a(xn - x) n 1 = (v1 - au1 )2 + . . . + (vn - aun )2 n with vi = yi - ȳ and ui = xi - x̄. Using theorem 2 implies a= v·u u2 = (y-y)(x-x) . (x-x)2 (Grøn et al., 2013, p. 408-409) 5.3 textbooks 61 The algebraic proof is followed by an explanation of how the best line can be interpreted geometric using vectors. Let x = {x1 , x2 , . . . , xn }, y = {y1 , y2 , . . . , yn } and d = {1, 1, . . . , 1}. 1 1 2 Then var(a, b) = n (y - ax - bd)2 = n e with e = y - (ax + bd). e is the vector that links the vector ax + bd with y. The smallest length of e is the orthogonal, i.e. the projection of y on a normal vector to the plane defined by x and d. Because it requires knowledge to orthogonal projection in a n-dimensional space to determine e, the proof is not completed. 6 EPISTEMOLOGICAL REFERENCE MODEL In this chapter an ERM will be elaborated based on the scholar knowledge and the knowledge to be taught. The knowledge will be structured in four main questions, clearly connected and interdependent. Common to the four questions is that a data set D is given. The four questions are: Q1 : What class of functions in Fi is suitable to model the data? Q2 : Given a class of functions in Fi . What do we mean by best model f 2 Fi ? Q3 : Special case of Q2 . What do we mean by best linear model f 2 FL ? Q4 : Given a model f. What is the quality of the model? The connection between the questions is illustrated by arrows in figure 16. The figure shows how answering a question may requires a movement to another question. For instance, question Q2 may lead to Q3 and a movement back to Q2 as exponential and power regression is reduced to a problem of linear regression by linearization. Figure 16: ERM: The four questions in the ERM. At the figure the connection between the questions are illustrated by arrows Before the MOs belonging to the four questions is described, an example of the tasks and techniques to the different questions will be presented and the connection between the questions will be illustrated. 63 64 epistemological reference model Example of tasks, techniques and the connection between Q1 -Q4 Based on the number of road victims in Denmark in 1979-1999, predict the number of road victims in Denmark in 2010. The data is: Year Victims Year Victims Year Victims 1979 9267 1986 8301 1993 8427 1980 8477 1987 7357 1994 8105 1981 8546 1988 7321 1995 8523 1982 8427 1989 7266 1996 8672 1983 8105 1990 9267 1997 8301 1984 8523 1991 8477 1998 7357 1985 8672 1992 8546 1999 7321 Source: Faktalink.dk To solve the task, several techniques is used. First, the relation of the variables (Q1 ) can be investigated by use of a scatter plot (figure 17). The scatter plot shows a clearly decreasing Figure 17: The relation between year (after 1979) and number of road accident victims tendency, but whether the relation is linear or non-linear is difficult to conclude from the plot. To determine the class of functions (Q1 ), the classes of functions have to be evaluated (Q4 ), which require that the specific models f (Q2 ) are determined. The linear model f 2 FL is determined (task of type TRL ) by techniques belonging to MOs in Q3 . Using the algebraic techniques, the exponential model (task of type TRE belonging to Q2 ) is determined by linearization and linear LSQ method (technique belonging to Q3 ). The models can be evaluated (Q4 ) using the coefficient of determination, plots of data and model and residual plots. The coefficient of determination (0.94 and 0.92) shows that the mathematical relation of the data is good for both models. Examining the scatter and residual plots do either helps to reject one of the models, since the residuals 6.1 q 1 : the type of model (F i ) look similar for both models (figure 18). To determine the class of functions (Q1 ) and consequently the best (a) Linear model (b) Exponential model Figure 18: Scatter plot and residual plot of the linear and exponential model model of the data, the class of functions that is most reasonable relative to predict the phenomenon is selected. Predicting road victims by the linear model means that no road accidents will occur in the end of 2015, while a more reasonable number of 2308 road victims will be predicted at the same time using the exponential model. Thus the exponential model f 2 FE is used to make predictions. 6.1 q 1 : the type of model (F i ) The question: What class of functions in F i is suitable to model the data? is the starting point in modeling, when the class of functions F i is unknown. The task related to the question is T R : Given D. Determine best model f(x) within classes of functions F i . Answering the question requires that MOs is mixed with non - mathematical objects, since a mathematical relation does not imply causal relation. The mix of mathematical and non-mathematical objects caused that no unique solution and technique to the task exist and consequently no general MO belonging to the task can be made. Even though no unique solution exists, some classes of functions are more reasonable than others. To determine a reasonable class (or classes) of functions several techniques exist and which to use depend on the specific data set. Some techniques are: • Make plot (scatter plot, logarithm plot or residual plot) • Consider the quality of the model f 2 F i • Use of the properties of the class of functions (growth, differentiation) 65 66 epistemological reference model • Consider the purpose of the model • Consider the phenomenon behind the data One technique is to make plots, since a linear relation is easy to recognize. If data follow a linear model (at either scatter plot, logarithm paper) and the relations seems feasible with the phenomenon the data comes from, the class of functions can be determined from graphs. If it is not possible to determine the relation between the variables from graphs, the possible classes of functions can be evaluated by consider the quality of the models. This requires that one or more models have to be determined (Q2 ) and afterwards evaluated (Q4 ). Using growth or differentiation of functions can help determine the model (examples are given in the a priori analysis). Furthermore the purpose with the model has to be considered to solve Q1 , since a class of functions can describe data very well in a restricted range, but not be useful for predictions (cf. the example). Independent of the technique or techniques used the phenomenon behind the data has always to be considered. 6.2 q 2 : the best model (f 2 F i ) The question What do we mean by best model f 2 F i ? deals with the interpretation and definition of the best model given a class of functions in F i . The type of tasks related to the question are T RL , T RE and T RP (section 5.2.1). The three tasks can be solved by three kinds of techniques: • The graphic ⌧ G by plotting the data and draw the best line (plot at respectively semi-log and log-log paper for the non-linear models) • The algebraic ⌧ A using theorem of the best model (in the linear case theorem 4.3.0.1. In exponential and power regression similar theorem exist) • The instrumented ⌧ I using a specific push button technique (linear regression, exponential regression and power regression) As the technique determines the mathematical praxeology, each type of task belongs to three different mathematical praxeologies. The technological - theoretical discourse is constituted of the knowledge that explains and justifies the technique. The practical block [T RE /⌧ A RE ] is justified by the definition of the best exponential model, which is justified by linearization (the properties of logarithm) and the linear LSQ method. The practical block [T RP /⌧ A RP ] is justified by the definition of the best power model, which is justified by linearization (the properties of logarithm) and the linear LSQ method. 6.3 q 3 : the best linear model (f 2 F L ) The theoretical blocks of [T RE /⌧ IRE ] and [T RP /⌧ IRP ] is lacking, since the intermediate calculations are hidden and users do not know about the IT-software. G The theoretical blocks belonging to [T RE /⌧ G RE ] and [T RP /⌧ RP ] is justified by linearization and the imprecise definition of the best linear model (best line by eye). If the connection between the mathematical praxeologies is not learnt, the MOs developed will be punctual. Because the theoretical block of punctual MO often is ignored or implicit assumed, the definition of best non-linear model and the connection between Q2 and Q3 is not learnt. 6.3 q 3 : the best linear model (f 2 F L ) This question is a special case of Q2 and the type of tasks related to the question is T RL . As described in Q2 , three techniques to the type of tasks occurs and so three different practical blocks can be made. Since the graphic technique is explained and justified by visualization and an individual sense perception of small errors, no formal mathematics can be used to justify the technique. Hence the technologicaltheoretical discourse to [T RL /⌧ G RL ] cannot be justified by formal mathematics. Common for the algebraic and instrumented technique is the formal mathematics of the linear LSQ method, since both techniques can be justified by this method. But even though the formal mathematics for the techniques are equivalence, the technological-theoretical discourses differ, since the instrumented technique is not justified by mathematics. The algebraic technique using theorem 4.3.0.1 is based on formal mathematics and since the theorem can be proved in more domains several technological-theoretical discourses exist to the practical block [T RL / ⌧ A RL ]. Figure 19 illustrates the different technological and theoretical discourses. The definition of the best linear model constitutes the technology and the domain in which the theorem is proved constitutes the theory. The technological discourse ✓M concerns minimization and the best linear model in this discourse is defined as the model f 2 FL which Pn 2 minimize S(f) = k=1 (yk - f(xk )) . To ✓M three theoretical discourses exist, because the minimum of S(f) can be proved in the domain of elementary algebra, multivariable calculus or linear algebra. 67 68 epistemological reference model Figure 19: Illustration of techniques to task of type TRL and the technological-theoretical discourses to the techniques. If no discourse is illustrated the technological-theoretical discourse is not justified by formal mathematics The technological discourse ✓S is the statistic approach, where the best model is defined as the most probable model f 2 FL given D. In this discourse the theorem is proved by the same theoretical discourses as belonging to ✓M , however the notations differ. For instance, is the conditional expectation used to denote the projection (the proof in section 4.4.2), however the proof is based on projection of random variable, i.e. the theory of projection in the domain of linear algebra. The technological-theoretical discourses for the algebraic technique can be used to justify and explain the calculation of the best linear model in the instrumented technique. Thus one of the technologicaltheoretical discourses to [TRL , ⌧A RL ] have to be developed to let the students know the definition of the best model, which can be used to justify the instrumented technique. In the c-level textbooks only the practical blocks [TRL /⌧IRL ], [TRE /⌧IRE ], [TRP /⌧IRP ] of the MOs were presented. In the textbook Lærebog i matematik-Bind 2 the MO presented is [TRL /⌧A RL /✓M /⇥MC ], while Hvad er matematik?-A presents the MO A [TRL /⌧A RL /✓M /⇥EA ] and [TRL /⌧RL /✓M /⇥LA ]. The practical blocks from Q2 and Q3 can be collected in local and regional MOs depending on the technology-technological discourse. A A For instance, can the practical blocks [TRL /⌧A RL ], [TRE /⌧RE ] and [TRP /⌧RP ] be integrated in a local MO with technology: minimize S(f) for f 2 FL , since the transformed functions belong to FL . Further the punctual G G MOs of [TRL /⌧G RL ], [TRE /⌧RE ] and [TRP /⌧RP ] share the imprecise definition of best linear model by eye and hence the punctual MOs can be integrated in a local MO sharing the technology. 6.4 q 4 : evaluate the model 6.4 q 4 : evaluate the model This question deals with the evaluation of a model. The quality of a model is evaluated in connection with predictions, to know how certain the predictions are and in relation to Q1 to compare models in different classes of functions and justifying the class of functions. To estimate the quality of a model, the best model f 2 F i has to determined (solve Q2 or Q3 ). As described in section 4.5 several techniques to evaluate models exist and more techniques have to be combined to evaluate a model, since statistics quantities do not characterize the relationship (cf. Anscrombe’s quartet) and visual examination does not measure the quality. The grand type of tasks: Given D and a class of functions F i . Evaluate the model, can be solved by several techniques. The techniques relevant for high school are: ⌧PS : Draw the data set D and the model f(x) in a coordinate system ⌧PL : Draw the data set D in a suitable coordinate system (semi-log or log-log) ⌧PR : Make a plot of the residuals ⌧Qr : Determine the correlation coefficient ⌧Qr2 : Determine the coefficient of determination ⌧QP : Given a actual point P = (x0 , y0 ) about the phenomenon, where (x0 , y0 ) 2 / D. Calculate f(x0 ) and compare f(x0 ) and y0 to evaluate the model (cf. 5.2.1) Besides the concrete mathematical techniques to evaluate a model, the phenomenon the data comes from has to be considered, since mathematical relation does not imply causal relation. For instance, using the linear model to predict road victims (example 6), the number be negative after 2015, which is inconsistent with the reality. In the ERM proper mathematical technological-theoretical discourses first appear in the arrow from Q2 to Q3 , and subsequently in Q3 and Q4 . This is due to the fact that Q2 is justified by the theory belonging to Q3 and Q1 is mainly justified from knowledge of the phenomenon. Linearization (Q2 $ Q3 ), the linear LSQ method (Q3 ) and statistics quantities (Q4 ) advanced mathematical theory constitutes the theoretical block of the MOs. 69 Part II THE INTERNAL DIDACTIC TRANSPOSITION: THE TEACHING SEQUENCE How can we reorganize or modernize the teaching in one-dimensional regression in ways that make use of the affordance of technology while presenting regression in a more complete and satisfactory way than as a set of modeling tools? 7 2nd R E S E A R C H Q U E S T I O N In part I the relevant scholarly knowledge and the result of the transformation of the scholarly knowledge to be teachable in high school were presented. The process of transposition simplifies the knowledge and one can argue that it is over-simplified. The students learn the techniques to determine the best model in a given class of functions without actually knowing what is meant by the best model. Very few of the c-level textbooks define what is meant by the best model or include a description of the technique to find the best model, the technological discourse ✓ M to Q 3 . This imply that students get a minimum of technological-theoretical discourse, which is needed to explain and justify the praxis. If students only have a minimum of technological-theoretical discourse, the techniques, which are instrumented, appear as black boxes, with even the intermediate calculations being invisible for the students. Based on the ERM, I will design a teaching sequence introducing high school students to the technological-theoretical discourse of regression with the purpose to investigate the possibilities to work more theoretically with one-dimensional regression in high school. The teaching sequence will be observed and the data collected will be used to answer the following questions: 1. What are the possibilities and obstacles to establish the MO [T R L / ⌧ A R L / ✓ M / ⇥ E A ] about the best linear model (Q 3 in ERM)? 2. What are the possibilities and obstacles to establish a MO around how to choose a suitable class of functions (Q 1 in ERM)? 73 8 DESIGN OF THE TEACHING SEQUENCE Based on the ERM and the knowledge of the external didactic transposition, I designed a teaching sequence. This chapter will contain description of the class, the aims, the MO, the didactic principles and the considerations and choice I had in the process of making the design. Before reading this and the following chapters, I will recommend that you have look at the content and exercises of the teaching sequence in appendix A.1, A.3 and A.4. 8.1 the class The teaching was conducted in a second year class at Gladsaxe Gymnasium. In the class, the mathematical skills are very unlike with about an equal number of weak and strong students. Based on their mathematical skills their usual teacher has divided the class in two groups. The teaching was only carried out in the group, where the skills were highest, hence the teaching was taught to 15 students. The decision was made by the teacher, because the other group needed more time to exam repetition. The teaching was placed in the beginning of a repetition sequence, hence it should be natural for the students to work with regression again. The students have learned about regression at first year, so the teaching is based on that. The students have been taught in the instrumented techniques to make regression and the technique to determine the coefficient of determination. The students have briefly been introduced to the linear LSQ method, but have not worked with the technological-theoretical discourses related to Q2 and Q3 . Because the students have learnt about linear regression, the encounter with Q3 will not be a real first encounter. However, in the design it is characterized as a first encounter, since the students have not considered the initial question of the process. The students have not been taught in the questions related to Q1 . The students are taught in logarithm paper and are familiar with that exponential and power functions became straight line at respectively semi-log and log-log paper. The time frame for the teaching sequence was three modules, each of 90 minutes. The teaching was performed by the usual teacher to let the situation be as natural as possible. Using the usual teacher the psychological aspect of a new teacher was avoided, since the students 75 76 design of the teaching sequence know the teacher and reverse. Further I could observe the class. On the contrary, I could not control the didactic process in detail. To let teaching be as close to the design, I met with the teacher three times before the teaching. 8.2 mathematical organizations The design is divided into two parts. The first part (module 1 and 2) dealing with the question "What do we mean by best linear model f 2 FL ?" (Q3 ) and the second part (module 3) treating the question "What class of functions in Fi is suitable to model the data?" (Q1 ). The reason that Q3 was taught before Q1 is that answering Q1 requires that Q2 and Q3 are answered. Thus, it is natural to teach Q3 before Q1 to let the students actually know how the best model f 2 Fi is determined. Another reason was to let the students realize in the simple case of linear models, that determine the best model f 2 Fi requires a definition of best and this definition can be formulated by mathematics. This should let the students search for a definition of best class of functions Fi and let them realize that no definition of best Fi can be made. 8.2.1 MO belonging to Q3 In the first part the students should develop a MO belonging to Q3 . Due to the fact that regression belongs to the domain of functions in the curriculum and that students in high school do not know about the mathematics belonging to the statistics discourse, the technological discourse taught in the design will be ✓M . I have decided that the students should work with the theoretical discourse ✓EA , because this discourse can be taught without a previous introduction to new mathematical domains. The theoretical discourse has to be reached by work with task tM : P 2 Minimize S(a, b) = n k=1 (yk - (axk + b)) . In the design I have decided to let the students deduce the formula of a and b (theorem 4.3.0.1) in a way similar to the second proof described in the scholar knowledge, however some smaller changes in the proof are made to let the study process be simpler. The specific changes are described later. I have chose to use the second proof, since the technique to minimize the sum to a certain extent are related to the way the students usually minimize functions. This means that the problem of minimize a functions of two variables can be linked to the problem of minimize 8.2 mathematical organizations a function of one variable. This should let the students reconstruct old knowledge about optimization. The technique to solve the task has the disadvantage that the transformation of the data (centralized) make the proof longer. Furthermore the proof requires knowledge of displacement. Thus the simplification of the task is not free of charge, since new knowledge has to built, however the displacement also has advantages in relation with other MO. The displacement can explain the transformation of the x-variable1 and furthermore it can be linked to the transformation of exponential and power functions to linear functions. In contrast to the second proof, the first proof described in the scholar knowledge is much shorter, however I have decide not to use this proof, because the technique of completing the square are not simple. I suppose that the students will have trouble to justify why (y - y + x - x) should be adding. Further the technique will be difficult to link to other MOs. Didactic process to reach the theoretical level In the design, I have made an exercise, which should help the students to solve task tM . The exercise is divided into two step, which again is divided into subtasks. The first part is to rewrite the sum S(a, b) to let the term only depends on one of the variable, i.e. S(a, b) = (a) + (b). The second part is to minimize the sum by minimize each of the functions, min(S(a, b))= min( (a)) + min( (b)) . The second part is the essential part, since the problem can be linked to knowledge belonging to the students’ MOs. The reason for the division in two step is to let the students knowing the current functions of the subtasks and highlight the chain of reasoning to solve tM . The division of task tM into subtasks should help the students solve the task by themselves, since the subtasks can be solved by applying known techniques and draw on established praxeologies. Some techniques will belong to old knowledge, while others techniques have to be established in the previous teaching. In the exercise the students are guided through the different subtasks. For instance, I have chose that the students have to explain 2 why yk - (axk + b) = y2k - 2axk yk - 2byk + a2 x2k + 2abxk + b2 instead of letting them expand the expression without knowing the result. Knowing the result are supposed to help the students to expand the expression, since they know what to find. In other subtasks I have make hints that should help the students what techniques to use. To let the students’ study process of the proof be as simple as possible while keeping the accuracy, I have made some changes in relation to the second proof described in the scholar knowledge. 1 In several tasks the students have to transform the x-variable, but the students are not presented to the theory behind the transformation. 77 78 design of the teaching sequence In the second part of the proof the sums are substituted with letters to simplify the expression. This has the advantage that the expression do not contain the notation of xk , yk . In the exercise, I have chose to use the notation xk , yk instead of hk , vk in the case of centralized data, since the students normally indicate points with (x1 , y1 ), (x2 , y2 ) and describe linear functions with the symbolic expression y = ax + b. Since the abstract calculation and algebra already are difficult, I think changing the letters will only caused further confusion. I am aware that using xk and yk can cause problems when generalize the formula, as similar notation (xk , yk ) will be used to describe the centralized data and a general data set. 8.2.2 MO belonging to Q1 In module 3 the students have to consider how to model a data set D without knowing the class of functions Fi . As described in ERM no specific MO can be set up to this question. Because of time, I have restricted the classes of functions to the three classes of functions the students are familiar with. Even through the classes of functions is restricted, I have designed exercises that should let the students discover that the phenomenon of the data has to be considered to make models and that more models can be reasonable to model a data set. In the design, the exercises are designed such that the students have to solve tasks related to Q2 , Q3 , Q4 to answer Q1 . The phenomena included in the exercises are selected, such that the students should be able to relate to them. This should ensure that the phenomena of the data can be included in the solutions of the tasks. Because of the time frame, I have decided not to present the theoretical block belonging to Q2 and Q4 . The students are supposed to use the techniques belonging to the questions, but the techniques will not be justified. It is planned that the connection between Q2 and Q3 should be shortly institutionalized. 8.3 the specific aims The aim with the design is to investigate the possibilities to work more theoretically with regression in high school, especially Q1 and Q3 . The purpose is to give the students a precise description of the best linear model f 2 FL and let the students construct the theoretical discourses through the work with theorem 4.3.0.1. The first two modules should let the students develop MOs related to Q3 . In module 3 the students have to work with questions related to Q1 and Q4 . The intention is that the students learn to justify their choice of model by including the quality of the model (Q4 ) and the phe- 8.4 didactic principles nomenon. Below is listed the specific aims of the design: 1. Define the best line (the technological discourse ✓M of Q3 ) 2. Explain the chain of reasoning to the task tM : P 2 Minimize n (theoretical discourse ✓EA ) k=1 yk - (axk + b) 3. Determine Fi and explain the choice of Fi (Q1 ) 4. Demonstrate knowledge about modeling and reflect about the idealization and range of the models (Q4 ) The last aim the students have become acquainted with, since it is part of the curriculum (cf. 5.1). 8.4 didactic principles In this section three general didactic principles used in design and execution of the teaching are presented. Inquiry instruction The teaching sequence is designed such that the students through work with concrete data and numeric examples should be able to generalize from numeric and graphic examples to general abstract definition and notation. The inquiry instruction should allow the students to examine prior mathematical objects and generate assumption based on the examples. I will use this approach, because it might let the students reflect and interpret their result (Lawson, 2009). Before the students can solve task tM , the techniques to solve the task have to be part of the students praxeological equipment. In the design I have chose that the students have to develop the techniques and knowledge by working with concrete examples, that should be generalized. In contract the knowledge could have been institutionalized, but since definition and techniques are easier to apply after having experience with the use of them (Winsløw, 2013, p. 157), I have included exercises that not belong to the theme of regression. The teaching is designed such that the students have to work in a numeric or graphical settings and from these settings generalize to algebraic definition and techniques. This should entail that the students will work with the praxis before the technological-theoretical discourse is developed (Lawson, 2009). Using the inquiry instruction, the students should be able to link the new mathematical objects to discoveries, ideas and patterns that they had made and so the comprehension and retention should become bigger (Lawson, 2009). 79 80 design of the teaching sequence Method of working The teaching is designed to be a mixture between class teaching and group work. It is planned that the students have to develop techniques and construct knowledge in group work and in class teaching the knowledge should be institutionalized. A considerable part of the teaching is planned to be in groups in order to develop the mathematical objects through discussions. In the students’ first encounter with Q3 and in the work with Q1 the group work is essential, as the students should discuss the meaning of best f 2 FL and best Fi . Class teaching and class discussions are used to validate exercises and institutionalize mathematical knowledge. The teaching is planned to include the students most possible, such that the students participate effectively in almost all activities. Thus the teacher should include the students in the validations and institutionalizations. In the summaries of the exercises, it is planned that the students should be the most active, while the teacher should hold a humble position controlling the summaries and answers questions. IT-tool Pragmatic value; the efficiency for solving tasks Epistemic value; the insight the IT-tool provides into the mathematical objects and theories to be studied (Gyöngyösi et al., 2011, p. 2) In the teaching, the IT-tool will be TI-Nspire, because the students normally used that. I have planned that TI-Nspire should be an essential tools in the students acquisition of problem solving, because of its high pragmatic value when making graphs and regression. The use of TI-Nspire should make it possible for the students to focus on the principles and theory, such that the best linear model f 2 FL , the transformation and the best class of functions Fi can be explored at a higher theoretical level (Gyöngyösi et al., 2011, p. 2-3). Applying TI-Nspire, the techniques and mathematics are invisible for the students and therefore calculations have to be done by hand in exercise 3 (about sums) to ensure that the students can discover and justify the rules of sums. I have planned that the students in their first encounter with Q3 should make the best line using a graphic technique. Since it is possible to insert a line and move it in TI-Nspire, TI-Nspire is used to investigate the best line. Both the pragmatic and epistemic value of this technique is high, since the students by themselves has to reflect where the line should, but at the same time they can make several lines. Furthermore the symbolic expression is determined without the need of calculations by hand. In accordance with the curricula I have designed the teaching such that IT became an important tool in the students work with conceptualization and problem solving (STXBekendtgørelsen, 2013, section 3.3). 8.5 description of the planned didactic process 8.5 description of the planned didactic process In the following, ATD and the ERM is used to describe the didactic process in the teaching. The exercises and specific schemes of each module are included in appendix A.1, A.3 and A.4. In appendix A.3 the problem/task, the techniques, the technology/theory and the didactic moments of each episode are summaries in schemes, while the time, activity, representation, method of working and the role of the teacher and the students can be viewed in appendix A.4. 8.5.1 Module 1: Definition of best linear model and introduction to sums and centralized data The goal of the module is to let the students explore techniques to solve TRL and let the students specify the question "What do we meant by best linear model?". This should give rise to a definition of the best linear model (✓M ). Further the students should work with sums, means and displacement. According to ATD, any didactic process requires a first encounter with the MO in question and the question "What do we mean by best linear model?" should make up the first encounter with the MO. In the beginning of the module, the teacher will introduce the aim of the modules. Exercise 1 +2 The first two exercises are a attempt to let the students work with the question of Q3 by solving the exercises. The exercises will constitute the first and an explanatory moment and should motivate the work with regression in the teaching. In the teaching the students have to elaborate techniques to determine the best line based on the data sets (figure 20) and make criteria of the best line. Afterwards the students have to discuss the criteria of the line and specify what they mean by best line. The purpose of the exercises is that the students realize that the question has to be specified, since their techniques are insufficient (more best line can be found). The specification should made basis for the technological discourse of Q3 . After the students have make criteria, the criteria should be discussed in class and based on the students’ criteria the teacher should introduce the best line f 2 FL . The institutionalization should let the question be specified to the task tM . The technological discourse ✓M should give rise to the technical moment, where a technique to minimize the sum has to be developed. In the introduction to the technological discourse, the teacher should motivate the work with sums. 81 82 design of the teaching sequence (a) Scatterplot of the data(b) Scatterplot of the data set in exercise 1 set in exercise 2 Figure 20: Scatterplot of the data set in exercise 1 and 2 The following exercises (Exercise 3-5) concern sums, mean and displacement. The aim of the exercises is to let the students develop and construct the techniques that they have to apply in the technical moment of Q3 . Exercise 3 The intention with exercise 3 is to let the students deduce the rules of sums by letting the students calculate sums and compare the sums. In the exercise, the students have to elaborate techniques to calculate sums in a numeric setting and afterwards discover and explain the connection between the calculated sums. The connection should let the students develop new techniques (rules of sum) that are used to solve 3j. The teacher can start the institutionalization of the rules of sum before the students have finish the exercise, if the teacher evaluates that they do not discover the connections by themselves. In the institutionalization the results of the sums have to be filled in the table at the blackboard and the connection between the sums should be validated. The connection between the sums has to be generalized and the rules of sums institutionalized. Exercise 4 + 5 In exercise 4 the students will reencounter mean and the techniques to calculate mean. Based on the numeric calculations (task 4a + b) the students have to generalize the technique and write a definition of mean by using the symbol of summation. The exercise is constructed, since the students have to apply the definition to solve task tM . Exercise 5 concerns displacement of data. The exercise is included to let the students learn that displacement of data does not change the interconnection of the data and so not the slope of the best line. The students have not work with displacement before, however the 8.5 description of the planned didactic process knowledge should be accessible, because of the visual representation. Since the representation is graphic, the students should be able to get an intuitive idea about data as a configuration of points that can be moved. The last two tasks (5j-k) are included if some are finished before time is up. After the students have worked with the two exercises the students’ work will be validated. The definition of mean and the technique to make displacement should be institutionalized. This should be followed by definition of centralized data.The teacher should finish by proving that every data set can be centralized (task 5j). If time, the teacher makes a summary of the mathematical objects in the module: The definition of the best line, the rules of sums and centralized data. The objects are written on a slide. In the establishment of the praxeology to displace data, vectors will not be introduced, as it will be a new mathematical object. Instead I have chose to let the students work with the displacement in a graphic setting in TI-Nspire. The graphic representation should justify that changing data with a constants does not change the interconnection between the points and so not the slope of the best line. 8.5.2 Module 2: The proof of best linear model In the second module the students should produce a new reliable technique to tasks of type TRL and establish a technological-theoretical discourse that can be used to justify and validate the instrumented technique ⌧IRL . In the module the students should develop the formula in theorem 4.3.0.1 by solving task tM . The module will begin by reencounter the essential points from module 1. Then the teacher will introduce task P 2 tM : Determine aand b that minimize n k=1 yk - (axk + b) . Using figure 21 the teacher should recall the technique to centralize data (subtracting (x, y)) and emphasize that centralize data does not change the slope of the best line. The students should solve task tM by assuming x = y = 0 This work should constitute the technical moment of Q3 . The students should work in groups with the different subtasks, which should let the students develop a technique to determine the best line. As the essential step occur in the second part, the teacher has to ensure that the students do not use long time at the first part. If many problems arise in the first part the teacher can institutionalize the sub tasks (1a1c), after which the students may continue with the second part. Since the techniques should be well-known to the students, the role of the teacher is to help the students by recalling techniques and explain the algebraic notation. 83 84 design of the teaching sequence Figure 21: The graphic representation used to illustrate and justify the displacement of a data set. The figure shows how the interconnection between the points are similar. After the students have work with task tM , the teacher will go through the task and generalize the technique. The teacher will use figure 21 to show and justify how the technique can be generalized. Exercise 6 Given their is sufficient time, exercise 6 can be done after the presentation of the theorems. The exercise consists of a task of type TRL which has to be solved by the algebraic and instrumented technique. The exercise should let the students get insight into the calculations in the instrumented technique and let the students validate the instrumented technique. Introduction to best exponential and power model If time, the module can be ended with this introduction. Otherwise it is moved to module 3. The part concerning best line should be ended by introducing the best exponential and power model. The teacher has to draw parallel to the linear case and explain that the best exponential and power model can be calculated by formulas similar to the linear case. The teacher has to illustrate graphic in TI-Nspire how data can be transformed to be in line and subsequently the relation between the linear model and the exponential or power model. Homework The students have to solve exercise 7 - 9 at home. The purpose of the exercises is to let the students recall the techniques to solve the tasks, since knowing the technique should make it possible to work 8.5 description of the planned didactic process with technological-theoretical questions about Q1 in the last module. In the exercises technological questions related to both Q1 and Q4 appear. The students have to consider the range and utility of the models in tasks 7d and 9b, while task 9a (task of type TR ) should constitute the students’ first encounter with Q1 . The data set in exercise 9 is represented graphic in stead of in a table, to let the students visualize the development of the data. 8.5.3 Module 3: The type of model The intention with the module is that the students learn that no criterion exist to determine the best class of functions Fi , however some classes of functions are more useful than other and further that the phenomenon has to be considered. In the module the students have to bring old technique into play to evaluate the models and use them to justify their choice of model. The module should start with a follow-up on the homework, where two students have to present exercise 8 + 9. If the students not explain their choice of model in exercise 9, the teacher should ask about their choice. For the students that have not made their homework, this will constitute the first encounter with Q1 . Exercise 9 should be used as motivation to consider the best classes of functions in the following exercises. Exercise 10 + 11 The purpose with exercise 10 +11 is that the students should realize that it is far from simple to determine the class of functions to model data and that the phenomenon has to be taken into account when determine the class of functions. The activity are supposed to constitute the technological-theoretical and technical moment in the students didactic process, since the students both have to determine specific models f 2 Fi and work with technological-theoretical questions, as the choice of model has to be explained. Exercise 10 contains tasks of type TR , TXM , TQP . The purpose is to let the students discover that data can fit more models and that models have a restricted range. I hope that the students choose different classes of functions, such that a discussion about which classes of functions to used can be started in the group. Otherwise the teacher has to ask for the class of functions that are not investigated. In exercise 11 the students have to solve tasks of types TRL , TRP , TYM . The exercise should illustrate that the model with highest coefficient of determination is not necessary the best model. In this exercise it should be clear that it is unrealistic to use a linear model when the phenomenon is taken into account. When the students have discussed the exercises in groups, the exer- 85 86 design of the teaching sequence cises are presented in class. In the presentation of the exercises, the teacher have to draw attention to the arguments of the choice. It is important that the teacher emphasizes the real world complications and consequently that no model can describe the real life perfect. Exercise 12+13 The aim with exercise 12 is that the students learn that answering Q1 the purpose (interpolation/extrapolation) of the model has to be considered. In the exercise two different models have to be used to predict the number of victims in respectively 1995 and 2016. I have chose that the students should extrapolate the year 2016, since the number of victims is negative. This clearly show that the linear model is unrealistic in domains far from the data set. Given their is sufficient time, the students will work with exercise 13 is quite different from the previous, since the class of functions used to model the data does not belong to Fi . This exercise should constitute a moment of evaluation, since the students should realize that not all data can be modeled by the three class of functions they have at one’s disposal. The module has to be finished by final remarks on the techniques to choice the class of functions. Finally, the teacher should underline how mathematics can be used to justify the best model f 2 Fi given the class of functions, but not the best class of functions Fi . 9 A P R I O R I A N A LY S I S This chapter will contain an analysis of the exercises in the teaching. The techniques, solutions and explanation that can occur are described. The a priori analysis make up the focus points in the observation and in the a posteriori analysis. To each exercise a reference to the specific exercise (in the appendix) is included in the headline. 9.1 module 1 First encounter and the explanatory moment: Exercise 1 + 2 (A.1) In the first exercises the students have to make the best line and discuss their techniques. The reason that the lines are based on intuition and visual impression, it is difficult to make explicit the technique used. The work with the lines should let the students make criteria of best line and in that way give their first answer to the question at stake. The arguments in task 1c + 2c and the criteria of best line in task 2e can be: 1. The best line minimizes 2. The best line minimizes 3. The best line minimizes 4. The best line minimizes Pn k=1 |yk Pn - (axk + b)| yk -b a | Pn |yk -(axk -b)| k=1 1+b2 k=1 |xk Pn k=1 - yk - (axk + b) 2 5. The best line has the same number of points below and above the line 6. The best line goes through the points (xi , yi ) and (xj , yj ), where yi 6 yk 6 yj for k 2 {1, . . . , n} 7. The best line goes through the mean (x, y) 8. The best line goes through two points from the data set D 9. If we know the exact value of (0, y) the best line has to go through this point. My hypothesis is that criteria 5 and 8 will be the most frequent, as these criteria seems reasonable and furthermore can be viewed from the plots. The criteria about sums (1-4) are supposed to be rare, as they are not visible or intuitive. However, it is possible that some students will suggest the criteria, because they have been introduced to the LSQ method at first year. The institutionalization of the technological argument ✓M may give rise to questions about the criterion, for instance; "Why do we minimize the squared vertical distances or not just the numeric distance", since the criterion is not intuitive or logical. 87 88 a priori analysis The rules of sums: Exercise 3 (A.1) In this exercise the students have to develop techniques to calculate sums and afterwards generalize the techniques. The tasks (a-h) can be solved directly by summing the terms. However, which terms to sum P will not be obvious, as the symbol n k=1 only have been visited in 2 relation with -test. In the first two tasks (summing a constant), the students can get trouble, since no variable k is included. The students have to realize what the index of summation k = 1 to n means for the sum and which terms to sum. Once it has been established how to deal with the index to sum the terms, the students can apply the same technique in the following tasks. In task 3i the students have to recognize the connection between the P P sums. Some should be simple, for instance 4k=1 5k = 5 4k=1 k, as the equivalence results are side by side in the table, while the conP nection of the sum of two terms, for instance 4k=1 (2 + 5k) = 58 P would be more difficult to recognize, as two results ( 4k=1 2 = 8 and P4 k=1 5k = 50) have to be added to discover the connection. The last task (3j) can only be solved if the relations (rules of sums) are discovered, since the sum will be painstaking to solve by hand. Using the rules of sums, the technique is to split the sum up and move the constant 2 in front of the sum. The formula of the mean: Exercise 4 (A.1) The first two tasks can be solved by techniques (calculate mean) that the students already have. After this the students have to generalize the technique to set up the definition of the formula of the mean for the values (x1 , . . . , xn ). This will be far from simple and both cognitive and didactical obstacles are supposed to occur. Firstly the algebra can be an obstacles, because the numbers have to be represented by x1 , . . . , xn and the total number by n. Secondly the symbols of summation can give troubles if the knowledge about the index and summation are not established in the previous exercise. Examples Pn P 1 Pn of definitions that can occur is: n x/n and k=1 x, k=1 xn /n, P n 1 k=1 xk . n Displacement of data: Exercise 5 (A.1) In this exercise the students have to investigate the connection between displacement of data and the best line through graphical representation of the displacement and comparison of the symbolic expression of the three linear models. The techniques to solve the exercise require knowledge of the commands in TI-Nspire to subtract a constants from data set, make plots and linear regression, which are commands the students are supposed to know. After plotting data, 9.2 module 2 the students have to visualize that subtracting (x, y) from the data displaces the data (x to the left and y down) without changing the relative location of points. The activity should let the students observe that; "It is like taking the whole configuration of points and moving it together" or "The relative location of the points is similar, but the points are moved in the neighbourhood of (0, 0)". After having plotted the data and calculated the linear models it should be explicit from a comparison of the linear models that the slopes are equal, but the intersection changes. The students should be able to make arguments as; "The slope of the three models are equal, but the intersection change" or "Displacement of the data does not change the slope, as the relative location of the points are similar however the intersection changes as the points are displace vertical and horizontal". P The last two tasks (j-k) are to prove that n k=1 (xk - x) = 0 and similar Pn k=1 (yk - y) = 0. The tasks require that the techniques developed in the previous exercises (3-4) are recalled and collected, which is far from simple because of the algebraic setting. The techniques that have to be used and collected to solve the tasks are: Pn Pn Pn 1. k=1 (xk - x) = k=1 xk k=1 x. Pn 2. k=1 x = nx. The students have to realize that x represents a constant. As x normally represents a variable, it can be a hurdle to identify x as a constant. Pn 1 Pn 3. The formula of the mean. n k=1 xk = x , k=1 xk = xn 9.2 module 2 The proof (A.1) In the second module the students have to reinvest the techniques learnt in the first module as well as former knowledge to prove the theorem. Because of the algebraic setting and the representation of constants and variables, the mathematical objects and techniques have to be considered from a different angle than the students usual do. P 2 The task tM : Minimize s(a, b) = n k=1 (yk - (axk + b)) is divided into the following subtasks: Assume x = y = 0 2 1a) Explain yk - (axk + b) = y2k - 2axk yk - 2byk + a2 x2k + 2abxk + b2 1b) Use the rules of sum to split up the sum, put a and b in front of the sum and sum over b 1c) Simplify the expression by using the assumptions The expression is rewritten to a2 A - 2aB + C + nb2 2a) Explain that A, B, C are constant 2b) Divide the expression into two second degree polynomials 2c) Minimize the expression by minimize each polynomial 2d) Write the values of a and b 89 90 a priori analysis In class: Generalize by substitute xk and yk with xk - x and yk - y. Step 1 The first task (1a) can be solved using either the algebraic technique of quadratic identities1 or the algebraic technique of multiplying term by term. 2 1. yk - (axk + b) = y2k + (axk + b)2 - 2yk (axk + b)) = y2k + (ax2k + b2 + 2axk b) - 2yk (axk + b) 2. yk - (axk + b) 2 = (yk - (axk + b)) · (yk - (axk + b)) 3. yk - (axk + b) 2 = (yk - axk - b) · (yk - axk - b) The algebraic technique of quadratic identities (1.) requires that the students first recognize the two terms; yk and (axk + b) and next use the identity at (axk + b)2 . As the students normally use the quadratic identity at squared bracket with two terms, some problems can occur to identify that (axk + b) has to be treated as one term. Using the algebraic technique of multiplication term by term, the expression can be expanded directly. The students have to be careful with the signs of the terms and the notation, when entering the expression. The expanded form has nine terms, which can be simplified to six terms. In the next task (1b) the students have to reinvent the rules of sums to rewrite the expression. Firstly, the techniques n X k=1 yk ± xk = n X k=1 yk ± n X xk k=1 has to be applied to divide the expression into six sums. Then the techniques n n n X X X cxk = c xk and c = c·n k=1 k=1 k=1 have to be applied. Using the techniques require that xk and yk are treated as indexed variables, such that 2, a and b can be put outside the sums. A problem that can occur is to identify a and b as constants for the sums, since they are variables in the expression. A specific P 2 problem can be to rewrite n k=1 b . The students have to know that Pn P n 2 2 2 k=1 b = b k=1 1 to simplify the sum to nb . In the last task (1c) the students have to realize that x = 0 is equivalent P with n k=1 xk = 0 (and similar with y = 0). This requires that the students know: x=0, n n X 1 X xk = 0 , xk = 0 n k=1 k=1 1 Quadratic identities means: (a + b)2 = a2 + b2 + 2ab or (a - b)2 = a2 + b2 - 2ab 9.2 module 2 Step 2 In this step the students have to minimize the expression from step 1. The step can pose a series of problems, since the techniques and mathematical objects have to be used in another settings than usual; first of all the symbols used will be complicated since xk and yk are constants and a and b variables, totally reverse to the formula y = ax + b. Secondly, the constants a, b and c in the standard notation of second degrees polynomials are not the same as the variables a and b in the expression. Thus the a and b occurring in the formula of the vertex are not equivalence with the a and b in the expression. To minimize the obstacles the expression a2 n X x2k - 2a k=1 n X k=1 yk xk + n X y2k + nb2 k=1 is rewritten to a2 A - 2aB + C + nb2 in the task. The rewriting avoids the use of xk and yk and simplifies the expression. The relation of P Pn Pn 2 A = n k=1 xk , B = k=1 xk yk and C = k=1 yk can be observed by comparing the expressions. Because of the notation (xk and yk ) it would be challenging for the students to explain why A, B and C are constants. The students have to identify (xk , yk ) as given points (i.e. constants) to realize that they are constants and furthermore that a sum of constants are a constants. The task should let the students make explanations as; "The sums are constants, since xk , yk are given data points, and summing constants gives a constant." or "The xk and yk are constants, thus the sum of these is a constant". In the next task (2b), the students have to realize that the sum S(a, b) can be divided into two functions (a) + (b). Due to the standard notation of second degree polynomials, the use of a and b as variables and A, B, C as constants will be a cognitive obstacle for the students, since a, b and c normally represents constants. Because C is independent of the variables, the expression can be divided in two ways. 1. (a) = a2 A - 2aB + C and 2. (a) = a2 A - 2aB and (b) = nb2 (b) = C + nb2 It does not matter that two ways to divide the expression S(a, b) exist, actually it can be an advantage, because it can give occasion for a discussion, where the essential (split up the polynomial) in the task can be highlighted. This step is crucial for solve the task, because the problem of minimize the function S(a, b) is reduced to a problem of minimize functions of one variable. The division of the expression in two ways also has a mathematical quality, since the work do not appear as one unique long screed, which the students have to discover or guess. The students will be presented for that a proof can have small variations without making any difference. 91 92 a priori analysis In the following task (2c) the two polynomials have to be minimized. Firstly, the students have to realize min(S(a, b)) = min( (a) + (b)) = min( (a)) + min( (b)) The polynomials can be minimized either by use of differential calculus or by the formula of vertex. Using differential calculus the students have to solve 0 (a) = 0 and 0 (b) = 0. Using the formula of vertex -b the symbols can be an 2a obstacles, since a and b in the formula is not equivalence to the a and b in the polynomials. For instance, if (a) = a2 A - 2aB + C, the students should recognize that A is the constant belonging to the squared variable and -2B is the constant belonging to the variable of B power one and so set up -(-2B) = A . After having determined the 2A values, the students have to justify the values minimize the polynomials. The value of b should be clear. The value of a can be explained using second degree polynomials or monotony. The argument using polynomials is: P 2 Since x2 > 0, the sum A = n k=1 xk > 0. Thus the value of a minimize the polynomial. Using monotony is much more complicated, since 0 (a) = 2aA - 2B, which causes that the monotony has to be checked for both B < 0 and B > 0. The argument is: B B B If B > 0, then 12 A <A < 2A . This gives 1B B 2 2 A A - 2B = B - 2B = -B < 0 and 2 · 2 A A - 2B = 4B - 2B = 2B > 0. B B 1B If B < 0, then 2 A < A < 2 A This gives B B 2 · 2A A - 2B = 4B - 2B = 2B < 0 and 2 12 A A - 2B = B - 2B = -B > 0. B Thus a = A is a minimum. Exercise 6 (A.1) In the exercise the students have to solve TRL using the technique P developed in task tM . The calculation of n k=1 (yk - y)(xk - x) and Pn 2 are hard to do by hand, but using commands in TI(x x) k=1 k Nspire the sums can be calculated and insert in the formula. The students should realize the equivalence of the instrumented and algebraic technique and consequently links the techniques. 9.3 module 3 The last module constitutes a didactic process about the class of functions (Fi ), where the students should work with Q1 (and to some extent Q4 ). The students’ expectations to what happens in mathematics can be an epistemological obstacles in the work with Q1 , since the exercises 9.3 module 3 have not one unique solution that is correct with logical necessity, which the students are accustomed to. Exercise 7 + 8 (A.1) The exercises contain tasks of types TRL , TYM , TQP , TRP and TXM . The techniques to make regression and extrapolation should be well-known to the students. However, task 7a is not simple, since a transformation of the independent variable has to be made before linear regression should be made. This transformation is not straightforward, because the years of the world records are not continuous, hence not all years from 1998-2007 are included in the table. Subtraction of (1998, 0) is the technique to transform the data set. If the students lacks the transformation, the following tasks can still be correct. If the students lacks to transform the data, this can give occasion to discuss transformation (displacement) and make connection to the previous work (ex: 5 and task tM ). Exercise 9 (A.1) The exercise constitutes the students first encounter with task TR : Determine the best model f(x) within classes of functions. At the graphic representation of the data, the absolute change in the dependent variable ( y) can be read to increase with the year. For that reason it should be clear that the data cannot follow a linear model. Whether the class of functions in Fi has to be exponential or power cannot be determined from the representation (figure 22). One technique (⌧Qr2 ) Figure 22: The representation of the data set in exercise 9 is to determine the coefficient of determination. Using exponential regression the students will get a good model (r2 = 0.998) whether they use x =year or x =year-1996. Making power regression different scenario can be found: 93 94 a priori analysis 1. Using x =year, the model will be f(x) = 0 · x491.409 , i.e. f(x) = 0. The reason is that x491.409 is enormous for x > 1, so the other parameter is round down to 0. 2. Using x = year - 1996 (Transforming data by subtracting (1996, 0)) will give an error message "Domain error. An argument must be in a specified domain", since ln(0) is undefined. 3. Using x=year-1995 (Transforming data by subtracting (1995, 0)) will give a reasonable model with r2 = 0.899 The first two scenarios can give rise to a discussion about the technique to determine f 2 FP , where the practical impact of data can be emphasized. Either the original data can be used as it is far away from the center (0, 0) or the transformed data, since ln(0) is undefined. To determine f 2 FP , the data has to be transformed to be close to the centrum (0, 0) (technique 3). Evaluating the power model, it can be found that the coefficient of determination is lower than in the exponential model, which could let the students make argument as; "The best model is the exponential (f(x) = 1.59 · 1.28x ) since the coefficient of determination is higher than the power model". Using the properties of growth of exponential function (f(x + k) = b · ax · ak ), the students can argue that the size in which the capacity changes depends on the size of the capacity. It seems reasonable that setting up windmills effect other to do it in the domain of the data set, since it describe the period (1996-2011) where many set up windmill. Thus it seems reasonable, when considering the properties of growth together with the phenomenon that an exponential model is suitable to describe the data set. Another technique to argue for the exponen0 (x) x ·ln(a) tial model is using differential equation ff(x) = b·ab·a = ln(a) , x but the students do not have the didactical conditions for using this technique. Exercise 10 (A.1) In exercise 10 the students have to solve tasks of types TR , TYM and TQP . In the first task (10a) the students can use scatter plot to reject a linear model. The following techniques can be used to investigate the best f 2 FE and g 2 FP . 1. ⌧Qr2 : Using the coefficient of determination the students will found similar values for both models 2. ⌧PS : Plotting the data and the models, both models will be found to fit the data 9.3 module 3 3. Properties of the function. f(x + k) = b · ax · ak = f(x) · ak g(x · k) = b · xa · ka = f(x) · ka The techniques can let students make arguments as: A1 "The exponential model is best, since the coefficient of determination is slightly higher" A2 "Both models can be used, as the both fit the data. The coefficient of determination is similar, so I cannot select which are best" A3 "I will chose the exponential model. It seems reasonable that the number of infant mortality decrease with the same per cent, since the contribution to minimize infant mortality will decrease when the number get smaller." In the last task (10d) the students will get (whether they use exponential or power model) that the predicted value is 7.5 lower than the actual value (predicted value is 2.4, the actual value is 9.9). This should give rise to a discussion of the model. Answering the task requires that the phenomenon of infant mortality is considered. The task should let the students set up arguments as: "The model is underestimated with 7.5, hence the model cannot be used to predict the number of infant mortality in a domain far from the original data" and "We see that the model underestimate the infant mortality by 7.5. This is due to the fact that it is assumed that the infant mortality per mile continue to decrease with about 10%, which is unrealistic. The drop in infant mortality in the period 1933-1973 was a result of development in healthcare and disease." Exercise 11 (A.1) The first three tasks can be solved by use of techniques well-known to the students. In the last task (11d) the relation of distance and time have to be discussed. To justify the class of functions different qualitative arguments, depending on the technique, can be suggested: A1 Coefficient of determination: Both models have r2 ⇡ 1 (Linear model: r2 = 0.9998, power model: r2 = 0.9993). This shows that both models fit the data well and so can be used to describe the data. A2 The average speed at marathon will be 297sek/km= 12.12 km/h using the linear model and 328.9 sek/km= 10.95 km/h using the power model. Using the data point (25, 7413) the runner’s average speed at 25km is 296.5sek/km. This means that the runner has to run with the same average 95 96 a priori analysis speed at marathon as for 25km, if the linear model is used. Using the power model the average speed at marathon will be about 1.2km/h slower than the average speed at 25km. Since the average speed decreases, when the distance increases, the power model is selected. A3 A table of time per km shows that the speed is not constant. The longer the distance, the more time used per km, i.e. the slower speed. Distance (km) 0.5 1 3 5 7 10 15 20 25 Time per km (sek/km) 196 215 259.7 272.8 284.1 286.9 289.4 295.2 296.5 Since the time per km is not constant the relation between the distance and time is not linear, hence the time at marathon can not be predicted by a linear model. A4 The maximal speed is equal (independent of the distance) using the linear model. Using a power model the time per km f(x) a-1 (assuming a > 1), increases when the distance x = b·x increases, so the maximal speed decreases. Therefore the maximal speed depends on the distance. Since a runner’s maximal speed decreases when the distance increases, the linear model is unrealistic. A5 The maximal speed for runners at a given distance decreases when the distance increases, so the power model seems most reasonable. Using the knowledge about functions and race, the students may realize that it seems inappropriate to use the linear model. The students should realize that the class of functions is determined in an interplay between the phenomenon of the data and mathematics. Exercise 12 (A.1) Like exercise 10, the students can get a good idea of how the data behaves, when making a scatter plot. Using the coefficient of determination a linear model seems reasonable (r2 = 0.947) to apply to make interpolation (task 12a). Extrapolate the linear model, the students will determine a negative number of victims. Once the students have found out that the linear model is unreasonable, the students are supposed to investigate the classes of exponential and power functions. The techniques that can be used are: 1. ⌧Qr2 : The coefficient of determination. The coefficient of determination are similar (r2 = 0.924, r2 = 0.923) 2. Properties of the functions: Both classes of functions tend to zero (limx!1 bax = 0 for 0 < a < 1 and limx!1 bxa = 0 for 9.3 module 3 a < 0) without being negative, when the independent variable increases, in accordance with the phenomenon 3. Properties of growth: Since the effort and action made to minimize road victims will decrease concurrently with the number of road victims, it is likely that the number of road victims decrease with the same per cent each year Extrapolate the two models, the students will found respectively 2262 and 2278 road victims in 2016, which will let the students do arguments as: "The number of road victims in 2016 will be about 2270. Using the exponential model the number will be 2262 and using the power the number will be 2278." or "The number will be 2262, as the exponential model fit the data best. However the power model seems also reasonable to model the data. The number only differ with 16 in the two models. " Exercise 13 (A.1) The first task (13a) will be difficult for the students, since they have not learn about logistic growth. First, when the students have classified the type of growth as logistic, logistic regression can be made. Subsequently the logistic model can be used to solve task b + c. The exercise can also be solved by techniques belonging to the students’ equipment, however the technique is not as simple as logistic regression. The technique is to divide the data into two data sets described by two different models. The number of rabbits in the first eight years (0 6 x 6 8) can be described by an exponential model, while the number of rabbits after 12 years (x > 12) is almost constant, hence can be modeled by a constant model. 97 10 A P O S T E R I O R I A N A LY S I S 10.1 methodology In this section, the methodology for data collection and the data material are briefly described. In the following section the data material will be analyzed corresponding to the focus points set up in the 2nd research question and based on the a priori analysis. The methodology for data collection and the type of data collected was based on the 2nd research question, since the data and the subsequently analysis of the data should make it possible to answer the 2nd research question. To investigate the possibilities and obstacles for the students’ theoretical work with regression, data about the students’ techniques, their explanations of the techniques and their arguments for the class of functions (Fi )chose were needed. To answer the first question in the 2nd research question, data about the students’ techniques and solutions to the subtasks in tM was collected. The techniques were used to clarify which praxeologies the students applied, while the calculations and the solutions illustrated the constrains and obstacles the students had when using the technique. Since the techniques and solutions to each exercise appear in the students’ handouts1 , the handouts were collected. In the handouts, boxes to the calculations were included and the students were told to write their solutions in these boxes. To answer the second question in the 2nd research question, the students’ techniques, considerations and arguments to determine the class of functions and their approach to tasks of type TR was collected. The arguments can be found in the students’ solutions and in their oral discussions. Due to that, each group was sound-recorded to collect data about their arguments and explanations. This gave insight into discussions, where the explanations were made and different arguments were discussed. The sound recordings further gave data about the obstacles occurring and the students’ interaction with the teacher on the different exercises. The sound recordings made up an important part of the data material. The sound recordings were supported by observations from the classroom. During the group work, I walked around between the there 1 The students were given the exercises at paper, which were collected in a handout 99 100 a posteriori analysis groups and took notes about the students’ approach to the different exercises, the obstacles occurring and noted the techniques used in TI-Nspire. During class teaching I noted how many students participated and took pictures of the blackboard. For instance, I observed which class of functions the students start modeling and which technique they used to decide the class of functions. In the observation I focused on the technique and the justification occurring. To organize the observation, a table with the following general heading was used: Activity Didactic Tasks and Technological/ Observed moment techniques theoretical elements activity The following data was collected about the teaching sequence: - Sound recordings of the group work - Sound recordings of the classroom teaching - Students’ solution to the exercises and their notes - Notes made in the teaching by observing Sound recording of three groups in three modules, each of length 95 min with a break of 5 min. After the teaching approximately 13.5 hours of sound recordings, observation notes and 14 sheet of the students’ solutions were collected. In the analysis of the data I listened to the sound recordings and transcribed the points worth discussing. The points worth discussing are for instance: an interesting explanation, a dialogue about the techniques, spontaneous citations about the study process etc. In the transcript the students have been anonymized by using letters A to E in the groups instead of their names. Similar the students are named with letters in the sound recordings from the class teaching. Therefore students A in one dialogue may not be the same as student A in another dialogue. The transcripts used in the thesis have been translated. The transcript in danish can be found in appendix A.7. Translation of the transcript posed challenges, because it was difficult to translate spoken danish to spoken english. Because of that the dialogue (and so words) presented in the thesis are literally not the students own anymore. In the translation, the meaning of the words are interpreted in such a way that it be intelligibly what the students meant. (Van Nes et al., 2010) The students’ solutions are examined and the different techniques, solutions and explanations to each task are draw together, see appendix A.6. The collection of the techniques, solutions and explanations made it possible to compare what actually happened in each exercise with what could have happen, as described in the a priori analysis. 10.2 analysis of the realized teaching sequence 10.2 analysis of the realized teaching sequence In this section the students’ didactic process as it occurred in the teaching will be analyzed. Since the activities planned took longer time than expected, the designed teaching was altered in the process. An outline of the planned and realized didactic process is sketch in figure 23. A detailed described of the realized didactic process and the figure of the outline can be found in appendix A.5. Figure 23: Outline of the planned and realized didactic process. Each string illustrates a module and each module is divided into the time spent on the activity. The color illustrates the didactic process. The numbers refer to the exercises. Figure 23 illustrates that the technical work with the techniques to the technical moment and the institutionalization constituted a greater part in the work with Q3 than planned, which was because the technical work was difficult. This caused that the students did not do all the technical work by themselves, thus the theoretical discourse of Q3 was mainly institutionalized. Focus of the analysis was on the students’ didactic process of respectively Q3 and Q1 . In the analysis the students’ solutions and quotations are italicized. In the analysis, a reference to the students’ solution and the dialogue (before translation) is made in rectangular brackets. 10.2.1 Q3 : The best linear model f 2 FL First encounter and the exploratory moment The students’ first encounter and the exploratory moment with Q3 worked smoothly. The students elaborated techniques to make the 101 102 a posteriori analysis best line and had intuitive and qualitative criteria of the line they made. It turned out that the students had difficulties explaining why the line they had made was the best, which comments as; "It is actually difficult to explain"[1] and "It is indeed very good. But I don’t know why"[1] shows. The students’ criteria of the best line turned out to be very similar. The most frequent criteria were[A.6]: • I have placed the line in the middle of all the points • I choose this line, since it hit as many points as possible. Further I put the beginning of the line at the intersection of y and x axis • I have counted the number of dots on each side of the line and they are equal The first and last criteria are related to criterion 5 in the a priori analysis, while the second criterion can be related to criterion 9 in the a priori analysis. It is encouraging that the students were on track to make mathematical criteria of the best line. The students ideas and intention of the best line, called the students’ protomathematics2 , indicate that the students had a sensible basis, which can be used to specify the question of best linear model. In the class, the students discussed the techniques they had elaborated and their solutions to the task. This led the students to adjust their line and state more exactly criteria. The following dialogue is from a discussion, where student C adjusted the line after discussing the techniques. Student B: But what are your reasons for this line? Student C: That it should hit as many points as possible Student A: Hit, so hit them Student C: Yes, like this [Student C showed what she meant by hit on the screen] Student A: So it does not matter, it just had to hit Student B: What about the one that sits in the middle Student D: There must be some which are outside Student C: I might as well have put it that way a little more down here. It would probably be more right Student A: Well, I have written that there has to be an equal number of points the same side. So in same distance from the line Student C: But. If you place it here. Then it is really far away from this one. From this point. [2] Few students were rather close to the definition of best line, as they include mean, sum or distances in their criteria. One student argued: "Since there are many points in east and west I make the line based on 2 Protomathematics is a set of skills and knowledge that a group or an individual knows and uses in life. Protomathematics offering fertile sources for mathematization without itself being mathematics. (Thomas, 1996, p. 11-13) 10.2 analysis of the realized teaching sequence what I think is the mean of the points"[A.6]. From the criterion it is not clear if the student refered to the point of gravity or if the criteria was similar to criteria 7 in the a priori analysis. Another student proposed that "All the points on the one side, their total distance has to be roughly the same as the other side."[3], perhaps an idea of summing the distances, like criteria 1-3 in the a priori analysis. Even though the criteria was loosely formulated, the inclusion of "sum" and "distances" indicate that the student had a rather strong intuition about summing a distance, and consequently useful ideas that could be used in the technological-theoretical moment. The technological-theoretical moment The discussions in the groups (when solving exercise 1+2) indicate that the students found out that the question had to be defined precisely to be answered. For instance one student said: "So you know you can not use the eye. It is [ makes disappointed sound]"[4]. The soundrecordings and the observation indicate that the students had realized the problem with using graphic techniques and the need of a more precise technique to determine the best line. It is however not possible to say if the students realized the need of a definition and a technique because of the exercises or if it is due to their previous work with regression (they know one "best" line exists). Because the students had relevant answers to the question, the specification of the question was based on these answers. The teacher institutionalized the definition of best linear model and the technology ✓M . The experimental approach to encountering linear regression seems reasonable to include to let the students construct the technological discourse (Aim 1 of the design). The technical moment The technical moment consists of developing a technique to minimize tM . As described in the a priori analysis the students had to solve task tM and reach the theoretical level ⇥EA through different subtasks. It was intended that the subtasks were based on old knowledge, in form of well-known techniques and praxeologies, but it turned out that the techniques were not to well establish. The students did not develop the algebraic techniques of sums and mean and they did not realize the connection between displacement and the symbolic expression of the best linear model. The last should have been the basis for generalizing the result of the technical work to a technique. Displacement of data In exercise 5, the students did not succeed to make a connection be- 103 104 a posteriori analysis tween displacement and the symbolic expression of the best linear model. It appeared that the students spent long time and effort to calculate the transformed data (subtract (x, y)), even though the teacher had recalled the technique. One reason, that the technological-theoretical level of displacement was not reached, was that the displacement of the data was invisible in Nspire. Nspire determines the values of the axis based on the data set, which meant that the points in all three cases were plotted the (a) Original data set (b) Vertical displacement (c) Centralized data set Figure 24: The scatterplots of the three data set. The scatterplots are similar except the axis, which did it difficult to discover the displacement. same place and only the axis were changed, see figure 24. I must admit that the problem was not something I had foreseen would be an obstacle to the students’ work, since otherwise I would have solved it (a redesign of the exercise is described in the discussion). Based on observation and the sound recordings it is clear that the students had difficulty explaining what they saw, as nothing happened with the location of the points, but only with the axis. The teacher had to help the students by asking about the coordinates before the students realized that the coordinates had changed. The following conversation shows how the teacher helped the students to state what happened. Student A: In c. Then you had to make those calculation we have made and insert them in a plot. But on our plots it is the same as, you know the same points The teacher: When you have subtracted the mean Student A: Yes, like this. We plot them The teacher: So then you plot right now Student A: The points are the same now. Is that the point? The teacher. Yes [ interruption of student B] Students B: But see the number here 10.2 analysis of the realized teaching sequence 105 The teacher: But they are not identical. What is the difference? Student B: It goes in minus Student A: Yes yes. The y and the x values The teacher: Yes, so you say it is the same. What do you mean? Student A: I just think that the points are in the same location, but Student B: Just on different, they are located to Student C: If you set it up in a different coordinate system so would they not be located the same places [8] The dialogue indicates that the students were aware that the relative location of the points was similar, but the students missed to realize that subtracting (x, y) implies that the points were moved in the neighbourhood of (0, 0) (A priori analysis ex:5). Because of time constraints, the students did not compare the symbolic expression of the three linear models and so they lack to discover that transforming the data did not change the slope of the linear model. The knowledge was institutionalized by the teacher. The obstacles described caused that the current function with the activity was not linked to questions about regression (monumentalism), hence the purpose of the exercise was missing. The students also said that the exercise was about "writing a lot of numbers in Nspire"[9] and "calculating the mean and difference" [9]. The subtasks in the proof The subtasks to solve tM turned out to be very difficult for the students. For that reason, the students only did some of the subtasks, by themselves, while the teacher institutionalized the others. In the following, the essential possibilities and obstacles that occurred in the subtasks are described. In the first module the technique and theory of summation and mean were handled. The students’ solutions and the observations show that the students elaborated techniques to calculate sums in the numeric setting. The technique to sum a constant was found to be the only obstacle. One group found the technique with help from the teacher, while the other two groups elaborated the technique by discussing the examples in the handout. Student A: Student C: Student A: Student C: Student A: Student B: Student A: Student D: I do not quite understand. What is k. k equal 1. The index This is what it starts from No, it is not Or it increases by one No. Try to look at the examples Is it not just that you have to put two up four times Yes - I also think that So it is 2 + 2 + 2 + 2 [5] In the exercise about mean, it became clear that the theory of sum- Monumentalism refer to the construction of praxeologies without knowing its current functions or why it was once built (Rodríguez et al., 2008, p. 292) 106 a posteriori analysis mation was not to well established and the students had challenges using the notation in the algebraic setting. Student B: We must make. The number five which are above the summation have to be n. And then we have to say Student A: No Student B: Yes Student A: What is written below is n Student B: Yes, it also comes to be called n, but this above will also be called n. [6] The praxeologies developed about mean and sum appeared to be insufficient when the techniques had to be used in the proof. The students were only able to apply the rule Pn Pn Pn k=1 (xk ± yk ) = k=1 xk ± k=1 yk , but not the other two rules, which the following is an example of: Student B: Can we use number one? [ The students refer to the rules of sums written on the blackboard (figure 25)] The teacher: Yes, where will you use number one? Student B: Is it not that where it says k? The teacher: k? Student C: No Student A: Also, I will use, yes number one as he said. But there were we have a constant. And this are for instance there where two stand before the letter The teacher: Yes, that is correct. Two is a constant. Are there other letters, I know it is letters, but are there other letters that are constants up here? If we try to think. What are our variables and what are our constants? Student A: Those which are raised to the second. So that is a to the second power [7] Figure 25: The rules of sum written at the black board As expected, the students could not explain why a and b could be put outside the sum. This illustrates that the students’ praxeologies were not stable to variation in the notation, since they were unable to 10.2 analysis of the realized teaching sequence use the algebraic rules of sum when the ostensive changes3 . The more stable theory, the less sensitive is a technique for variation, which indicate a lack of the technological-theoretical discourse in the students praxeologies. In short, the algebraization of the rules and the technological questions did not lead to an establishment of a fluid interrelation between the technical and technological-theoretical level of sums and mean. This implies that the students did not have the conditions to use the algebraic techniques (rules of sums, definition of mean) in the proof. 2 The expansion of yk - (axk + b) was found to be based on wellknown techniques. The students’ solutions showed that all three techniques described in the a priori analysis were used. Some students made small errors when using the techniques (figure 26), but the task (1a) was manageable. Figure 26: A student’s solution to task 1a). The technique is multiplication of the brackets. Note the student wrote axn · axn = ax2n [A.6] In addition to the problems caused by the new technologies of sum and mean, the students also had difficulties with the algebraic symbols used. In task 2a) the students had difficulty realizing that the sums were constants. It was obvious that the notation of xk and yk was in conflict with the way the students usually use the notation x and y. The representation of x and y as variables and a, b and c as constants is linked to the students’ application of them in the various mathematical context of functions (their ostensives), but here the students had to reverse the representation, which they did not manage to do. Due to the fact that no students recognized xk and yk as constants, the 3 Any praxeology is activated through the manipulation of ostensives and the evocation of non-ostensives. The ostensives are symbols, words, gestures and representations connected to the abstract mathematical concepts. For instance ax2 + bx + c is an ostensive connected to second degree polynomials. (Arzarello et al., 2008) 107 108 a posteriori analysis teacher institutionalized that the points were constants. The discussion after the teacher’s help and the students’ solutions indicate that the students realized that xk and yk represented the data and so be constants. However, no students made arguments like the arguments in the a priori analysis, since they not argued for that a sum of constants are a constant. In task 2b, the students did not master the definition of second degree polynomials in an algebraic setting, where the symbols differ from the standard notation. Student E: Can the a, b, c be the same as x. So ax2 + bx + c. Boom The teacher: Yes. What could then be the x? Student E: It could be a, b, c. Big a, b, c The teacher: The big a, b, c were what? Student B + E (in chorus): They were constants The teacher: Yes, so what should vary here? Student E: The little a, b, c Student B: The a, b have to vary [10] Firstly it is worth mentioning that the students did not understand the formulation of the task (2b). Listening to the students’ discussions, it seems that the formula; sum of two second degree polynomials caused the barrier. Two variables in the same function and furthermore that the variables were named a and b were a general problem for all students. The dialogue shows how student E first suggested to let A, B and C (the constants) be the variable and later the a, b and c (from the standard notation) be the variables. It was obvious that all the students were confused about the use of symbols and that they needed help from the teacher to identify the variables (see dialogue 11). It turned out that the students had hard time using their praxeologies about second degree polynomials when the ostensives changes. In the students’ solutions both ways to divide the expression S(a, b) can be found, thus the potential to discuss the step S(a, b) = (a) + (b) was available, but unfortunately neither the students nor the teacher did. Because the division was not discussed, the mathematical quality of the proof was missing (cf. a priori analysis). Because of time constraints, only one group solved task 2c. Based on that group and the observation made in the institutionalization, I infer that the task was manageable. During the institutionalization of task 2c), it was surprising that several students were able to participate in the discussion and that the students suggested both techniques (differential calculus, vertex) to solve the task. Using the formula of vertex, the students succeed to link the formal definition of vertex to the specific case, see figure 27 (cf. a priori analysis). In addition, one student explained that the 10.2 analysis of the realized teaching sequence values minimize the polynomials (using polynomials cf. a priori analysis), which illustrates that the students was able to link the subtask P 2 to the previous tasks, as the sum (A = n k=1 xk ) was applied. The fact that a great part of the students’ work was based on alge- Figure 27: A student’s solution to task 2c + 2d).[A.6] braic techniques, which were not sufficiently stable, it did not succeed to let the students by themselves solve task tM . For that reason a part of the technical work with the subtasks was skipped and was only institutionalized. Judging from the observation, the students only mastered to focus on the techniques in the subtasks and consequently they did not link the subtasks in a didactic process about the task tM . This caused that the students did not realize that the technique to solve tM was to split up the expression in two expressions with each one variable and minimize each of these (Aim 2 of the design). 10.2.1.1 Institutionalization and evaluation In the first two modules the institutionalization occurred continuously in the teaching and it turned out that the institutionalization constituted for a larger part than planned (figure 23) because the technological moment proved to be difficult. In relation to the MO of Q3 , the students did not reach the theoretical discourse by themselves, but only in an interplay between the technical work with the subtasks and institutionalization from the teacher. Based on the comments made after institutionalization of the theorem, it seems that at least few of the students had realized that they had developed a general technique to make linear regression and were able to establish a connection between the instrumented techniques to make linear regression and the technological-theoretical discourse [✓M /⇥EA ]. Student A: So this is not the way we have to calculate it. It is thus a proof The teacher: Fortunately, Nspire does it for you 109 110 a posteriori analysis Several students: Yes please Students A: But this is the reasons for it Student B (to student A): So now you know what Nspire does The teacher: So each time you make regression in Nspire, linear regression in Nspire, then it calculate the value of a in this way and similarly with the value of b [12] In the teaching no evaluation of the MO was made. In the redesign, examples of tasks to evaluate the MO are suggested. 10.2.2 Q1 : The type of model Fi 10.2.2.1 The first encounter with Q1 The first encounter with Q1 , turned out to be in the class, since only one student (student B) had made the homework. Student B had only made exponential regression hence the three scenarios about power regression (described in the a priori analysis) did not occur. In spite of the fact that the other students had not solved the exercise, they however questioned the model (exponential model) that student B suggested. The other students suggested that the data could also be modeled by the class of power functions. Student A: How did you know that it is an exponential model? Students B: You can see it on that [points at the figure (figure 22) in the exercise] Student C: Why not a power? Student B: No. That is how it is The teacher: Why not? Student B: It just is an exponential function Student D: But why? Student E: Can you not remember the rules for it? [Silence talk in the groups] Student E: It looked best The teacher: How could you find out if it is a power or exponential? [No students came with an answer ] The teacher: It is a good discussion you have started. It is actually what we will talk about in this module. How do we find out which model to use, when we get some numbers. [13] The students realized that they had a problem, since neither of them could justify their choice of model. This motivated the students to work with questions related to Q1 , as they were curious to found out the technique to determine the class of functions. I will not comment on the exploratory moment, as the students did not elaborate new techniques in the work with Q1 . The students developed their existing techniques and included the phenomena behind the data to solve the tasks. 10.2 analysis of the realized teaching sequence 10.2.2.2 The technical moment The technical moment related to Q1 turned out to be feasible, because the students mastered the instrumented techniques (make plots, regressions and calculate the coefficient of determination). Dealing with the technical work, it was surprising that the students evaluated the instrumented technique to make linear regression. For instance, one student discovered a limitation of the instrumented technique, when the students determined the linear model in exercise 11. The students worked in TI-Nspire (figure 28), when the following dialogue occurred. Student C: It does not make sense to me Student D: It does make sense Student C: So, I start at minus second. That does not make sense. But I can figure out how to do it Student D: Try to exchange x and y. Some time is x and Student C: No Student D: But what is it that make no sense? Student C: No no. If you zoom in. It takes the mean of it. It knows not better [When the students said it, the student referred to Nspire] Student C: Then the write determine b in the linear model. I just have to write zero because it looks like this [Silence] Student C: Yes yes, but I think about zero. I start at zero. If we think logically [Later in the discussion] Student B [ask student C]: Why do you understand that the b-values have to be negative? Student C: Logically you think it has to be zero. But you can say. You have to consider that Nspire do not know what the numbers must be [15] This illustrates that student C, during the process of determining the Figure 28: The linear model in TI-Nspire that the students discuss in the dialogue. Student C is confused about the intercept with the yaxis (0, -103.5), since it did not fit with the real life 111 112 a posteriori analysis linear model, wanted to justify the model but was confused when the phenomenon was taken into account, since the symbolic expression of the linear model did not fit the reality (the intersection should be 0). The student reflected about the technique and realized that TI-Nspire did not take the phenomenon into account. The example illustrates how the students made a connection between the model f 2 FL (Q3 ) and the evaluation of the model (Q4 ). 10.2.2.3 The technological-theoretical moment It turned out that the students were able to made qualitative arguments when choosing between a non-linear and linear class of functions, but had some trouble including the phenomenon to choose between the non-linear classes of functions (FE and FP ). In exercise 10, it turned out that the students only used the two techniques ⌧Qr2 and ⌧PS (see the a priori analysis). The students missed to use the properties of exponential and power function (technique 3 in the a priori analysis), hence they could not include the phenomenon of infant mortality in their arguments. The students’ arguments were like A1 and A2 in the a priori analysis. For instance, the students argued; From the data it is difficult to decide if it is a power or exponential function. But in this case I choose the exponential function f(x) = b · ax The observations and solutions indicate that the students were a bit confused that both models fit the data very well and wanted answers to which were best, however they realized, after discussing in the class, that more models can fit data. It is worth noting that the students were able to include the phenomenon in task 10d, which indicate that the students were able to relate to the phenomenon. Thus it is not possible to conclude, if the students did not considered the phenomenon in the first task (10a) or if they were unable to include the phenomenon in their arguments, because their knowledge about the properties of functions was insufficient. Dealing with the quality of the model in task 10d the students were able to discuss the predicted value and did rather well arguments for the restricted range of the model. For instance, one student argued: When we look at the model at our graph, then we can see that it decrease fast the years we have get. And we cannot expect that it will continue to decrease, so we have to consider the model against the reality. What realistically will happen. And it cannot continue to decrease all the time. Thus it be correct that the mortality rate has been 9.9 and not 2.4 [14] The students’ solutions to exercise 11 illustrate as well, that the students linked the data with the phenomenon and included the knowledge of run to justify their choice of model. For instance, one student argued; I will use the power function, since you can not run with the same 10.2 analysis of the realized teaching sequence speed in 42km. Even through the coefficient of determination are higher for the linear function [A.6]. Several of the arguments included the phenomenon and were similar to argument A5 in the a priori analysis. Based on the students’ solutions and the observations I infer that the students learn that the class of functions cannot be determined solely by the coefficients of determination. It was obvious that the students had no trouble to include the phenomenon in this case, since the two models look very different (very different properties). Few students concluded that none of the models could be used to model the phenomenon as it really occurs. One students explained: No, you did not move with the same speed. I will say that when running you start with faster speed and then you finished and then you finish with a spurt. So I would not actually say any of them [16]. This shows that the students realized the real world complication and that no model can describe real life perfect. The technical work and the technological-theoretical questioning result in that the students realized that no mathematical criteria of best class of functions can be made and to determine the best class of functions, the phenomenon and the purpose with the model has to be considered. In the students’ study process of Q1 , I observed that the students went forth and back between the four questions (Q1 -Q4 ) in the ERM and were able to establish an interrelation between the different praxeologies to answer Q1 . As should be clear from the analysis the students’ work with Q1 had the intended effect. The students used the evaluation of the models (Q4 ) and the phenomena to answer Q1 and links Q1 and Q4 . The students were able to determine the class of functions, but had trouble including the properties of the functions and the phenomenon to select between the non-linear models. The students rather well explain the choice between linear and non-linear models (Aim 3 of the design). Further the students were able to reflect about the range of the model (Aim 4 of the design). 113 11 DISCUSSION 11.1 redesign After analyzing the teaching, an interesting question is: What should be done different next time the design is used? In the following, suggestions to improvements of the design will be discussed. Next time the design will be used, it is important that more time is available. It was obvious that there was not enough time to establish the knowledge of sum and mean and further to solve the task tM . Another time, the students should be introduced to the sum notation and work with sums and mean before working with regression, such that the rationale of learning the techniques will be higher. In the design the praxeologies were only short visited. The praxeologies could be taught in relation to descriptive statistics. In addition, inclusion of more tasks like task 3j, where the tasks are solved by applying the rules of sums will hopefully make the technical work with task tM easier. In a redesign exercise 3 + 4 should be ommited. The knowledge of displacement turned out to be not to well established because the graphic representation failed (figure 24). Thus in a redesign, the graph window has to be fixed (figure 29). The window can be fixed by set the options of the coordinate system (the interval of the x and y variable). The exercise should be altered to include the options. For instance 5c should be: Make a plot of xk and yk - y. Adjust the coordinate system such that x 2 [-5, 20] and y 2 [-40; 280]. Explain what happens with the location of the data relative to the plot in 5a). Fixing the graph window should let the students visualize the displacement and be sufficiently to justify the connection between displacement and the symbolic expression of the line. In addition, it will be advantageous to use the knowledge of displacement to reconstruct the students praxeologies about regression. Several students know the technique to transform the independent variable when making regression (for instance from year to years after 2007) without knowing the technology. Thus, implementing the following exercise, the students should be able to justify the technique 115 116 discussion (a) Original data set (b) Vertical displacement (c) Centralized data set Figure 29: Plots of the displacement, when setting the graph window. The plots clearly show how the location of the data changes, while the mutual location are similar. of displacement. The exercise could be: The table shows the number of traffic casualty in the first half-year of the years 2007-2012 from the Roads Directorate. Year 2007 2008 2009 2010 2011 2012 Traffic casualty 195 190 161 110 107 82 Source: FDM It is assumed, that the development in the number of traffic casualty can be described by a linear model. a) Use the data of the table to determine a and b in the linear model y = ax + b, where y is the number of traffic casualty in the first half-year and x is the year b) Use the data of the table to determine a and b in the linear model y = ax + b, where y is the number of traffic casualty in the first half-year and x is years after 2007 c) Compare the models and explain the meaning of the parameter b in the two models. In the design, the students should work with task tM by themselves and this should not be changed in a redesign, since I believe the students are able to solve the subtasks, if the task and notation are throughly introduced (and the knowledge about sum and mean are established). Thus next time the task will be solved, the introduction should be altered. First, the teacher should explain the symbols and notations included in the tasks to ensure that the students are aware of the notation used. Second, the teacher has to problematize the task of minimizing the function S(a, b) and let the students discuss how to solve this problem. First after the students have discussed how to solve the problem, the technique should be presented. The intro- 11.2 the rationale for teaching more theoretically duction to the notation should help the students with the subtasks, especially 2a and 2b, where it turned out that the notation were the main obstacles. Furthermore the problematization of the task should help the students to connect the subtasks. As described in the analysis, I had not planned a moment of evaluation in the design. If a moment of evaluation should be included it could be done by recalling the students’ criteria from the first encounter and investigate to what extent these first incomplete techniques fit the general technique. Another possibility it to let the students work with concrete data, where the techniques had some limitations, for instance data set with outliers. The last change I would made is related to tasks of type TR , since it turned out the students had trouble choosing between the class of exponential and power functions. The new exercise should include data, for which the students have the conditions to include the phenomenon to make qualitative arguments for their choice. An example of such a phenomenon is a car’s braking distance. A car’s braking distance depends on the speed and can be modeled with a power model (f(x) = k · x2 ). Using a phenomenon like that, the students should be able to make qualitative arguments for their choice. Maybe the students know that double the speed quadruple the braking length, else they can found the information online. 11.2 the rationale for teaching more theoretically Investigation of the curricula showed that the students only have to learn to apply regression and for that reason it is interesting to consider the question: Why teaching more theoretically in regression? In the following, reasons to teach the technological-theoretical discourse of Q3 will be discussed. Consider the rationale to teach the technological-theoretical discourse of regression is closely related to what the purpose with mathematics in high school should be and what the curricula reasonable can and should provide. The development of CAS has in the recent years had an influence on the content of mathematics and the way mathematics is taught. A major issue that are emerged in relation with the development of CAS is the question about in which extent the students have to be presented for the theory behind the instrumented technique and what the reason to present the knowledge should be. In the teaching sequence it turned out that the students had rather well ideas of the best line and the question of "best line" seems to 117 118 discussion motivate the students to search for an answer. Based on the analysis I conclude that the technological discourse can be learned without major troubles and that the discourse can be reach at all level. Nevertheless, it can be discussed whether it is important for the students to learn the discourse, since the students can make the line without the knowledge. However, the knowledge is relevant when evaluate models and when the students treat problems in other disciplines. The a posteriori analysis also shows, how a student was able to justify the model and be aware of the limitation with the technique, because the students knew the discourse of the technique. Further it let the students reflect and consider the problem of linear regression and make the students able to justify the instrumented technique. The a posteriori analysis concerning the theoretical discourse highlights didactic restrictions coming from different levels. It turned out that the obstacles were not related to the subject-theme level of regression, but came from other subject-theme levels and higher level. Thus, teaching more theoretical in regression cannot be restricted to the subject-theme level of regression, but will require a reconstruction of other themes, for instance sums, means and transformation. For that reason it will not be simple to include the discourse in the teaching and it can be discussed whether the theory helps the students to reach the aims in the curricula. In addition, it can be difficult to motivate the work, since the students never will use the formula. Nonetheless, the work can be justified, since it let the students connect different techniques coming from different themes and make it possible to reconstruct praxeologies, for instance about optimization. Furthermore, students learn to make mathematical reasoning and the task illustrate how mathematics problems are solved by simplification. In short, rather well arguments for including both the technological and the theoretical discourse can be made, however teaching the technological discourse can be argued to be sufficient to avoid the black box effect of regression. 12 CONCLUSION The aim of the thesis was to examine the bodies of knowledge regarding regression and transform this knowledge to design a teaching sequence investigating the possibilities to work more theoretically with one-dimensional regression to avoid the black box effect. The analysis of the scholar knowledge showed that two technological discourses and three theoretical discourses exist about linear regression, but only the technological discourse about minimization and the theoretical discourse of elementary algebra are based on mathematics well known to students in high school. The other discourses require knowledge that goes beyond the mathematics presented in the curricula and consequently were not relevant in the design of the teaching sequence. Analyzing the knowledge to be taught showed that the students only is presented for a minimum of the technologicaltheoretical discourses and reading the curricula showed that the students only have to learn to apply regression. The analysis of the external didactical transposition resulted in four main questions which constitute the ERM. The ERM had a double function in the thesis; it was basis for the design of the teaching sequence and it enabled to make a priori and a posteriori analysis. In the teaching sequence two of the questions were taught: What do we mean by best linear model? What class of functions in Fi is suitable to model the data? The analysis of the teaching highlighted some interesting possibilities and obstacles in relation to teach more theoretically with regression. The teaching showed that it was possible to let the students consider the question of best linear model, and that the students had a sensible basis to reach the technological discourse. The technical work to reach the theoretical discourse turned out to be a challenge, where the didactic restrictions coming from levels beyond the subject-theme were the main obstacles. Especially the rules of sum and the algebraic notation was found to be obstacles in the students’ didactic process. The work with the question of best class of functions appeared to be based on praxeologies the students had. The students were able to made qualitative arguments when choosing between non-linear and linear classes of functions, but had some trouble including the phenomenon to choose between the non-linear classes of functions. 119 120 conclusion Discussing the rationale for teaching the technological-theoretical discourse of linear regression rather well arguments for including both the technological and the theoretical discourse can be made, however teaching the technological discourse can be argued to be sufficient to avoid the black box effect of regression. BIBLIOGRAPHY Arzarello, F., Bosch, M., Gascón, J., and Sabena, C. (2008). The ostensive dimension through the lenses of two didactic approaches. ZDM, 40(2):179–188. Barbé, J., Bosch, M., Espinoza, L., and Gascón, J. (2005). Didactic restrictions on the teacher’s practice: The case of limits of functions in spanish high schools. In Beyond the apparent banality of the mathematics classroom, pages 235–268. Springer. Bosch, M. and Gascón, J. (2006). Twenty-five years of the didactic transposition. ICMI Bulletin, 58:51–63. Bosch, M. and Gascón, J. (2014). Introduction to the anthropological theory of the didactic (atd). In Networking of Theories as a Research Practice in Mathematics Education, pages 67–83. Springer. Brydensholt, M. and Ebbesen, G. R. (2010). Lærebog i matematik: Bind 1. Systime, 1. edition. Brydensholt, M. and Ebbesen, G. R. (2011). Lærebog i matematik: Bind 2. Systime, 1. edition. Dictionary.com (2015). Regression. Retrieved 22/1 2015 at http:// dictionary.reference.com/browse/regression?s=t. Ditlevsen, S. and Sørensen, H. (2009). En introduktion til statistik, chapter 6: Lineær regression, pages 85–109. Københavns universitet: Institut for matematiske fag. Drijvers, P. and Gravemeijer, K. (2005). Computer algebra as an instrument: Examples of algebraic schemes. In The didactical challenge of symbolic calculators, pages 163–196. Springer. Garcia, F. J., Pérez, J. G., Higueras, L. R., and Casabó, M. B. (2006). Mathematical modelling as a tool for the connection of school mathematics. ZDM, 38(3):226–246. Grimmett, G. R. and Stirzaker, D. R. (2001). Probability and Random Processes, chapter 7.9: Prediction and conditional expectation, pages 343–346. Oxford: University press, 3. edition. Grøn, B. (2004). Evalueringsrapport: Matematik studentereksamen og hf. Grøn, B., Felsager, B., Bruun, B., and Lyndrup, O. (2013). Hvad er matematik? A. Lindhardt and Ringhof, 1. edition. 121 122 list of references Grøn, B., Felsager, B., Bruun, B., and Lyndrup, O. (2014). Hvad er matematik? C. Lindhardt and Ringhof, 2. edition. Gyöngyösi, E., Solovej, J. P., and Winsløw, C. (2011). Using cas based work to ease the transition from calculus to real analysis. Pytlak, M., Rowland, T., & Swoboda, E., Proceedings of CERME, 7:2002–2011. Kachapova, F. and Kachapov, I. (2010). Orthogonal projection in teaching regression and financial mathematics. Journal of Statistics Education, 18(1):1–18. Katz, V. J. (2009). A History of Mathematics: An Introduction, chapter 23: Probability and Statistics in the Nineteenth Century, pages 818–832. Addison-Wesley, 3rd edition. Key, E. S. (2005). A painless approach to least squares. College Mathematics Journal, pages 65–67. Khanacademy (2014). The least squares approximation for otherwise unsolvable equations (video). Retrieved 11/12 2014 at https://www.khanacademy.org/math/ linear-algebra/alternate_bases/orthogonal_projections/v/ linear-algebra-least-squares-approximation. Kro, A. (2003). Funksjoner af flere variable. Introduktion til Matematik. Københavns Universitet: Institut for matematiske fag. Lawson, A. E. (2009). Teaching inquiry science in middle and secondary schools, chapter 5-7, pages 81–129. Sage. Miller, S. J. (2006). The method of least squares. Mathematics Department Brown University Providence, RI, 2912:1–7. Nielsen, J. A. (2015). Regression med mindste kvadraters metode. LMFK, (2):28. Petersen, P. B. and Vagner, S. (2003). Studentereksamensopgaver i matematik 1806-2000. Matematik lærerforeningen. Rodríguez, E., Bosch, M., and Gascón, J. (2008). A networking method to compare theories: metacognition in problem solving reformulated within the anthropological theory of the didactic. ZDM, 40(2):287–301. Schomacker, G., Clausen, F., and Tolnø, J. (2010). Gyldendals Gymnasiematematik C Grundbog. Gyldendal Uddannelse, 2. edition. Siemsen, E. and Bollen, K. A. (2007). Least absolute deviation estimation in structural equation modeling. Sociological Methods & Research, 36(2):227–265. list of references STXBekendtgørelsen (1999). Bilag 23-matematik b, 1999. Retrieved 15/1 2015 at http://fc.silkeborg-gym.dk/fag/matematikweb/ fagbilag.htm. STXBekendtgørelsen (2013). Bek nr 776, bilag 36-matematik b 2013. Retrieved 15/1 2015 at https://www.retsinformation.dk/Forms/ R0710.aspx?id=152507#Bil36. Svendsen, J. (2009). Matematiklærerens forberedelse. observationer af samspillet mellem gymnasielærerens forberedelse og undervisning. Master’s thesis, Institut for Naturfagenes Didaktik. Københavns Universitet. Thomas, R. (1996). Proto-mathematics and/or real mathematics. For the learning of mathematics, pages 11–18. Undervisningsministeriet (24 maj 2013). Matematik B - Studentereksamen. Van Nes, F., Abma, T., Jonsson, H., and Deeg, D. (2010). Language differences in qualitative research: is meaning lost in translation? European journal of ageing, 7(4):313–316. Waterhouse, W. C. (1990). Gauss’s first argument for least squares. Archive for history of exact sciences, 41(1):41–52. Weisstein, E. W. (2015). Least squares fitting. MathWorld - A Wolfram Web Resource. Retrieved 23/1 2015 at http://mathworld.wolfram. com/LeastSquaresFitting.html. Winsløw, C. (2003). Semiotic and discursive variables in cas-based didactical engineering. Educational Studies in Mathematics, 52(3):271– 288. Winsløw, C. (2011). Anthropological theory of didactic phenomena: Some examples and principles of its use in the study of mathematics education. Un panorama de la TAD. An overview of ATD. CRM Documents, 10:533–551. Winsløw, C. (2013). Didaktiske elementer. En indføring i matematikkens og naturfagenes didaktik. Biofolia, 4. edition. Winsløw, C. (2015). Fodgængerversion af lineær regression. LMFK, (1):21. 123 Part III APPENDIX A APPENDIX a.1 the tasks bedste rette linje Opgave 1: Tabellen viser sammenhørende værdier af alder og længde for en population af spækhuggere. Alder 1 2 3 4 5 6 7 8 9 Cm 310 365 395 424 440 500 578 580 582 a) Lav et plot af punkterne. Kan gøres ved kommandoerne 3: Data, 9: Hurtiggraf. Klik på akserne for at tilføje de to variable. b) Indtegn den linje, du synes passer bedst til punkterne. Linjen tegnes ved hjælp af kommandoen 4: Undersøg data, 2: Tilføj flytbare linjer. Hvilken forskrift har linjen? c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at andre ville finde frem til præcis samme linje som dig, hvis de hørte din begrundelse. Opgave 2: På en skole har man undersøgt alder og højde for nogle børn. De sammenhørende værdier mellem alder og højde kan ses i tabellen. I kan hente tabellen på lectio. Alder Højde Alder Højde Alder Højde 6 108 9 150 12 148 6 123 10 132 13 143 7 123 10 143 13 169 7 138 10 150 14 148 8 128 11 144 14 167 8 139 11 156 15 148 9 122 12 141 15 177 127 128 list of references a) Lav et plot af punkterne. b) Indtegn den linje, du synes passer bedst til punkterne. Hvilken forskrift har linjen? c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at alle vil finde frem til præcis samme linje som dig, hvis de hører din begrundelse. Er dine begrundelser de samme som i forrige opgave? Hvorfor/hvorfor ikke? De næste spørgsmål skal besvares i grupperne. d) Sammenlign forskrifterne for den bedste linje i opg. 1 + 2. Har I fået den samme? Diskuter jeres overvejelser og begrundelser fra 1c) og 2c). e) Diskuter i gruppen hvad I forstår ved den bedste rette linje. Find kriterier for den bedste linje. Summer Man benytter en særlig notation til at opskrive summer komP pakt ved at bruge et specielt symbol, nemlig sumtegnet . Dette tegn er det græske bogstav sigma. For eksempel kan vi skrive summen af fire tal som 4 X xk = x1 + x2 + x3 + x4 k=1 Indekset k = 1 og 4 angiver at vi skal sætte 1, 2, 3, 4 ind på k’ plads. Mere generelt gælder at n X xk = x1 + x2 + x3 + . . . + xn k=1 hvor n er et naturligt tal. Tallet n betyder at vi skal summe op til xn . Vi får derfor n led. I det første eksempel var n = 4, så vi summede 4 led. Herunder ses eksempler på, hvordan man regner med summer: 3 X 4 = 4 + 4 + 4 = 12 k=1 5 X k=1 k = 1 + 2 + 3 + 4 + 5 = 15 list of references Opgave 3 Udregn summerne. Skriv udregningen til højre på papiret. Skriv resultatet i tabellen. P a) Udregn 4k=1 2 b) Udregn c) Udregn d) Udregn e) Udregn P4 k=1 3 P4 2 k=1 k P4 k=1 4k P4 k=1 5k f) Udregn 5 g) Udregn h) Udregn P4 k=1 2 2 P4 k=1 k P4 k=1 (2 + 5k) P4 2 k=1 (5k + k ) P4 k=1 3 P4 2 k=1 k P4 k=1 4k P4 2 k=1 5k 5 P4 k=1 k P4 k=1 (2 + 5k) P4 2 k=1 (5k + k ) i) Betragt summerne og dine resultater i tabellen. Redegør for sammenhængen mellem nogle af summerne og formuler disse mere generelt. j) Udregn når det oplyses at P100 k=1 100 X (2k2 + k) k=1 k2 = 338350 og Regneregler for summer Vi kan summe en konstant n X k=1 P100 k=1 k = 5050 (1) c = n·c Man kan sætte en konstant ude for summationstegnet n X k=1 c · xk = c n X k=1 xk (2) 129 130 list of references Man kan dele summer op n X (xk + yk ) = k=1 n X (xk - yk ) = k=1 n X k=1 n X k=1 xk + xk - n X k=1 n X yk (3) yk (4) k=1 Opgave 4 I skal nu arbejde videre med datasættet fra opgave 2 med alder og højde. a) Beregn gennemsnittet af alderen i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. b) Beregn gennemsnittet af højden i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. Symbolet x bruges til at beskrive gennemsnittet af x-værdier (x1 , x2 , . . . , xn ) i et datasæt. Tilsvarende bruges y til at beskrive gennemsnittet af y-værdier (y1 , y2 , . . . , yn ). Eksempelvis kan vi skrive gennemsnittet af 1, 2, 3, 4, 5 som 5 1+2+3+4+5 1X x= = k 5 5 k=1 c) Opstil en generel formel til at udregne gennemsnittet (x) af xværdierne (x1 , x2 , . . . , xn ) ved brug af sumtegn. d) Opstil en generel formel til at udregne gennemsnittet (y) af yværdierne (y1 , y2 , . . . , yn ) ved brug af sumtegn. Opgave 5 a) Plot datasættet fra opgave 2. Indtegn linjer, der viser gennemsnittet af x og y. Linjen, der viser gennemsnittet af x indtegnes ved 4: Undersøg data, 8: Plot værdi. Linjen, der viser gennemsnittet af y indtegnes ved 4: Undersøg data, 4: Plot funktion. Funktionen, der skal plottes, er y = y. b) Bestem den bedste rette linje for datasættet plottet i a). Skriv forskriften for linjen list of references c) Tilføj en kolonne til datasættet, hvor du udregner yk - y for hvert punkt i datasættet. Lav et plot af xk og yk - y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a). d) Bestem den bedste rette linje for datasættet plottet i c). Skriv forskriften for linjen P e) Udregn n k=1 (yk - y). f) Tilføj en ny kolonne til datasættet, hvor du udregner xk - x for hvert punkt i datasættet. Lav et plot af xk - x og yk - y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a) og c). g) Bestem den bedste rette linje for datasættet plottet i f). Skriv forskriften for linjen h) Sammenlign forskrifterne bestemt i b), d) og g). Gør rede for sammenhængen mellem forskrifterne P i) Udregn n k=1 (xk - x). j) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4c), at det altid gælder at n X (xk - x) = 0 k=1 k) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4d), at det altid gælder at n X (yk - y) = 0 k=1 Bestemme bedste rette linje for et centraliseret datasæt I skal bestemme den bedste rette linje y = ax + b for et datasæt (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ), bestående af n punkter, hvor n > 2. Vi antager at datasættet er centraliseret, dvs. at n n 1 X 1 X x= xk = 0 og y = yk = 0 n n k=1 k=1 Vi skal bestemme a og b, så kvadratsummen ◆2 n ✓ X yk - axk + b k=1 131 132 list of references bliver mindst mulig. Trin 1) Det første trin består i at omskrive summen, så at vi kan minimere denne i trin 2. 2 a) Gør rede for at yk - (axk + b) = y2k - 2axk yk - 2byk + a2 x2k + 2abxk + b2 P P 2 2 b) Vi har at n = n k=1 yk - (axk + b) k=1 yk - 2axk yk - 2byk + a2 x2k + 2abxk + b2 . Brug regnereglerne for summer (1-4) til at dele summen op, sætte a, b foran summationen og summe over b. c) Brug antagelsen om at gennemsnittet af x og y er 0 til at simplificere udtrykket fra b). Trin 2) I trin 1) fandt vi at ◆2 n ✓ n n n X X X X yk - axk + b = a2 x2k - 2a xk yk + yk + nb2 k=1 k=1 k=1 k=1 Vi skal nu finde de værdier for a og b, som minimere kvadratsummen. Summen n n n X X X 2 2 a xk - 2a xk yk + yk + nb2 k=1 k=1 k=1 kan vi skrive som a2 A - 2aB + C + nb2 a) Gør rede for hvad A, B, C svarer til i summen og forklar hvorfor A, B, C er konstanter. I udtrykket er A, B, C konstanter, som ikke afhænger af de to variable a, b. b) Udtrykket a2 A - 2aB + C + nb2 kan betragtes som en sum af to andengradspolynomier med hhv. a og b som variable. Skriv de to andengradspolynomier som udtrykket består af. c) Bestem a og b, så a2 A - 2aB + C + nb2 bliver mindst mulig. Brug fra b) at udtrykket kan betragtes som to andengradspolynomier, som hver skal minimeres. d) Opskriv værdierne for a og b, som du fandt i c). Disse værdier gør P 2 kvadratsummen n mindst mulig. k=1 yk - axk + b Opgave 6 Betragt datasættet fra opgave 2 med alder og højde. list of references a) Brug formlen for a og b til at bestemme den bedste rette linje. Benyt dine udregninger fra i går, hvor du udregnede yk - y, xk - x for alle punkter. b) Bestem forskriften for datasættet i TI-Nspire. Er forskriften ens? Opgave 7: Verdensrekorder Tabellen nedenfor viser tiderne for verdensrekorderne i maratonløb (målt i sekunder) for mænd i perioden 1998-2007. År 1998 1999 2002 2003 2007 Tid (sek) 7565 7542 7538 7495 7466 Udviklingen af verdensrekorderne kan beskrives ved en lineær model f(x) = ax + b, hvor x angiver antal år efter 1998 og f(x) verdensrekorden i sekunder. a) Bestem a og b i modellen b) Hvad kan vi forvente at verdensrekorden bliver i 2014 ifølge modellen? c) Verdensrekorden blev i 2014 slået, så den nu er på 7377. Hvordan passer det med b)? Benyt oplysningen til at kommentere på modellens rækkevidde. Opgave 8: Dykkersyge Hvis en dykker opholder sig et (længere) stykke tid på en vis dybde, skal han stige langsomt op til overfladen igen for at forhindre dykkersyge, dvs. at der opstår livsfarlige bobler af kuldioxyd i blodet. Der er en sammenhæng mellem dybde og det antal minutter, som dykkerne kan opholde sig på den pågældende dybde uden at få dykkersyge. Den kan ses i følgende tabel x (m) 10 12 14 16 18 20 22 25 ... 40 42 y (min) 219 147 98 72 56 45 37 29 ... 9 8 I en model antages det, at dybden og tiden kan beskrives med en potens funktion. a) Opstil en model, der beskriver sammenhængen mellem dybde og antal minutter. Bestem forskriften for modellen. b) Benyt modellen til at udregne hvor længde dykkeren kan opholde sig på en dybde på 30m. 133 134 list of references c) Benyt modellen til at udregne den dybde, dykkeren kan dykke, hvis han gerne vil dykke så dybt som muligt og opholde sig på dybden i 20min. Opgave 9: Vindmølleenergi Figuren herunder viser udviklingen på verdensplan i kapaciteten af vindmølleenergi. a) Opstil en model, der beskriver sammenhængen mellem årstal og kapaciteten af vindmølleenergi. Bestem forskriften for modellen. b) Benyt modellen til at beregne kapaciteten af vindmølleenergi i 2014 og sammenlign med den faktiske kapacitet af vindmølleenergi i 2014, som var 369.6 MW. Opgave 10: Spædbørnsdødelighed Spædbørnsdødeligheden angiver, hvor mange promille af de levendefødte, der dør inden for det første leveår. Tabellen nedenfor er hentet fra Danmarks Statistik og den viser udviklingen i spædbørnsdødeligheden i perioden 1933-1973. Årstal 1933 1938 1943 1948 1953 1958 1963 1968 1973 Spædbørnsdødelighed 71.4 60.0 48.4 37.8 27.4 23.0 19.6 15.4 11.5 a) Lav et plot af datasættet. Indfør passende variable og opstil en model, der beskriver sammenhængen mellem årstal og spædbørnsdødeligheden i perioden 1933-1973. Forklar dit valg af model. b) Bestem parameterne i modellen fra a). list of references c) Forudsig på baggrund af modellen hvor stor en promille af de levendefødte, der dør inden for det første leveår i 2008. d) Det oplyses at spædbørnsdødeligheden i 2008 var 9.9 promille. Hvordan passer det med modellen? Benyt oplysningen til at kommentere modellens rækkevidde. Opgave 11: Løb Tabellen viser sammenhørende værdier af løbedistance og den tid, der tager en kondiløber at løbe distancen hurtigst muligt. Løbedistance (km) 0.5 1 3 5 7 10 15 20 25 Løberens tid på distancen (sek) 98 215 779 1364 1989 2869 4341 5904 7413 På baggrund af disse data vil kondiløberen gerne udregner hvor lang tid det vil tage at løbe et maraton (42.195km). Nogle eksperter mener at kondiløberens tid som funktion af distancen kan beskrives ved en lineær model f(x) = ax + b, hvor x er distancen (målt i km) og f(x) er kondiløberens tid (målt i sekunder). a) Bestem a, b i den lineære model og brug denne model til at udregne hvor lang tid kondiløberen vil være om at løbe et maraton. Andre eksperter mener at kondiløberens tid som funktion af distancen skal beskrives ved en potens model f(x) = b · xa , hvor x er distancen (målt i km) og f(x) er kondiløberens tid (målt i sekunder). b) Bestem a, b i potens modellen og brug denne model til at udregne hvor lang tid kondiløberen vil være om at løbe et maraton. c) Er kondiløberens tid på et maraton den samme ved de to modeller? d) Hvilken model ville du bruge til at bestemme kondiløberens tid? Skriv begrundelserne ned. Diskuter dine begrundelser med gruppen Opgave 12: Trafikulykker Hvert år opgøres antallet af trafikulykker i Danmark. Antallet af trafikulykker i Danmark i årene 1979-1999 kan ses i tabellerne herunder. I 1995 blev antallet af trafikulykker ikke opgjort, derfor mangler der data for dette år. Datasættet ligger på lectio. 135 136 list of references År Antal ulykker År Antal ulykker År Antal ulykker 1979 9267 1986 8301 1993 8427 1980 8477 1987 7357 1994 8105 1981 8546 1988 7321 1995 ... 1982 8427 1989 7266 1996 8672 1983 8105 1990 9267 1997 8301 1984 8523 1991 8477 1998 7357 1985 8672 1992 8546 1999 7321 a) Bestem på baggrund af tabellen hvor mange trafikulykker der ca. har været i 1995. Skriv din model, dine overvejelser om valg af model samt dine udregninger. b) Forudsig på baggrund af tabellen antallet af trafikulykker i 2016. Skriv din model, dine overvejelser om valg af model samt dine udregninger. Opgave 13: Kaniner på en øde ø På en øde ø uden kaniner blev der sat 50 vilde kaniner ud 1. januar 2000. I de efterfølgende år talte man hvert år antallet af kaniner på øen d. 1. januar. Tabellen herunder viser hvordan antallet af vilde kaniner på øen vokser med tiden, målt i år efter 2000. Datasættet ligger på lectio. År Antal kaniner År Antal kaniner 1 100 8 17030 2 240 9 26010 3 450 10 30070 4 1000 11 32500 5 2000 12 33810 6 4100 13 33890 7 9050 14 33870 a) Hvilken sammenhæng er der mellem tiden og antal kaniner. b) Bestem ud fra data i tabellen hvor mange kaniner der har været på øen 1. juli 2006. Skriv din model, udregninger og overvejelser. c) Bestem ud fra data i tabellen hvor mange kaniner der vil være på øen 1. januar 2016 Skriv din model, udregninger og overvejelser. list of references a.2 the tasks with solutions 137 Bedste rette linje Opgave 1: Tabellen viser sammenhørende værdier af alder og længde for en population af spækhuggere. Alder 1 2 3 4 5 6 7 8 9 Cm 310 365 395 424 440 500 578 580 582 a) Lav et plot af punkterne. Kan gøres ved kommandoerne 3: Data, 9: Hurtiggraf. Klik på akserne for at tilføje de to variable. b) Indtegn den linje, du synes passer bedst til punkterne. Linjen ved hjælp af kommandoen 4: Undersøg data, 2: Tilføj flytbare linjer. Hvilken forskrift har linjen? Mange muligheder for svar. Min linje: f (x) = 37.1x + 281 c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at andre ville finde frem til præcis samme linje som dig, hvis de hørte din begrundelse. Forskellige begrundelser afhængigt af hvordan eleverne har lagt linjen. Beskrevet i opgave 2e) Opgave 2: På en skole har man undersøgt alder og højde for nogle børn. De sammenhørende værdier mellem alder og højde kan ses i tabellen. I kan hente tabellen på lectio. Alder 6 6 7 7 8 8 9 Højde 108 123 123 138 128 139 122 Alder 9 10 10 10 11 11 12 Højde 150 132 143 150 144 156 141 Alder 12 13 13 14 14 15 15 Højde 148 143 169 148 167 148 177 a) Lav et plot af punkterne. b) Indtegn den linje, du synes passer bedst til punkterne. Hvilken forskrift har linjen? Mange muligheder for svar. Min linje: f (x) = 4.76x + 92 c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at alle vil finde frem til præcis samme linje som dig, hvis de hører din begrundelse. Er dine begrundelser de samme som i forrige opgave? Hvorfor/hvorfor ikke? Forskellige begrundelser afhængigt af hvordan eleverne har lagt linjen. Beskrevet i opgave 2e) De næste spørgsmål skal besvares i grupperne. d) Sammenlign forskrifterne for den bedste linje i opg. 1 + 2. Har I fået den samme? Forklar hver af jeres overvejelser og begrundelser fra 1c) og 2c).ă Forskrifterne vil være forskelligt fra gruppe til gruppe. Fælles for grupperne vil være at de har fundet forskellige forskrifter, som de skal begrunde. e) Diskuter hvad I forstår ved den bedste rette linje. Find kriterier for den bedste linje. P • Minimere den horisontale afstand nk=1 (xk x) P • Minimere den vertikale afstand nk=1 (yk y) • Minimere den vinkelrette afstand • Minimere den vertikale afstanden numerisk Pn |yk Pn k=1 • Minimere kvadratet af de ovenstående afstande • Lige mange punkter på hver side af linjen y| k=1 (yk y)2 • Skal gå igennem den mindste og største værdi • Skal gå igennem gennemsnittet af værdierne • Skal gå igennem to punkter • Fastlægge skæringen (b-værdien) hvis vi kender denne Summer Man benytter en særlig notation til Pat opskrive summer kompakt ved at bruge et specielt symbol, nemlig sumtegnet . Dette tegn er det græske bogstav sigma. For eksempel kan vi skrive summen af fire tal som 4 X xk = x1 + x2 + x3 + x4 k=1 Indekset k = 1 og 4 angiver at vi skal sætte 1, 2, 3, 4 ind på k’ plads. Mere generelt gælder at n X xk = x1 + x2 + x3 + . . . + xn k=1 hvor n er et naturligt tal. Tallet n betyder at vi skal summe op til xn . Vi får derfor n led. I det første eksempel var n = 4, så vi summede 4 led. Herunder ses eksempler på, hvordan man regner med summer: 3 X 4 = 4 + 4 + 4 = 12 k=1 5 X k=1 k = 1 + 2 + 3 + 4 + 5 = 15 Opgave 3 Udregn summerne. Skriv udregningen til højre på papiret. Skriv resultatet i tabellen. a) Udregn b) Udregn 4 X 4 X 2 k=1 4 X k=1 4 X 3 k=1 c) Udregn d) Udregn e) Udregn 4 X k2 k=1 4 X g) Udregn h) Udregn 4 X 4k 2 k=1 4 X 5k 4 X k 8 k=1 2 P4 12 k=1 5k = 5 · 1 + 5 · 2 + 5 · 3 + 5 · 4 = 50 5 k=1 k=1 4 X (2 + 5k) k=1 4 X 4k 2 = 4 · 12 + 4 · 22 + 4 · 32 + 4 · 42 = 120 k=1 4 X k = 5(1 + 2 + 3 + 4) = 5 · 10 = 50 (2 + 5k) = (2 + 5) + (2 + 10) + (2 + 15) + (2 + 20) = 58 (5k + k 2 ) k=1 P4 k 2 = 12 + 22 + 32 + 42 = 30 k=1 4 X k=1 f ) Udregn 5 · 3 = 3 + 3 + 3 + 3 = 12 k=1 4 X k=1 4 X 2=2+2+2+2=8 k=1 4 X (5k + k 2 ) = (5 + 12 ) + (10 + 22 ) + (15 + 32 ) + (20 + 42 ) = 80 k=1 3 P4 30 k=1 k2 P4 k=1 120 4k 2 P4 50 k=1 5k P 5 4k=1 k 50 P4 58 k=1 (2 + 5k) P4 80 k=1 (5k + k2) i) Betragt summerne og dine resultater i tabellen. Redegør for sammenhængen mellem nogle af summerne. Diskuter i grupperne og forklar sammenhængene. 4 X a) k=1 4 X b) k=1 n X c) k=1 n X d) e) k=1 4 X f) k=1 n X g) k=1 4 X (a + e) h) k=1 4 X (c + e) k=1 n X 2=2·4 3=3·4 c=n·c 2 4k = 4 · 5k = 5 4 X k2 k=1 4 X k k=1 n X cxk = c xk k=1 (2 + 5k) = 2 4 X 2+ k=1 4 X (5k + k ) = (xk + yk ) = k=1 k=1 n X k=1 4 X 5k k=1 5k + xk + 4 X k=1 n X k2 yk k=1 j) Udregn når det oplyses at 100 X k=1 2 P100 k=1 100 X (2k 2 + k) k=1 k 2 = 338350 og (2k + k) = 100 X 2 2k + k=1 100 X =2 k=1 k2 + 100 X k=1 100 X k=1 P100 k=1 k = 5050 k k = 2 · 338350 + 5050 = 681750 Regnereglerne for summer er: Vi kan summe en konstant n X k=1 (1) c=n·c Man kan sætte en konstant ude for summationstegnet n n X X c · xk = c xk k=1 (2) k=1 Man kan dele summer op. n X k=1 n X k=1 (xk + yk ) = (xk yk ) = n X k=1 n X k=1 xk + xk n X k=1 n X yk (3) yk (4) k=1 Opgave 4 I skal nu arbejde videre med datasættet fra opgave 2 med alder og højde. a) Beregn gennemsnittet af alderen i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. P Bruger sum i TI-Nspire. 21 k=1 xk = 220 Kan også bruge =mean(alder). x = 220 = 10.4762 21 Dette betyder at børnene, som er blevet undersøgt, i gennemsnittet er 10.5 år. b) Beregn gennemsnittet af højden i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. P Bruger sum i TI-Nspire. 21 k=1 yk = 2991 Kan også bruge =mean(højde). x = 2991 = 142.429 21 Dette betyder at børnene, som er blevet undersøgt, i gennemsnittet er 142.4 cm høje Symbolet x bruges til at beskrive gennemsnittet af x-værdier (x1 , x2 , . . . , xn ) i et datasæt. Tilsvarende bruges y til at beskrive gennemsnittet af y-værdier (y1 , y2 , . . . , yn ). Eksempelvis kan vi skrive gennemsnittet af 1, 2, 3, 4, 5 som 5 1+2+3+4+5 1X x= = k 5 5 k=1 c) Opstil en generel formel til at udregne gennemsnittet (x) af x-værdierne (x1 , x2 , . . . , xn ) ved brug af sumtegn. n x1 + x2 + . . . + xn 1X x= = xk n n k=1 d) Opstil en generel formel til at udregne gennemsnittet (y) af y-værdierne (y1 , y2 , . . . , yn ) ved brug af sumtegn. n y1 + y2 + . . . + yn 1X y= = yk n n k=1 Opgave 5 a) Plot datasættet fra opgave 2. Indtegn linjer, der viser gennemsnittet af x og y. Linjen, der viser gennemsnittet af x indtegnes ved 4: Undersøg data, 8: Plot værdi. Linjen, der viser gennemsnittet af y indtegnes ved 4: Undersøg data, 4: Plot funktion. Funktionen, der skal plottes, er y = y. b) Bestem den bedste rette linje for datasættet i a). Skriv forskriften for linjen. Forskriften for linjen er: y = 4.755x + 92.61 c) Tilføj en kolonne til datasættet, hvor du udregner yk y for hvert punkt i datasættet. Lav et plot af xk og yk y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a). Kolonnen med yk y kaldes forskely. Den udregnes ved at skrive skrive = b1 - $ c $4 og trække kolonnen ned. På plottet kan vi se at punkterne er blevet forskudt vertikalt. Gennemsnittet af punkterne er nu 0 for y-værdierne. Plottet ligner det for før, men værdierne på y-aksen er ændret. d) Bestem den bedste rette linje for datasættet plottet i c). Skriv forskriften for linjen y = 4.755x 49.8147 Trukket y = 142.429 fra. P e) Udregn nk=1 (yk y). Udregnes ved sum(d1:d21). Giver i TI-Nspire 3E 11 . f) Tilføj en ny kolonne til datasættet, hvor du udregner xk x for hvert punkt i datasættet. Lav et plot af xk x og yk y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a) og c). Kolonnen med xk x kaldes forskelx. Den udregnes ved at skrive skrive = a1 - $ c $3 og trække kolonnen ned. På plottet kan vi se at punkterne er blevet forskudt horisontalt i forhold til c). Gennemsnittet af punkterne er nu 0 for både x- og y-værdierne. I forhold til plottet i a) er punkterne forskudt både horisontalt og vertikalt. g) Bestem den bedste rette linje for datasættet i f). Skriv forskriften for linjen. y = 4.755x h) Sammenlign forskrifterne bestemt i b), d) og g). Gør rede for sammenhængen mellem forskrifterne. a) y = 4.755x + 92.61 c) y = 4.755x 49.8147 g) y = 4.755x Hældningen (a) er den sammen for forskrifterne, men skæringen med y-aksen (b) ændrer sig ved forskydning. P i) Udregn nk=1 (xk x). Udregnes ved sum(e1:e21). Giver i TI-Nspire 1E 11 . j) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4c), at det altid gælder at n X (xk x) = 0 k=1 n X (xk x) = k=1 HUSK fra opg. 4c): x = 1 n n X n X xk k=1 k=1 = nx Pn k=1 x nx = 0 xn , nx = Pn k=1 xn k) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4d), at det altid gælder at n X (yk y) = 0 k=1 n X (yk k=1 HUSK fra opg. 4d): y = 1 n y) = n X n X yk k=1 = ny Pn k=1 yn , ny = y k=1 ny = 0 Pn k=1 yn Bestemme bedste rette linje for et centraliseret datasæt I skal bestemme den bedste rette linje y = ax + b for et datasæt (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ), bestående af n punkter, hvor n P 2. ViPantager at datasættet er centraliseret, dvs. at x = n1 nk=1 xk = 0 og y = n 1 k=1 yk = 0. n Vi skal bestemme a og b, så kvadratsummen n ✓ X yk axk + b k=1 ◆2 bliver mindst mulig. Trin 1) Det første trin består i at omskrive summen, så at vi kan minimere denne i trin 2. a) Gør rede for at yk Mulige teknikker: yk (axk + b) 2 (axk + b) = yk 2 = yk2 axk b 2byk + a2 x2k + 2abxk + b2 2axk yk 2 = yk axk b · yk axk b = yk2 yk axk yk b axk yk + a2 x2k + axk b = yk2 2axk yk 2byk + a2 x2k + 2abxk + b2 yk (axk + b) 2 byk + baxk + b2 = (yk )2 + (axk + b)2 2yk (axk + b) = yk2 + a2 x2k + b2 + 2axk b 2yk (axk + b) = yk2 + a2 x2k + b2 + 2abxk 2ayk xk 2byk P P 2 b) Vi har at nk=1 yk (axk + b) = nk=1 yk2 2axk yk 2byk + a2 x2k + 2abxk + b2 . Brug regnereglerne for summer (1-4) til at dele summen op, sætte a, b foran summationen og summe over b. n X yk (axk + b) 2 = k=1 = n X yk2 + k=1 = n X n X k=1 yk2 +a 2 k=1 = n X k=1 a2 x2k + n X k=1 yk2 + a2 n X k=1 x2k + n ✓ X k=1 n X 2 yk2 b + k=1 n X + a2 x2k n X 2 + b + 2abxk 2abxk k=1 2 b + 2ab k=1 x2k + nb2 + 2ab n X 2ayk xk k=1 n X xk 2a k=1 n X k=1 xk 2ayk xk n X 2byk k=1 n X yk xk 2b k=1 2a 2byk n X k=1 yk xk n X k=1 2b n X k=1 yk yk ◆ c) Brug antagelsen om at gennemsnittet af x og y er 0 til at simplificere udtrykket fra b). n X yk2 +a k=1 = n X k=1 2 n X x2k 2 + nb + 2ab k=1 yk2 +a 2 n X x2k + nb2 k=1 n X xk k=1 n X 2a 2a n X yk xk k=1 2b n X yk k=1 yk xk k=1 Trin 2) I trin 1) fandt vi at n ✓ X yk axk + b k=1 ◆2 = a2 n X x2k 2a k=1 n X k=1 xk yk + n X yk + nb2 k=1 Vi skal nu finde de værdier for a og b, som minimere kvadratsummen. Summen n n n X X X 2 2 a xk 2a xk yk + yk + nb2 k=1 kan vi skrive som k=1 a2 A k=1 2aB + C + nb2 a) Gøre rede for hvad A, B, C svarer til i summen og forklar hvorfor A, B, C er konstanter. A= n X x2k k=1 B= n X xk yk k=1 C= n X yk2 k=1 xk , yk er konstanter og derfor er summen af disse også konstanter. I udtrykket er A, B, C konstanter, som ikke afhænger af de to variable a, b. b) Udtrykket a2 A 2aB + C + nb2 kan betragtes som en sum af to andengradspolynomier med hhv. a og b som variable. Skriv de to andengradspolynomier som udtrykket består af. Summen kan skrives som en sum af to polynomier. Mulighed 1) a2 A 2aB + C og nb2 . Mulighed 2) a2 A 2aB og C + nb2 c) Bestem a og b, så a2 A 2aB + C + nb2 bliver mindst mulig. Brug fra b) at udtrykket kan betragtes som to andengradspolynomier, som hver skal minimeres. For at summen bliver mindst mulig skal hver af polynomierne være mindst mulige. Teknik 1: Toppunktformel a2 A 2aB + C har toppunkt når a = (2A2B) = B . A Da A > 0 er dette et minimum. nb2 er mindst mulig, når b = 0. Da n 2 er dette et minimum. Teknik 2: Differentiering: 2aA 2B = 0 , 2aA = 2B , a = B A B B Hvis B > 0, så er 12 B < < 2 . Dette giver at: A A A 2 12 B A 2B = B 2B = B < 0 og 2 · 2B A 2B = 4B 2B = 2B > 0. A A B Derfor er a = A er et minimum. Hvis B < 0, så er 2 B <B < 12 B Dette giver at: A A A B 2 · 2 A A 2B = 4B 2B = 2B < 0 og 2 12 B A 2B = B 2B = B > 0. A B Derfor er a = A er et minimum. 2nb = 0 , b = 0 Da n 2 er 2nb < 0 for b 2] 1; 0[ og 2nb > 0 for b 2]0; 1[, dvs. b = 0 er et minimum ✓ ◆2 Pn d) Opskriv værdierne for a og b som gør kvadratsummen k=1 yk axk + b mindst mulig. Vi har fundet at kvadratsummen er minimeret, når Pn yk xk B b = 0 og a = = Pk=1 n 2 A k=1 xk Opgave 6 Betragt datasættet fra opgave 2 med alder og højde. a) Brug dine formler for a og b til at bestemme forskriften for den bedste rette linje. Benyt dine udregninger fra i går, hvor du udregnede yk y, xk x for alle punkter. P Først udregnes nk=1 (yk y)(xk x). Dette gøres i kolonne f ved kommandoen = d1 · e1. Herefter udregnes summen ved = sum(f 1 : f 21). I kolonne g udregnes (xk x)2 og bagefter findes summen ved brug af sum. a = 785.714 = 4.755 165.238 b = 142.429 4.755 · 10.4762 = 92.6147 Forskriften for den rette linje er y = 4.755x + 92.61 b) Bestem forskriften for datasættet i TI-Nspire. Er forskriften ens? Der er to teknikker til at finde forskriften i TI-Nspire. 4) Statistisk, 1) Statistiske beregninger, lineær regression. 3) Data, 9) Hurtiggraf, 6) Regression, Vis lineær. Begge teknikker giver forskriften y = 4.755x + 92.61, som er den samme som blev bestemt i a). Opgave 7: Verdensrekorder Tabellen nedenfor viser tiderne for verdensrekorderne i maratonløb (målt i sekunder) for mænd i perioden 1998-2007. År 1998 1999 2002 2003 2007 Tid (sek) 7565 7542 7538 7495 7466 Udviklingen af verdensrekorderne kan beskrives ved en lineær model f (x) = ax + b, hvor x angiver antal år efter 1998 og f (x) verdensrekorden i sekunder. a) Bestem a og b i modellen Model: f (x) = 10.626x + 7561.58, hvor f (x) angiver sekunder og x år efter 1998. b) Hvad kan vi forvente at verdensrekorden bliver i 2014 ifølge modellen? f (2014 1998) = f (16) = 7391.56 c) Verdensrekorden blev i 2014 slået, så den nu er på 7377. Hvordan passer det med c)? Benyt oplysningen til at kommentere på modellens rækkevidde. Ifølge modellen skulle verdensrekorden i 2014 være på 7391.56sek. Dette stemmer fint overens med virkeligheden, da verdensrekorden her var på 7377sek, dvs. en forskel på 14.56sek. Dette viser at modellen godt kan bruge til at forudsige tiden i 2014. Opgave 8: Dykkersyge Hvis en dykker opholder sig et (længere) stykke tid på en vis dybde, skal han stige langsomt op til overfladen igen for at forhindre dykkersyge, dvs. at der opstår livsfarlige bobler af kuldioxyd i blodet. Der er en sammenhæng mellem dybde og det antal minutter, som dykkerne kan opholde sig på den pågældende dybde uden at få dykkersyge. Den kan ses i følgende tabel x (m) 10 12 14 16 18 20 22 25 ... 40 42 y (min) 219 147 98 72 56 45 37 29 ... 9 8 I en model antages det, at dybden og tiden kan beskrives med en potens funktion. a) Opstil en model, der beskriver sammenhængen mellem dybde og antal minutter. Bestem forskriften for modellen. Sammenhængen mellem dybde og antal minutter kan beskrives ved en potens model f (x) = bxa , hvor f (x) angiver antal minutter, dykkeren kan opholde sig på en dybde x. x angiver dybden i meter. Forskriften for modellen er: f (x) = 42529x 2.289 b) Benyt modellen til at udregne hvor længe dykkeren kan opholde sig på en dybde på 30m. Vi kender dybden x = 30m. f (30) = 17.7min c) Benyt modellen til at udregne den dybde, dykkeren kan dykke, hvis han gerne vil dykke så dybt som muligt og opholde sig på dybden i 20min. Vi kender tiden y = 20min. 1 20 2.289 f (x) = 20 , 42529x 2.289 = 20 , x = 42529 = 28.43m Eleverne kan også bruge solve(f (x) = 20, x) Opgave 9: Vindmølleenergi Figuren herunder viser udviklingen på verdensplan i kapaciteten af vindmølleenergi. a) Opstil en model, der beskriver sammenhængen mellem årstal og kapaciteten af vindmølleenergi. Bestem forskriften for modellen. På grafen er det svært at se om udviklingen af vindmølleenergi vokser eksponentielt eller potens, så vi prøver at lave begge modeller til at beskrive sammenhængen mellem årstal og kapaciteten af vindmølleenergien. Eksponentiel model: f (x) = bax , hvor f (x) er vindmølleenergi målt i MW og x er år efter 1996. g(x) = bax , hvor g(x) er vindmølleenergi målt i MW og x er årstal. f (x) = 6.476 · 1.278x med r2 = 0.998 g(x) = 1.589 · E 212 · 1.278x med r2 = 0.998 Potens model: f (x) = bxa , hvor f (x) er vindmølleenergi målt i MW og x er årstal. g(x) = bxa , hvor g(x) er vindmølleenergi målt i MW og x er år efter 1996 h(x) = bxa , hvor h(x) er vindmølleenergi målt i MW og x er år efter 1995 (for at undgå 0) f (x) = 0 · 491.409x med r2 = 0.998. Error (pga. ln(0) ) g(x) = 2, 759 · 1, 405x med r2 = 0.899. På baggrund af r2 -værdierne vælger vi at at beskrive sammenhængen med en eksponentiel model. Dette virker også realistisk i forhold til virkeligheden, pga. en smitteeffekt som påvirker folk til at købe vindmøller. Derfor vil størrelsen hvormed vindmølleenergien ændrer sig være proportional med x vindmølleenergien, b·ab·a·ln(a) = ln(a). x b) Benyt modellen til at beregne kapaciteten af vindmølleenergien i 2014 og sammenlign med den faktiske kapacitet af vindmølleenergi i 2014, som var 369.6M W . f (2014 1996) = f (18) = 535.4M W g(2014) = 535.4M W I følge modellen vil kapaciteten af vindmølleenergi i 2014 være 535.4M W . Dette stemmer ikke overens med den faktiske kapacitet, da denne er 369.6M W . Modellen overestimerer kapaciteten med 535.4 369.6M W = 165.8M W . Når vi anvender modellen til at forudsige vindmølleenergien forudsætter vi at vindmøllekapaciteten fortsætter med at stige med 27,8% pr. år, hvilket ikke er realistisk i længden, da kapaciteten af vindmølleenergi på et tidspunkt må stagnere. Modellen kan derfor ikke bruges til at forudsige vindmølleenergien i 2014. Opgave 10: Spædbørnsdødelighed Spædbørnsdødeligheden angiver, hvor mange promille af de levendefødte, der dør inden for det første leveår. Tabellen nedenfor er hentet fra Danmarks Statistik og den viser udviklingen i spædbørnsdødeligheden i perioden 1933-1973. Årstal Spædbørnsdødelighed 1933 71.4 1938 60.0 1943 48.4 1948 37.8 1953 27.4 1958 23.0 1963 19.6 1968 15.4 1973 11.5 a) Lav et plot af datasættet. Indfør passende variable og opstil en model, der beskriver sammenhængen mellem årstal og spædbørnsdødeligheden i perioden 1933-1973. Forklar dit valg af model. Hvis vi plotter datasættet ser vi at datasættet aftager, men det er svært at afgøre om det skal beskrives med eksponentiel eller potens funktion. Eksponentiel: På plottet af datasættet kan vi se at modellen passer fint på datasættet. Eksponentiel model har r2 = 0.996. Model: f (x) = bax , hvor f (x) angiver dødeligheden i promille og x angiver år efter 1933. Potens: På plottet af datasættet kan vi se at modellen passer fint på datasættet. Potens model med r2 = 0.996. Model: f (x) = bxa , hvor f (x) angiver dødeligheden i promille og x angiver årstal. På baggrund af graferne og forklaringsgraden kan vi ikke afgøre hvilken model, der bedst beskriver sammenhængen. Hvis vi forholder os til virkeligheden vælges eksponentiel modellen, da midlerne som bruges til at forbedre spædbørnsdødeligheden må formodes at afhænge af spædbørnsdødeligheden. Derfor jo mindre spædbørnsdødeligheden er, desto færre midler bruges på at formindske denne. Derfor vil størrelsen, hvormed spædbørnsdødeligheden ændres formodes at være proportional med spædbørnsdødeligheden. b·ax ·ln(a) = ln(a). b·ax b) Bestem parameterne i modellen fra a). Eksponentiel model f (x) = 73.58 · 0.955x c) Forudsig på baggrund af modellen hvor stor en promille af de levendefødte, der dør inden for det første leveår i 2008. Eksponentiel: f(2008-1933)=f(75) = 2.402. I følge modellerne vil 2.402 promille af de levendefødte dø inden for det første leveår i 2008 d) Det oplyses at spædbørnsdødeligheden i 2008 var 9.9 promille. Hvordan passer det med modellen? Benyt oplysningen til at kommentere modellens rækkevidde. Hvis vi bruger modellen til at forudsige spædbørndødeligeheden i 2008 underestimeres antallet af spædbørn der dør. I virkeligheden var der 9.9 promille, mens vi ved modellen beregnede 2.4, dvs. en forskel på 7.5. Når vi anvender modellen til at forudsige spædbørndødeligheden i 2008 antages vi at den vil falde hvert år med 4,5%. Dette er ikke realistisk at den vil fortsætte med det så mange år efter de faktiske tal. Modellen kan derfor bruges til at beskrive spædbørnsdødeligheden i 1933-1973 og få år efter, men ikke til at lave ekstrapolation så mange år. Opgave 11: Løb Tabellen viser sammenhørende værdier af løbedistance og den tid, der tager en kondiløber at løbe distancen hurtigst muligt. Løbedistance (km) Løberens tid på distancen (sek) 0.5 98 1 215 3 779 5 1364 7 1989 10 2869 15 4341 20 5904 25 7413 På baggrund af disse data vil kondiløberen gerne udregner hvor lang tid det vil tage at løbe et maraton (42.195km). Nogle eksperter mener at kondiløberens tid som funktion af distancen kan beskrives ved en lineær model f (x) = ax + b, hvor x er distancen (målt i km) og f (x) er kondiløberens tid (målt i sekunder). a) Bestem a, b i den lineære model og brug denne model til at udregne hvor lang tid kondiløberen vil være om at løbe et maraton. Modellen er givet ved: f (x) = 299.47x 103.54, hvor f (x) angiver tiden i sek og x angiver distancen i km. f (42.195) = 12532.5sek Andre eksperter mener at kondiløberens tid som funktion af distancen skal beskrives ved en potens model f (x) = b · xa , hvor x er distancen (målt i km) og f (x) er kondiløberens tid (målt i sekunder). b) Bestem a, b i potens modellen og brug denne model til at udregne hvor lang tid kondiløberen vil være om at løbe et maraton. Modellen er givet ved: f (x) = 219.91x1.11 , hvor f (x) angiver tiden i sek og x angiver distancen i km. f (42.195) = 13879.9sek c) Er kondiløberens tid på et maraton den samme ved de to modeller?. Hvis nej, hvor stor er forskellen. Lineær model: f (42.195) = 12532.5sek Potens model: f (42.195) = 13879.9sek Forskel: 13879.9 12532.5sek = 1347.4sek ⇡ 22min d) Hvilken model ville du bruge til at bestemme kondiløberens tid? Skriv begrundelserne ned. Diskuter dine begrundelser med gruppen. Lineær model: r2 = 0.9998 Potens model: r2 = 0.9993 Viden om løb: Tiden angiver hvor lang tid personen er om at løbe en given distance hurtigst muligt. Vi ved at desto længere man løber, desto langsommere løber man pr. km. Ved en lineær model antager vi at den maksimale hastighed er den samme uanset distancen, hvilket ikke er realistisk. Den lineære model giver at tiden øges med 299.47sek hver gang personen løber 1km mere. De faktiske tider pr. km for de forskellige distancer er: Løbedistance (km) Løberens hastighed (sek/km) 0.5 196 1 215 3 259.7 5 272.8 7 284.1 10 286.9 15 289.4 20 295.2 25 296.5 I tabellen kan vi se at hastigheden ikke er konstant og at det faktisk kun er ved de længere distancer (15-25) at sek/km stemmer nogenlunde overens med modellen. Ved en potens model antages det at hastigheden ikke er konstant og at den maksimale hastighed falder desto længere man løber. Ved at bruge potens modellen ses det at løberen vil løbe med 13879.9 = 328.9sek/km, dvs. omkring 42.195 30sek langsommere pr. km end ved 25km på et marathon. På grafen kan vi også se at tiden bliver overestimeret en smule. Da den lineære model ikke er realistisk i forhold til løb vil jeg bruge potensmodellen til at forudsige løberens tid på marathon. Opgave 12: Tilskadekommende i trafikken Hvert år opgøres antallet af tilskadekommende i trafikken i Danmark. Antallet af tilskadekommende i Danmark i årene 1979-1999 kan ses i tabellerne herunder. I 1995 blev antallet af tilskadekommende i trafikken ikke opgjort, derfor mangler der data for dette år. År 1979 1980 1981 1982 1983 1984 1985 Antal ulykker 9267 8477 8546 8427 8105 8523 8672 År 1986 1987 1988 1989 1990 1991 1992 Antal ulykker 8301 7357 7321 7266 6396 6231 6031 År 1993 1994 1995 1996 1997 1998 1999 Antal ulykker 5735 5661 ... 5339 4424 4071 4217 a) Bestem på baggrund af tabellen hvor mange tilskadekommende i trafikken der ca. har været i 1995. Skriv din model, dine overvejelser om valg af model samt dine udregninger. På scatterplottet kan vi se at antallet af tilskadekommende ser ud til at aftage lineært, så jeg har lavet lineær regression. Ved lineær regression fås modellen f (x) = 256.6x + 9407.8, hvor f (x) er antal tilskadekommende og x antal år efter 1979. Ved lineær regression fås r2 = 0.95. Antallet af tilskadekommende i trafikken i 1995 er f (1995 1979) = f (16) = 5301.5, hvilket stemmer overens med tabellen. b) Forudsig på baggrund af tabellen antallet af tilskadekommende i trafikken i 2016. Skriv din model, dine overvejelser om valg af model samt dine udregninger. Antallet af tilskadekommende i trafikken, hvis den lineære udvikling fortsætter er: f (2016 1979) = f (37) = 87.9 Dette tal er urealistisk, da det er negativt. Det er heller ikke realistisk at antallet af tilskadekommende i trafikken vil fortsætte med at aftage lineært efter 1999, da vi så på et tidspunkt får 0 eller et negativt antal tilskadekommende. Derfor kan den lineære modellen ikke bruges til ekstrapolation. Vores antal af tilskadekommende i 1995 er meget realistisk hvis vi ser på antallet i 1994 og 1996. Modellen kan derfor anvendes til at beskrive årstallene mellem 1979-1999 og i en kort periode efter. For at udregne antallet i 2016 vil vi derfor modellere data med en anden model. På plottet ser vi at data aftager med en lineær tendens, men vi forventer ikke at de fortsætter med at aftage lineært, da antallet på et tidspunkt må stagnere. Både en eksponentiel og potens model vil stagnere, så antallet af tilskadekommende vil aftage år for år, men aldrig blive negativt. Ved eksponentiel regression fås: f (x) = 9886.6 · 0.961x hvor f (x) er antal tilskadekommende og x år efter 1979. Forklaringsgraden for modellen er 0.924. Ved denne model er antallet af tilskadekommende i 2016 f (37) = 2262. Ved potens regression fås: g(x) = 2.0096 · E 265 · x 79.2696 hvor g(x) er antal tilskadekommende og x er årstallet. Forklaringsgraden for modellen er 0.923. Ved denne model er antallet af tilskadekommende i 2016 g(2016) = 2278. Afhængigt af hvilken model vi vælger vil forskellen kun være på 16. Da forklaringsgraden er ens for begge modeller og de begge stagnere (ved at det er svært at undgå tilskadekommende i trafikulykker) er der ingen kvalitative argumenter som kan bruges til at afgøre hvilken model vi skal bruge. Da modellerne forudsiger antallet til at være henholdsvis 2262 og 2278 kan vi forudsige at antallet af tilskadekommende i 2016 vil være omkring 2270. Den lineære model beskriver data fra 1979-1999 godt (højest r2 -værdi) og kan derfor bruges til interpolation, mens en eksponentiel eller potens model er bedre egnet til extrapolation uden for modellens rækkevidde. Opgave 13: Kaniner på en øde ø På en øde ø blev der sat 50 vilde kaniner ud 1. januar 2000. I de efterfølgende år talte man hvert år antallet af kaniner på øen d. 1. januar. Tabellen herunder viser hvordan antallet af vilde kaniner på øen vokser med tiden, målt i år efter 2000. År 1 2 3 4 5 6 7 Antal kaniner 100 240 450 1000 2000 4100 9050 År 8 9 10 11 12 13 14 Antal kaniner 17030 26010 30070 32500 33810 33890 33870 a) Hvilken sammenhæng er der mellem tiden og antal kaniner. På scatterplottet kan vi se at antallet af kaniner vokser eksponentielt indtil 8-9år, hvorefter at udvikling mindskes og efter ca. 12 år er antallet af kaniner stabilt på ca. 33800-33900. Kan beskrives ved logistisk vækst (hørt dette fra biologi?) 34089.8 Logistisk regression (d=0): f (x) = 1+3509.79·e 1.02543x . b) Bestem ud fra data i tabellen hvor mange kaniner der har været på øen 1. juli 2006. Skriv din model, udregninger og overvejelser. Kaninerne blev sat ud 1. januar 2000, så 1. juli 2006 svarer til x = 6.5. Ved den logistiske model (d=0) får vi at antallet er f (6.5) = 6228.89. Da den første del af udviklingen kan beskrives ved en eksponentiel funktion, laver jeg eksponentiel regression på data fra de første 8 år og bruger denne model til at bestemme antallet af kaniner 1. juli 2006. Modellen, der kan bruges til at beskrive udviklingen de første 8 år er givet ved: f (x) = 51.6 · 2.08x med r2 = 0.999. Dette viser at udviklingen de første 8 år kan beskrives ved en eksponentiel model. Antallet af kaniner d. 1 juli 2006 er da f (6.5) = 5985.99 c) Bestem ud fra data i tabellen hvor mange kaniner der vil være på øen 1. januar 2016 Skriv din model, udregninger og overvejelser. Ifølge tabellen og grafen kan vi se at antallet af kaniner er stabilt efter 12år på ca. 33800-33900, så antallet af kaniner efter 16 år (2016) vil være i dette interval. Vi kan beskrive de sidste år (år 12-14) med en lineær model som har hældning 0. Udregning ved logistisk model: f (16) = 34080.8. Dette stemmer overens med hvad vi forventede ud fra data. 162 list of references a.3 tables of the didactic process Description of the mathematical praxeologies and the didactic process in module 1-3. Module 1 Episode Problem Task Introduction to best line Q3 : What do we mean by best line? TRL Q3 : What do we mean by best line? Exercise 1+2 Summary of exercise 1+2. Definition of best line Like above Exercise 3 Calculate sums Summary of exercise 3 Exercise 4 + 5ai Calculate sums Summary of exercise 4 + 5a-i Exercise 5j (5k) Calculate mean/sums. Displacement of data TRL Like above Prove Pn 0 Summary k=1 (xk x) = Technique of task Technology Theory of specific activity Didactic moment of MO 1) with best line Draw the best line by eye. Calculate sums by hand Calculate sums. Draw dataset I Technique ⌧RL Like above Pn x = n1 k=1 xk Rules for sums Explain their choice of the line. Specify what best means Definition of best line. ✓P M : Minimize n k=1 (yk (axk + b))2 The rules sums Changing data do change slope of the for the not the line Definition of mean. Change of data do not change the slope Subtract x, y centralize the dataset D Definition of best line Rules of sums Data can always be centralized 1) with best line 2) Elaborate techniques 3) Technological discourse 4) Technical work with techniques to the technical moment 4) Rules of sums 4) Technical work with techniques to the technical moment 4) Technique to displace data 4) Technique to centralize data Module 2 Episode Summary module 1 Problem Task Technique of Formula of best line for centralized data Summary of the proof Minimize Pn k=1 (yk (axk + b))2 Part 1) Rewrite the sum Part 2) Minimize the sum of two polynomials Like above Formula of best line for a data set Minimize Pn k=1 (yk (axk + b))2 Exercise 6 TRL Introduction to best exponential/power model Q2 : What do we mean by best exp/power model? Q2 : How can we determine the models? Technology Theory Didactic moment Definition and results from module 1 4) Techniques to technical work 4) Develop technique Theorem of best line for centralized data Theorem of best line for data set 5) Institutionalization Realize that the result of the two techniques are identical 6) Evaluation of the technique and the instrumented technique 1) with best exp/power model 5) of definition and techniques min(s(a,b)) = min( (a)) + min( (b)) Multiplication Quadratic identity Rules of sums Differentiation Formula of vertex Like above 1) Displacement of data 2) Theorem of centralized data A ⌧RL : Use the formula I ⌧RL : TI-Nspire Transformation / linearization Linear regression Definition of best exp/power model by linearization 5) Institutionalization Module 3 Episode Problem Task Technique Technology Theory Didactic moment Summary of exercise 7-9 TRL , TR , TRP , TY M , TXM , TQP ⌧ I : TI-Nspire I I ⌧RL , ⌧RP , solve 1) with Q1 , 1) reencounter with old techniques Exercise 10 +11 Related to Q1 , Q4 . 10) TR , TY M , TQP 11) TRL , TRP , TY M Nspire. ExamI I ples: ⌧RL , ⌧RP , Scatterplot r2 Calculate speed Summary of exercise 10 + 11 Like above Like above Realize that models can describe data well in a restricted range, but not useful for predictions Restricted range for predictions. The phenomenon are relevant for choice of Fi . Like above Exercise 12 + 13 Related to Q1 , Q4 . 12) TR , TY D , 13) TR , TY D The same as in exercise 10+11. Regression of restricted data set Logistics regression Summary of exercise 12 + 13 Like above Like above Summary module 3 Q1 and Q4 of As in exercise 10 + 11. The model with highest r2 are not necessarily the best. New class of functions Choice of Fi depends on the purpose. Not solely use r2 to determine Fi . Investigate the phenomenon. The previous technological and theoretical elements 3) specify what best mean. 4) Develop techniques to choose Fi 3) State precise what best mean. 4) Technique to choice best. 5) Institutionalization of best. 3) Explain best Fi , 4) Develop techniques 5) Institutionalization of Q1 and Q4 5) Institutionalization of Q1 and Q4 166 list of references a.4 the planned didactic process Lektion 1 Tid Aktivitet 5min 3min Introduktion til forløb. Introduktion til bedste linje og regression. 13min Opgave 1 + 2 12min Repræsentation /arbejdsform Vise datasæt. S: Hvilken linje er bedst? Motivation til forløb og timen. Grafisk via plots. Tegne bedste linje. Individuel/gruppe Lærer Fortælle om emne, Lytte, stille spørgsmål dataindsamling Introducerer problem Lytte, stille spørgsmål. om hvad vi mener med bedst? Observere, lytte til diskussioner, identificerer gode argumenter. Opklare spørgsmål Opsamling opg. 1 Grafisk illustration af 1) Italesat kriterier for +2. kriterier. Brug af TI- bedste linje. Input fra Introduktion til Nspire til at vise linjer. elever. MKM + summer. Algebraisk og grafisk 2) Definition af repræsentation af bedste linje. MKM. MKM. 3) Forklare summen af kvadreret fejl. Forklare sumtegn. 15min Opgave 3 5min 6min PAUSE Opsamling på opg. 3. Institutionaliserin g af regneregler. 20min Opgave 4 + 5a-i Elev Tegne bedste linje. Overveje og diskutere kriterier. Definere bedste linje. 1) Eleverne gennemgår kriterier og forslag til bedste linje. 2) Genkalde definition af bedste linje. 3) Lytte, stille spørgsmål. Numerisk. Udregne Observere. Afklare Udregne summer. summer. Finde spørgsmål. Lytte til Diskutere sammenhænge/nye diskussioner. sammenhænge/nye teknikker. Algebraisk - Identificerer grupper, teknikker i grupper. skrive regleregler for der finder Hvis en gruppe finder summer. sammenhængene. regneregler skrives Individuel/gruppe. denne på tavlen. Algebraisk. Valideres via numeriske eksempler. Regneregler skrives op. Regneregler for summer institutionaliseres. Regneregler opskrives via input fra eleverne. Aktiv deltage med formulering af regneregler/teknikker fundet i opg. 3. Forklare hvorfor regnereglerne gælder. Udregning af gns. Overvære eleverne for Arbejde med summer Generalisering af at få fornemmelse for i TI-Nspire. Opstille udtryk. Udregning og forståelse og generel formel. grafisk repræsentation problemer. Forståelse for ændring af ændret datasæt. Identificere grupper af datasæt - se grafisk. Gruppearbejde. med gode Diskussion af besvarelser/forklaring forflytning. er. Lektion 1 Opsamling 4 + 5a- Fælles gennemgang af i. Introduktion til 4 + 5a-i. Grafisk centraliseret illustration af datasæt. gennemsnit + forflytning. Nyt datasæt. 8min Gennemgang af 5j (5k) 5min 3min 95min Gennemgå formel for Enkelte elever: gennemsnit. Sikre Gennemgå løsning af forklaring af flytning 5a-f. De andre tjekker er tilstrækkelig. udregninger, stiller Supplere med spørgsmål. Forklare spørgsmål. Pointe: hvad der sker. Flytte datasæt, så det bliver centraliseret. Algebraisk: Bevis for at Gennemgå hvorfor Deltage aktivt i summen af sum er 0. Stille beviset ved at bidrage centraliseret data er 0. spørgsmål til eleverne - med ideer til inddrage regneregler udledning. for summer og formel for gns. Huske at give eleverne tid til huske regneregler + gns. Opsamling. Gennemgå vigtige Pointere vigtige Definition af pointer, som er samlet pointer. Motivation bedste model på et slide. for næste time. (arbejde videre med i morgen). Bruge summer, centraliseret data. Lytte, stille spørgsmål, tage noter. Lektion 2 Repræsentation /arbejdsform Tid Aktivitet 5min Opsamling på Vigtige pointer skrives vigtige pointer fra på tavlen. lektion 1, som skal bruges i beviset. Introduktion til bevis. 5min Introduktion til bevis af bedste rette linje 43min Udlede formlen Algebraisk. Udlede for bedste rette beviset via trin. linje for et Gruppearbejde. centraliseret datasæt. 1) Omskrive sum 2) Minimere sum 5min 15min PAUSE Gennemgang af Algebraisk. Trinnene bevis for et og mellemregningerne centraliseret er skrevet og vises datasæt. trinvis. Kan ske over to omgange (først trin 1, derefter trin 2), hvis det er nødvendigt at opbryde forrige aktivitet. Bevis for ikkeGrafisk, algebraisk. centraliseret data. Kobling ml. bevis og opg. 5. 7min Grafisk. Algebraisk Lærer Elev Spørger eleverne til vigtige pointer og skriver disse på tavlen. Sikre at følgende genkaldes: 1)Bedste linje (MKM) 2) Centraliseret data 3) Regneregler for summer Introducere opg. Om formel for a,b. Skrive problem op. Grafisk vise hvordan datasæt kan rykkes. Starte med at bestemme for centraliseret data. Hjælpe eleverne med hints ved at refererer til allerede kendt viden. Validere resultaterne i trinnene. Observere om der er behov for gennemgang efter trin 1. Eleverne genkalder og gengiver pointer fra lektion 1. Spørge eleverne om forklaringer til de beviset (skrevet på forhånd). Validere og forklare trinnene/udregninger ne. Vigtigt at pointere de to trin. Forklare at vi har en formel til at beregne bedste linje. S: Hvad hvis data ikke er centraliseret? Eleverne får tid til at tænke over muligheder? Bruge formel fra før, da vi kan centralisere data. Vise forskydning grafisk. I alm. gælder vise formel. Forklare udregningerne og udledningen af formlerne. Lytte, stille opklarende spørgsmål. Lytte, stille spørgsmål. Arbejde i grupper på egen hånd (i det omfang det er muligt). Udlede formlen via trin. Opdage formlen for parameterne a og b for et centraliseret datasæt. Besvare spørgsmål. Tid til at tænke over løsning. Bruge viden fra lektion 1. Deltage i det omfang, de kan. Lektion 2 8min Opgave 6 7min Introduktion til Grafisk via eksempel i bedste TI-Nspire. Vise eksponentiel/pote transformationen. ns model. Linearisering. 95min Udregning af bedste rette linje via formel. Individuel/par Lade eleverne arbejde på egen hånd. Kan bidrage med teknikker (kommandoer) til at udregne sum, produkt og kvadrat. Arbejde med formlen udledt i forrige aktivitet. Opdage at formlen og CAS giver det samme. Validere selv deres udregning. Vise hvordan data kan transformeres, så de bliver lineære. Genkalde viden om eks/pot funktioner og logaritmer. Finde den bedste lineære model af transformeret data (ændre datasæt) Lytte, stille spørgsmål. Opdage/introduceres for sammenhæng mellem eksponentiel, potens og lineær regression. Lektion 3 Tid Aktivitet Repræsentation /arbejdsform 12min Opsamling på TI-Nspire opg. 7 - 9 (lektier) 3min Introducere problemet med valg af model. 27min Opgave 10 + 11 Numerisk / grafisk. Arbejde i TI-Nspire. Grupper. 10min Opsamling på opg. 10 + 11 TI-Nspire 5min 25min PAUSE Opgave 12 + 13 8min Opsamling på opg. 12 + 13 Lærer Elev Udvælge 1-2elever, 1-2 elever gennemgår som gennemgår opg. 8-opgaverne. 9. Validere Andre: Stille besvarelserne. Stille opklarende spørgsmål, supplere spørgsmål, tjekke og præcisere svar. resultater. Læreren gennemgår at Lytte, stille spørgsmål model typen sjældent er givet. Arbejde med at bestemme typen af model. Besvare opklarende spørgsmål. Observere elevernes arbejde. Udvælge elever med gode overvejelser/ besvarelser, som gennemgår efterfølgende. Validere besvarelserne. Supplere og præcisere svar. Bidrage med andre muligheder for at afgøre model. Arbejde med valg af model. Diskutere modeller og overvejelser om valg af model. 1-2 elever gennemgår. De andre bidrager med kommentarer, forslag og deres overvejelser samt stiller spørgsmål. Besvare opklarende Arbejde med valg af spørgsmål. model. Diskutere Hjælpe/give hints med modeller og opg. 13. Observere overvejelser om valg elevernes arbejde. af model. Udvælge elever med gode overvejelser/ besvarelser. TI-Nspire Validere besvarelserne. Supplere og præcisere svar. Pointere at der aldrig findes en rigtig model. Kommer an på formål. 1-2 elever gennemgår. De andre bidrager med kommentarer, forslag og deres overvejelser samt stiller spørgsmål. Lektion 3 5min 95min Opsamling på valg Væsentlige teknikker til Inddrage eleverne. af model at afgøre model skrives Understrege pointer. på tavlen. Kommer an på formål med modellen. Aktiv deltagelse. Bidrage med teknikker til at vælge model. Noterer vigtige pointer. Plan for undervisning Inden undervisningsforløbet bestemmes grupper, som eleverne arbejder i. Eleverne vil både komme til at arbejde individuelt og i grupper igennem forløbet. Diskussioner og overvejelser vil altid foregå i grupperne. Eleverne vil til hvert model få udlevet opgaver til dagens model, hvor jeg har indsat tekstbokse, som de kan skrive deres besvarelser i (se opgaverne). Efter opgaverne vil der være en tom side, hvor eleverne kan skrive noter osv. i. Tekstboksene og noterne gør det muligt for mig at dokumentere og analysere elevernes arbejde efterfølgende. Da det er svært at lave en tidsplan over forløbet har læreren mulighed for at ændre undervejs. I de enkelte moduler vil jeg skrive mulige ændringer, såsom at undlade opgaver, lave opsamling før, selv gennemgå opgaver og lignende justeringer. Modul 1: Torsdag d. 9/4 8.00-9.35 Formål/indhold • Kriterier for bedste rette linje, som giver anledning til definition af mindste kvadraters metode (MKM) • Introduktion til summer og arbejde med summer. Regneregler for summer • Centralisering af datasæt ved at forskyde dette Introduktion til forløb Jeg introducerer kort forløbet. Jeg fortæller om emnet; regression og hvad de skal arbejde med i de tre moduler. Jeg fortæller om dataindsamling (optagelser af deres samtaler ved diktafoner og billeder af opgaveark, observationer). Introduktion til bedste linje og regression Læreren fortæller at vi skal repetere regression og stiller spørgsmålet om hvad vi mener med bedste linje. Læreren fortæller at vi de næste to dage (modul 1 + 2) skal arbejde med matematikken omkring regression, så vi kan forstå hvordan TI-Nspire udregner den bedste rette linje. Læreren gør det klart at TI-Nspire selvfølgelig finder det bedste rette linje ud fra nogle kriterier og at der ligger matematik bag. Det er vigtigt at læreren pointerer at vi kan forklare udregningerne med matematik og derfor også hvad TI-Nspire gør. Derefter fortæller læreren grupperne til eleverne og opgave 1-2 udleveres. Eleverne arbejde individuelt med opgave 1 + 2a-c). Herefter arbejder eleverne i grupperne (2d-e) Opsamling på opgave 1 + 2 Når læreren fornemmer at eleverne har arbejdet med opg. 1-2 og fået diskuteret og overvejet hvad de forstår ved bedste rette linje starter læreren en opsamling på klassen. Ved opsamlingen har læreren datasættet for opg. 2 plottet i TI-Nspire, så læreren grafisk kan illustrere nogle af kriterierne for bedste rette linje. Læreren udvælger forskellige grupper og spørge hvilke kriterier de har. De forskellige kriterier diskuteres. Kriterier som vertikale, horisontale og vinkelrette afstande kan illustreres på tavlen. Hvis eleverne har få kriterier kan læreren fremsætte nogle kriterier, som eleverne kan diskutere (se mulige kriterier herunder). Forhåbentlig vil nogle eleverne fremsætte kriteriet om at kvadrere de vertikale afstande, idet eleverne tidligere har arbejdet med lineær regression og blevet kort introduceret til MKM. Dette kriterie kan illustreres via TI-Nspire. Mulige kriterier og "begrundelser" for hvorfor de ikke bruges: P • Minimere den horisontale afstand nk=1 (xk x) - Ved dette minimeres fejl i x-værdierne. Ofte måler vi en fejl i y-værdien (eks: kender årstal, alder, osv.). P • Minimere den vertikale afstand nk=1 (yk y) - Måler fejl i y-værdierne. • Minimere den vinkelrette afstand - god ide (både fejl i x og y). - Rigtig svær at beregne. • Minimere den vertikale afstanden numerisk - Svær at beregne. Kan give flere svar. Pn k=1 • Minimere kvadratet af de ovenstående afstande |yk Pn y| k=1 (yk y)2 - Undgår at positive og negative værdier udligner hinanden. Simpel løsning, som er let at regne. • Lige mange punkter på hver side af linjen - Eksempel med 1 outliner • Skal gå igennem den mindste og største værdi - Eksempel med 1 outliner. Linjen ligge helt skævt • Skal gå igennem gennemsnittet af værdierne - Gør den også med MKM. • Skal gå igennem to punkter - Vise hvor skævt det bliver ved at tage to punkter fra datasæt opg. 2. Vigtigt at vi anvender alle punkter. • Fastlægge skæringen (b-værdien) hvis vi kender denne - Der kan altid være en vis usikkerhed. Hvis eleverne ikke selv fremsætterPkriteriet om MKM genkalder læreren dette kriterie hos eleverne og præsenterer nk=1 e2k , hvor e er den vertikale afstand. Læreren illustrerer kvadraterne for datasættet i opg. 2 og forklarer hvad sumtegnet betyder. Herefter skrives definitionen op: Datasæt , yn ). Den bedste rette linje y = ax + b findes ved at minP (x1 , y1 ), . . . , (xnP imere nk=1 (yk y)2 = nk=1 (yk (axk + b))2 . Valgt kriterie kan vi forklare og retfærdiggøre linjen ved brug af matematik. Læreren fortæller at eleverne nu skal arbejde med summer, idet vi finder den bedste rette linje ved at summe fejlene. Eleverne arbejder herefter med opg. 3 omkring summer. Hvis elever opdager regneregler for summer skrives disse på tavlen. Imens eleverne arbejder med opgave 3 skriver læreren tabellen med summer på tavlen. Opsamling på opgave 3 De sidste spørgsmål i + j) vil være krævende for eleverne og derfor kan denne opsamling startes før i + j) er regnet, hvis læreren vurderer at eleverne ikke selv når frem til regnereglerne for summer. Læreren spørger eleverne om resultaterne til 3a-h og de skrives ind i tabellen. Hvis nogle elever har fundet sammenhænge står disse på tavlen og eleverne forklare disse til resten af klassen. Hvis ikke, kan læreren spørge til sammenhæng mellem nogle af summerne (a-b, c-d, e-f, g-(a+e), h-(e+c). Denne opsamling kan være mere eller mindre lærerstyret afhængig af hvordan eleverne klarer opgaverne og tiden. Hvis eleverne ikke når opg. 3j) tages denne i samspil med eleverne på tavlen efter at regnereglerne er gennemgået. Det er vigtigt at eleverne bliver introduceret for regnereglerne, da de skal bruge disse i beviset (modul 2). Regneregler for summer Vi kan summe en konstant n X c=n·c k=1 Man kan sætte en konstant ude for summationstegnet n X k=1 c · xk = c n X xk k=1 Man kan dele summer op. n X (xk + yk ) = k=1 n X k=1 (xk n X xk + k=1 yk ) = n X k=1 n X yk k=1 xk n X yk k=1 Regnereglerne skal skrives op på tavlen. Eleverne vil få regnereglerne på papir, så det er ikke vigtigt at disse noteres. Læreren udleverer opgave 4 + 5, som eleverne herefter arbejder med. Opsamling på opgave 4 + 5a-i Opgave 4 gennemgås ved at læreren spørger eleverne, som forklare hvordan gennemsnittet regnes (opg. a+b) og herefter Pn fortæller generel formel for gennemsnit. Læreren understreget at hvis x = 0 er k=1 xk = 0. Opgave 5 gennemgås ved at læreren har udvalgt 2 elever, som har gode besvarelser/ forklaringer. Eleverne viser via TI-Nspire hvordan gennemsnittet flyttes i de forskellige situationer. Læreren supplerer med forklaringer og sikrer at eleverne indser at ændring af datasættet medfører til en forflytning af datasættet og at hældningen på den bedste rette linje er uændret. Læreren gentager og illustrerer hvis nødvendigt at datasættet flyttes vertikalt med y og horisontalt med x. Læreren introducerer at vi faktisk ændrer datasættet fra (x1 , y1 ), . . . , (xn , yn ) til (x1 x, y1 y), . . . , (xn x, yn n) ved forskydningen. Det nye datasæt kaldes et centraliseret datasæt, da gennemsnittet af dette vil være i (0, 0). Gennemgang af opgave 5j (samme med 5k) P Læreren beviser at nk=1 xk x = 0. Dette gøres ved at inddrage eleverne aktivt i udledningen. Læreren skriver summen op og spørger eleverne hvordan vi kan udlede dette. Det er vigtigt at læreren giver eleverne lidt tid til at tænke. Forhåbentlig vil eleverne genkalde regnereglen for at summe en konstant og formel for gennemsnit. Hvis ikke må læreren gennemgå udledningen. Der forklares hvordan 5h tilsvarende kan bevises. Læreren understreger at denne sum altid vil være 0 og at vi derfor altid kan centralisere et datasæt ved at trække gennemsnittet fra. Hvis tiden er knap gennemgås 5k ikke. Opsamling på modul 1 Læreren gennemgår de vigtigste pointer fra dagens model, som er skrevet på et slide. • Definition af bedste linear model: Givet datasæt (x1 , y1 ), . . . , (xn , yn ). Den bedste rette linje y = ax + b findes ved at minimere n X 2 (yk y) = k=1 n X (axk + b))2 (yk k=1 • Regneregler for summer Vi kan summe en konstant n X k=1 c=n·c Man kan sætte en konstant ude for summationstegnet n X k=1 c · xk = c n X xk k=1 Man kan dele summer op. n X (xk + yk ) = k=1 n X k=1 (xk n X xk + k=1 yk ) = n X k=1 n X yk k=1 xk n X yk k=1 • Et datasæt (x1 , y1 ), . . . , (xn , yn ) med x = 0 og y = 0 kaldes et centraliseret datasæt. Et datasæt kan centraliseres ved at forskyde datasættet vertikalt og horisontalt med x og y. Modul 2: Fredag d. 10/4 8.00-9.35 Formål/indhold • Formlen for bedste rette linje for et centraliseret datasæt • Formlen for bedste rette linje for et ikke-centraliseret datasæt • Introduktion til bedste eksponentiel og potens model. I dette modul er det vigtigt at eleverne har tid til at arbejde med formlen for bedste rette linje. Jeg har lavet en enkelt opgave, hvor eleverne skal bruge formlen til at beregne bedste rette linje, men da fokus er på formlen og beviset skal denne kun udleveres og laves, hvis der er tid til det. Læreren må afgøre hvorvidt der er tid til denne. Læreren skal prioritere opgave 6 højere end introduktion til eksponentiel og potens modeller, da denne kan udskydes til modul 3. Opsamling på vigtige pointer fra modul 1 Modul 2 indledes med en opsamling af de vigtigste pointer fra gårdagens undervisning. Læreren spørger eleverne til vigtige pointer fra modul 1 og disse skrives på tavlen. Læreren skal sikre at pointerne fra opsamling på modul (fra modul 1) kommer på tavlen, da eleverne skal bruge regnereglerne til at udlede formlen for bedste rette linje. Introduktion til bevis af bedste rette linje Læreren introducerer efterfølgende opgaven om at udlede formlen for bedste rette linje for et datasæt. Læreren pointere hvorfor vi finder denne og læreren kan motivere eleverne ved at trække en parallel til formlen for to punkter og forklare at vi nu skal finde en formel for a og b, når vi har mere end to punkter. Læreren understreger at TI-Nspire bruger denne formel til at bestemme den bedste rette linje (lineær regression), intet mystisk. Læreren illustrerer grafisk hvordan vi kan flytte datasættet, så det bliver centraliseret og fortæller at dette gør det lettere at finde a og b. Læreren introducerer at eleverne nu skal finde forskriften for a, b, når datasættet er centraliseret. Læreren giver eleverne opgaven med trin 1) for forskrift for ret linje. Eleverne skal i grupperne arbejder med trin 1 i opgaven. Når eleverne er færdige med trin 1) udleveres trin 2), som eleverne herefter arbejder med. Læreren skal være opmærksom på at redegørelsen af kvadratet kan være krævende for eleverne, da de normalt kun udregner kvadratet af 2leddet størrelser. Læreren kan hjælpe eleverne ved enten at huske dem på hvordan man ganger parenteser eller at bruge kvadratsætningen på yk og (ax + b). Læreren har mulighed for at reagere undervejs og skal observere om der er behov for at gennemgå trin 1), hvor summen omskrives inden eleverne arbejder videre med trin 2), hvor summen minimeres. Hvis læreren vurderer at eleverne har brug for at få valideret trin 1) inden der arbejdes videre med trin 2) kan hun dette. Udledningen af formlen behøves ikke at blive gennemgået samlet, men kan deles op i de to trin. Gennemgang af bevis for centraliseret data Udledningen af formlen for a og b gennemgås enten samlet eller i to omgange, afhængig af hvad læreren vurderer. Udledningen foregår ved at læreren spørger eleverne om forklaringer til beviset. Beviset er skrevet på forhånd og de enkelte skridt vises løbende, mens eleverne forklarer udregningerne. Læreren validerer og trækker paralleller til gårdagens og tidligere kendt viden. Læreren understreger at vi nu har fundet en formel til at beregne den bedste linje, for et centraliseret datasæt. Læreren skriver på tavlen: Givet et datasæt (x1 , y1 ), . . . , (xn , yn ) hvor x = 0 og y = 0 er den bedste rette linje givet ved Pn xk yk y = ax + b hvor a = Pk=1 og b = 0 n 2 k=1 xk Udledning af formel for ikke-centraliseret data Efter gennemgangen af beviset for centraliseret data går læreren tilbage til det generelle datasæt, hvor data ikke er centraliseret. Læreren viser igen datasættet fra introduktionen og spørger eleverne hvordan vi kan bruge formlen for centraliseret datasæt til at bestemme a, b for et generelt datasæt. Læreren giver eleverne 1-2min til at tænke over dette. Forhåbentlig vil nogle eleverne komme med forslag til at vi kan rykke datasættet, så det bliver centraliseret og læreren kan tage udgangspunkt i dette ved at spørge ind til hvordan datasættet så ændres fra (xk , yk ) til (xk x, yk y). Herefter skal læren illustrere via skitse at hældningen på datasættet ikke ændres og derved forklare at vi kan bruge formlen fra før med de nye værdier for datasættet. Derefter viser læreren hvordan vi kan bestemme b. Forklaing: Betragt datasættet (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) med n punkter, hvor n 2. Vi ved at for et centraliseret datasæt er forskriften for y = ax + b givet ved Pn xk yk a = Pk=1 og b = 0 n 2 k=1 xk For et vilkårligt datasæt (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ), som har tyngdepunkt i (x, y) kan vi centralisere datasættet, så det får tyngdepunkt i (0, 0). Vi centraliserer datasættet ved at forskyde det både vertikalt og horisontalt. Det centraliseret datasæt bliver da (x1 x, y1 y), (x2 x, y2 y), . . . , (xn x, yn y). Forskydningen af det nye datasæt ændrer ikke på hældningen for den bedste rette linje, så hældningen bliver: Pn (x x)(yk y) Pn k a = k=1 x)2 k=1 (xk Vi ved at b = 0 for det centraliseret datasæt. Vi har derfor at y y = a(x x) , y = a(x x) + y Dette giver at b = y ax. Det er vigtigt at læreren trækker paralleller til modul 1 og illustrere grafisk hvordan et datasæt kan centraliseret, for at opnå forståelse for at vi nu har udledt formlen for a og b i det generelle tilfælde. Læreren forklarer at vi nu har fundet en formel for a og b, når vi har mere end 2 punkter og skriver afslutningsvis på tavlen: For et datasæt (x1 , y1 ), . . . , (xn , yn ) er den bedste rette linje y = ax + b givet ved Pn (x x)(yk y) Pn k a = k=1 og b = y ax x)2 k=1 (xk Læreren fortæller at vi nu har forklaret matematisk hvordan TI-Nspire udregner den bedste rette linje og vi faktisk også selv kan udregne den. Læreren vurderer herefter afhængigt af tiden om opgave 6 skal regnes eller droppes. Introduktion til bedste eksponentiel/ potens model Timen afsluttes med at læreren forklarer at der også eksisterer tilsvarende formler for eksponentiel og potens regression, men at vi ikke skal bestemme disse. Læreren viser, via eksempel, hvordan data der tilnærmelsesvis kan beskrives ved en eksponentiel/potens model kan transformeres til at blive lineær. Læreren viser hvordan data transformeres i TI-Nspire og laver plot af data. Læreren inddrager eleverne ved at spørge om de kan forklare/huske, hvorfor transformeringen af data giver en lineær sammenhæng. Herefter forklarer læreren at modellen kan findes ved lineær regression på det transformerede datasæt. Læreren skal pointere at ved at ændre datasættet kan vi opnå en lineær sammenhæng, som vi kan lave lineær regression på. Denne introduktion kan rykkes til modul 3, hvis der ikke er tid til det i slutningen af timen. Læreren giver eleverne opgave 7-9 for som lektie. Modul 3: Mandag d. 13/4 9.50-11.25 Formål/indhold • Valg af model (formål) • Matematik kan ikke udelukkende bruges til at vælge model. Inddrage viden omkring datasæt. Ingen modeller er rigtige, men nogle er brugbare Opsamling på opgave 7-9 Eleverne forventes at have lavet opg. 7-9 hjemme. Læreren udvælger to elever, som hhv. gennemgår opg. 8 og 9. Læreren supplere med besvarelser og præciserer svar. Herefter introducerer læreren problemet med bestemmelse af model. Læreren udleverer opgave 10-11 og eleverne arbejder med disse. Opsamling på opgave 10 + 11 Opsamling på opgave 10+11 starter, når læreren fornemmer at eleverne er igennem opgaverne, hvis opgaverne tager kortere tid end afsat. Læreren udvælger to grupper, som gennemgår opg. 10 og 11. Læreren validerer elevernes besvarelser og præciserer svarene. Læreren skal understrege at selvom en model passer godt på data i et interval kan vi ikke være sikre på at modellen kan bruges til at lave fremskrivninger (10d) og der derfor også skal være faglig viden. Læreren skal sikre at eleverne får diskuteret og begrundet valg af model i opg. 11. Mulige begrundelser til at vælge model i opg. 11: • Data følger tilnærmelsesvis en lineær tendens • Begge modeller passer godt på data, da vi har r2 = 0.999 for begge modeller. • Beregning af løberens hastighed sek/km. Dette viser at løberens hastighed ikke er konstant, hvormed en lineær model ikke kan bruges. • Viden om løb: Når distancen øges mindskes hastigheden. Derfor kan den lineær model ikke bruges, da denne forudsætter at hastigheden er uafhængig af distancen. Vi har at tiden øges med 299.47sek pr. km. • Potens model. Ved en potens model vil hastigheden ikke være konstant og vi har at løberens hastighed øges med antallet af kilometer, som vi må forvente. • Ved den lineære model får vi at løberen vil løbe med 297sek/km på maraton, dvs. at han kan løbe et maraton med samme hastighed som 25km. Ved potens model vil løberen løbe med 328.9sek/km på maraton, dvs. løbe over 30sek langsommere hver km end ved 25km. Hvis eleverne ikke selv inddrager r2 -værdien skal læreren inddrage denne. Eleverne arbejder herefter videre med opgave 12+13. Opsamling på opgave 12 + 13 Læreren kan påbegynde opsamling på opgave 12 + 13, når hun fornemmer at eleverne har arbejdet med opgave 13 og er gået i stå med denne. Det forventes at eleverne går i stå med opgave 13, da denne ikke ligner tidligere opgaver og at eleverne ikke kender til model som kan beskrive denne form for udvikling. Læreren udvælger to elever som gennemgår opgave 12. Opsamlingen foregår ligesom ved opgave 10+11. Ud fra lærerens observation af arbejdet med opgave 13 vælger læreren om eleverne skal gennemgå denne eller om hun selv gennemgår den ved at inddrage eleverne. Det er vigtigt at læreren får pointeret at der aldrig findes en rigtig model til at beskrive et datasæt, men at nogle modeller selvfølgelig er mere brugbare end andre. Det er vigtigt at læreren understreger at formålet med modellen er vigtigt, når man vælger model ved at inddrage opgave 10 + 12. Opsamling på valg af model Læreren samler op på de væsentlige teknikker og redskaber til at vælge model, når denne ikke kendes ved at inddrage eleverne. De forskellige teknikker, overvejelser og muligheder for at afgøre valg af model skrives på tavlen. • Hvad er formålet med modellen? • Hvad ved vi om data? (viden om data) • Plot (logaritme plot) • Forklaringsgraden r2 • Ekstrapolation / intrapolation Læreren understreger endnu en gang at der ikke findes noget kriterie for at vælge hvilke type af model der bedst egner sig til at beskrive et datasæt, men når vi har bestemt hvilken type af model så kan vi begrunde med matematikken hvilken der er bedst. list of references a.5 the realized didactic process 183 Modul 1 11 elever (4 fraværende). Der dannes tre grupper. Opsamling på opg. 4 +5a-i, gennemgang af 5j-k + opsamling på modul 1 udskydes Tid 8min 20min 12min 12min 10min 8min 7min 18min Aktivitet Introduktion til bedste rette linje Opgave 1 + 2 Opsamling opg. 1 + 2. Introduktion til bedste rette linje Opgave 3 Opsamling opg. 3 Pause Læreren gennemgår teknik til 4a+b + opg. 5 Opgave 4 + 5 Opgaver der ikke nås/droppes 3i+j 3j til modul 2. Modul 2 14 elever (1 fraværende - 4 nye elever). De nye elever fordeles i de eksisterende grupper. Der går 5min før timen starter. Gennemgang af 5j-k blev droppet. Gennemgang af bevis samt introduktion til Tid 11min 5min 9min 20min 7min 15min 22min Aktivitet Opgave 5 Opsamling på modul 1 Opsamling opg. 5. Introduktion til bevis Bevis. Trin 1 Pause Opsamling trin 1 Trin 2 Opgaver der ikke nås/droppes 5g-5k Dropper 5j-k b+c Kun 1 gruppe når til c bedste eksponentiel/potens model udskydes til modul 3. Opgave 6 droppes. Modul 3 14 elever. 3 grupper. Opgave 8 blev ikke gennemgået, da kun to havde lavet lektier. Tid 20min 6min 8min 24min 7min 18min 6min Aktivitet Gennemgang af bevis (trin 2 for centraliseret data) samt den generelle tilfælde Introduktion til bedste eksponentiel/potens model Opsamling på opg. 9 Opgave 10 Pause Opgave 11 Opsamling opg. 11 + forløb Opgaver der ikke nås/droppes Opgave 12 + 13 blev droppet, da aktiviteter fra modul 2 blev medtaget. Modul 1 11 elever (4 fraværende). Der dannes tre grupper. Opsamling på opg. 4 +5a-i, gennemgang af 5j-k + opsamling på modul 1 udskydes Tid 8min 20min 12min 12min 10min 8min 7min 18min Aktivitet Introduktion til bedste rette linje Opgave 1 + 2 Opsamling opg. 1 + 2. Introduktion til bedste rette linje Opgave 3 Opsamling opg. 3 Pause Læreren gennemgår teknik til 4a+b + opg. 5 Opgave 4 + 5 Opgaver der ikke nås/droppes 3i+j 3j til modul 2. Modul 2 14 elever (1 fraværende - 4 nye elever). De nye elever fordeles i de eksisterende grupper. Der går 5min før timen starter. Gennemgang af 5j-k blev droppet. Gennemgang af bevis samt introduktion til Tid 11min 5min 9min 20min 7min 15min 22min Aktivitet Opgave 5 Opsamling på modul 1 Opsamling opg. 5. Introduktion til bevis Bevis. Trin 1 Pause Opsamling trin 1 Trin 2 Opgaver der ikke nås/droppes 5g-5k Dropper 5j-k b+c Kun 1 gruppe når til c bedste eksponentiel/potens model udskydes til modul 3. Opgave 6 droppes. Modul 3 14 elever. 3 grupper. Opgave 8 blev ikke gennemgået, da kun to havde lavet lektier. Tid 20min 6min 8min 24min 7min 18min 6min Aktivitet Gennemgang af bevis (trin 2 for centraliseret data) samt den generelle tilfælde Introduktion til bedste eksponentiel/potens model Opsamling på opg. 9 Opgave 10 Pause Opgave 11 Opsamling opg. 11 + forløb Opgaver der ikke nås/droppes Opgave 12 + 13 blev droppet, da aktiviteter fra modul 2 blev medtaget. 188 list of references a.6 the students’ solutions Bedste rette linje Opgave 1: a) Lav et plot af punkterne. Kan gøres ved kommandoerne 3: Data, 9: Hurtiggraf. Klik på akserne for at tilføje de to variable. b) Indtegn den linje, du synes passer bedst til punkterne. Linjen ved hjælp af kommandoen 4: Undersøg data, 2: Tilføj flytbare linjer. Hvilken forskrift har linjen? Mange svarmuligheder: f (x) = 34x + 276 m(x) = 36, 6x + 274 f (x) = 37.9x + 275 y = 38, 5x + 273 f (x) = 37x + 275 f (x) = 37.1x + 281 f (x) = 33x + 301 f (x) = 37x + 282 34.9x + 278 f (x) = 38.05x + 270 c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at andre ville finde frem til præcis samme linje som dig, hvis de hørte din begrundelse. Begrundelser: • Har prøvet at sætte den nogenlunde i midten af punkterne samt at den skulle starte i 0 • Går gennem grafens start punkt (venstre hjørne) for at kunne få udviklingen fra år 0 hvor der allerede er en længde. • Grunden til at jeg placerede linjer der er fordi jeg har rettet den til at den starter i (0,0). • Jeg valgte denne linje, da jeg synes den passer bedst til punkterne. Jeg fandt hældningen (a) først, så den lå på 38,5. • At den rette linje skulle være så tæt på flest punkter som muligt. • Jeg valgte denne linje, da det var der, hvor flest punkter blev ramt. Desuden satte jeg "starten" af linjen ved skæringen af y- og x-aksen. 5 punkter bliver her ramt. Vi havde glemt at den ikke skal være i (0,0). • Linjen er sat efter at skabe det tætteste gennemsnit af alle punkterne. Hvilket vil sige man prøver at berører alle punkter i en vis grad. • Jeg tænkte at linjen skulle ramme så mange punkter som muligt, hvilket betød at linjen ikke rammer punkterne helt, men snitter dem. • Jeg valgte linje da det var den linje jeg synes beskrev sammenhængen mellem punkterne bedst og desuden er den placeret i midten af punkterne. Opgave 2: a) Lav et plot af punkterne. b) Indtegn den linje, du synes passer bedst til punkterne. Hvilken forskrift har linjen? Mange svarmuligheder: m(x) = 8.69x + 50 f (x) = 9x + 49 f (x) = 8.42x + 51 8.63x + 48 f (x) = 9.53x + 46 f (x) = 9x + 48 Egen gæt: f (x) = 7.43x + 65. f (x) = 8.85x + 51 f (x) = 3.63x + 102 linjen har jeg placeret i midten af alle punkterne. c) Begrund dit valg af linje: hvorfor valgte du netop den linje i b)? Prøv at formulere din begrundelse på en måde, så du kan være sikker på, at alle vil finde frem til præcis samme linje som dig, hvis de hører din begrundelse. Er dine begrundelser de samme som i forrige opgave? Hvorfor/hvorfor ikke? Begrundelser: • Har prøvet at lave en linje som passer bedst muligt. Lagt den nogenlunde i midten, samt den skærer i 0. • 3 underliggende punkter blev ramt, og derfor valgte jeg denne. Intet punkt ligger nemlig langt væk. Starten blev sat ved skæringen af y- og x-aksen. • Nogenlunde samme begrundelse som før, men da der er mange punkter i øst og vest her laves (sættes) linjen mere udfra hvad jeg mener er gennemsnittet af punkterne. • Linjen har jeg placeret i midten af alle punkterne. • Igen har jeg prøvet og lægge linjen, så den lægger tættest på alle punkterne. De næste spørgsmål skal besvares i grupperne. d) Sammenlign forskrifterne for den bedste linje i opg. 1 + 2. Har I fået den samme? Forklar hver af jeres overvejelser og begrundelser fra 1c) og 2c).ă • Ikke de samme, men nogen der minder om hinanden. Vi har samme fremgangsmåde • I opgave 1 der har vi meget ens ligninger. Samme i opgave 2. • Næ- kun en havde ret e) Diskuter hvad I forstår ved den bedste rette linje. Find kriterier for den bedste linje. • Den der er i midten. Den der er tættest på flest mulige punkter. • Den linje der er i midten og rammer flest mulige punkter. • Tættest på flest mulige punkter (nogenlunde i midten) Noter: N-spire finder regression ved at finde den mindste sum af kvadraterne (axk + b))2 = mindst mulige kvadrat sum. Man skal minimere. P (yk Summer Opgave 3 Udregn summerne. Skriv udregningen til højre på papiret. Skriv resultatet i tabellen. a) Udregn 4 X 2=2+2+2+2=8 k=1 b) Udregn 4 X 3 = 3 + 3 + 3 + 3 = 12 k=1 c) Udregn d) Udregn e) Udregn 4 X k 2 = 12 + 22 + 32 + 42 = 30 k=1 4 X k=1 4 X 4k 2 = 4 · 12 + 4 · 22 + 4 · 32 + 4 · 42 = 120 5k = 5 · 1 + 5 · 2 + 5 · 3 + 5 · 4 = 50 k=1 f ) Udregn 5 · g) Udregn h) Udregn 4 X 4 X (2 + 5k) = (2 + 5 · 1) + (2 + 5 · 2) + (2 + 5 · 3) + (2 + 5 · 4) = 58 k=1 4 X k=1 k = 5(1 + 2 + 3 + 4) = 50 k=1 (5k + k 2 ) = (5 + 12 ) + (10 + 22 ) + (15 + 32 ) + (20 + 42 ) = 80 i) Betragt summerne og dine resultater i tabellen. Redegør for sammenhængen mellem nogle af summerne. Diskuter i grupperne og forklar sammenhængene. j) Udregn når det oplyses at P100 k=1 100 X (2k 2 + k) k=1 k 2 = 338350 og P100 k=1 k = 5050 Opgave 4 I skal nu arbejde videre med datasættet fra opgave 2 med alder og højde. a) Beregn gennemsnittet af alderen i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. 10.47 ⇠ 10.5år. At gennemsnits alderen er 10.5 år. Tallet er den gennemsnitlige alder af de 21 personer b) Beregn gennemsnittet af højden i datasættet fra opgave 2. Gør rede for hvad dette tal betyder. 142.43. At gennemsnits højden er 142.43. Den gennemsnitlige højde afhængig af alderen af de 21 personer. c) Opstil en generel formel til at udregne gennemsnittet (x) af x-værdierne (x1 , x2 , . . . , xn ) ved brug af sumtegn. Pn x k x = Pk=1 n n xP = k=1 xk /n n 1 x n Pk=n k n 1 k=1 n Pxnk 1 x = n k=1 xk d) Opstil en generel formel til at udregne gennemsnittet (y) af y-værdierne (y1 , y2 , . . . , yn ) ved brug af sumtegn. Pn y k y = Pk=1 n n y= Pn k=1 yk /n 1 y n Pk=n k n 1 k=1 n Pyk y = n1 nk=1 yk Opgave 5 a) Plot datasættet fra opgave 2. Indtegn linjer, der viser gennemsnittet af x og y. Linjen, der viser gennemsnittet af x indtegnes ved 4: Undersøg data, 8: Plot værdi. Linjen, der viser gennemsnittet af y indtegnes ved 4: Undersøg data, 4: Plot funktion. Funktionen, der skal plottes, er y = y. b) Bestem den bedste rette linje for datasættet i a). Skriv forskriften for linjen. f (x) = 4.755x + 92.6 f (x) = 4.75504x + 92.6138 f (x) = 4, .6x + 99 y = 4.20x + 100 f (x) = 4.15x + 99 c) Tilføj en kolonne til datasættet, hvor du udregner yk y for hvert punkt i datasættet. Lav et plot af xk og yk y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a). • Det er rykket længere ned på koordinatsystemet. • Centraliseret datasæt. • Når man trækker yk y går datasættet ned i koordinatsystemet. Når man trækker xk x går datasættet mod venstre. • Punkterne ligger på samme måde. De er bare forskudt. • De ligger på samme måde, men er blevet forskudt. • Prikkerne har ikke ændret udseende, men koordinatsættet er gået i minus • Man kan se på plottet at det starter i negativ ligeledes som på grafen • De ligger stadig ens i forhold til hinanden, men de er rykket i grafen. d) Bestem den bedste rette linje for datasættet plottet i c). Skriv forskriften for linjen f (x) = 4.755x 49.81 f (x) = 5.4x 0.83 y = 5.5x + 0.9 f (x) = 3.72x 0.1 P e) Udregn nk=1 (yk y). =0 Udregnet til 3.1E 11 ⇡ 0 e 11 3P ⇡0 n y) = 3E 11 = 0 k=1 (yk f) Tilføj en ny kolonne til datasættet, hvor du udregner xk x for hvert punkt i datasættet. Lav et plot af xk x og yk y. Forklar hvad der er sket med placeringen af datasættet i forhold til plottet i a) og c). Placeringen er centraliseret (er placeret omkring 0) g) Bestem den bedste rette linje for datasættet i f). Skriv forskriften for linjen. h) Sammenlign forskrifterne bestemt i b), d) og g). Gør rede for sammenhængen mellem forskrifterne. P i) Udregn nk=1 (xk x). j) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4c), at det altid gælder at n X (xk x) = 0 k=1 k) Vis, ved brug af regnereglerne for summer og formlen for gennemsnittet (fra opg. 4d), at det altid gælder at n X (yk y) = 0 k=1 Bestemme bedste rette linje for et centraliseret datasæt I skal bestemme den bedste rette linje y = ax + b for et datasæt (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ), bestående af n punkter, hvor n P 2. ViPantager at datasættet er centraliseret, dvs. at x = n1 nk=1 xk = 0 og y = n 1 k=1 yk = 0. n Vi skal bestemme a og b, så kvadratsummen n ✓ X yk axk + b k=1 ◆2 bliver mindst mulig. Trin 1) Det første trin består I at omskrive summen, så at vi kan minimere denne i trin 2. a) Gør rede for at yk Mulige teknikker: yk (axk + b) 2 (axk + b) = yk 2 = yk2 2axk yk (axk + b) · yk 2byk + a2 x2k + 2abxk + b2 (axk + b) = yk axk b · yk axk b = yk2 yk axk yk b axk yk + a2 x2k + axk b = yk2 2axk yk 2byk + a2 x2k + 2abxk + b2 byk + baxk + b2 yk (axk + b) · yk (axk + b) regne direkte på denne 2 2 yk + (axk + b) 2yk (axk + b) 2 2 2 2 = yk + a xk + b + 2axk 2yk axk + 2yk b (yk (yk yk 2 (axk + b)) · (yk (axk + b)) axk b) · (yk axk b) axkyk byk axyyk 2 + axk 2 + axkb Herunder ses de forskellige besvarelser: ykb + axkb + b2 P P 2 b) Vi har at nk=1 yk (axk + b) = nk=1 yk2 2axk yk 2byk + a2 x2k + 2abxk + b2 . Brug regnereglerne for summer (1-4) til at dele summen op, sætte a, b foran summationen og summe over b. c) Brug antagelsen om at gennemsnittet af x og y er 0 til at simplificere udtrykket fra b). Trin 2) a) Gøre rede for hvad A, B, C svarer til i summen og forklar hvorfor A, B, C er konstanter. • Viser med streger hvad der er A, B, C. A, B, C er konstanter, da det er de givne tal. • Har skrevet A, B, C under de tre summer. De er konstanter fordi de ikke afhænger af a og b og fordi punkterne er fastlagt. P P P • A = nk=1 x2k , C = nk=1 yk , B = nk=1 xk yk . for at gøre det lettere at regne med. • De er konstanter fordi de ikke afhænger af a og b. Punkterne er fastlagt, derfor er det også en konstant. • De er konstanter, fordi de ikke afhænger af de to variabler a,b og fordi x og y er fastlagt. • Vi har fået opgivet nogle værdier, så derfor er de konstanter. • Konstanterne A, B, C svarer til de værdier som bliver opgivet som datasæt. • x og y er variable, mens A, B, C er konstanter, fordi disse er givet i et datasæt. • Fordi punkterne er fastlagt. b) Udtrykket a2 A 2aB + C + nb2 kan betragtes som en sum af to andengradspolynomier med hhv. a og b som variable. Skriv de to andengradspolynomier som udtrykket består af. • a2 A • 1) a2 A 2aB. nb2 + C. 2aB + C. 2) nb2 • Vi kan variere a og b. Ene polynomier: a2 A 2aB +C. Andet polynomier: nb2 . • Vi kan variere a og b. a2 A 2 • Aa + 2Ba + C = 0 • Aa2 2Ba = 0 2aB + C. 2 nb + C = 0 nb2 + C = 0 • a2 A 2aB omskrevet til ) Aa2 + 2Ba + 0 = 0 og nb2 + C omskrevet til ) nb2 + c = 0. c) Bestem a og b, så a2 A 2aB + C + nb2 bliver mindst mulig. Brug fra b) at udtrykket kan betragtes som to andengradspolynomier, som hver skal minimeres. b=0 fordi det skal være mindst muligt. b=0. Toppunktformlen: 2ab ) 2B :B 2A A b=0 Toppunktsformlen 2B =B 2A PA xkyk a= xk2 b=0 Toppunktsformlen: 2B =B 2A PA a = Pxxk 2yk b 2a b 2a k d) Opskriv værdierne for a og b som gør kvadratsummen mindst mulig. Pn k=1 ✓ yk axk + b ◆2 Opgave 7: Verdensrekorder a) Bestem a og b i modellen b) Hvad kan vi forvente at verdensrekorden bliver i 2014 ifølge modellen? c) Verdensrekorden blev i 2014 slået, så den nu er på 7377. Hvordan passer det med c)? Benyt oplysningen til at kommentere på modellens rækkevidde. Opgave 8: Dykkersyge a) Opstil en model, der beskriver sammenhængen mellem dybde og antal minutter. Bestem forskriften for modellen. b) Benyt modellen til at udregne hvor længe dykkeren kan opholde sig på en dybde på 30m. c) Benyt modellen til at udregne den dybde, dykkeren kan dykke, hvis han gerne vil dykke så dybt som muligt og opholde sig på dybden i 20min. Opgave 9: Vindmølleenergi a) Opstil en model, der beskriver sammenhængen mellem årstal og kapaciteten af vindmølleenergi. Bestem forskriften for modellen. b) Benyt modellen til at beregne kapaciteten af vindmølleenergi i 2014 og sammenlign med den faktiske kapacitet af vindmølleenergi i 2014, som var 369.6M W . Opgave 10: Spædbørnsdødelighed Besvarelser: a) Model: bax . Den eksponentielle funktion har den højeste r2 -værdi. b) Ved regression a = 0.96, b = 1.48. c) ba2008 d) Det passer ikke. a) Eksponentiel regression passer bedst, da r2 -værdien er tættest på 1. b) a = 0.955 og b = 1.484 · e40 c) f (x) = 1.484 · e40 · 0.9552008 d) Modellen afviger en smule og er derfor ikke helt troværdig a) Forklaringsgraden er næsten lige så god for dem begge. Derfor har jeg bare valgt den eksponentielle funktion f (x) = b · ax . b) f (x) = 1.484 · 0.955x c) Promillen for børnedødelighed i 2008 er på 2.402. d) Dårligt - modellen vil ikke nødvendigvis være realistisk. Det kan ikke fortsætte med at falde så meget og det er ikke præcist at bruge det til at forudsige noget. a) Ud fra datasættet er det svært at bestemme om det er en potens eller eksponentiel funktion. Men i dette tilfælde vælges der eksponentiel funktion f (x) = b · ax . b) Tager det ud fra år efter 1933. f (x) = 73.58 · (0.96)x c) 73.58 · (0.96)7 5 = 3.44 a) f (x) = 1.48368·1040 ·(0.955397)x . Vi tjekkede forklaringsgraden. De passede begge godt, men den eksponentielle passede best. Jo tættere på 1, jo bedre passer den. b) Se a) c) x sættes til 2008 og det beregnes at være 2.40192 d) Det passer dårligt. Modellen er et udtryk for en sammenhæng der ikke nødvendigvis er. Opgave 11: Løb Besvarelser: a) a = 299.5 og b = 103.5 b) a = 1.1, b = 219.9. 13879.9 a) f (x) = 299.467x 103.53.f (x) 12532.5sek.208.875min = 3.48timer. b) 4.55timer. c) Nej. = 299.467 · 42.195 103.54 = a) 299.467x +-103.544 Et maraton tager 12474.1sek b) f (x) = 219 · x1.1076 . c) Potens funktion - 13809sek. Lineær funktion - 12474.1sek. d) Jeg ville bruge potens funktion, da der i den lineære skal vedkommende løbe med en konstant fart. Hvilket nok ikke er så realistisk. a) a = 299.467, b = 103.544. f (x) = 299.467 · 45.195 + ( 103.544) = 12532.5sek tager et maraton. b) a = 219.913, b = 1.1076. f (x) = 219.913x1.1076 . Maraton = 13880sek. c) Nej! 13880 12532.5 = 1347.5sek forskel på de to modeller. d) Potensfunktionen, da er mere virkelighedstro, da der er svingende hastighed. a) f (x) = 299.467x 103.544. Tid om et maraton: 12532.5sek. b) f (x) = 219.913 · x1,1076 . Sekunder for et maraton: 13879.9 c) Nej det er den ikke. d) Vi ville bruge potensfunktionen, da man ikke kan holde den samme fart i 42km. Selvom at forklaringen passer bedst for den lineære funktion. 202 list of references a.7 the transcriptions in danish 1. Gruppe 1 diskuterer valg af linje i opg. 1 C: Hvilken forskrift har I fået for jeres linje? (intet svar) D: Den er faktisk meget god. Men jeg ved ikke hvorfor. Vi har prøvet at tage hensyn til at der er sådan nogenlunde samme mængde på begge sider. Altså med sådan afstand. B: Altså det er ikke et punkt er det. A: Jo. D: Men du kan se at der er to som er rimelig langt væk fra.. Fra vores linje. Og der er også to her der ligger rimelig meget på. A: Det faktisk svært at begrunde. 2. Gruppe 2 diskuterer valg af linje i opg. 2 B: Men hvad er jeres begrundelser for at I har valgt den? C: At den skulle ramme så mange punkter som muligt. A: Ramme, altså ramme dem. C: Ja, sådan her. A: Så det er lige meget den skulle bare ramme. B: Hvad med den der sidder i midten? D: Der må være nogle der er uden for. C: Man kunne måske godt have lagt den sådan lidt mere hernede. Det ville nok være mere rigtigt. A: Altså jeg har skrevet det der med at der skal være nogenlunde lige mange punkter med samme side. Altså med samme afstand til linjen. C: Men også bare. Hvis man ligger den her. Så er de jo bare virkelig langt væk fra denne her. Fra det her punkt. 3. Gennemgang af opgave 2 på tavlen. Eleverne kommer med kriterier C: At afstanden fra punkterne til linjerne skal være nogenlunde lige stor på begge sider L: Så tænker du noget med at den her afstand ind til linjerne skal C: Ja, altså alle punkterne tilsammen skal have nogenlunde samme afstand. C: Altså alle punkterne på den ene side, deres samlede afstand skal være nogenlunde den samme som den anden side. L: Ja, så den samlede afstand. C: Ja, skal være lige stor på begge sider. L: Ja. 4. Kommentarer i gruppe 1, da læreren viser den bedste rette linje C: Arg B: Det kan du godt glemme alt om. A: Fuck. C: Så ved man at man kan ikke bruge øjemål. Det er (laver en list of references skuffende lyd) B: Øjemål. Dårligt. 5. Gruppe 1 arbejder med opg. 3 A: Jeg forstår ikke helt. Hvad er k. K lig 1. Indekset. C: Det er det det starter fra. A: Nej det er det ikke. C: Eller det stiger med en. A: Nej. Prøv at kigge på eksemplerne B: Er det så ikke bare at man skal sætte 2 op fire gange. A: Jo det vil jeg også skyde. D: Så det er 2 + 2 +2 +2 6. Gruppe 1 diskuterer opg. 4c B: Vi skal lave. Det 5 tal der står over sumtegnet om til n. Og så skal vi sige. A: Nej B: Jo A: Det der står under er n. B: ja, det kommer også til at hedde n, men det der står over kommer også til at hedder n. C: Men er det ikke bare sumtegn så ovenover n og så det der dytelyt. Og så k. Og så dividere man 7. Gennemgang af del 1 i beviset. L: Okay. Vi har fået delt det op på hvert sit sumtegn. Vi har brugt en regneregel. Nu ser det sådan her ud. Hvad tænker I så. Hvilke regneregler kan vi så bruge nu? [Forklare de to regneregler] B: Kan vi ikke bruge et’eren? L: Jo, hvor vil du bruge et’eren henne. B: Er det ikke der hvor der står k L: k? C: Nej. A: Altså jeg vil bruge, ja nr. 1 som han siger. Men der hvor vi har en konstant. Og det er for eksempel der hvor der står 2 foran et bogstav. L: Ja, det er rigtigt. 2 er en konstant. Er der andre bogstaver, jeg ved godt det er bogstaver, men er der andre bogstaver der er konstanter heroppe? Hvis vi lige prøver at tænke. Hvad er vores variable og hvad er vores konstanter? A: Dem der står i anden. Så det er a i anden. 8. Gruppe 2 arbejder med forflytning af datasættet A: Der er sq da ikke sket noget. [ Kalder på læren ] A: I c. Så skulle man øhh lave de der udregninger vi har lavet og 203 204 list of references indsætte dem med et plot. Men på vores giver det det samme som altså de samme prikmæssigt. L: Når I har trukket gennemsnittet fra. A: Ja, altså sådan her. Vi tegner dem. L: Så dem I plotter nu A: Prikkerne ligger jo ens nu. Er det meningen? L: Ja B: Men prøv at se tallene her. L: Men de er ikke helt ens. Hvad er forskellen? B: Den går i minus A: Ja ja, y og x værdien’s værdier. L: Ja. Så det at du siger det er de samme. Hvad mener du med det? A: jeg mener bare at prikkerne ligger samme sted, men L: Ja B: Bare på forskellige, de ligger i forhold til C: Hvis du satte det op i et andet koordinatsystem så ville de ikke ligge de samme steder. 9. Starten på opsamlingen af opg. 5 L: Opgave 5. Hvad går den ud på? A: Med at skrive en masse tal ind i Nspire. L: Ja. Det har I gjort. Men hvad går opg. 5 ud på. Hvad har I fundet ud af? A: Udregnet gennemsnit og forskel. L: Så gennemsnittet af hvad A: Gennemsnittet af alder og højde L: Ja, så vi har. Gennemsnittet af alderen og gennemsnittet af højden. Hvad skulle vi bruge det til? A: Til at indsætte nogle plots. I.. L: Ja ? og øhh? A: Vi er ikke nået så langt endnu. L: Har du noget at supplere med elev B? Eller hvad ville du at sige? B: Nej, øm. Nææ. Jo, så har vi bestemt. Den lineære funktion for den. L: Ja, for hvad. B: For plottet, for punkterne. L: Ja, så B: Ja, vi har lavet lineær regression. L: Alder og højde. B: Ja. L: Ja, så hvis vi har her. Alderen er vores x’er, højden er vores y’er. Så ligger vores datasæt et eller andet sted heroppe (viser på tavlen). Det har I alle fundet Klassen:Ja L: Hvad bruger vi det her gennemsnit til? Hvad har I brugt list of references gennemsnittet til. A: Udregnet forskellen. L: Så forskel mellem x og gennemsnittet. Det samme med y. Og nu skriver jeg det med lille k hernede. Taget alle x?erne og fratrukket gennemsnittet. L: De her to forskelle. Hvad har I brugt dem til? D: Til at lave et nyt plot/graf. L: Ja, så har vi lavet et nyt plot. Så har vi trukket gennemsnittet fra i x’erne og y’erne. Hvordan ser det nye plot ud? E: Magen til. Bare i minus. L: I minus? F: Noget af det. L: Noget af det er i minus? F: Halvdelen er ca. i minus L: (Læreren viser på tavlen hvordan datasættet er flyttet) L: Hvordan ser datasættet ud? Hvad er det centraliseret om? Elev mumler. 0 G: 0. L: Ja, 0.0 komma hvad? Klassen. 0 L: Ja, så vi har forskudt vores datasæt fra at være heroppe til at komme herned (viser på tavlen). 10. Gruppe 2 diskuterer bevis opg. 2b B: Jeg forstår ikke engang spørgsmålet. Som en sum af to andengrads polynomier. E: Kan de der a,b og c ikke være det samme som x. Altså ax i anden + bx + c. Bum L: Ja. Hvad kunne så være x. E: Det kunne være a, b og c. E: Store A, B og C. L: Store A,B, C var hvad? B + E: Det var konstanter. L: Ja, så hvad er det der skal variere her? E: Det skal lille a, b og c. B: Variere a og b. L: Hvilket kunne være et polynomium? B: Skal man se bort fra konstanterne L: Må man se bort fra konstanterne når man laver et polynomium C + E: Nej. L: Så hvad kunne det være polynomium være? Tænk på a’erne som det I plejer at have som x’ere. 11. Gruppe 1 diskuterer bevis 2b B: Jeg kan ikke forstå hvorfor det er to andengradspolynomier. Jeg kan godt se ideen med at det er et, men jeg kan ikke se hvor- 205 206 list of references dan det skal blive til to. B: Men jeg kan ikke få det til at passe. B: vi forstår ikke hvordan det er to andengradspolynomier? C: For der mangler et C? L: Hvis I nu tænker på a og b, som de variable. Det er dem vi normalt kender som x’er. Hvordan kan I så dele det der udtryk op, så a’erne for sig og b’erne for sig. Og nu snakker jeg om de små bogstaver. Kan I dele det op på en måde så a’erne er for sig og b’erne er for sig. C: Dele det op. L: I får at vide at der skal være to andengradspolynomier B: Ja L: så I skal have to udtryk. Hvad kunne det ene udtryk hedde? Noget med a’erne. De led hvor små a’er i. B: a i anden + nb i anden L: a’erne og b’erne må ikke være sammen. For det skal være to andengradspolynomier og de har dem som variable. Så I skal tænke på at de er x’er nu. Som normale x?er i et andengradspolynomium. E: Så bare sig. B: jamen så er der jo 3. E: De små a’er er x’er. dvs. stort A f.eks. lille a i anden ? 2aB. E: + C. Nej det må jeg ikke. L: Jo. + C kan du også. For den kan høre med til begge for den afhænger ikke af noget. E: Okay. [Læren viser på tavlen ] L: Hvad er der tilbage? C: nb i anden. E: Ja. B: Det kan jo godt passe. B: Store C må du ikke bruge to gange. Så kun med til det ene polynomium. L: Kan I se det. I kor: Ja! Got it. E: Jeg kan godt forstå det. Det er bare forvirrende at man skal bytte om på tallene. Men det er til at forstå. 12. Kommentarer efter beviset er gennemgået A: Så det er ikke sådan her vi skal gøre når vi skal beregne det. Det er bare et bevis! L: Heldigvis gør Nspire det her for jer. Flere: ja tak Men det er det her, der ligger til grunds for det. B (i gruppen til A): Så nu ved du hvad Nspire laver. L: Så hver gang I trykker regression i Nspire, lineær regression i Nspire, så regner den a værdien på den her måde og b værdien list of references på den her måde. A: Det er et bevis! L: Lige præcis. 13. 2 elever gennemgår opg. 9. Kun den ene (B) har lavet opgaven hjemme. A: Hvorfra ved I at det er en eksponentiel model? B: Det kan man se på den der (peger på figuren) C: Hvorfor ikke en potens? B: Nej. Sådan er det bare. L: Hvorfor ikke? Det er gode spørgsmål B: Det er bare en eksponentiel funktion D: Jamen hvorfor? E: Kan I ikke huske reglerne for det? [Snak i grupperne] E: Det så pænest ud. L: Hvordan ville I finde ud af om det var en potens eller eksponentiel? [larm] L: Det er nogle rigtige diskussioner I har gang i. Det er faktisk det vi skal snakke om i den her time. Hvordan finder vi ud af hvilken model vi skal bruge når vi har nogle tal. 14. Gennemgang af opg. 10 L: Hvordan har I grebet opg. 10 an? A: Til at starte med kiggede vi lidt på den der forklaringsgrad. Så vi lavede regression på eksponentiel og potens funktion. Ens forklaringsgrad. Bruge begge modeller. Valg eksponentiel, da den passe lige lidt bedre. [ Opgave b + c bliver gennemgået ] L: Hvad siger I til d’eren? B: Dårligt C: Når vi kigger på modellen på vores hurtiggraf så kan vi se at den falder sådan rimelig meget de der årstal der er blevet oplyst. Og vi kan ikke forvente at den bliver ved med at falde, så vi bliver nødt til at sætte modellen op mod virkeligheden. Realistisk kommer til at ske. Og det kan jo ikke blive ved med at gå ned hele tiden. Derfor kan det godt passe at dødeligheden har været 9.9 og ikke 2.4. 15. Gruppe 3 arbejder med opg. 11a C: Har I lavet den lineær funktion? D: Jeg er ikke kommet ind endnu. C: Det giver ikke mening for mig. D: Det gør det da. C: Altså jeg starter på minus det der sekunder. Det giver ikke mening. Men jeg kan godt finde ud af at gøre det D: Prøv at bytte om på x og y. Så tid er x og 207 208 list of references C: Nej. D: Hvad er det der ikke giver mening C: Nej nej. Hvis du zoomer ind. Den tager jo gennemsnittet af det. Den ved jo ikke bedre. C: Så står der bestem b i den lineære model. Skal jeg ikke bare skrive nul fordi sådan ser den ud C: Ja ja, men jeg tænker bare sådan nul. Jeg starter på 0. Altså hvis vi tænker logisk. L: Det rigtigt. ( Forklare) [Diskutere videre] B: Til elev C - Hvorfor forstod du at b-værdien skulle være minus (lineære model) C: Logisk tænker man at den skal være 0. Men man kan sige. Man skal tænke på at Nspire ved jo ikke hvad tallet skal være. B: Nej C: Så du skal bare skrive det Nspire siger. 16. Gruppe 3 diskuterer opg. 11d L: Hvilken model vil I vælge? A: Jeg tænker umiddelbart.. Umiddelbart potens E: Lineær L: Hvorfor? E: Fordi tallene ser mest realistiske ud L: Ser det mest realistisk ud. Hvad med ud i fremtiden? Bevæger man sig med.. C: Nej, man bevæger sig ikke med samme hastighed.. Man altså. Jeg vil sige at når man løber starter man med hurtigere fart og så slutter man af og så slutter man af med en slutspurt. Så jeg vil faktisk ikke sige nogen af dem. L: nej, okay. Så det kan godt være at ingen af modeller er specielt gode til maraton. E: Det rigtigt nok. B: Men ifølge den der r værdi så passer de jo godt. colophon This document was typeset using the typographical look-and-feel classicthesis developed by André Miede. The style was inspired by Robert Bringhurst’s seminal book on typography “The Elements of Typographic Style”. classicthesis is available for both LATEX and LYX: http://code.google.com/p/classicthesis/ Happy users of classicthesis usually send a real postcard to the author, a collection of postcards received so far is featured here: http://postcards.miede.de/ Final Version as of June 27, 2015 (classicthesis myversion).
© Copyright 2024