Gender Differences in Scientific Literacy of HKPISA 2006: A Multidimensional Differential Item Functioning and Multilevel Mediation Study WONG, Kwan Yin A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Education in Education The Chinese University of Hong Kong February 2012 Thesis Assessment Committee Professor CHUNG, Yue-ping Stephen (Chair) Professor HO, Sui-chu Esther (Thesis Supervisor) Professor CHEUNG, Sin-pui Derek (Committee Member) Professor EPSTEIN, Joyce L. (External Examiner) ABSTRACT The aim of this study is to investigate the effect of gender differences of 15-year-old students on scientific literacy and their impacts on students’ motivation to pursue science education and careers (Future-oriented Science Motivation) in Hong Kong. The data for this study was collected from the Program for International Student Assessment in Hong Kong (HKPISA). It was carried out in 2006. A total of 4,645 students were randomly selected from 146 secondary schools including government, aided and private schools by two-stage stratified sampling method for the assessment. HKPISA 2006, like most of other large-scale international assessments, presents its assessment frameworks in multidimensional subscales. To fulfill the requirements of this multidimensional assessment framework, this study deployed new approaches to model and investigate gender differences in cognitive and affective latent traits of scientific literacy by using multidimensional differential item functioning (MDIF) and multilevel mediation (MLM). Compared with mean score difference t-test, MDIF improves the precision of each subscales measure at item level and the gender differences in science performance can be accurately estimated. In the light of Eccles et al (1983) Expectancy-value Model of Achievement-related Choices (Eccles’ Model), MLM examines the pattern of gender effects on Future-oriented Science Motivation mediated through cognitive and affective factors. As for MLM investigation, Single-Group Confirmatory Factor Analysis (Single-Group CFA) was used to confirm the applicability and validity of six affective factors which was, originally prepared by OECD. These six factors are Science Self-concept, Personal Value of Science, Interest in Science Learning, Enjoyment of Science Learning, Instrumental Motivation to Learn Science and Future-oriented Science Motivation. Then, Multiple Group CFA was used to verify measurement invariance of these factors across gender groups. The results of Single-Group CFA confirmed that five out of the six affective factors except Interest in Science Learning had strong psychometric properties in the context of Hong Kong. Multiple-group CFA results also confirmed measurement invariance of these factors across gender groups. i The findings of this study suggest that 15-year-old school boys consistently outperformed girls in most of the cognitive dimensions except identifying scientific issues. Similarly, boys have higher affective learning outcomes than girls. The effect sizes of gender differences in affective learning outcomes are relatively larger than that of cognitive one. The MLM study reveals that gender effects on Future-oriented Science Motivation mediate through affective factors including Science Self-concept, Enjoyment of Science Learning, Interest in Science Learning, Instrumental Motivation to Learn Science and Personal Value of Science. Girls are significantly affected by the negative impacts of these mediating factors and thus Future-oriented Science Motivation. The MLM results were consistent with the predications by Eccles’ Model. Overall, the CFA and MLM results provide strong support for cross-cultural validity of Eccles’ Model. In light of our findings, recommendations to reduce the gender differences in science achievement and Future-oriented Science Motivation are made for science education participants, teachers, parents, curriculum leaders, examination bodies and policy makers. ii 從 PISA 2006 探討香港學生科學素養之性別差異: 多維試題功能及多層中介變項研究 摘要 這項研究的目的旨在探討香港 15 歲學生在科學素養上的性別差異及這些差異如何影響男 女生在選擇以科學作為升學及職業的動機。 本研究的數據取自 2006 年在本港舉行的香港學生能力國際評估計劃(Programme for International Student Assessment)。該計劃的 4645 學生樣本取自 146 所學校,包括:官立、 資助及私立學校,以兩階段分層隨機抽樣的方法選取。 學生能力國際評估計劃如其他大型國際評估一樣,其評估框架採用多維試題架構。本研究 採用配合該試題架構及樣本結構的多維試題功能(MDIF)及多層中介變項(MLM)兩個 研究方法,去了解 15 歲男女學生在科學素養(認知和情感)上的性別差異及這些差異如何 影響男女生在選擇以科學作為升學及職業的動機。比較常用的均差 t-檢定,MDIF 具備提 高各次級量尺的精確度特質,因而可以更有效和準確地計算出男女學生在科學素養上的性 別差異。MLM 則以 Eccles (1983) 的成功期望價值理論為學理基礎去分析和了解這些性別 差異如何影響男女生在選取與科學相關的升學途徑和擇業的動機。 要完成 MLM 的研究,我們必須先使用單組驗證性因子分析(Single-Group CFA)驗證經濟 合作與發展組織(OECD)所建構的六項情意因素,包括:「科學上的自我概念」、「科學的 個人價值」 、 「科學的興趣」 、 「對科學的喜好」 、 「學習科學的工具性動機」和「將來工作而 學習科學的動機」,以便了解使用這些源自西方社會的情意因素在本土研究的可行性及效 度。接着使用本土數據去調整這六項情意因素結構。最後利用多組驗證性因子分析 (Multiple-Group CFA) 去 確 定 這 些 因 素 結 構 對 男 女 生 是 否 都 適 用 ( 即 測 量 等 同 檢 驗 Measurement Invariance Test)。 iii 由單組驗證性因子分析結果得知,六項情意因素,除了要對「科學的興趣」因素作較大幅 度的修改外,其他五項因素都具有良好的心理測量特性。而多組驗證性因子分析的結果亦 顯示,六項情意因素都能通過測量等同檢驗,亦即這六項因素結構對男女生都適用。 研究結果顯示除了「鑑定形成科學議題」能力外,本港 15 歲的男生在「解釋科學現象」 及「科學論證」等科學認知層面上優於女生。在科學情意發展上,男生比女生亦有更好的 發展,其效應值(effect size)更高於認知層面。 MLM 的研究結果與 Eccles 的成功期望價值理論預測結果吻合,也就是說,男女生在面向 未來升學選科和擇業動機上呈現明顯的性別差異,而這些差異主要是透過情意因素(中介 變項)間接影響男女生的選擇意向。就這些因素而言,女生在選取科學作為未來升學途徑 和職業動機明顯地較男生為弱。 整體而言,驗證性因子分析結果和 MLM 的研究結果支持源自西方社會的 Eccles 成功期望 價值理論具備跨文化效度,在香港華人社會的研究結果與西方結果基本吻合。 最後,本文作者將根據本研究的結果,向科學教育的工作者、教師、父母、課程發展人員、 政策的制定者和考核機構提供一些可行的建議,希望藉此改善香港男女生在科學生涯規劃 上的性別差異。 iv ACKNOWLEDGEMENTS I would like to thank my supervisor, Professor HO, Sui-chu Esther, and my advisors Professor CHEUNG, Sin-pui Derek, Professor CHUNG, Yue-ping Stephen, Professor YIP, Din-yan, Professor EPSTEIN, Joyce L. and Professor TSANG, Wing-kwong for their professional advice and insightful comments in this course of doctoral studies. I would also like to thank the Hong Kong Centre for International Student Assessment for providing me all kinds of assistance in completing this thesis. I would like express my sincere gratitude to my mentors, Mr. KWONG Tat Hay and Mr. CHOW King Wah. Without their continuous support and encouragement throughout the course, I would never have finished this thesis. Special thanks go to my brother, Mr. WONG Kwan Yeh and my friend, Mr. CHOI Sze Wai, who spent many hours reading the drafts and making suggestions. Last but not least, I am grateful to my dearest mother for her love and support. Many thanks for her care in all these years. v TABLE OF CONTENTS ABSTRACT ……………………………………………………………………….... i ACKNOWLEDGEMENTS …………………………………………………….…. v TABLE OF CONTENTS ……………………………………………………….….. vi LIST OF TABLES ……………………………………………………………….…. xi LIST OF FIGURES …………………………………………………………..…….. xiii ABBREVIATIONS …………………………………………………………..………xiv CHAPTER ONE: INTRODUCTION 1.1 Background of the study …………………………………………………... 1 1.1.1 Gender-equity in global content of education ……………………..… 2 1.1.2 Gender differences in science performance and affective learning outcomes ……………………………………………………………. 8 1.1.3 Gender differences in variability of science performance……..…….. 11 1.1.4 PISA background ..…………………………………………………... 13 1.2 Weaknesses of previous gender studies ..………………………………….. 13 1.2.1 Weaknesses of measurement models based on total score ………….. 13 1.2.2 Weaknesses of unidimensional measurement models ………….…….14 1.2.3 Strength of multidimensional IRT models…………………………….14 1.2.4 Strength of multilevel models …………………………………..…….14 1.3 Research questions ..……………………………………………………….. 16 1.4 Significance of the study ..………………………………………………....17 1.4.1 For gender-equity educational policies in Hong Kong …………….…17 1.4.2 For local economic growth ………………………………………….. 18 1.4.3 For gender-inclusive science curriculums, assessments & teachers’ training …………………………………………………. 19 1.4.4 For academic discourse in gender-equity …………………………….20 1.5 Structure of the thesis ………………………………………………………20 1.6 Summary ………………………………………………………….……….. 21 vi CHAPTER TWO: LITERATURE REVIEW 2.1 Defining scientific literacy by historical review ……………………………. 22 2.1.1 Cognitive domain of scientific literacy ……………………………… 22 2.1.2 Affective domain of scientific literacy ………………………………. 30 2.1.2.1 Taxonomy of affective domain elements in science education 30 2.1.2.2 Science self-concept …………………...…………………….. 31 2.1.2.3 Motivation in science learning ………………………………. 31 2.2 Gender differences in scientific literacy ………………….. ……………… 33 2.2.1 Defining gender: the nature versus nurture debate ………………….. 33 2.2.2 Gender differences in cognitive learning outcomes ………………….33 2.2.3 Gender differences in affective learning outcomes .……….………....38 2.2.4 Gender differences in science educational and occupational choices.. 40 2.3 Factors attributing gender differences …………………………………….. 44 2.3.1 Biological contributions ……………………………………………. 44 2.3.1.1 Evolutionary psychology perspectives ………………………. 44 2.3.1.2 Brain structural perspectives ………………………………… 45 2.3.1.3 Brain functional perspectives ………………………………... 45 2.3.1.4 Hormonal perspectives ………………………………………. 46 2.3.2 Sociocultural contributions ………………………………………….. 46 2.3.2.1 Gender-role ……..………………………………………….… 47 2.3.2.2 Schooling and family conditions …………………………….. 47 2.3.3 Item characteristics attributing to gender differences …………..…… 49 2.3.3.1 Scientific content …………………………………………….. 49 2.3.3.2 Item format ……………………………………………………49 2.3.4 Expectancy-value model of achievement-related choices in science .. 50 2.3.4.1 Self-concept of ability as mediator of gendered choices …..… 51 2.3.4.2 Subjective task values as mediators of gendered choices ……. 52 2.4 Local research on gender differences in scientific literacy …………….….. 52 2.4.1 Gender differences in science performance ……………………..……52 2.4.2 Gender differences in affective domain ……………….……….…..... 56 2.5 Summary …………………………………………………………………. 57 vii CHAPTER THREE: RESEARCH DESIGN AND METHODS 3.1 PISA 2006 database ……………………………………………………….. 58 3.2 Conceptual framework of present study ……………………………………60 3.3 Conceptualization and operationalization of scientific literacy …………… 62 3.3.1 Cognitive domain ……………………………………………………. 61 3.3.2 Affective domain …………………………………………………….. 63 3.3.2.1 Science Self-concept …………………………………………65 3.3.2.2 Personal Value of Science ……………………………………67 3.3.2.3 Interest and Enjoyment of Science Learning ……………….. 68 3.3.2.4 Motivation to Learn Science …………………………….….. 72 3.4 Conceptualization and operationalization of Parental SES …………….…..74 3.5 Multidimensional Differential Item Functioning (MDIF) …………….……75 3.5.1 The item response (IRT) model ………………………..……………. 75 3.5.1.1 DIF model for gender differences studies ……………………. 77 3.5.1.2 Effect size by DIF ……………………………………………. 79 3.5.1.3 Item fit statistics …………………………………………..….. 79 3.6 Model testing in SEM ………………………………………….………….. 80 3.7 Summary ………………………………………………………….……….. 80 CHAPTER FOUR: GENDER DIFFERENCES IN STUDENTS’ COGNITIVE & AFFECTIVE LEARNING OUTCOMES 4.1 Gender differences in students’ cognitive outcomes ……………………… 81 4.1.1 Gender differences in science performance dimensions …………….. 81 4.1.1.1 Gender differences in science performance dimensions measured by MSD …………………………………………….82 4.1.1.2 Gender differences in science performance dimensions measured by MDIF …………………………………………... 84 4.1.2 Gender differences in content domains ………………………………86 4.1.2.1 Gender differences in content domains measured by MSD …. 86 4.1.2.2 Gender differences in content domains measured by MDIF … 87 4.1.3 Gender differences in item formats …………………………………..89 4.1.4 Gender variability in science performance ………………………….. 90 4.1.4.1 Gender variability measured by variance ratio (B/G) ……..… 90 4.1.4.2 Gender variability measured by number of students against each ability estimate …………………………………..92 viii 4.2 Gender differences in students’ affective learning outcomes measured by MSD ……………………………………………………………………….. 95 4.3 Gender differences in science achievement related choices measured by MSD ………………………………………………………………………. 98 4.4 Gender differences in students’ affective learning outcomes measured by DIF ……………………………………………………………………….. 99 4.5 Gender differences in science achievement related choices measured by DIF ………………………………………………………………………… 100 4.6 Summary ……………………………………………………………………101 CHAPTER FIVE: THE FINDINGS BY EXPECTANCY-VALUE MODEL OF ACHIEVEMENT-RELATED CHOICES 5.1 Pearson correlations between affective factors and gender ……………..….104 5.2 Gender differences by revised Expectancy-value Model ……………..……106 5.2.1 Grouping homogeneity ………………………………………….…....106 5.2.2 Mediation effect of Science Performance …………………………… 106 5.2.3 Mediation effect of Science Self-concept …………………………… 109 5.2.4 Mediation effect of Interest in Science Learning …………………….112 5.2.5 Mediation effect of Enjoyment of Science Learning ……………….. 113 5.2.6 Mediation effect of Interest and Enjoyment of Science Learning ….. 116 5.2.7 Mediation effect of Attainment Value …………………...……………117 5.2.8 Mediation effect of Utility Value …………………………………..…119 5.2.9 Mediation through Attainment Value and Utility Value ………….…..121 5.2.10 Full models of gender effects on Future-oriented Science Motivation 122 5.3 Summary ……………………………………………………….………..… 126 CHAPTER SIX: CONCLUSIONS AND IMPLICATIONS 6.1 Database and data analysis ……………………………………………...….129 6.2 Major findings …………………………………………..………………… 130 6.2.1 Multidimensional DIF model ……………………………………...… 130 6.2.2 Multilevel Mediation using Expectancy-Value Model ……………….134 6.3 Revisiting conceptual model …………………………………………….… 137 6.4 Implications for policy and practice …………………………………….… 139 6.4.1 Implications for policy makers ……………………………….….…...139 ix 6.4.2 Implications for school administrators, teachers and textbook authors ………………………………………………..…… 140 6.4.3 Implications for parents and students ………………………..……….140 6.5 Limitations and recommendations for future research ………………………143 6.5.1 Limitations of the study ………………………………………………143 6.5.2 Recommendations for future research ………………………………..144 Appendix A: Handling missing values …………………………………………. 146 Appendix B: Booklet effects ……………………………………………………. 149 Appendix C: Wright map for science performance dimensions …………………151 Appendix D: Gender differences in scientific performance measured by MDIF . 152 References ………………………………………………………………………. 158 x LIST OF TABLES Table 1.1 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 Title Page Gender differences in Nobel Laureates from 1901 to 2009 Characteristics of Scientific Literacy: The 1960s Characteristics of Scientific Literacy: The 1970s Characteristics of Scientific Literacy: The 1980s A multidimensional and hierarchical model of scientific literacy Content Summary for the National Science Education Standards and Benchmarks for Science Literacy Scientific competency framework of PISA 2006 Klopfer’s taxonomy of affective behaviors in science education A range of components in affective domains of scientific literacy Gender differences in science performance at elementary schools Gender differences in science performance at high schools Gender differences in affective leaning outcomes at elementary schools Gender differences in affective leaning outcomes at high schools Gender differences in science educational and occupational choices at high schools Gender differences in science educational and occupational choices at universities Trends in average science performance by gender - 1995 through 2007 (Grade 4) Trends in average science performance by gender - 1995 through 2007 (Grade 8) Data collection method Demographic features of the participating students Distribution of PISA 2006 Scientific Literacy items (knowledge domains by competency) A summary of procedure to conduct multi-group invariance test across gender groups Item parameters for Science Self-concept Model fit for Science Self-concept Measurement invariance test across the gender group for Science Self-concept Item parameters for Personal Value of Science Model fit for Personal Value of Science Measurement invariance test across the gender group for model of Personal Value of Science Item parameters for Interest in Science Learning Item parameters and scale reliability for Enjoyment of Science Learning Model fit and estimated latent correlations for Interest in and Enjoyment of Science Learning Measurement invariance test across the gender group for two-dimensional model of Interest in and Enjoyment of Science Learning 12 25 27 27 28 xi 29 29 30 30 35 36 38 39 41 42 54 55 58 59 62 64 65 66 66 67 67 68 69 69 70 71 Table Title Page 3.15 3.16 Item parameters for Instrumental Motivation to Learn Science Item parameters for Future-oriented Science Motivation Model fit and estimated latent correlations for motivation to learn science Measurement invariance test across the gender group for two-dimensional model of motivation to learn science Model fit for socioeconomic status Gender differences in scientific competency Summary of items showing statistically significant gender DIF for different science performance dimensions Gender differences in content domains Summary of items showing statistically significant gender DIF for item content Summary of items showing statistically significant gender DIF for item format Gender variance ratio on the PISA scale Gender differences in affective learning outcomes (WLE scores) Gender differences in Future-oriented Science Motivation (WLE scores) Gender differences in affective learning outcomes (PV scores) Gender differences in Future-oriented Science Motivation (PV scores) Correlations among gender (Girl), affective factors and Science Performance Mediation effect of science performance Mediation effect of Science Self-concept Mediation effect of Interest in Science Learning (Model 4a), Enjoyment of Science Learning (Model 4b) and Interest and Enjoyment of Science Learning (Model 4c) Mediation effect of Attainment Value (Model 5a), Utility Value (Model 5b) and Attainment Value and Utility Value (Model 5c): Full model of gender effects on Future-oriented Science Motivation EM Correlations matrix of SES Descriptive Statistics of the results of multiple imputation of SES Hong Kong estimated booklet effects in logits Internationally estimated booklet effects Gender DIF items for Closed Constructed Response (CCR) Gender DIF items for Multiple Choice (MC) Gender DIF items for Complex Multiple Choice (CMC) Gender DIF items for Open Response (OR) 72 72 3.17 3.18 3.19 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 5.1 5.2 5.3 5.4 5.5 5.6 A1 A2 A3 A4 A5 A6 A7 A8 xii 73 73 74 83 85 87 88 90 91 96 98 99 100 105 109 111 115 120 125 146 147 150 150 152 152 154 156 LIST OF FIGURES Figure Title Page 1.1 1.2 Effect of gender equality on global economic competitiveness OECD Social Institutions and Gender Index (SIGI) The trend of day school first attempters in science subject choice by gender in HKCEE 2001-2009 Gender differences in science Expectancy-value model of achievement-related choices Revised Expectancy-value Model of Achievement-related Choices in Science A second-order CFA model of INTSCIEHKG A graphical representation of within-item and between-item multidimensionality. Item fit statistics for the three science performance dimensions: Explaining Phenomena Scientifically (EPS), Identifying Scientific Issues (ISI) and Using Scientific Evidence (USE) Gender variability on different science performance level Gender variance ratio on different science performance level Science performance: Number of boys and girls at each ability estimate Explaining Phenomena Scientifically (EPS): Number of boys and girls at each ability estimate Identifying Scientific Issues (ISI): Number of boys and girls at each ability estimate Using Scientific Evidence (USE): Number of boys and girls at each ability estimate Item characteristic curves for Learning advanced science topics would be easy for me (ST37Q01) Gender effect on Science Performance and Future-oriented Science Motivation (Model 1) Mediation effect of Science Performance (Model 2) Mediation effect of Science Self-concept (Model 3) Mediation effect of Interest in Science Learning (Model 4a) Mediation effect of Enjoyment of Science Learning (Model 4b) Mediation effect of Interest and Enjoyment of Science Learning (Model 4c) Mediation effect of Attainment Value (Model 5a) Mediation effect of Utility Value (Model 5b) Mediation effect of Attainment Value and Utility Value (Model 5c) Full model (Model 6a) of gender differences in Future-oriented Science Motivation Full model (Model 6b) of gender differences in Future-oriented Science Motivation Revised Conceptual Model for Studying Gendered Educational and Occupational Trajectories in Science Missing value pattern of SES and related factors Wright map for the three dimensions: Explaining Phenomena Scientifically (EPS), Identifying Scientific Issues (ISI) and Using Scientific Evidence (USE) in science performance 3 4 1.3 2.1 2.2 3.1 3.2 3.3 3.4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 6.1 A1 A2 xiii 19 48 50 61 71 77 79 91 92 93 93 94 95 100 107 108 110 112 114 116 118 119 121 123 124 138 147 151 ABBREVIATIONS AAI CCR CDC CEDAW CFA CFI CGSS CMC DAC ΔCFI DFID DIF DSS Δχ2 EAP EPS ERGEEA ESS FIML GSES HKCEE HKEAA HKPISA IEA INSTSCIE INTSCIE IRT ISI JOYSCIE KAS KOS LS MAR MC MCAR MCMC MDG3 MDIF MGMI MI MML MNAR MRCMLM MSD MVN NELS NLSY Academic Achievement Index Closed Constructed Response Curriculum Development Council Convention on the Elimination of All Forms of Discrimination against Women Confirmatory Factor Analysis Comparative Fit Iindex Child’s General Self Schemata Complex Multiple Choice Development Assistance Committee Change of Comparative Fit Index UK Department for International Development Differential item functioning Direct Subsidy Scheme Change of Chi-square Expected a posterior Explaining Phenomena Scientifically Gender Equity Education Act Earth and Space Systems Full information maximum likelihood Guidelines on Sex Education in Schools Hong Kong Certificate of Education Examination Hong Kong Examinations and Assessment Authority Programme for International Student Assessment in Hong Kong International Association for the Evaluation of Educational Achievement Instrumental Motivation to Learn Science Interest in Science Learning Item Response Theory Identifying Scientific Issues Enjoyment of Science Learning Knowledge About Science Knowledge Of Science Living Systems Missing at random Multiple Choice Missing completely at random Markov chain Monte Carlo Millennium Development Goals Multidimensional Differential Item Functioning Multi-group Measurement Invariance Measurement Invariance Marginal Maximum Likelihood Missing not at random Multidimensional random coefficients multinomial logit model Mean Score Difference Multivariate Normal Distribution National Educational Longitudinal Study of the Eighth Grade Class National Longitudinal Study of Youth xiv NNFI NSTA OECD OR PERSCIE PISA PS RMR RMSEA SAT SCIEFUT SCSCIE SDT SEL SEM SEQ SIGI SOD SP SRMR SSPA STEM STS TIMSS TLI TS UGC UNDP UNESCAP UNESCO URCMLM USE WFMS Non-Normed Fit Index (NNFI) National Science Teachers Association Organisation for Economic Co-operation and Development Open Response Personal Value of Science Programme for International Student Assessment Physical Systems Root Mean Square Residual Root Mean Square Error of Approximation US Scholastic Aptitude Test Future-oriented Science Motivation Science Self-concept Self-determination theory Scientific Explanations Structural Equation Modeling Scientific Enquiry OECD Social Institutions and Gender Index Sex Discrimination Ordinance Science Performance Standardized Root Mean Square Residual Secondary School Places Allocation Science, technology, engineering and mathematics Science, technologies and society Trends in International Mathematics and Science Study Tucker Lewis Index Technology Systems University Grants Committee of the Hong Kong Special Administrative Region United Nations Development Programme United Nations Economic and Social Commission for Asia and the Pacific United Nations Educational, Scientific and Cultural Organization Unidimensional random coefficients multinomial logit model Using Scientific Evidence Weighted Fit Mean Square Error xv CHAPTER ONE INTRODUCTION 1.1 Background of the study In the past two decades, the number of girls going to university and graduating with bachelor degree outnumbered that of boys. This trend is similar across all the developed OECD (Organisation for Economic Co-operation and Development) countries and Hong Kong (OECD, 2009a; UGC, 2009; UGC 2010). International assessment such as TIMSS1 2009 and PISA2 2006 reported that girls are catching up with boys on science performance. Empirical evidences support that there are no significant gender differences found in science performance at grade 4, grade 8 and grade 9 (e.g. Ho et al., 2008; Martin et al., 2008). Yet, the number of girls who took science subjects, including physics, chemistry and biology, in senior secondary studies and their performance (A-C) of these subjects in two public examinations namely, Hong Kong Certificate of Education Examination (HKCEE) and Hong Kong Advanced Level Examination were well below those of boys’ counterparts (HKEAA, 2009a and HKEAA, 2009b). The same pattern persists at university course selection and occupational choices after secondary school education (Census and Statistics Department, 2006; UGC, 2009; UGC, 2010). In short, girls tend to select non-science educational programs and careers. In conclusion, significant gender differences in science performance are disappearing in these years while the gender segregated educational and occupational choices in science continue. So, what are major factors influence girls’ motivation to choose science as their educational and careers goals? Is there any gender differences in these factors that might lead to the observed gendered patterns in educational and occupational choices related to science? To answer these two major questions, we have to find out how and to what extent the gender differences influencing the students’ science performance and affective learning outcomes in Hong Kong. The data obtained from the PISA 2006 database will be used to address these two questions. PISA is a triennial study to collect data from 15-year-old students about their cognitive and 1 2 Trends in International Mathematics and Science Study Programme for International Student Assessment 1 affective learning outcomes inside and outside schools. It is a project commenced and coordinated by the OECD. The third cycle of PISA: PISA 2006, three subject domains were assessed, with scientific literacy as the major domain, whereas mathematics and reading as minor domains. The enriched sets of cognitive3 and affective factors about scientific literacy collected for major domain enable us to examine the gender difference in scientific literacy and its subsequent impacts on achievement-related choices in science. In the later sections, the background of gender-equity educational policy and theoretical perspectives in gender differences in Hong Kong will be discussed. 1.1.1 Gender-equity in global content of education Gender equity as human right In 1979, the Convention on the Elimination of All Forms of Discrimination against Women (CEDAW) provides a comprehensive framework to guide all rights-based action for gender equity, including that of United Nations Development Programme (UNDP). UNDP supports capacity building of its national partners to take on measures which could advance women’s rights and take account of the full range of their contributions to development, as a basic for MDG3 accomplishment (DFID, 2007 & UNDP, 2007). “It is impossible to realize our goals while discriminating against half the human race” (Kofi Annan, 2006) The United Nations World Summit of 2005, the Millennium Development Goals (MDG3) reaffirmed gender equity and women’s empowerment, including indicators and concrete targets related to girls’ education and gender mainstreaming. The United Nations Educational, Scientific and Cultural Organization (UNESCO) identify five critical areas of concern in gender mainstreaming (UNESCO, 2000): 1. equal access to education for women and girls; 2. women’s contribution to peace; 3. women’s access to the media, and their image in the media; 4. women’s contribution to the management of natural resources and environmental protection; 5. 3 the girl-child with regard to access to education and literacy. To distinguish the cognitive components from affective components of scientific literacy, science performance (SP) is used to signify the cognitive performance of scientific literacy in the subsequent chapters. 2 In brief, the gender-equity in education is one of the main focuses of UN Millennium Project Task Force on Education and Gender Equality in coming decade (MPTG, 2005). Gender equity for economic development OECD focuses more on gender equality as a vital goal for social and economic development and growth in a global economy which is about effectiveness in improving people’s lives in a sustainable way. From the economic and social survey of Asia and the Pacific in 2007, it has been estimated that persistent gender inequality and discrimination against women due to restrictions on access to employment and education alone cost between USD 58 billion and USD 77 billion per year in the Asia Pacific (UNESCAP, 2007). Similarly, the correlation study by World Economic Forum on the Global Gender Gap Index 2009 and the Global Competitiveness Index 2009-2010 from 134 countries confirmed the correlation between gender equality and the level of competitiveness of countries (see Figure 1.1). Though correlation does not imply causality, it is in consistent with human capital theory that women represent half of human population and her empowerment means a more efficient use of human resources in national economic development and growth. (Zahidi et al., 2009). Mahony (1988) argues that achieving a national competitiveness in the global economy through schooling is the central theme of British educational policy, and national prosperity depends on high levels of knowledge and skills in an increasingly service-led economy. Figure 1.1: Effect of gender equality on global economic competitiveness Gender Equality (0.00 – 1.00 scale) 3 The Development Assistance Committee (DAC) Guiding Principles for Aid Effectiveness on Gender Equality and Women’s Empowerment (2008) further elaborate the principle of gender equality to cover cultural, religious and social factors which heavily influence gendered subject choice, specifically, more emphasis is placed on ‘traditional’ subjects for girls, while more encouragement for boys to study subjects such as science and mathematics. Gender bias in curricula at all educational levels reinforces stereotypes about the roles of girls and boys. To promote and monitor gender equality worldwide, the OECD Social Institutions and Gender Index (SIGI) (see Figure 1.2) was established and became a new composite measure for gender equality in 102 non-OECD countries. Among those countries, Hong Kong ranked 20/102 while China ranked 83/102 in 2009. In other words, Hong Kong has better gender equality than China. Figure 1.2: OECD Social Institutions and Gender Index (SIGI) Promoting gender equity in education United States Title IX of the Education Amendments of 1972, an amendment of the original the Civil Rights Act enacted in 1964, says that No person in the United States shall, on the basis of sex, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any education program or activity receiving Federal financial assistance (Mink, 1972) 4 Title IX asserts sex discrimination in the areas of science or math education, or in other aspects of academic life. However, Title IX gave a very broad description of gender equity in education only while the National Science Teachers Association (NSTA), the professional body for science education, in its position statement about the gender equity in science education presented clear guidelines to science teachers and school administrators (NSTA, 2006) in four areas: teaching, selecting science curriculum; assessing student performance and in helping students prepare for further study and careers. In the classroom, science teachers must 1. Implement varied and effective research-based teaching and assessment strategies that align with the learning styles of all students. 2. Ensure that all students are in a learning environment that encourages them to participate fully in class discussions and science activities and investigations. 3. In developing and implementing professional development and teacher preparation programs, science teachers, administrators, teacher educators, and policy makers must 4. Ensure that discussions about research-based issues related to the pedagogy of gender equity are an integral part of professional development and teacher education programs. 5. Be aware of their own deep-seated beliefs so that they can ensure that their beliefs do not interfere with objective science teaching. In selecting science curriculum, science teachers, administrators, and community members must 6. Select only those curriculum materials that promote gender inclusiveness through their text, illustrations, and graphics. 7. Select only those curriculum materials that present culturally diverse male and female role models working in all disciplines and at all levels of science. In developing assessment tools, science teachers, administrators, and evaluators must 8. Design and implement varied kinds of assessment models so that all students, regardless of their learning style, can be assessed fairly in science. 9. Provide administrative support for the development and use of a range of assessment tools that promote gender equity. 5 In helping students prepare for careers, guidance counselors and science teachers must 10. Encourage all students to consider science and science-related careers by exposing them to a range of school and community activities. 11. Provide all students with the most recent information about the kinds of opportunities available in the sciences, as well as the preparation necessary to attain such careers. Taiwan The Gender Equity Education Act was promulgated and implemented in 2004. It demanded all the authorities and schools to promote and adopt gender-inclusive curricula, teaching, and assessments. All curricula shall cover gender equity education elements, such as affective education, sex education, and gay and lesbian education, and comply with principle of gender equity. Teachers shall maintain gender equity consciousness, and “shall encourage students to take courses in fields that are not traditionally affiliated with their gender” (Ministry of Education, 2005).In 2005, the Enforcement Rules for the Gender Equity Education Act (ERGEEA) was further clarified the professional practices in gender-equity in schools. 1. Curricula: (1) Pre-service training of staff members, orientation training of new staff members, in-service program and preparation program for candidates of educational administrators as prescribed in Article 15 of the Act. (2) Curricula and activities provided to students as prescribed in the first paragraph of Article 17. 2. Instruction: (1) Develop innovative teaching methods related to gender equity education. (2) Enhance teachers’ competence in gender equity education pedagogies. 3. Assessments: (1) Cognition, affection, and practice of the concept of gender equity. (2) Diverse and non-gender-biased methods of assessment such as observation, operation tasks, performances, oral exams, written exams, assignments, learning progress portfolio, research reports etc. (Ministry of Education, 2006 p. 1-2) Hong Kong Locally, there is no clear policy to address the issue of gender equity in education and the focus is on the sex education rather than sex-equity in education. The education department and 6 curriculum development council published the Guidelines on Sex Education in Schools (GSES) in 1997 and advised the secondary schools to infiltrate sex education elements in various subjects such as General Studies, Science, Biology, Social Studies, Ethics and Religious Studies, and Home Economics. With the introduction of curriculum reform in 2001 which placed emphasis on holistic education, cross-curriculum programs in civic education, moral education, sex education, health education and environmental education have all been integrated into moral and civic education. Moral and civic education stress on cultivating students’ positive values and attitudes, helping them develop a healthy lifestyle, acquire skills in life to face and deal with daily life and social problems, learn how to face the challenge of growth, and deal with doubts and perplexities about sex, for example, dating and courtship, gender awareness, and sexual harassment but does not include gender equity. Currently, under law, the gender equity in education is administrated by anti-discrimination ordinances in Sex Discrimination Ordinance (SDO). The SDO only ensure that boys and girls have equal chance of assessing educational resources at system level, for example, through Secondary School Places Allocation (SSPA) system, boys’ score were scaled up while that of girls were scaled down systematically in order to accommodate a pre-set allocation system before year 2001. This limited girls’ access to the prestigious schools territory-wide and produced negative impact on girls’ educational outcomes (Tang, 2006). The High Court found the system in violation of the SDO in June 2001 and the gender-disparity practice was abolished and amended to allow boys and girls to have equal access of these top schools. Apart from the GSES and SDO, in practice, there are no clear gender equity policies and guidelines implemented at school curricula, school-based assessment, initial teachers’ education and individual student. The local activist groups for instance, the Association Concerning Sexual Violence Against Women, the Gender Research Centre at the Chinese University of Hong Kong and commentators keep complaining that the Hong Kong Special Administrative Region Government puts inadequate concerns and efforts to adopt gender-equity education in local schools: kindergartens, primary schools, secondary schools and tertiary institutes. 7 1.1.2 Gender differences in science performance and affective learning outcomes Gender differences have been a hot and contentious issue in Western education since the Women’s Liberation Movement in the United States in 1960s and early 1970s. Gender has been well realized as a key social identity that influences an individual’s educational experiences and achievements. The mostly influential and cited research findings by Maccoby and Jackin in 1974 suggested that girls have lower self-esteem, achievement motivation and a better rote learner than boys whereas boys have higher-level of cognitive processing than that of girls. They also concluded that gender differences are well-established in these domains: girls have higher verbal ability than boys while boys have greater visual-spatial and mathematical abilities than girls. However, in Hyde’s “The Gender Similarities Hypothesis” published in 2005, she highlighted the evidence for gender similarities by meta-analysis of 5000 psychological gender differences studies covering approximately 7 million people that 80% of the effect sizes4 were small or close to zero in the United States. Contradicting evidence from Britain GCSE results has suggested that boys are significantly underachieving in comparison with girls in all subjects (Department for Education and Skills, 2004). The boys’ underperformance hit the headlines firstly in the mid-1990s after the introduction of mandatory national curriculum in England and Wales in 1988 and boys and girls were forced to take the same core subjects for the first time. Girls very quickly caught up with and even slightly outstripped boys at GCSE science examinations (Epstein et al., 1998; Francis & Skelton, 2005; Department for Education and Skills, 2009). This trend applies up to degree level and more girls gain good degree awards than that of boys for a decade. The zero sum game version of education becomes the moral panic among British journalists, commentators and policy-makers. It justifies a greater focus and disbursement on meeting boys’ needs at the expense of girls and vice versa. It prompted an acknowledgment of prevalent gender inequities and disparities in relation to science (Calabrese, 1998). The two IEA (International Association for the Evaluation of Educational Achievement) studies in the 1970s and 1980s also reported that the gender differences in science performance were consistently favouring boys and the differences intensified with age and grade levels of 4 Effect size of gender differences was evaluated with Cohen d. Cohen d = 0.2 is small, d = 0.5 is medium and d ≥ 0.8 is large. Cohen d smaller than 0.2 is considered as negligible effect size in the social science context and conventional clinical practices (Cohen, 1988). 8 schooling (Keeves, 1986, 1992). Similar patterns were observed in many participating countries in the Trends in International Mathematics and Science Study (TIMSS), and that girls’ underachievement in science education is particularly distinct towards the end of the education system (Law, 1996a, Mullis, et al., 1998; Robitaille & Beaton, 2002). Likewise, the same complication persists in the local conditions that Yung et al. (2006) reported a consistent trend of boys outperforming girls with significant differences from TIMSS 1995 to TIMSS 2003 in science while this trend was not found in TIMSS 2007 and PISA 2000 to PISA 2006 (Ho et al., 2003, 2005, 2008; Martin et al., 2008). Previous studies indicated that the gender differences can be underestimated if boys and girls are each given gender-biased items and the gender effect gets cancelled. The gender differences at item level can be distorted by just looking at the overall performance of boys and girls (Cole, 1997). McCrae (2009), the principal research fellow and leader of the Mathematics and Science test development team at the Australian Council for Educational Research, who managed framework and item development for the PISA 2006 science assessment, admitted that Percentage of score points to be assigned to the knowledge of science components of the assessment was determined by the PISA Governing Board (PGB) before the field trial, in June 2004, to be about 60%. This decision had a far-reaching consequence in terms of overall gender differences in the PISA 2006 science results, as boys generally outperformed girls on knowledge of science items and the situation was reversed for knowledge about science items.5 (OECD, 2009b p. 44) From the analysis6 of competencies domains in scientific literacy in PISA 2000 and PISA 2006: boys perform better than girls in explaining phenomena scientifically and using scientific evidence, but perform less satisfactorily in identifying scientific issues. Yip et al (2004) alerts us to consider a test dominated gendered items may lead to a different outcome about gender effect on science performance. 5 6 Knowledge of science refers to knowledge about the natural world and knowledge about science refers to scientific enquiry and scientific explanations. PISA 2006 includes cognitive and affective learning outcomes of scientific literacy in the assessment framework while PISA 2000 and PISA 2003 include cognitive domain only. 9 Though gender differences in various content areas are fluctuating over the years because of difference test design, boys are generally dominant in physical and earth sciences while girls equally well in biology and chemistry (Law, 1996a; Yip et al., 2004 & Yung, 2006). Girls usually perform less well than boys in the areas which are not commonly covered in the formal school science curricula despite they are more industrious in the school subject (Mullis et al., 2000). Research findings also suggest that there is a large gender gap in terms of multiple-choice and true-false questions and boys outperformed girls statistical significantly (Yip et al., 2003 & Yung et al., 2006). The gender differences in guessing tendencies are robust that males tend to guess while girls do not attempt and skip the items (Gershon & Sinai, 1991). It hypothesized that boys attribute partially to their superior performance than girls in multiple-choice questions because of their higher guessing tendency. Several research findings also support that the multiple-choice format favours boys whereas the open-response format favours girls (Liu, 2009; Bolger & Kellaghan, 1990; Bell & Hay, 1987; Murphy, 1982). Arnot et al. (1998, p.28) suggest that boys show greater adaptability to more traditional approaches to learning which require the memorisation of abstract, unambiguous facts and rules that have to be acquired quickly. They also appear to be more willing to sacrifice deep understanding which requires effort, for correct answers achieved at speed. While girls appear to prefer open-ended tasks which are related to real situations and tend to respond in ways that are collaborative and provide a broader context (Francis & Skelton, 2005). This gendered learning style differences attribute significantly the item level gender bias in assessment. The education system in Hong Kong is characterized as examination-led where the materials transpire in the classroom are largely dictated by what happen in the public examination hall. Despite the competition for tertiary education has dropped considerably in recent years, the emphasis on examination for selection purpose is still much more weighted than many other countries (Yung, 2002). The differential performance of boys and girls in science subject domains thus has strong implications for the equity of assessments in public examinations and subsequent educational opportunities. Though the gender differences in science performance reduce in the past few decades, the percentages of girls select science and engineering courses, in particular physical sciences in both higher secondary and university levels remains substantially lower than that of boys (Halpern et al., 2007). However, the direct social and economic impact on the overall underrepresentation of females in science, technology, engineering and mathematics (STEM) has not been fully explored. So, why not many females enter the STEM workforce after 10 graduation? Ho et al. (2008) suggested that instrumental motivation to learn science was one of the key affective measures that decide students’ course selection, career choice and academic performance. Kelly’s (1988) three-year trajectory research on 1472 British secondary students’ course choice showed a strong correlation between subject choice and their attitude to science in junior secondary. In the United States, Tai et al. (2006) analyzed eighth-grade (about 13-year-olds) students about career expectation in STEM for years 1988 through 2000 using National Education Longitudinal Study of 1988 (NELS:88). Their results indicated that students with expectations for a science-related career were 3.4 times more likely to earn physical science and engineering degrees than students without similar expectations. From these findings, affective learning outcomes are critical components to assess one’s achievement in scientific literacy. In sum, the paradox of gender differences studies from Western literatures demonstrates that the gender differences in education outcomes might be reduced substantially over the past few decades; however, the gender differences of future career orientation in the field of science has not yet been well testified in local context . 1.1.3 Gender differences in variability of science performance For countries and regions with technology driven economy, the economic development and growth is highly dependent on individuals who having high-level of scientific competencies (i.e. Levels 5 and 6 in PISA proficiency levels on science performance) to bring new technology and innovation while having basic competence (i.e. Level 2) are essential to enable individuals to absorb and adopt new technology to their workplaces. Communities with large number of scientifically literate citizens will benefit the economic growth than one with less literate individuals (Hanushek & Woessmann, 2007). Historically, Ellis (1934) and Galton (1969) demonstrated that boys were intellectually and educationally more variable than girls. Modern comments from Noddings (1992) and Feingold (1992) come to the same conclusion that gender differences in science variation do exist. The variability hypothesis has tried to explain why men were more often than women found in the ranks of genius and idiot. For example, from 1901 to 2009, out of all 187 Nobel Laureates in Physics, 157 Nobel Laureates in Chemistry and 195 Nobel Laureates in Physiology or Medicine, 2 (1.1%), 4 (2.5%) and 10 (5.1%) are women respectively (see Table 1.1). 11 Table 1.1: Gender differences in Nobel Laureates from 1901 to 2009 Nobel Prize Nobel Laureates Natural Science Women Men No. of Prize Physics 2 (1.1%) 185 (98.9%) 187 Chemistry 4 (2.5%) 153 (97.5%) 157 Physiology / Medicine 10 (5.1%) 185 (94.9%) 195 Sub-total 16 (3.0%) 523 (97.0%) 539 Social Science Women Economic Sciences 1 (1.6%) 63 (98.4%) 64 Peace 12 (10.0%) 85 (70.8%) 120* Literature 12 (11.3%) 94 (88.7%) 106 Sub-total 25 (5.5%) 242 (83.4%) 290 Total 41 (4.9%) 765 (92.3%) 829 Men No. of Prize * Remark: Out of 120 Nobel Laureates in Peace, 97 times were awarded to individuals and 23 times to organizations. (Source: Nobel Foundation, 2009) To explain bunching together of males in the top ranks, Hedges and Nowell (1995) found that males were dominant at higher end of the distributions in educational attainment of science literacy and mathematics. The pattern was observed in the US Scholastic Aptitude Test (SAT), National Longitudinal Study of Youth (NLSY) and National Educational Longitudinal Study of the Eighth Grade Class (NELS). Machin and Pekkarinen (2008) extended these gender variance studies to cover wider sample of countries participating in PISA 2003 and the results suggested that boys having larger learning diversity is a robust phenomenon. The same phenomenon was observed in Hong Kong PISA studies. The boys in the 95th percentile outperformed girls with statistically significance except in 2003. At the 5th percentile, girls outperformed boys from PISA 2000 to 2006. (Yip et al., 2004, Ho et al., 2003; Ho et al., 2008). 12 1.1.4 PISA background There have been a number of large scale international assessments launched since 1960s for example, the IEA’s Third International Mathematics and Science Study and the OECD’s PISA. The IEA survey, TIMSS, focus on the trends on learning outcomes of students about knowledge and skills broadly aligned with science curricula of participating countries. PISA on other hand focus on measures how well students can apply their knowledge and skills of science to real-life problems. PISA is thus designed to represent the learning outcomes at age 15, rather than a direct measure of attained curriculum knowledge (OECD, 2006). OECD started the first cycle of PISA in 2000 and repeated every three years. However, the main focus, of the first two cycles in 2000 and 2003, was on reading and mathematics. The scope of and number of items on cognitive domain of scientific literacy was very limited and not comprehensive enough in these two cycles of assessment while the major domain of PISA 2006 was on scientific literacy and the total number of relevant items had increased from 35 in 2000 to 108 in 2006. In addition, affective learning outcomes, such as: Enjoyment of Science Learning (JOYSCIE), Future-oriented Science Motivation (SCIEFUT), Interest in Science Learning (INTSCIE), Instrumental Motivation to Learn Science (INSTSCIE), Personal Value of Science (PERSCIE) and Science Self-concept (SCSCIE) also had been included as part of assessment in PISA 2006 (OECD, 2006). The assessment framework of PISA 2006 is more complete and valid in evaluating students’ overall performance in scientific literacy. 1.2 Weaknesses of previous gender studies 1.2.1 Weaknesses of measurement models based on total score As mentioned by Cole (1997), the gender differences in achievement can be distorted using total score or mean score comparison method. Furthermore, Liu (2006) elaborated that the gender differences could be underestimated if boys and girls are each favored by some items in the assessment and thus the gender effect will be cancelled out when the total score or mean score is compared. Secondly, such comparisons did not take into account the assessment structure, for example, the dimensionality and components of the scientific literacy. Therefore, the relative strengths and weaknesses of boys and girls in certain areas of science remain unclear. 13 1.2.2 Weaknesses of unidimensional measurement models Educational and psychological measurements in large-scale assessments such as PISA and TIMSS tend to cover a large variety of latent traits in a short period of time by having multiple short subtests for each distinct latent trait. Unfortunately, such approach suffers from imprecision because of short test lengths and unidimensional assumptions of measurement models (Cheng et al, 2009; Wu, 2008). The problem is well known as bandwidth-fidelity dilemma (Cronbach & Gleser, 1965). Bandwidth refers to the amount of information that can be contained in a message, while fidelity refers to the precision of the information conveyed. In short, the higher the precision of a given test, the less the extent of the information it can be gathered in a session of limited time and items and vice versa. Murphy (1993) suggested that there was an inevitable trade-off between attaining a high degree of precision in measurement of any one attribute (or characteristic) and obtaining information about a large number of characteristics. 1.2.3 Strength of multidimensional IRT models7 In most of measurement models, a set of items are used to measure distinct latent traits which follows the unidimensionality assumption. The distinct latent traits in each subtest are then analyzed separately. However, this ‘‘composite’’ unidimensional approach often violates the test’s claim of subtest structure when the traits are highly correlated (Wang 1994; Adams, Wilson & Wang, 1997). Even the subtests with the underlying dimensions are not highly correlated; the unidimensional model can introduce bias in ability estimation (Folk & Green, 1989). The person measures8 are also not reliable and attenuated since the correlations between latent traits are always underestimated (precision issue) in unidimensional models (Cheng et al., 2009). A number of studies worked around the problem of precision in subtest measurement using unidimensional models, for example, Wainer, Sheehan and Wang (2000) deployed other subtest scores to improve the individual subtest score measurement. Davey and Hirsch (1991) call this method as consecutive approach which ignores the multidimensionality of the dataset and failed to take advantage of the correlation between latent traits to improve the reliability. More importantly, modern assessments like PISA not only aim at measuring the students’ 7 8 Details of multidimensional IRT models will be discussed in Chapter 3. The person measure refers to the ability estimate by IRT model. The precision of person measure increases with number of items in a given test. 14 achievement but also the conceptual understanding demonstrated by the performances at each dimension (Adams, Wilson & Wang, 1997). Understanding of students’ performance in each domain of science is critical in science education and remedial actions can be taken to address the relative weak areas of the curriculum. Unidimensional models do not provide solution to the above issues while create problem in reporting students’ achievement when the test score is a summative results of multiple subtests. The multidimensional model resolves the problem above by taking the advantage of the correlations between latent traits. Thus, assessments consist of larger number of subtests enhance measurement precision (Wang, Chen & Cheng, 2004). Using multidimensional model, Wu (2008) even demonstrated that the precision of reading scores estimation at individual student level improved using mathematics and science scores together. Apart from improving measurement precision, multidimensional models also eliminate the validity issue by modeling the multidimensionality properties of the subtests, for example, a test having several unidimensional subtests can be modeled with “between” item multidimensionality 9 in ConQuest software (Adams, Wilson & Wang, 1997). 1.2.4 Strength of multilevel models Individual-level random sampling is not always possible for practical or ethical reasons in the field experiments. Practically, social and behavior science researches usually collect data with nested structure or multilevel or hierarchical in nature. Therefore, any attempt to understanding the individual-level learning outcomes and behavior patterns can severely handicap one’s ability to elucidate the underlying social processes which are constantly shaping students’ behaviors patterns. These social processes work at social groupings where people are changing simultaneously over time (Heck et al., 2010). Individuals from the same cluster, for example same classroom or school, may resemble each other not only in terms of outcomes, but also in terms of compliance behavior (Jo et al., 2008). Modeling and analysis of the multilevel data with single level techniques will be misleading. Conventionally, intraclass correlation (ICC) is used to estimate the proportion of variance that exists between clusters compared to the total variance ( σ b2 + σ w2 ) in multilevel data. The intraclass correlation coefficient in outcome Y is: 9 Refer to chapter 3 for details. 15 σ b2 ICCY = 2 σ b + σ w2 where σ b2 is the between-cluster variance and σ w2 is the within-cluster variance. The larger the ICC values, the higher the homogeneity of the groups and can be quite different from each other. Simulation studies demonstrate that the larger the ICC values, the standard errors are more likely to be underestimated if the clustering effects of sampling are ignored at estimation (e.g. Jo et al., 2008). In other words, the statistical model underlying the multistage stratified sampling violates the key assumptions (e.g. single random sampling provides independent errors) of single level multiple regression models. The variances and standard errors are underestimated and thus bias and erroneous conclusions. To recap, multidimensional and multilevel techniques are essential in processing the PISA 2006 datasets that the science performance items were built upon multidimensional framework and the two-stage sampling was deployed. The data collected is by nature multidimensional and multilevel and unidimensional and single level multiple regression models are not applicable to current research. 1.3 Research questions The gender equity in education is not only a key issue in human development and social justice, it is also related to economic development and growth in the long run (Mahony, 1988; Tse, 1998; Zahidi et al., 2009). In order to achieve better gender-equity in science education, this study looks at these key questions to figure out the current status of gender-equity education in Hong Kong: Are boys and girls doing equally well in scientific literacy? In particular, what kinds of items in terms of item format and content display gender differential item functioning10 (DIF)? How large of gender variability in Hong Kong secondary schools? To what extent and how gender effects are mediated through cognitive and affective domains of scientific literacy on achievement related choice? What can we do to reduce gender inequity in terms of curriculum design, classroom practices, initial professional teachers’ training, assessment and educational policies? In sum, the purpose of this study is to explore the gender effects on learning outcomes, educational and occupational choices related to science of 15-year-olds (S.1-S.4) in Hong Kong secondary schools. The following are the research questions to be addressed for this study: 10 Differential item functioning (DIF) occurs when people from different groups, for example gender, with the same latent trait have a different probability of giving a certain response on a questionnaire or test. 16 1. Is there any gender differences in students’ cognitive outcomes? 1.1 Is there any gender difference in science performance? 1.2 Is there any gender variability in science performance? 1.3 Which item content domains show substantial gender difference? 1.4 What type of item formats shows DIF and gender item interaction? 2. Is there any gender differences in students’ affective outcomes? 2.1 Is there any gender differences in students’ self-concept? 2.2 Is there any gender differences in students’ interest and enjoyment values? 2.3 Is there any gender differences in students’ attainment value and utility value towards science? 2.4 Is there any gender differences in future-oriented science motivation? 3. To what extent and how gender effects are mediated through cognitive and affective domains of science on achievement related choice? 3.1 Is there any mediation effect of student science self-concept to their achievement related choice after controlling parental SES and science performance? 3.2 Is there any mediation effect of student interest-enjoyment value to their achievement related choice after controlling parental SES and science performance? 3.3 Is there any mediation effect of student attainment value and utility value to their achievement related choice after controlling parental SES and science performance? 3.4 Is there any mediating effect of student cognitive and affective domains of science on students’ achievement related choice? 1.4 Significance of the study 1.4.1 For gender-equity educational policies in Hong Kong In both TIMSS and PISA assessments, gender differences are always one of the top areas of studies and in international reports and citations. This indicates a strong faith of Western world that gender-equity education is a critical belief in basic human rights and integrity, social justice and progression, man-power development and economic growth to face the challenge of globalization in 21st century. In contrast, the gender equity in Hong Kong, at present, is realized by legislation and the gender-equity policies in education are still at very preliminary stage of development. The local activists, professionals and journalists criticize that the present government gender-equity policies is obsolete and only tackle the elementary issues of gender disparities in terms of education accessibility. While most Hong Kong people are satisfied with 17 9-year free, compulsory and universal basic education offered since 1978, the veracity of unfairness under universal basic education is essentially ignored (Tse, 1998). The most recent report on review of 9-year compulsory education by the Board of Education in 1997 once again placed key focus on “formal equality of opportunity” (Coleman, 1968): (1) education for all, (2) assessment and (3) allocation systems rather than educational inequity. The sub-committee recommends that Hong Kong should continue with its present policy of offering free and compulsory education to all children of the relevant age for nine years. Board of Education, 1997 As mentioned by the Development Assistance Committee (DAC) Expert Group on Women in Development of OECD, monitoring, reporting and evaluation are critical processes for assessing and improving development practices and impacts (DAC, 1999 p.35). The current study presents a unique opportunity to investigate the gender effects on scientific literacy and provide grounds for gender-equity policies, gender-differential policies, non-gender biased curriculum development and assessment in local schools. 1.4.2 For local economic growth Klasen’s (1999) cross-country regression analysis demonstrated that gender inequity in education had a direct impact on economic growth through lowering the average quality of human capital. The empowerment of women and participation in STEM is vital to sustainable, people-centred development and uphold respect for women’s human rights (OECD, 2006 & DAC, 1999). According to the 2006 population by-census in Hong Kong, however, the field of education of highest level attended by girls in pure science was about 20% less than that of boys (Census and Statistics Department, 2006). Although the statistics from PISA 2006 showed that the overall science performance of both sexes were similar (Ho et al., 2008), the same trend can also be detected in the HKCEE subject choices in the past decade from 2001 to 2009 (see Figure 1.3). The results pinpointed the fact that the performance of affective domains in scientific literacy is crucial to future-oriented motivation in subject choices and career-orientation. 18 Figure 1.3: The trend of day school first attempters in science subject choice by gender in HKCEE 2001-2009 55 Subject choice (%) 50 Biology (Girls) Biology (Boys) Chemistry (Girls) Chemistry (Boys) Physics (Girls) Physics (Boys) 45 40 35 30 2000 2002 2004 2006 2008 2010 Examination year To the career masters at schools and government officers in man-power and economic development areas, the discourse in gender-equity educational policy can be collapsed into economy policy (Ball and Gewirtz, 1997; Ball, 1999). The research findings will be valuable for both educational and economic policy makers to arise girls’ involvement in science learning in post-compulsory education and science careers which is considered imperative in the increasingly technological workplace in the 21st century. 1.4.3 For gender-inclusive science curriculums, assessments & teachers’ training Currently, the major curriculum documents and assessment guides, for example, the Science Education Key Learning Area Curriculum Guide (Primary 1 - Secondary 3) and Combined Science Curriculum and Assessment Guide (Secondary 4 - 6) are gender-blinded. The gender effect on classroom learning and school-based assessment in science domains are basically ignored and the gender-equity courses for professional initial teachers’ training are often offered as electives (Chan et al., 2009). Without adequate and persuasive evidence from empirical research, the current situation of gender-insensitive curriculum design and implementation in science education and formal teachers’ training is hard to be revised, improved and monitored continuously. For the benefits of boys and girls, the results of this study attempt to alert the participants in science education 19 such as school heads, science teachers and curriculum officers to pay attention to gender disparity issues in their daily works. 1.4.4 For academic discourse in gender-equity research The Rasch model and DIF is usually employed to study the effect of item position and item format on students’ performance. The applications of multidimensional item response modeling to study gender effect on PISA 2003 Mathematics performance was led by Professor Mark Wilson at the University of California, Berkeley and his student, Lydia Ou Liu at the Educational Testing Service, Princeton. Such innovative application is new in the field of educational assessment and gender study. For the mediation study, the mediators were selected based on the well-developed Eccles’ (1983) expectancy-value model of achievement-related choices which was seldom used outside North America to study the gender differences of Asian population. This study deploys similar methodology to study the gender effect on scientific literacy of PISA 2006 and that will benefit the local research communities by studying the applicability of such method and model to Hong Kong situation. 1.5 Structure of the thesis This thesis is divided into six chapters. Chapter one has provided an overview of the research backgrounds, limitations of previous gender studies and possible solution. Then, research questions of the present study and its significances have been stated. Chapter two conducts literature review on gender differences of science performance and affective learning outcomes. It also endeavors to cast the theoretical standpoints in connection with gender and achievement, from sex role theory to psychoanalytic theory; and from evolutionary-biological to social constructionist. The chapter attempts to summarize the gender differences in scientific literacy and some common theories in explaining the gender differences from the literature. Chapter three presents the conceptual frameworks and methodology to address the research questions. More specifically, Mean Score Difference (MSD), Multidimensional Differential Item Functioning (MDIF) and Multilevel Mediation (MLM) are used to examine the gender differences at mean score level, item level and system level respectively. The MDIF is for 20 gender biased item analysis. While MLM investigates the underlying relationship between gender differences in affective learning outcomes and gendered educational and occupational choices related to science. Chapter four discusses the results of the gender differences of scientific literacy in terms of the cognitive and affective outcomes using MSD and MDIF. Chapter five looks into the gender differences of achievement related choices using Eccles et al’s (1983) model and MLM. Chapter six summarizes all the major findings of the study, examines the implications for policy and practice at school, families, examination bodies and education authorities, reflects limitations of the present study, and recommends for future research. 1.6 Summary Chapter one starts with an overview of the research backgrounds of the gender equity issues in the field of science education. Then, the limitations of previous gender studies and the possible solution to these limitations were discussed. Lastly, the research questions and significances of the study were listed. In the next chapter, the related literatures and theoretical frameworks about gender differences in science will be examined. 21 CHAPTER TWO LITERATURE REVIEW To assess gender differences in scientific literacy, two fundamental but controversial concepts “scientific literacy” and “gender” have to be clarified. The purpose of this chapter is to examine these two controversial concepts and to review critically the factors contributing to gender differences in scientific literacy. 2.1 Defining scientific literacy by historical review 2.1.1 Cognitive domain of scientific literacy 1950s Scientific literacy has been widely recognized as the goal for science education since 1950s (Hurd, 1958; Hurd, 1970; McCurdy, 1958; Rockefeller, 1958). However, its definition is not universally accepted in either the science or education community. It swings like a pendulum in science education reform in the United States and the Western world (Deboer, 2000). The differences in its meanings and interpretations may give the public a general impression that scientific literacy is an ill-defined, diffuse and controversial concept (Champagne & Lovitts, 1989). To define scientific literacy, the meaning of “literacy” has to be clarified first. The common understanding of “literacy” is the ability to read and write, but its meaning changes over time. Beingliterate refers to people who can master the process required to interpret culturally important information (deCastell & Luke, 1986) while OECD (2003) define “literacy” as the capacity of students to analyze, reason and communicate effectively as they pose, solve and interpret problems in a variety of subject matter areas. Literacy can be further classified into inert literacy and liberating literacy. Inert literacy refers to the capacity of people to read a passage or sign a document whereas liberating literacy means people can read freely and widely in search of whatever information and knowledge they choose (Cremin, 1988). The liberating literacy is therefore more widely accepted in a society that offers free access to materials that open people’s minds and allows them to explore new ideas and aspirations (Bybee, 1997b). Based on this perspective, John Dewey (1916) framed the early goal of science education as instrumental in nuturing independence of thought of all students and to act independentally of arbitrary authority to enable them to participate more fully and effectively in an open 22 democratic society. The function of schooling is to prepare students for adulthood in a democratic society, and therefore enabling students to carry out independent scientific inquiries and investigations in the laboratories becomes an essential component of scientific literacy in the school curriculum. Whatever natural science may be for the specialist, for edu cational purposes it is knowledge of the conditions of human action. (Dewey, 1916, p. 228). This functionalist perspective in the sociological account of scientific literacy is still prevalent in most European Countries today; the European Commission (1995) White Paper on Education and Training argued that “the importance of adequate scientific awareness – not simply in the mathematical sense – to ensure that democracy can function properly. Democracy functions by majority decision on major issues which, because of their complexity, require an increasing amount of background knowledge. … At the moment, decisions in this area are all too often based on subjective and emotional criteria, the majority lacking the general knowledge to make an informed choice. Clearly this does not mean turning everyone into a scientific expert, but enabling them to fulfill an enlightened role in making choices which affect their environment and to understand in broad terms the social implications of debates between experts. There is similarly a need to make everyone capable of making considered decisions as consumers. (pp. 11-12)” Some science educators suggest that without critical public engagement and debate of possible applications and implications in scientific advances, such as stem cell research and avian flu vaccination programs, the public distrust scientific expertise and place unnecessary restrictions on future research and technological development. Such restrictions will in turn hinder the advancement of potential scientific and technological innovations and breakthroughs which may bring solutions to plethora of issues our contemporary society facing now (Osborne, 2007; Millar, 1996a). 23 1960s In 1960s, the focus of scientific literacy in the Western world shifted to content knowledge of various scientific academic disciplines. Most curriculum content concentrated on teaching abstract models and scientific concepts of the natural world that were organized by scientists. The tremendous driving force of such science curriculum reform in this period was a result of direct competition between the United States and former Soviet Union, in particular in areas strongly related to science and technology, such as nuclear weapons and space exploration. The key stimulus was former Soviet Union’s successful launch of Sputnik I in October 1957. In 1961, President John F. Kennedy declared a goal of landinga man on the Moon and returning him safely to the Earth. The US national goal was translated into support of science programs that encouraged students to enter careers in science and engineering. The goal of science education in the period was thus to prepare young people to enter the field of science and engineering and the linkage between science and daily applications of science in society were seldom mentioned. Diane Ravitch (1983) opined that it was pedagogically imprudent to focus so heavily on the structure of the disciplines at the expense of the interests and developmental needs of learners. The characteristics of the scientific literacy in 1960s can be found in Table 2.1. 24 Table 2.1 Characteristics of Scientific Literacy: The 1960s (Source: Bybee, 1997b p. 53-54) National Science Teachers Association Theory into Practice (1964) Conceptual Schemes (1) (2) (3) (4) (5) (6) (7) All matter is composed of units called fundamental particles; under certain conditions these particles can be transformed into energy and vice versa. Matter exists in the form of units that can be classified into hierarchies of organizational levels. The behavior of matter in the universe can be described on a statistical basis. Units of matter interact. The basis of all ordinary interactions are electro-magnetic, gravitational, and nuclear forces. All interacting units of matter tend toward equilibrium states in which the energy content (enthalpy) is a minimum and the energy distribution (entropy) is most random. In the process of attaining equilibrium, energy transformations or matter transformations occur; nevertheless, the sum of energy and matter in the universe remains constant. One of the forms of energy is the motion of units of matter. Such motion is responsible for heat and temperature and for the states of matter: solid, liquid, and gaseous. All matter exists in time and space, and since interactions occur among its units, matter is subject in some degree to changes with time. Such changes may occur at various rates and in various patterns. Paul Hurd and James Callagher (1966) (1) (2) (3) (4) National Science Teachers Association Theory into Practice (1964) Processes of Science Appreciate the socio/historical development of science. Aware of the ethos of modern science. Understand and appreciate the social and cultural relationships of science. Recognise the social responsibility of science. (1) (2) (3) (4) (5) 25 Science proceeds on the assumption, based on centuriesold experience, that the universe is not capricious. Scientific knowledge is based on observation of samples of matter that are accessible to public investigation in contrast to purely private inspection. Science proceeds in a piecemeal manner, even though it also aims at achieving a systematic and comprehensive understanding of various sectors or aspects of nature. Science is not, and will probably never be, a finished enterprise, and there remains much more to be discovered about how things in the universe behave and how they are interrelated. Measurement is an important feature of most branches of modern science because the formulation, as well as the establishment, of laws are facilitated through the development of quantitative distinctions. Milton Pella (1967) (1) (2) (3) (4) (5) (6) Interrelationships between science and society Ethics of science Nature of science Conceptual knowledge Science and technology Science in the humanities 1970s – 1980s Until 1970s, however, the science, technologies and society (STS) movement revised the goal of science education ‘to develop scientifically literate individuals who understand how science, technology, and society influence one another and who are able to use this knowledge in their everyday decision-making’ (NSTA, 1982; Solomon, 1993; Solomon & Aikenhead, 1994) (see Table 2.2 and Table 2.3). The National Science Teachers Association (NSTA) (1991) suggested that a scientifically and technologically literate person can demonstrate both intellectual capability and attributes in science: Intellectual 1. uses concepts of science and of technology, as well as an informed reflection of ethical values, in solving everyday problems and making responsible decisions in everyday life, including work and leisure; 2. locates, collects, analyses, and evaluates sources of scientific and technological information and uses these sources in solving problems, making decisions, and taking actions; 3. distinguishes between scientific and technological evidence and personal opinion and between reliable and unreliable information; 4. offers explanations of natural phenomena testable for their validity; 5. applies scepticism, careful methods, logical reasoning, and creativity in investigating the observable universe; 6. defends decisions and actions using rational argument based on evidence; and 7. analyses interactions among science, technology and society. Attitudinal 8. displays curiosity about the natural and human-made world; 9. values scientific research and technological problem solving; 10. remains open to new evidence and the tentativeness of scientific/technological knowledge; and 11. engages in science/technology for excitement and possible explanations. Societal 12. recognizes that science and technology are human endeavours; 13. weighs the benefits/burdens of scientific and technological development; 14. recognizes the strengths and limitations of science and technology for advancing human welfare; and 15. engages in responsible personal and civic actions after weighing the possible consequences of alternative options. Interdisciplinary 16. connects science and technology to other human endeavours e.g. history, mathematics, the arts, and the humanities; and 17. considers the political, economic, moral and ethical aspects of science and technology as they relate to personal and global issues. 26 Table 2.2: Characteristics of Scientific Literacy: The 1970s (Source: Bybee, 2008 p. 87) Michael Agin (1974) Victor Showalter (1974) (1) Science and Society (1) Nature of Science (2) Ethics of Science (3) Nature of Science (4) Knowledge of the Concepts of Science (2) Concepts in Science (3) Processes of Science Benjamin Shen (1974) (1) Practical Science Literacy (2) Civic Science Literacy (3) Cultural Science Literacy (4) Values of Science (5) Science and Society (5) Science and Technology (6) Interest in Science (6) Science and the (7) Skills Associated with Humanities Science Table 2.3: Characteristics of Scientific Literacy: The 1980s (Source: Bybee, 2008 p. 88) National Science National Commission Improving American Association Teachers Association, on Excellence in Indicators of the for the Advancement Science-Technology- Education, A Nation Quality of Science of Science, Science for Society: Science at Risk (NCEE, 1983)and Mathematics All Americans Education for the Education in Grades (AAAS, 1989) 1980s (NSTA, 1982) K-12, (Murnane St Raizen, 1988) (1) Scientific and 1. Concepts, laws, 1. The nature of 1. The nature of technological and processes of the scientific science process and physical and worldview 2. The nature of inquiry skills biological 2. The nature of mathematics (2) Scientific and sciences the scientific 3. The nature of technological 2. Methods of enterprise technology knowledge scientific inquiry3. Scientific 4. The physical (3) Skills and and reasoning habits of mind setting knowledge of 3. Applications of 4. Science and 5. The living science and knowledge to human affairs environment technology in everyday life 6. The human personal and 4. Social and organism social decisions environmental 7. Human society (4) Attitudes, values, implications of 8. The designed and appreciation scientific and world of science and technological 9. The technology development mathematical (5) Interactions world among 10. Historical science-technolo perspectives gy- society via 11. Common themes context of 12. Habits of mind science-related societal issues 27 It is clear that a more realistic framework and long term developmental goal of scientific literacy can be visualized with a single comprehensive model. Bybee (1997a) achieved this developmental goal by proposing a model that consists of four dimensions: nominal, functional, conceptual and procedural and multidimensional scientific literacy (see Table 2.4). Table 2.4: A multidimensional and hierarchical model of scientific literacy Nominal Scientific Literacy (1) Identifies terms, questions, as scientific but demonstrates incorrect topics, issues, information, knowledge, or understanding. (2) Has misconceptions of scientific concepts and processes. (3) Gives inadequate and inappropriate explanations of scientific phenomena. (4) Expresses scientific principles in a naive manner. Functional Scientific Literacy (1) Uses scientific vocabulary. (2) Defines scientific terms correctly. (3) Memorizes technical words. Conceptual and Procedural Scientific Literacy (1) Understands conceptual schemes of science. (2) Understands procedural knowledge and skills of science. (3) Understands relationships among the parts of a science discipline and the conceptual structure of the discipline. (4) Understands organizing principles and processes of science. Multidimensional Scientific Literacy (1) Understands the unique qualities of science. (2) Differentiates science from other disciplines. (3) Knows the history and nature of science disciplines. (4) Understands science in a social context. The first two dimensions focus on the ability of using scientific language to describe and explain observation while the third dimension requires higher cognitive skills to understand scientific concepts and processes. The last and highest dimension, multidimensional scientific literacy, consists of sociocultural dimensions of science and technology: history, nature and social context of science. The attractive features of Bybee’s multidimensional and hierarchical model are able to capture the key ideas of earlier presented frameworks for scientific literacy at right dimensions (Pella, O’Hearn, & Gala, 1966, Agin, 1974; Showalter, 1974; Murnane & Raizen, 1988). Moreover, it incorporates two essential standards for measuring achievements in scientific literacy, National Science Education Standards and Benchmarks for Science Literacy. (see Table 2.5) 28 Table 2.5: Content Summary for the National Science Education Standards and Benchmarks for Science Literacy (Source: Bybee, 2008 p. 88) National Science Education Standards Benchmarks for Science Literacy (1) Unifying Concepts and Processes Science (1) The Nature of Science as Inquiry Physical Science Life Science (2) The Nature of Mathematics (2) Earth and Space Science (3) The Nature of Technology (3) Science and Technology (4) The Physical Setting (4) Science in Personal and Social (5) The Living Environment Perspectives (6) The Human Organism Human (5) History and Nature of Science Society (7) The Designed World (8) The Mathematical World Historical Perspectives Common Themes Habits of Mind From the review above, the assessment framework of PISA 2006 science is in line with Bybee’s multidimensional definition (see Table 2.6), and put strong emphasis on the second view of science education: to prepares future literate citizens with abilities to apply their scientific knowledge to daily life and make informed decision in real life context rather than specialist education making them very proficient within their specialist domain but with no broad education about science. Table 2.6: Scientific competency framework of PISA 2006 (OECD, 2006) Dimensions11 Sub-dimensions (1) Recognising issues that it is possible to investigate scientifically Identifying scientific issues (2) Identifying keywords to search for (Conceptual and Procedural scientific information Scientific Literacy) (3) Recognising the key features of a scientific investigation 11 Explaining phenomena scientifically (Nominal Scientific Literacy) (1) Applying knowledge of science in a given situation (2) Describing or interpreting phenomena scientifically and predicting changes (3) Identifying appropriate descriptions, explanations, and predictions Using scientific evidence (Multidimensional Scientific Literacy) (1) Interpreting scientific evidence and making and communicating conclusions (2) Identifying the assumptions, evidence and reasoning behind conclusions (3) Reflecting on the societal implications of science and technological developments The terms in bracket are components of Bybee’s (1997a) multidimensional and hierarchical model. 29 2.1.2 Affective domain of scientific literacy 2.1.2.1 Taxonomy of affective domain elements in science education In the past 30 years, the affective learning outcomes of science education has been considered as one of the key components of scientific literacy among science educators and substantial work has been done in the science education research community (Baker & Doran, 1975, Choppin & Frankel, 1976, Osborne et al., 2003). However, its content is not well defined. A typical taxonomy of affective behaviors in science was worked out by Klopfer (1971) (see Table 2.7). Table 2.7: Klopfer’s taxonomy of affective behaviors in science education (1) the manifestation of favourable attitude toward science and scientists; (2) the acceptance of scientific enquiry as a way of thought; (3) the adoption of scientific attitudes; (4) the enjoyment of science learning experiences; (5) the development of interest in science and science-related activities; and (6) the development of an interest in pursuing a career in science or science related work. Later studies extended Klopfer’s (1971) taxonomy to cover a wider range of components such as attitudes, values, beliefs, interests and motivation in affective measurements (see Table 2.8, Simpson et al., 1994 and Osborne et al., 2003). The following section reviews some key affective factors that are used in this study. Table 2.8: A range of components in affective domains of scientific literacy (1) the perception of the science teacher; (2) anxiety toward science; (3) the value of science; (4) self-esteem in science; (5) motivation towards science; (6) enjoyment of science; (7) attitudes of peers and friends towards science; (8) attitudes of parents towards science; (9) the nature of the classroom environment; (10) achievement in science; and (11) fear of failure on course. 30 2.1.2.2 Science self-concept According to Wigfield and Karpathian (1991), self-concept is defined as individuals’ affective reactions to their characteristics, and overall evaluation of themselves as persons. Shavelson et al (1976) studied a number of definitions about self-concept. They came to the conclusion that self-concept is a continual process of reinforcement by evaluative inferences and that it reflects both cognitive and affective responses. Scheirer and Kraut (1979) pointed out that self-concept is a complex construct composed of descriptive, evaluative, comparative, and affective elements. Pajares (1996) maintained that self-concept includes competence judgments coupled with evaluative reactions and feelings of self-worth. Markus and Nurius (1986) viewed self-concept as “a system of affective-cognitive structures about the self that lends structure and coherence to the individuals’ self-relevant experiences”. Modem theorists argued that self-concept is a multidimensional and subject specific construct rather than a global measurement of self-related experiences (Harter, 1990; 1998; Marsh, 1993). For example, if a researcher is interested in relations of self-concept to science performance, then he or she should measure self-concept in this domain, rather than just using a general self-concept measure. Students should be asked to give response to the item “I learn school science topics quickly” rather than “I learn school subjects quickly”. In summary, science self-concept consists of two components, affective and cognitive evaluation of self-experiences. Second, self-concept is a multidimensional and domain specific construct. 2.1.2.3 Motivation in science learning The word “motivation” is derived from the Latin “movere” which means “to move”. Motivation refers to an internal state that arouses, directs, and sustains students’ behaviour. Motivation can be further divided into intrinsic motivation and extrinsic motivation. According to Self-determination theory (SDT), intrinsic motivation is only possible when individuals freely choose their own actions; that is, they are self-determined (Ryan & Deci, 2000). On the other hand extrinsic motivation, such as receiving a reward, works for individuals with purpose. 31 Interest in science learning Interest is more specific than intrinsic motivation, which is a broader motivational characteristic (Hidi & Harackiewicz, 2001). For practical reasons, interest can be divided into individual and situational interest. Individual interest is a relatively stable evaluation of certain domains while situational interest is an emotional state aroused by specific features of an activity or a task. Feeling-related and value-related valences are two distinguishable aspects of individual interest (Schiefele, 2001). Feeling-related valences refer to the feelings that are associated with an object or an activity itself - feelings like involvement, stimulation, or flow. Value-related valences refer to the attribution of personal significance or importance to an object. Enjoyment of science learning According to Csikszentmihalyi (1990, 1996), enjoyment refers to “flow” activities that provide students a feeling of creative accomplishment and satisfaction. Flow is a state of deep absorption in an activity that is intrinsically enjoyable. So, enjoyment of science learning is defined as total engagement in the science learning activity that is intrinsically enjoyable (Shernoff et al., 2003). Personal value of science The personal value of science is defined as students’ beliefs about the value of science in terms of the relevance and importance of science around them, both now and in the future (PISA, 2006). This definition of personal value is identical to attainment value in Eccles et al (1983) Expectancy-value Model for Achievement-related Choices. Attainment value in Eccles’ (1983) model refers to the needs, personal values, and explicit motives that an activity fulfills. As they grow up, individuals develop an image of who they are and what they would like to be. Instrumental motivation to learn science Instrumental motivation is defined as the motivation that derives from the future goals. It is a type of extrinsic motivation to learn and get good grades of school science for future careers and studies. The activities engaged in have a utility value when they are perceived as instrumental for achieving other goals in the near or distant future (Eccles & Wigfield, 2002). 32 2.2 Gender differences in scientific literacy 2.2.1 Defining gender: the nature versus nurture debate The complexity of using the terms ‘sex’ versus ‘gender’ in educational and social research is in connection with the ‘nature/nurture’ argument. The use of ‘sex’ or ‘gender’ depend on political standpoints and briefs about the origin of sex differences which are inherent and biological, or socially produced or a mixture of both (Francis & Skelton, 2005). Traditionally, ‘Sex’ is defined as biological distinction between males and females based upon their genetic composition (males have XY chromosomes while females have two XX chromosomes), reproductive anatomy and physiology. However, many research findings from feminist psychologists and social constructionists demonstrate that gender difference is nurtured through a complex system of social classification and hierarchy (Crawford & Unger, 1994; Crawford et al., 1995; Unger & Crawford, 1996). In between, the proponents of ‘brain sex’ differences and socio-biologists argue that it is extremely difficult to isolate biological influences from sociocultural and environmental factors and the two influences are reciprocal in nature (Halpern, et al., 2007). These authors choose to use a biopsychosocial model (Halpern, 2000, 2004) to explain the ‘sex’ differences due to complex interaction of biological and sociocultural/environmental variables. Socialization practices are undoubtedly important, but there is also good evidence that biological sex differences play a role in establishing and maintaining cognitive sex differences. (Halpern, 2000 p. xvii) In short, the distinction between ‘sex’ and ‘gender’ is intertwined in the debate of nature or nurture as a key factor to determine the differences of cognitive and affective learning outcomes between boys and girls. In this thesis, the influence of sociocultural factors on gender differences in science literacy was evaluated. The term “gender” has been chosen for emphasizing sociocultural effects on gender differences. 2.2.2 Gender differences in cognitive learning outcomes A typical technique used in gender difference studies is meta-analysis which is a statistical method for aggregating research findings across all the available research findings with the same question (Hedges & Becker, 1986). If there is a gender difference in science abilities, it is able to indicate which gender achieves more highly and the magnitude of the difference. For each gender study, effect size by Cohen (1988) d for gender difference is calculated with the following formula: 33 M boys − M girls 2 2 ( SDboys + SDgirls )/2 where Mboys is the mean score for boys, Mgirls is the mean score for girls, and 2 2 ( SDboys + SDgirls ) / 2 is the pooled standard deviation. The value of d measures how far apart the male and female means are, in standardized units and indicates the magnitude (effect size) of gender difference across all the research findings (Hyde, 2005). A positive value of d indicates males score higher and vice versa. According to Cohen (1988), the effect size is small, medium and large if d is less than or equal to 0.2, equal to 0.5 and equal to or larger than 0.8 respectively. Hyde (1990) suggested that meta-analysis across all the available studies is a better alternative in investigatingscepticism scepticism gender differences because such research findings are notoriously inconsistent across studies and it overcomes this problem by aggregating many studies involving tens of thousands and even millions of participants. This provides more reliable research findings than any individual study. Using the meta-analysis method, Hyde (2005) found out that out of 128 valid effect sizes, 124 (about 97%) were similar. The effect sizes for science related gender studies were ranging from +0.19 (spatial visualization) to +0.73 (mental rotation). Based on these findings, Hyde (2005) proposed the gender similarities hypothesis: The gender similarities hypothesis holds that males and females are similar on most, but not all, psychological variables. That is, men and women, as well as boys and girls, are more alike than they are different. (Hyde, 2005 p.581) Halpern (2000), however, argued that some cognitive tasks did show sex differences and some of these differences were lost in aggregated results of meta-analysis. Halpern disagreed with Hyde in assigning values to small and large effect sizes, affirming that small differences might result in accumulating very big differences in the long run. In Valian’s (1998) analysis of females’ slow advancement in academia and other professions, she also showed how small differences were compounded over time to create big differences. Gender differences in science performance Using the same meta-analysis method of Hyde, this section summarizes the gender studies of science in the past few decades. 34 Table 2.9 summarizes the gender studies of science performance in elementary school education. Out of these twenty four studies, thirteen (54%) show that boys had an advantage in biological science and physics. Significant gender differences were not found in general science, geology or earth sciences and chemistry (Becker, 1989). According to Cohen (1988), the effect sizes of gender differences in these studies were relatively small (Cohen’s d = 0.02 to 0.43)12. Table 2.9: Gender differences in science performance at elementary schools Study Region Grade Area of study Effect size Results Ashbaugh (1968) Georgia 5 Geology +ve Boys outperformed girls Ashbaugh (1968) Georgia 4 Geology 0.43 Boys outperformed girls Ashbaugh (1968) Georgia 6 Geology 0.26 Boys outperformed girls Boys outperformed girls Law (1997) HK 4 General science +ve significantly; Greatest gender difference internationally Keeves & Kotte IEA Boys outperformed girls 4 General science +ve (1996) Science significantly Tamir & Amir (1975) Israel 1 Physical science 0.36 Boys outperformed girls Tamir & Amir (1975) Israel 2 Physical science 0.16 Boys outperformed girls Hsu (2008) Taiwan 6 General science NA No significant gender differences Marjoribanks (1976) UK 6 Physical science -0.12 Girls outperformed boys Allen (1970) USA 1 Physical science -ve Girls outperformed boys Allen (1975) USA 5 Biology +ve Boys outperformed girls Significant gender differences for the high ability students; No significant gender differences at Dimitrov (1999) USA 5 Physical science +ve the low and medium ability level; Significance of the interaction, gender x format x ability. Anderson & Butts USA 6 Electricity +ve Boys outperformed girls (1980) Allen (1973) USA 3 Physical science 0.42 Boys outperformed girls Bridgham (1969) USA 3 Electrostatics 0.42 Boys outperformed girls Fuller, May & Butts USA 3 Life cycles 0.37 Boys outperformed girls (1979) Boys outperformed girls; No Shrigley (1972) USA 6 Earth science 0.24 significant differences in scores of boys and girls Scott & Siegel (1965) USA 6 Science concepts 0.12 Boys outperformed girls Skinner (1967) USA 5 Geo1ogy 0.02 Boys outperformed girls Wallach & Kogan USA 5 General science -0.03 Girls outperformed boys (1966) Scott & Siegel (1965) USA 4 Science concepts -0.05 Girls outperformed boys Scott & Siegel (1965) USA 5 Science concepts -0.18 Girls outperformed boys Bowyer & Linn USA 6 Science literacy -0.15 Girls outperformed boys (1978) Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. Effect size cannot be determined is denoted as “NA”. 12 Absolute values of Cohen’s d are reported here. 35 Table 2.10 summarizes the gender studies of science performance at high schools. Out of twenty eight studies, twenty two (79%) show that boys consistently outperformed girls in general science and science subject domains from grade 8 to 12. Girls outperformed boys in three studies (11%). The effect sizes of gender differences in these studies were all relatively small (Cohen’s d = 0.12 to 0.42). Table 2.10: Gender differences in science performance at high schools Study Region Keeves (1975) Australia Keeves & Kotte (1996) Walberg (1969) IEA Science Canada Hart (1978) Canada Grade Area of study Science 7 performance General 8 science 12 Physics BSCS 12 Biology Yip et al (2004) HK 10 Ho et al (2005) HK 10 Ho et al (2008) HK 10 General science General science General science Effect size 0.32 0.42 0.12 Boys outperformed girls +ve Boys outperformed girls; Boys scored higher than girls at the higher percentiles (75th and above) +ve Boys outperformed girls; +ve Boys outperformed girls; Law (1997) HK 8 Mullis et al (2000) HK 8 General science +ve Yung et al (2006) HK 8 General science +ve Martin et al (2008) HK 8 General science +ve Tamir & Kempa (1975) Israel 10 Tamir (1974) Tamir (1974) Israeli Israeli 12 12 Tamir (1976) Israeli 12 36 Boys outperformed girls Boys outperformed girls significantly Boys outperformed girls +ve General science Phys, Chemistry & Biology Botany Zoology BSCS Biology Results +ve Boys achieve significantly more than girls; Greatest gender difference internationally Boys outperformed girls; No significant differences in achievement of boys and girls; Girls improved significantly from 1995 to 1999 while boys showed non-significant improvement Boys outperformed girls significantly; Girls improved significantly from 1995 to 2003 while boys showed no improvement Boys outperformed girls; Both girls and boys improved significantly from 1995 to 2007 NA No significant gender differences +ve -ve Boys outperformed girls Girls outperformed boys 0.12 Boys outperformed girls Hsu (2008) Taiwan 7-8 Hsu (2008) Taiwan 9-10 Strope & Braswe11 (1966) USA 13 Kruglak (1970) USA 13 USA 6-8 USA 11 USA 11 USA 8 McDuffie & Beehier (1978) Field & Cropley (1969) Ogden & Brewster (1977) Babikian (1971) General science General science NA NA Astronomy concepts Freshman physics Science performance General science Science performance Physical science Science performance +ve No significant gender differences No significant gender differences Men did better than women in learning astronomy facts and concepts. +ve Boys outperformed girls -ve Girls outperformed boys 0.38 Boys outperformed girls 0.36 Boys outperformed girls 0.36 Boys outperformed girls Ogden & USA 11 0.28 Boys outperformed girls Brewster (1977) Doran & Sellers USA 10 Biology 0.22 Boys outperformed girls (1978) Lynch et al Physical 0.22 Boys outperformed girls USA 8 (1979) science Marek (1981) USA 10 Biology 0.18 Boys outperformed girls Thomas & Snider USA 8 Chemistry 0.13 Boys outperformed girls (1969) Sieveking & College USA 13 -0.12 Girls outperformed boys Savitsky (1969) chemistry Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. Effect size cannot be determined is denoted as “NA”. To sum up, boys had better science performance than girls. Boys tended to outperformed girls in biology and physics at elementary level and boys’ advantages in physics (Cohen’s d = 0.36 to 0.42) persisted high school. However, there were no significant gender differences found in general science and chemistry at either elementary or high schools. It should be noted that all the effect sizes obtained from these studies were relatively small (Cohen’s d = 0.02 to 0.43). 37 2.2.3 Gender differences in affective learning outcomes In the past two decades, numerous researches related to gender differences in affective learning outcomes have been conducted. Patrick et al (2009) investigated the gender differences of kindergarten boys and girls in the United States and found that boys in regular classrooms like science more than girls. Boys also found to have higher science self-concept than girls in American elementary schools and German high schools (e.g. Andre et al., 1999; Rudasill & Callahan, 2010; Häussler & Hoffmann, 2002). Table 2.11: Gender differences in affective leaning outcomes at elementary schools Study Region Grade Area of study Effect size Results Patrick, Boys in regular classrooms Interest in Mantzicopoulos & +ve reported liking science more USA 5 science learning Samarapungavan than did girls (2009) Boys reported significantly more interest in learning Interest in Jones, Howe & Rua +ve USA 6 about the listed science topics science (2000) than girls Andre, Whigham, Boys had higher science Science +ve Hendrickson, & USA 4-6 self-concepts than girls self-concept Chambers (1999) Self-perception Boys had higher Rudasill & USA 5-12 of ability in +ve self-perceptions of ability in Callahan (2010) science science than girls Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. In terms of motivation to learn science, personal value of science and enjoyment of science learning, various research findings suggested that boys had higher values in this affective domain than girls at higher schools (e.g. Salta & Tzougraki, 2004; Weinburgh, 2000; Cheung, 2008; Cheung, 2009b; Salta & Tzougraki, 2004). A summary of these research findings can be found in Table 2.11 and Table 2.12. Out of these thirty studies, twenty five (83%) show that boys had higher interest and enjoyment in science learning, personal value of science, science self-concept and motivation to learn science than girls. Girls showed higher motivation than boys to learn science in three (10%) international studies. Unlike cognitive development in science, the results in Table 2.11 and Table 2.12 also suggest that boys on average develop better affective learning outcomes in science than girls early in elementary school. The same pattern persists throughout higher school education. The effect sizes of these studies range from small (Cohen’s d = 0.08) to medium (Cohen’s d = 0.60). 38 Table 2.12: Gender differences in affective leaning outcomes at high schools Study Region Grade Area of study Effect size Results Boys were significantly higher Physics related Häussler & +ve than girls in physics related Germany 7 self-concept Hoffmann (2002) self-concept Boys were significantly higher Physics-related Häussler & +ve than girls in physics related Germany 7 interest Hoffmann (2002) interest No significant difference in the level of interest, usefulness, Importance of Salta & Tzougraki 0.08 and importance attributed to Greece 11 chemistry (2004) chemistry between boys and girls No significant difference in the level of interest, usefulness, Usefulness of Salta & Tzougraki 0.10 and importance attributed to Greece 11 chemistry (2004) chemistry between boys and girls No significant difference in the level of interest, usefulness, Interest in Salta & Tzougraki 0.11 and importance attributed to Greece 11 chemistry (2004) chemistry between boys and girls Boys had significantly higher Personal value 0.11 personal value of science than Cheung (2008) HK 10 of science 0.33 girls Boys had significantly higher Self-concept in 0.45 self-concept in science than Cheung (2008) HK 10 science 0.59 girls Interest in 0.30 - Boys had significantly higher Cheung (2008) HK 10 science 0.43 interest in science than girls Boys had significantly higher Enjoyment of 0.39 enjoyment of science learning Cheung (2008) HK 10 science learning 0.52 than girls Boys had significantly higher Instrumental 0.20 instrumental motivation to Cheung (2008) HK 10 motivation to 0.37 learn science than girls learn science Attitude toward Boys like chemistry theory Cheung (2009b) HK 10-12 chemistry +ve lessons more than girls at lesson grade 10-11 Girls were significantly more Motivation to Steinkamp & International NA -0.60 motivated to learn botany learn botany Maehr (1984) Girls were significantly more Motivation to Steinkamp & International NA -0.31 motivated to learn chemistry learn chemistry Maehr (1984) Girls were significantly more Motivation to Steinkamp & International NA -0.28 motivated to learn biology learn biology Maehr (1984) Boys were significantly higher Importance of Steinkamp & International NA 0.13 value of science than girls science Maehr (1984) Boys had significantly higher Enjoyment of Steinkamp & International NA 0.14 enjoyment of science learning science learning Maehr (1984) than girls 39 Boys were significantly more motivated to learn geology Boys were significantly more Steinkamp & International NA 0.20 motivated to learn general Maehr (1984) science Boys were significantly more Steinkamp & International NA 0.35 motivated to learn physical Maehr (1984) science Boys had higher self-concept +ve Weinburgh (2000) USA 6-8 in science than girls Boys had significantly higher Enjoyment of +ve enjoyment of science learning Weinburgh (2000) USA 6-8 science learning than girls Boys had higher motivation to Motivation to +ve Weinburgh (2000) USA 6-8 learn science than girls learn science Intrinsic Boys had significantly higher Bryan, Glynn & USA 9-10 motivation to 0.34 intrinsic motivation to learn Kittleson (2011) learn science science Boys were more interested in Interested in +ve Lee (1998) USA 16-18 science than girls science 7th- and 10th-grade hoys and National Center for 7 & Enjoyment of NA girls were equally likely to Education Statistics USA 10 science learning enjoy mathematics and science (1997) Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. Effect size cannot be determined is denoted as “NA”. Steinkamp & Maehr (1984) International NA Motivation in geology Motivation to learn general science Motivation to learn physical science Science self-concept 0.14 2.2.4 Gender differences in science educational and occupational choices Eccles and co-workers had conducted a large number of gender researches on science educational and occupational choices in the past thirty years. They found girls’ disadvantages in science related achievement choices were highly correlated to the affective learning outcomes in science (e.g. Frome, Alfeld, Eccles & Barber, 2006). High school boys were also more likely than girls to choose careers in science-related fields in the United States, Israeli and Hong Kong (Jacobs, 2006, Friedler & Tamir, 1999; Nagy et al., 2006; Census and Statistics Department, 2006). European Commission (2009) reported that females in scientific research remained a minority, accounting for 30% of researchers in the European Union in 2006. Similarly, girls had lower motivation to choose advanced science courses at high schools and universities. The University Grants Committee of Hong Kong (UGC, 2011) reported a consistently lower course enrolment rate of girls in science disciplines than boys from sub-degree levels to postgraduate levels except at taught postgraduate levels. A summary of the findings can be found in Table 2.13 and Table 2.14. 40 Table 2.13: Gender differences in science educational and occupational choices at high schools Study Region Grade Area of study Effect size Results Nagy, Effects of academic More girls than boys Trautweina, self-concept and selected advanced Baumerta, Germany 10-12 intrinsic value of -ve biology course in grade Köllerb & biology in course 12 Garrettc (2006) choices Boys had significantly Future-oriented higher future-oriented motivation to learn +ve motivation to learn Ho et al (2008) HK 10 science and have science and have science science careers careers than girls Boys selected more Science course taking science courses, and Friedler & Tamir Israeli 5-12 pattern at elementary +ve displayed greater interest (1990) and secondary levels in science careers than girls Males were more likely to Factors that influence persist in science and persistence in science +ve Mau (2003) USA 8 engineering career and engineering career aspirations than females aspirations A gap in career Boys were more likely to National Center aspirations of boys and +ve aspire to be scientists and for Education USA 8 girls in science or engineers Statistics (1997) engineering Female students were just as likely as male students National Center Science course taking NA to take advanced science for Education USA 9-12 pattern in high school courses in high school; Statistics (1997) physics is exception. Dunteman, Sex differences in Boys were more likely Wisenbaker & USA 12 college science +ve than their girls to major in Taylor (1978) program participation science Effect of intrinsic value Females were more likely Frome, Alfeld, of physical science on than males to drop out Eccles & Barber USA 12 young women’s +ve occupations in (2006) occupational traditionally aspirations male-dominated fields. Boys were more likely Parents’ expectations than girls to choose and in young adult Jacobs, Chhin & +ve USA 12 careers in science-related children’s gender-typed Bleeker (2006) fields occupational choices Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. Effect size cannot be determined is denoted as “NA”. Out of nine gender studies related to science educational and occupational choices at high schools, seven (78%) show boys more likely than girls to choose careers and studies in science-related fields (see Table 2.13). Only one study (11%) in Germany shows that more girls than boys tended to enroll in advanced biology courses at grade 12. 41 Table 2.14: Gender differences in science educational and occupational choices at universities Effect Study Region Grade Area of study Results size Science course Boys were more likely to take Subtaking pattern at +ve UGC (2011) HK science course than girls degree sub-degree level Science course Boys were more likely to take taking pattern at +ve UGC (2011) HK undergrad science course than girls postsecondary level Science course taking pattern at Girls were more likely to take UGC (2011) HK postgrad taught -ve taught postgraduate science postgraduate course than boys level Science course taking pattern at Boys were more likely to take UGC (2011) HK postgrad research +ve research postgraduate science postgraduate course than girls level Females in scientific research European remain a minority, accounting Females in NA Commission Europe postgrad for 30% of researchers in the scientific research (2009) EU in 2006 Interest in Girls were significantly more Lee (1998) USA undergrad becoming a -0.50 interest to become biologist biologist than boys Interest in Boys were significantly more Lee (1998) USA undergrad becoming a 0.18 interest to become chemists chemist than girls Interest in Boys were significantly more Lee (1998) USA undergrad becoming a 0.32 interest to become scientists scientist than girls Interest in Boys were significantly more Lee (1998) USA undergrad becoming a 0.63 interest to become physicists physicist than girls Webb, Lubinski, Course taking More girls took medical & Benbow USA undergrad pattern of -0.39 science as majors than boys (2005) medical science Webb, Lubinski, Course taking More girls took biological & Benbow USA undergrad pattern of -0.26 science as majors than boys (2005) biological science Webb, Lubinski, Course taking More girls took chemistry as & Benbow USA undergrad pattern of -0.01 majors than boys (2005) chemistry Webb, Lubinski, Course taking No gender difference in taking & Benbow USA undergrad pattern of earth 0.01 Earth science as majors (2005) science Webb, Lubinski, Course taking More boys took physical & Benbow USA undergrad pattern of 0.34 science as majors than girls (2005) physical science 42 National Science Foundation (2010) National Center for Education Statistics (1997) USA Female share of science and engineering undergrad bachelor’s degrees, by field in 2007 NA USA Science course enrollment undergrad pattern at postsecondary level NA Males earned a majority of bachelor’s degrees awarded in engineering, computer sciences, and physics. Females earned half or more of bachelor’s degrees in psychology, agricultural sciences biological sciences chemistry. At postsecondary level, women were less likely than men to earn a degree in physical sciences, and computer sciences. exception is in life science degrees Number of women in science and engineering occupations rising from 12% to 27% between 1980 and 2007. Representation of females in NA USA undergrad science and engineering Science course National Center enrollment Males were more likely than for Education USA postgrad pattern at +ve women to earn master’s Statistics (1997) postgraduate degrees in science level Note: The codes for effect size are as follows: positive (+ve) difference, boys outperformed girls; negative (-ve) difference, girls outperformed boys. Effect size cannot be determined is denoted as “NA”. National Science Foundation (2010) Similar to the high school situation, nine out of eighteen studies (50%) show that boys were more likely than girls to enroll in science programs at sub-degree and degree levels in Hong Kong, Europe and the United States (see Table 2.14). However, five out of eighteen studies (28%) suggest that girls were more likely than boys to take science programs in Hong Kong (taught postgraduate programs) and the United States (undergraduate programs). The effect sizes of gender differences found in these studies range from small (Cohen’s d = 0.01) to medium (Cohen’s d = 0.63). To sum up, boys were more likely than girls to make educational and occupational choices related to science at both high school and university levels. The effect sizes of gender differences on these choices fluctuated from small (Cohen’s d = 0.01) to medium (Cohen’s d = 0.63). Hong Kong girls followed the similar gendered educational and occupational choice pattern of Western counterparts and tended to choose non-science oriented careers and educational programs. 43 2.3 Factors attributing gender differences The debate of gender differences as a phenomenon of natural inherence by genetic and natural selection or socioculturally constructed (or a mixture of the two paradigms) (Halpern, 2000) is long. These two complete different views of gender differences are referred to as the nature versus nurture debate. For pure academic studies, the debate depends on research findings which can demonstrate whether gender is socioculturally constructed or the gender differences are inherent and inevitable. The following session provides a review of possible explanations for gender differences from the two camps, social constructionists and socio-biologists. It serves two purposes here: (1) to provide an overview of the problem and its underlining factors influencing the gender differences which may be in strong connection with Hong Kong educational context and (2) to provide a theoretical starting point which guides us the selection or development of an appropriate model out of all these possible perspectives for this study. 2.3.1 Biological contributions In contrast to sociocultural theories, researchers in cognitive and biological sciences look into the scientific aspects of ‘sex’ differences in cognitive performance in terms of structural, functional, psychological, evolutionary, developmental differences of male and female brains as well as the hormonal effects on the cognitive developments of boys and girls. The regular clinical approaches deployed in these studies include magnetic resonance imaging (MRI), X-ray computed tomography (CT scan) and positron emission tomography (PET) which allow cognitive and behavior psychologists and neurobiologists to map and distinguish the structural and physiological differences, such as physical brain size, blood flow rate and glucose metabolic patterns of neurons in various brain areas, between brains of opposite sex. 2.3.1.1 Evolutionary psychology perspectives Darwin (1871) proposed the ‘theory of natural selection’ with ‘sexual selection’ as the subset of the theory to explain the natural mechanism involving competition among the members of same sex over mates and selective choice of mating partners. Evolutionary psychologists suggest that the male-male competition for reproduction, dominance hierarchies, controlling ecological rich territories and hunting behaviours in evolutionary history support the brain development for large-scale navigation in males (Chagnon, 1988). This is why mental rotation tasks, which require simultaneously maintaining a three-dimensional object in working memory while transforming it, display very large gender differences with Cohen (1988) d range from 0.63 to 0.77 (Loring-Meier & Halpern, 1999; Masters & Sanders, 1993). The contemporary male 44 visuospatial abilities are the result of their roles as hunters and fighters which require abilities to construct and track projectile movement in three-dimensional space. The hypothesis is consistent with the predication of Darwin’s theory of sexual selection (Halpern et al., 2007). The cognitive capacity and behavior differences between the sexes were therefore well developed millions of years ago to ensure the survival and propagation of the human race (Francis & Skelton, 2005). 2.3.1.2 Brain structural perspectives By the mid 1800s, the most popular argument for superior cognitive abilities of males is because of having larger and heavier brains than females. The absolute size of the brain as a direct indicator of intellectual capacity was problematic since it implied that animals with larger brains were more clever than human beings. However, the weight of the brain to weight of female’ body ratio was actually larger than that of the male, so that provided opposite evidence. Similar hypotheses to explain males’ superior spatial abilities in science and mathematics, such as males having a larger cortex surface, more convolutions (Mosedale, 1978), higher volumes of white matter and cerebrospinal fluid in the male brain (Blatter et al., 1995; Gur et al., 1999) all failed to produce significant and consistent empirical evidences (Halpern et al., 2007) and so bigger isn’t better and size is not related to intelligence. From Gur et al’s (1999) spatial tests, females may achieve higher level of spatial performance using different strategies than males. In contrast to controversial evidence of brain size differences between males and female, substantial evidence suggests that females have larger corpus callosum which supports the idea of greater connectivity between the two hemispheres for faster language processing in females (Halpern et al., 2007). This might partially explain females outperforming males in open-ended response items in TIMSS and PISA assessments that demand better language ability. 2.3.1.3 Brain functional perspectives Other researchers attempted to explain the differences by examining the functional differences of the brain. Two direct estimations of brain regional activities are to measure blood flow rate and glucose consumption rate changes in response to scientific tasks. Gur et al (2000) suggested that the males’ better performance in difficult items of visuospatial tasks was in strong connection with more focal activation of right visual-association areas that results in more lateralization of cognitive abilities (i.e. rely on one side of hemisphere to process the visuospatial tasks e.g. mental rotation). Females handle similar tasks, recruit additional brain regions on both sides of the hemispheres for distributed processing. As a result, there is the more 45 distributed and bilateral recruitment of brain regions in females than in males as the complexity of the task increase (Kucian et al., 2005). A similar situation was demonstrated in 3-D virtual maze, females deploying parietal and prefrontal activation, which takes more effort, whereas males relied upon automatic retrieval of geometric-navigation cues in hippocampus only (Grön et al., 2000). The blood flow rate and glucose metabolic rates in females are higher than that of males and the magnetic resonance imaging (MRI) pattern confirm more bilateralization of cognitive abilities for females (Gur et al., 1995; Murphy et al., 1996). 2.3.1.4 Hormonal perspectives Administration of male hormone, testosterone to female-to-male transsexuals before sex-change surgery improved their spatial cognition (Van Goozen, Cohen-Kettenis, Gooren, Frijda, & Van de Poll, 1994, 1995). The pattern of improvement was later confirmed by 3-D spatial-ability test for these individuals (Voyer et al., 1995; Slabbekoorn et al., 1999) The magnitude of testosterone effect on female-to-male transsexuals improved the spatial cognition significantly for the first three months (Cohen’s d = 0.56), but no further improvement after seven months of treatment. For male-to-female transsexuals, androgen suppression did not decrease spatial performance of these individuals, suggesting prenatal effects of androgen on these abilities. The results were consistent with postnatal activation theory (Voyer et al., 1995). 2.3.2 Sociocultural contributions Compared with evolutionary psychologists and neurobiologists, social scientists’ attention is drawn into the sociocultural factors in seeking to explain observed cognitive and affective differences of females and males (Unger, 1981). Blickenstaff (2005) argued strongly against the hypotheses of inborn genetic differences to explain the gender differences of students. Based on the socialization theory, Blickenstaff (2005) counter-proposed an evaluation of eight alterative hypotheses for gender differences in science and STEM from females’ perspective (p. 371-372): 1. Girls’ lack of academic preparation for a science major/career. 2. Girls’ poor attitude toward science and lack of positive experiences with science in childhood. 3. The absence of female scientists/engineers as role models. 4. Science curricula are irrelevant to many girls. 5. The pedagogy of science classes favors male students. 6. A ‘chilly climate’ exists for girls/women in science classes. 7. Cultural pressure on girls/women to conform to traditional gender roles 8. An inherent masculine worldview in scientific epistemology. 46 2.3.2.1 Gender-role Traditionally, perception of science and scientists is widely stereotyped as masculine or male dominated. Gender role sterertyping is an attempt to explain gendered patterns on achievement in the socialization process. Children acquire their knowledge of gender by observing and mimicking their behavior on same-sex members of their family, friends and local communities as well as gender stereotype messages from the public media (Ngai, 1995; Sharpe, 1976). Steele and Aronson (1995) opined that stereotype or stereotype threat occurs when a person’s belief that he or she belongs to a group stereotyped as inferior in a given ability in the presence of certain contextual cues. The contextual cues could discourage females from aspiring to and pursuing science education and careers and from taking leadership roles (Schmader, 2004). As a long term effect, stereotype threats can decrease females’ opportunities of being accepted into science educational programs whose admission requirements emphasize test scores. According to Rosenthal and Rubin (1982), an effect accounting for only 4% variance of scores is associated with a difference of 60% versus 40% of a group’s performance above average. For example, to get admission to medical program at university, an individual must attain at least an average score at public examination to qualify for admission. 4% variability in score will result in 60% of one group and 40% of the other group qualifying for admission. Therefore, though stereotype threat may sometimes seem ‘‘small”, they can have substantial real-world consequences. 2.3.2.2 Schooling and family conditions Simpson and Oliver (1985) demonstrated that school was a significant origin of variation with respect to science performance and affective outcomes. Students’ academic achievement could also be mediated strongly by their family background such as socioeconomic status (SES) and parental involvement (Ho and Willms, 1996). Yang (1996) from Taiwan integrated the views of others and came up with a model with six dimensions to explain gender differences in science interest: social culture (sex role), individual cognition, individual affect, education environment (school), family background and nature of science (see Figure 2.1). 47 Figure 2.1: Gender differences in science (Source: Yang, 1996 p. 56) Social culture Sex role Family background Schooling Gender differences in science interest Science knowledge Nature of science Attitude toward science Individual affect Self-efficacy Science achievement Ability Individual cognition 1. Social culture (of gender) includes sex role formation process through different views and evaluation of gender behavior from public media, laws, history, custom, thought and philosophy. 2. Individual cognition includes students’ science ability and performance. 3. Individual affect includes attitude toward science, interest in science and personal value of science. 4. Education environment (school) includes students’ gender, school type, classroom activities, teacher (ability and gender), curriculum and teaching kits. 5. Family background includes parental socioeconomic status (SES), parents’ careers, parents’ expectation on child’s success and child-rearing style. 6. Nature of science includes scientific method, science knowledge, logical thinking in science, experimentation, scientific analysis, scientists. 48 2.3.3 Item characteristics attributing to gender differences 2.3.3.1 Scientific content Gender differences in science assessments can vary a lot, in terms of content knowledge, in favour of either boys or girls (Kahle & Lake, 1983, Johnson, 1987; Murphy, 1991). Many studies (e.g. Walding, 1994) pinpointed that content, both conceptual understanding of science and practical aspects of science, has implication for gender differences in science attainment. Kelly (1978) examined gender differences in the 14-year-old population in the first IEA science survey conduced in the developed countries. The results suggested gender differences within branches of science, showing that boys have clear advantages on physical systems, intermediate on chemical systems and smaller for biological ones. Similar conclusion are observed in other researches (e.g. Erickson & Ericken, 1984; Jovanovic, Solano-Flores & Shavelson, 1994) 2.3.3.2 Item format With the same test length, multiple-choice assessments often show a higher reliability over ones with open-ended formats (Mazzeo et al., 1992) and this explains why multiple-choice tests are commonly used in public examinations (e.g. HKCEE) and large scale international assessments such as TIMSS and PISA. However, consistently over the years, boys are reported to be favored by multiple-choice tests while girls are favored by open-ended items (Hoste, 1982; Stobart, Elwood & Quinlan, 1992; Walding, et al., 1994). Tests consisting of mainly multiple-choice or open-ended formats can generate test bias on either sex (Shepard, 1993). After examining the results of General Certificate of (GCE) examinations in UK, Murphy (1978, 1982) explained that females are favored by items of open-response (e.g. essay) since these items demand higher verbal ability; whereas close-response items (e.g. multiple-choice) do not require it, but focus solely on problem solving, an area in which males can do better. Another common claim over girls’ poorer performance in multiple-choice tests is their lack of confidence and willingness to take risks and give responses in novel situations by guessing. In the USA, the National Assessment of Educational Progress (NAEP) includes an ‘I Don’t Know’ (IDK) alternative option in the multiple-choice items of science domains to estimate more accurately the knowledge of respondents from different group. Sherman (1974) examined the results of NAEP and found out that girl candidates tended to choose IDK as the item responses more often than their male counterpart. Overall, Gipps and Murphy (1994) stated that factors within the assessment itself, for example, item format and response mode, were one of influential sources of invalidity in assessment. 49 They recommended those responsible for designing public examinations ensure that a range of assessment techniques are used so that there is no bias against one particular group of candidates. 2.3.4 Expectancy-value model of achievement-related choices in science Development of an interest in pursuing a career in science or science related work has been well recognized as one of the important aspects of affective behaviors in science education (Klopfer, 1971). To understand gender differences in educational and occupational tendency and choices, one must understand the sociocultural influences to those choices. A well-developed model of achievement-related choices has been proposed by Eccles et al (1983) (see Figure 2.2). Figure 2.2: Expectancy-value model of achievement-related choices (Eccles et al., 1983). As indicated by the model, an individual student chooses his or her future academic programmes and careers based on two criteria namely, (1) expectations for success and (2) subjective task values. Eccles and her colleagues conceptualize expectation for success as one’s belief in achieving the tasks when he or she takes on the challenges. Subject task values were defined in terms of four essential motivational components: 50 (1) the utility value of task in achieving either one’s short or long range goals or obtaining external rewards; (2) the intrinsic interest in, and enjoyment of, the task; (3) the attainment value – the needs, personal values, and explicit motives that an activity fulfills; and (4) the cost of engaging in the activity. How can Eccles et al (1983) model help us to understand gendered choices of academic courses in high schools, major in universities, and careers? If boys and girls hold different expectations for success in science-related tasks and they attach different values to success at these tasks, then gender differences in achievement-related choices in science occur. In the past two decades, researches findings indicated that self-concept of ability and subjective task values acted as major mediators of gendered choices in the Eccles et al (1983) model (see Jacobs, 2005 for review). The next two sections will review these two mediators. 2.3.4.1 Self-concept of ability as mediator of gendered choices The effect of self-concept of ability on achievement-related choices has been discussed extensively (e.g. Eccles et al., 1983; Nagy et al., 2006; Meece et al., 2006). These authors suggested that self-concepts of ability are a critical predictor and mediator of gendered choices. Nagy et al (2006) found that at grade 10, German boys outperformed girls on the mathematics and biology achievement tests, and reported higher math self-concepts and intrinsic values. Girls scored higher on the biology self-concept and intrinsic value scales. Boys were twice more likely to enroll in advanced mathematics at grade 12 than girls. The reverse pattern was found for biology. The gender differences at Grade 10 mediated the gender effect on course enrollment in Grade 12. Jacobs (2005) reported that gender differences in both self-concepts and career aspirations along traditional gender-typed lines were found. Girls were underrepresented in the high science ability self-concept cluster in the United States. As a result, they were less likely than boys to aspire to careers in fields related to physical science. However, they were more likely than boys to aspire to health- and biology-related careers. In short, current literatures support the theory that science self-concept acts as a mediator of gendered choices in science education and careers. 51 2.3.4.2 Subjective task values as mediators of gendered choices Evidence of subjective task values as mediators of gendered educational choices can be found in Dunteman, Wisenbaker, and Taylor’s (1978) longitudinal study in the United States. They investigated the link between personal values of science and selection of university major in science using the National Longitudinal Study of over 20,000 high school seniors from 1200 high schools with 18 seniors per school. They concluded that students who were high on thing-orientation and low on person-orientation were more likely to select a math or a science major. Boys tended to be thing-oriented while girls were more likely to be person-oriented. This gendered tendency reflected in their personal value of science. Hence, boys were more likely than girls to select science majors in college studies. Eccles et al (1999) conducted a longitudinal study of about 1,000 European-American, middle and working class adolescents from southeastern Michigan in the United States. The study provided additional evidence to show that subjective task values mediated the gendered occupational choices. They assessed how high school seniors attach different values to a wide array of occupations and occupational characteristics. Their results indicated that individuals who valued helping others were predicted not entering a physical science-related profession. They concluded that males and females differed, in a gender stereotypic fashion, in the value they attached to the different career characteristics and self-perceptions. These differences could explain a sizeable amount of the gendered variance in career choices. In other words, the above findings suggested that subjective task values are the important mediators of gender differential choices in science. 2.4 Local research on gender differences in scientific literacy 2.4.1 Gender differences in science performance After extensive Internet search using Scholar.google, Eric, Sagepub, Wiley, JSTOR, Informaworld, Springerlink, PsycINFO, ProQuest and Hong Kong Academic Library Link (HKALL), only three research articles (Keyes, 1983; Lin, 2009; Yip et al., 2004) and six reports (Law 1996a, 1996b, 1997; Ho et al., 2003, 2005, 2008; Yung et al., 2006) directly address the gender differences of cognitive performance of science domains in Hong Kong. The findings of these four local gender studies and six reports are summarized as follows: Keyes (1983) conducted a gender research on Hong Kong Chinese Adolescents to test the hypothesis of sex differences in patterns of cognitive ability by variation in sex-role 52 identification. She discovered that males had better spatial ability while females were better confluent production. Based on these results, she came to a conclusion that biological sex differences were the best predictor of a male or female pattern of performance. Lin (2009) from Taiwan used structural equation modeling (SEM) to investigate three factors, family, school and self, in accounting for the gender differences of PISA 2006 science performance of students from Taiwan, Japan, South Korea and Hong Kong. She reported that both family and school factors influence less than student self on science performance. Out of all local studies about gender differences in science performance, the most comprehensive one was carried out by Yip et al (2004). The study covers gender differences in various domains of science knowledge, practical and communication skills and item formats. Boys were reported to outperform girls on items with earth and physical science, understanding of scientific knowledge and closed response format. Girls on the other hand tended to perform better on items with recognizing questions and identifying scientific evidence. In terms of gender diversity, boys in higher ability groups consistently achieved higher than girls. For TIMSS assessments, Hong Kong students had relatively large gender differences in science performance at fourth grade (Law, 1997). Law (1996) also reported that the gender differences in science performance were the largest amongst those participating countries meeting the sampling and participation requirements at eighth grade. In both cases, boys outperformed girl counterparts with significant differences. Gender differences were found to be the least in life sciences and largest in earth and physical sciences. Girls tended to perform better in open response items than boys. In 2006, Yung et al. reported that boys outperformed girls in science content domains. Significant gender differences were consistently found in Earth Science and physics. Among the ability areas, factual knowledge, conceptual understanding, analysis and reasoning, boys outperformed girls in all the areas with significant differences. However, a significant reduction in gender differences was found from 1995 to 2003 assessment period. During this period, girls’ performance has improved faster than that of males. The same trend of improvement happened between 2003 and 2007 and by TIMSS 2007, no significant difference between boys’ and girls’ achievement in science was reported (see Table 2.15 & Table 2.16) (Martin et al., 2008). 53 Table 2.15: Trends in average science performance by gender - 1995 through 2007 (Grade 4) (Source: TIMSS 2007 International Science Report, Martin et al., 2008 p. 60) Combining the three PISA reports (Ho et al., 2003, 2005, 2008), there are no statistically significant gender differences reported in the overall scores of the 15-year-olds at schools. However, boys tended to do better in understanding concepts and explaining phenomena scientifically while girls were better at recognizing questions and identifying evidence. Boys did better in closed items than girls while girls outperformed boys on the open-response items. For the content knowledge, the results indicated that there was no consistent support that boys can do better on physical science and biological science than girls. However, the variability of boys’ science scores was always larger than females apparently at upper and lower extremes (Machin & Pekkarinen, 2008). 54 Table 2.16: Trends in average science performance by gender – 1995 through 2007 (Grade 8) (Source: TIMSS 2007 International Science Report, Martin et al., 2008 p. 61) The insistency in reporting gender differences from TIMSS and PISA over the decade may be caused by how the assessments are being constructed. The tests with higher proportion of open-ended items or dominated with the closed items may lead to an opposite conclusion. The composition of the tests at item level is so crucial to the final conclusion of gender differences. The validity of the outcome measure and suggest that the conclusions about group differences and about correlates of achievement depend heavily on specific features of the items that make up the test (Hamilton, 1998). Still more sophisticated techniques such as DIF are seldom used to investigate the item response pattern of boys and girls. Similarly, the causes, contextual factors, of the differences are also not yet fully explored at system levels though the gender differences seem to be disappearing in recent years. 55 2.4.2 Gender differences in affective domain The development of affective learning outcomes has become more important in local schools. The Curriculum Development Council (CDC, 1988) stated clearly that one of the broad aims of secondary 1 to 3 science syllabus was ‘to develop curiosity and interest in science”. Recently, more emphasis was put on affective learning outcomes of science education that the Science Education Key Learning Area Curriculum Guide for Primary 3 - Secondary 3 (CDC, 2002) recommended schools adopt strategies to nurture students’ interest in school science learning: It is important to nurture students’ interest in science learning. Students are generally intrigued by new things. They are interested in problems that puzzle them and have a natural urge to find solutions to settle them. Organizing the curriculum around problems or phenomena that puzzle students helps motivate students to learn. Rather than relying solely on textbooks, teachers of General Studies and science subjects are encouraged to make use of hands-on exploratory learning activities to develop students’ interest in science. It is essential that students participate in a wide range of activities to develop enjoyment in the process of science learning. (Science Education KLA Guide, p.7) In response to this trend in curriculum development and evaluation, Cheung (2009a) used Attitude toward Chemistry Lessons Scale (ATCLS) to investigate the interaction effect between grade level and gender at secondary 4 and 5. The result indicates boys in secondary 4 and secondary 5 like chemistry theory lessons more than that of girls. In contrast, girls’ attitude toward chemistry laboratory work remains more or less the same from secondary 4 to secondary 7 with boys’ attitude declining over the same period (Cheung, 2009b). The gender differences in affective domains are usually not the focus of the local reports. Even some researches indicated that the correlation between attitude toward science and achievement can be as high as 0.50 for boys and 0.55 for girls which accounted for 25-30% of the variance in science performance (Weinburgh, 1995). The only gender differences study in affective domain was from Cheung (2009b). The correlation between affective outcomes and science performance was positively correlated by gender at each grade level from secondary 1 to 5. 56 2.5 Summary In chapter two, scientific literacy in cognitive and affective domains commonly used in the science education community and its evolutionary nature of scientific literacy definition have been discussed. Basically, there is no consensus about scientific literacy definition in the science education community. PISA 2006 adopted Bybee’s multidimensional definition of scientific literacy for cognitive domain and Klopfer’s taxonomy of affective behaviours in science education for affective domain (OECD, 2006; Bybee, 1997b; Klopfer, 1976). “Gender” differences were then defined in respect to the sociocultural and biological contributions. From the literature review, there were clear gender differences in science. The differences in science performance had been narrowed in the past few decades. However, the gender gaps in affective learning outcomes remained large, in particular, the educational and occupational trajectories related to science. Girls tended to choose non-science oriented careers and educational programs. Secondly, factors attributing gender differences i.e. biological contributions and sociocultural contributions were included as part of the literature review. Again, there is no agreement about the “nature and nurture” of gender differences in science. Girls’ disadvantages in science performance were reduced and became insignificant in recent years. However, the gender differences in affective learning outcomes and future-oriented motivation remained large and significant. Eccles et al (1983) Expectancy-value Model of Achievement-related Choices was used to illustrate the gendered pattern in educational and occupational choices in science. Thirdly, other factors attributing to gender differences, for example, item characteristics were revised. Previous literatures suggested that closed response items tended to favor boys while open response items tended to favor girls. Finally, a brief literature review on local gender studies in science was conduced. Form this review; we might conclude that there is a limited understanding of the relationship between science performance, affective learning outcomes and gendered choices in science careers and education in local context. In the coming chapters, we attempt to answer this question “How, and to what extent, the gender differences in students’ science performance and affective learning outcomes influence their intention to choose science-oriented careers and educational opportunities. In the coming chapter 3, the major factors will be conceptualized for later investigations in chapter 4 and chapter 5. 57 CHAPTER THREE RESEARCH DESIGN AND METHODS The study deployed quantitative research methods, mainly Multidimensional Differential Item Functioning (MDIF) and Multilevel Mediation (MLM), to explore the gender differences at item level and system level. The following sections will discuss the data collection method, research framework, research methods and analysis techniques. 3.1 PISA 2006 database To increase the quality of the data collected, the PISA survey used two-stage stratified sampling procedure. The first stage for the main study PISA 2006 was conducted between May and June 2006 in Hong Kong to have stratified sample of 150 local schools, namely, government, aided, international and independent under direct subsidy scheme (DSS) with three achievement level of students, high, medium and low. The level of achievement is determined by a composite index, Academic Achievement Index (AAI), derived from individual student’s school performance and territory-wide assessment. The sample represents 5.7% of the 15-year-old school students of the target population in 2006 (Ho et al., 2008). The overall distribution of the participating schools is shown in Table 3.1. Table 3.1: Participating school distribution in PISA 2006 in Hong Kong. (Source: Ho et al., 2008, P. 7) Type of School Student Academic Intake Total Number of Schools Number of Schools Participated Government High Ability Medium Ability Low Ability N/A High Ability Medium Ability Low Ability N/A Local/DSS International 17 7 10 2 Aided 128 125 126 1 # Independent 43 27 Total 486 # There is no intake information about independent schools. 58 6 2 3 0 46 46 35 0 7 1 146 Table 3.2 shows the demographic features of the 4645 students in the sample. Most of the students were from secondary four or grade 10 (64.1%) and the second largest proportion was from secondary three or grade 9 (24.4%) which was account for 90% of the sample in total. The female (50.6%) to male (49.4%) ratio in the sample was about in equal proportion. Most of the students were born in Hong Kong (75.5%) while the rest were from other areas (23.4%). In terms of immigrant status, students who were not born in Hong Kong but at least of the parents born in Hong Kong were classified as native. Students were classified as first generation if his or her parents and the student were not born in Hong Kong. The third group of students was classified as second generation if the student were born in Hong Kong while the parents were born in other places. According to this PISA definition, 55.6% of the students were natives and first generation and second generation were 18.7 % and 24.4% respectively. Table 3.2: Demographic features of the participating students (Source: Ho et al., 2008, P. 9) Number of Participating Students Proportion (%) Grade/Form 7/ S1 8/ S2 9/ S3 10/ S4 11/S5 Total 107 421 1134 2978 5 4645 2.3 9.1 24.4 64.1 0.1 100 Gender Female Male Total 2351 2294 4645 50.6 49.4 100 Place of Birth Hong Kong Non-Hong Kong Data Missing Total 3509 1089 47 4645 75.5 23.4 1.0 100* Immigrant Status Native 2581 55.6 Second-Generation 1134 24.4 First-Generation 869 18.7 Data Missing 61 1.3 Total 4645 100 * The sum of “place of birth” category is not 100 because of round up error at decimal place. 59 3.2 Conceptual framework of present study To answer the research question concerning to what extent and how gender effects are mediated through cognitive and affective domains of science on achievement related choice, a new model has been constructed for the present study based on the Expectancy-value Model of Achievement-related Choices well-developed by Eccles et al (1983) (see Figure 3.1). Eccles et al (1983) model was chosen for present study since it is intentionally designed to analyze the gender-segregated choices in the STEM areas (Halpern et al., 2007). Secondly, it captures the major affective factors in the model which are well known to be socially, culturally, and psychologically influenced (Eccles, 2011). There are four types of variables in the model: independent variable, mediators, dependent variable and control variable. The independent variable consists of gender (Girls). The mediators include Science Self-concept (SCSCIE), Enjoyment of Science Learning (JOYSCIE), Interest in Science Learning (INTSCIE), Instrumental Motivation to Learn Science (INSTSCIE), Personal Value of Science (PERSCIE) and Science Performance (SP)13. The dependent variable is Future-oriented Science Motivation (SCIEFUT). Parental SES acts as a control variable in the model. SCSCIE is placed under Child’s general self schemata, to reflect one’s perceptions and expectations for success in scientific literacy. JOYSCIE and INTSCIE are grouped under Subjective Task Value to reflect one’s motivation to learn school science. PERSCIE is put under Attainment Value to reflect one’s needs, personal values, and explicit motives that an activity fulfills. INSTSCIE is placed under Utility Value to reflect the usefulness of a task in facilitating the achievement of goals or in obtaining any immediate or long-term rewards. JOYSCIE, INTSCIE, PERSCIE and INSTSCIE are thus grouped under Subjective Task Value. Eccles (2011) mentioned that gender-role socialization could lead males and females to place different Subjective Task Values on various long-range goals and activities. If one place success in one’s gender role as the key component of his or her identity, then activities that fulfill this role have higher value and vice versa. General value of science and self-efficacy in science are two affective factors available in the PISA 2006 survey, which are not enclosed in the revised model so as to keep the 13 Science performance (SP) refers to the cognitive performance of scientific literacy. 60 original psychometric properties of the measurement models in Eccles et al (1983) model14. Parental SES and Science Performance are two new components included in the model which is not found in Eccles et al (1983) Expectancy-value Model of Achievement-related Choices. However, these two components are important control variables for estimating the mediated effects more actually (Bradley & Corwyn, 2002; Meece et al, 2006). Figure 3.1: Revised Expectancy-value Model of Achievement-related Choices in Science Independent Mediators Dependent variable: variable: Child’s general self schemata Science Self-concept (SCSCIE) Interest in Science Learning (INTSCIE) Stable child characteristic Girl (STF Gender) Control variable: Cultural milieu Parental SES 14 Enjoyment of Science Learning (JOYSCIE) Attainment Value (PERSCIE) Achievement related choice Future-oriented Science Motivation (SCIEFUT) Utility Value (INSTSCIE) Subjective task value Science Performance (SP) The definition of self-efficacy in science in PISA 2006 is different from self-concept of one’s abilities in Eccles et al (1983) model. General value of science is not a part of subjective task value in the model. 61 3.3 Conceptualization and operationalization of scientific literacy 3.3.1 Cognitive domain As stated in chapter two, the assessment framework of cognitive domains in scientific literacy focus on the practical aspects of the scientific knowledge, skills, competencies and other attributes embodied in individuals that are relevant to personal, social and economic well-being rather than the national curriculum of different participating countries and areas. The criteria for item selection for PISA 2006: (1) Scientific situations or contexts in which scientific knowledge and the use of scientific processes are applied. The framework identifies three main areas: Science in Life and Health, Science in Earth and Environment, and Science in Technology. (2) Scientific knowledge or concepts, which constitute the links that aid understanding of related phenomena. Scientific processes, centered on the ability to acquire, interpret and act upon evidence. Three such processes present in PISA relate to: i) describing, explaining and predicting scientific phenomena, ii) understanding scientific investigation, and iii) interpreting scientific evidence and conclusions (OECD, 2006). After trial study, 108 assessment items were kept for the main study with reference to the following assessment framework (see Table 3.3). Table 3.3: Distribution of PISA 2006 science performance items (knowledge domains by competency). (Source: PISA 2009b) scientific competency (cognitive dimensions) Identifying Explaining Using scientific phenomena scientific issues scientifically evidence physical systems living systems 15 2 17 (13%) 24 1 25 (23%) 12 0 12 (11%) 2 6 8 (7%) 24 1 25 (23%) 0 21 21 (19%) 31 (29%) 108 knowledge of science earth and space systems technology systems scientific knowledge enquiry about scientific science explanations Total Sub-total 24 (22%) 62 53 (49%) 62 (57.4%) 46 (42.6%) 3.3.2 Affective domain This section outlines the procedure to conceptualize and operationalize the affective domain factors used in this study. For each factor, four stringent steps are used verify its suitability for multilevel mediation study. Firstly, the internal consistency, Cronbach’s α of affective factors (or measurement models in SEM terminology) was examined. The criteria for Cronbach’s α value15 equal to or above 0.7 which indicate data collected of high reliability (Henson, 2001). Secondly, multiple-imputation was used to impute the missing values. The detail procedure of handling missing values can be found in Appendix A. Thirdly, confirmatory factor analysis (CFA) of structural equation modeling (SEM) was used to (1) confirm theoretically expected item dimensionality of the data collected from student questionnaire, and (2) make necessary adjustment of dimensional structure to fit Hong Kong context (Kaplan, 2000; Brown, 2006). The CFA will produce a series of model fit indices to validate the goodness-of-fit between the model and the data collected. The most common index used to check the model fit is the χ2 goodness-of-fit tests. The χ2 goodness-of-fit tests which compare the expected and observed values to determine how well an experimenter’s predictions fit the data. The limitation of the test is its sensitivity to sample size, for tests involving large samples such as TIMSS and PISA which results in a rejection of the null hypothesis, even when the factor model is appropriate (DeCoster, 1998). Since the sample size is very large (N=4645) in the present study, χ2 goodness-of-fit will be reported for the affective factors but it is not be used for assessing the model fits of the affective factors. Rather, the degree of model fit is assessed against the following four indexes: Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), Comparative Fit Index (CFI) and Tucker Lewis Index (TLI) or Non-Normed Fit Index (NNFI). RMSEA values below 0.05 indicate a close model fit whereas values over 0.10 are usually interpreted as unacceptable model fit. RMR values should be less than 0.05. Both values of CFI and NNFI between 0.90 and 0.95 indicate an acceptable model fit, and values greater than 0.95 indicating a good model fit. Finally, multiple-group CFA is used to confirm measurement invariance (MI) across 15 The range of Cronbach’s α is 0.0 to 1.0. The higher Cronbach’s α, the better the internal consistency. 63 gender groups i.e. same measurement scale is comparable across boys and girls. Reise et al (1993) mentioned that it is misleading to make comparisons across the group if measurement scale and trait scores are not comparable. The procedure to conduct MI across gender groups is: (1) Pattern invariance (2) Unconstrained (baseline model) (3) Measurement weights (factor loadings) (4) Structural covariances (5) Measurement residuals (or uniqueness) (Byrne, 1994; Vandenberg & Lance, 2000). A summary of these model tests is presented in Table 3.4. Step 1 Table 3.4: A summary of procedure to conduct multi-group invariance test across gender groups Comparative Model Description Purpose model (MI?) Whole sample Model fit with all sample Localization of measurement models i.e. fitting local data -- Pattern invariance 2 (1a) Boys’ sample Model fit with boys’ sample 3 (1b) Girls’ sample Model fit with girls’ sample Confirming pattern invariance of boys’ sample Confirming pattern invariance of girls’ sample --- Gender invariance (constrained model) 4 (2) Unconstrained Model fit with boys’ and girls’ sample with no constraint 5 (3) Measurement weights Factor loadings are set invariants across gender 6 (4) Structural covariances Factor loadings and covariances are set invariants across gender 7 (5) Measurement residuals Factor loadings, covariances and uniqueness are set invariants across gender Acting as a baseline model for model comparison with (3), (4) and (5). Checking gender equivalence of factor loadings of measurement models Checking gender equivalence of factor loadings and covariances of measurement models Checking gender equivalence of factor loadings covariances and uniqueness of measurement models -(3) versus (2) (4) versus (3) (5) versus (4) & (5) versus (2) As pointed out by Cheung and Rensvold (2002), the most commonly used goodness-of-fit of SEM is the χ2 statistic. χ2 statistic is problematic because of the statistic’s functional dependence on sample size N. Cheung and Rensvold (2002) proposed to use the change of CFI (ΔCFI) rather than the χ2 difference (Δχ2) to evaluate MI for practical reasons. Empirical evidence supported that ΔCFI might be more accurate than the Δχ2 test because of its insensitivity to sample size but more sensitive 64 to lack of invariance16 (Meade, Johnson, & Braddy, 2008). Cheung and Rensvold (2002) recommended ΔCFI=0.01 as a cutoff point for MI test. In other words, ΔCFI ≤ 0.01 suggest an invariant situation and the null hypothesis of invariance should not be rejected. As the sample size of the current study is large (N=4645), Δχ2 represents an excessively stringent test of MI (Cudeck & Brown, 1983; MacCallum, Roznowski & Necowitz, 1992). ΔCFI is therefore more practical to test for multi-group MI (Cheung and Rensvold, 2002) and will be reported for subsequent measurement models in this study. 3.3.2.1 Science Self-concept CFA on Science Self-concept In Table 3.5, six items for on SCSCIE were used to assess students’ perceptions of ability in science. The items were inverted for IRT scaling so that more positive WLE scores on this index indicate higher levels of SCSCIE (OECD, 2009b). The value of Cronbach’s α was 0.929 indicating that the data were of high reliability. Table 3.5: Item parameters for Science Self-concept Item parameters for science self-concept (SCSCIE) Model 16 Item ST37Q01 ST37Q02 How much do you agree with the statements below? (Strongly Agree/Agree/ Disagree /Strongly disagree) a) Learning advanced <school science> topics would be easy for me b) I can usually give good answers to <test questions> on <school science> topics ST37Q03 c) I learn <school science> topics quickly ST37Q04 d) <School science> topics are easy for me ST37Q05 ST37Q06 e) When I am being taught <school science>. I can understand the concepts very well f) I can easily understand new ideas in <school science> The measurement model is not applicable across groups. 65 Scale reliability (Cronbach’s α) 0.929 Table 3.6 shows the results of CFA for model of SCSCIE. The model fit indices, RMSEA (0.048), RMR (0.005), CFI (0.996) and TLI (0.992) were satisfactory for SCSCIE. Table 3.6: Model fit for Science Self-concept Model χ2 (df) p RMSEA RMR CFI TLI Accept model SCSCIE 83.327 (7) 0.000 0.048 0.005 0.996 0.992 Yes Model fit (criteria) --- >0.050 <0.080 <0.050 >0.900 >0.900 --- Multiple-group CFA on Science Self-concept Table 3.7 shows that ΔCFI was less than the cutoff (0.01) for the constrained models, boys and girls had similar measurement weights (factor loadings), structural covariances and measurement residuals for the model. The results suggest that the measurement scale for SCSCIE was comparable across the gender groups. Table 3.7: Measurement invariance test across gender groups for Science Self-concept Model Whole sample Pattern invariance (1a) Boys’ sample (1b) Girls’ sample Gender invariance (constrained model) (2) Unconstrained (3) Measurement weights (4) Structural covariances χ2 (Δ) 83.327 df (Δ) 7 CFI (Δ)# 0.996 52.570 32.443 7 7 0.996 0.998 85.013 101.171 (16.158)** 111.968 (10.797)** 14 19 (5) 20 (5) TLI RMSEA (ΔCFI ≤ 0.01) 0.005 0.992 0.048 -- 0.007 0.004 0.990 0.995 0.053 0.039 --- 0.997 0.006 0.993 0.006 0.996 0.011 0.994 0.011 (0.001) 0.996 0.028 0.993 0.028 (0.001) 0.992 192.445 26 (5) Measurement (0.004) 0.032 0.991 0.032 residuals (80.477)*** (6) (0.005) --Model fit (criteria) >0.900 <0.050 >0.900 <0.080 ** *** # Note: p<0.01, p<0.001; cutoff for (Δ) =0.01 (Cheung and Rensvold, 2002) 66 Invariant? RMR -Yes (3) versus (2) Yes (4) versus (3) Yes (5) versus (4) (5) versus (2) -- 3.3.2.2 Personal Value of Science CFA on Personal Value of Science In Table 3.8 five items for PERSCIE were used to assess the extent to which students value the contribution of science to their own personal development. The items were inverted for IRT scaling so that positive WLE scores on this index indicate students’ positive perceptions of PERSCIE (OECD, 2009b). The value of Cronbach’s α was 0.795 indicating that the data were of high reliability. Table 3.8: Item parameters for Personal Value of Science Item parameters for personal value of science (PERSCIE) OECD model Item ST18Q03 ST18Q05 ST18Q07 ST18Q08 ST18Q10 How much do you agree with the statements below? (Strongly Agree/Agree/ Disagree /Strongly disagree) c) Some concepts in <broad science> help me see how I relate to other people e) I will use <broad science> in many ways when I am an adult g) <Broad science> is very relevant to me h) I find that <broad science> helps me to understand the things around me j) When I leave school there will be many opportunities for me to use <broad science> Scale reliability (Cronbach’s α) 0.795 Table 3.9 shows the results of CFA for one-dimensional model of PERSCIE items. The model fit indices, RMSEA (0.057), RMR (0.010), CFI (0.992) and TLI (0.979) were satisfactory for PERSCIE. Table 3.9: Model fit for Personal Value of Science 0.992 NNFI (TLI) 0.979 Accept model Yes <0.050 >0.900 >0.900 --- Model χ2 (df) p RMSEA RMR CFI PERSCIE 63.452 (4) 0.000 0.057 0.010 Model fit (criteria) --- >0.050 <0.080 67 Multi-group CFA on Personal Value of Science Table 3.10 shows that ΔCFI was less than the cutoff (0.01) for the constrained models; therefore boys and girls had similar measurement weights (factor loadings), structural covariances and measurement residuals for the model. The results suggest that the measurement scale for PERSCIE was comparable across the gender groups. Table 3.10: Measurement invariance test across gender groups for model of Personal Value of Science Invariant? df CFI χ2 RMR TLI RMSEA Model # (ΔCFI ≤ 0.01) (Δ) (Δ) (Δ) Whole sample 63.452 4 0.992 0.010 0.979 0.057 -Pattern invariance (1a) Boys’ sample 32.154 4 0.991 0.012 0.979 0.055 -(1b) Girls’ sample 30.745 4 0.993 0.008 0. 982 0.053 -Gender invariance (constrained model) (2) Unconstrained 62.899 8 0.992 0.010 0.981 0.038 -68.206 Yes 12 0.992 (3) Measurement 0.012 0.987 0.032 ) (5.307 (3) versus (2) (4) (0.000) weights Yes (4) Structural 68.223 13 0.992 0.012 0.988 0.030 (4) versus (3) (0.017) (1) (0.000) covariances 0.983 Yes 135.465 19 (5) Measurement (0.009) 0.020 0.983 0.036 (5) versus (4) (67.242)*** (6) residuals (0.009) (5) versus (2) ---Model fit (criteria) >0.900 <0.050 >0.900 <0.080 *** # Note: p<0.001; cutoff for (Δ) =0.01 (Cheung and Rensvold, 2002) 3.3.2.3 Interest and Enjoyment of Science Learning CFA on Interest and Enjoyment of Science Learning In Table 3.11, eight items for INTSCIE were used to assess the extent to which students are interested in learning topics in different science disciplines. The items were inverted for IRT scaling so that more positive WLE scores on this index indicate higher levels of INTSCIE (OECD, 2009b). The value of Cronbach’s α was 0.833 indicating that the data were of high reliability. 68 Table 3.11: Item parameters for Interest in Science Learning Item parameters for interest in science learning (INTSCIEHKG) Model HK model Physical science (PHYSCIHKG) Biological science (BIOSCIHKG) Deleted ST21Q01 How much interest do you have in learning about the following <broad science> topics? a) Topics in physics ST21Q02 b) Topics in chemistry Item ST21Q03 ST21Q04 ST21Q05 ST21Q06 Scientific method ST21Q07 (SCIMETHKG) ST21Q08 c) The biology of plants d) Human biology e) Topics in astronomy f) Topics in geology g) Ways scientists design experiments h) What is required for scientific explanations Scale reliability (Cronbach’s α) 0.833 In Table 3.12, five items in JOYSCIE were used assess the extent to which students like doing specific scientific tasks. The items were inverted for IRT scaling so that positive WLE scores on this index indicate higher levels of JOYSCIE. (OECD, 2009b). The value of Cronbach’s α was 0.904 indicating that the data were of high reliability. Table 3.12: Item parameters and scale reliability for Enjoyment of Science Learning Item parameters for enjoyment of science learning (JOYSCIE) Model Item How much do you agree with the statements below? (Strongly Agree/Agree/ Disagree /Strongly disagree) ST16Q01 a) I generally have fun when I am learning <broad science> topics ST16Q02 b) I like reading about <broad science> ST16Q03 c) I am happy doing <broad science> problems ST16Q04 d) I enjoy acquiring new knowledge in <broad science> ST16Q05 e) I am interested in learning about <broad science> 69 Scale reliability (Cronbach’s α) 0.904 Table 3.13 shows the result of CFA for two-dimensional model of INTSCIE and JOYSCIE. The model fit indices, RMSEA (0.104), RMR (0.049), CFI (0.899) and TLI (0.857) were not satisfactory for Hong Kong sample. The lack of fit was mostly due to correlated error terms between interest items about similar topics (e.g. biology of plants and human biology) (OECD, 2009b. From the CFA analysis of Hong Kong sample, astronomy (factor loading = 0.45) and geology (factor loading = 0.43) were two items being deleted from the final model as they were not part of local school science curriculum for 15-year-old students in Hong Kong. Adding these two items caused the misfit of the model which reflected in RMSEA value (0.170) and other model fit indices (e.g. TLI = 0.718). Figure 3.2 shows a second-order CFA model of INTSCIEHKG17 which grouped item ST21Q01 and ST21Q02 into physical science (PHYSCIHKG), ST21Q03 and ST21Q04 into biological science (BIOSCIHKG), ST21Q07 and ST21Q08 into scientific method (SCIMETHKG) in response to correlated error terms between interest items of similar topics. The final model shows a second-order factor, INTSCIEHKG which accounts for the hierarchical factor structure of the multiple factors. The estimated latent correlation between the INTSCIEHKG & JOYSCIE was high (0.85) and significant. Table 3.13: Model fit and estimated latent correlations for Interest in and Enjoyment of Science Learning Latent correlations Model p RMSEA RMR CFI TLI between: χ2 (df) GENSCIE /PERSCIE INTSCIE 2697.940 (20) 0.000 --0.170 0.074 0.798 0.718 INTSCIEHKG 62.685 (40) 0.012 0.011 0.007 0.999 0.999 --JOYSCIE 179.003 (11) 0.000 0.057 0.009 0.994 0.991 --INTSCIE 0.049 0.81** 0.104 0.899 0.857 & JOYSCIE 3248.604 (64) 0.000 Accept model No Yes Yes No INTSCIEHKG & JOYSCIE 424.927 (40) 0.000 0.046 0.015 0.987 0.985 0.85** Yes Model fit (criteria) --- >0.050 <0.080 <0.050 >0.900 >0.900 --- --- Note: **Correlation is significant at the 0.01 level (2-tailed). Figure 3.2: A second-order CFA model of INTSCIEHKG. 17 HKG subscript is used to denote localized measurement model for Hong Kong context. 70 Multi-group CFA on Interest and Enjoyment of Science Learning Table 3.14 shows that ΔCFI was less than the cutoff (0.01) for the constrained models; therefore boys and girls had similar measurement weights, structural covariances and measurement residuals for the model. The results suggest that the measurement scale for two-dimensional model of INTSCIE and JOYSCIE was comparable across the gender groups. Table 3.14: Measurement invariance test across gender groups for two-dimensional model of Interest in and Enjoyment of Science Learning Invariant? CFI df χ2 Model RMR TLI RMSEA # (ΔCFI ≤ 0.01) (Δ) (Δ) (Δ) Whole sample 424.927 40 0.987 0.015 0.985 0.046 -Pattern invariance (1a) Boys’ sample 211.726 40 0.987 0.015 0.983 0.043 -(1b) Girls’ sample 251.040 40 0.986 0.015 0.980 0.047 -Gender invariance (constrained model) (2) Unconstrained 462.766 80 0.987 0.015 0.981 0.032 -Yes (3) Measurement 514.473 87 0.985 0.020 0.981 0.033 (3) versus (2) (51.707) (7) (0.002) weights 518.843 Yes (4) Structural 92 0.985 0.021 0.981 0.032 (4.37) (4) versus (3) (5) (0.000) covariances 0.985 Yes 619.123 106 (5) Measurement (0.000) 0.021 0.982 0.032 (5) versus (4) residuals (100.28)*** (14) (0.002) (5) versus (2) Model fit (criteria) --->0.900 <0.050 >0.900 <0.080 *** # Note: p<0.001; cutoff for (Δ) =0.01 (Cheung and Rensvold, 2002) 71 3.3.2.4 Motivation to Learn Science CFA on Motivation to Learn Science In Table 3.15, five items for INSTSCIE were used to assess the extent to which students believe that learning science in school is useful and worthwhile for future studies or job prospects. The items were inverted for IRT scaling so that positive WLE scores on this index indicate higher levels of INSTSCIE (OECD, 2009b). The value of Cronbach’s α was 0.937 indicating that the data were of high reliability. Table 3.15: Item parameters for Instrumental Motivation to Learn Science Item parameters for instrumental motivation to learn science (INSTSCIE) Model Item ST35Q01 ST35Q02 ST35Q03 ST35Q04 ST35Q05 How much do you agree with the statements below? (Strongly Agree/Agree/Disagree /Strongly disagree) a) Making an effort in my <school science> subject(s) is worth it because this will help me in the work I want to do later on b) What I learn in my <school science> subject(s) is important for me because I need this for what I want to study later on c) I study <school science> because I know it is useful for me d) Studying my <school science> subject(s) is worthwhile for me because what I learn will improve my career prospects e) I will learn many things in my <school science> subject(s) that will help me get a job Scale reliability (Cronbach’s α) 0.937 In Table 3.16, four items for future-oriented science motivation (SCIEFUT) were used to assess the extent to which students intend to continue to learn science and take up a science-related career. The items were inverted for IRT scaling so that positive WLE scores on this index indicate higher levels of motivation to learn science and take up a science-related career in the future (OECD, 2009b). The value of Cronbach’s α was 0.931 indicating that the data were of high reliability. 72 Table 3.16: Item parameters for Future-oriented Science Motivation Item parameters for future-oriented science motivation (SCIEFUT) Model Item How much do you agree with the statements below? (Strongly Agree/Agree/Disagree /Strongly disagree) ST29Q01 a) I would like to work in a career involving <broad science> ST29Q02 b) I would like to study <broad science> after <secondary school> ST29Q03 c) I would like to spend my life doing advanced <broad science> ST29Q04 d) I would like to work on <broad science> projects as an adult Scale reliability (Cronbach’s α) 0.931 Table 3.17 shows the results of CFA for a two-dimensional model of motivation to learn science. The fit was satisfactory for Hong Kong sample (RMSEA = 0.074). The latent correlation between the two factors is 0.71. Table 3.17: Model fit and estimated latent correlations for motivation to learn science Model χ2 (df) p RMSEA RMR CFI NNFI (TLI) INSTSCIE 57.702 (4) 0.000 0.054 0.005 0.997 0.993 Latent correlations between: INSTSCIE/ SCIEFUT --- SCIEFUT 22.025 (1) 0.000 0.069 0.004 0.999 0.992 --- Yes 691.651 (26) 0.000 0.074 0.019 0.983 0.976 0.71** Yes --- >0.050 <0.080 <0.050 >0.900 >0.900 --- --- INSTSCIE & SCIEFUT Model fit (criteria) Note: **Correlation is significant at the 0.01 level (2-tailed). Multi-group CFA on Motivation to Learn Science Table 3.18 shows that ΔCFI was less than the cutoff (0.01) for the constrained models; therefore boys and girls had similar measurement weights, structural covariances and measurement residuals for the model. The results suggest that the measurement scale for the model of motivation to learn science was comparable across the gender groups. 73 Accept model Yes Table 3.18: Measurement invariance test across the gender group for two-dimensional model of motivation to learn science Model Whole sample Pattern invariance (1a) Boys’ sample (1b) Girls’ sample Gender invariance (constrained model) (2) Unconstrained (3) Measurement weights (4) Structural covariances χ2 (Δ) 691.651 df (Δ) 26 CFI (Δ)# 0.983 358.780 360.789 26 26 0.982 0.983 Invariant? RMR TLI RMSEA (ΔCFI ≤ 0.01) 0.019 0.976 0.074 -- 0.019 0.017 0.975 0.976 0.075 0.074 --- -0.982 0.018 0.975 0.053 0.982 Yes 0.021 0.978 0.050 (0.000) (3) versus (2) 0.981 Yes 0.026 0.978 0.049 (0.001) (4) versus (3) 0.976 Yes 982.677 71 (5) Measurement (0.005) 0.030 0.975 0.053 (5) versus (4) residuals (224.647)*** (9) (0.006) (5) versus (2) --- >0.900 <0.050 >0.900 -Model fit (criteria) <0.080 *** # Note: p<0.001; cutoff for (Δ) =0.01 (Cheung and Rensvold, 2002) 719.570 750.289 (30.719)*** 758.030 (7.741) 52 59 (7) 62 (3) 3.4 Conceptualization and operationalization of Parental SES The parental SES of the students was modeled by three factors, highest parental occupational status (HISEI), educational level of mother (MISCED) and educational level of father (FISCED). These factors were extracted from the students’ questionnaire. The CFA result in Table 3.19 suggest that the parental SES was in perfect fit of the collected data (RMSEA = 0.000). Table 3.19: Model fit for socioeconomic status Model χ2 (df) p RMSEA RMR CFI TLI Accept model SES 0.000 (0) 0.000 0.000 0.000 1.000 1.000 Yes Model fit (criteria) --- >0.050 <0.080 <0.050 >0.900 >0.900 --- 74 3.5 Multidimensional Differential Item Functioning (MDIF) This session describes the method used to address the research questions in two levels, item level of gender differences in science performance and affective learning outcomes. 3.5.1 The item response (IRT) model To study gender differences in science performance at the item level. The IRT model should be flexible enough to handle the PISA dichotomously and polytomously scored items which are arranged in a multidimensional structure. Second, it should be able to isolate the items displaying differential item functioning (DIF). Adams and Wilson (1996) proposed a generalized approach to fit Rasch models, such as simple logit model (Rasch, 1960), the scale and partial credit models (Andrich, 1978; Masters, 1982), the linear logistic test model (Fisher, 1983), multifaceted models (Linacre, 1994), into single generalized logit model called the unidimensional random coefficients multinomial logit model (URCMLM). Provided that θ is the latent variable, the item response probability j in item i can be modeled as: P (X ij = 1; A, b , ξ | θ )= exp (bijθ + a ′ij ) Ki , ∑ exp (b θ + a′ ) ik ij k =1 where Ki is the possible responses to item i. Thus, URCMLM is flexible enough to take care of PISA dichotomous and polytomous scored items and DIF with the design matrix A and scoring matrix bik allow to handle the rating scale and partial credit models for example, the category indicators 0, 1, 2 and 3 may be used to indicate respondents’ choices or a different level of performance. However, as discussed in chapter two, assuming the items in PISA 2006 follows unidimensionality is invalid since the assessment is made up of several unidimensional sub-scales and the multidimensional version of URCML has to be used (see Wilson & Hoskens, 2005). It is a direct extension of URCML to cover a set of D-latent traits from the respondents allocated into D-dimensional latent space. The multidimensional random coefficients multinomial logit model (MRCMLM) is formulated as follows: 75 P (X nik = 1 | θ n , ξ )= exp (bik θ n + a ′ik ξ ) Ki ∑ exp (b θ ik n + a ′ik ξ ) , k =1 where Xnik= 1 if person n’s response to item i is in category k or 0 otherwise (1≤ i ≤ I, 1 ≤ k ≤ Ki, 1≤ n ≤ N, and Xnik , is fixed to zero as a reference category for model identification); θn is a d x 1 ability parameter vector of a person n (1 < d (dimension) < D); b'ik is a 1 x d scoring vector for category k of item i; ξ is a p x 1 item parameter vector; and a'ik is a 1 x p vector to specify linear combination of p elements of ξ for each response category. ξ is a fixed unknown parameter vector while θn is a random parameter vector. The parameters of θn are assumed to follow a multivariate normal distribution (MVN): (Liu, 2006 p.48-49). Wu, Adams & Wilson (1997) implemented MRCMLM in ConQuest IRT software which makes multidimensional scale and partial credit models analysis possible. In this study, ConQuest IRT software was used to estimate the parameters in MVN by marginal maximum likelihood (MML) method and person’s ability by expected a posterior (EAP) estimation. A three dimensional model was used to analyze science performance based on its three competency domains. For each domain of items, they are assigned to the three dimensions with reference to between-item multidimensionality model (Wang, Wilson & Adam, 1997) (see Figure 3.3). 76 Figure 3.3: A graphical representation of within-item and between-item multidimensionality. ITEMS LATENT DIMENSIONS ITEMS 1 2 1 1 3 4 4 2 2 5 6 6 7 7 8 1 2 3 5 LATENT DIMENSIONS 3 3 8 9 9 Between-item Multidimensionality Within-item Multidimensionality 3.5.1.1 DIF model for gender differences studies Differential item functioning (DIF) is a method in IRT to test the item bias favouring different groups of examinees, after controlling their underlying abilities. According to Peck (2002) and Liu (2006), the possibility of getting a correct item without gender based DIF, can be defined as: P ( X | θ , g = male) = P( X | θ , g = girls ) for all the values of θ where X is the observed item response, θ is the unobserved latent trait to be measured and g is the group member indicator. However, if gender based DIF exits, the URCMLM model for DIF estimation of dichotomous item i will be: logit [( X ni = 1 | θ n , g ) ] = θ n − β i + γ i G where Xni is the response of person n to item i, βi is the item parameter indicating the 77 item difficulty for item i and γi is the item DIF parameter. If G is 1 then g is the control group while G is 0 then g is the experimental group. To make the DIF more comparable, -1 and 1 is assigned for the control group and experimental group respectively. Hence, γi can be inferred as the differences in the item difficulty between the two gender groups. logit male = θ n − β i + γ i logit female = θ n − β i − γ i The gender DIF effect can then be modeled as: exp(θ n − β i − γ g X ik Ζ g ) P (X = 1 | θ n , β i , γ g )= . 1 − exp(θ n − β i − γ g X ik Ζ g ) where θn is the unobserved latent ability of student n under normal distribution. βi is the item difficulty for item i. Xik is an item variable with value 1 for k is 1 and 0 otherwise. Zg is the group indicator for male and female students and γg is the random variable to estimate DIF. Both γg and θn follow the bivariate normal distribution and μcontrol is assumed to be (0,0) if there is no DIF. If μexpt is significantly different from zero, there is a mean DIF effect for two genders (Liu, 2006). ConQuest models the above equation and computes DIF γg for each item. The model statement (command statement for calculation) for performing the DIF analysis is: model item –gender + item*gender + item*step; The model statement contains four terms will result in the estimation of four sets of parameters. The term ‘item’ results in the estimation of a set item difficulty parameters, the second term ‘–gender’ results in the mean ability estimation of male and female students across all the items, the third term ‘item*gender’ results in the estimation of DIF and the last item ‘item*step’ results in deployment of the partial credit model in the estimation. The negative sign in front of the gender term is to ensure that the gender parameters will be presented more naturally with a higher number corresponding to a higher mean ability and vice versa. 78 3.5.1.2 Effect size by DIF Paek (2002) conducted a DIF research on the relationship between traditional DIF size and the DIF estimates by ConQuest. The effect size of DIF of an item i is classified as negligible (Class A DIF) if the absolute size is twice of DIF estimated by ConQuest i.e. |2γi|<.426; intermediate DIF (Class B DIF) for .426≤|2γi|<.638 and large DIF (Class C DIF) for |2γi| ≥.638. So, by looking at the classification of all the items in science performance, we can get some idea of the size of gender differences across the items and competency dimensions. 3.5.1.3 Item fit statistics If an item fit the expected model, its in weighted fit mean square error (WFMS) statistic provided by ConQuest has an expected value of 1 (Wright and Masters, 1982). Statistically, the MNSQ statistic is a χ2 statistic divided by its degrees of freedom. For partial credit model and rating scaling, a range of WFMS values (0.7 to 1.3) is often considered as critical range for item fit (Wright, Linacre, Gustafson, & Martin-Löf, 1994). In this study, when an item with a WFMS beyond the critical range is considered as poor fitting. Figure 3.4 below shows that all the 108 items for cognitive domain of scientific literacy have good item fit statistics. Figure 3.4: Item fit statistics for the three science performance dimensions: Explaining Phenomena Scientifically (EPS), Identifying Scientific Issues (ISI) and Using Scientific Evidence (USE) 1.3 EPS ISI Fit stastistics (WFMS) 1.2 USE 1.1 1 0.9 0.8 0.7 79 3.6 Model testing in SEM Single mediation models The MLM SEM model testing followed similar procedure used by MacKinnon (2008). A MLM SEM model is a path model that specifies a hypothesized causal chain between independent (X), dependent (Y), and mediating (M) variables with consideration of nested effect of multilevel data: X→Mx→Y where X is Girl, Y is the Future-oriented Science Motivation and Mx is the individual mediator in this study. A statistical mediation analysis tests the X→Mx→Y relation, gender effect was said to be mediated if (i) X had a statistically significant effect on the hypothesized dependent (Y) in the absence of the mediator (Mx), (ii) X had a statistically significant effect on the hypothesized mediator (Mx), (iii) the hypothesized mediator had a statistically significant effect on the dependent (Y), and (iv) the mediated effect was statistically significant after controlling for the parental SES (MacKinnon, 2008). Multiple mediator models The revised Expectancy-value Model of Achievement-related Choices in Science (see Figure 3.1) incorporates all the mediator models above in a stepwise manner. Their mediated gender effects in the multiple mediation model were calculated using αM*βM for each of the six mediators. The RMSEA, RMR, CFI and TLI were used to evaluate the overall fit of the models. MacKinnon (2008) suggested that initial mediation model testing should start with single mediator models and then with more complicated mediator models will help researchers to understand what is truly going on with the mediation. 3.7 Summary In chapter three, the data collection method was mentioned. Secondly, the conceptual framework and the procedure to conceptualize cognitive and affective factors were stated. Thirdly, the research methods, MDIF at item level and multilevel mediation at system level, were put forwarded. Finally, the internal consistency and model fit for CFA of affective factors was validated for boys and girls. 80 CHAPTER FOUR GENDER DIFFERENCES IN STUDENTS’ COGNITIVE & AFFECTIVE LEARNING OUTCOMES In this chapter, the gender differences in students’ cognitive and affective learning outcomes will be examined. Firstly, gender differences in cognitive outcomes including: the overall score and the three competencies in science performance (SP), Explaining Phenomena Scientifically (EPS), Identifying Scientific Issues (ISI) and Using Scientific Evidence (USE) will be investigated by two methods: Mean Score Difference (MSD)18 and Multidimensional Differential Item Functioning (MDIF)19. Secondly, SP content domains and item formats will be investigated by MDIF. Thirdly, gender variability will be analyzed using gender variance ratio and distribution pattern of boys and girls at each ability estimate. Finally, the gender differences in affective learning outcomes including: Enjoyment of Science Learning (JOYSCIE), Future-oriented Science Motivation (SCIEFUT), Interest in Science Learning (INTSCIE), Instrumental Motivation to Learn Science (INSTSCIE), Personal Value of Science (PERSCIE) and Science Self-concept (SCSCIE) will be analyzed using MSD. 4.1 Gender differences in students’ cognitive outcomes 4.1.1 Gender differences in science performance dimensions To study the gender differences in students’ cognitive outcomes, section 4.1.1.1 will use 18 A SPSS macro program provided by the PISA 2006 data analysis manual (OECD, 2009d) was used to measure the mean gender difference and the standard error of the difference. 19 MDIF is for measuring gender differences at item level. 81 MSD to investigate the gender differences of mean score. Section 4.1.1.2 will use MDIF to exam the gender differences at item level. 4.1.1.1 Gender differences in science performance dimensions measured by MSD Table 4.1 below shows the overall performance as well as the performance in each dimension of SP for boys and girls using mean score difference method. There was no significant gender difference in the overall achievement of SP. However, boys performed significantly better than girls on Explaining Phenomena Scientifically (EPS) (20.79, p<0.001) while girls outperformed boys on Identifying Scientific Issues (ISI) with statistical significance (14.66, p<0.05). Boys performed slightly better than girls on Using Scientific Evidence (USE) but there was no statistical significant difference found in this area of competency. The results deviated from Hong Kong PISA 2006 Report published in 2008. Ho et al (2008) suggested that there was no significant gender differences found in EPS, ISI and USE of PISA 2006 by comparing the mean percentage correct between boys and girls. However, the comparisons neither captured the measurement error nor the sampling error at the two-stage sampling process. As a result, biased estimates in standard errors could be produced. The current study follows the data analysis procedure (i.e. Fay’s Balanced Repeated Replication (BRR) method) of PISA 2006, to compute the standard errors on the differences among plausible values (OECD, 2009b p.157-160); therefore, the attenuation in mean differences due to measurement error and sampling error could been avoided (Wu, 2005). 82 Table 4.1: Gender differences in scientific competency Boys(N=2294) Girls(N=2351) Difference Mean (SE/SD) Mean (SE/SD) Boys-Girls (SE) 545.61 (3.46/95.12) 538.91 (3.47/88.73) Explaining phenomena scientifically 559.79 (3.50/97.29) Identifying scientific issues Using scientific evidence Competency Science performance z d 6.70 (4.85) 1.38 0.07 539.00 (3.27/88.60) 20.79 (4.55) 4.57*** 0.22 520.37 (4.07/103.17) 535.03 (4.48/97.98) -14.66 (5.90) -2.48* 0.15 543.58 (3.76/102.93) 541.18 (3.98/94.32) 2.40 (5.49) 0.44 0.02 Note: *p<.05; ***p<.001; z statistic was calculated by dividing the difference by its standard error. The effect size d of the difference was calculated with the Cohen’s equation Meanboys − Meangirls 2 2 ( SDboys + SDgirls )/2 (Cohen, 1988). Results from the Table 4.1, indicates that most of the effect sizes20 Cohen d were below 0.20 except “Explaining phenomena scientifically”. However, small effect sizes can lead to long term systematic implications (Cole, 1997a, 1997b). A relatively large effect size found in the EPS competency domain (Cohen’s d = 0.22) demands students to (1) apply science knowledge in a given situation (2) describe or interpret phenomena scientifically and predict changes (3) make appropriate descriptions, explanations, and predictions. The above results were consistent with Yip et al’s (2004) findings, where Hong Kong PISA 2000 dataset was used. Using mean difference method, Yip et al (2004) found that 20 Cohen’s d = 0.2 is small, d = 0.5 is medium and d ≥ 0.8 is large. Cohen d smaller than 0.2 is considered as negligible effect size in the social science context and conventional clinical practices (Cohen, 1988). 83 items assessing students’ understanding of scientific knowledge or concepts (i.e. EPS) were in favor of boys. The result was also same as Le’s (2009) findings from PISA 2006 where all the data of the 50 participating countries was used. Le found that EPS items are clearly in favor of boys while ISI and USE items tend to favor girls. 4.1.1.2 Gender differences in science performance dimensions measured by MDIF The next analysis was to use the method of MDIF to investigate the gender differences where the overall abilities of students were taken into account. Table 4.2 shows the items with statistically significant gender DIF for the three science performance dimensions after controlling the overall abilities of boys and girls (see Appendix D). In the first column, the items were classified according to the category system21 proposed by Peak (2002). Out of 108 science performance items, 52 items (48%) showed statistically significant gender differences. There were 29 items favoring girls and 23 items favoring boys. Although more items favored girls, but most of these items had negligible effect sizes (Class A items). The number of items with intermediate effect sizes favored boys (Class B items) was equal to that of girls. Likewise, the number of items with large effect sizes (Class C items) favored either sex were equal. 21 The item DIF γi estimated by ConQuest software has negligible effect size for |2γi|<0.426 (Class A DIF); intermediate effect size (Class B DIF) for 0.426≤|2γi|<0.638 and large effect size (Class C DIF) for |2γi| ≥0.638. 84 Table 4.2: Summary of items showing statistically significant gender DIF for different science performance dimensions Number of items Classification Item Favouring Sub-total Favouring Sub-total Total of DIF items dimension girls for girls boys for boys (Effect size) EPS 12 ISI USE 3 9 Class B (Intermediate) EPS ISI USE 2 0 0 Class C (Large) EPS ISI USE 2 0 1 Class A (Small) 11 24 Total 18 42 2 2 0 0 2 4 3 1 0 2 3 6 23 52 2 5 29 If the items displaying gender DIF are sorted according to science performance dimensions, the number of items with intermediate effect size and large effect size favored either sex were the same. In contrast to the mean score difference method, the Hong Kong results estimated by MDIF had shown that large portion of the items (52 items out of 108 items or 48%) showing statistically significant gender differences at item level (see the sixth and seventh columns of tables in Appendix D). This result is consistent with a number of earlier studies. For example, Hamilton (1999) revealed that American boys had significant advantage in 12th-grade constructed-response science DIF items in the National Education Longitudinal Study of 1988. Using gender DIF method and 50 participating countries sample, Le (2009) found that there were large number of items showing significant gender DIF in PISA 2006 science items. 85 4.1.2 Gender differences in content domains This section examine gender differences on each content domains of science performance including Knowledge About Science (KAS) which includes Scientific Enquiry (SEQ), Scientific Explanations (SEL) and Knowledge Of Science (KOS) which cover Earth and Space Systems (ESS), Living Systems (LS), Physical Systems (PS), Technology Systems (TS). Section 4.1.2.1 will use MSD to investigate the gender differences at mean score. Section 4.1.2.2 will use MDIF to exam the gender differences at item level. 4.1.2.1 Gender differences in content domains measured by MSD Table 4.3 shows the students’ performance by content domains including KAS and KOS. Girls scored barely higher than boys without any statistical significance (p=0.596) in KAS. On the other hand, boys achieved significantly higher statistical score than girls in KOS sub-domains, in particular the PS (p<0.001). The results followed the traditional rule of thumb that boys tended to perform better than girls in earth sciences and physical sciences (Schmidt et al., 1997, Yung et al., 2006) and that it displayed the largest gender differences in physical sciences among all the content domains (Cohen’s d = 0.35). Surprisingly, the results also suggested that boys significantly outperformed girls on LS (p<0.05). This result deviated from many previous findings that girls performed better than boys or had no statistical gender difference in LS (Jovanovic et al., 1994; Mullis, Martin, Fierros, Goldberg & Stemler, 2000; Yip et al., 2004). 86 Table 4.3: Gender differences in content domains Boys(N=2294) Girls(N=2351) Difference Content domain22 Mean (SE/SD) Mean (SE/SD) Boys-Girls (SE) Knowledge About Science (KAS) 540.27 (3.44/101.23) 542.83 (3.48/94.61) Earth & Space Systems (ESS) 532.78 (3.65/97.29) Living Systems (LS) Physical Systems (PS) z d -2.55 (4.79) -0.53 0.03 517.63 (3.21/93.54) 15.15 (4.86) 3.12* 0.16 563.56 (3.37/98.71) 551.90 (3.23/91.82) 11.66 (4.81) 2.42* 0.12 562.65 (3.35/100.03) 528.93 (3.41/93.14) 33.72 (4.77) 7.08*** 0.35 Knowledge Of Science (KOS) Note: *p<.05; ***p<.001. 4.1.2.2 Gender differences in content domains measured by MDIF Table 4.4 shows the number of items with statistically significant gender DIF by item content after controlling the overall abilities of boys and girls (see Appendix D). Then, the items were classified according to the category system proposed by Peak (2002). If the items displaying gender DIF are sorted according to Knowledge Of Science (KOS), more items from PS favored boys while more items from LS, ESS and TS favored girls. However, most of ESS and TS items were from Class A with small effect sizes. In terms of KAS, the number of items displaying gender DIF favored either sex were similar. With respect to item content, in general, our results agreed with prior findings on gender differences. Becker (1989) conducted a meta-analysis of gender differences in science content and showed that males had significant advantages in studies of biology, general science, and physics, but significant differences were not found for studies of mixed science content, and geology and earth sciences. Similarly, advantages were found for 22 PISA 2006 database does not provide plausible values for Scientific Enquiry, Scientific Explanations and Technology Systems. 87 8th-grade and 10th-grade boys on physical science. Small gender differences were also divulged in life science test that favored boys in the United States (Burkam, 1997). Table 4.4: Summary of items showing statistically significant gender DIF for item content Number of items Classification Content Item23 Favoring Sub-total Favoring Sub-total Total of DIF items domain content girls for girls boys for boys (Effect size) KAS Class A (Small) KOS KAS Class B (Intermediate) KOS KAS Class C (Large) Total 23 KOS SEQ 4 SEL 4 ESS LS PS TS 5 8 0 3 SEQ 0 SEL 0 ESS LS PS TS 0 1 1 0 SEQ 0 SEL 0 ESS LS PS TS 0 2 0 1 KAS 8 KOS 21 8 16 0 2 0 3 29 2 4 0 4 8 0 0 0 0 1 1 0 0 1 1 0 0 1 7 16 6 42 12 0 4 2 1 6 2 23 52 Knowledge About Science: Scientific Enquiry (SEQ), Scientific Explanations (SEL); Knowledge Of Science: Earth and Space Systems (ESS), Living Systems (LS), Physical Systems (PS), Technology Systems (TS) 88 4.1.3 Gender differences in item formats This section examines the gender differences in item formats including Multiple Choice (MC), Complex Multiple Choice (CMC), Closed Constructed Response (CCR) and Open Response (OR). In section 4.1.3.1 MDIF was used to exam the gender differences at item level. Table 4.5 shows the number of items with statistically significant gender DIF for item formats after controlling the overall abilities of boys and girls (see Appendix D). The first column classifies items according to the category system proposed by Peak (2002). The results indicated that boys were generally favored by closed constructed response items while girls were favored by Open Response (OR) items. The results were consistent with previous studies on gender differences with respect to item formats (Le, 2009; Yip et al., 2004; Bolger & Kellaghan, 1990; Mazzeo et al., 1993; Cole, 1997b; Hamilton, 1999; Zenisky et al., 2004). Multiple choice and closed response items tended to favor boys while open-response and contextualized items tended to favor girls (Le, 2009; Yip et al., 2004). 89 Table 4.5: Summary of items showing statistically significant gender DIF for item format Number of items Classification Item Favoring Sub-total Favoring Sub-total Total of DIF items format girls for girls boys for boys MC 7 CMC CCR OR 5 0 12 MC Class B CMC (Intermediate) CCR OR 2 0 0 0 MC CMC CCR OR 0 1 0 2 Class A (Small) Class C (Large) 7 24 4 2 5 18 42 2 2 0 0 0 2 4 3 0 2 1 0 3 6 Note: Multiple Choice (MC), Complex Multiple Choice (CMC), Closed Constructed Response (CCR) and Open Response (OR) 4.1.4 Gender variability in science performance 4.1.4.1 Gender variability measured by variance ratio (B/G) The gender variance for each science performance level as defined by PISA scale, were computed for boys and girls. The results are shown in Table 4.6. In the last column of Table 4.6, the pattern of gender variability between boys and girls were compared in form of variance ratio (B/G), where the trend was presented in Figure 4.1. The results indicated that girls’ variance was always smaller than that of boys at every level of science performance and it kept fairly constant around 1400 units for all science performance levels, except at level 0. The largest variance for girls was found at level 0. The boys’ variance was more volatile at all levels and reached the peaks at level 0 and level 6. 90 Table 4.6: Gender variance ratio on the PISA scale Science performance level Boys’ variance# Girls’ variance# 2505.3 1619.0 1606.8 1479.5 1449.9 1534.9 1845.3 1752.7 1376.9 1455.1 1416.3 1449.5 1359.7 1337.6 Level 0 - below 335 Level 1 - 335.0 to 409.4 Level 2 - 409.5 to 484.0 Level 3 - 484.1 to 558.6 Level 4 - 558.7 to 633.2 Level 5 - 633.3 to 707.9 Level 6 - Above 707.9 Variance ratio (B/G) 1.4 1.2 1.1 1.0 1.0 1.1 1.4 Remark#: Average value of PV1 to PV5 variances. Figure 4.1 and Figure 4.2 shows the gender variability and gender variance ratio on different science performance levels respectively. The results show that boys always had higher gender variability than those of girls. Using several American national norms of standardized test batteries, Feingold (1992) found that males were consistently more varied than females in intellectual abilities. Our results were thus in line with Feingold’s (1992) hypothesis on gender variability. Figure 4.1: Gender variability on different science performance level 3000 Boys’ variance 2500 Girls’ variance Variance in PVs 2000 1500 1000 500 0 0 1 2 3 4 Science performance level 91 5 6 7 Figure 4.2: Gender variance ratio on different science performance level 1.5 Variance ratio (Boys/Girls) 1.4 1.3 1.2 1.1 1 0.9 0 1 2 3 4 5 6 7 Science performance level 4.1.4.2 Gender variability measured by number of students against each ability estimate This section examines the gender variability by measuring the number of students against each ability estimate. Figure 4.3 shows the number of boys and girls at each ability estimate for science performance. More boys allocated at the left tail of the distribution between -0.5 logits to -2.5 logits. Girls outnumbered boys ranging from -0.5 logits to 0.5 logits and at about 1.0 logit. However, boys outnumbered girls between 2.3 logits and 2.6 logits. This shows that more boys did poorly than girls for the bottom of the distribution. In the middle part of the distribution, girls tended to outnumber boys. At the upper portion of the distribution, more boys did better than girls. This finding is consistent with the greater variability of boys than girls on different science performance levels (see section 4.1.4.1). 92 Figure 4.3: Science performance: Number of boys and girls at each ability estimate 60 Girls Boys Number of students 50 40 30 20 10 0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 Ability estimate (logits) Figure 4.4, Figure 4.5 and Figure 4.6 show the number of boys and girls at each ability estimate for EPS, ISI and USE. For the dimension of EPS, the number of girls dominated between -0.5 logit and 1.5 logits while boys outnumbered girls between 1.5 logits and 2.0 logits. More boys achieved higher mark than girls, except the highest ability zone with ability estimate bigger than 2.0 logits. This shows that more boys could demonstrate a higher level of competency in EPS dimension. Figure 4.4: Explaining Phenomena Scientifically (EPS): Number of boys and girls at each ability estimate 70 Girls Boys Number of students 60 50 40 30 20 10 0 -3.0 -2.0 -1.0 0.0 1.0 Ability estimate (logits) 93 2.0 3.0 4.0 For the dimension of ISI, it was obvious that girls almost outnumbered boys from -1.2 logits to 3.2 logits except the regions near 1.8 logits. Consistent with Yip’s previous studies, the domination of girls in ISI was reproduced. Girls demonstrated higher scientific skills (Yip et al., 2004), whereas boys had better conceptual understanding in science. (e.g. Machin & Pekkarinen, 2008). Figure 4.5: Identifying Scientific Issues (ISI): Number of boys and girls at each ability estimate 70 Girls Boys Number of students 60 50 40 30 20 10 0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 Ability estimate (logits) For the dimension of USE, number of boys dominated in the upper regions (1.0 logit or above) while number of girls subjugated in lower to middle regions (from -1.2 logits to 1.0 logit). Such results suggest that more boys displayed a higher level of competency in USE dimension. 94 Figure 4.6: Using Scientific Evidence (USE): Number of boys and girls at each ability estimate 70 Girls Boys Number of students 60 50 40 30 20 10 0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 Ability estimate (logits) Overall, the four distributions show multimodal distribution with large degree of overlapping between number of boys and girls at each ability estimate. Our results did not support the Machin and Pekkarinen’s (2008) hypothesis that more male higher achievers than females occupied at the right tails of the distributions. In short, apart from the greater ability diversity of boys than that of girls at left tails and right tails of the distributions mentioned above, the numbers of lower achievers of male outnumbered females in all the dimensions of science performance. This pattern of boys’ underachievement at the left tails of the distribution is increasingly identified as an international trend and issues. From this point of view, the gender differences appear to be growing in favor of girls rather than diminishing (Francis & Skelton, 2005). 4.2 Gender differences in students’ affective learning outcomes measured by MSD This section examines the gender differences in students’ affective learning outcomes in science. Table 4.7 shows the gender differences in Science Self-concept (SCSCIE), 95 Interest in Science Learning (INTSCIE), Enjoyment of Science Learning (JOYSCIE), Personal Value of Science (PERSCIE), Instrumental Motivation to Learn Science (INSTSCIE) and Future-oriented Science Motivation (SCIEFUT). Table 4.7: Gender differences in affective learning outcomes (WLE scores) Affective factor SCSCIE INTSCIE JOYSCIE PERSCIE INSTSCIE Note: Boys(N) Girls (N) Difference Mean (SE/SD) Mean (SE/SD) Boys-Girls (SE) 2280 2346 -0.03 (0.02/0.02) -0.47 (0.03/0.02) 2282 2346 0.33 (0.02/0.03) 0.06 (0.03/0.02) 2282 2346 0.54 (0.02/0.89) 0.21 (0.02/0.87) 2283 2346 0.50 (0.02/0.91) 0.44 (0.02/0.86) 1811 1905 0.28 (0.02/0.94) 0.05 (0.02/0.91) z d 0.44 (0.04) 12.23*** 0.47 0.27 (0.04) 7.21*** 0.28 0.33 (0.03) 10.14*** 0.37 0.16 (0.03) 4.75*** 0.18 0.23 (0.03) 7.85*** 0.25 ***p<0.001. z statistic was calculated by dividing the difference by its standard error Overall, boys’ SCSCIE was significantly higher than that of girls (0.44, p<0.001). The effect size, (Cohen’s d =0.47) of SCSCIE was medium. The results were consistent with previous findings in other countries. Häussler and Hoffmann’s (2000) found that boys’ physics-related self-concept was higher than their general school-related self-concept while the opposite was true for girls in Germany. In Reis and Park (2001) study using data from the National Education Longitudinal Study of 1988, they found that high-achieving 12th-grade boys had higher SCSCIE than their female counterparts in the United States. 96 Similarly, Boys’ INTSCIE was significantly higher than girls (0.27, p<0.001). The effect size of INTSCIE was small (Cohen’s d = 0.28). The results were line with previous studies. For example, Evans (2002) conducted a gender study of interest and knowledge acquisition in science learning in the United States, Taiwan and Japan. Girls’ scores were found to be lower than those of boys on every interest item regardless of their different cultural background. In general, boys had significant higher JOYSCIE than girls (0.33, p<0.001). The effect size (Cohen’s d = 0.37) was small. Our findings in JOYSCIE were in good agreement with earlier studies. Weinburgh (1994, 2000) found boys more positive in their JOYSCIE, motivation in science, and self-concept of science in the United States. Although boys had statistically significant higher PERSCIE (or Attainment Value in Eccles’ model) than that of girls (0.16, p<0.001), though the effect size (Cohen’s d = 0.18) was relatively smaller than other affective domains. The results suggested that girls, tended to isolate “science identity” from their conception of own identity, ideals or competence in science domain (Wigfield, 1994). These results were aligned with prior findings. Using data from the 2004 Student Achievement Indicators Programme, Adamuti-Trache and Sweet (2009) found that more Canadian boys than girls possessed a high self-concept in science. Boys were likely than girls to place a higher value on math and science. Simpkins and Davis-Kean’s (2005) also found that American boys had significantly higher PERSCIE than girls at high schools. The results also indicated that boys had significantly higher INSTSCIE (or utility value in Eccles’ model) than girls (0.23, p<0.001). In other words, girls were less motivated than boys to learn science in order to improve their career prospects. However, the effect size 97 (Cohen’s d = 0.22) was rather small in social science context. Our results were consistent with the Australian study by Ainley et al (2008). They reported that boys had higher values than girls in JOYSCIE, instrumental motivation, future orientation to study or work in science, science self efficacy and science self concept. 4.3 Gender differences in science achievement related choices measured by MSD Table 4.8 shows gender differences in students’ SCIEFUT. Boys’ overall mean SCIEFUT were significant higher than girls (p<0.001). Lau (1997) found that the gender differences on subject choices became explicit when S.3 students were asked to opt the stream of studies. Table 4.8: Gender differences in Future-oriented Science Motivation (WLE scores) Boys(N=2279) Girls(N=2344) Difference Mean (SE/SD) Mean (SE/SD) Boys-Girls (SE) 0.46 (0.017/0.87) 0.12 (0.02/0.91) 0.34 (0.03) Note: z d 12.96*** 0.38 ***p<0.001; z statistic was calculated by dividing the difference by its standard error Though the gender differences in science have declined over the years, it continues to be a gulf between the number of boys and girls to pursue university degrees in engineering, physical sciences and computer sciences (UGC, 2009; Stumpf & Stanley, 1996; Bae & Smith, 1996). The differential gender differences in this aspect lead to gender-segregated career paths and further studies in science (Eccles, 2011). This phenomenon is so-called “I can but I don’t want to” (Jacobs, et al., 2005). The continuum of gender differences on science educational and career choices suggests that affective learning outcomes are much more important than cognitive achievement in science (Linver, Davis-Kean, & Eccles, 2002). 98 4.4 Gender differences in students’ affective learning outcomes measured by DIF This section examines the gender differences in affective outcomes estimated by DIF and the results are presented in Table 4.9. The results were consistent with that estimated by MSD (see Table 4.7) and Cheung’s (2008) findings. Boys had significantly (p<0.001) higher learning outcomes in all science-related affective domains. Table 4.9: Gender differences in affective learning outcomes (PV scores) Affective factor SCSCIE INTSCIE JOYSCIE PERSCIE INSTSCIE Boys(N) Girls (N) Difference Estimate Estimate Boys-Girls 2280 2346 0.61 -0.61 2282 2346 0.16 -0.16 2282 2346 0.47 -0.47 2283 2346 0.18 -0.18 1811 1905 0.31 -0.31 SE z d 1.23 0.04 16.57*** 0.48 0.32 0.02 8.94*** 0.29 0.94 0.03 14.27*** 0.41 0.36 0.03 6.52*** 0.22 0.62 0.04 7.92*** 0.23 Note: *** p < 0.001; z statistic was calculated by dividing the estimate by its standard error To illustrate how DIF provide us more information about gender differences in affective domains than traditional MSD (e.g. Cheung, 2008), a typical item characteristic curve of the SCSCIE item “Learning advanced science topics would be easy for me (ST37Q01)” was examined (see Figure 4.7). Given that a boy and a girl have the same overall SCSCIE level; girls are likely to report 0.5 logits lower than boys in this item. In other words, it was significantly harder for girls to believe that “they can learn advanced science topics at schools”. 99 Figure 4.7: Item characteristic curves for Learning advanced science topics would be easy for me (ST37Q01) 4.5 Gender differences in science achievement related choices measured by DIF Table 4.8 shows gender differences in students’ SCIEFUT estimated by DIF. Boys’ mean estimate in SCIEFUT were significant higher than girls (p<0.001). The results were consistent with that estimated by MSD (see Table 4.8) and Cheung’s (2008) findings. Table 4.10: Gender differences in Future-oriented Science Motivation (PV scores) Boys(N=2279) Girls(N=2344) Difference Estimate Estimate Boys-Girls 0.54 -0.54 1.09 SE z d 0.04 15.08*** 0.40 Note: *** p < 0.001; z statistic was calculated by dividing the estimate by its standard error In short, the results obtained from DIF method were in good agreement with MSD. However, it also provided us the item response behaviour of different affective factors. 100 4.6 Summary To sum up, six major findings emerged from the present study. First of all, methods based on multidimensional differential item functioning (MDIF) provided a more sensitive method to exam gender differences than the mean difference method. Overall, significant gender differences were found at item level of science performance but not at mean score level. Secondly, significant gender differences were found in two dimension of science performance: Explaining Phenomena Scientifically (EPS) and Identifying Scientific Issues (ISI). EPS favored boys while ISI favored girls. However, the effect sizes were small in these science performance dimensions. Our results were not consistent with two Hong Kong PISA reports published in 2005 and 2008. Based on the PISA 2003 and PISA 2006 datasets, the reports suggested that there were no significant gender differences in science performance in Hong Kong. However, our results indicated that there were significant gender differences in two dimensions of science performance. The discrepancies were originated from the varied sensitivity of the methods as used in the estimations. Mean score differences and standard error, based on replicate procedure24 will be a more reliable alternative in gender studies. Thirdly, the gender variance ratio (B/G) was always greater than one at all science performance levels. It highlighted the fact that boys’ science performance varied more than those of girls at low, medium and high achieving levels. In addition, for low achievers, there were more boys than girls found at left tails of the EPS, ISI and USE distributions. However, for high achievers, there was no evidence to support 24 PISA 2006 used Fay’s Balanced Repeated Replication (BRR) replication method to produce the repeated subsamples or replicate samples so as to overcome the overestimation of standard errors of mean differences. 101 that more boys than girls found at the right tails of the EPS, ISI and USE distributions. Fourthly, there was significant gender differences found in three domains of “Knowledge Of Science” but not for “Knowledge About Science”. Boys outperformed girls on Knowledge Of Science namely, Earth and Space Systems, Living Systems and Physical Systems and the largest effect size was found for Physical Systems. There was no significant gender difference on “Knowledge About Science”. Fifthly, regarding to the item format, more items with gender DIF favored girls, in particular the Open Response items. On the contrary, Closed Constructed Response items favored boys. Sixthly, regarding the affective domain of scientific literacy, boys had significant higher values than girls in all affective factors including Enjoyment of Science Learning, Interest in Science Learning, Instrumental Motivation to Learn Science, Personal Value of Science and Science Self-concept. Moreover, significant gender difference was found in Future-oriented Science Motivation. The effect size of affective domain was larger than cognitive domain. They were all in favor of boys. Overall, the results indicated that boys have higher cognitive and affective learning outcomes than girls. Moreover, the gender differences in science-related educational and career choices of boys and girls were significant. In particular, the current cognitive and affective learning outcomes of scientific literacy might affect students’ career choices. The gender differences of science performance in Hong Kong follow 102 the trends of most Western countries such as US, UK, Germany and Australia. However, for high achievers, there was no evidence to support that more boys than girls found at the right ends of the distributions. In the next chapter, the Eccles et al (1983) Expectancy-value Model of Achievement-related Choices will be used to investigate the sociocultural effects on gender differences of future-oriented educational and career choices. 103 CHAPTER FIVE THE FINDINGS BY EXPECTANCY-VALUE MODEL OF ACHIEVEMENT-RELATED CHOICES In this chapter, gender effect on achievement-related choices and performance will be examined. The expectancy-value model for achievement-related choices will be deployed to scrutinize to what extent and how cognitive factors and affective factors including: Enjoyment of Science Learning, Future-oriented Science Motivation, Interest in Science Learning, Instrumental Motivation to Learn Science, Personal Value of Science and Science Self-concept mediated gender effects and affected students’ achievement-related choice. 5.1 Pearson correlations between affective factors and gender Table 5.1 shows the correlations among Girls, Science Performance and affective factors, namely Enjoyment of Science Learning (JOYSCIE), Future-oriented Science Motivation (SCIEFUT), Interest in Science Learning (INTSCIE), Instrumental Motivation to Learn Science (INSTSCIE), Personal Value of Science (PERSCIE) and Science Self-concept (SCSCIE). The results indicated that girls were negatively correlated with all the affective factors (correlation: -0.094 to -0.235, p<0.01) and SP (correlation: -0.041, p<0.01). In other words, girls had relatively lower affective learning outcomes and cognitive outcome than boys. 104 Table 5.1: Correlations among gender (Girl), affective factors and Science Performance SES Girl SP INSTSCIE INTSCIE PERSCIE SCSCIE SCIEFUT JOYSCIE - SCIEFUT - 0.634** - 0.534** 0.563** - 0.365** 0.449** 0.511** - 0.461** 0.507** 0.541** 0.692** - 0.516** 0.509** 0.546** 0.676** 0.574** - 0.200*** 0.311*** 0.214*** 0.238*** 0.234*** 0.353*** - -0.041** -0.124** -0.134** -0.094** -0.235** -0.190** -0.182** 0.029 0.220** 0.011 0.097** 0.072** 0.085* 0.030* 0.100** SCSCIE PERSCIE INTSCIE INSTSCIE SP Girl SES JOYSCIE - Note: **p<0.01, ***p<0.001. Abbreviation: Enjoyment of Science Learning (JOYSCIE); Future-oriented Science Motivation (SCIEFUT); Gender (Girl); Interest in Science Learning (INTSCIE); Instrumental Motivation to Learn Science (INSTSCIE); Plausible value for Science Performance (SP); Personal Value of Science (PERSCIE); Science Self-concept (SCSCIE), Parental Social Economic Status (SES) 105 5.2 Gender differences by revised Expectancy-value Model The following sections deploy the revised Eccles et al (1983) model to address one key question: To what extent and how gender effects on achievement-related choice are mediated through cognitive and affective domains of science? Eccles (1987) argued that successful intervention of gender-inequality required a thorough knowledge of the gender role socialization processes linked to these psychological variables. Therefore, the mediation effects of Science Performance, Science Self-concept, Enjoyment of Science Learning, Interest in Science Learning, Instrumental Motivation to Learn Science and Personal Value of Science will be examined. 5.2.1 Grouping homogeneity The intra-class correlation (ICC) is a measure of grouping homogeneity. It approaches “1” when there is small variation between groups. ICC approaches “0” when within-groups variance equals between-groups variance, indicating that the clustering effect is negligible. The bigger the number of clusters, the smaller (or better) the grouping homogeneity as the sampling is close to random sampling. The ICC value (0.007) for SCIEFUT is statistically insignificant for the schools25. This result suggests that SCIEFUT does not vary much between schools and therefore factors at the student level will be our focus of further analysis. 5.2.2 Mediation effect of Science Performance For Model 1, Girl, parental SES and SP were included in the estimation. Model 1 examines the gender effect on SCIEFUT and SP after controlling parental SES. The model fit indices as discussed in chapter 3, RMSEA (0.067), CFI (0.979), TLI (0.961) and SRMR (0.017) were reasonable for a good-fitting model26 (See Figure 5.1). 25 The two-staged sampling with complex data structure was handled with ‘TYPE IS COMPLEX’ in Mplus. It specifies that the data is complex and clustered into groups of schools. 26 The most commonly reported fit indicator is χ2. The χ2 statistic is not reported for the model 1 and 106 Figure 5.1: Gender effect on Science Performance and Future-oriented Science Motivation (Model 1) Independent variable: Dependent variable: Achievement related choice (A) Girl Future-oriented Science Motivation (SCIEFUT) -0.191*** (B) Parental SES 0.045** -0.048* Control variable: (D) 0.223*** (C) Science Performance (SP) Note: *p<0.05, **p<0.01, ***p<0.001; RMSEA=0.067, CFI=0.979, TLI=0.961, SRMR=0.017 As the model fit is deemed to be adequate, the estimates are presented in Table 5.2 where parental SES shows significant positive effects on SP (0.223, p<0.001) and their Future-oriented Science Motivation (0.045, p<0.01). Girl, however, had significant negative effect on SP (-0.048, p<0.05) and SCIEFUT (-0.191, p<0.001). Building on Model 1, Model 2 included SP as a mediator between Girl and SCIEFUT (see Figure 5.2). The model fit indices are the same as model 1 (see Figure 5.1). Therefore, the Model 2 is also an adequate-fitting model. After adding SP as a mediator between Girl and SCIEFUT, the direct effect of parental SES on SCIEFUT became insignificant (-0.010, p=0.542). The effect of parental SES mediated totally through SP to act on SCIEFUT (0.055, p<0.01 see Table 5.2). In addition, SP also mediated partially the negative effect of Girl on SCIEFUT (-0.012, p<0.05 see Table 5.2). Compared with boys, girls were essentially less motivated (-0.179, p<0.001) toward subsequent models because of its over-sensitivity to large sample sizes. 107 participating in science-related education and employment in the future. Their cognitive performance in science (-0.048, p<0.05) were also not as good as boys. Figure 5.2: Mediation effect of Science Performance (Model 2) Independent variable: Mediator Dependent variable: Achievement related choice (A) Girl Future-oriented Science Motivation (SCIEFUT) -0.179*** (B) Control variable: Parental SES -0.010(ns) -0.048* (C) 0.000(ns) -0.049* (E) (D) 0.223*** Science Performance (SP) 0.248*** (E) Note: *p<0.05, ***p<0.001; RMSEA=0.067, CFI=0.979, TLI=0.961, SRMR=0.017 The disadvantage for girls in their future education and career choices in the field of science can be explained partly by their lower performance in science. Consistent with Eccles (2011), gendered socialization is one of the possible causes to explain why fewer girls select science-related educational programmes and vocations. Overall, our results are in agreement with previous findings that parental SES, have strong and positive effects on students’ cognitive ability and their future career choice (Ho, 1997; Davis-Kean, 1999). 108 Table 5.2: Mediation effect of Science Performance Model 1 Model 2 Estimate (SE) Estimate (SE) 0.223*** 0.045** (0.019) (0.017) 0.223*** -0.010 (ns) (0.019) (0.016) Mediator Science Performance (B) GirlÆ SP (E) SPÆ SCIEFUT -0.048* 0.248*** (0.024) (0.017) Indirect Effect of SES SESÆSPÆ SCIEFUT 0.055*** (0.006) -0.012* -0.012* (0.006) (0.006) -0.179*** -0.191*** (0.014) (0.014) Student Background (D) SESÆ SP (C) SESÆ SCIEFUT Indirect Effect of Girl GirlÆSPÆSCIEFUT Total Indirect Effect of Girl Direct Effect of Girl (B) GirlÆSP (A) GirlÆSCIEFUT Total Effect of Girl Model fit (criteria) RMSEA (<0.080) CFI (>0.900) TLI (>0.900) SRMR (<0.080) -0.048* -0.191*** -0.191*** 0.067 0.979 0.961 0.017 (0.024) (0.014) (0.014) 0.067 0.979 0.961 0.017 Note: *p<0.05, **p<0.01, ***p<0.001. 5.2.3 Mediation effect of Science Self-concept Model 3 included SCSCIE as a mediator between Girl and SCIEFUT. The model fit indices, RMSEA (0.048), CFI (0.978), TLI (0.972) and SRMR (0.033) were reasonable for a good-fitting model (see Figure 5.3). 109 Figure 5.3: Mediation effect of Science Self-concept (Model 3) Independent variable: Mediators Dependent variable: Child’s General Self Schemata (CGSS) Science Self-concept (SCSCIE) Stable child characteristic (A) Girl (G) Achievement related choice 0.356*** -0.121*** -0.019(ns) Future-oriented Science Motivation (SCIEFUT) (C) -0.048* 0.219*** Control variable: Parental SES (B) (D) 0.223*** Science Performance (SP) (E) Note: *p<0.05, *** p<0.001; RMSEA=0.048, CFI=0.978, TLI=0.972, SRMR=0.033 The results in Table 5.3 indicate that the direct effect of Girl was reduced by 32% from -0.179 to -0.12127 when SCSCIE was included in the analysis. SCSCIE is a significant mediating factor for the gender effects on SCIEFUT (-0.060, p<0.001). Its mediating effect ((F)x(G) = -0.060) was about six-fold of SP ((B)x(E) = -0.011). These results suggest that SCSCIE as a component of Child’s General Self Schemata (CGSS)28 is a critical mediator of gender effects on SCIEFUT. 27 28 Negative sign indicated relatively smaller values of girls than boys. Absolute values were used for calculating the percentage change of direct effect of girl before and after adding affective mediator(s). Child’s General Self Schemata (CGSS) is a component of Eccles et al (1983) model; please refer to chapter 3 for details. 110 Table 5.3: Mediation effect of Science Self-concept Student Background (D) SESÆSP (C) SESÆSCIEFUT Mediator Science Self-concept (F) GirlÆSCSCIE (G) SCSCIEÆSCIEFUT Science Performance (B) GirlÆSP (E) SPÆSCIEFUT Indirect Effect of SES SESÆSPÆSCIEFUT Indirect Effect of Girl GirlÆSCSCIEÆSCIEFUT GirlÆSPÆSCIEFUT Total Indirect Effect of Girl Direct Effect of Girl (A) GirlÆSCIEFUT Total Effect of Girl Model fit (criteria) RMSEA (<0.080) CFI (>0.900) TLI (>0.900) SRMR (<0.080) Note: *p<0.05, ***p<0.001. Model 2 Estimate (SE) Model 3 Estimate (SE) 0.223*** 0.045** 0.223*** -0.019(ns) (0.017) (0.014) -0.168*** 0.356*** (0.015) (0.018) (0.019) (0.017) -0.048* 0.248*** (0.024) (0.017) -0.048* 0.219*** (0.024) (0.016) 0.055*** (0.006) 0.049*** (0.006) -0.012* -0.012* (0.006) (0.006) -0.060*** -0.011* -0.071*** (0.007) (0.005) (0.009) -0.179*** -0.191*** (0.014) (0.014) -0.121*** -0.192*** (0.015) (0.014) 0.067 0.979 0.961 0.017 0.049 0.979 0.971 0.033 Our results were compatible with previous findings that the higher science self-concept, the higher future-oriented science motivation. Nagy et al (2006) found that subject based self-concept at grade 10 could predict subsequent course choices at grade 12 in Germany. Similarly, Simpkins and Davis-Kean (2005) found that European American students with higher self-concepts in science were more likely than those with moderate or low self-concepts, to take physical science courses in high school. So, girls’ disadvantage in future choices of science studies and careers can also be explained partially by their lower SCSCIE in the Hong Kong context. 111 5.2.4 Mediation effect of Interest in Science Learning Model 4a included INTSCIE as a mediator between Girl and SCIEFUT. The model fit indices, RMSEA (0.062), CFI (0.955), TLI (0.939) and SRMR (0.071) were reasonable for a good-fitting model (see Figure 5.4). The results in Table 5.4 indicate that the direct effect of Girl was reduced by 68% from -0.179 to -0.057 when INTSCIE was included in the analysis for SCIEFUT. This was because INTSCIE mediated the gender effects on SCIEFUT partially (-0.131, p<0.001). The indirect effect of Girl on SCIEFUT through SP became insignificant (-0.003, p=0.075) after adding INTSCIE as the mediator of gender differences. Figure 5.4: Mediation effect of Interest in Science Learning (Model 4a) Independent Mediators Dependent variable: variable: Subjective task value Stable child characteristic (H) -0.193*** Science Interest (INTSCIE) Achievement related choice (I) 0.680*** (A) -0.057*** Girl Future-oriented Science Motivation (SCIEFUT) -0.048* Control variable: Parental SES (B) (D) 0.223*** (C) Science Performance (SP) -0.033** 0.068*** (E) Note: *p<0.05,**p<0.01,***p<0.001; RMSEA=0.062, CFI=0.955, TLI=0.939, SRMR=0.071 112 The results suggest that INTSCIE alone could explain 31% (-0.060/-0.192×100%) of the gender effects on SCIEFUT. Secondly, the mediating effect of SP was insignificant in the presence of INTSCIE. In other words, enhancement of girls’ cognitive performance in science will not improve girls’ inclination to choose science-related education programmes and careers in the future as we take student science interest into account. Therefore, improvement of girls’ INTSCIE appears to be more important for overcoming the disadvantage of girls in the SCIEFUT. 5.2.5 Mediation effect of Enjoyment of Science Learning Model 4b included JOYSCIE as a mediator between Girl and SCIEFUT. The model fit indices, RMSEA (0.067), CFI (0.963), TLI (0.948) and SRMR (0.085) were reasonable for model fit though SRMR was slightly higher than the criterion (SRMR<0.080) (See Figure 5.5). The results in Table 5.4 indicate that the direct effect of Girl was reduced by 65% from -0.179 to -0.063 when JOYSCIE was included in the analysis for SCIEFUT. This was because JOYSCIE mediated the gender effects on SCIEFUT partially (-0.125, p<0.001). As a result of the addition of JOYSCIE, the indirect effect of Girl through SP became insignificant (-0.002, p=0.095). This pattern is similar to that of the indirect effect of interest in science learning. 113 Figure 5.5: Mediation effect of Enjoyment of Science Learning (Model 4b) Independent variable: Mediators Dependent variable: Achievement related choice Stable child characteristic Subjective task value (A) Girl -0.189*** (J) Science Enjoyment (JOYSCIE) Future-oriented Science Motivation (SCIEFUT) -0.063*** 0.664*** (K) -0.048* Control variable: Parental SES (B) (D) 0.223*** (C) Science Performance (SP) -0.026* 0.046** (E) Note: *p<0.05,**p<0.01,***p<0.001; RMSEA= 0.067, CFI=0.963, TLI=0.948, SRMR=0.085 114 Table 5.4:Mediation effect of Interest in Science Learning (Model 4a), Enjoyment of Science Learning (Model 4b) and Interest and Enjoyment of Science Learning (Model 4c) Model 2 Model 4a Model 4b Model 4c Estimate Estimate Estimate Estimate (SE) (SE) (SE) (SE) Student Background 0.223*** 0.223*** 0.223*** 0.223*** (D) SESÆSP (0.019) (0.019) (0.019) (0.019) 0.045** -0.033** -0.026* -0.033** (C) SESÆSCIEFUT (0.017) (0.012) (0.012) (0.004) Mediator Interest in Science Learning -0.193*** -0.190*** (H) GirlÆINTSCIE (0.020) (0.020) 0.680*** 0.490*** (I) INTSCIEÆSCIEFUT (0.012) (0.036) Enjoyment of Science Learning -0.189*** -0.184*** (J) GirlÆ JOYSCIE (0.015) (0.015) 0.664*** 0.243*** (K) JOYSCIEÆSCIEFUT (0.011) (0.036) Science Performance -0.048* -0.048* -0.048* -0.048* (B) GirlÆ SP (0.024) (0.024) (0.024) (0.024) (E) SPÆ SCIEFUT 0.248*** 0.068*** 0.046** 0.013(ns) (0.017) (0.015) (0.015) (0.015) Indirect Effect of SES 0.055*** 0.015*** 0.007* 0.003(ns) SESÆ SPÆ SCIEFUT (0.006) (0.004) (0.003) (0.003) Indirect Effect of Girl GirlÆ INTSCIE Æ -0.131*** -0.093*** SCIEFUT (0.014) (0.012) GirlÆ JOYSCIE Æ -0.125*** -0.045*** SCIEFUT (0.010) (0.008) -0.012* -0.003(ns) -0.002(ns) -0.001(ns) GirlÆ SPÆ SCIEFUT (0.006) (0.002) (0.001) (0.001) Total Indirect Effect of -0.012* -0.135*** -0.128*** -0.139*** Girl (0.006) (0.014) (0.011) (0.013) Direct Effect of Girl -0.179*** -0.057*** -0.063*** -0.051*** (A) GirlÆSCIEFUT (0.014) (0.014) (0.012) (0.012) -0.191*** -0.192*** -0.191*** -0.190*** Total Effect of Girl (0.014) (0.014) (0.015) (0.014) Model fit (criteria) 0.067 0.055 RMSEA (<0.080) 0.067 0.062 0.963 0.955 CFI (>0.900) 0.979 0.955 0.948 0.945 TLI (>0.900) 0.961 0.939 0.085 0.078 SRMR (<0.080) 0.017 0.071 Note: *p<0.05, ***p<0.001. 115 5.2.6 Mediation effect of Interest and Enjoyment of Science Learning Model 4c included both INTSCIE and JOYSCIE as mediators between Girl and SCIEFUT. The model fit indices, RMSEA (0.055), CFI (0.955), TLI (0.945) and SRMR (0.078) were reasonable for an adequate-fitting model (see Figure 5.6). Figure 5.6: Mediation effect of Interest and Enjoyment of Science Learning (Model 4c) Independent variable: Mediators Dependent variable: Subjective task value Stable child characteristic (H) -0.190*** Girl -0.184*** (J) Science Interest (INTSCIE) Science Enjoyment (JOYSCIE) Achievement related choice (I) 0.490*** (A) -0.051*** 0.243*** (K) Future-oriented Science Motivation (SCIEFUT) -0.048* Control variable: Parental SES (B) (D) 0.223*** (C) -0.033** Science Performance (SP) 0.013(ns) (E) Note: *p<0.05, *** p<0.001; RMSEA=0.055, CFI=0.955, TLI=0.945, SRMR=0.078 The results in Table 5.4 indicate that the direct effect of Girl was reduced by 72% from -0.179 to -0.051 when INTSCIE and JOYSCIE were included in the analysis for SCIEFUT. This was because INTSCIE and JOYSCIE mediated the gender effects on SCIEFUT partially (-0.139, p<0.001). Besides, the indirect effect of Girl through SP was found insignificant (-0.001, p=0.414). The indirect effect of parental SES through SP also was also found insignificant (0.003, p=0.362). 116 The results were consistent with past research findings that the disadvantage of girls’ motivation in science careers and studies might originate from lower INTSCIE and JOYSCIE. Consistent with Kelly and Smail’s (1986) gender study on 11-year-old UK students, they also found that girls who endorsed sex stereotypes showed less interest than boys in learning science, which might affect their long-term planning to work in a career related to science or doing advanced science in the future. The results also indicate that children from families with higher parental SES were less motivated than their poorer counterparts to choose science-oriented education programmes and careers after secondary school education. In other words, the higher the parental SES is, the lower the children’s SCIEFUT in Hong Kong. In a global sense, although rich societies like Hong Kong have extra economic and social resources to support their 15-old-children to achieve more highly in science, it does not necessarily encourage more adolescents to participate in science-oriented programmes and careers in the future in the context of Hong Kong. To recap, our findings show that gender differences due to the affective domain override the cognitive domain, and infer that gender equity programmes focusing on improving females’ cognitive achievement might not able to resolve the problem of gender-segregated achievement-related choices in science. Conversely, gender equity programmes targeted at rising girls’ affective domains of INTSCIE as well as JOYSCIE are likely to solve the problem of girls’ disadvantage in their future careers and education opportunities in science. 5.2.7 Mediation effect of Attainment Value Model 5a included PERSCIE (or Attainment Value in Eccles et al model, 1983) as a mediator between Girl and SCIEFUT. The model fit indices, RMSEA (0.050), CFI (0.971), TLI (0.962) and SRMR (0.059) were reasonable for a good-fitting model (see 117 Figure 5.7). The results in Table 5.5 indicate that the direct effect of Girl was reduced by 11% from -0.179 to -0.159 when PERSCIE was included in the analysis for SCIEFUT. PERSCIE is a significant mediating factor for the gender effects on SCIEFUT (-0.029, p<0.001). As a result, PERSCIE alone could explain 15% (-0.029/-0.196×100%) of the gender effects on SCIEFUT. Figure 5.7: Mediation effect of Attainment Value (Model 5a) Independent variable: Mediators Dependent variable: Subjective task value: Attainment value Stable child characteristic Achievement related choice Personal Value (L) (M) of Science 0.376*** (PERSCIE) -0.076 *** (A) -0.159*** Girl Future-oriented Science Motivation (SCIEFUT) (B) Control variable: Parental SES -0.048* -0.022(ns) (C) (D) 0.223*** Science Performance (SP) 0.159*** (E) Note: *p<0.05, *** p<0.001; RMSEA=0.068, CFI=0.960, TLI=0.939, SRMR=0.055 118 5.2.8 Mediation effect of Utility Value Model 5b included INSTSCIE (or Utility Value in Eccles et al model (1983)) as a mediator between Girl and SCIEFUT. The model fit indices, RMSEA (0.054), CFI (0.977), TLI (0.968) and SRMR (0.053) were reasonable for a good-fitting model (see Figure 5.8). Figure 5.8: Mediation effect of Utility Value (Model 5b) Independent variable: Mediators Stable child characteristic Girl Dependent variable: Subjective task value: Utility value (A) -0.051 *** (B) (N) -0.048* Control variable: Parental SES Instrumental (O) Motivation to Learn science (INSTSCIE) Achievement related choice -0.164*** 0.430*** Future-oriented Science Motivation (SCIEFUT) -0.011(ns) (C) (D) 0.223*** Science Performance (SP) 0.162*** (E) Note: *p<0.05, *** p<0.001; RMSEA=0.054, CFI=0.977, TLI=0.968, SRMR=0.053 The results in Table 5.5 indicate that the direct effect of Girl was reduced by 8% from -0.179 to -0.063 when INSTSCIE was included in the analysis for SCIEFUT. INSTSCIE is also a significant mediating factor for the gender effects on SCIEFUT (-0.022, p<0.01). 119 Table 5.5: Mediation effect of Attainment Value (Model 5a), Utility Value (Model 5b) and Student Background (D) SESÆSP (C) SESÆSCIEFUT Model 2 Estimate (SE) Model 5a Estimate (SE) Model 5b Estimate (SE) Model 5c Estimate (SE) 0.223*** (0.019) 0.045** (0.017) 0.223*** (0.019) -0.022(ns) (0.015) 0.223*** (0.019) -0.011(ns) (0.015) 0.223*** (0.019) -0.019(ns) (0.014) Mediator Attainment Value -0.076*** (0.020) 0.375*** (0.018) (L) GirlÆPERSCIE (M) PERSCIEÆSCIEFUT -0.076*** (0.020) 0.273*** (0.020) Utility Value (N) GirlÆINSTSCIE (O) INSTSCIEÆSCIEFUT -0.051*** (0.015) 0.430*** (0.023) -0.051** (0.015) 0.343*** (0.023) Science Performance (B) GirlÆSP (E) SPÆSCIEFUT -0.048* (0.024) 0.248*** (0.017) -0.048* (0.024) 0.159*** (0.018) -0.048* (0.024) 0.162*** (0.016) -0.048* (0.024) 0.107*** (0.016) 0.055*** (0.006) 0.035*** (0.005) 0.036*** (0.005) 0.023*** (0.004) -0.021*** (0.006) -0.017** (0.006) -0.005(ns) (0.003) -0.043*** (0.010) Indirect Effect of SES SESÆSPÆSCIEFUT Indirect Effect of Girl GirlÆPERSCIE Æ SCIEFUT GirlÆINSTSCIEÆ SCIEFUT GirlÆSPÆ SCIEFUT Total Indirect Effect of Girl Direct Effect of Girl (A) GirlÆSCIEFUT Total Effect of Girl -0.029*** (0.008) -0.012* (0.006) -0.012* (0.006) -0.008* (0.004) -0.036*** (0.010) -0.022** (0.007) -0.008* (0.004) -0.030** (0.009) -0.179*** (0.014) -0.191*** (0.014) -0.159*** (0.014) -0.196*** (0.014) -0.164*** (0.014) -0.193*** (0.014) -0.152*** (0.013) -0.195*** (0.014) 0.067 0.979 0.961 0.017 0.068 0.960 0.939 0.055 0.054 0.977 0.968 0.053 0.050 0.971 0.961 0.059 Model fit (criteria) RMSEA (<0.080) CFI (>0.900) TLI (>0.900) SRMR (<0.080) Note: *p<0.05, ***p<0.001. 120 5.2.9 Mediation through Attainment Value and Utility Value Model 5c included both PERSCIE and INSTSCIE as mediators between Girl and SCIEFUT. The model fit indices, RMSEA (0.050), CFI (0.971), TLI (0.961) and SRMR (0.059) were reasonable for an adequate-fitting model (see Figure 5.9). Figure 5.9: Mediation effect of Attainment Value and Utility Value (Model 5c) Independent Mediators Dependent variable: variable: Subjective task value: Attainment value Stable child characteristic (L) -0.051 *** (B) Control variable: Parental SES (M) of Science 0.273 *** (PERSCIE) -0.076 *** Girl Achievement related choice Personal Value (A) -0.152 *** (N) Instrumental (O) Motivation to Learn science (INSTSCIE) 0.343 *** -0.048* Utility value Future-oriented Science Motivation (SCIEFUT) -0.019(ns) (C) 0.107*** (E) (D) 0.223*** Science Performance (SP) Note: *p<0.05, *** p<0.001; RMSEA=0.050, CFI=0.971, TLI=0.961, SRMR=0.059 The results in Table 5.5 indicate that the direct effect of Girl was reduced by 15% from -0.179 to -0.152 when PERSCIE and INSTSCIE were included in the analysis for SCIEFUT. This was because PERSCIE and INSTSCIE mediated the gender effects on SCIEFUT partially (-0.017, p<0.001). The indirect effect of Girl through SP was also found insignificant (-0.005, p=0.053). The results suggest that PERSCIE and INSTSCIE could explain about 20% ((-0.021-0.017)/-0.195×100%) of the gender effects on SCIEFUT. Secondly, the mediating effect of SP was found insignificant in the presence of PERSCIE and INSTSCIE. 121 Our findings were consonant with Chow and Salmela-Aro’s results (2011). They stated that Finn boys had higher task value, Attainment Value and Utility Value and dominated the high-math-and-science group, while girls dominated the low-math-and-science group at secondary school. The high-math-and-science group had reported a stronger tendency to enroll in science-related programmes after the completion of compulsory education. In sum, our findings support that girls’ disadvantage in their future careers and education opportunities in science can be accounted for in part by their lower Attainment Value and Utility Value in science. However, the effect size might be slightly less than the “general self schemata”. 5.2.10 Full models of gender effects on Future-oriented Science Motivation Model 6a and Model 6b included all the five affective mediators, SCSCIE, JOYSCIE, INTSCIE, INSTSCIE and PERSCIE. Presumably, Model 6a represents the revised Eccles et al (1983) model and had no direct effect of Girl (GirlÆSCIEFUT) while Model 6b was proposed based on the empirical evidence of current study (i.e. had direct effect of Girl on SCIEFUT). The model fit indices, RMSEA (0.044), CFI (0.947), TLI (0.940/0.941) and SRMR (0.057) were basically the same for the two models. The fit indices were also reasonable for good-fitting models (see Figure 5.10 and Figure 5.11). The results in Table 5.6 (Model 6a) indicate that SCSCIE, JOYSCIE, INTSCIE, INSTSCIE and PERSCIE are significant mediating factors for gender effects on SCIEFUT (-0.142, p<0.001). Since there is no direct effect of Girl (GirlÆSCIEFUT) in Model 6a, the total mediating effect of affective factors is slightly higher than that in Model 6b (-0.135, p<0.001). 122 Figure 5.10: Full model (Model 6a) of gender differences in Future-oriented Science Motivation Independent variable: Mediators Dependent variable: Child’s general self schemata Science Self-concept (SCSCIE) (F) 0.050*** Interest in Science Learning (INTSCIE) -0.167*** (H) -0.201*** (J) Stable child characteristic -0.185*** (L) Girl (STF Gender) (G) -0.095*** (N) Enjoyment of Science Learning (JOYSCIE) Attainment Value (PERSCIE) (I) 0.392*** Achievement related choice (K) 0.194*** Future-oriented Science Motivation (SCIEFUT) (M) 0.092*** (O) 0.168*** -0.063*** Utility Value (INSTSCIE) Control variable: (B) -0.049* Subjective task value 0.000(ns) (E) -0.034*** Cultural milieu Parental SES (D) 0.223*** Science Performance (SP) (C) Note: *p<.05, *** p<0.001; RMSEA=0.044, CFI=0.947, TLI=0.940, SRMR=0.057 123 Figure 5.11: Full model (Model 6b) of gender differences in Future-oriented Science Motivation Independent Mediators Dependent variable: variable: Child’s general self schemata (F) -0.167*** (H) -0.191*** Stable child characteristic (J) -0.184*** Girl (STF Gender) Science Self-concept (SCSCIE) Interest in Science Learning (INTSCIE) Enjoyment of Science (JOYSCIE) (A) -0.055*** (L) -0.094*** (N) Attainment Value (PERSCIE) -0.062*** (G) 0.045** (I) 0.377*** Achievement related choice (K) Future-oriented Science Motivation (SCIEFUT) 0.194*** (M) 0.096*** (O) 0.172*** Control Variable: (B) -0.048* Utility Value (INSTSCIE) Subjective task value 0.000(ns) (E) -0.031** Cultural milieu Parental SES (D) 0.223*** Science Performance (SP) (C) Note: *p<.05, **p<.01, *** p<.001; RMSEA=0.044, CFI=0.947, TLI=0.941, SRMR=0.057 124 Table 5.6: Full model of gender effects on Future-oriented Science Motivation Model 2 Model 6a Model 6b Estimate (SE) Estimate (SE) Estimate (SE) Student Background (D) SESÆSP (C) SESÆSCIEFUT 0.223*** 0.045** (0.019) (0.017) 0.223*** -0.034** (0.019) (0.010) 0.223*** -0.031** (0.019) 0.010 Direct Effect of Girl GirlÆSP GirlÆSCIEFUT -0.048* -0.179*** (0.024) (0.014) -0.049* (0.024) -0.048* -0.055*** (0.024) (0.011) -0.008** (0.001) -0.007** (0.003) -0.036*** (0.007) -0.036*** (0.007) -0.079*** (0.011) -0.072*** (0.010) -0.009** (0.001) -0.009** (0.003) -0.011*** (0.003) -0.011*** (0.003) Indirect Effect of Girl GirlÆ SCSCIEÆ SCIEFUT GirlÆJOYSCIE Æ SCIEFUT GirlÆINTSCIEÆ SCIEFUT GirlÆ PERSCIEÆ SCIEFUT GirlÆINSTSCIEÆ SCIEFUT Total Indirect Effect of Girl Total Effect of Girl Model fit (criteria) RMSEA (<0.080) CFI (>0.900) TLI (>0.900) SRMR (<0.080) -0.012* (0.006) -0.142*** (0.012) -0.135*** (0.012) -0.191*** (0.014) -0.142*** (0.012) -0.189*** (0.014) 0.067 0.979 0.961 0.017 0.044 0.947 0.940 0.057 0.044 0.947 0.941 0.057 Note: *p<0.05, **p<0.01, ***p<0.001. Model 6a and Model 6b were tested with 5 plausible values (PV1SCIE to PV5SCIE) provided by PISA 2006 and PV1SCIE was used to for the final analysis. The results in Table 5.6 (Model 6b) indicate that the direct effect of Girl was reduced by 69% from -0.179 to -0.055 when SCSCIE, JOYSCIE, INTSCIE, INSTSCIE and PERSCIE were included in the analysis for SCIEFUT. SCSCIE, JOYSCIE, INTSCIE, INSTSCIE and PERSCIE are significant mediating factors for the gender effects on 125 SCIEFUT (-0.135, p<0.001). This explains over 71% (-0.135 / -0.189 x 100%) of gender effects on Future-oriented Science Motivation. The results also show that the mediating effect of SCSCIE, INTSCIE, JOYSCIE significantly overrides that of PERSCIE and INSTSCIE29 (-0.156, p<0.001). 5.3 Summary In sum, six major findings emerged from this mediation study. Firstly, children from richer families were less motivated than the poorer counterparts to choose science-oriented education programmes and careers after secondary school education. In other words, the higher the parental SES was, the lower the children’s Future-oriented Science Motivation. Secondly, Figure 5.10, Figure 5.11 and Table 5.6 have demonstrated a method for testing multiple mediations simultaneously. The multiple mediation study shown in model 6b makes it possible for us to compare the relative magnitudes of the specific indirect effects associated with the mediators. The order of the magnitude was: INTSCIE (-0.072), JOYSCIE (-0.036), SCSCIE (-0.011), PERSCIE (-0.009) and INSTSCIE (-0.007). This finding provides us with a clue that gender effects mediate through Interest and Enjoyment of Science Learning and Science Self-concept more than Attainment Value and Utility Value. In fact, “Interest and Enjoyment of Science Learning” was the most influential affective mediator followed by “Science Self-concept” which provided us some hints on the order and magnitude of intervention to be followed for educators. 29 The difference between (INTSCIE, JOYSCIE) mediation and (PERSCIE, INSTSCIE) was estimate with MODEL CONSTRAINT statement in Mplus: Difference = (F*J + H*I + J*K) – (L*M + N*O) where F to O is the path coefficients of Model 6b. 126 Thirdly, from model 6b, both direct and indirect paths were significant (p<0.001) and the results partially supported Halpern’s (2000, 2004) biopsychosocial hypothesis. The hypothesis recognizes the mixing effects of biological, psychological and social impact on gender development. However, total indirect effect (psychological and social effects) (-0.135, p<0.001) prevailed over the direct effect (biological effect) (-0.055, p<0.001) for gender differences and explains over 71% of gender effects on Future-oriented Science Motivation. Sociocultural conditions of individuals were still clearly the dominant factors influencing the development of gender differences in achievement-related choices in science education and careers. Fourthly, all the indirect effects for girls on affective learning outcomes were significant and negative (p<0.001). It is a paradox in late modern societies that social literacy allows females more freedom to choose their future life while persistent gender role stereotype obstruct their educational and occupational trajectories related to science, and consequently restrict their social mobility. As suggested by Leung (2011), the major task of career guidance for girls is not to “help her to make a career decision”, but to “empower her to overcome the various social structural barriers limiting her choices” and to promote “social justice in gender-biased educational and career choices”. Fifthly, gender differences in Future-oriented Science Motivation cannot be diminished by improving female science cognitive achievement. The mediation effect of Science Literacy is insignificant when the affective factors were taken into the estimation. Sixthly, parental SES is like a double-edged sword, it helped students to improve their science performance because of extra family resources. At the same time, it discouraged the students from families with higher parental SES from choosing science 127 as their future studies and careers. For parents and teachers, they should reflect on “Why our wealthy students tend not to choose science as their future studies and careers”. All the models studied so far are in line with the revised Eccles et al (1983) model expectancy-value model of achievement-related choices and the results provided us with more insight into the problem “gender-segregated science related educational and career choices” in Hong Kong context. In the next chapter, I will sum up the major findings of the study, examine the implications for policy and practice at school, families, examination bodies and education authorities, discuss limitations of the study, and recommend future research. 128 CHAPTER SIX CONCLUSIONS AND IMPLICATIONS This chapter sums up the major findings of the study, examines the implications for policy and practice at school, families, examination bodies and education authorities, discusses limitations of the study, and makes recommendations for future research. 6.1 Database and data analysis The study focus on three major questions: (1) Are there any gender differences in science performance? (2) Are there any gender differences in affective learning outcomes of scientific literacy? (3) What are the mediation effects of gender differences through cognitive and affective factors on future-oriented science motivation? The data was obtained from Hong Kong PISA 2006 which consists of 4645 students, representing 5.7% of the 15-year-old population, selected from 146 local schools with two-staged randomly sampling. All the student demographic and affective variables were retrieved from the student questionnaire while the students’ cognitive performance in science was taken from 13 assessment booklets. The study deployed two key quantitative research methods, Multidimensional Differential Item Functioning (MDIF) and Multilevel Mediation (MLM). MDIF was used to investigate gender effects at the item level while MLM was used to study the mediation effects of affective variables on students’ future-oriented science motivation. Confirmatory factor analysis (CFA) was firstly used to testify the goodness of fit of the measurement models. In other words, the OECD measurement models were fitted with 129 local data and then the models were adjusted until the model fits were accepted. Next, the “Multi-group Measurement Invariance” (MGMI) test was conducted to assess the applicability of all the measurement models across the genders. Once this procedure had finished, the MLM was conducted. It is important to note that although MLM is the advanced statistical technique for structural equation modelling, the casual relationship between the latent variables and gender effects might not be affirmed. In the following section, the major findings of the current study will be displayed. 6.2 Major findings 6.2.1 Multidimensional DIF model 1. The multidimensional DIF is both more robust and more accurate in estimating the gender differences in science performance. When the gender effect was modeled as a facet in item model, the gender differences in science performance were statistically significant at item level while no gender differences could be detected using Mean Score Difference (MSD). One of the reasons for the robust result is that multidimensional DIF caters test validity and reliability in the same model. The ConQuest software puts one more step forward to retrieve information from other dimensions within the same matrix to improve the precision of measurement. These features supplement traditional MSD. In short, the MDIF was a method with sufficient sensitivity to detect the gender differences at the item level. 2. In general, small effect sizes for gender differences were identified across (i) competency dimensions, (ii) content and (iii) item formats. 130 Overall, small but statistically significant gender differences in science performance were found. The effect sizes detected in this study, in most of the cases, were less than 0.20 which is considered as small (i.e. little impact) in education and social science studies. The largest effect size (0.35) was found in physical systems under knowledge of science. Findings from this study support gender similarities hypothesis in cognitive performance as proposed by Hyde (2005) that the effect in sizes of gender differences is small. 3. Boys and girls showed differential advantages in science performance competencies, content domains and item formats The MSD results on science performance competencies show that boys outperformed girls on Explaining Phenomena Scientifically (EPS) and Using Scientific Evidence (USE) while girls surpassed boys on Identifying Scientific Issues. The MSD results on content domains ascertain that boys outperformed girls on three domains of “Knowledge Of Science” but not for “Knowledge About Science”. Boys did better than girls on Earth and Space Systems, Living Systems and Physical Systems and the largest effect size was found for Physical Systems. There was no significant gender difference on “Knowledge About Science”. The MDIF results show that boys did better than girls for most of the item format, including multiple choice, complex multiple choice, closed constructed response, except open response. But, the MDIF results indicate that more items were in favor of 131 girls, after controlling the average abilities of boys and girls. In short, the MSD and MDIF results were consistent with most of previous gender studies in science. Boys showed advantages in EPS and USE competencies and content domains, in particular, physical systems. However, the effect size for physical systems (0.35) was small. Regarding the gender differences on item format, girls were favored by open response items while boys were favored by closed constructed response at item level. 4. Boys had higher gender variability than girls’ at all science performance levels and gender variance ratios (B/G) were the highest at tails of the distributions. More boys were found to be lower achievers of science than that of girls (left ends of the distributions). In contrast to the traditional expectation, the findings did not support the notion that more boys occupied the right ends of the distributions. The numbers of high achievers in this region were in fact alike for boys and girls. However, boys outnumbered girls at the left ends of the distributions, for the three dimensions “Explaining Phenomena Scientifically”, “Identifying Scientific Issues” and “Using Scientific Evidence”. Boys’ underachievement is an issue in the Western countries like UK and US. Hong Kong schools are now facing similar problem. Besides the concern of underachievement, boys’ cognitive outcomes varied much more than that of girls at all level of science performance. The gender variance ratio (B/G) reached the highest at both ends of the distributions. These results were consistent with 132 the existing literatures (e.g. Feingold, 1992). To sum up, there is no apparent evidence to support that the number of boys’ higher achievers outnumbered girls at the right ends of the distributions. The underachievement of more boys located at the left ends of the distributions, and also their higher cognitive variability in science performance, gave some implications for classroom teaching and learning. For this, we have elaboration in a later session. 5. Boys had higher affective learning outcomes than girls. Findings from this study suggested that there were significant gender differences in affective learning outcomes. Boys showed statistically significant higher levels of Enjoyment of Science Learning (JOYSCIE), Interest in Science Learning (INTSCIE), Instrumental Motivation to Learn Science (INSTSCIE), Personal Value of Science (PERSCIE), Science Self-concept (SCSCIE) and Future-oriented Science Motivation (SCIEFUT) than girls. The largest three effect size (Cohen d > 0.20) in affective learning outcomes was SCSCIE (0.47), SCIEFUT (0.38) and JOYSCIE (0.37). In Hong Kong, it is important to note that girls tended to report a significantly lower Science Self-concept, Future-oriented Science Motivation and Interest in Science Learning than boys. As a whole, compared with boys, girls had clear disadvantages in terms of confidence and future-oriented motivation to participate in science related educational programmes or careers after secondary school education. 133 6.2.2 Multilevel Mediation using Expectancy-Value Model Most of the past gender studies focused on regression analyses of gender effects on science performance and correlation between gender and attitudes toward science. The current study explored further into the underlying mediation mechanism of gender effects on Future-oriented Science Motivation using SEM. The following section summarizes the major findings of the study. 6. OECD affective constructs in science and Eccles et al (1983) model were both applicable to local schools upon localization process. The affective measurement models were all developed based on the OECD (Organisation for Economic Co-operation and Development) countries and most of these countries except Japan and South Korea were non-Asian origin. In addition, the model fits were done with the international calibration sample of OECD countries only and the deployed measurement models may not reflect the genuine psychometric characteristics of Hong Kong students. To apply these measurement models locally, all measurement models or constructs deployed in this study were firstly validated with CFA using AMOS software. For the poor-fitting measurement models, such as, “Interest in Science Learning”, two misfit items, topics in astronomy and topics in geology were removed from the model due to the fact that local science curriculum does not cover them in science education. Secondly, the split of general science interest into physical science, biological science and scientific method by second order CFA confirmed perfect fit of the modified measurement model with respect to local data. 134 The second step was to validate the measurement invariance (suitability) of the constructs across two gender groups, boys and girls. Results for the multiple-group CFA confirmed the validity of the measurement models for local males and females. For the structural models, the revised Eccles et al (1983) model was used to conceptualize the interactions among subjective expectations of success and the personal value of available educational and career choices. The final structural model was a good fit with the local PISA data suggesting that Eccles et al (1983) model is not only useful to understand gender differences in Western science education, but also applicable to the Hong Kong context. Most of the past local adoption of the model, focused on physical education (e.g. Pang & Ha, 2010). The current study extended and validated the model’s applicability to study the gendered educational and vocational choices related to science education in Hong Kong. To sum up,, CFA results suggested that most of the OECD measurement models are a good fit with local data. However, modifications were often necessary to ensure that all the measurement models reflected the actual psychometric properties of local context. The results of the structural models also confirmed the applicability of Eccles et al (1983) model for local science education. Comparison of the mediation effects of cognitive and affective domains 7. Cognitive achievement was an insignificant mediator of gender differences on future-related achievement choices while affective factors were important intervening mediators of gender differences. 135 Multiple mediator models used in this study allowed us to compare directly the relative magnitudes of specific indirect effects. From this study, cognitive achievement was found to be a strong mediator for gender differences provided that other affective mediators were excluded from the model. However, its mediating effect reduced to zero in a multiple mediator model when the model included affective mediators. To sum up, the mediating effect of cognitive achievement was negligible in a multiple mediator model. The gender differences in future-oriented motivation in science can probably be reduced by rising females’ affective learning outcomes in science rather than cognitive performance. Comparison of the mediation effects of self-schema and subject task values 8. “Interest in Science Learning” and “Enjoyment of Science Learning” were the most influential intervening mediators of gender differences. Built upon the Eccles et al (1983) model, the order of strength to mediate gender effects on future-oriented science motivation are: Interest in Science Learning, Enjoyment of Science Learning, Science Self-concept (self-schema), Personal Value of Science (attainment value) and Instrumental Motivation to Learn Science (utility value). The results suggest that Interest and Enjoyment of Science Learning dominate the mediation effects of gender differences and have a decisive role in shaping students’ future-orientation of studies and career paths in science. Science self-concept is considered to be a very strong predictor of gendered course choices and careers orientation in Western societies. Surprisingly, its effect was overshadowed by Interest and Enjoyment of Science Learning in Hong Kong context. There is at least one possible explanation for this unexpected result. Hong Kong students tended to 136 underestimate their science self-concept more than their Western counterparts because of the more demanding science curriculum (Lam, Cheng, Lai, Leung & Tsoi, 1996) and humble nature of Asian students. Personal Value of Science and Instrumental Motivation to Learn Science has similar and weak mediation effect of gender differences. The present junior secondary school science curriculum put inconsiderable emphasis on the nature of science (or knowledge about science in PISA terminology) and it is hard for students to build up a clear picture of how the science enterprise works. Consequently, there is no simple way for junior secondary school students to recognize the importance of science with their own identity under existing science curriculum. To recap, Interest in Science Learning and Enjoyment of Science Learning are the two most influential intervening mediators of gender differences. 6.3 Revisiting conceptual model The revised Eccles et al (1983) model, the conceptual model, was used for multilevel mediation (MLM) study and was displayed in chapter 2. In the model, affective factors (or psychosocial factors) are the major mediators for gender differences. However, the empirical evidence of this study partially supports the biopsychosocial model of gender differences which recognizes the joint impact of biological, psychological and social influences (Halpern, 2000, 2004). The new model in Figure 6.1 includes the possibility of direct effect of biological influence on Future-oriented Science Motivation. However, it is also possible that there are still some affective factors not covered in the present study which have significant effects. That is worthy to be studied in the future. 137 Figure 6.1: Revised Conceptual Model for Studying Gendered Educational and Occupational Trajectories in Science Independent variable: Mediators Child’s general self schemata Dependent variable: Science self-concept (SCSCIE) Interest in Science Learning (INTSCIE) Stable child characteristic Enjoyment of Science Learning (JOYSCIE) Gender Attainment value (PERSCIE) Control variable: Cultural milieu SES Achievement related choice Future-oriented Science Motivation (SCIEFUT) Utility value (INSTSCIE) Subjective task value Science Performance (SP) Key: Solid line (⎯) indicates significant path coefficients; dotted line (----) lines signify insignificant path coefficients. 138 6.4 Implications for policy and practice This section will discuss the implications for policy makers, school leaders, teachers, parents and students: 6.4.1. Implications for policy makers As of today, there is no clear gender equity educational policy in Hong Kong. The enforcement of gender equity at school level is still limited to formal equality of opportunity of access to education in 70’s. The research findings suggested that girls were significantly less motivated than boys to study science and work in science field after secondary education, However, it did not support the view that improving girls’ performance in science will encourage more girls entering science careers and enrolling in science programmes after secondary school education. It appears that the problem of “leaking science pipeline” cannot simply be resolved by offering equal opportunity of schooling. One of the possible reasons is implementation of science curricula at school level. Quihuis (2002) found that female high achievers in science class were more likely to suffer from negative affect and to worry about confirming stereotypes, while male high achievers were more likely to experience positive affect and enjoy curriculum in science classes. Moreover, textbook and teaching materials were found to transmit gender role ideologies at schools (EOC, 1999). More male role models, such as male scientist, appearing more frequently in the science textbooks than female ones is a common phenomenon. Gender role beliefs and stereotypes are believed to influence the development of children’s self-concepts, their perceptions of the value of various activities an also expectations for success in science (Eccles, 2011). 139 In fact, practical and comprehensive intervention strategies can be found in Taiwan. Since 2005, the Enforcement Rules for the Gender Equity Education Act clarify the professional practices, including pre-service teacher training, school curricula, teaching materials, assessments as well as classroom instruction, for gender-equity in Taiwan schools. The local government officials, Legislative Council members and education leaders from local institutions should consider proposing and implementing similar schemes. As a short term measure, the Education Bureau could incorporate subject level affective domain measurement by gender into the School Development and Accountability Framework. This will alert the local school administrators about the importance of gender equity policy at school level. 6.4.2. Implications for school administrators, teachers and textbook authors Consistent with previous studies, our results indicate that interest and enjoyment in science learning are the two most influential factors to motivate girls to learn science and select science-related careers in the future. Science teachers should bring more hands-on laboratory experiences and projects with real life context into daily lessons, helping students not only to learn but also to enjoy and become interested in science. Girls were found to enjoy learning science when they could apply what they had learnt to their lives or to the world around them (Quihuis, 2002). Recent research evidence has demonstrated that a context-based science-technology-society (STS) instruction approach can cultivate positive attitudes toward science for both girls and boys and reduce the gender difference in affective learning outcomes (Bennett, Lubben, & Hogarth, 2007) 140 Science teachers can inspire students, particularly girls, about science by sharing their own interest in science and providing students with a captivating curriculum. Girls are more likely to become interested in science when they can establish a good relationship with their science teachers (AAUW, 1991). Friendly teachers, who are easy to approach with science questions, also help girls in learning science and encourage them to consider science for their future studies and careers. Our findings point out that students,, in particular girls, reported a significantly lower Science Self-concept than boys. Classroom instruction and assessments, therefore, should focus more on raising students’ Science Self-concept. Formative assessment is more appealing to girls while competition and grades comparison in the science lessons will reduce their Science Self-concept and discourage them from learning science. Science teachers can help promote a cooperative learning atmosphere and discourage competition in classroom learning. Curriculum leaders and science panel heads at schools should realize that subject content coverage and high performance in public examinations are not the only successful indicators for science curricula. High affective learning outcomes are also a critical evaluation of the implemented curricula. Yet, many school teachers overlook the issue of students’ diversity in science affective development. Extra effort is placed on drilling public examination materials, while time for interest and enjoyment in science learning is either ignored or reduced. As the findings indicate that boys have higher diversity in science performance, schools receiving more boys should be aware of greater learning diversity in classroom learning. School level science curricula should be enriched with more interesting hands-on 141 experiments and exciting project work. Curriculum time should be increased accordingly to cater for boys’ greater diversity in learning. One of reason for underrepresentation of females in the science field was the lack of instrumental motivation (utility values) to learn science. Careers teachers at schools or tertiary institutions may help in improving careers information about science. School level science careers exhibitions and enterprise partnership schemes in science will provide careers related exposure and experiences for girls and probably retain more girls in the science stream. Apart from subject knowledge, science textbook authors are recommended to include historical and current events and scientists’ stories of major scientific discoveries, for example, the discovery of polonium and radium by Marie and Pierre Curie, the discoveries in neurotrophic factors and their potential for the treatment of neurodegenerative diseases by Nancy Ip, to raise students’ attainment values of science works. Girls may realize that men and women are equally important in many scientific discoveries and innovations. 6.4.3 Implications for parents and students According to Eccles and her colleagues, parents’ gender stereotypes provide boys and girls with differential socialization experiences. The girls’ self schema and subjective task values will subsequently be reduced. Parents can help by not only treating their sons and daughters equally in terms of the types and frequency of science activities and games, but also encourage their daughters to participate in male-dominated science to the same extent as their sons. Parents also 142 can act as a positive science role model and teach their children science by reading science literature and watching or listening to science programmes with their children and discussing what they have learnt from these activities. 6.5 Limitations and recommendations for future research 6.5.1 Limitations of the study PISA, like other international assessments such as TIMSS has one natural limitation, the database collected contain only one wave of data. To understand the educational and vocational trajectories of students, it is essential to have a longitudinal design to follow the 15-year-olds for a longer period of time. The second limitation of the study is its reliance upon career intentions rather than actual future educational and career choices. It would be an important extension for future research to cover the actual educational and career choices of the students. Thirdly, there was no separation of biological and physical sciences in PISA measurement models. CFA analysis of “interest in science learning” from this study and other literatures (e.g. Nagy et al., 2006) suggest gender differential properties of biological and physical sciences. Physical science usually demands higher mathematical abilities and girls typically presented with more anxiety toward mathematics than boys (Eccles, 1984). Given the important of educational and vocational decisions for personal, social and economic development, it is worthwhile to study the gender related differences in this aspects in great details. Unfortunately, in Hong Kong, there are limited studies concerning the gender differences in science, in particular, longitudinal studies about 143 girls’ underrepresentation in STEM careers. The current study can only provide a snapshot of the problem “I can but I don’t want to” in science education. More future longitudinal studies and in-depth case studies about gender development in STEM rather than science are essential to have comprehensive understanding of the problem. Knowledge base for development, intervention, and policy concerning gender differences in occupational outcomes in science will definitely lead to a future solution to the problem. 6.5.2 Recommendations for future research Provided that the effect sizes for gender differences in affective domains of science are larger than cognitive domains, it will be meaningful to explore the origin and development of the gender differences in science affective learning outcomes by qualitative method such as ethnography. Moreover, the sensitive techniques used in this study such as MDIF, will be very useful for item level analysis of other PISA competency domains. Secondly, the Eccles Expectancy-Value Model has rich sets of psychological and sociocultural constructs related to gender differences, which have not yet been fully explored in the study, for example, parental stereotype and gender differential performance in science education. The information gathered will be very useful for parents, social psychologists and social educationalists.. Finally, comparative educational studies using Eccles Expectancy-Value Model can be conducted using cross country PISA data. It will be useful to understand the differential sociocultural contributions or regional effects of different countries on gender differences in STEM and to make use of these understandings to reduce the gender 144 differences in science education. Better still if the model was examined and proved to be feasible in Hong Kong. 145 Appendix A: Handling missing values Missing values in the PISA datasets can pose challenge and even threat in subsequent data analysis in confirmatory factor analysis (CFA) and multilevel regression analysis. It may seriously jeopardize the validity of results, it is important to handle missing data in an appropriate way. Practically, there is three type of missing values: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR) (Rubin, 1976). The traditional tactics to deal with missing value are listwise deletion, pairwise deletion, mean substitution, and various regression-based estimations. These tactics are considered as unacceptable solution because of their biased parameter estimation. (Heck, Thomas et al., 2010). Up to now, full information maximum likelihood (FIML) and multiple imputation (MI) are the two approaches considered as acceptable in the literature (Peugh & Enders, 2004). Lam (2005) investigated the missing values of SES of HKPISA 2000+ datasets and found out that the missing values of SES did not follow MCAR pattern. Here comes to the similar conclusion with reference to the Little’s MCAR test (Little, 1988) on PISA 2006 dataset. The null hypothesis for Little’s MCAR test is that the data are MCAR. Data are MCAR when the pattern of missing values does not depend on the data values. The SES missing values are not MCAR candidates since the p value is less than .05. (See Table A.1). Listwise deletion cannot therefore be applied to the SES missing values. Table A1: EM Correlations matrix of SES MISCED FISCED HISEI SES MISCED 1 FISCED .593 1 HISEI .497 .524 1 SES .840 .853 .803 1 Little’s MCAR test: Chi-Square = 41.950, DF = 9, Sig. = .000 The Little’s MCAR test result suggests that multiple-imputation is essential to recover and complete the dataset for SES. The same procedure above was used to analyse the missing value patterns for other factors before conducing multiple-imputation. 146 The second way to look at the pattern of missing values is monotone or nonmonotone pattern of missing data. The multiple imputation module of IBM SPSS Statistics 19 was used to find out the pattern of the missing values in SES. The missing pattern of SES is shown in FigureA1. Pattern 1 represents cases which have no missing values, while Pattern 2, 3, 4 and 5 represents cases that have missing values on SES. In short, missing values of SES does not display monotone pattern and nonmonotone multiple-imputation method should be used. Figure A1: Missing value pattern of SES and related factors The first method for the multiple imputations was Markov chain Monte Carlo (MCMC) which can handle nonmonotone pattern and the output was shown in Table A2. Table A2: Descriptive Statistics of the results of multiple imputation of SES Data Original Data 4385 .0000000 Std. Minimum Deviation 1.00000000 -2.0397540 Imputed Values 260 -.2047680 1.05584122 -3.4862281 3.0354896 4645 -.0114617 1.00419286 -3.4862281 3.2021793 Complete Data After Imputation N Mean 147 Maximum 3.2021793 The second method for the multiple imputations was Bayesian imputation which was used to generate missing values using CFA. However, for this method, the original PISA 2006 dataset with missing values was used for model fit before missing data generation. IBM SPSS Statistics 19 script to conduct multiple imputation of SES // Impute Missing Data Values. MULTIPLE IMPUTATION PV1MATH PV2MATH PV3MATH PV4MATH PV5MATH PV1READ PV2READ PV3READ PV4READ PV5READ PV1SCIE PV2SCIE PV3SCIE PV4SCIE PV5SCIE PV1INTR PV2INTR PV3INTR PV4INTR PV5INTR PV1SUPP PV2SUPP PV3SUPP PV4SUPP PV5SUPP PV1EPS PV2EPS PV3EPS PV4EPS PV5EPS PV1ISI PV2ISI PV3ISI PV4ISI PV5ISI PV1USE PV2USE PV3USE PV4USE PV5USE SES /IMPUTE METHOD=FCS MAXITER= 10 NIMPUTATIONS=1 SCALEMODEL=LINEAR INTERACTIONS=NONE SINGULAR=1E-012 MAXPCTMISSING=NONE /MISSINGSUMMARIES NONE /IMPUTATIONSUMMARIES MODELS DESCRIPTIVES /OUTFILE IMPUTATIONS=ImputedSES.sav. 148 Appendix B: Booklet effects To adjust the booklet effect (or position effect of items in different booklets), the booklet difficulty parameters were first estimated like other items in ConQuest. The calibration model statement is: Model item + item * step + booklet; The results of the calibration are presented in last column of Table A3. The item fit is good (all weighted fit mean square, MNSQ values are close to 1) and the booklet effect is relatively smaller for most of the booklets due to adoption of the Balanced Incomplete Block (BIB) test design in PISA booklets. However, the booklet effect is still significant at p <0.000. A positive value indicates a booklet that is harder than the average while a negative value indicates a booklet that is easier than the average. For example, students taking Booklet 2 are 0.194 logit higher than the overall average. Likewise, the estimated average achievement level of students taking Booklet 3 is 0.131 higher than the overall average, and the average achievement level of students taking Booklet 12 is 0.203 lower than the overall average. Booklet 2 (-0.194 logit) is the easiest booklet while Booklet 12 (0.203 logit) is the most difficult one and the difference is 0.397 logit difference, about half a grade difference. The differences of estimate on average students’ achievement level are attributed to the booklets themselves, rather than the differing abilities of the students. To rectify, booklet difficulty estimates were added to adjust and reflect the proficiencies of students who responded to certain booklets. As recommended by PISA technical report (OECD, 2009b), internationally estimated booklet parameters are more desirable option from the perspective of cross-national consistency. Table 4 summarizes the booklet effects with international calibration. By comparing to Table A3 and Table A4, the Hong Kong estimated booklet effects are closely aligned with the calibrated international booklet effects. The use of students’ ability estimates, the plausible values, directly from PISA database for gender difference study is valid. 149 Table A3: Hong Kong estimated booklet effects in logits Booklet Estimate Weighted fit SE MNSQ CI T 1 -0.029 0.052 1.01 ( 0.85, 1.15) 0.1 2 -0.194 0.055 0.95 ( 0.85, 1.15) -0.7 3 -0.131 0.055 1.00 ( 0.85, 1.15) 0.0 4 0.045 0.054 1.03 ( 0.85, 1.15) 0.4 5 0.034 0.051 1.06 ( 0.85, 1.15) 0.8 6 0.064 0.055 1.04 ( 0.85, 1.15) 0.5 7 -0.136 0.060 0.99 ( 0.85, 1.15) -0.2 8 0.041 0.055 1.08 ( 0.85, 1.15) 1.0 9 0.004* 0.053 0.96 ( 0.85, 1.15) -0.5 10 0.198 0.054 1.01 ( 0.85, 1.15) 0.2 11 0.078 0.053 0.94 ( 0.85, 1.15) -0.9 12 0.203 0.054 0.99 ( 0.85, 1.15) -0.1 13 -0.177 0.062 1.02 ( 0.85, 1.15) 0.2 Note: An asterisk next to a parameter estimate indicates that it is constrained Separation Reliability = 0.834 Chi-square test of parameter equality = 64.53, df = 12, Sig Level = 0.000 Table A4: Internationally estimated booklet effects Booklet Science domain 1 2 3 4 5 6 7 8 9 10 11 12 13 -0.033 -0.214 -0.220 -0.068 0.017 0.072 -0.213 0.189 0.002 0.229 0.130 0.219 -0.112 Source: OECD, 2009b p. 221 150 Appendix C: Wright map for science performance dimensions The Wright map in Figure A2 shows that the PISA 2006 science items can cover the students’ ability distributions in the three dimensions quite well, except the upper portion of the distributions. For example, there is only one item (item 77: S519Q03) filling up the higher ability region of Identifying Scientific Issues (ISI) distribution. For a more accurate estimation, more items which demand higher order cognitive ability is necessary to fill up these gaps in order to achieve a continuum of item difficulty. Figure A2: Wright map for the three dimensions: Explaining Phenomena Scientifically (EPS), Identifying Scientific Issues (ISI) and Using Scientific Evidence (USE) in science performance Logit EPS ISI USE 4 21 3 2 1 0 -1 X X X XX XX XXX XXXXX XXXXX XXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXX XXXXXXXX XXXXXXXXX XXXXXXXX XXXXXX XXXXXX XXXXXX XXXX XXX XXX XX XX XX X X X 30 52 13 1 17 18 11 31 46 49 14 15 40 4 48 20 24 33 53 5 45 47 26 32 27 42 8 50 10 12 29 35 19 22 43 9 23 34 37 6 36 7 16 41 28 4439 51 38 25 2 -2 X X XX XX XX XXX XXXX XXXX XXXXX XXXXXXX XXXXXXX XXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXX XXXXXXX XXXXXX XXXXX XXXXX XXXX XXXX XXX XXX XX X XX X X X X 77 70 65 54 57 64 76 71 72 73 67 66 55 74 75 56 61 59 60 68 58 63 62 69 3 -3 Note: Each ‘X’ represents 30.5 cases 151 X X X XX XXX XXX XXXXX XXXXXX XXXXX XXXXXX XXXXXXXX XXXXXXX XXXXXXX XXXXXXXX XXXXXXXX XXXXXXXXX XXXXXXXX XXXXXXXX XXXXXXX XXXXXXX XXXXX XXXXX XXXXXX XXXX XXXX XXX XXX XXX XX X X X X X 108 106 82 90 85 97 99 105 81 88 93 79 94 86 101 107 96 100 80 87 89 95 91 102 104 78 84 83 98 103 92 Appendix D: Gender differences in scientific performance measured by MDIF Table A5: Gender DIF items for Closed Constructed Response (CCR) Item code S413Q06 S416Q01 S421Q01 S421Q02 S421Q03 Boys’ DIF Girls’ DIF Weighted | 2DIF | Estimate Estimate Competency item fit (DIF class) (SE) (SE) (girls/boys) 0.84 0.017 -0.017 CCR PS EPS 0.90 (0.041) (0.041) 1.02 0.126** -0.126** 0.25 CCR SEL USE 1.09 (0.047) (0.047) (Class A) 1.03 0.143*** -0.143*** 0.29 CCR PS EPS 0.98 (0.042) (0.042) (Class A) 1.20 -0.036 0.036 CCR LS EPS 1.16 (0.042) (0.042) 0.90 0.348*** -0.348*** 0.70 CCR ESS EPS 0.93 (0.044) (0.044) (Class C) Note: *p<0.05, **p<0.01, ***p<0.001; DIF estimates with negative values mean relatively easier with reference to the opposite gender Item format Item content Table A6: Gender DIF items for Multiple Choice (MC) Item code Item format Item content Competency S213Q02 MC TS EPS S256Q01 MC PS EPS S268Q01 MC SEQ ISI S268Q06 MC LS EPS S304Q02 MC PS EPS S326Q03 MC SEL USE S408Q01 MC LS EPS S408Q05 MC SEQ ISI S413Q05 MC TS USE S415Q02 MC ESS EPS S425Q02 MC SEL USE Weighted item fit (girls/boys) 1.07 0.87 1.21 0.95 0.94 1.05 0.97 1.01 0.93 0.86 1.07 1.06 0.93 0.87 0.97 1.00 1.14 1.13 0.95 0.92 0.97 0.93 152 Girls’ DIF Estimate & (SE) 0.040 (0.051) -0.310 (0.052) 0.061 (0.046) 0.315*** (0.042) 0.293*** (0.044) 0.021 (0.047) -0.103** (0.044) 0.094** (0.044) -0.109** (0.048) -0.011 (0.047) 0.065 (0.048) Boys’ DIF Estimate & (SE) -0.040 (0.051) 0.310 (0.052) -0.061 (0.046) -0.315*** (0.042) -0.293*** (0.044) -0.021 (0.047) 0.103** (0.044) -0.094** (0.044) 0.109** (0.048) 0.011 (0.047) -0.065 (0.048) | 2DIF | (DIF class) 0.62 (Class B) 0.63 (Class B) 0.59 (Class B) 0.21 (Class A) 0.19 (Class A) 0.22 (Class A) - S425Q05 MC SEQ ISI S426Q03 MC ESS EPS S426Q05 MC ESS EPS S428Q01 MC SEL USE S428Q03 MC SEL USE S437Q01 MC PS EPS S437Q03 MC PS EPS S437Q04 MC PS EPS S438Q02 MC SEQ ISI S447Q02 MC SEQ ISI S447Q03 MC SEQ ISI S447Q04 MC SEQ ISI S456Q02 MC LS EPS S465Q02 MC ESS EPS S465Q04 MC ESS EPS S466Q05 MC SEL USE S476Q01 MC LS EPS S476Q02 MC LS EPS S476Q03 MC LS EPS S477Q02 MC LS EPS S477Q03 MC LS EPS 0.98 1.04 0.99 0.98 0.90 0.90 0.92 0.96 0.83 0.84 1.08 0.86 1.06 1.01 0.98 0.96 0.98 1.02 0.97 0.87 1.01 0.98 0.95 1.01 1.22 1.22 1.02 0.98 1.03 1.01 1.05 1.09 1.10 0.96 0.99 0.91 0.97 0.85 1.01 0.90 0.95 0.8 153 -0.051 (0.047) -0.102** (0.043) 0.034 (0.044) 0.212*** (0.048) 0.089 (0.056) 0.118** (0.050) 0.104** (0.042) 0.050 (0.043) -0.008 (0.047) -0.070 (0.044) 0.052 (0.044) -0.027 (0.044) -0.227*** (0.044) -0.101* (0.042) -0.103* (0.041) 0.199*** (0.047) 0.094* (0.044) 0.206*** (0.043) -0.031 (0.046) -0.113* (0.045) 0.01 (0.049) 0.051 (0.047) 0.102** (0.043) -0.034 (0.044) -0.212*** (0.048) -0.089 (0.056) -0.118** (0.050) -0.104** (0.042) -0.050 (0.043) 0.008 (0.047) 0.070 (0.044) -0.052 (0.044) 0.027 (0.044) 0.227*** (0.044) 0.101* (0.042) 0.103* (0.041) -0.199*** (0.047) -0.094* (0.044) -0.206*** (0.043) 0.031 (0.046) 0.113* (0.045) -0.010 (0.049) 0.20 (Class A) 0.42 (Class A) 0.24 (Class A) 0.21 (Class A) - 0.45 (Class B) 0.20 (Class A) 0.21 (Class A) 0.40 (Class A) 0.19 (Class A) 0.41 (Class A) 0.23 (Class A) - S478Q01 MC S485Q03 MC S498Q03 MC S508Q03 MC S521Q02 MC S521Q06 MC Note: 1.26 -0.098* 0.098* 0.20 1.20 (0.041) (0.041) (Class A) 0.83 0.029 -0.029 PS USE 0.88 (0.053) (0.053) 1.03 -0.008 0.008 SEQ ISI 1.01 (0.044) (0.044) 1.03 0.074 -0.074 SEQ ISI 1.01 (0.045) (0.045) 1.18 0.073 -0.073 PS EPS 1.02 (0.043) (0.043) 1.02 -0.059 0.059 PS EPS 0.87 (0.048) (0.048) *p<0.05, **p<0.01, ***p<0.001; DIF estimates with negative values mean relatively easier with reference to the opposite gender LS EPS Table A7: Gender DIF items for Complex Multiple Choice (CMC) Item code Item format Item content Competency S213Q01 CMC SEQ ISI S269Q04 CMC PS EPS S326Q04 CMC LS EPS S408Q04 CMC LS EPS S413Q04 CMC TS USE S415Q07 CMC SEQ ISI S415Q08 CMC SEQ ISI S426Q07 CMC SEQ ISI S438Q01 CMC SEQ ISI S456Q01 CMC LS EPS S458Q02 CMC LS USE S466Q01 CMC SEQ ISI S466Q07 CMC SEQ ISI Weighted item fit (girls/boys) 0.89 0.92 1.08 1.04 0.88 0.89 1.16 1.16 1.07 1.01 0.86 1.00 0.93 1.02 1.1 1.26 0.93 1.17 1.05 1.11 1.11 1.08 1.02 1.12 0.88 1.18 154 Girls’ DIF Estimate & (SE) -0.082 (0.045) 0.201*** (0.034) 0.116** (0.042) -0.031 (0.042) 0.077 (0.047) -0.109* (0.048) 0.004 (0.047) -0.021 (0.045) 0.045 (0.052) 0.030 (0.043) -0.148** (0.047) 0.051 (0.047) -0.012 (0.053) Boys’ DIF Estimate & (SE) 0.082 (0.045) -0.201*** (0.034) -0.116** (0.042) 0.031 (0.042) -0.077 (0.047) 0.109* (0.048) -0.004 (0.047) 0.021 (0.045) -0.045 (0.052) -0.030 (0.043) 0.148** (0.047) -0.051 (0.047) 0.012 (0.053) | 2DIF | (DIF class) 0.40 (Class A) 0.23 (Class A) 0.22 (Class A) 0.30 (Class A) - S478Q02 S478Q03 S493Q01 S493Q03 S495Q01 S495Q02 S495Q04 S498Q02 S508Q02 S510Q01 S514Q04 S519Q02 S524Q06 S527Q01 S527Q03 S527Q04 1.05 0.005 -0.005 0.99 (0.047) (0.047) 1.00 -0.125** 0.125** CMC LS EPS 0.93 (0.046) (0.046) 1.04 -0.355*** 0.355*** CMC LS EPS 1.02 (0.044) (0.044) 1.01 -0.072 0.072 CMC LS EPS 0.92 (0.047) (0.047) 1.00 0.009 -0.009 CMC SEL USE 1.03 (0.047) (0.047) 1.09 0.258*** -0.258*** CMC SEL USE 1.07 (0.048) (0.048) 1.08 -0.106* 0.106* CMC SEQ ISI 1.09 (0.044) (0.044) 1.08 0.116** -0.116** CMC SEQ ISI 1.11 (0.044) (0.044) 1.07 0.060 -0.060 CMC SEQ ISI 1.00 (0.045) (0.045) 1.08 0.083* -0.083* CMC PS EPS 1.12 (0.042) (0.042) 0.88 -0.146** 0.146** CMC TS USE 0.98 (0.050) (0.050) 1.15 0.05 -0.050 CMC PS EPS 1.13 (0.041) (0.041) 1.06 0.216*** -0.216*** CMC TS USE 1.02 (0.049) (0.049) 1.07 0.139 -0.139 CMC SEL USE 1.03 (0.257) (0.257) 1.11 0.073 -0.073 CMC ESS EPS 1.14 (0.044) (0.044) 1.07 -0.038 0.038 CMC ESS EPS 1.06 (0.316) (0.316) Note: *p<0.05, **p<0.01, ***p<0.001; DIF estimates with negative values mean relatively easier with reference to the opposite gender CMC SEL USE 155 0.25 (Class A) 0.71 (Class C) 0.52 (Class C) 0.21 (Class A) 0.23 (Class A) 0.17 (Class A) 0.29 (Class A) 0.43 (Class C) - Table A8: Gender DIF items for Open Response (OR) Item code Item format Item content Competency S114Q03 OR SEL USE S114Q04 OR SEL USE S114Q05 OR ESS EPS S131Q02 OR SEL USE S131Q04 OR SEQ ISI S268Q02 OR LS EPS S269Q01 OR ESS EPS S269Q03 OR LS EPS S304Q01 OR PS USE S304Q03a OR TS USE S304Q03b OR TS EPS S326Q01 OR SEL USE S326Q02 OR SEL USE S408Q03 OR LS EPS S425Q03 OR LS EPS S425Q04 OR SEQ USE S426Q01 OR ESS EPS S428Q05 OR LS EPS S437Q06 OR PS EPS S438Q03 OR SEQ ISI S447Q05 OR SEL USE Weighted item fit (girls/boys) 0.86 0.93 1.03 0.98 1.02 1.07 0.94 0.94 0.93 0.98 0.99 0.97 0.86 0.85 0.89 0.83 0.90 0.97 1.05 0.95 0.85 0.8 0.96 0.92 0.89 0.87 1.03 0.98 1.00 1.00 0.95 1.01 0.93 1.19 0.92 0.92 1.03 0.85 0.92 0.95 0.92 156 Girls’ DIF Estimate & (SE) 0.026 (0.051) -0.015 (0.034) -0.196*** (0.042) -0.098* (0.047) -0.113* (0.044) -0.116** (0.041) -0.011 (0.045) -0.171*** (0.046) 0.094* (0.047) -0.10* (0.047) 0.044 (0.043) -0.201*** (0.053) -0.189*** (0.052) -0.133** (0.043) 0.049 (0.041) -0.174*** (0.047) 0.016 (0.052) 0.019 (0.041) -0.037 (0.048) -0.08 (0.044) -0.031 Boys’ DIF Estimate & (SE) -0.026 (0.051) 0.015 (0.034) 0.196*** (0.042) 0.098* (0.047) 0.113* (0.044) 0.116** (0.041) 0.011 (0.045) 0.171*** (0.046) -0.094* (0.047) 0.10* (0.047) -0.044 (0.043) 0.201*** (0.053) 0.189*** (0.052) 0.133** (0.043) -0.049 (0.041) 0.174*** (0.047) -0.016 (0.052) -0.019 (0.041) 0.037 (0.048) 0.08 (0.044) 0.031 | 2DIF | (DIF class) 0.39 (Class A) 0.20 (Class A) 0.23 (Class A) 0.23 (Class A) 0.34 (Class A) 0.19 (Class A) 0.20 (Class A) 0.40 (Class A) 0.38 (Class A) 0.27 (Class A) 0.35 (Class A) - S458Q01 OR S465Q01 OR S477Q04 OR S485Q02 OR S485Q05 OR S493Q05 OR S495Q03 OR S498Q04 OR S508Q04 OR S510Q04 OR S514Q02 OR S514Q03 OR S519Q01 OR S519Q03 OR S524Q07 OR Note: 0.87 (0.032) (0.032) 0.96 0.135** -0.135** LS EPS 0.90 (0.041) (0.041) 1.18 -0.056 0.056 SEL USE 1.21 (0.035) (0.035) 1.07 -0.264*** 0.264*** LS EPS 0.92 (0.048) (0.048) 1.10 0.115*** -0.115*** PS EPS 1.11 (0.032) (0.032) 0.96 0.015 -0.015 SEQ ISI 0.90 (0.036) (0.036) 0.96 0.003 -0.003 LS EPS 0.89 (0.042) (0.042) 0.87 0.109* -0.109* SEL USE 0.87 (0.047) (0.047) 1.17 -0.165*** 0.165*** SEL USE 1.10 (0.035) (0.035) 0.90 0.005 -0.005 SEQ ISI 0.93 (0.044) (0.044) 0.89 0.156*** -0.156*** PS EPS 0.92 (0.042) (0.042) 0.87 -0.22*** 0.22*** TS USE 0.94 (0.057) (0.057) 0.98 -0.093* 0.093* ESS EPS 0.97 (0.041) (0.041) 1.08 -0.068 0.068 SEL USE 0.97 (0.035) (0.035) 1.06 0.108 -0.108 SEQ ISI 1.01 (0.218) (0.218) 0.91 0.048 -0.048 SEL USE 0.87 (0.047) (0.047) *p<0.05, **p<0.01, ***p<0.001; DIF estimates with negative values mean relatively easier with reference to the opposite gender 157 0.27 (Class A) 0.53 (Class C) 0.23 (Class A) 0.22 (Class A) 0.33 (Class A) 0.31 (Class A) 0.44 (Class C) 0.19 (Class A) - References AAUW. (1991). Shortchanging girls. Shortchanging America. A nationwide poll to assess self esteem, educational experiences, interest in math and science, and career aspirations of girls and boys ages 9-15. Washington, DC: American Association of University Women. Adams, R. J. & Wilson, M. R. (1996). A random coefficients multinomial logit: A generalized approach to fitting Rasch models. In Objective Measurement III: Theory into Practice, G. Engelhard and M. Wilson (Eds.), pp 142-166. Norwood, New Jersey: Ablex. Adams, R. J., Wilson, M. & Wang, W. C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1-32. Adamuti-Trache and Sweet (2009). Science-related career aspirations and planful competence: A gendered relationship? Canadian Council on Learning. Canada. Stumpf, H., & Stanley, J. (1996). Gender related differences on the College Board’s advanced placement and achievement tests, 1982-1992. Journal of Educational Psychology, 88, 353-364. Agin, M. (1974). Education for Scientific Literacy: A Conceptual Frame of Reference and Some Applications. Science Education, 58, 2. Aiken L. R. & Ailen D. R. (1969). Recent research attitudes concerning science. Science Education, 52(4), 295-205. Aiken, L. R. (1979). Attitudes toward mathematics and science in Iranian middle schools. School Science and Mathematics, 79(2), 229-224. Aikenhead, G.S. (1994) What is STS science teaching? In J. Solomon & G. S. Aikenhead (Eds.), STS Education: International Perspectives in Reform. New York: Teacher’s College Press. Ainley J. Kos J. & Nicholas M. (2008). ACER Research Monograph No 63: Participation in Science, Mathematics and Technology in Australian Education. Australian Council for Educational Research. Australia. Ajzen, I. & Fishbein, M. (1980). Understanding attitudes and predicting social behavior. Englewood Cliffs, NJ: Prentice-Hall. Allen, L. R. (1970). An evaluation of certain cognitive aspects of the material objects unit of the Science Curriculum Improvement Study elementary science program. Journal of Research in Science Teaching, 7, 277-281. Allen, L. R. (1973). An evaluation of children's performance on certain cognitive, affective, and motivational aspects of the systems and subsystems unit of the Science Curriculum Improvement Study elementary science program. Journal of Research in Science Teaching, 10, 125-134. Allen, R. (1975). Using a single topic film with elementary school children. Journal of Research in Science Teaching, 12, 292-295. Allport, G. W. (1925). Attitudes. In C. Murchison (Ed.), A handbook of social psychology. Worcester, MA: Clark University Press. Allport, G. W. (1985). The historical background of social psychology. In G. Lindzey & E. Aronson (Eds.), The handbook of social psychology 2rd ed., vol 1, (pp. 1-46). New York: Random House. Andersen, E. B. & Madsen, M. (1977). Estimating the parameters of the latent population distribution. Psychometrika, 42, 357-374. 158 Anderson, C. W. & Butts, D. (1980). A comparison of individualized and group instruction in a sixth-grade electricity unit. Journal of Research in Science Teaching, 17, 129-145. Andre, T., Whigham, M., Hendrickson, A. & Chambers, S. (1999). Competency beliefs, positive affect, and gender stereotypes of elementary students and their parents about science versus other school subjects. Journal of Research in Science Teaching, 26, 719-747. Andrich, D. A. (1978). A rating formulation for ordered response categories. Psychometrika, 42, 561-72. Arnot, M., Milieu, D. & Maton, K. (1998). Current Innovative Practice in Schools in the United Kingdom Final Report, Cambridge: University of Cambridge for the Council of Europe. Ashbaugh, A.C. (1968). Selection of geological concepts for intermediate grades. Science Education, 52, 189-196. Ato, T. & Wilkinson, W. J. (1982). Factors related to secondary school students’ attitudes to science in Benue State of Nigeria. Research in Science & Technological Education, 1, 209-220. Ayers, J. B. & Price, C. O. (1975) Children’s attitudes toward science. School Science and Mathematics, 75, 211-318. Babikian, Y. (1971). An empirical investigation to determine the relative effectiveness of discovery, laboratory, and expository methods of teaching science concepts. Journal of Research in Science Teaching, 8, 201-309. Bae, Y. & Smith, T. M. (1996). Issues in focus: Women in mathematics and science.Washington, DC: National Center for Educational Statistics. Baker, D. P. & Jones, D. P. (1992). Creating gender equality: Cross-national gender stratification and mathematical performance. Society of Education, 66, 91-102. Baker, M. R. & Doran, R. (1975). From an awareness of scientific data to concerns of mankind: Strategies for affective instruction in science. Science Education, 59, 529-558. Ball, S. & Gewirtz, S. (1997). Girls in the education market: choice, competition and complexity, Gender and Education, 9, 207-22. Ball, S. (1999). Labour, learning and the economy: a ‘policy sociology’ perspective, Cambridge Journal of Education, 29, 195-206. Barnes, G., McInerney, D. M. & Marsh, H. W. (2005). Exploring sex differences in science enrolment intentions: An application of the general model of academic choice. Australian Educational Researcher, 22(2), 1-22. Beaton, A. E. & Gonzalez, E. (1995). NAEP primer. Chestnut Hill, MA, Boston College: Boston. Becker, B. J. (1989). Gender and science achievement: a reanalysis of studies from two metaanalyses. Journal of Research in Science Teaching, 26, 141169 Becker, B. J. (1989). Gender and science achievement: A reanalysis of studies from two meta-analyses. Journal of Research in Science Teaching, 26, 141-169. Bell, R. C. & Hay, J. A. (1987). Differences and biases in English language examination formats. British Journal of Educational Psychology, 57,212-220. Bem, D. (1970). Beliefs, attitudes, and human affairs. Belmont, CA: Brooks/Cole. Bennett, J., Lubben, F. & Hogarth, S. (2007), Bringing science to life: A synthesis of the research evidence on the effects of context-based and STS approaches to science teaching. Science Education, 91, 347-370. 159 Ben-Shakar, G. & Sinai, Y. (1991). Gender differences in multiple-choice tests; the role of differential guessing tendencies. Journal of Educational Measurement, 28, 22-25. Blatter, D. D., Bigler, E.D., Gale, S. D., Johnson, S.C., Anderson, C.V., Burnett, B.M., et al. (1995). Quantitative volumetric analysis of brain MR: Normative database spanning 5 decades of life. American Journal of Neuroradiology, 16, 241-251. Blatter, D. D., Bigler, E.D., Gale, S.D., Johnson, S.C., Anderson, C.V., Burnett, B.M., et al. (1995). Quantitative volumetric analysis of brain MR: Normative database spanning 5 decades of life. American Journal of Neuroradiology, 16, 241-251. Blickenstaff, J. C. (2005). Women and science careers: leaky pipeline or gender filter? Gender and Education, 17(4), 269286. Boaler, J. (1997). Reclaiming school mathematics: the girls fight back. Gender and Education, 9(2), 285-205. Boe, M. V., Henriksen, E. K., Lyons, T. & Schreiner, C. (2011). Participation in science and technology: young people’s achievement-related choices in late-modern societies. Studies in Science and Education, 47(1), 37 - 72. Bolger, N. & Kellaghan, T. (1990). Method of measurement and gender differences in scholastic achievement. Journal of Educational Measurement, 27, 165-l74. Bowyer, J. B. & Linn, M. C. (1978). Effectiveness of the Science Curriculum Improvement Study in teaching scientific literacy. Journal of Research in Science Teaching, 15, 209-219. Bradley, R. H., & Corwyn, R. F. (2002). Socioeconomic status and child development. Annual Review of Psychology, 53, 371–399 Breckler, S. J. (1984). Empirical validation of affect, behavior, and cognition as distinct components of attitude. Journal of Personality and Social Psychology, 47(6), 1191-1205. Bridgham, R.G. (1969). Classification, seriation and the learning of electrostatics. Journal of Research in Science Teaching, 6, 118-127. Brotman, J. S. & Moore, F. M. (2008). Girls and science: A review of four themes in the science education literature. Journal of Research in Science Teaching, 45(9), 9711002. Brown, D. R., Michaels, G.E. & Bledsoe, J.C. (1965). An experiment in the Use of film slides in an introductory course in biology. Journal of Research in Science Teaching, 2, 222-244. Brown, L. M., Tweeten, P. W. & Pacheco, D. (1975). Attitudinal differences among junior high school students, teachers and parents on topics of current interest. Science Education, 59, 467-472. Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York, Guilford Press. Bryan, R. R., Glynn, S. M. & Kittleson, J. M. (2011), Motivation, achievement, and advanced placement intent of high school students learning science. Science Education, 95, 1049-1065. Burkam, D. T., Lee, V. E. & Smerdon, B. A. (1997). Gender and science learning early in high school: Subject matter and laboratory experiences. American Educational Research Journal, 34, 297-331. Burke, P. A. Y. (1982). A study to determine the trends in attitude toward science among fourth-, fifth-, and sixth-grade male and female students (Doctoral dissertation, State University of New York at Buffalo. Dissertation Abstracts International. 160 Butler, M. B. (1999). Factors associated with students’ intentions to engage in science learning activities. Journal of Research in Science Teaching, 26, 455-472. Bybee, R. (1997a). Toward an understanding of scientific literacy. In W. Graber & C. Bolte (Eds.), Scientific literacy (pp.27 - 68). Kiel, Germany: Institute for Science Education (IPN). Bybee, R. (1997b). Achieving scientific literacy: from purposes to practices. Portsmouth, NH: Heinemann. Bybee, R. (2008). Teaching secondary school science: strategies for developing scientific literacy. Upper Saddle River, N.J.: Pearson/Merrill/Prentice Hall. Bybee, R. W. & McCrae, B.J. (2009) Scientific literacy: Implications of PISA for science 2006 for teachers and teaching. In R. Bybee & B. McCrae (Eds.), PISA Science 2006: Implications for Science Teachers and Teaching (pp. 227-247). Arlington, VA: NSTAP. Byrne, B. M. (1994). Testing for the factorial validity, replication, and invariance of a measuring instrument: A paradigmatic application based on the Maslach Burnout Inventory. Multivariate Behavioral Research, 29, 289-211. Calabrese Barton, A. (1998). Feminist science education. New York: Teachers College Press. Cannon, R. K. & Simpson, R. D. (1985). Relationships among attitude, motivation, and achievement of ability grouped, seventh-grade, life science students. Science Education, 69(2), 121-128. Caplan, P. J., Crawford M., Hyde J. S. & Richardson J. T. E. (1997). Gender differences in human cognition, New York: Oxford University Press CDC (Curriculum Development Council) (2002). Science Education Key Learning Area: Curriculum Guide (Primary 2 - Secondary 2). Hong Kong (China): Government Logistics Department. Census and Statistics Department (2006). Interactive Data Dissemination System(IDDS) for the 2006 Population By-census. http://idds.censtatd.gov.hk/ Default.aspx. Retrieved on 26-06-2008 Chagnon, N. A. (1988). Life histories, blood revenge, and warfare in a tribal population. Science, 229, 985-992. Champagne, A. B. & Lovitts, B. E. (1989). Scientific literacy: A concept in search of definition. In A. B. Champagne, B. E. Lovitts & B. J. Callinger (Eds.), This year in school science. Scientific literacy (pp. 1-14). Washington, DC: AAAS. Chan, A.K.W., Lam, O. Y., Lai-Yeung, T.W.L.; Yu, C. L.M. (2009, December). A critical examination of the development and discourses of studies on Gender & Education in Hong Kong, 1983-2009. Paper presented at the Symposium on Gender & Education in Hong Kong, HKIEd, Hong Kong. Chan, R. & Yam, E. (1995). Green Movement in a Newly Industrializing Area: A Survey on the Attitudes and Behaviour of the Hong Kong Citizens. Journal of Community & Applied Social Psychology, 5(4), 272-284. Chan, R. (1996). Environmental attitudes and behaviors of secondary school students in Hong Kong. The Environmentalist, 16, 297-206 Chan, R. (1999). Mass media and environmental knowledge of secondary school students in Hong Kong, The Environmentalist, 19(2), 85-97 Chan, R. Y. C. (1992). A Study of the Environmental Attitudes and Behavior of Customers in Hong Kong. International Journal of Environmental Education and Information, 12(4). 285-96. 161 Cheng, Y. Y., Wang, W. C. & Ho, Y. H. (2009). Multidimensional Rasch analysis of a psychological test with multiple subtests: A statistical solution for the bandwidth-fidelity dilemma. Educational and Psychological Measurement, 69, 269-288. Cheung, D. (2008). The Affective Domain of Scientific Literacy. In Ho, S. C., Yip, D. Y., Wong, K. L., Lau K. L. & Cheung D. The third HKPISA report: Monitoring the quality of education in Hong Kong from an international perspective (pp.70-86). Hong Kong: Hong Kong PISA Centre, The Chinese University of Hong Kong. Cheung, D. (2009a). Developing a Scale to Measure Students’ Attitudes toward Chemistry Lessons. International Journal of Science Education, 21(16), 2185-2202. Cheung, D. (2009b). Students’ attitudes toward chemistry lessons: The interaction effect between grade level and gender. Research in Science Education, 39(1), 75-91. Cheung, G. W. & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9, 233-255. Chiu, M. M. & Ho, S. C. (2006). Family effects on student achievement in Hong Kong. Asia Pacific Journal of Education, 26(1), 21-35. Choppin, B. & Frankel, R. (1976). The three most interesting things. Studies in Educational Evaluation, 2, 57-61. Chow A. & Salmela-Aro K. (2011).Task-values across subject domains: A gender comparison using a person-centered approach. International Journal of Behavioral Development, 35, 202-209. Clarke, C. O. (1972). A determination of commonalities of science interests held by intermediate grade children in inner-city, suburban and rural schools. Science education, 56, 125-26. Cohen J. (1988). Statistical power analysis for the behavioral sciences. 2nd ed., Hillsdale, NJ: Erlbaum. Cole, N. S. (1997a). Understanding gender differences and fair assessment in context. In W. W. Willingham and N. S. Cole (Eds.), Gender and fair assessment (pp. 157-182). London: Lawrence Erlbaum. Cole, N. S. (1997b). The ETS Gender Study: How Females and Males Perform in Educational Settings. Princeton, NJ: Educational Testing Service. Coleman, J. S. (1968). The concept of equality of educational opportunity. Harvard Educational Review, 38(1), 7-22. Coyle, K. (2004). Understanding environmental literacy in America: And making it a reality. National Environmental Education and Training Foundation, NEETF/Roper Report: Washington, DC. Coyle, K. (2005). Environmental literacy in America: What ten years of NEETF/Roper research and related studies say about environmental literacy in the U.S., The National Environmental Education & Training Foundation, Washington, D.C. Crawford, M. & R. Unger. (1994). Gender issues in psychology. In Companion encyclopedia of psychology (pp. 1007-1027), edited by A. M. Cohnan. New York Routledge. Crawford, M., Chaffin, R. & Fitton, L. (1995). Cognition in social context. Special Issue, Psychological and psychobiological Perspectives on Sex Differences in Cognition: Theory and Research, Learning and Individual Differences, 7(4), 341-362. 162 Crawley, F. E. & Coe, A. E. (1990). Determinations of middle school students’ intention to enroll in a high school science course: An application of the theory of reasoned action. Journal of Research in Science Teaching, 27(5), 461-476. Cremin, L. (1988). American Education: The Metropolitan Experience. New York. New York: Harper and Row. Cronbach, L. J. & Gleser, G. C. (1965). Psychological Tests and Personnel Decisions, 2nd ed, University of Illinois Press, Urbana, IL. Csikszentmihalyi, M. (1990). Flow: The psychology of optimal experience. New York: Harper Perennial. Csikszentmihalyi, M. (1996). Creativity: Flow and the psychology of discovery and invention. New York: HarperCollins. Cudeck, R. & M. W. Browne. 1982. Cross-Validation of Covariance Structures. Multivariate Behavioral Research, 18:147-167. DAC Expert Group on Women in Development (1999). DAC source book on concepts and approaches linked to gender equity. Paris: OECD. Darwin, C. (1871). The descent of man, and selection in relation to sex. London: John Murray. Davey, T. & Hirsch, T. M. (1991). Concurrent and consecutive estimates of examinee ability profiles. Paper presented at the Annual Meeting of the Psychometric Society, New Brunswick, NJ. Davis-Kean, P. E. (1999). The effect of socio-economic characteristics on parenting and child outcomes. Paper presented at the biennial meeting of the Society for Research in Child Development, Albuquerque, NM. Deboer, G. E. (2000). Scientific literacy: another look at its historical and contemporary meanings and its relationship to science education reform. Journal of Research in Science Teaching, 27(6), 582-601. deCastell, S. & Luke, A. (1986) Models of literacy in North American schools. Social and historical conditions and consequences. In: De Castell, Suzanne; Luke, Allan; Egan, Kieran (Eds.) Literacy, society and schooling: A reader. New York: Cambridge University Press. DeCoster, J. (1998). Overview of Factor Analysis. http://www.stat-help.com/ notes.html. Retrieved 1-11-2009. DeFleur, M. L. & Westie, F. R. (1962). Attitude as a scientific concept. Social Forces, 42, 17-21. Department for Education and Skills (2004). http://standards.dfes.gov.uk/gender and achievement/understanding/analysis/. Retrieved on 12-12-2008. Department for Education and Skills (2009). The Gender Agenda: making a difference in science. http://nationalstrategies.standards.dcsf.gov.uk/downloader/ Abf82e208271d222cf0411241e9d1816.pdf. Retrieved on 12-12-2008. Dewey, J. (1916). Methods of science teaching. General Science Quarterly, 1, 2-9. Dewey, J. (1924). The supreme intellectual obligation. Science Education, 18, 1-4. DFID (UK Department for International Development) (2007). Gender Equality Action Plan 2007-2009: making faster progress to gender equality. New York: United Nations. Dimitrov, D. M. (1999), Gender Differences in Science Achievement: Differential Effect of Ability, Response Format, and Strands of Learning Outcomes. School Science and Mathematics, 99, 445-450. Doherty, J. & Dawe, J. (1985). The relationship between development maturity and attitude to school science: An exploratory study. Educational Studies, 11(2), 92-107. 163 Dunteman, G. H., Wisenbaker J. & Taylor M. E. (1978). Race and Sex Differences in College Science Program Participation. ERIC Document Reproduction Service No. ED199024. Eagles, E. & Demare, R. (1999). Factors influencing children’s environmental attitudes. Journal of Environmental Education, 20(4), 22-27. Eagly, A. H. & Chaiken, S. (1992). The psychology of attitudes. Fort Worth, TX: Harcourt Brace Jovanovich. Eagly, A. H., Mladinic, A. & Otto, S. (1994). Cognitive and affective bases of attitudes toward social groups and social policies. Journal of Experimental Social Psychology, 20, 112-127. Eccles J. (1984). Sex differences in mathematics participation. In M. W. Steinkamp & M. L. Maehr (Eds.), Women in science (pp. 93-137). Greenwich, CT: Jai Press. Eccles J. (2011). Gendered educational and occupational choices: Applying the Eccles et al. model of achievement-related choices. International Journal of Behavioral Development, 35 (3), 195-201. Eccles, J. S. & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual Review of Psychology, 52, 109-122. Eccles, J. S. (1987). Gender roles and women’s achievement-related decisions. Psychology of Women Quarterly, 11, 135 - 172. Eccles, J. S., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M. & Meece, J. L. (1982). Expectancies, values and academic behaviors. In J. T. Spence (Ed.), Achievement and achievement motives (pp. 75- 146). San Francisco: Freeman. Eccles, J., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J., and Midgley, C. (1983). Expectancies, values and academic behaviors. In Spence, J. T. (Ed.), Achievement and Achievement Motives, W. H. Freeman, San Francisco. Education Department (1997). Guidelines on Sex Education in Schools. Hong Kong: Printing Department. Ellis, H. (1924). Man and woman: a study of secondary and tertiary sexual characters. London: Heinemann. EOC (Equal Opportunities Commission) (1999). Research on Content Analysis of Textbooks and Teaching Materials in Respect of Stereotypes. Hong Kong: EOC. Epstein, D., Elwood, J., Hey, V. & Maw, J. (1998). Schoolboy frictions: feminism and ‘failing boys’, In D. Epstein, J. Elwood, V. Hey and J. Maw (Eds.) Failing Boys? Buckingham: Open University Press. Erickson, G. L. & Erickson, L. J. (1984). Females and science achievement: Evidence, explanations, and implications. Science Education, 68: 62-89. European Commission (1995). Teaching and learning. Towards the learning society. White Paper on Education and Training. Brussels, Belgium: European Commission. European Commission (2009). She Figures 2009 Statistics and Indicators on Gender Equality in Science. European Commission. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C. & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(2), 272-299. Feingold, A. (1992). Sex differences in variability in intellectual abilities: A new look at an old controversy. Review of Educational Research, 62, 61-84. Field, T. W. & Cropley, A. J. (1969). Cognitive style and science achievement. Journal of Research in Science Teaching, 6, 2-10. Fischer, G. H. (1982). Logistic latent trait models with linear constraints. Psychometrika, 48, 2-26 164 Fishbein, M. & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research. Reading, MA: Addison-Wesley. Folk, V. G. & Green, B. F. (1989). Adaptive estimation when the unidimensionality assumption of IRT is violated. Applied Psychological Measurement, 12, 272-289. Foster, L. L. (1967). Manifest science interest of pupils in forms 2 and 4. Australian Science Teachers’ Journal, 12(2), 65-72. Francis, B. & Skelton C. (2005). Reassessing gender and achievement: questioning contemporary key debates. London, New York: Routledge. Francis, L. J. & Greer, J. E. (1999). Measuring attitude towards science among secondary school students: The affective domain. Research in Science and Technological Education, 17(2), 219-226. Fraser, B. (1977) Selection and validation of attitude scales for curriculum evaluation. Science Education, 61, 217-229. Fraser, B. (1980) Science teacher characteristics and attitudinal outcomes. School Science and Mathematics, 80, 200-208. Friedler, Y. & Tamir, P. (1990). Sex differences in science education in Israel: An analysis of 15 years of research. Research in Science and Technological Education, 8, 21-24. Frome, P., Alfeld, C., Eccles, J. S. & Barber, B. L. (2006). Why don’t they want a maledominated job? An investigation of young women who changed in occupational aspirations. Educational Research and Evaluation, 12, 259-272. Fuller, E. W., May, D. H. & Butts, D. P. (1979). The science achievement of third graders using visual, symbolic, and manipulative instructional treatments. Journal of Research in Science Teaching, 16, 129-126. Galton, F. (1969). Hereditary Genius: An Inquiry into Its Laws and Consequences. Macmillan: London. Gardner, P. L. (1974). Sex differences in achievement, attitudes, and personality of science students: A review. Research in Science Education, 4, 221-358. Gardner, P. L. (1975a), Attitudes to Science: A review, Studies in Science Education, 2, 1-41. Gardner, P. L. (1975b). Attitude measurement: A critique of some recent research. Educational Research, 17, 101-109. Gauld, C. F. & Hukin, A. A. (1980). Scientific attitudes: A review, Studies in Science Education, 7, 129-161. George, R. (2006). A cross-domain analysis of change in students’ attitudes toward science and attitudes about the utility of science. International Journal of Science Education, 28(6), 571-589. Gershon, B. S. & Yakov S. (1991). Gender Differences in Multiple-Choice Tests: The Role of Differential Guessing Tendencies. Journal of Educational Measurement, 28(1), 22-25. Gräber, W., Erdmann, T. & Schlieker, V. (2001). ParCIS: Partnership between Chemical Industry and Schools. http://www.ipn.uni-kiel.de/ _chik_symposium/sites/pdf/graeber.pdf. Retrieved on 8-11-2008. Greenfield, T. A. (1997). Gender- and grade-level differences in science interest and participation. Science Education, 81, 259-276. Grön, G., Wunderlich, A. P., Spitzer, M., Tomczak, R. & Riepe, M. W. (2000). Brain activation during human navigation: Gender different neural networks as substrate of performance. Nature Neuroscience, 2, 404-408. 165 Gur R. C., Turetsky B. I., Matsui M., Yan M., Bilker W., Hughett P. & Gur R. E (1999). Sex differences in brain gray and white matter in healthy young adults. Journal of Neuroscience, 19, 4065-4072. Gur, R. C., Alsop, D., Glahn, D., Petty, R., Swanson, C. L., Maldjian, J. A., Turetsky B. I., Detre J. A., Gee J. & Gur R. E. (2000). An fMRI study of sex differences in regional activation to a verbal and a spatial task. Brain and Language, 74, 157-170. Gur, R. C., Turetsky, B. I., Matsui, M., Yan, M., Bilker,W., Hughett, P. & Gur, R. E. (1999). Sex differences in brain gray and white matter in healthy young adults: Correlations with cognitive performance. Journal of Neuroscience, 19, 4065-4072. Gur, R. C., Mozley, L. H., Mozley, P. D., Resnick, S. M., Karp, J. S., Alavi, A., Arnold S. E. & Gur R. E. (1995). Sex differences in regional cerebral glucose metabolism during a resting state. Science, 267, 528-521. Hadden, R. A. & Johnstone, A. H. (1982). Secondary school pupils‟ attitudes to science: The year of erosion. European Journal of Science Education, 5, 209-218. Haladyna, T. & Thomas, G. (1979). The attitudes of elementary school children towards school and subject matters, Journal of Experimental Education, 48, 18-22. Halpern, D. F. (2000). Sex differences in cognitive abilities (3rd ed.). Mahwah, NJ: Erlbaum. Halpern, D. F. (2004). A cognitive taxonomy for sex differences in cognitive abilities. Current Directions in Psychological Science, 13, 135-139. Halpern, D. F., Benbow, C. P., Geary, D. C., Gur, R., Hyde, J. S. & Gernsbacher, M. A. (2007). The science of sex differences in science and mathematics. Psychological Science in the Public Interest, 8, 1-51. Hamilton L. S. (1999). Detecting Gender-Based Differential Item Functioning on a Constructed- Response Science Test, Applied Measurement in Education, 12(3), 211-235. Hamilton, L. S. (1998). Gender Differences on High School Science Achievement Tests: Do Format and Content Matter? Educational Evaluation and Policy Analysis, 20(2), 179-195. Hamilton, M. A. (1982). Jamaican students’ attitude to science as it relates to achievement in external examinations. Science Education, 66, 155-169. Hanushek, E. A. & Woessmann L. (2007). Education Quality and Economic Growth, World Bank, Washington, DC. Harlen W. (2001). The Assessment of Scientific Literacy in the OECD/PISA Project. Research in science education - past, present, and future. Dordrecht; Boston, Mass. : Kluwer Academic Publishers, 49-60. Harter, S. (1990). Causes, correlates, and the functional role of global self-worth: A life-span perspective. In R. J. Sternberg & J. Kolligian (Eds.), Competence considered (pp. 67-97). New Haven, CT: Yale University Press. Harter, S. (1998). The development of self-representations In W. Damon (Series Ed.) & N. Eisenberg, Handbook of Child Psychology: Vol. 2. Social, emotional, and personal development (5th ed., pp. 552-618). New York: Wiley Harvey, T. J. & Stables, A. (1986). Gender differences in attitudes to science for third year pupils: An argument for single-sex teaching groups in mixed schools. Research in Science and Technological Education, 4(2), 162-170. Hasan, O. E. (1985). An investigation into factors affecting attitudes toward science of secondary school students in Jordan. Science Education, 69(1), 2-18. 166 Haste, H. (2004). Science in my future: A study of the values and beliefs in relation to science and technology amongst 1121 year olds: Nestlé Social Research Programme. Häussler, P. & Hoffmann, L. (2000). A curricular frame for Physics education: Development, comparison with students’ interests, and impact on students’ achievement and self-concept. Science Education, 84, 689-705. Häussler, P. & Hoffmann, L. (2002). An intervention study to enhance girls’ interest, self-concept and achievement in physics classes. Journal of Research in Science Teaching, 29(9), 870-888. Heck, R. H., Thomas S. L. & Tabata L. N. (2010). Multilevel and longitudinal modeling with IBM SPSS. New York, Routledge. Hedges L. V. & Nowell A. (1995). Sex Differences in Mental Test Scores, Variability, and numbers of high-scoring individuals. Science, 269, 41 Hedges, L. V. & Becker, B. J. (1986). Statistical methods in the metaanalysis of research on gender differences. In J. Hyde & M. C. Linn (Eds.), The psychology of gender: Advances through meta-analysis. Baltimore: Johns Hopkins University Press. Hein, C. & Lewko, J. H. (1994). Gender Differences in Factors Related to Parenting Style: A Study of High Performing Science Students, Journal of Adolescent Research. 9(2): 262-81. Henderleiter, J. & Pringle, D.L. (1999). Effects of context-based laboratory experiments on attitudes of analytical chemistry students. Journal of Chemical Education, 76(1), 100-106. Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primer on coefficient alpha. Measurement and Evaluation in Counseling and Development. 24(2), 177 - 189. Hidi, S. & Harackiewicz, J. M. (2001). Motivating the academically unmotivated: A critical issue for the 21st century. Review of Educational Research, 70, 151-179. HKEAA (2009a). 2009 HKCEE Examination Report. Hong Kong Examinations and Assessment Authority HKEAA (2009b). 2009 HKALE Examination Report. Hong Kong Examinations and Assessment Authority Ho S. C. & Willms J. D. (1996). Effects of Parental Involvement on Eighth-Grade Achievement. Sociology of Education, 69, 126-141. Ho, S. C. (1997). Parental involvement and student performance: The contributions of economic, cultural, and social capital. Unpublished Ph.D. dissertation, The University of British Columbia. Ho, S. C., Chun, K. W., Yip, D. Y., Wong, K. M., Chiu, M. M., Sze, M. M., Lo, N. K., Chung, Y. P., Tsang, W. K., Man Y. F. & Ho, W. K. (2002). The first HKPISA report: Monitoring the quality of education in Hong Kong from an international perspective. Hong Kong: Hong Kong PISA Centre, The Chinese University of Hong Kong. Ho, S. C., Kwong W. L., Chun, K. W., Yip, D. Y., Wong, K. M., Law, H. Y. & Lam, C. C. (2005). The second HKPISA report: Monitoring the quality of education in Hong Kong from an international perspective. Hong Kong: Hong Kong PISA Centre, The Chinese University of Hong Kong. Ho, S. C., Yip, D. Y., Wong, K. L., Lau K. L. & Cheung D. (2008). The third HKPISA report: Monitoring the quality of education in Hong Kong from an international perspective. Hong Kong: Hong Kong PISA Centre, The Chinese University of Hong Kong. 167 Hofman, H. H. (1977). An assessment of eight-year old children’s attitudes toward science. School Science and Mathematics, 77, 662-670. Hofstein, A. & Lunetta, V. N. (2004), The laboratory in science education: Foundations for the twenty-first century. Science Education, 88, 28-54. Hofstein, A., Ben-zvi, R. & Samuel, D. (1976). The measurement of the interest in, and attitudes to, laboratory work amongst Israeli high school chemistry students. Science Education, 60, 401-411. Hofstein, A., Ben-Zvi, R., Samuel, D. & Tamir, P. (1977). Attitudes of Israeli high-school students toward chemistry and physics: A comparative study. Science Education, 61(2), 259-268. Holbrook, J. & Rannikmae, M. (2009). The Meaning of Scientific Literacy. International Journal of Environmental & Science Education, 4, 2, 275-288 Holstermann, N., Grube D. & Bögeholz S.. (2009). Hands-on Activities and Their Influence on Students’ Interest. Research in Science Education, 1-15. Hoste, R. (1982). Sex differences and similarities on performance in a CSE biology examination. Educational Studies, 8, 141-152. Hsu R. F. (2008). The Changes of Learning Hierarchies in Students’ Investigation Designing Skill (in Chinese). Journal of National Taiwan University, 277-412. Hughes, E. F. (1971). Role playing as a technique for developing a scientific attitude in elementary teacher trainees, Journal of Research in Science Teaching, 8, 112-122. Hurd, P. (1958). Science literacy: Its meaning for American schools. Educational Leadership, 16, 12-16. Hurd, P. (1970). New directions in teaching secondary school science. Chicago: Rand McNally. Huskinson, T. & Haddock, G. (2004). Individual differences in attitude structure: Variance in the chronic reliance on affective and cognitive information. Journal of Experimental Social Psychology, 40, 82-90. Hyde, J. S. (1981). How large are cognitive gender differences? American Psychologist, 26(8), 892-901. Hyde, J. S. (1990). Meta-analysis and the psychology of gender differences. Journal of Women in Culture and Society, 16, 55-72. Hyde, J. S. (2005). The gender similarities hypothesis. American Psychologist, 60, 581-592. Jacobs, J. E., Chhin, C. S. & Bleeker, M. M. (2006). Enduring links: Parents’ expectations and their young adult children’s gender-typed occupational choices. Educational Research and Evaluation, 12(4), 295-407. Jacobs, J. S. (2005). Twenty-five years of research on gender and ethnic differences in math and science career choices: What have we learned? In J. E. Jacobs & S. D. Simpkins (Eds.), Leaks in the pipeline to math, science, and technology careers, Special Issue of New Directions for Child and Adolescent Development, 110, 85-94. Jacobs, J. E., Davis-Kean, P., Bleeker, M., Eccles, J. S. & Malanchuk, O. (2005). I can, but I don’t want to: The impact of parents, interests, and activities on gender differences in math. In A. Gallagher & J. Kaufman (Ed.), Gender Differences in Mathematics (pp. 246-263). Cambridge University Press. Jenkins, Edgar W. & Nelson, N. W. (2005). Important but not for me: students’ attitudes towards secondary school science in England, Research in Science & Technological Education, 22(1), 41-57. 168 Jo B., Asparouhov T., Muthén B. O., Ialongo N. S. & Brown C. H. (2008). Cluster randomized trials with treatment noncompliance. Psychological Methods, 12, 1-18. Johnson, R. T. (1981). Children’s attitudes toward science, Science and Children, 18, 29-41. Johnson, S. (1987). Gender differences in science: Parallels in interest, experience and performance. International Journal of Science Education, 9(4), 467-481. Jones, M. G., Howe, A. & Rua, M. J. (2000). Gender differences in students’ experiences, interests, and attitudes toward science and scientists. Science Education, 84, 180-192. Jovanovic, J., Solano-Flores, G. & Shavelson, R. J. (1994) Performance-based assessments: Will gender differences in science achievement be eliminated? Education and Urban Society, 26, 352-366. Jovanovic, J., Solano-Flores, G. & Shavelson, R. J. (1994). Performance-based assessments: Will gender differences in science achievement be eliminated? Education and Urban Society, 26, 252-266. Kachigan, S. K. (1991). Multivariate statistical analysis: A conceptual introduction (2nd ed.). New York: Radius Press. Kahle, J. B. & Lakes, M. K. (1982). The myth of equity in science classrooms. Journal of Research in Science Teaching, 20, 121-140. Kahle, J. B. & Meece, J. (1994). Research on gender issues in the classroom. In D. L. Gabel (Ed.) Handbook of research on science teaching and learning (pp. 542-557). New York: Macmillan. Kaplan, D. (2000), Structural equation modeling: Foundation and extensions, Thousand Oaks: SAGE Publications. Keeves J. P. & Kotte D. (1996). Patterns of Science Achievement: International Comparisons. In L. H. Parker, L. J. Rennie & B. J. Fraser (Eds.), Gender, science and mathematics: shortening the shadow (pp.77-94). Dordrecht, Boston: Kluwer Academic Publisher Keeves, J. P. (1975). The home, the school, and achievement in mathematics and science. Science Education, 59, 429-460. Keeves, J. P. (1986). Science education: The contribution of IEA research to a world perspective. In N. T. Postlethwaite (Ed.), International educational research, papers in honor of Torsten Husen. Oxford: Pergamon Press. Keeves, J. P. (1992). Learning science in a changing world, cross-national studies of science achievement, 1970 to 1984. The Netherlands: International Association for the Evaluation of Educational Achievement. Kelly, A. & Smail, B. (1986). Sex stereotypes and attitudes to science among eleven year-old children. British Journal of Educational Psychology, 56, 158-168. Kelly, A. (1978). Girls and Science: an International Study of Sex Differences in School Science Achievement. Stockholm: Almquist & Wicksell International. Kelly, A. (1988). The Customer is Always Right: Girls’ and Boys’ Reactions to Science Lessons. School Science Review, 69(249), 662-676. Keyes, S. (1982). Sex differences in cognitive abilities and sex-role stereotypes in Hong Kong Chinese adolescents. Sex Roles, 9(8), 852-870. Klasen, S. (1999). Does Gender Inequity Reduce Growth and Development? Evidence from Cross-Country Regressions. The World Bank Development Research Group/ Poverty Reduction and Economic Management Network. Report Background Papers No.7. Washington, DC: World Bank. http://www.worldbank.org/gender/prr. Retrieved on 03-12-2008. 169 Klopfer, L. E. (1971). Evaluation of learning in science, In B. S. Bloom, J. T. Hastings & G. F. Madaus (Eds.), Handbook on Formative and Summative Evaluation of Student Learning, New York: McGraw-Hill, 559-641. Klopfer, L. E. (1976), A structure for the affective domain in relation to science education, Science Education, 60, 299-212. Koballa, T. R. Jr. (1988). Attitude and related concepts in science education. Science Education, 72, 115-126. Kollmuss A. & Agyeman J. (2002). Mind the Gap: why do people act environmentally and what are the barriers to pro-environmental behavior? Environmental Education Research, 8(2), 229-260. Koslow, M. J. & Nay, M. A. (1976). An approach to measuring scientific attitude. Science Education, 60, 147-172. Krajcik, J., Blumenfeld, P. C., Marx, R. W., Bass, K. M., Fredricks, J. & Soloway, E. (1998). Inquiry in Project-Based Science Classrooms: Initial Attempts by Middle School Students. Journal of the Learning Sciences, 7(2), 212-25. Krogh, L. B. & Thomsen, P. V. (2005). Studying students’ attitudes towards science from a cultural perspective but with a quantitative methodology: border crossing into the physics classroom. International Journal of Science Education, 27(2), 281202. Krosnick, J. A., Judd, C. M. & Wittenbrink, B. (2005). The measurement of attitudes. In D. Albarracin, B. T. Johnson & M. P. Zanna (Eds.), The handbook of attitudes (pp. 21-76). Mahwah, NJ: Lawrence Erlbaum Associates. Kruglak, H. (1970). Pre- and post-Sputnik physics background of college freshmen- II. Journal of Research in Science Teaching, 7, 41-42. Kucian, K., Loenneker, T., Dietrich, T., Martin, E. & von Aster, M. (2005). Gender differences in brain activation patterns during mental rotation and number related cognitive tasks. Psychology Science, 47, 112-121. LaForgia, J. (1988). The affective domain related to science education and its evaluation. Science Education, 72, 407-421. Lam, C. C., Cheng, K. M., Lai, Winnie Y. W., Leung Frederick K. S. & Tsoi, H. S. (1996). Preparation of Students for Tertiary Education: Final Report. Hong Kong: University Grants Committee. Lam, Y. P. (2005). Effects of Family Social Capital on Hong Kong Students’ Literacy. Unpublished EdD Dissertation, The Chinese University of Hong Kong. Lau, W. L. (1997). A study of subject choices among third year secondary school pupils. Unpublished M. Ed. dissertation. Hong Kong: University of Hong Kong. Law, N. (1996a). Science and mathematics achievements at the junior secondary level in Hong Kong. Hong Kong: TIMSS Hong Kong Study Centre. Law, N. (1996b). Science and mathematics achievements at the junior secondary level in Hong Kong: a summary report for Hong Kong in the Third International Mathematics and Science Study (TIMSS). Hong Kong, Faculty of Education University of Hong Kong. Law, N. (1997). Science and mathematics achievements at the mid-primary level in Hong Kong a summary report for Hong Kong in the Third International Mathematics and Science Study (TIMSS). Hong Kong, Faculty of Education University of Hong Kong. Le, L. T. (2009). Investigating Gender Differential Item Functioning across Countries and Test Languages for PISA Science items. International Journal of Testing, 9, 2, 122-133. 170 Lee, J. D. (1998). Which kids can “become” scientists? Effects of gender, self-concepts, and perceptions of scientists. Social Psychology Quarterly, 61, 199-222. Lee, K. (2008). Making Environmental Communications Meaningful to Female Adolescents: A Study in Hong Kong. Science Communication, 20(2), 147-176. Lee, K. (2009). Gender differences in Hong Kong adolescent consumers’ green purchasing behavior, Journal of Consumer Marketing, 26(2), 87-96. Lee, V. E., Smith, J. B. & Croninger R. G. (1997). How High School Organization Influences the Equitable Distribution of Learning in Mathematics and Science. Sociology of Education. 70(2), 128-15. Leung, S. A. (2011). Chapter 1: Understanding Educational and Career Transitions: A Belief Review of Implications from Career Theories. Hong Kong Association of Careers Masters and Guidance Masters. Likert, R. (1922). A technique for the measurement of attitudes. Archives of Psychology, 140, 1-55. Linnenbrink, E. A. & Pintrich, P. R. (2002). Achievement goal theory and affect: An asymmetrical bidirectional model. Educational Psychologist, 27, 69-78. Lin, H. F. (2009). The Study of Junior High School Students’ Scientific Literacy: The Cases of Taiwan, Japan, Korea and Hong Kong in PISA2006. Journal of Educational Research and Development, 5(4), 77-108. Linacre, J. M. (1994). Many-Facet Rasch Measurement. Chicago. MESA Press. Linver, M., Davis-Kean, P. E. & Eccles, J. S. (2002, March). Influences of gender on academic achievement. Paper presented at the Society for Research on Adolescence, New Orleans, LA. Lips, H. M. (1992). Gender- and science-related attitudes as predictors of college Students’ academic choices. Journal of Vocational Behavior, 40(1), 62-81. Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 82, 1198-1202. Liu O. L. (2006). Evaluating differential gender performance on large-scale math assessments: A multidimensional Rasch modeling and mixture approach. Unpublished doctoral dissertation. University of California, Berkeley. Liu O. L. & Wilson M. (2009). Gender Differences and Similarities in PISA 2002 Mathematics: A Comparison between the United States and Hong Kong. International Journal of Testing, 9 (1), 20-40. Loring-Meier, S. & Halpern, D. F. (1999). Sex differences in visualspatial working memory: Components of cognitive processing. Psychonomic Bulletin & Review, 6, 464-471. Lowery, L. J., Bowyer & Padilla, M. J. (1980). The science curriculum improvement study and student attitudes, Journal of Research in Science Teaching, 17, 227-255. Lynch, P. P., Benjamin, P., Chapman, T., Holmes, R., McCammon, R., Smith, A. & Symmons, R. (1979). Scientific language and the high school pupil. Journal of Research in Science Teaching, 16, 251-357. MacCallum, R. C., Roznowski, M. & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111, 490-504. Maccoby, E. E. & Jacklin, C. N. (1974). The Psychology of Sex Differences. Stanford, Calif.: Stanford University Press. 171 Machin S. & Pekkarinen T. (2008). Assessment. Global sex differences in test score variability. Science, 222, 1221-1222. Mahony, P. (1998) Girls will be Girls and Boys will be First, in J. Elwood, D. Epstein, V. Hey, and J. Maw (Eds.), Failing Boys? Issues in Gender and Achievement. Buckingham: Open University Press. Mackinnon, David Peter (2008). Introduction to statistical mediation analysis. Lawrence Erlbaum and Associates Marek, E. A. (1981). Correlations among cognitive development, intelligence quotient, and achievement of high school biology students. Journal of Research in Science Teaching, 18, 9-14. Marjoribanks, K. (1976). School attitudes, cognitive ability, and academic achievement. Journal of Educational Psychology, 68, 652-660. Markus, H. & Nurius, P. (1986). Possible selves. American Psychologist, 41,954-969. Marsh, H. W. (1992). Academic self-concept: Theory, measurement, and research. In J. Suls (Ed.), Psychological perspectives on the self (vol. 4 pp. 59-98). Hillsdale, NJ: Erlbaum. Martin, M.O., Mullis, I.V.S. & Foy, P. (with Olson, J.F., Erberber, E., Preuschoff, C. & Galia, J.) (2008). TIMSS 2007 International Science Report: Findings from IEA’s Trends in International Mathematics and Science Study at the Fourth and and Eighth Grades, Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Martin, M.O., Mullis, I.V.S. & Foy, P. (with Olson, J.F., Erberber, E., Preuschoff, C. & Galia, J.) (2008). TIMSS 2007 International Science Report: Findings from IEA’s Trends in International Mathematics and Science Study at the Fourth and Eighth Grades, Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Masters, G. N. 1982. A Rasch model for partial credit scoring. Psychometrika, 47, 149-74. Masters, M. S. & Sanders, B. (1992). Is the gender difference in mental rotation disappearing? Behavior Genetics, 22, 227-241. Mattern, N. and C. Schau (2002). Gender differences in science attitude-achievement relationships over time among white middle-school students. Journal of Research in Science Teaching, 29(4), 224-224. Mau, W. C. (2002). Factors that influence persistence in science and engineering career aspirations. Career Development Quarterly, 51, 224 - 242. Mazzeo, J., Schmitt, A. P. & Bleistein, C. A. (1992). Sex-related performance differences on constructed-response and multiple-choice sections of advanced placement examinations, College Board Report No. 92-7, ETS RR No 92-5, New York, College Entrance Exam. ERIC, ED385543. Mazzeo, J., Schmitt, A. P. & Bleistein, C. A. (1993). Sex-related performance differences on constructed-response and multiple-choice sections of advanced placement examinations (College Board Report No. 92-7). New York: College Entrance Examination Board. McCrae, B. J. (2009) PISA 2006 test development and design. In R. Bybee & B. McCrae (Eds.), PISA Science 2006: Implications for Science Teachers and Teaching, pp. 27-28. Arlington, VA: NSTA Press. McCurdy, R. (1958). Toward a population literate in science. The Science Teacher, 25, 266 - 268. 172 McDuffie, T. E., Jr. & Beehler, C. (1978). Achievement-workstyle relationships in ISCS Level I. Journal of Research in Science Teaching, 15, 485-490. Meade, A. W., Johnson, E. C. & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in test of measurement invariance. Journal of Applied Psychology, 92, 568-592. Meece J. L., Glienke B. B. & Burg S. (2006). Gender and motivation, Journal of School Psychology, 44, 251-272. Menis, J. (1982). Attitudes towards chemistry as compared with those towards mathematics, among tenth grade pupils (aged 15) in high level secondary schools in Israel, Research in Science and Technological Education, 1, 85-191 Millar, R. (1996a). Towards a science curriculum for public understanding. School Science Review, 77(280), 7-18. Millar, R. (2006b). Twenty first century science: Insights from the design and implementation of a scientific literacy approach in school science. International Journal of Science Education, 28(12), 1499-1521. Ministry of Education (2005). Gender Equity Education Act. http://www.gender.edu.tw. Retrieved on 26-06-2008. Ministry of Education (2006). Enforcement Rules for the Gender Equity Education Act. http://law.moj.gov.tw/Eng/news/news_detail_ch.aspx?id=1911. Retrieved on 26-06-2008. Mink, P. T. (1972). Title IX of the Education Amendments of 1972 / Patsy T. Mink Equal Opportunity in Education Act. http://en.wikipedia.org/wiki/ Title_IX#cite_note-0. Retrieved on 18-7-2010. Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-381. Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177-196. Mislevy, R. J., Beaton, A. E., Kaplan, B. & Sheehan, K. M. (1992). Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 122-161. Mosedale, S. S. (1978). Science corrupted: Victorian biologists consider “the woman question.” Journal of the History of Biology, 11, 1-55. Moyer, R. H. (1977). Environmental attitude assessment: another approach. Science Education, 61, 247-256. Mullis, I. V., Martin, M.O., Beaton, A. E., Gonzalez, E. J., Kelly, D. L. & Smith, T. A. (1998). Mathematics and science achievement in the final year of secondary school. Boston: Center for the Study of Testing, Evaluation and Educational Policy, Boston College. Mullis, I. V., Martin, M. O., Fierros, E. G., Goldberg, A. L. & Stemler, S. E. (2000). Gender differences in achievement: IEA’s Third International Mathematics and Science Study. Chestnut Hill, MA: Boston College. Munby, H. (1997). Issues of validity in science attitude measurement. Journal of Research in Science Teaching, 24(4), 227-241. Murnane, R., & Raizen, S. (1988). Improving Indicators of the Quality of Science and Mathematics Education in Grades K-12. Washington, DC: National Academy Press. Murphy, C. & Beggs, J. (2005). Primary Science in the UK: A Scoping Study. Final Report to the Wellcome Trust. London: Wellcome Trust. 173 Murphy, D. G., DeCarli, C., McIntosh, A. R., Daly, E., Mentis, M. J., Pietrini, P., Szczepanik J., Schapiro M. B., Grady C.L., Horwitz B. & Rapoport S. I. (1996). Sex differences in human brain morphometry and metabolism: An in vivo quantitative magnetic resonance imaging and positron emission tomography study on effect of aging. Archives General Psychiatry, 53(7), 585-594. Murphy, K. R. (1992). Honesty in the Workplace, Brooks/Cole Publishing, Pacific Grove, CA. Murphy, P. (1991). Gender differences in pupils’ reactions to practical work. In B. Woolnough (Ed.), Practical science. Milton Keynes: Open University Press. Murphy, P. & Whitelegg, E. (2006). Girls in the Physics Classroom: A Review of Research of Participation of Girls in Physics. London: Institute of Physics. Murphy, R. J. L. (1978). Sex Differences in Examination Performance: do these reflect differences in ability or sex-role stereotypes? Educational Review, 20(2), 259-262. Murphy, R. J. L. (1982). Sex differences in objective test performance. British Journal of Educational Psychology, 52, 212-219. Murphy, R. J. L. (1982). Sex differences in objective test performance. British Journal of Educational Psychology, 52, 212-219. Muthén, L. K., & Muthén, B. O. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén & Muthén. Nagy, G., Trautwein, U., Baumert, J., Köller, O. & Garrett, J. (2006). Gender and course selection in upper secondary education: Effects of academic self-concept and intrinsic value. Educational Research and Evaluation, 12(4), 222-245. Nagy, G., Trautwein, U., Baumert, J., Köller, O. & Garrett, J. (2006). Gender and course selection in upper secondary education: Effects of academic self-concept and intrinsic value. Educational Research and Evaluation, 12(4), 323-345. National Center for Education Statistics. (1997). Findings from condition of education 1997, No. 11:Women in Mathematics and science (NCES 97-982). Washington, DC: Author. National Science Foundation (2010). National Science Board: Science and Engineering Indicators - 2010. Arlington, VA: Author. National Science Foundation (2004). Women, Minorities, and Persons with Disabilities in Science and Engineering. Arlington, Va.: National Science Foundation, May 2004. http://www.nsf.gov/sbe/srs/wmpd/start.htm. Retrieved on 12-12-2008. Nettles, M. T. & Millett, C. M. (2006). Three magic letters: Getting to Ph.D. Baltimore: Johns Hopkins University Press. Ngai, S. K. (1995). Gender and schooling: A study of gender role socialization in a Primary School. Unpublished MEd Dissertation, The University of Hong Kong. Nobel Foundation (2009). http://nobelprize.org. Retrieved on 12-12-2008. Noddings, N. (1992). Variability: A pernicious hypothesis. Review of Research in Education. 62:85-88. Norman K. Denzin and Yvonna S. Lincoln (2002). Introduction: the discipline and practice of qualitative research, In N. K. Denzin & Y. S. Lincoln (Eds.), The Landscape of Qualitative Research: Theories and Issues, Sage. NSTA (National Science Teachers Association) (1971). NSTA position statement on school science education for the 70’s. The Science Teacher, 28, 46 -51. NSTA (National Science Teachers Association) (1982). Science-technology-society: Science education for the 1980s. Washington, D.C. NSTA (National Science Teachers Association) (1991). Position statement. Washington DC: National Science Teachers Association. 174 NSTA (National Science Teachers Association) (2006). NSTA position statement on Gender Equity in Science Education. http://www.nsta.org/pdfs/ PositionStatement_GenderEquity.pdf. Retrieved on 18-7-2010. OECD (Organisation for Economic Co-operation and Development) (2005). Paris Declaration and Accra Agenda for Action. Paris: OECD. OECD (Organisation for Economic Co-operation and Development) (2006). Assessing scientific, reading and mathematical literacy: A framework for PISA 2006. Paris: OECD. OECD (Organisation for Economic Co-operation and Development) (2009a). Education at a Glance. Paris: OECD. OECD (Organisation for Economic Co-operation and Development) (2009b). PISA 2006 technical report. Paris: OECD. OECD (Organisation for Economic Co-operation and Development) (2009c). Green at Fifteen How 15-year-olds perform in environmental science and geoscience in PISA 2006. Paris: OECD. OECD (Organisation for Economic Co-operation and Development) (2009d). PISA Data Analysis Manual SPSS® SECOND EDITION. Paris: OECD. Ogden, W. R. & Brewster, P. M. (1977). An analysis of cognitive style profiles and related science achievement among secondary school students. ERIC Document Reproduction Service No. ED 129 610. Osborne J., Simon S. & Tytler R. (2009). Attitude toward science: An Update. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, California, April 12-17. http://www.kcl.ac.uk/content /1/c6/05/82/69/AttitudesToAttitudesTowardScience.pdf. Retrieved on 12-8-2010. Osborne, J. (2007). Science education for the twenty first century. Eurasia Journal of Mathematics, Science & Technology Education, 2(2), 172-184. Osborne, J., Simon, S. & Collins, S. (2002). Attitude toward science: A review of the literature and its implications. International Journal of Science Education, 25(9), 1049-1079. Oskamp, S. & Schultz P. W. (2005). Attitudes and opinions. Mahwah, N.J., L. Erlbaum Associates. Paek, I. (2002). Investigations of differential item functioning: comparisons among approaches, and extension to a multidimensional context. Unpublished doctoral dissertation. UC Berkeley. Pajares, F. (1996). Self-efficacy beliefs in academic settings. Review of Educational Research, 66, 542-578. Pang B. & Ha A. S. C. (2010).Subjective task value in physical activity participation: The perspective of Hong Kong schoolchildren European Physical Education Review 16, 223-235 Patrick H., Mantzicopoulos P. & Samarapungavan A. ( 2009). Motivation for learning science in kindergarten: Is there a gender gap and does integrated inquiry and literacy instruction make a difference, Journal of Research in Science Teaching, 46, 2. Pedretti, E. & Forbes (2000). From curriculum rhetoric to classroom reality, STSE education. Orbit, 21 (2), 29-41. Pell, T. & Jarvis, T. (2001). Developing attitude to science scales for use with children of ages from five to eleven years. International Journal of Science Education, 23(8), 847862. 175 Pella, M. O., O’Hearn G. T. & Gale C. W. (1966). Referents to Scientific Literacy. Journal of Research in Science Teaching, 4, 199-208. Peugh, J. L. & Enders C. K. (2004), Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement. Review of Educational Research, 74(4), 525-556. Pintrich, P. R. & Marx, R. W. & Boyle R. A. (1992). Beyond Cold Conceptual Change: The Role of Motivational Beliefs and Classroom Contextual Factors in the Process of Conceptual Change. Review of Educational Research, 62(2), 167-199. Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879-891. Quihuis G. (2002). Understanding the factors that influence high science achievers’ academic choices and intent to pursue or opt out of the hard sciences. Unpublished Ph.D. dissertation. Stanford University. Ramsden, J. M. (1998). Mission impossible?: Can anything be done about attitudes to science? International Journal of Science Education, 20, 125-127. Randall, R. E. (1975). A study of the perceptions and attitudes of secondary school. Dissertation Abstracts International, 25, 5152A. Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. (Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press. Ravitch, D. (1982). The troubled crusade. New York: Basic Books. Reis, S. M. & Park, S. (2001). Gender differences in high-achieving students in math and science. Journal for the Education of the Gifted, 25, 52-73. Reise S. P., Widaman K. F. & Pugh R. H. (1992). Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychological bulletin, 114(2), 552-566. Rennie, L. J. & Parker, L. H. (1987). Scale dimensionality and population heterogeneity: Potential problems in the interpretation of attitude data. Journal of Research in Science Teaching, 24, 567-577. Robitaille, D. F. & Beaton A. E. (2002). Secondary analysis of the TIMSS data. Dordrecht; Boston, Kluwer Academic Publishers. Rockefeller B. F. (1958). The pursuit of excellence: Education and the future of America. In Prospect for America: Report Number of the Rockefeller Panel Reports. Garden City, NY: Doubleday. Rosenthal, R. & Rubin, D.B. (1982). Further meta-analytic procedures for assessing cognitive gender differences. Journal of Educational Psychology, 74, 708-712. Rubin, D. B. (1976). Inference and missing data. Biometrika, 62, 581-592. Rubin, D. B. (1987). Multiple imputations for non-response in surveys. New York, New York: John Wiley & Sons. Rudasill, K. M. & Callahan, C. M. (2010). Academic self-perceptions of ability and course planning among academically advanced students. Journal of Advanced Academics, 21, 200-229. Ryan, R. M. & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55, 68-78. Salta, K. & Tzougraki, C. (2004). Attitudes Toward Chemistry Among 11th Grade Students in High Schools in Greece. Science Education, 88, 525- 547. 176 Scantlebury, K. & Baker, D. (2007). Gender issues in science education research: Remembering where the difference lies. In S. Abell & N. Lederman (Eds.), Handbook of research on science education (pp. 257286). Mawhah, New Jersey: Lawrence Erlbaum. Schibeci, R. A. (1984), Attitudes to Science: An Update, Studies in Science Education, 11, 26-49. Schibeci, R. A. (1989), Home, school and peer group influences on student attitudes and achievement in science, Science Education, 72(1), 12-24. Schiefele, U. (2001). The role of interest in motivation and learning. In J. M. Collis & S. Messick (Eds.), Intelligence and personality: Bridging the gap in theory and measurement (pp. 162-194). Mahwah, NJ: Erlbaum. Schmader T., Johns M., and Barquissau M. (2004). The costs of accepting gender differences: The role of stereotype endorsement in women’s experience in the math domain. Sex Roles, 50, 825-850. Schmidt, W. H., Raizen, S. A., Britton, E. D., Bianchi, L. J. and Wolfe, R. G. (1997) Many visions, many aims, Vol. 2. Dordrecht: Kluwer Academic Publishers. Schreiner, C. & Sjøberg, S. (2007). Science education and youth’s identity construction two incompatible projects? In D. Corrigan, J. Dillon & R. Gunstone (Eds.), The Reemergence of Values in the Science Curriculum. Rotterdam: Sense Publishers. Schreiner, C. (2006). Exploring a ROSEgarden: Norwegian youth’s orientations towards science - seen as signs of late modern identities. University of Oslo, Oslo. Scott, N. C., Jr. & Siegel, I. W. (1965). Effects of inquiry training in physical science on creativity. ERIC Document Reproduction Service No. ED002700. Selim, M. A. & Shrigley, R. L. (1983). The group-dynamics approach: a socio-psycho-logical approach for testing the effect of discovery and expository teaching on the science achievement and attitude of young Egyptian students. Journal of Research in Science Teaching, 20, 213-224. Sharpe, S. (1976) Just Like a Girl. Harmondsworth: Penguin. Shavelson, R. J., Hubner, J. J. & Stanton, G. C. (1976). Self-concept: Validation of construct interpretations. Review of Educational Research, 46, 407-441. Shepard, L. (1992). Evaluating Test Validity, in: L. Darling-Hammond, Review of Research in Education. Washington DC, American Research Association. Sherman, S. W. (1974). Multiple Choice Test Bias Uncovered by Use of an ‘I Don’t Know’ Alternative. ERIC Document Reproduction Service No. ED 121824. Shernoff, D. J., Csikszentmihalyi, M., Schneider, B. & Shernoff, E. S. (2002). Student engagement in high school classrooms from the perspective of flow theory. School Psychology Quarterly, 18(2), 158-176. Showalter, V. (1974). What Is Unified Science Education? Program Objectives and Scientific Literacy. Prism II-2: 1-6 Shrigley, R. L. (1972). Sex difference and its implications on attitude and achievement in elementary school science. School Science and Mathematics, 72, 789-792. Shrigley, R. L., Koballa, T. R. & Simpson, R. D. (1988). Defining attitude for science educators, Journal of Research in Science Teaching, 25(8), 659-678. Sieveking, N. A. & Savitsky, J. S. (1969). Evaluation of an achievement test prediction of grades, and composition of discussion groups in college chemistry. Journal of Research in Science Teaching, 6, 274-276. 177 Simpkins, S. D. & Davis-Kean, P. E. (2005). The intersection between self-concepts and values: Links between beliefs and choices in high school. In J. E. Jacbos & S. D. Simpkins (Eds.), Leaks in the pipeline to math, science, and technology careers (Number 110, pp. 31-47). San Francisco, CA: Jossey-Bass. Simpkins, S. D. & Davis-Kean, P. E.(2005). The intersection between self-concepts and values: Links between beliefs and choices in high school. In J. E. Jacbos & S. D. Simpkins (Eds.), Leaks in the pipeline to math, science, and technology careers (Number 110, pp. 31-47). San Francisco, CA: Jossey-Bass. Simpson, R. D. & J. Steve Oliver (1990). A summary of major influences on attitude toward and achievement in science among adolescent students. Science Education, 74(1): 1-18. Simpson, R. D. & Oliver, J. S. (1985). Attitudes toward science and achievement motivation profiles of male and female science students in grade six through ten. Science Education, 69, 511-526. Simpson, R. D., Koballa, T. R. Jr., Oliver, J. S. & Crawley, F. E. (1994). Research on the affective dimension of science learning. In D. L. Gabel (Ed.), Handbook of research on science teaching and learning (pp. 211-224). New York: Macmillan. Sjøberg, L. (1982). Interest, achievement and vocational choice, European Journal of Science Education, 5, pp. 299-207. Sjøberg, S. & Schreiner, C. (2005). How do learners in different cultures relate to science and technology? Results and perspectives from the project ROSE. Asia Pacific Forum on Science Learning and Teaching, 6(2), 116. Skinner, R., Jr. (1967). Inquiry sessions: An assist for teaching science via instructional television in the elementary schools. Journal of Research in Science Teaching, 5, 246-250. Slabbekoorn, D., Van Goozen, S. H. M., Megens, J., Gooren, L. J. G., & Cohen-Kettenis, P. T. (1999). Activating effects of cross-sex hormones on cognitive functioning: A study of short-term and longterm hormone effects in transsexuals. Psychoneuroendocrinology, 24, 432-447. Solomon, J. & Aikenhead, G. (1994). STS Education: International Perspectives in Reform. New York: Teacher’s College Press. Solomon, J. (1992). Teaching Science, Technology & Society. Philadelphia, CA: Open University Press Spall, K., Dickson, D. & Boyes, E. (2004). Development of school students’ constructions of biology and physics. International Journal of Science Education, 26(7), 787-802. Squiers, S. M. M. (1982). An analysis of attitudes of high school seniors towards science and scientists in a southern metropolitan high school (Doctoral dissertation, Auburn University, 1982). Dissertation Abstracts International, 11, 02A. Stables, A. & Wikeley, F. (1997). Changes in preference for and perceptions of relative importance of subjects during a period of educational reform. Educational Studies, 22(2), 292-402. Stables, A. (1990). Differences between pupils from mixed and single-sex schools in their enjoyment of school subjects and in their attitudes to science and to school. Educational Review, 42(2), 221-220. Starr, J. W. & Nicholl, C. (1975). Creativity and achievement in Nuffield Physics. British Journal of Educational Psychology, 45, 222-226. 178 Steele C. M. & Aronson J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69, 797-811. Steinkamp, M. W. & Maehr, M. L. (1984). Gender differences in motivational orientations toward achievement in school science: A quantitative synthesis. American Educational Research Journal, 21, 29-59. Stobart, G., Elwood, J. & Quinlan, M. (1992). Gender Bias in Examinations: How Equal Are the Opportunities? British Educational Research Journal, 18(2), 261-376. Strope, M. B. & Braswell, A. L. (1966). A comparison of factual teaching and conceptual teaching in introductory college astronomy. Journal of Research in Science Teaching, 4, 95-97. Stumpf, H. & Stanley, J. (1996). Gender related differences on the College Board’s advanced placement and achievement tests, 1982-1992. Journal of Educational Psychology, 88, 353-364. Tai, R. H., Qi Liu, C., Maltese, A. V. & Fan, X. (2006). Planning Early for Careers in Science. Science, 212, 1142-1145. Tamir, P. & Amir, R. (1975). Teaching science to first and second grade pupils in Israel by the audio-tutorial method. Science Education, 59, 29-49. Tamir, P. (1974). Botany and zoology-A curriculum problem. Journal of Research in Science Teaching, 11, 5-16. Tamir, P. (1976). Factors which influence student achievement in high school biology. Journal of Research in Science Teaching, 12, 529-545. Tang, R. (2006). Welcoming Speech for Opening Ceremony Challenges and Possibilities in Gender Equity Education: The Second International Conference in the Asia-Pacific Region. http://www.eoc.org.hk/eoc/upload/ 200672714262044022.doc. Retrieved on 15-12-200 Thomas, B. & Snider, B. (1969). The effects of instructional method upon the acquisition of inquiry skills. Journal of Research in Science Teaching, 277-286. Tikka, P., Kuitunen, M. & Tynys, S. (2000). Effects of educational background on students’ attitudes, activity levels, and knowledge concerning the environment. Journal of Environmental Education, 21(2), 12-19. Tse K. C. (1998). Differential Educational Opportunities in Hong Kong: A Review (in Chinese). Hong Kong: the Hong Kong Institute of Educational Research. UGC (University Grants Committee) (2009). First-year Student Intakes (Headcount) of UGC-funded Programmes by Level of Study and Sex. http://www.ugc.edu.hk/eng/ugc/stat/student_head.htm. Retrieved on 12-12-2009. UGC (University Grants Committee) (2011). First-year Student Intakes (Headcount) of UGC-funded Programmes. http://www.ugc.edu.hk/eng/ugc/stat/ student_head.htm. Retrieved on 12-12-2009. UNDP (United Nations Development Programme) (2007). Empowered and Equal: Gender Equality Strategy 2008-2011. New York: United Nations. http://www.undp.org/women/docs/Gender-Equality-Strategy-2008-2011.pdf. Retrieved on 12-12-2009. UNESCAP (United Nations Economic and Social Commission for Asia and the Pacific) (2007). Economic and Social Survey of Asia and the Pacific 2007, New York: United Nations. http://www.unescap.org/survey2007/download /01_Survey_2007.pdf. Retrieved on 18-02-2010. 179 UNESCO (United Nations Educational, Scientific and Cultural Organization) (2000). Gender equality and equity. New York: United Nations. http://unesdoc.unesco.org/images/0012/001211/121145e.pdf. Retrieved on 6-4-2010 UNESCO (United Nations Educational, Scientific and Cultural Organization) (2002), UNESCO and the International decade of education for Sustainable development (2005-2015), UNESCO International Science, Technology & Environmental Education Newsletter, vol. XXVIII, no. 1-2, UNESCO, Paris. UNESCO (United Nations Educational, Scientific and Cultural Organization) (2005). International Implementation Scheme for the UN Decade of Education for Sustainable Development, UNESCO, Paris. Unger, R. & Crawford. M. (1996). Women and gender: A feminist perspective (second edition). New York McGraw-Hill. Unger, R. K. (1981). Female and male: Psychologicalperspectives. New York: Harper & Row. Valian, V. (1998). Why so slow? The advancement of women. Cambridge, MA: MIT Press. Van Goozen, S. H. M., Cohen-Kettenis, P. T., Frijda L. J. G., N. H. & Van de Poll, N. E. (1994). Activating effects of androgens on cognitive performance: Causal evidence in a group of femaleto-male transsexuals. Neuropsychologia, 22, 1152-1157. Van Goozen, S. H. M., Cohen-Kettenis, P. T., Gooren, L. J. G., Frijda, N. H. & Van de Poll, N. E. (1994). Activating effects of androgens on cognitive performance: Causal evidence in a group of female to male transsexuals. Neuropsychologia, 22, 1152-1157. Van Goozen, S. H. M., Cohen-Kettenis, P. T., Gooren, L. J. G., Frijda, N. H. & Van de Poll, N. E. (1995). Gender differences in behaviour: Activating effects of cross-sex hormones. Psychoneuroendocrinology, 20, 242-262. Van Liere, K. D. & Dunlap, R. E. (1980). The Social Bases of Environmental Concern: A Review of Hypotheses, Explanations and Empirical Evidence. Public Opinion Quarterly, 44: 181-97. Vandenberg R. J. & Lance C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 2, 4-70. Voyer, D., Voyer, S. & Bryden, M.P. (1995). Magnitude of sex differences in spatial abilities: A meta-analysis and consideration and consideration of critical variables. Psychological Bulletin, 117, 250-270. Wainer, H., Sheehan, M. & Wang, X. (2000). Some paths toward making praxis scores more useful. Journal of Educational Measurement, 27, 112-140. Walberg, H. J. (1969). Physics, femininity, and creativity. Developmental Psychology, 1, 47-54. Walding, R., Fogliani, C., Over, R. & Bain, J. (1994). Gender differences in response to questions on the Australian National Chemistry Quiz. Journal of Research in Science Teaching, 21, 822-846. Walkerdine, V. (1990). Schoolgirl Fictions. London: Verso. Wallach, M. A. & Kogan, N. (1966). Modes of thinking in young children. New York: Holt, Rinehart & Winston. Wallston, B. S. & O’Leary, V. E (1981). Sex and gender make a difference: The differential perceptions of women and men. In L. Wheeler (Ed.), Review of Personality and Social Psychology, (vol. 2 pp. 9-41). Beverly Hills, CA: Sage. 180 Wang, W. (1994). Implementation and application of the multidimensional random coefficients multinomial logit. Unpublished doctoral dissertation. University of California, Berkeley. Wang, W. C. (2000). The simultaneous factorial analysis of differential item functioning. Methods of Psychological Research On line, 5(1), 51-76. Wang, W. C., Chen, P. H. & Cheng, Y. Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116-126. Wareing, C. (1981). Cognitive style and developing scientific attitudes in the SCIS classroom. Journal of Research in Science Teaching, 18, 72-77. Waugh R. F. (2002). Creating a scale to measure motivation to achieve academically: Linking attitudes and behaviours using Rasch measurement. British Journal of Educational Psychology, 72, 65-86 Webb, R. M., Lubinski, D. & Benbow, C. P. (2002). Mathematically facile adolescents with math/science aspirations: New perspectives on their educational and vocational development. Journal of Educational Psychology, 94, 785-794. Weinburgh, M. (1995). Gender Differences in Student Attitudes toward Science: A MetaAnalysis of the Literature from 1970 to 1991. Journal of Research in Science Teaching, 32(4), 387398. Weinburgh, M. H. & Engelhard, G., Jr. (1991). Gender, prior academic performance and beliefs as predictors of attitudes toward biology laboratory experiences. Paper presented at Georgia Educational Research Association conference, Decatur, GA. Weinburgh, M. H. (1994). Achievement, grade level, and gender as predictors of studentattitudes toward science. Paper presented at the Distinguished Paper Session of the annual meeting of the American Association of Educational Research. New Orleans. Weinburgh, M. H. (2000). Gender, ethnicity, and grade level as predictors of middle school students’ attitudes toward science. ERIC, ED442662. Weisberg, J. S. (1970). The use of visual advance organizers for learning earth science concepts. Journal of Research in Science Teaching, 7, 161-165. Weller, F. (1922). Attitudes and skills in elementary science. Science Education, 17, 90-97. Wigfield, A. (1994). The role of children’s achievement values in the self-regulation of their learning outcomes. In D. H. Schunk & B. J. Zimmerman (Eds.), Self-regulation of learning and performance: Issues and educational applications (pp. 101-124). Mahwah, NJ: Erlbaum. Wigfield. A., Eccles, J., Mac Iver, Reuman. D. & Midgley. C. (1991). Transitions at early adolescence: Changes in children’s domain-specific self-perceptions and general self-esteem across the transition to junior high school. Developmental Psychology, 27. 552-563. Wilson M. (2005). Constructing Measures: An Item Response Modeling Approach. Mahwah, NJ: Erlbaum. Wilson, M. & Hoskens M. (2005). Multidimensional Item Responses: Multimethod-Multitrait Perspectives. In S. Alagumalai, D. D. Curtis & N. Hungi, (Eds.) Applied Rasch Measurement: A Book of Exemplars, 4, 287-207. Springer Netherlands. Wooley, J. K. (1978). Factors affecting students’ attitudes and achievement in an astronomy computer-assisted instruction programme, Journal of Research in Science Teaching, 15, 172-178. 181 Wright, B. D. & Masters, G. N. (1982). Rating Scale Analysis, Chicago: MESA Press. Wright, B. D., Linacre, J. M., Gustafson, J. E. & Martin-Löf, P. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8(2), 270. Wu, M. L. (2005). The Role of Plausible Values in Large-Scale Surveys. Studies in Educational Evaluation, 31, 114-128. Wu, M. L. (2008). Improving the measurement of Reading scores by using mathematics and science scores. Keynote address presented at the PROMS (Pacific Rim Objective Measurement Symposium) 2008 in Tokyo. Wu, M. L., Adams, R. J., and Wilson, M. R. (1997). ConQuest: Multi-Aspect Test Software, [computer program] Camberwell: Australian Council for Educational Research. Yager, R. E. (1982). Elementary science teachers - take a bow. Science and Children, 20, 20-22. Yang L. L. (1996). Analysis of gender differences in science interests (in Chinese). Taipei: Wen Jing. Yip D. Y. & Ho S. C. (2002). Assessment of Scientific Literacy of Hong Kong Students in PISA 2000. Education Journal, 21(1), 117 - 122. Yip D. Y. (2002). Scientific Literacy of Hong Kong Students in PISA 2000: An Analysis of Performance on the Released Items. Education Journal, 21(2), 141-159. Yip D. Y., Chiu M. M. & Ho S. C. (2004). Hong Kong Student Achievement in OECD-PISA Study: Gender Differences in Science Content, Literacy Skills and Test Item Formats. International Journal of Science and Mathematics Education, 2 (1), National Science Council, Taiwan, 91-106. Young, D. J. & Fraser, B. J. (1994). Gender differences in science achievement: Do school effects make a difference? Journal of Research in Science Teaching, 31, 857-871. Yung, B. H. W. (2002). Same assessment, different practice: Professional consciousness as a determinant of teachers’ practice in a school-based assessment scheme. Assessment in Education, 9(1), 101-121. Yung B. H. W. (2006). Learning from TIMSS Implications for Teaching and Learning Science at the Junior Secondary Level. Education and Manpower Bureau, Hong Kong (China): Government Logistics Department. Zahidi S., Rao S. P., Cuenod M., Bekhouche Y., Kar I. & Prasad G. (2009). The India Gender Gap Review 2009. World Economic Forum, Geneva, Switzerland. http://www.weforum.org/pdf/gendergap/IGGR09.pdf. Retrieved on 1-4-2010. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 25, 151-175. Zee, V. E. & Minstrell, J. (1997). Using Questioning to Guide Student Thinking. Journal of the Learning Sciences, 6(2), 227-269. Zelezny, L. C., Chua P. P., and Aldrich C. (2000). Elaborating on Gender Differences in Environmentalism. Journal of Social Issues, 56, 442-457. Zenisky, A. L., Hambleton, R. K. & Robin, F. (2004). DIF detection and interpretation in large-scale science assessments: informing item writing practices. Educational Assessment, 9(1-2), 61-78. Zimmermann, L. (1996). Knowledge, affect, and the environment: 15 years of research (1979-1992). The Journal of Environmental Education, 27, 41-44. Zusho, A., Pintrich, P. R. & Coppola, B. (2002). Skill and will: the role of motivation and cognition in the learning of college chemistry. International Journal of Science Education, 25(9), 1081-1094. 182 183
© Copyright 2024