Choices and Decisions for Test Instruments in Cognitive and Achievement Assessment Region 4 Education Service Center, Houston, Texas. Thursday 7 June 2012 Ron Dumont Ed.D., NCSP Professor Director: School of Psychology Fairleigh Dickinson University 1000 River Road – TWH101 Teaneck, NJ 07666 201-692-2464 [email protected] John Willis Ed.D., SAIF Senior Lecturer in Assessment – Rivier College 419 Sand Hill Road Peterborough, NH 03458-1616 (603)-924-0993 [email protected] TALKING POINTS LD means a disorder in one or more of the basic psychological processes involved in understanding or in using language. You cannot identify a LD without specifying a disorder and showing how it impairs schoolwork. Severity of LD is not measured by the severity of the weakness in basic process (es). It is measured by the severity of the impact on academic achievement. Real-life academic achievement is often more important than achievement test scores. Careful, diagnostic assessment of achievement is often the core of the evaluation. The “exclusions” are not as important as most people think, as long as the disorder in basic process (es) and impact on academic achievement have been documented thoroughly and properly. Students may have LD as a disability secondary to another disability, even intellectual handicap. Global intelligence measured by total IQ scores (GCA, MPC, BCA, etc.) is usually not a helpful construct for understanding the cognitive functioning of students with specific learning disabilities. Official definitions of LD do not define “intellectual ability” even as “intelligence,” much less as an “intelligence test score.” Cognitive ability factors are usually the most helpful level of analysis. The factor structure of a test for persons with certain, specific disabilities may be very different from the factor structure for the norming sample. It is almost always better to adopt an appropriate test than to adapt an inappropriate one for a student with a severe disability. Parents, teachers, and the students themselves make important contributions to the evaluation, and they must be included in the process. Examiners should elicit genuine referral questions, not just “issues” to be answered by the evaluation. Interviews and questionnaires are essential parts of a complete evaluation. To ensure the valuable contributions of parents, teachers, and students, evaluation results must be as clear as possible. Jargon and statistics must be defined very clearly. The total evaluation must be integrated, which is not achieved with a staple. LD identification is a professional judgment by a team, not an exercise in arithmetic. However, any arithmetic involved should be accurate. Tests and scores are not interchangeable. A student’s age- and grade-equivalent scores (which are horrible statistics) will not come out in the same rank order as the student’s standard scores. Age-based and grade-based norms differ, and often both must be reported. Discontinuities between fall, winter, and spring norms can be dramatic. The same performance yields very different scores on different tests. The same grade equivalent yields different standard scores on different tests. Reading tests use an extraordinary number of means of assessing reading. They are not interchangeable. Often, several reading tests must be used for a complete picture. Reading fluency and study skills are important. Math tests almost always require limits-testing. Correct test scores, including “math-fact” errors and misreading of operations signs, must not be confused with the reality of the student’s math skills. Writing must be assessed carefully. The best formal written expression tests have many flaws. Writing samples may be needed in addition to tests. Achievement testing should include detailed descriptions of actual skills, gaps, and weaknesses. It is useless simply to print a table of test scores and describe the scores in words. Tests that combine separate skills in single scores (e.g., reading decoding and reading comprehension or math calculation and math applications) are as useful as a second handle on a snow shovel. Discrepancy formulae are statistical nightmares. The absence of some discrepancy should not be used to exclude children from LD classification. Discrepancies should be thought of as presumptive in nature, not exclusive. Norms are much more important than most people think. Norms are worse than most people think. Validity and reliability matter. Validity for specific purposes and reliability over realistic spans of time are rarely documented. Diagnoses are not political or economic decisions. Relative and transient weaknesses must be taken seriously. Examiners must use the best instruments available. Inadequate tests should be used only when they are absolutely necessary and the best existing for the purpose. All disabilities, including LD, can be seen as mismatches between learning style and instruction. Changes in circumstances can alter the need for special education. Evaluation processes that always or never recommend highly restrictive placements are equally suspect. Fads in diagnosis and treatment must be avoided. Evaluations must be a careful, thoughtful, thorough process, whether initial or re-evaluations Concrete recommendations, individually planned for the student, are an important goal of the evaluation. Stock, boilerplate recommendations are not much help. A useful evaluation with recommendations does not cost much more than a useless one without recommendations. Computers don’t write reports. People do. 2 Evaluations should be individual and humanistic, consider multiple intelligences, reflect reality beyond test scores, and accept the possibility of improvement in areas of weakness as well as potential stability of individual patterns of strengths and weaknesses. There is often a huge gap between the science of LD identification and the social policy involved with the identification of LD. Best practice and educational law are often in conflict. There is often a distinct difference between an evaluation for classification and an evaluation for diagnosis of educational difficulties. HOW TO BE AN EVEN BETTER EVALUATOR: GETTING MORE FROM EVALUATIONS I. CONTEXT OF THE EVALUATION A. Test results must be placed in context, including cautious, skeptical interpretation of: 1. The student's actual, real-life performance in school; 2. The student's educational history; 3. The student's personal, familial, and medical history; 4. Reports and referral questions from therapists, parents, and the student; actively solicit reports and questions; 5. The student's self-reports; 6. Previous evaluations; and 7. Concurrent findings of other evaluators. 8. All data must be integrated, which is not achieved with a staple. B. Commonalties must be given due weight; disparities must be acknowledged and discussed. C. The evaluation and preliminary conclusions may be done "blind," but the final decisions must take into account the above considerations. II: TEST STANDARDIZATION AND NORMS A. Do not alter standardized tests especially normed, standardized tests. 1. Obey basal and ceiling rules, and make sure we have them right, since they vary from test to test. 2. Read test instructions and items verbatim; give demonstrations as instructed. a. Do not coach, help, or teach except as instructed. b. Do not ad lib. c. Do not trust our memories. d. Practice until you can deliver your lines in a smooth, conversational tone. e. Do not give unauthorized feedback. f. Adhere to time limits, timed presentations, and timed delays. 3. Learn and practice a test to complete mastery before using it. Obtain qualified supervision when using a new test. 4. If we "test the limits," explain what we did, why, and how. Make absolutely certain that results of limit-testing cannot possibly be confused with valid scores. Do not test limits in ways that will influence later items, e.g., by providing extra practice. 5. For students with severe or low-incidence disabilities, try to adopt appropriate tests rather than 3 adapt inappropriate ones. 6. Test students in their specific native language (e.g., Puerto Rican vs. Castilian Spanish or American Sign Language vs. Signed English) with tests normed in those languages. 7. Consider consequences of taking subtests out of context. B. Pay attention to test norms [we are responsible for using appropriate tests.] 1. Make sure that tests are normed on a genuinely representative sample: a. Sufficiently large; b. Appropriately stratified and randomized (or exhaustive); i. sexes ii. geographic regions iii. racial and ethnic groups iv. disabilities v. income and educational levels vi. other germane variables. vii. interactions of these variables c. National, international, or appropriate, clearly specified subgroup; d. Truthfully and completely presented in test manual; e. Recent; and f. The appropriate test level for the student. g. age-based vs. grade-based norms [we often need both for a complete understanding]; h. consider differences in norms for sexes, races, regions, incomes, and other variables. 2. When the best-normed test in existence for the purpose is not very well normed: a. Look again. b. In the report clearly explain the problems and the probable consequences. C. Error in norms tables 1. Printed norms. 2. Computerized norms. D. Princess Summerfallwinterspring E. Be skeptical of publishers' claims. III. RELIABILITY A. Standard error of measurement (SEm). 1. Consistently use 90% (1.65 or 1 2/3 SEm) or 95% (1.96 or 2 SEm) confidence bands, even if it is difficult for you. 2. Explain the meaning of the confidence band clearly. 3. Be certain we understand it ourselves. 4. Believe it; recognize that a test score was obtained once, at a specific time and place. 5. Recognize and explain that the confidence band does not include errors and problems with test administration and conditions. 6. Distinguish between reliability and validity. 7. Distinguish between standard error of measurement (SEm) and standard error of estimate (SEest). 8. Be sure to use the correct confidence band for the appropriate score: raw, W, standard score, percentile rank, etc. B. Determine (and worry about) how reliability data were obtained [we are responsible for using appropriate tests]. 1. How large were the samples? [They are often very small.] 2. Are there reliability data for students similar to the student being tested? 3. Are we using internal consistency measures to estimate test-retest reliability? 4. Are the time intervals comparable to those we are concerned with? 4 C. Cut-off scores. D. Be skeptical of publishers' claims. IV. VALIDITY A. Validity for what purposes? B. Validity for what groups? C. Determine (and worry about) how validity data was obtained [we are responsible for using appropriate tests.] 1. How large were the samples? [They are often very small.] 2. Are there validity data for students similar to the student being tested? 3. What were the criterion measures? Are we using a closed circle of validating tests against other very similar tests? 4. Are the time intervals comparable to those we are concerned with? D. Construct validity. E. Standard error of estimate (SEest). F. "Incremental validity." G. Interpret tests only in ways for which validity has been established. H. Keep a record of all test data for follow up establishing trends and understanding how the test works in the local situation. V. SCORING A. Use two straight-edges in the table, and if necessary, photocopy the norms table and draw lines and circle numbers. B. Check accuracy of tables by inspecting adjacent scores. C. Read table titles and headings aloud while scoring. D. Recheck all scores. E. Check them again. F. Get someone else to check them. G. Score by both age and grade norms, if available, and compare the results. H. Record the student's name and the data on all sheets of paper. I. Check the student's birthdate and age with the student. Calculate the age correctly by the rules for the particular test. J. Make sure we (not just business managers) are on publishers' mailing lists. K. Perform thought experiments with tables, e.g., What if the student had made two lucky or unlucky guesses? What if the student were 6 months older or younger? etc. L. Record all responses verbatim. M. Keep raw data for future use. N. Use consistent notations for correct and incorrect answers, no responses, "I don't know" responses ("I have no clue"), and examiner's questions. Make sure the examinee cannot determine which notations you are making from the number or direction of pencil strokes. 0. Use protractors and templates consistently, carefully, and correctly. If uncertain, have someone check your work. P. Follow computer-scoring instructions slavishly, including the sequence in which you turn on the CPU and peripheral equipment. Q. Check accuracy of computer results by occasionally hand scoring. R. Make sure you have the latest version of the scoring program, that you know of any new glitches in it, and that you have the protocols that go with that version. S. Understand and clearly explain differences among standard scores, scaled scores, normal curve equivalents, percentile ranks, and other scores. 5 T. Use age-equivalent ("mental age") and grade-equivalent scores sparingly, if at all, explain them and their myriad limitations clearly, and make sure they have some relationship to reality. Bracket them with 90% or 95% confidence bands, just as you do standard scores. VI. MEASURES OF COGNITIVE ABILITY A. What to use. B. What not to use. C. When to use them. VII. MEASUREMENT OF MEMORY A. What to use. B. What not to use. C. When to use them. VIII. MEASUREMENT OF LANGUAGE A. What to use. B. What not to use. C. When to use them. IX. MEASUREMENT OF ACHIEVEMENT A. B. C. D. Assess all relevant skills with appropriate instruments or procedures. Use tests with sufficient numbers of items. Fill in gaps in skills assessed. Follow up hypotheses. X. SEVERE DISCREPANCY A. Between what and what? Hope and experience? B. Reality vs. test scores. C. Co-normed tests and linked tests. D. How to understand and make appropriate choices. 1. Simple difference. 2. Regressed difference. 3. Grade-equivalent differences 4. NCE differences. XI. INTERPRETATION OF EVALUATION RESULTS A. Distinguish clearly between different tests, clusters, factors, subtests, and scores with similar titles. 1. e.g., "Reading Comprehension" is not the same skill on different reading tests. 2. e.g., "Processing Speed is not the same ability on different intelligence tests.. B. Explain with words and pictures all the statistics we use in our reports. C. Explain differences between different statistics for different tests that we combine in our reports. D. Explain the names (e.g., "Below Average") for the statistics we use in our reports. E. Explain differences between different names for the same scores on various tests that we combine in our reports. F. Distinguish clearly between findings and implications. 6 G. Interpretation and recommendations require understanding of the disabilities and the programs, not merely of the tests. H. Identification of a disability is a reasoned, clinical judgment, not an exercise in arithmetic. I. Offer specific, detailed recommendations and give a rationale for each. J. Beware of fads in diagnoses and recommendations, both general and our own. K. Eschew boilerplate. L. Use computer reports to help interpret data and plan reports. Do not include or retype the actual printouts in reports. M. Remember that students' skills in related areas may differ dramatically and unexpectedly N. Use tests that distinguish between skills rather than lumping them together. For example, a combined reading score based on both oral reading and reading comprehension is about as useful as a second handle on a shovel. 0. A full scale, composite, or total IQ score is not kismet. P. If it happens, it must be possible. Q. Resist pressures to lie. R. Explain the mechanism of the disability. 1. For example, a specific learning disability is a disorder in one or more of the basic psychological processes involved in understanding or in using language, spoken or written which may manifest itself in an imperfect ability to listen, speak, think, read, write, spell, or do mathematical calculations. So tell us about the student's disorder(s) . 2. Similarly, a serious emotional disturbance must be based on a psychological disorder. So specify, define, and explain the disorder, not just the behaviors. S. Report genuinely germane observations from test sessions, but be clear (in our own minds as well as in our reports) that behavior in a test session may be unique to that test session and may never be seen in any other context. T. Pay attention to our own observations. If we cite the student's boredom or fatigue, we should not hit the Autotext button for "Test results are assumed to be valid." Explain why we did not use more and shorter test sessions. [Why didn't we?] U. There is no such thing as a routine triennial reevaluation. V. Consider practice effects when tests are re administered. Consider differential practice effects on different subtests. How many WlSCs are too many? W. Severity of an educational disability is measured by its impact on school functioning and achievement, not by scores on diagnostic tests. X. Evaluate the entire pattern of the student's abilities, not merely weaknesses. Y. Revisit the verbatim record of the student's actual responses before accepting "canned" interpretations from the manual, handbook, or computer printout. For instance, WISC-IV Comprehension measures Social Studies achievement as well as "social comprehension," Picture Completion almost never measures the "ability to distinguish essential from nonessential details," and young children can earn high scores for "verbal abstract reasoning" with entirely concrete responses to Similarities. Z. Base conclusions and recommendations on multiple sources of convergent data (not just test scores). 7 All of this information is either taken or adapted from Table 2.1, pp. 32-41, in D. P. Flanagan, K. S. McGrew, & S. O. Ortiz (2000), The Wechsler Intelligence Scales and Gf-Gc Theory: A Contemporary Approach to Interpretation (Boston: Allyn & Bacon), which was slightly changed from Table 1-1, pp. 15-19, in K. S. McGrew, & D. P. Flanagan (1998), The Intelligence Test Desk Reference (ITDR): Gf-Gc Cross-Battery Assessment (Boston: Allyn & Bacon). "Most all definitions were derived from Carroll [J. B. Carroll (1993) Human Cognitive Abilities: A Survey of Factor-Analytic Studies (Cambridge, Eng.: Cambridge University Press)]. Two-letter factor codes (e.g., RG) are from Carroll (1993a). Information in this table was adapted from McGrew [K. S. McGrew (1997), Analysis of the major intelligence batteries according to a proposed comprehensive Gf-Gc framework in D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (1997), Contemporary Intellectual Assessment: Theories Tests, and Issues (New York: The Guilford Press)] with permission from Guilford Press" (Flanagan, McGrew, & Ortiz, 2000, p. 41). For the most recent information, please see D. P. Flanagan, S. O. Ortiz, and V. Alfonso (2007), Essentials of Cross-Battery Assessment (2nd ed.) (New York: Wiley & Sons) D. P. Flanagan, S. O. Ortiz, V. Alfonso, & J. T. Mascolo (2006). Achievement test desk reference (ATDR-II): A guide to learning disability identification (2nd ed.) (New York, NY: Wiley)., and N. Mather and L. Jaffe (2002), Woodcock-Johnson III: Recommendations, reports, and strategies (New York, NY: Wiley). This information is provided only as an illustrative guide to information in the publications cited above and cannot stand alone, so do not copy this information. You will need the explanations and worksheets in at least one of the cited references. Not all of these classifications may agree with those in the cited sources, so – again – it is absolutely necessary to use one or more of those sources. Fluid Intelligence Gf [Note that this is reasoning, not math knowledge, Gq.] Comprehension/Knowledge Gc General Sequential Reasoning (RG) [deduction] KAIT Logical Steps WJ III Analysis-Synthesis LIPS-R Picture Context LIPS-R Visual Coding UNIT Cube Design Induction (I) Raven's Progressive Matrices WAIS, WISC Matrix Reasoning DAS Matrices DAS Picture Similarities DAS Sequential and Quantitative Reasoning (also RQ) KAIT Mystery Codes WJ III Concept Formation WJ III Number Matrices LIPS-R Classification LIPS-R Design Analogies LIPS-R Repeated Patterns LIPS-R Sequential Order RIAS Odd-Item Out Quantitative Reasoning (RQ) DAS Sequential & Quantitative Reasoning (also I) Piagetian Reasoning (RP) Speed of Reasoning (RE) Language Development (LD) OWLS Oral Expression WISC/WAIS/WPPSI Comprehension (also K0) WISC/WAIS/WPPSI Similarities (also VL) DAS Similarities DAS Verbal Comprehension (also LS) RIAS Guess What (also K0 and VL) RIAS Verbal Reasoning (also VL) RIAS Verbal Memory (also Gsm MS) 8 Comprehension/Knowledge Gc (continued) [VL emphasizes vocabulary over broader language development.] Visual Processing Gv WJ III Memory for Sentences (also Gsm MS) WRAML Story Memory (also Glr MM) Lexical Knowledge (VL) EVT (also LD) EOWPVT, CREVT PPVT-III (also K0 and LD), ROWPVT, CREVT WISC/WAIS/WPPSI Vocabulary (also LD) DAS Word Definitions (also LD) DAS Naming Vocabulary (also LD) WJ III Picture Vocabulary (also K0) WJ III Verbal Comprehension (also LD) RIAS Guess What (also K0 and LD) RIAS Verbal Reasoning Listening Ability (LS) OWLS Listening Comprehension WJ III Oral Comprehension NEPSY Comprehension of Instructions General (verbal) Information (K0) WISC/WAIS/WPPSI Information WJ III General Information WISC/WAIS/WPPSI Comprehension (also LD) WISC/WAIS/WPPSI Picture Completion (also Gv CF) RIAS What's Missing (also Gv CF) PPVT-III (also VL and LD) WJ III Picture Vocabulary (also VL) RIAS Guess What (also LD and VL) PPVT-III (also VL and LD) WJ III Picture Vocabulary (also VL) Information about Culture (K2) KAIT Famous Faces K-ABC Faces and Places WJ-R Humanities General Science Information (K1) WJ-R Science Geography Achievement (A5) WJ-R Social Studies Communication Ability (CM) Oral Production and Fluency (OP) Grammatical Sensitivity (MY) Foreign Language Proficiency (KL) Foreign Language Aptitude (LA) CHC Theory and Cross Battery Knowledge (CHC CBA) Spatial Relations (SR) WISC/WAIS/WPPSI Block Design (also VZ) DAS Pattern Construction LIPS-R Figure Rotation (also VZ) WRAML Visual Learning (also MV & Gsm MS) WJ III Spatial Relations (also VZ) WISC/WAIS Object Assembly (also CS) Visual Memory (MV) WRAML Finger Windows (also Gsm MV) WRAML Visual Learning (also SR & Gsm MS) DAS Recall of Designs DAS Recognition of Pictures KAIT Memory for Block Designs WJ III Picture Recognition 9 [The current versions of CHC theory do not handle drawing tests very well. Carroll (1993) did not have many, if any drawing tests (e.g., VMI, Bender) in his massive data set.] Auditory Processing Ga [Current versions of CHC theory need more sophistication in dealing with phonological awareness.] LIPS-R Immediate Recognition LIPS-R Forward memory WRAML Picture Memory WRAML Design Memory RIAS Nonverbal Memory Visualization (Vz) WPPSI-R Geometric Design (also P2) LIPS-R Form Completion (also SR) LIPS-R Matching LIPS-R Paper Folding LIPS-R Figure Rotation (also SR) WISC/WAIS/WPPSI Block Design (also SR) VMI-5 (also P2?) Bender Visual-Motor Gestalt (also P2?) DAS Block Building DAS Matching Letter-Like Forms WJ III Spatial Relations (also SR) WJ III Block Rotation Closure Speed (CS) WPPSI/WAIS Object Assembly (also SR) WJ III Visual Closure Flexibility of Closure (CF) LIPS-R Figure-Ground RIAS What's Missing (also Gc K0) WISC/WAIS/WPPSI Picture Completion (also Gc K0) Spatial Scanning (SS) WPPSI Mazes WISC-III PI Elithorn Mazes Porteus Mazes Serial Perceptual Integration (PI) K-ABC Magic Window Length Estimation (LE) Perceptual Illusions (IL) Perceptual Alternations (PN) Imagery (IM) TAT Phonetic Coding: Analysis (PC:A) GFW Auditory Discrimination GFTA Test of Articulation CTOPP WJ III Incomplete Words WJ III Sound Awareness Phonetic Coding: Synthesis (PC:S) CTOPP WJ III Sound Blending Speech Sound Discrimination (US) WJ-III Auditory Attention (also U3) WJ III Sound Patterns - Voice (also U3) Wepman Auditory Discrimination Test Resistance to Auditory Stimulus Distortion (UR) WJ III Auditory Attention and GFW Selective Attention Memory for Sound Patterns (UM) General Sound Discrimination (U3) WJ-III Auditory Attention (also US) Temporal Tracking (UK) 10 Musical Discrimination & Judgment (U1, U9) WJ III Sound Patterns - Music Seashore Music Appreciation Test Maintaining and Judging Rhythm (U8) Sound-Intensity/Duration Discrimination (U6) Sound-Frequency Discrimination (U5) Hearing & Speech Threshold Factors (UA,UT,UU) Absolute Pitch (UP) Sound Localization (UL) Short-Term Memory Gsm Memory Span (MS) WISC/WAIS/CTOPP/TAPS/Etc Digit Span DAS Recall of Digits K-ABC Number Recall K-ABC Word Order WJ III Memory for Words WRAML Number-Letter Memory WJ III Memory for Sentences (also Gc LD) WRAML Sentence Memory (also Gc LD) WRAML Finger Windows (also Gv MV) WRAML Visual Learning (also Gv SR & MV) RIAS Verbal Memory (also Gc LD) Working Memory (MW) WISC/WAIS Letter-Number Sequencing WJ III Numbers Reversed WJ III Auditory Working Memory Learning Abilities (L1) Long-Term Storage and Retrieval Glr Associative Memory (MA) KAIT Rebus Learning and Delayed Recall WJ III Memory for Names and Delayed Recall WJ III Visual-Auditory Learning (also MM) WJ III Visual-Auditory Learning Delayed (also MM) LIPS-R Delayed Recognition LIPS-R Associated Pairs (also MM) LIPS-R Delayed Pairs (also MM) WRAML Sound-Symbol Meaningful Memory (MM) WRAML Story Memory (also Gc LS) WJ III Story Memory and Delayed Recall WMS-III Logical Memory II CMS Stories 2 WJ III Visual-Auditory Learning (also MA) WJ III Visual-Auditory Learning Delayed (also MA) LIPS-R Associated Pairs (also MA) LIPS-R Delayed Pairs (also MA) Free Recall Memory (M6) DAS Recall of Objects WRAML Verbal Learning Ideational Fluency (FI) WJ-III Retrieval Fluency Associational Fluency (FA) Expressional Fluency (FE) Naming Facility (NA) WJ-III Rapid Picture Naming CTOPP [Glr involves the ability to store and retrieve information, not the amount of information that is stored (Gc).] [Glr tasks involve controlled . learning. General information tests, for example, cannot distinguish information that was forgotten from information thsat was never known at all.] 11 Word Fluency (FW) Figural Fluency (FF) NEPSY Design Fluency Figural Flexibility (FX) Sensitivity to Problems (SP) Originality/Creativity (FO) Learning Abilities (L1) Processing Speed Gs Perceptual Speed (P) WISC/WAIS Symbol Search (also R9) WJ III Visual Matching (also R9) WJ III Cross Out LIPS-R Attention Sustained Rate of Test Taking (R9) WISC/WAIS Digit Symbol-Coding WJ III Pair Cancellation WISC Symbol Search (also P) WJ III Visual Matching (also P) DAS Speed of Information Processing (also Gt R7) Number Facility (N) Decision/Reaction Time or Speed Gt Simple Reaction Time (R1) Choice Reaction Time (R2) Semantic Processing Speed (R4) Mental Comparison Speed (R7) DAS Speed of Information Processing (also R9) WJ III Decision Speed Speed of Making Errors (RXXXXXX) Quantitative Knowledge Gq Mathematical Knowledge (KM) Mathematical Achievement (A3) Reading/Writing Grw Reading Decoding (RD) Reading Comprehension (RC) Verbal (printed) Language Comprehension (V) Cloze Ability (CZ) Spelling Ability (SG) Writing Ability (WA) English Usage Knowledge (EU) Reading Speed (RS) Please see the attached grids showing characteristics of some commonly used achievement tests. The performance characteristics of achievement tests are especially important. For example, an untimed reading comprehension test with only one or two sentences per item may give very different results from a timed reading test with several sentences per item, and a test of writing a story may yield a very different impression than one of writing individual sentences. Flanagan, D. P., Ortiz, S. O., Alfonso, V. & Mascolo, J. T. (2006). Achievement test desk reference (ATDR-II): A guide to learning disability identification (2nd ed.). Hoboken, NJ: Wiley. A serious treatment of achievement tests from the standpoint of CHC theory (a computerized version of the book's CHC test classifications available for download at: http://www.crossbattery.com/ and at http://sites.google.com/site/dumontwillisontheweb/home/files). 12 References Alessi, G. (1988). Diagnosis diagnosed: A systematic reaction. Professional School Psychology, 3 (2), 145-151. A provocative and important article. Why do we diagnose children as LD when there are other factors that should be explored? Examiners readily acknowledge in theory, but almost never cite in practice such causes of underachievement as poor instruction, defective school management policies, or inadequate curriculum. Bracken, B. A. (1988). Ten psychometric reasons why similar tests produce dissimilar results. Journal of School Psychology, 26, (2), 155-166. If you ever administer more than one test to a child you have discovered that not all tests give the same results. Dr. Bracken explains 10 of the most common reasons for such dissimilar results. Breaux, K. C., & Frey, F. E. (2009) Assessing Writing Skills Using Correct–Incorrect Word Sequences: A National Study. Poster Session, National Association of School Psychologists Conference Retrieved from http://psychcorp.pearsonassessments.com/hai/images/products/wiat-iii/WIAT-III_NASP_Poster.pdf Brunnert, K. A., Naglieri, J. A., & Hardy-Braz, S. T. (2008). Essentials of WNV assessment. Hoboken, NJ: Wiley. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge, Eng: Cambridge University Press. Carroll, J. B. (1997). The three-stratum theory of cognitive abilities. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment (ch. 7, pp. 122-130). New York, NY: Guilford Press. Cheramie, G. M., Goodman, B. J., Santos, V. T., & Webb, E. T. (2007) Teacher perceptions of psychological reports submitted for emotional disturbance eligibility. Journal of Education and Human Development, 1(2). Article accessed through www.scientificjournals.org. Cheramie, G. M., Parks, L., & Schuler, A. (2011). Math problem-solving: Applying a processing model to LD determination. In N. Mather & L. E. Jaffe (Eds.) Comprehensive evaluations: Case reports for psychologists, diagnosticians, and special educators (pp. 356-371). New York, NY: Wiley. Cheramie, G. M., Stafford, M. E., & Mire, S. S. (2008) The WISC-IV General Ability Index in a non-clinical sample. Journal of Education and Human Development, 2(2). Article accessed through www.scientificjournals.org. Cohen, S. A. (1971). Dyspedagogia as a cause of reading retardation: Definition and treatment. In B. Bateman (Ed.), Learning Disorders (Vol. 4, pp. 269-291). Seattle, WA: Special Child. Dehn, M. J. (2006). Essentials of processing assessment. New York: Wiley. Dumont, R., Willis, J. O., & Elliott, C. D. (2008). Essentials of DAS-II assessment. Hoboken, NJ: Wiley. Dumont, R., Willis, J, & McBride, G. (2001). Yes, Virginia, there is a severe discrepancy clause, but is it too much ado about something? The School Psychologist, APA Division of School Psychology, 55 (1), 1, 4-13, 15. Elliott, C. D. (2007). Differential Ability Scales 2nd edition introductory and technical handbook. San Antonio: The Psychological Corporation. A valuable text on cognitive assessment in general, not only the DAS-II. There is a brief appendix with a clear, concise, helpful explanation of Item Response Theory. Embretson, S. E., & Hershberger, S. L. (Eds.) (1999). The new rules of measurement: What every psychologist and educator should know. Mahwah, NJ: Lawrence Erlbaum. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum. Farrall, M. L. (2012). Reading assessment: Linking language, literacy, and cognition. New York, NY: Wiley. The definitive text on reading assessment. Don't assess reading without it. Fiorello, C. A., Hale, J. B., McGrath, M., Ryan, K., & Quinn, S. (2002). IQ interpretation for children with flat and variable test profiles. Learning and Individual Differences, 13, 115 – 125. Flanagan, D. P. (2001). Assessment Service Bulletin Number 1: Comparative features of major intelligence batteries: Content, administration, technical features, interpretation, and theory. Itasca, IL: Riverside Publishing. Flanagan, D. P., & Harrison, P. L. (Eds.) (2012). Contemporary intellectual assessment third edition: Theories, tests and issues. New York: Guilford Press. Thirty-six chapters of authoritative theoretical and practical information on the history of intelligence assessment; CHC (Gf-Gc), PASS, triarchic successful intelligence, and multiple-intelligences theories; newest Wechsler scales, Differential Ability Scales 2nd ed., Kaufman Assessment Battery for Children 2nd ed., StanfordBinet 5th ed., Woodcock-Johnson III Normative Update, Das-Naglieri Cognitive Assessment System, Universal Nonverbal Intelligence Test, Stanford-Binet Intelligence Scale 5th ed., Reynolds Intellectual Assessment Scales, and NEPSY-II; 13 contemporary interpretive approaches and their relevance for intervention; cognitive assessment in different populations; and contemporary and emerging issues. Many of the chapters are written by the authors of the tests and developers of the theories. Flanagan, D. P., & Kaufman, A. S. (2009). Essentials of WISC-IV assessment (2nd ed.). Hoboken, NJ: Wiley. Flanagan, D. P., Ortiz, S. O., & Alfonso, V. (2007). Essentials of cross-battery assessment (2nd ed.). Hoboken, NJ: Wiley. See also http://www.crossbattery.com/. The word on the McGrew, Flanagan, and Ortiz integrated cross-battery method and on practical application of CHC theory to assessment Flanagan, D. P., Ortiz, S. O., Alfonso, V. & Mascolo, J. T. (2006). Achievement test desk reference (ATDR-II): A guide to learning disability identification (2nd ed.). New York, NY: Wiley. A serious treatment of achievement tests from the standpoint of CHC theory (a computerized version of the book's CHC test classifications available for download at: http://www.crossbattery.com/ and at http://sites.google.com/site/dumontwillisontheweb/home/files). Floyd, R. (2002). The Cattell-Horn-Carroll (CHC) Cross-Battery Approach: Recommendations for school psychologists. Communiqué, 30 (5), 10-14. Floyd, R. G., Evans, J. J., & McGrew, K. S. (2003). Relations between measures of Cattell-Horn-Carroll (CHC) cognitive abilities and mathematics achievement across the school-age years. Psychology in the Schools, 60 (2), 155-171. Goldman, J. J. (1989). On the robustness of psychological test instrumentation: Psychological evaluation of the dead. In Glenn G. Ellenbogen (Ed.) The Primal Whimper: More Readings from the Journal of Polymorphous Perversity. New York: Ballantine, Stonesong Press. Essential information on testing the dead. See http://alpha.fdu.edu/~dumont/psychology/McGee.htm Hale, J. B., & Fiorello, C. A. (2001). Beyond the academic rhetoric of 'g': Intelligence testing guidelines for practitioners. The School Psychologist, 55, 113-117, 131-135, 138-139. Hale, J. B., & Fiorello, C. A. (2004). School neuropsychology: A practitioner's handbook. New York: The Guilford Press. Readable and valuable for evaluators who are not neuropsychologists or neurosurgeons. Hale, J. B., Hoeppner, J. B., & Fiorello, C. A. (2002). Analyzing Digit Span components to assessment of attention processes. Journal of Psychoeducational Assessment, 20 (2), 128-143. Horn, J. L., & Blankson, N. (2005). Foundation for better understanding of cognitive Abilities. In D. P. Flanagan, & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (2nd ed.) (pp. 41-68). New York: Guilford Press. International Reading Association (1982). Misuse of grade equivalents: resolution passed by the Delegates Assembly of the International Reading Association, April 1981. Reading Teacher, January, p. 464. Jaffe, L. E. (2009). Development, interpretation, and application of the W score and the relative proficiency index (WoodcockJohnson III Assessment Service Bulletin No. 11). Rolling Meadows, IL: Riverside Publishing. Retrieved from http://www.riverpub.com/products/wjIIIComplete/pdf/WJ3_ASB_11.pdf. Kaufman, A. S. (2009). IQ Testing 101. New York: Springer Publishing. A wonderfully accessible, but informative review of the history and practice of intelligence testing. Recommended for both evaluators and their spouses who work in other fields of endeavor. Kaufman, A. S., & Kaufman, N. L. (Eds.) (2001). Specific learning disabilities and difficulties in children and adolescents: Psychological assessment and evaluation. Cambridge, Eng.: Cambridge University Press.. Kaufman, A. S., Lichtenberger, E. O., Fletcher-Janzen, & Kaufman, N. L. (2005). Essentials of KABC-II assessment. Hoboken, NJ: Wiley. Lichtenberger, E. O., Mather, N., Kaufman, N. L., & Kaufman, A. S. (2004). Essentials of assessment report writing. New York, NY: Wiley. Lichtenberger, E. O., & Breaux, K.C. (2010). Essentials of WIAT-III and KTEA-II assessment. New York, NY: Wiley. Lichtenberger, E. O., & Kaufman, A. S. (2009). Essentials of WAIS-IV Assessment. New York, NY: Wiley. Lyman, H. B. (1997). Test scores and what they mean (6th ed.). Boston: Allyn and Bacon. Tremendously valuable, tightly focused, very readable discussion of the statistics used in reporting test scores. Highly recommended. Mather, N., & Jaffe, L. E. (2004). Woodcock-Johnson III: Reports, recommendations, and strategies (with CD). New York, NY: Wiley. An extraordinarily thorough and helpful treatment of the WJ III, including many very useful forms, sample reports, 158 pages of specific, practical recommendations, and 85 pages of detailed explanations of teaching strategies, which evaluators can (and should) use in their reports. Even if you never use the WJ III, this book is extremely useful with any assessment interpretation. 14 Mather, N., & Jaffe, L. E. (Eds.) (2010). Comprehensive evaluations: Case reports for psychologists, diagnosticians, and special educators. New York, NY: Wiley. Fifty-eight sample evaluation reports with commentaries by the authors addressing a wide variety of disabilities and referral concerns. An invaluable resource for anyone writing psychological, neuropsychological, or educational evaluations. Mather, N., Wendling, B. J., & Roberts, R. (2nd ed.) (2009). Writing assessment and instruction for students with learning disabilities (2nd ed.). San Francisco: Jossey-Bass. Don't even think of testing or teaching writing skills without this book, which contains detailed, specific, helpful information and advice and 210 figures and exhibits of student writing samples with commentary. Mather, N., Wendling, B. J., & Woodcock, R. W. (2001). Essentials of WJ III Tests of Achievement testing. New York, NY: Wiley. McBride, G. M., Dumont, R., & Willis, J. O. (2004). Response to response to intervention legislation: The future for school psychologists. The School Psychologist, 58, 3, 86-91, 93. McBride, G. M., Dumont, R., & Willis, J. O. (2011). Essentials of IDEA for assessment professionals. New York: Wiley. Don't get sued without it. http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470873922.html http://www.amazon.com/Essentials-IDEA-Assessment-Professionals-Psychological/dp/0470873922 McCallum, S., Bracken, B., & Wasserman, J. (2001). Essentials of nonverbal assessment. Hoboken, NJ: Wiley. General principles, UNIT, LIPS-R, and other instruments. Includes interpretive worksheets. McGrew, K. S., Flanagan, D. P., Keith, T. Z., & Vanderwood, M. (1997). Beyond g: The impact of Gf-Gc specific cognitive abilities research on the future use and interpretation of intelligence tests in the schools. School Psychology Review, 26, 189-210. National Research Council (2002). Mental retardation: Determining eligibility for Social Security benefits. Committee on Disability Determination for Mental Retardation. Daniel J. Reschly, Tracy G. Myers, & Christine R. Hartel, editors. Division of Behavioral and Social Sciences and Education. Washington, D.C.: National Academy Press. http://www.nap.edu/catalog.php?record_id=10295. Free at http://www.nap.edu/openbook.php?record_id=10295&page=1. Despite the apparently narrow focus of the title, this volume is an excellent compendium of recent information about the diagnosis of mental retardation, produced by the 16-member Committee of the National Research Council. There are informative chapters on intellectual assessment, adaptive behavior assessment, the relationship of intelligence and adaptive behavior, and differential diagnosis. Highly recommended for evaluators and teams making identifications of mental retardation or differential identifications of mental retardation and specific learning disabilities. O'Neill, A. M. (1995). Clinical inference: How to draw meaningful conclusions from tests. New York, NY: Wiley. A valuable antidote to the increasing mechanization and depersonalization of assessment. Highly recommended. Ortiz, S. O., & Flanagan, D. P. (2002a). Cross-Battery Assessment revisited: Some cautions concerning "Some Cautions" (Part I). Communiqué, 30 (7), 32-34. See Floyd (2002), Ortiz & Flanagan (2002b), Watkins, Glutting, & Youngstrom (2002) and Watkins, Youngstrom, & Glutting (2002). Ortiz, S. O., & Flanagan, D. P. (2002b). Cross-Battery Assessment revisited: Some cautions concerning "Some Cautions" (Part II). Communiqué, 30 (8), 36-38. See above. Pierce, A., Miller, G, Arden, R., & Gottfredson, L. S. (2009). Why is intelligence correlated with semen quality? Biochemical pathways common to sperm and neuron function, and their vulnerability to pleiotropic mutations, Communicative & Integrative Biology, 2(5), 385-387. Prifitera, A., Saklofske, D. H., & Weiss, L. G. (Eds). (2008). WISC-IV: Clinical assessment and intervention 2e. Burlington, MA: Academic Press (Elsevier). Tons of good information, including norms for the Cognitive Proficiency Index (CPI) as well as the General Ability Index (GAI). Psychological Corporation (2010). WIAT®–III Essay Composition: “Quick Score” for Theme Development and Text Organization. Retrieved from http://psychcorp.pearsonassessments.com/hai/images/Products/WIAT-III/WIATIII_Quick_Scoring_Guide.pdf Good example of the importance of frequently revisiting Web pages for all the tests we use. Raiford, S. E., Weiss, L. G, Rolfhus, E., & Coalson, D. (2005/2008). General Ability Index. WISC-IV Technical Report #4 (updated December 2008). San Antonio, TX: Pearson Education. Retrieved September 28, 2009 from http://psychcorp.pearsonassessments.com/NR/rdonlyres/1439CDFE-6980-435F-93DA05888C7CC082/0/80720_WISCIV_Hr_r4.pdf Reynolds, C. R. (1997). Forward and backward memory span should not be combined for clinical analysis. Archives of Clinical Neuropsychology, 12, 29-40. Reynolds, C. R., & Fletcher-Janzen, E. (Eds.) (2007). Encyclopedia of Special Education (3rd ed.). New York, NY: Wiley. Everything you'd want to know. 15 Roid, G. H., & Barram, R. A. (2004). Essentials of Stanford-Binet Intelligence Scales (SB5) assessment. New York, NY: Wiley. Sattler, J. M. (2008). Assessment of children: Cognitive foundations (5th ed.) San Diego: Jerome M. Sattler, Publisher. THE book on assessment. Sattler, J. M., & Dumont, R. P. (2004). Assessment of children: WISC-IV and WPPSI-III supplement. San Diego, CA: Jerome M. Sattler, Publisher. Sattler, J. M. & Hoge, R. (2006). Assessment of children: Behavioral, social and clinical foundations (5th ed.) San Diego: Jerome M. Sattler, Publisher. Comprehensive treatment of educational and psychological assessment of children emphasizing particular disabilities, behavior assessment, and other clinical topics. Essential reference, along with the companion Cognitive Foundations. The valuable appendices include many extremely helpful semistructured interviews. Sattler, J. M., & Ryan, J. J. (2009). Assessment with the WAIS-IV. San Diego, CA: Jerome M. Sattler, Publisher. Schrank, F. A., & Flanagan, D. P. (2003). WJ III clinical use and interpretation: Scientist-practitioner perspectives. New York: Academic Press. Schrank, F. A., Flanagan, D. P., Woodcock, R. W., & Mascolo, J. T. (2001). Essentials of WJ III cognitive abilities assessment. New York, NY: Wiley. Schultz, M. K. (1988). A comparison of Standard Scores for commonly used tests of early reading. Communiqué, 17 (4), 13. Specific data are obsolete, but the issue remains fresh as the morning dew. Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21, 360-407. Important paper frequently misquoted and misinterpreted. Stanovich, K. E. (2009). What intelligence tests miss: The psychology of rational thought. New Haven: Yale University Press. Provocative and only slightly unfair critique. Worth reading. Steingart, S. K. (2004). The web-connected school psychologist: The busy person's guide to school psychology on the Internet (2nd ed.). Longmont, CO: Sopris West. Two versions: inexpensive with a really valuable CD and extremely inexpensive without the CD. Companions to Sandy's wonderful Web site http://www.schoolpsychology.net/, all of which we highly recommend. In addition to a huge, tremendously useful collection of thoughtfully categorized and helpfully annotated Internet addresses for resources relevant to school psychology, the book includes a clear, detailed, practical tutorial for taking advantage of the Internet's resources and a glossary. Don't go on line without it. Sternberg, R. J. & Kaufman, S. B. (Eds.) (2011). The Cambridge handbook of intelligence.. New York, NY: Cambridge University Press. Watkins, M. W., Glutting, J., & Youngstrom. E. (2002). Cross-battery cognitive assessment: Still concerned. Communiqué, 31 (2), 42-44. See Floyd (2002), Ortiz & Flanagan (2002a, 2002b), and Watkins, Youngstrom, & Glutting (2002). Watkins, M. W., Youngstrom, E. A., & Glutting, J. J. (2002). Some cautions regarding Cross-Battery Assessment. Communiqué, 30 (5), 16-20. See Floyd (2002), Ortiz & Flanagan (2002a, 2002b), and Watkins, Glutting, & Youngstrom (2002). Weiss, L. G., Saklofske, D. H., Prifitera, A., & Holdnack, J. A. (Eds.) (2006). WISC-IV: Advanced clinical interpretation. Burlington, MA: Academic Press (Elsevier). Weiss, L. G., Saklofske, D. H., Coalson, D., & Raiford, S. E. (Eds.) (2010). WAIS-IV: Clinical use and interpretation. Burlington, MA: Academic Press (Elsevier). Includes norms for the Cognitive Proficiency Index (CPI) to contrast with the General Ability Index (GAI). Wendling, B. J., & Mather, N. (2009). Essentials of evidence-based academic interventions. New York: Wiley. Willis, J. O. & Dumont, R. P. (2002). Guide to identification of learning disabilities (3rd ed.). Available (paper) from [email protected] and (CD) from [email protected] Willis, J. O., & Dumont, R. (2006). And never the twain shall meet: Can response to intervention and cognitive assessment be reconciled? Psychology in the Schools, 43, 8, 901-908. Willis, J. O., Dumont, R., & Kaufman, A. S. (2011). Factor analytic models of intelligence. In R. J. Sternberg & S. B. Kaufman (Eds.), The Cambridge handbook of intelligence (pp. 39-57). New York, NY: Cambridge University Press. Zirkel, P. A. Legal Eligibility of students with learning disabilities: Consider not only RTI but also §504. Learning Disability Quarterly, 32(2), 51-53. Article Stable URL: http://www.jstor.org/stable/27740356 16 LOGICAL STEPS IN DETERMINATION OF A SPECIFIC LEARNING DISABILITY 1. Is there a problem with academic performance? Problems may be subtle or difficult to document, but if there are no academic problems at all, there is no educational disability. [A problem with an important life function other than academic performance might trigger an identification under Section 504 of P.L. 93-112 or the Americans with Disabilities Act (ADA).] Pay close attention to reports of problems that do not cause low marks even though they interfere with learning. For example, the teacher might already be providing an informal program of special education; marks might be based 25% on attendance, 50% on simply turning in homework regardless of quality; and 25% on class participation; or marks might be based on an erroneous perception of the student's academic potential. Does the student have low scores on group or individual achievement tests? 1. Look at any history of test scores. [You may want to use the attached forms for recording past test scores.] Be cautious, though, with tests that are used so frequently that the expected growth from test to retest is less than the 90% confidence band or even the SEm. Check the tables. 2. Look at the pattern of strengths and weaknesses on the test scores. Some group tests offer item analyses. Even though the norm-referenced tests do not function well as criterion-referenced measures, those analyses may contain useful information. Is the student receiving low or failing marks in a class? 1. Again, track the history of class marks. [You may want to use the attached form for recording past marks.] 2. Try to determine the basis for the student's marks. High marks might be based on special marking considerations. Is the student working much too hard or much too long to earn adequate marks? 1. Parents may be the best source of this information. A parent interview is essential. We need to know also what the parents would like to learn from the evaluation. 2. Be sure to interview the student. Sometimes it helps to obtain a copy of the report card and discuss it in detail with the student. What does the student want to learn from the evaluation? Is the teacher making extraordinary adaptations or accommodations for the student? 1. Teacher interviews are essential. We need to know what has been done, what is being done, how well those interventions have worked and are working, and what specific things the teachers would like to learn from the evaluation. 2. The classroom observation is often more useful for observing the teaching and the environment than for observing the student. Is there a notably deficient specific area of performance (e.g., tests, homework, note-taking, etc.)? 1. Is there another indication of insufficient academic performance? 2. Has a Problem Solving Model been instituted that involved data gathering relative to the students ability to respond to scientific, researched based interventions? The law has moved away from using discrepancy models to identify children with specific learning disabilities. The school is not required to determine if the child has a severe discrepancy between achievement and intellectual ability to find that the child has a specific learning disability and needs special education services. The school may use response to intervention to 17 determine if the child responds to scientific, research-based intervention as part of the evaluation process. (Section 1414(b)(6)) 1. Have the interventions been validated as being scientifically, researched based? 2. Did the interventions use rigorous data analyses that were adequate to test the stated hypotheses and justify the general conclusions drawn 3. Did the interventions employ empirical methods that draw on observation or experiments 4. Did the interventions use objective procedures that relied on measurements or observational methods that provide reliable and valid data across evaluators and observers, across multiple measurements and observations, and across studies by the same or different investigators 5. Were the interventions evaluated and accepted by peer-review journals or approved panels of independent experts 3. Are there one or more disorders in basic psychological processes involved in understanding or in using language, spoken or written? [See http://alpha.fdu.edu/~dumont/psychology/basic_disorders.htm] This step follows next in a logical sequence, but determination of any disorder(s) may not be clear until completion of psychological, educational, speech and language, occupational therapy, physical therapy, vision, hearing or other evaluations. There should be multiple, convergent confirmations of any disorders. Can each disorder be observed or inferred from academic performance? 1. Again, we need to consider all aspects and all measures of academic performance. 2. We are looking for possible cause-and-effect relationships between basic processes and academic performance. There needs to be a real-life connection between our hypotheses and what is actually happening with the student's performance in school. Can each disorder be documented through assessment? 1. Once we have documented the deficient achievement and are looking for possible reasons, it becomes more permissible to use poorly normed and completely informal measures and observations. Formal assessment of ability and achievement levels needs to be done, at least in part, with extremely wellnormed, reliable instruments that are valid for their intended purposes, but exploring within the area of deficient achievement may (and sometimes, given the state of the art, must) be done with less statistical rigor. The disorders need to be demonstrated clearly, reliably, and convincingly, but not always as test scores. The severity of a learning disability is measured by the severity of its impact on achievement, not by the severity of any basic-process disorder. The McGrew, Flanagan, and Ortiz Integrated Cattell-Horn-Carroll CHC Cross-Battery Approach is a very useful framework for considering many, though not all, basic-process disorders [See http://www.crossbattery.com/ 3. Can the Team make a logical argument that each identified disorder manifests itself in an imperfect ability to listen, think, speak, read, write, spell, or do mathematical calculations? It is not enough simply to specify deficient achievement and a disorder. There needs to be a logical, cause-and-effect relationship between the two. A. As noted above, we need to demonstrate how the purported basic-process disorder is impairing the carefully documented achievement area. This demonstration will require a thorough analysis of the student's academic skills. A low test score or low class mark is not enough. We need to show the mechanisms operating in the deficient achievement area(s). Examples of misaligned math problems worked left-to-right and bottom-to-top might, for instance, demonstrate the effect of a visual perception problem on math. The assumption that a visual perception problem impaired listening comprehension 18 might be more difficult to justify unless, for example, we could show how deficient visual imagery was interfering with the listening comprehension. B. Research evidence can be cited to show relationships between certain basic processes (e.g., phonological abilities or rapid naming) and certain areas of achievement (e.g., reading decoding). [See http://www.crossbattery.com/ for some examples.] C. Some clearly identifiable disorders have no discernable effect on achievement. Simply finding a disorder does not establish a learning disability (e.g., JOW's severe rhythm disorder greatly impairs his singing, dancing, and clapping in time to music, but the effect on academic achievement is trivial, only diminishing his appreciation of poetry). D. It is the disorder in the basic psychological process that distinguishes a specific learning disability from the disabilities and disadvantages ruled out in the federal regulations [(300.7(c)(10)] for learning disabilities (". . .learning problems that are primarily the result of visual, hearing, or motor disabilities, of mental retardation, of emotional disturbance, or of environmental, cultural, or economic disadvantage"). E. It is essential, as much as possible, to distinguish learning disabilities from dyspedagogia and apedagogia [300.541(1) "The child does not achieve commensurate with his or her age and ability levels in one or more of the areas listed in paragraph (a)(2) of this section, if provided with learning experiences appropriate for the child's age and ability levels"]. [Note that Steps 4 through 6 involve determination of Learning Disabilities based upon the “severe discrepancy model.” If the determination is being made without examining aspects of severe discrepancy, go to step 7.] 4. What is the best estimate of the student’s actual intellectual ability? See Mark 4:25. The Team must not allow a disorder to depress estimates of both intelligence and achievement and then mind-lessly conclude there is no discrepancy between the two. For example, verbal and visual/spatial learning disabilities, respectively, will depress verbal (Gc) and visual, spatial (Gv) intelligence measures. For another example, a disorder in quantitative knowledge (Gq) would depress the WISC Arithmetic and Verbal IQ scores and DAS Sequential & Quantitative and Nonverbal (fluid) Scale scores. Logically, the intelligence test should be chosen only after the basic-process disorders have been delineated. The McGrew, Flanagan, and Ortiz Integrated Cattell-Horn-Carroll CHC Cross-Battery Approach can be a very useful framework for considering intellectual abilities [ See http://www.crossbattery.com/.]. A. Which scales, factors, or subtests on intelligence tests are likely to be depressed by the disorder or disorders? B. Which intelligence test, scales, or factors would be likely to yield an estimate of actual intellectual ability uncontaminated by the disorder or disorders? C. What is the best estimate of the student’s actual intellectual ability based on those measures? D. Have we considered at least all of the broad abilities in the McGrew, Flanagan, and Ortiz Integrated Cattell-Horn-Carroll CHC theory? It is not prudent, for example, to use a test, such as the WISC-III, that omits fluid reasoning unless we supplement it with a measure of that ability. 5. Is there a severe discrepancy between the student’s level of intellectual ability (4. C.) and the student’s achievement in one or more of the following areas? Remember that achievement and even ability may be assessed by means other than test scores (1. B. – 1. F.). Maintain a bias in favor of reality. Achievement tests must be chosen thoughtfully. For example, a very brief achievement test is not a valid measure of academic performance for a student with a short attention span, and an untimed, silent reading test will not pick up problems with reading fluency. Do not obsess over formulae. Some data will not fit formulae. The Team must apply reasoned, professional judgment, not simply indulge in an exercise in arithmetic. By our interpretation of federal law and by most state laws, it is not lawful to deny services to a student who truly has a learning 19 disability simply because of the results of a statistical exercise. A statistical comparison of ability and achievement must use only one set of norms (e.g., national age or grade) [See http://alpha.fdu.edu/~dumont/psychology/age_vs_grade_based_scores.htm.] and should use predicted achievement scores rather than simple differences [http://alpha.fdu.edu/~dumont/psychology/Severe_Discrepancy_Discrepancies.htm , http://alpha.fdu.edu/psychology/Determining_predicted_ ach.htm , http://www.crossbattery.com/.]. Remember that these achievement areas have many components, including, for example, vocabulary or factual knowledge, fluency, independence. Few, if any, achievement tests cover all aspects of the requisite skills. Do not use tests on which the student receives very low or nearly perfect raw scores; find tests on which the student passes and fails several items [http://alpha.fdu.edu/~dumont/psychology/McGee.htm.]. A. B. C. D. E. F. G. H. oral expression; listening comprehension; written expression; basic reading skill; reading fluency reading comprehension; mathematics calculation; or mathematics reasoning. 6. Are the discrepancies caused primarily by the disorders? There is absolutely nothing in IDEA to suggest that a student cannot have a learning disability in addition to other disorders. However, the particular discrepancy ("learning problems") in question must not be primarily the result of a vision, hearing, or motor disability, of mental retardation, of emotional disturbance, or of environmental, cultural, or economic disadvantage [300.7 (c) (10) (ii)], even if one or more of those disorders or disadvantages may be causing other, separately identified learning problems. For example, a child with mental retardation might also have a specific learning disability in math with extremely low achievement severely discrepant from low predicted achievement because of a disorder in working memory. Similarly, a deaf or blind child might have unusual difficulty learning American Sign Language or Braille because of visual perceptual weaknesses. If we have been careful in your identification and analysis of the disorder(s), we should be able to separate them and their effects from the effects of disadvantages and other disabilities. 7. Does the student require special modifications of, or accommodations in, the educational program in order to achieve at levels commensurate with age and ability (4. C.)? Here is the crucial issue for identification under Section 504 or the ADA. The needed accommodations or modifications should be more than we would routinely ask of a teacher of moderate skill, experience, and dedication. 8. Does the student require a uniquely designed program of special instruction in order to achieve at levels commensurate with age and ability (4. C.)? This is the crucial issue for identification of an educational disability. If the student does not require a uniquely designed program of special instruction, but meets the other criteria, the identification should probably be under Section 504 rather than the Individuals with Disabilities Education Act. 20 Some Useful Assessment Addresses Best Two Sources we have Found – Always Start Here http://www.schoolpsychology.net/ http://www.interventioncentral.org Sandra Steingart's School Psych Resources Online Jim Wright's Intervention Central Assessment (approximately alphabetical) http://alpha.fdu.edu/~dumont/psychology/ Dumont/Willis on the Web http://cie.ed.asu.edu/ Current Issues in Ed, Arizona State U. online journal http://dynamicassessment.com Dynamic Assessment web site http://edpsychassociates.com/Watkins3.html Marley Watkins: articles, software, tests, ASCA, etc. http://facpub.stjohns.edu/~ortizs/spwww.html Samuel Ortiz's School Psychology Homepage http://faculty.education.uiowa.edu/dlohman/ papers by David Lohman http://groups.yahoo.com/group/IAPCHC IAP list home page – lots of valuable documents http://learninfreedom.org/iqbooks.html reviews of many books on cognitive assessment http://pareonline.net/Home.htm Practical Assessment, Research, and Evaluation http://sites.google.com/site/dumontwillisontheweb/home/files Dumont Excel test templates http://www.apa.org/science/programs/testing/fair-code.aspx Code of Fair Testing Practices in Education http://www.asaif.net/ Assoc. of Specialists in Assmt. of Intell. Functioning http://www.bartleby.com/141/index.html Strunk's Elements of Style on line http://www.crossbattery.com/ Official Cross-Battery Web Site http://www.dynamicassessment.com/ Dynamic Assessment Web Site http://www.ed.gov/offices/OCR/archives/testing/index1.html extensive OCR Guide for High-Stakes Testing http://www.ets.org/disabilities/documentation/documenting_learning_disabilities/ ETS rules for documenting disabilities http://www.fcrr.org Florida Center for Reading Research. http://www.iapsych.com/ Institute for Applied Psychometrics (IAP) http://my.ilstu.edu/~wjschne/ Joel Schneider's info., scoring programs & tests http://www.indiana.edu/~intell/ large Web site on intelligence testing http://www.interventioncentral.org/ Jim Wright's Intervention Central http://www.iteachilearn.com/cummins/bicscalp.html_ BICS vs. CALP, ESL assessment http://www.ldonline.org/ LD http://www.lighthouse.org/ information about with visual impairments http://www.lowvision.org/ Low Vision Gateway – lots of good information http://www.natd.org/ National Association of Test Directors http://www.oswego.edu/~mcdougal/web_site_4_11_2005/index.html graphing tools: McDougal, Clark, & Wilson http://psychcorp.pearsonassessments.com/hai/images/ageCalculator/agecalculator.htm Chronological age calculator http://www.psy.gla.ac.uk/~steve/hawth.html Steve Draper on Hawthorne, Pygmalion, placebo http://www.ssa.gov/disability/professionals/bluebook/ Disability Evaluation for Social Security http://www.sedl.org/reading/rad/ SEDL Reading Assessment Data Base http://www.ux1.eiu.edu/~glcanivez/ Gary Canivez: articles, software, etc. http://www.whatworks.ed.gov/ U.S.D.O.E. What Works Clearinghouse (WWC) http://www.wordfinding.com/ Word Finding difficulty - Diane German Information on Specific Tests http://psychcorp.pearsonassessments.com/haiweb/cultures/en-us/productdetail.htm?pid=015-8338820&Community=CA_Psych_AI_Ability DAS-II information http://www.ctcinsight.com/ Insight DVD administered group CHC ability test http://www.nwea.org/ Northwest Evaluation Association NWEA MAP http://www.pearsonassessments.com/HAIWEB/Cultures/en-us/Productdetail.htm?Pid=015-8984-609 WIAT-III http://psychcorp.pearsonassessments.com/haiweb/cultures/en-us/productdetail.htm?pid=015-8979044&Community=CA_Psych_AI_Ability WISC-IV Technical Reports, etc. http://psychcorp.pearsonassessments.com/haiweb/cultures/en-us/productdetail.htm?pid=015-8980808&Community=CA_Psych_AI_Ability WAIS-IV information http://www.riverpub.com/products/sb5/resources.html Stanford-Binet Assessment Service Bulletins http://www.riverpub.com/products/wjIIIComplete/resources.html WJ III Assessment Service Bulletins http://www4.parinc.com/Products/Product.aspx?ProductID=RIAS RIAS PowerPoint 21 Law http://idea.ed.gov/ IDEA one-stop shopping site http://dww.ed.gov/ U.S. DOE Doing What Works http://www.ed.gov/News/ U.S. DOE updates http://www.ed.gov/about/offices/list/ocr/504faq.html FAQ about Section 504 http://www.ed.gov/offices/OCR/archives/pdf/TestingResource.pdf OCR on high stakes testing http://www.ed.gov/policy/gen/guid/fpco/index.html Family Policy Compliance Office (FPCO): FERPA http://www.edpubs.org/webstore/Content/search.asp ordering page for Dept. of Ed. publications http://www2.ed.gov/policy/gen/guid/fpco/doc/ferpa-hipaa-guidance.pdf Joint FERPA & HIPAA Guidance http://www.tea.state.tx.us/ Texas Education Agency http://www.wrightslaw.com/ excellent, parent-oriented sped law web site 22 Achievement Test Grid Tests with similar purposes and even similar names are not interchangeable. Differences in norms (who was tested, where, and how long ago), content, format, and administration rules (e.g., time limits, bonus points for speed, rules for determining which items are administered to each examinee) can yield very different scores for the same individual on two tests that superficially seem very similar to each other. See, for example, Bracken (1988) and Schultz (1988). Tests that purport to measure the same general ability or skill may sample different component skills. For example, if a student has word-finding difficulties, a reading comprehension test that requires recall of one, specific correct word to complete a sentence with a missing word (cloze technique) might be much more difficult than an otherwise comparable reading comprehension test that offers multiple choices from which to select the correct missing word for the sentence (maze technique) or a test with open-ended comprehension questions. Similarly, for a student with adequate reading comprehension but weak shortterm memory, the difference between answering questions about a reading passage that remains in view and answering questions after the passage has been removed could make a dramatic difference in scores. The universal question of students taking classroom tests – "Does spelling count?" – certainly applies to interpretation of formal, normed tests of written expression. Such differences in format make little difference in average scores for large groups of examinees, so achievement-test manuals usually report robust correlations between various achievement tests despite differences in test format and specific skills sampled (see, for example, McGrew, 1999, or the validity section of any test manual). [These issues also apply to tests of cognitive ability, where they also receive insufficient attention.] Examiners need to select achievement tests that measure the skills relevant to the referral concerns and to avoid or carefully interpret tests that measure some ability (such as word-finding or memory) that would distort direct measurement of the intended achievement skill. When selecting or interpreting a test, think about the actual demands imposed on the examinee, the referral concerns and questions, and what you know about your examinee's skills and weaknesses. The tables below are based on those developed by Sara Brody (2001) and are being corrected over time by Ron Dumont, Laurie Farr Hanks, Melissa Farrall, Sue Morbey, and John Willis. They are only rough summaries, but they at least illustrate the issue of test content and format and may possibly help guide test selection. For extensive information on achievement tests, please see the test manuals, the publishers' Web pages for the tests, the Buros Institute Test Reviews Online, and the Achievement Test Desk Reference (Flanagan, Ortiz, Alfonso, & Mascolo, 2006). __________________________________ Bracken, B. A. (1988). Ten psychometric reasons why similar tests produce dissimilar results. Journal of School Psychology, 26, (2), 155-166. Brody, S. (Ed) (2001). Teaching reading: Language, letters & thought (2nd ed.). LARC Publishing, P.O. Box 801, Milford, NH 03055 (603-880-7691 http://www.larcpublishing.com). Flanagan, D. P., Ortiz, S. O., Alfonso, V. & Mascolo, J. T. (2006). Achievement test desk reference (ATDR-II): A guide to learning disability identification (2nd ed.). Hoboken, NJ: Wiley. McGrew, K. S. (1999). The Measurement of reading achievement by different individually administered standardized reading tests: Apples and apples, or apples and oranges? IAP Research Report No. 1. Clearwater, MN: Institute for Applied Psychometrics. Schultz, M. K. (1988). A comparison of Standard Scores for commonly used tests of early reading. Communiqué, 17 (4), 13. Buros Institute of Mental Measurements (n.d.). Test reviews online. Retrieved from http://buros.unl.edu/buros/jsp/search.jsp. 23 A SAMPLING OF A FEW READING TESTS 1 Word List Nonsense Oral Words Reading Accuracy untimed untimed Woodcock-Johnson III† Reading Speed silent Reading Vocabulary oral resp. Comprehension: Oral Oral & Written Language Scales (OWLS-II) Compre- Listening Spelling hension: CompreSilent hension mod. cloze both two Phonemic Skills several Gray Oral Reading Test 5 ed. passages passages questions Gray Silent Reading Test age mlt-chc untimed letters 3 Wechsler Individual Achievement Test-III untimed untimed Kaufman Test of Ed. Achievement-II 6† untimed untimed Test of Reading Comprehension 4th ed. Peabody Individual Ach. Test-Rev. NU passages passages yes passages silent 4 passages mlt-chc ‡ untimed vocabulary yes yes yes passages yes 2 subtests 7 sentences silent 9 Nelson-Denny Reading Test age passages passages words & nonsense wds. Gates-MacGinitie Reading Tests 4th ed. Slosson Oral Reading Test-Rev 8 Wide Range Achievement Test- 4 ed. timed yes grade both 5 yes both5 age mlt-chc both5 mlt-chc mlt-chc grade5 mlt-chc 10 mlt-chc grade timed th both 2 mlt-chc th Diagnostic Assessments of Reading Norms both sentences age Italics = time limits mod. cloze = modified cloze method, in which the student orally supplies a missing key word in the passage to demonstrate comprehension. underscored subtests require the student to answer from memory without the item available for review. Dumont, Farr Hanks, Farrall, Morbey, & Willis 3/7/12 1 nd The organization of these tables is borrowed from Table 11.1, p. 308, in Brody, S. (Ed) (2001). Teaching reading: Language, letters & thought (2 ed.). LARC Publishing, P.O. Box 801, Milford, NH 03055 (603-880-7691 http://www.larcpublishing.com). The selection and sequence are entirely arbitrary, whimsical, and based on personal experience. The listing of categories for tests is as accurate as we could make it, but there may be errors. We would be grateful for corrections and suggestions for improvement. 2 Norms in one-month intervals. 3 A wide variety of reading tasks with letters, digraphs, blends, diphthongs, silent –e, r-controlled vowels, syllables, polysyllabic words, etc. 4 Special norms for examinees who cannot read the first (easiest) grade-appropriate passage and must drop back to easier passages. Interpret very carefully. 5 Grade norms are seasonal. 6 Comprehensive Form; the Brief Form combines oral word-list reading and comprehension in one score which we do not find useful for diagnosis. ‡ Unusual multiple multiple-choice format. 7 For one, examinee must rearrange sentences in logical order. For the other, the examinee answers 5 multiple-choice questions about each passage. 8 Multiple-choice matching from memory of one of four pictures to each printed sentence after the sentence is removed. 9 The examinee marks the word he or she is reading 60 seconds after the starting time. 10 Norms available for extended time. A Sampling of Tests Measuring Aspects of Phonological Awareness Melissa Farrall, Ph.D. & Sara Brody, Ed.D. TESTS Test of Auditory-Perceptual Skills - Revised (TAPS-R) Auditory Word Discrimination Subtest: identifying whether two words spoken by examiner are SAME or DIFFERENT Test of Phonological Awareness (TOPA): Marking pictures of orally presented words that are distinguished by the same or different sound in the word-final position. Rosner Test of Auditory Analysis Skills (TAAS): Say "cowboy" without the "cow." Say "picnic" without the "pic." Say "cart" without the '/t/." Say "blend" without the /bl/." Lindamood Auditory Conceptualization Test (LAC-3): Using colored blocks and felt squares to represent differences or changes in sequences of speech sounds in syllables. Woodcock-Johnson III (WJ III)‡: Incomplete Words, Sound Blending, Auditory Attention., Auditory Working Memory, Rapid Picture Naming, Word Attack, Spelling of Sounds, Sound Awareness. Goldman-Fristoe-Woodcock Auditory Skills Test Battery (GFW) Many auditory tasks as well as reading and spelling nonsense words. Very, very old norms. Roswell-Chall Auditory Blending Test: Blending sequences of sounds spoken by the examiner. The Phonological Abilities Test (PAT) by Muter, Hulme, & Snowling The Phonological Awareness Test 2 (TPAT-2) by Robertson & Salter Comprehensive Test of Phonological Processing (CTOPP) Tests of phonological awareness, memory, & rapid naming. Kaufman Test of Educational Achievement – II (KTEA-II) ‡ by Alan & Nadine Kaufman Differential Ability Scales II (DAS-II) Phonological Processing and Rapid Naming subtests rapid naming word discrimination rhyming segmentation isolation deletion substitution blending graphemes X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X (X) This table is adapted (stolen) from Brody, S. (Ed) (2001). Teaching reading: Language, letters & thought (2nd ed.). LARC Publishing, P.O. Box 801, Milford, NH 03055 (603880-7691 http://www.larcpublishing.com) with extensive revisions by Melissa L. Farrall. The selection and sequence of tests are entirely arbitrary, whimsical, and based on personal experience. The listing of categories for tests is as accurate as we could make it, but there may be errors. We would be grateful for corrections and suggestions for improvement. ‡ A new edition is in development. 25 A SAMPLING OF A FEW WRITING TESTS Oral & Written Language Scales (OWLS-II) Spelling Writing Writing Writing Writing Editing of of Vocabu- Dictated ConWords Nonsense Lary Sentences strained Words Sentences in context in context yes yes Test of Written Language 3rd ed. th Test of Written Language 4 ed. Wechsler Individual Ach Test-III List Woodcock-Johnson III‡ List Peabody Individ. Ach Test-Rev. mlt-chc Kaufman Test of Ed. Ach. – II 13‡ List Test of Written Spelling 3rd ed. Goldman-Fristoe-Woodcock Story: Content Syntax Punc- Writing Norms Oral Score Score tuation Speed Prompt Score 2 scores 2 scores 2 subtests yes 3 scores* yes yes 2 scores 2 subtests yes 2 scores yes † yes list yes † 1 test yes both12 item analysis ** * essay holistic ** yes yes* yes* yes Both both12 *** yes holistic grade stanine yes† yes† student summarizes story† Age yes† yes† yes both 11 both 12 both12 2 lists 14 Wide Range Achievement Test-4 Story: Picture Prompt Age list Age List Age Italics = time limits Dumont, Farr Hanks, Farrall, Morbey, & Willis, 8/1/11. The organization of these tables is borrowed from Table 11.1, p. 308, in Brody, S. (Ed) (2001). Teaching reading: Language, letters & thought (2nd ed.). LARC Publishing, P.O. Box 801, Milford, NH 03055 (603-880-7691 http://www.larcpublishing.com). The selection and sequence of tests are entirely arbitrary, whimsical, and based on personal experience. The listing of categories for tests is as accurate as we could make it, but there may be errors. We would be grateful for corrections and suggestions for improvement. * scores are very strongly influenced by the amount written in 15 minutes The written expression test has several different types of items yielding a total score. ** There is a scoring system provided for analyzing stories and essays written separately from the formal test *** Part of the scoring of the Writing Samples test 11 Norms by one-month intervals. 12 Seasonal grade norms. 13 Comprehensive and Brief Forms. 14 The norms are excessively old, but the Goldman-Fristoe-Woodcock Auditory Skills Test Battery was "way ahead of its time" (Kevin McGrew) and is worth studying. † 26 A SAMPLING OF A FEW MATH TESTS Kaufman Test of Ed. Ach. - II 15‡ Paper & Pencil Computation yes Wechsler Individual Ach. Test-III yes Peabody Indiv. Ach Test-Rev. NU Mental Arithmetic Math Vocabulary Math Fluency Applications with paper and pencil yes yes Wide Range Achieve. Test-4 yes KeyMath III yes Norms for Calculator Use yes Norms both16 mlt-chc yes Norms for Corrected Misreading Signs both 16 yes mlt-chc Woodcock-Johnson III‡ Applications w/o paper and pencil both16 both 17 yes age Yes yes yes both16 Italics = time limits The organization of these tables is borrowed from Table 11.1, p. 308, in Brody, S. (Ed) (2001). Teaching reading: Language, letters & thought (2nd ed.). LARC Publishing, P.O. Box 801, Milford, NH 03055 (603-880-7691 http://www.larcpublishing.com). The selection and sequence of tests are entirely arbitrary, whimsical, and based on personal experience. The listing of categories for tests is as accurate as we could make it, but there may be errors. We would be grateful for corrections and suggestions for improvement. Dumont, Farr Hanks, Farrall, Morbey, & Willis 8/1/11. 15 Comprehensive Form; the Brief Form combines computation and applications in one score we do not find useful for diagnosis. A new edition is in development. 16 Seasonal grade norms. 17 Norms at one-month intervals. ‡ 27 Summary of Design Characteristics for Several Cognitive Assessment Batteries Test Age Description (Sub)test Approximate Time CAS 5 - 17 Full Scale. Planning, Attention, 12 (3 per scale) 60 minutes CTONI 6 – 18-11 Nonverbal Intelligence Quotient, 6 (3 use pictured 60 minutes Pictorial Nonverbal Intelligence objects, 3 use quotient, and Geometric Nonverbal geometric designs) Simultaneous, and Successive. Intelligence Quotient DAS-II KAIT KABC-II 2-6 to 17-11 11 to 85 3 to 18 General Conceptual Ability (GCA). 20 (10 core, 10 Verbal, Nonverbal Reasoning, Spatial diagnostic) Composite IQ. Fluid Scale, 10 (4 per scale and 2 Crystallized scale. memory recall) Nonverbal Index, Mental Processing 18 (core or Index or Fluid Crystallized Index. supplementary) 30-60 minutes 40 minutes 30-70 minutes Simultaneous and Sequential Processing, Learning Ability or Longterm Retrieval, Short-term Memory, Fluid Reasoning, Crystallized Ability, Visual-Spatial Processing K-BIT 2 4 – 90 Crystallized (Verbal), Fluid 2 (1 per scale) 20 minutes Full Scale Quotient. Fundamental 20 (4 reasoning, 6 25 to 90 minutes Visualization or Spatial Visualization, visualization, 8 Fluid Reasoning, Attention memory memory, 2 attention) (Nonverbal), IQ Leiter-R 2 to 20-11 scale RIAS 3 to 90 Composite Intelligence Index, Verbal 6 (2 Verbal, 2 and Nonverbal Intelligence, Composite Nonverbal, 2 memory) 20 to 35 minutes Memory Index SIT-R3 SB5 2 – 90 Crystallized Verbal Intelligence 6 10-20 minutes Full Scale IQ Fluid Reasoning, 10 (5 Verbal, 5 60 minutes Knowledge, Quantitative Reasoning, Nonverbal) Visual-Spatial Processing, and Working Memory, Full Scale TONI-3 6 to 90 Nonverbal Intelligence 1 15 to 20 minutes UNIT 5 - 17 Full Scale IQ. Memory Quotient, 6 15 to 45 minutes Reasoning Quotient, Symbolic Quotient, Nonsymbolic Quotient 28 Summary of Design Characteristics for Several Cognitive Assessment Batteries (continued) WASI-II 6 – 89 Full Scale-4, Full Scale-2, Verbal 4 15 to 30 minutes Full Scale, Verbal Comprehension, 15 (4 verbal 70 minutes Perceptual Organization, Working comprehension, 5 Memory, and Processing Speed perceptual reasoning, 3 Indexes. working memory, 3 Comprehension, Perceptual Reasoning WAIS-IV 16 - 89 processing speed) WISC-IV 6 – 16-11 Full Scale IQ, Verbal Comprehension, 15 (5 verbal Perceptual Reasoning, Working comprehension, 4 Memory, and Processing Speed perceptual reasoning, 3 Indexes working memory, 3 70 minutes processing speed) WPPSI-IV 2:6 – 7:6 Full Scale 16 30 to 50 minutes WJ III 2 – 90 General Intellectual Ability. 20 (2 per scale plus 6 40 to 90 minutes Long-term Retrieval, Short-term supplementary) Memory, Processing Speed, Fluid Reasoning, ComprehensionKnowledge, Visual-Spatial Processing, Auditory Processing Note: CAS = Cognitive Assessment System; CTONI = Comprehensive Test of Nonverbal Intelligence; DAS-II = Differential Ability Scales – Second Edition; KAIT = Kaufman Adolescent and Adult Intelligence Test; KABC-II = Kaufman Assessment Battery for Children Second Edition; K-BIT 2 = Kaufman Brief Intelligence Test, Second Edition; Leiter-R = Leiter International Performance Scale-Revised; RIAS = Reynolds Intellectual Assessment Scales; SIT-R3 = Slosson Intelligence Test Revised-Third Edition for Children and Adults; SB5 = StanfordBinet Intelligence Scale, Fifth Edition; TONI-3 = Test of Nonverbal Intelligence-Third Edition; UNIT = Universal Nonverbal Intelligence Test; WASI-II= Wechsler Abbreviated Scale of Intelligence- Second Edition ; WAIS-IV = Wechsler Adult Intelligence Scale Fourth Edition;; WISCIV = Wechsler Intelligence Scale for Children, Fourth Edition; WPPSI-IV = Wechsler Preschool and Primary Scales of Intelligence – Fourth Edition;; WJ III = Woodcock-Johnson III Tests of Cognitive Abilities 29 Measures of Phonological Processing Found on a Selection of Tests Auditory Discrimination Test – 2nd Ed. Word Pairs-same/different The Phonological Awareness Test Rhyming: Discrimination & Production Segmentation: Sentences, syllables, phonemes Isolation: initial, final, & medial sounds Deletion: syllables & phonemes Substitution: with and without manipulatives Blending: syllables & phonemes Decoding: Nonword Reading Comprehensive Test of Phonological Processing Deletion, Reversal, Blending, Segmenting, Rapid Naming Diagnostic Achievement Battery-3rd Ed. Phonemic Analysis subtest: Deletion Wechsler Individual Achievement Test-2nd Ed. Pseudoword Decoding: nonword reading Gray Oral Reading Tests-Diagnostic Decoding Word Attack Woodcock Diagnostic Reading Battery Word Attack: nonword reading Incomplete Words: Closure Sound Blending Illinois Test of Psycholinguistic Abilities-3rd Ed. Sound Deletion: deletion Rhyming Sequences: rhyming sound Decoding: nonword reading Sound Spelling: Nonword Spelling Woodcock-Johnson III Tests of Achievement Word Attack: nonword reading Spelling of Sounds: nonword spelling Sound Awareness: Rhyming, Deletion, Substitution, Reversal Kaufman Test of Educational Achievement-II Phonological Awareness: rhyming, matching, blending, segmenting and deleting Nonsense Word Decoding: nonword reading Decoding Fluency: speed of nonword reading Woodcock-Johnson III Tests of Cognitive Abilities Sound Blending Incomplete Words: Closure Auditory Attention NEPSY Phonological Processing: match phoneme to picture of whole word Sound-picture: Deletion, Substitution Speeded Naming: Rapid Naming Woodcock-Johnson III Diagnostic Supplement Sound Patterns-Voice: word pairs (nonwords) – discrimination Process Assessment of the Learner Rhyming Syllables: Segmenting Phonemes: Deletion Rimes: Deletion Pseudoword Decoding: Nonword Reading RAN Words, Digits, Words and Digits: Rapid Naming Woodcock Reading Mastery Tests-Revised (NU) Word Attack : nonword reading Test of Language Development-3rd Ed. (Primary) Phonemic Analysis: Segmenting Test of Phonological AwarenessKindergarten - Initial Sound: Same/Different Early Elementary - Final Sound: Same/Different Test of Phonological Awareness Skills Rhyming, Closure, Sequencing, Deletion Test of Word Reading Efficiency Phonetic Decoding: nonword reading 30 HISTORY OF REPORT-CARD MARKS Grade/Year/Semester: Reading Writing Spelling English Math Social Studies/History Science Health Music Art Physical Education Computer Key: Notes: Adapted from Willis, J. O. & Dumont, R. P. (2002). Guide to identification of learning disabilities (3rd ed.). http://alpha.fdu.edu/psychology/ PREVIOUS AND PRESENT INDIVIDUAL ACHIEVEMENT TEST SCORES Test: Date: Age: Grade: Age (A) or Grade (G) Norms: Oral Reading: words aloud from a list reading nonsense words aloud (phonics) reading speed/fluency accuracy in reading paragraphs aloud Reading Vocabulary: synonyms antonyms and synonyms analogies antonyms, synonyms, and analogies Reading Comp: written multiple-choice questions oral, multiple-choice questions oral, open-ended questions oral words to fill in blanks (cloze) matching pictures to sentences Spelling: writing dictated words writing dictated nonsense words spelling accuracy in story or essay multiple-choice spelling test Writing: writing mechanics/style proofreading writing sentences according to directions writing fluency Story/Essay Writing: vocabulary syntax/grammar punctuation/writing mechanics/style theme/content holistic assessment Math: paper-and-pencil computation applications – oral without paper and pencil applications – with paper and pencil orally presented multiple-choice multiple-choice presented in writing Notes: * silent reading scores are reported as: 32 PREVIOUS AND PRESENT GROUP ACHIEVEMENT TEST SCORES Test: Date: Grade: Reading Vocabulary Reading Comprehension Word Study Skills/Word Analysis Total Reading Language Mechanics Language Expression Spelling Total Language Math Computation Math Concepts Math Applications Total Math Science Social Studies Environment/Science & Soc. Studies Study Skills Total Battery Aptitude/Cognitive Ability Test Notes: 33 Ten Top Problems with Normed Achievement Tests for Young Children Ron Dumont John O. Willis Fairleigh Dickinson University Rivier College 1. There is no agreed-upon preschool or kindergarten curriculum at national, state, and even, in some cases, local levels. It is difficult to sample a curriculum that does not exist. For higher grades, there is at least some commonality among the many curricula at a given grade level. The same skill may be placed at very different levels. See, for example, http://alpha.fdu.edu/~dumont/psychology/WR_vs_LWI.htm, http://alpha.fdu.edu/~dumont/psychology/NO_VS_CALC.htm, and http://alpha.fdu.edu/~dumont/psychology/MR_VS_AP.htm 2. Young children are often inconsistent in their responses, which would argue for more items to increase reliability. 3. But young children have short attention spans and they fatigue easily, which requires fewer items. 4. Sampling works well for a large domain. For example, if a child is expected to have a reading vocabulary of 3,000 words, it is pretty easy to estimate reading skill with a 25-word test. However, if a child is expected to have a reading vocabulary of 10 words, your 25-word test could, by pure chance, easily sample all 10 or none of them, giving an inflated or depressed estimate of the child's reading ability. Similarly, many achievement tests for young children have only a few letter-naming items, rather than 53. If a child knows ten to sixteen letters, a ten-item test could easily hit or miss all of them by pure chance. If you test a child on Monday, and the teacher teaches the vowels on Tuesday, that could be the difference between a score of zero and a score of five on the ten-item test. 5. Young children develop new skills so rapidly that norms tables should be divided by weeks, not three, four, or six months. The difference between age 10-0-0 to 10-6-29 and age 10-7-0 to 10-11-29 may be trivial, but the difference between 4-0-0 to 4-6-29 and 4-7-0 to 4-11-29 is tremendous. 6. Item format matters a lot more for younger children. Most ten-year-olds don't care whether an addition problem is presented horizontally or vertically, but five-year-olds do. The space between lines on writing paper and the presence or absence of a dotted midline is a deal-breaker for most kindergarten students. 7. Item gradients are necessarily very steep for younger children. There aren't any clearly defined steps between not being able to write the letter M and being able to write it. 8. Norming samples are also a huge problem at the preschool level. If you carefully sample geographic regions, parental education and income, and other germane variables, you can be fairly safe in using whatever public and private schools (in the right proportions) are available. However, at the preschool level, there is a huge difference between the Mary Poppins School of Unfettered Self-Expression and Free Play and the John Stuart Mill Preschool of Relentless Academic Pressure. Low-income kids from the JSMPRAP are likely to score higher on academic achievement tests than rich kids from the MPSUSEFP. A truly representative national sample (especially with only 100 to 300 kids per year of age) is virtually unobtainable. 9. There often is insufficient floor for young children on achievement tests. See, for example, http://alpha.fdu.edu/~dumont/psychology/McGee.htm. 10. Consequently, formal and informal, criterion-based tests (with exhaustive sampling, e.g., all 53 letters, all sums up to ten, etc.); curriculum-based measurement; classroom observations; and work samples are likely to be much more informative than normed tests up to at least a mid-second-grade level of achievement (regardless of actual grade placement. Bracken, B. A. (1988). Ten psychometric reasons why similar tests produce dissimilar results. Journal of School Psychology, 26, (2), 155-166. Bracken, B. A., & Walker, K. C. (1997). The utility of intelligence tests for preschool children. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (pp. 484-502). New York: Guilford Press. Schultz, M. K. (1988). A comparison of Standard Scores for commonly used tests of early reading. Communiqué, 17 (4), 13. 34 MNEMONICS FOR FIVE ISSUES IN THE IDENTIFICATION OF LEARNING DISABILITIES TAKEN FROM THE THREE SYNOPTIC GOSPELS OF THE NEW TESTAMENT OF THE KING JAMES VERSION OF THE BIBLE FIRST BY KEITH STANOVICH AND LATER, IN RESPECTFUL AND SINCERE IMITATION, BY RON DUMONT AND JOHN WILLIS http://alpha.fdu.edu/~dumont/psychology/mnemonics_for_five_issues.htm Keith Stanovich's Matthew Effects Matthew 25:29 "For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath." A student who gets off to a slow start in reading for any reason is likely to keep falling farther behind, rather than catching up, as other students continue get more reading done per unit of time and keep progressing. Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21, 360-407. This is a really important concept! First, a slow start in reading is a regular education emergency. Regular education needs to check progress and needs to intervene promptly. Second, not all weak readers have specific learning disabilities. It might be a cumulative Matthew effect. They need immediate intervention as soon as the weakness is belatedly discovered, but the intervention may not be special education. Did I mention that this is really important? The Mark Penalty Mark 4:25 "For he that hath, to him shall be given: and he that hath not, from him shall be taken even that which he hath." "In this situation, both measures – 'ability' and 'achievement' – are depressed by the same disorder. Therefore, the distinction between 'achievement and intellectual ability' is rendered meaningless by the contamination of both areas." (Willis, J. O. & Dumont, R. P. (2002). Guide to Identification of Learning Disabilities (3rd ed., pp. 131-2). Peterborough, NH: authors. http://alpha.fdu.edu/~dumont/psychology. Not only do blind and deaf children tend to get artificially low full scale or composite scores on mental ability tests, but so do children with visual perception (Gv) and auditory perception (Ga) weaknesses (as well as children with other basic-process weaknesses). Then, it falsely appears that they have limited intelligence rather than a specific learning disability. The Luke Composite Effect Luke 8:18 "Take heed therefore how ye hear: for whosoever hath, to him shall be given; and whosoever hath not, from him shall be taken even that which he seemeth to have." Total or composite scores will be more extreme (farther from the mean) than the average of the component scores (unless all of the component scores are perfectly correlated). [See, for example, McGrew, K. S. (1994). Clinical interpretation of the Woodcock-Johnson Tests of Cognitive Ability-Revised. Boston: Allyn and Bacon.] On the WISC-IV, VCI 65, PRI 65, WMI 65, PSI 65 = FSIQ 57, and VCI 134, PRI 135, VMI 135, PSI 136 = FSIQ 144. The Luke Jeopardy Luke 19:26 "For I say unto you, That unto every one which hath shall be given; and from him that hath not, even that he hath shall be taken away from him." Students who have one known disability are very likely to have additional disabilities. It is essential not to overlook other possible disabilities and weaknesses, nor to automatically attribute all problems to the initial diagnosis. The Other Matthew Effect Matthew 13:12 "For whosoever hath, to him shall be given, and he shall have more abundance: but whosoever hath not, from him shall be taken away even that he hath." If a student has a major disability, any additional disabilities or weaknesses ("The Luke Jeopardy") are likely to have more severe effects on the student's functional abilities than they would in isolation. 35 Raw Scores of Zero Ron Dumont, Fairleigh Dickinson University, & John Willis, Rivier College We are severely allergic to zero raw scores. Jeri J. Goldman (1989) is the pioneer in the study of zero raw scores. We have tried to contribute to this vital field with the Evaluation of Sam McGee at http://alpha.fdu.edu/~dumont/psychology/McGee.htm. A zero raw score can sometimes mean that the student lacks an essential skill that is not the intended target of the test. The WJ III test of Editing is a good example, used by the publisher, Riverside, in communications with WJ III purchasers. Editing is intended to measure a specific editing skill in kids who can read. If the student cannot read (especially if it is because nobody has tried to teach this four-year-old how to read yet), then what should have been a test of proofreading skill inadvertently becomes just another reading test. More important, though, is that zero raw scores near the bottom end of a test's age range can produce insanely high scores (for example in Goldman's and our dead students). Another problem is that you don't know whether it was a high zero or a low one. Say, for example, the zero raw score yields a standard score of 65. Was the student almost capable of getting one item correct and leaping to a standard score of 78? Or, if the test had had sufficient bottom, would the student still have gotten a zero, reflecting a true ability at a standard-score level of 20? Sampling error becomes a horrendous problem with zero raw scores. We should worry when scores are based on only a few correct responses, much less zero. Suppose a beginning first grader can read 25 words at sight and has no other word attack skills. By sheer chance, that kid's set of 25 words might include 10 words on the WJ-III, 5 on the KTEA-II, and none on the WIAT-III (or any other combination). The closer you are to the bottom of the test, the more sampling error will bite you. Schultz's (1988) comparison of standard scores for commonly used tests of early reading, discussed above, found that different achievement tests yielded widely different scores for the same test performance, e.g., standard score of 98 on one test and 65 on another. A zero raw score might mean that the student simply missed the whole point of the exercise, even though the student was reasonably competent in the underlying ability that the test was intended to measure. A child might, for example, be able to demonstrate verbal abilities pretty well on an analogies test, but not on a similarities one. Daniel (1986) found, for another example, that block-design tests with flat tiles and with three-dimensional cubes measured the same construct in sixth grade children. However, the 3-D cubes might confuse some younger children who could have demonstrated their abilities with flat tiles (Elliott, 1990, p. 48). Even if a zero raw score should happen, by accident, to produce a valid standard score, you certainly would not have much data to work with. We strongly recommend that if you get within about one SEm of a zero raw score, it would be prudent to find some other measure of that ability that had more bottom. Daniel, M. H. (1986, April). Construct validity of two-dimensional and three-dimensional block design. Paper presented at the annual convention of the National Association of School Psychologists, Hollywood, FL. Elliott, C. D. (1990). Differential Ability Scales introductory and technical handbook. San Antonio: The Psychological Corporation. Goldman, J. J. (1989). On the robustness of psychological test instrumentation: Psychological evaluation of the dead. In Glenn G. Ellenbogen (Ed.) The Primal Whimper: More Readings from the Journal of Polymorphous Perversity. New York: Ballantine, Stonesong Press. Schultz, M. K. (1988). A comparison of Standard Scores for commonly used tests of early reading. Communiqué, Newsletter of the National Association of School Psychologists, 17 (4), 13. 36 Executive Functions as Basic Psychological Processes For reasons that are not clear to us, people keep emailing us to ask whether "executive functions" qualify as "basic psychological processes" for the purpose of identifying a specific learning disability under IDEA. Although we have already replied to 35 of the 37 people in the English-speaking world who have the slightest interest in our opinion, here it is for the other two. 1. Executive functions are perfectly respectable basic psychological processes. They are basic. They are psychological. They are processes. What more could you ask? 2. "Executive functions" should be written in the plural because there are lots of them and an individual may have different levels of functioning in different functions. 3. Cognitive ability tests (even ones that tap many different cognitive abilities, such as the Wechslers, DAS, KABC, Stanford-Binet, or WJ) do not measure all or even close to all potential basic psychological processes. Even the ones designed to produce a profile of strengths and weaknesses focus on tests that primarily measure intelligence and only secondarily measure other processes. A "flat profile" on a test of cognitive ability does not rule out anything (well, OK, it does rule out a jagged profile on that administration of that test). In fact, one reason that prefrontal lobotomies remained popular as long as they did was that lobotomies usually did not make significant differences in IQ scores on the Wechsler and Binet scales! [Goldstein, K. (1950). Prefrontal lobotomy; analysis and warning. Scientific American, 182 (2), 44-47.] When people are constructing an IQ test, one thing they usually do is throw out any candidate subtests that do not have at least reasonably high correlations with psychometric g. 18 A subtest that does not measure intellectual ability usually does not make the cut. [An exception is the DAS-II, which puts high-g-loading subtests in its "Core" battery to compose a General Conceptual Ability (GCA) score, but then adds a bunch of "Diagnostic" subtests that yield valuable information, but do not necessarily have particularly high g loadings. The Diagnostic subtests are not included in the GCA. You can do something somewhat similar by separately considering the GIA and CPI scores on the WISC-IV or WAIS-IV.] Cecil Reynolds and others have made the point on the NASP listserv (and in many publications that I cannot cite from memory) that profile analysis on an IQ test is risky at best because the g loadings of the subtests make them contaminated measures of anything else they might be measuring. That is the reason, for example, that an OT may find horrendous weaknesses with visual perception, but the SAIF or psychologist may report only mild problems (if any) on cognitive ability subtests, such as WISC Block Design or DAS Pattern Construction, that are "supposed to" tap visual perceptual skills. The cognitive ability subtests primarily measure intelligence and only secondarily measure other specific abilities. This issue would argue for using dedicated tests (such as the ones with funny names employed by OTs, SLPs, and neuropsychologists) to measure abilities other than intelligence and using IQ tests that are tightly focused on measuring g. I agree with the observation, but still prefer to begin my assessment with a profile-producing measure, such as the DAS-II, KABC-II, or Wechsler scales, and then follow up leads with the specialized tests, as recommended by many authorities. 4. The Office of Special Education Programs and Office of Special Education and Rehabilitative Services (OSEP and OSERS) have repeatedly made it clear that the law does not require documentation of a processing disorder for identification of a specific learning disability. Legally (unless state law requires otherwise), the process weakness may simply be inferred. [McBride, G.M., Dumont, R., & Willis, J.O. (2011, pp. 79, 80, 86, 88-92). Essentials of IDEA for Assessment Professionals. New York: Wiley]. 5. However, we believe that best practice does call for documentation of a disorder in one or more of the basic psychological processes involved in understanding or in using language, spoken or written, that may manifest itself in the imperfect ability to listen, think, speak, read, write, spell, or to do mathematical calculations. In fact, we believe we need not only to clearly name and define the specific weak process (not merely vaguely calling it some sort of process or executive function disorder, but specifying one or more impaired specific processes such as, for 18 General, global, overall intelligence. On most IQ tests, the total score (e.g., Wechsler Full Scale IQ) attempts to measure this psychometric g. Factor analytic techniques can yield a more sophisticated estimate of g. For a very readable discussion of these issues, I recommend Alan S. Kaufman's IQ Testing 101 (New York, NY: Springer, 2009 ISBN 978-0-8261-0629-2), which is totally accessible to readers with no background in psychological assessment, but held the attention of, and provided new information for, someone who has been doing IQ testing as a hobby for 43 years. 37 example, spatial perception, auditory working memory, fine-motor coordination, task initiation, multi-tasking, selfmonitoring and evaluation, or planning), but also to explain how that weakness impairs achievement. 6. Achievement may be measured by normed test scores. However, it may also, legally and logically, be measured by informal assessments, classroom performance, or academic grades. Normed achievement tests do not even attempt to assess many essential academic skills, such as working through multiple drafts to produce a ten-page essay; reading, comprehending, and remembering a 40-page chapter in a science or history text; or sustaining attention to, understanding, taking notes on, and remembering a 40-minute lecture. High scores on even the best of the normed achievement tests do not rule out possible academic problems. 7. Many of the important executive functions involve activities that last for many minutes, hours, days, weeks, or months. By their nature, most executive functions are issues of consistent, typical performance, not the peak performance targeted by most tests of cognitive ability and academic achievement. There are some useful direct tests of some executive functions, such as the NEPSY-II and Delis-Kaplan Executive Function System (D-KEFS). Memory tests, such as the TOMAL-2 and WRAML2, include some executive functions. There are also procedures, such as the Rey-Osterrieth Complex Figure, the WISC-IV Integrated, or the Advanced Clinical Solutions for the WAIS-IV and WMS-IV, which can be helpful. You can also learn a lot about a student's executive functions by observing the student's approach to the tasks. For example, if the student is struggling to correctly orient the last of the identical patterned cubes to copy a Block Design, does the student rotate it once, rotate it again, and then – instead of rotating it once more into the correct orientation, flip the block over to start again from scratch with an identical face? Does the student work very slowly and carefully on the timed Coding subtest, but quickly and impulsively on the untimed, multiplechoice Picture Concepts and Matrix Reasoning subtests? Does the student misunderestimate the difficulty of repeating dictated series of digits and then devote more mental energy to repeating other series backwards, thereby recalling more digits backward than forward? Or does the student fail to learn from experience? However, we also need to rely on the observations of parents, teachers, and therapists who interact with the student for months or years. Standardized, normed questionnaires allow us to collect and quantify those observations. Most authorities consider it best practice to first use a broad-spectrum questionnaire, such as the BASC-2, and then a more narrowly focused one, such as the Behavior Rating Inventory of Executive Function (BRIEF). The risk of beginning an assessment with a narrowly focused test or questionnaire is Maslow's hammer problem. 19 Many of us recall the widespread use of school readiness tests that identified maturity and immaturity. Although thoughtful examiners knew better and used their knowledge appropriately, the formal test scores perforce classified intellectual disability, cerebral palsy, deafness, blindness, and childhood schizophrenia as "immaturity." It is imprudent to prematurely narrow our choices in the assessment process. 8. Finally, as with any assessment process, we want to understand and explain how the disorder actually impairs academic achievement or performance. We want to understand the mechanism. For one example, it might be a stretch to attribute poor math fluency to a weakness in planning. The burden of proof would be on the evaluator suggesting that cause-and-effect relationship. However, if the student was given a low rating from parents and even lower from teachers on the BRIEF Plan/Organize scale, 20 was observed to plan poorly on written expression and block design tasks in the assessment, and scored low on Elithorn Mazes on the WISC-IV Integrated and planning tasks on the KABC-II, D-KEFS or NEPSY-II, you might make a pretty strong argument that poor planning ability helped explain the student's confused, incomplete, and late homework; the lack of practice caused by the inadequate homework; and the resulting lack of skill and fluency on tasks the homework was designed to reinforce. You might also be able to demonstrate how poor planning interfered with computation of multiple-step math computation examples and solution of multiple-step math story problems, with writing assignments, and with the planning of experiments in the science lab (including the one that required evacuation of the school last October). 19 To the man who only has a hammer, everything he encounters begins to look like a nail. – Abraham H. Maslow 20 "ability to manage current and future-oriented task demands. . . . anticipate future events, set goals, and develop appropriate steps ahead of time to carry out a task or activity. . . .bring order to information and to appreciate main ideas of key concepts" 38 SCORES NOT USED WITH THE TESTS IN THIS REPORT (GIVEN FOR REFERENCE) When a new test is developed, it is normed on a sample of hundreds or thousands of people. The sample should be like that for a good opinion poll: female and male, urban and rural, different parts of the country, different income levels, etc. The scores from that norming sample are used as a yardstick for measuring the performance of people who then take the test. This human yardstick allows for the difficulty levels of different tests. The student is being compared to other students on both difficult and easy tasks. You can see from the illustration below that there are more scores in the middle than at the very high and low ends. Many different scoring systems are used, just as you can measure the same distance as 1 yard, 3, feet, 36 inches, 91.4 centimeters, 0.91 meter, or 1/1760 mile. PERCENTILE RANKS (PR) simply state the percent of persons in the norming sample who scored the same as or lower than the student. A percentile rank of 50 would be Average – as high as or higher than 50% and lower than the other 50% of the norming sample. The middle half of scores falls between percentile ranks of 25 and 75. STANDARD SCORES ("quotients" on some tests) have an average (mean) of 100 and a standard deviation of 15. A standard score of 100 would also be at the 50th percentile rank. The middle half of these standard scores falls between 90 and 110. SCALED SCORES ("standard scores on some tests) are standard scores with an average (mean) of 10 and a standard deviation of 3. A scaled score of 10 would also be at the 50th percentile rank. The middle half of these standard scores falls between 8 and 12. V-SCALE SCORES have a mean of 15 and standard deviation of 3. A v-scale score of 15 would also be at the 50th percentile rank and in Stanine 5. The middle half of v-scale scores falls between 13 and 17. T SCORES have an average (mean) of 50 and a standard deviation of 10. A T score of 50 would be at the 50th percentile rank. The middle half of T scores falls between approximately 43 and 57. There are 200 &s. Each && = 1%. & & & & & &&&&&& &&&&&& && &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& && &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& && &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& && &&&&&& &&&&&& &&&&&& &&&&&& &&&&&& & &&&&&& &&&&&& & & & Percent in each 2.2% 6.7% 16.1% 50% 16.1% 6.7% 2.2% Standard Scores – 69 70 – 79 80 – 89 90 – 109 110 – 119 120 – 129 130 – Scaled Scores V-Scale Scores 1 2 3 1–8 T Scores – 29 z-scores < –2.00 BruininksOseretsky Percentile Ranks Wechsler IQ Classification WRAT4 Classification DAS & VMI Classification RIAS Classification Stanford-Binet Classification 4 9 5 10 6 11 30 – 36 7 12 – 2.00 – –1.34 8 13 9 14 37 – 42 10 15 1.33 – –0.68 12 17 13 18 14 19 15 20 16 17 18 19 21 – 24 43 – 56 57 – 62 63 – 69 70 – 0.67 – 0.66 0.67 – 1.32 1.33 – 1.99 2.00 – – – 11 16 & 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 – 02 Extremely Low Lower Extreme Very Low 03 – 08 Borderline Low Low 09 – 24 Low Average Below Average Below Average 25 – 74 Average Average Average 75 – 90 High Average Above Average Above Average 91 – 97 Superior Superior High Significantly Below Av. Moderately Below Av. Below Average Average Above Average Moderately Above Av. Moderately Impaired Borderline Low Average Average High Average Superior 40-54 Mildly Impaired 55-69 39 98 – Very Superior Upper Extreme Very High Significantly Above Av. Gifted 130-144 Very Gifted 145-160 Severe Delay = 30 – 39 Very ModLow/ erate Mild Delay Delay 40-54 55-69 Woodcock-Johnson Classif. Pro-Ed Classification OWLS-II Classification KTEA-II Classification Very Low Very Poor Deficient – 69 Lower Extreme Leiter Classification WIAT-III Classification Vineland Adaptive Levels Very Low <55 Low 55 – 69 Low Below Average Low Average Below Poor Average Below Average 70 – 84 Below Average 70 – 84 Average (90 – 110) Below Average 70 – 84 Average 85 – 115 Low Low – 70 Moderately Low 71 – 85 PPVT-4 Classifications Extremely Low Moderately Low CELF-4 Classifications Very Low – 70 Stanines Very Low – 73 Average Low 71 – 77 Borderline 78 – 85 Below Low Average 74 – 81 82 - 88 Average Average 85 – 115 Average 85 – 115 Above Average High Superior (111 – 120) (121 – 130) Above Superior Average Above Average 116 – 130 Above Average 116 – 130 High Average Above Average 116 – 130 Very High/ Gifted Very Superior (131 – ) Very Superior Exceptional 131 – Upper Extreme Super -ior 131145 Very Super -ior 146 – Adequate or Average Moderately High High 86 – 114 115 – 129 130 – Low High Extremely Moderately High High Average Average Above Average 86 – 114 115 – Low High Above Average Very High High Average Average Average 119 – 126 97 – 103 127 – 104 - 111 112 – 118 89 – 96 Adapted from Willis, J. O. & Dumont, R. P., Guide to identification of learning disabilities (1998 New York State ed.) (Acton, MA: Copley Custom Publishing, 1998, p. 27). Also available at http://alpha.fdu.edu/psychology/test_score_descriptions.htm. RELATIVE PROFICIENCY INDEXES (RPI) show the examinee's level of proficiency (accuracy, speed, or whatever is measured by the test) at the level at which peers are 90% proficient. An RPI of 90/90 would mean that, at the difficulty level at which peers were 90% proficient, the examinee was also 90% proficient. An RPI of 95/90 would indicate that the examinee was 95% proficient at the same level at which peers were only 90% proficient. An RPI of 75/90 would mean that the examinee was only 75% proficient at the same difficulty level at which peers were 90% proficient. RPI Proficiency with Age- or Grade-Level Tasks Age- or Grade-Level Tasks will be: 100/90 Very Advanced Extremely Easy 98/90 to 100/90 Advanced Very Easy 95/90 to 98/90 Average to Advanced Easy 82/90 to 95/90 Average Manageable 67/90 to 82/90 Limited to Average Difficult 24/90 to 67/90 Limited Very Difficult 3/90 to 24/90 Very Limited Extremely Difficult 0/90 to 3/90 Negligible Impossible Adapted from Jaffe, L. E. (2009). Development, interpretation, and application of the W score and the relative proficiency index (Woodcock-Johnson III Assessment Service Bulletin No. 11). Rolling Meadows, IL: Riverside Publishing. http://www.riverpub.com/products/wjIIIComplete/pdf/WJ3_ASB_11.pdf. 40 Random Comments on the DAS-II and WISC-IV Table 8.10, p. 169 in the DAS-II Introductory and Technical Handbook (Elliott, 2007b) shows the WISC-IV and DAS-II scores obtained by 202 children of ages 6-0 through 16-11 tested about three weeks apart (1 to 72 days) with the DASII first. The sample is small compared to standardization samples, but fairly large for studies of this kind. Elliott notes (2007, p. 168) that the DAS-II scores are slightly lower than the WISC-IV ones, as would be expected by the Flynn Effect (e.g., Flynn, 1998) and possible learning of test procedures by taking the DAS-II first. I don't know why counterbalanced order was not used. I would add that the WISC-IV Index score that is highest compared to other WISC-IV Index scores and DAS-II Cluster scores was, as would be expected from the Flynn Effect, Perceptual Reasoning (mean = 106.2). The other WISC-IV Index and DAS-II Cluster scores ranged from 98.8 to 101.2 (FSIQ 103.1). When an individual student had notably different scores on the WISC-III and DAS, the culprit was usually the DAS Nonverbal (fluid) Reasoning Cluster. In most cases, the WISC-III Verbal Comprehension and DAS Verbal Ability scores would be close to one another, and the WISC-III Perceptual Organization and the DAS Spatial Ability scores would also hang together. A high or low Nonverbal Reasoning Cluster score would usually cause any noteworthy difference between the child's WISC-III FSIQ and DAS GCA scores (see, for example, Dumont, Cruse, Price, & Whelley, 1996). The WISC-IV dropped Picture Completion from the Perceptual Index (which changed from a 4-subtest Perceptual Organization Index to a 3-subtest Perceptual Reasoning) and added two subtests that appear to be measures more of fluid reasoning (Gf) than visual-spatial thinking (Gv) (see Farrall's "Myth of the WISC-III/WISC-IV Retest). I would speculate that the WISC-IV Perceptual Reasoning Index might be closer to the DAS-II Special Nonverbal Composite rather than showing the closer relationship to the Spatial Ability than to the Nonverbal Reasoning that we saw with the WISC-III and DAS. The DAS-II Verbal Ability and WISC-IV Verbal Comprehension have very similar word-defining subtests. The greatest difference is the 2-1-0 WISC-IV and 1-0 DAS-II scoring. In the comparison of the two tests (Elliott, 2007), Vocabulary and Word Definitions correlated .64, with the WISC-IV score (10.3) only slightly higher than the DAS-II (49.6). Both instruments have subtests of explaining how different words can be alike. The DAS-II uses three words and 1-0 scoring for most items (and in Elliott's study was given first), but the WISC-IV uses only two words per item and 2-1-0 scoring for VCI PRI WMI PSI FSIQ mean most items. The correlation between the two was .63 with Verbal .73 .67 100.0 extremely close scores (50.5 DAS-II and 10.5 WISC-IV). The NVR .71 .74 99.6 WISC-IV Verbal Comprehension Index, however, includes a Spatial .72 .69 100.2 third subtest: Comprehension. In Elliott's (2007) study, the SNC .77 .78 100.0 WISC-IV Comprehension had correlations of .39 with the W Mem .74 .69 99.0 DAS-II Verbal Comprehension, .57 with Naming Vocabulary, Pro Spd .53 .49 100.5 .46 with Word Definitions, and .43 with Verbal Similarities. I Sch Rdn .72 .74 .53 .48 .80 98.8 might anticipate that a divergent Comprehension score on the GCA .73 .80 .62 .49 .84 99.8 WISC-IV could cause a notable difference between the Verbal mean 101.2 106.2 100.5 100.0 103.1 Comprehension Index and the Verbal Ability Cluster. The difference between 2-1-0 and 1-0 scoring could separate WISC-IV and DAS-II verbal scores for some children. Also, some children probably profit from the three words on the DAS-II Verbal Similarities, because they can pass an item even if they do not know one of the words, but other children may be confused by so many choices. DF DB LNS WMI The WISC-IV Working Memory Index appears to be roughly equal DF .56 .46 .39 .61 parts simple short-term memory (Digits Forward and easier LNS items) and DB .45 .63 .49 .66 working memory (Digits Backward and higher LNS items). It correlated .74 SqOrd .43 .58 .52 .66 with the DAS-II Working Memory, which includes mostly working memory WMC .49 .69 .58 .74 tasks: Digits Backward (but not Digits Forward) and Recall of Sequential Order. The DAS-II Processing Speed includes Rapid Naming as well as the more CD SS PSI traditional Speed of Information Processing. The WISC-IV still uses the paper-andSoIP .38 .31 .40 pencil Coding and Symbol Search. However, the verbal-response Rapid Naming has RN .42 .35 .45 very slightly higher correlations with the paper-and-pencil WISC-IV subtests than does PS .49 .41 .53 the DAS-II paper-and-pencil Speed of Information Processing. Dumont, R. Cruse, C., Price, L. & Whelley, P. (1996). The relationship between the Differential Ability Scales and the WISC-III for students with learning disabilities. Psychology in the Schools, 33, 203-209. Elliott, C. D. (2007). Differential Ability Scales 2nd edition introductory and technical handbook. San Antonio, TX: Pearson. Farrall, M. L. The myth of the WISC-III/WISC-IV Retest: the apples and Oranges effect (retrieved 23 May 2007 from http://alpha.fdu.edu/psychology/melissa_farrall_WISCIV.htm) Flynn, J. R. (1998). IQ gains over time: Toward finding the causes. In U. Neisser (Ed.), The rising curve: Long-term gains in IQ and related measures (pp. 25-66). Washington: American Psychological Association. 41
© Copyright 2024