Unit 11 Testing and Individual Differences
50 Slides912.92 KB
Unit 11 Testing and Individual Differences
Learning Targets Module 61 Assessing Intelligence 61-1 Describe the characteristics of an intelligence test, and distinguish between achievement and aptitude tests. 61-2 Discuss when and why intelligence tests were created, and explain how today’s tests differ from early intelligence tests. 61-3 Describe the normal curve, and explain standardization, reliability, and validity.
What is an intelligence test? a method for assessing an individual’s mental aptitudes and comparing them with those of others, using numerical scores Psychologists classify intelligence tests as either achievement tests, intended to reflect what you have learned, or aptitude tests, intended to predict your ability to learn a new skill.
What is the difference between an achievement test and an aptitude test? achievement test aptitude test Exams covering what you have learned in this course are achievement tests. A college entrance exam, which seeks to predict your ability to do college work, is an aptitude test. Examples include the AP exam, chapter or unit tests in your courses, final exams in college, etc. Examples include the SAT or ACT or career tests that help predict what future job might best fit your interests.
Aptitude and achievement tests. What achievement or aptitude tests have you taken? In your opinion, how well did these tests reflect what you’d learned or predict what you were capable of learning? Talk with your partner.
1. What Would You Answer? Which of the following is the best example of an aptitude test? A. Atul answers questions about the rules of the road. B. Mr. Anderson’s AP psychology test covers the material from the current unit. C. Sherjeel takes the ACT for college admission. D. Jeffrey is required to translate 50 Mandarin sentences for his final exam. E. Lucy and Meghan discuss what they might study in college.
Interpret the graph. Use your understanding of statistics to explain the data on the graph above.
Thinking critically. Research indicates that there is a strong positive correlation between SAT scores and intelligence scores. Many consider the modern SAT to be more of an achievement test, measuring the rigor of courses taken in high school, the access to preparation courses, and other social factors.
Consider the quote below. Plato, a pioneer of the individualist tradition, wrote more than 2000 years ago in The Republic that “no two persons are born exactly alike; but each differs from the other in natural endowments, one being suited for one occupation and the other for another.” Do you agree with Plato? Talk about it.
AP Exam Tip Become familiar with the key contributors in intelligence testing and be able to identify how they differ (e.g., Galton, Binet, Terman, Wechsler and Stern).
How were individual differences in mental abilities historically researched? English scientist Francis Galton was fascinated with measuring human traits. Galton wondered if it might be possible to measure “natural ability” and to encourage those of high ability to mate with one another. He devised methods to measure “intellectual strengths” based on such things as reaction time, sensory acuity, muscular power, and body proportions.
What were the results of Galton’s research? Galton’s quest for a simple intelligence measure failed, and the measurements he gathered did not correlate with intelligence. Galton did; however, leave the field of psychology with statistical techniques that are still used, the phrase nature and nurture, and the belief in the inheritance of genius.
How did Alfred Binet contribute to the field? Alfred Binet (1857-1911) French psychologist Alfred Binet was commissioned by the French government to design fair and unbiased intelligence tests to administer to French schoolchildren.
What was Binet’s assumption about intellectual development? Binet and his student, Théodore Simon, began by assuming that all children follow the same course of intellectual development but that some develop more rapidly. A “dull” child should score much like a typical younger child, and a “bright” child like a typical older child. Thus, their goal became measuring each child’s mental age, the level of performance typically associated with a certain chronological age.
What is meant by mental age? Binet assumed the average 9-year-old,has a mental age of 9. Children with below-average mental ages, such as 9-year-olds who perform at the level of typical 7-year-olds, would struggle with age-appropriate schoolwork. Although the child had a chronological age of 9, Binet would say they have a mental age of 7.
How did Binet test for mental age? To measure mental age, Binet and Simon theorized that mental aptitude, like athletic aptitude, is a general capacity that shows up in various ways. They tested a variety of reasoning and problemsolving questions on Binet’s two daughters, and then on “bright” and “backward” Parisian schoolchildren. Items answered correctly could then predict how well other French children would handle their schoolwork.
How were Binet’s tests modified by Lewis Terman? Stanford University professor Lewis Terman, modified Binet’s tests for use as a numerical measure of inherited intelligence. Adapting some of Binet’s original items, adding others, and establishing new age norms, Terman extended the upper end of the test’s range from teenagers to “superior adults.” Terman also gave his revision the name today’s version retains—the Stanford-Binet. For Terman, intelligence tests revealed the intelligence with which a person was born.
What is the intelligence quotient (IQ) and how was it derived? From such tests, German psychologist William Stern derived the famous term intelligence quotient, or IQ. The IQ was simply a person’s mental age divided by chronological age and multiplied by 100 to get rid of the decimal point. IQ was defined as the ratio of mental age (ma) to chronological age (ca) multiplied by 100 (thus, IQ ma/ca 100). On contemporary intelligence tests, the average performance for a given age is assigned a score of 100.
What were the limits of IQ calculating? The original IQ formula worked fairly well for children but not for adults. Most current intelligence tests, including the Stanford-Binet, no longer compute an IQ in this manner.
How did the Army utilize the intelligence tests? With Terman’s help, the U.S. government developed new tests to evaluate both newly arriving immigrants and World War I army recruits—the world’s first mass administration of an intelligence test. The Army Alpha and Beta (the version for illiterate or non-English speaking recruits) tests were intended to measure verbal and numerical abilities, following directions and general knowledge. To some psychologists, the results indicated the inferiority of people not sharing their Anglo-Saxon heritage
What were the problems with the early intelligence tests? Sweeping judgments based on intelligence test scores became an embarrassment to most of those who championed testing. Lewis Terman came to appreciate that test scores reflected not only people’s innate mental abilities but also their education, native language, and familiarity with the culture assumed by the test. Abuses of the early intelligence tests, such as in immigrant screening, remind us that science can be value-laden.
What intelligence test did David Wechsler design? Psychologist David Wechsler created what is now the most widely used individual intelligence test, the Wechsler Adult Intelligence Scale (WAIS), together with a version for school-age children (the Wechsler Intelligence Scale for Children [WISC]) and another for preschool children (the WPPSI). (Evers et al., 2012)
What are some of the subtests of the WAIS? Recognizing similarities Vocabulary Letter-number sequencing Block design (use four blocks to make the image shown)
What information does a WAIS provide? The WAIS yields not only an overall intelligence score, as does the Stanford-Binet, but also individual scores for verbal comprehension, perceptual organization, working memory, and processing speed. Striking differences among these individual scores can provide clues to cognitive strengths or weaknesses. For example, a low verbal comprehension score combined with high scores on other subtests could indicate a reading or language disability.
What three criteria must an intelligence test meet to be accepted? standardized To make scores meaningful they are compared to a pretested sample population. reliable The test gives consistent scores no matter who takes it or when they take the test. valid The test measures or predicts what it is supposed to.
What is the normal curve? If a graph is constructed of test-takers’ scores, the scores typically form a bell-shaped pattern called the bell curve, or normal curve.
How is the normal curve defined? the bell-shaped curve that describes the distribution of many physical and psychological attributes Most scores fall near the average, and fewer and fewer scores lie near the extremes.
What is a characteristic of a normal curve distribution? Remember that in a normal distribution the mean, median, and mode are all the same and at the center.
What is another characteristic of the normal curve? 68% of scores fall 1 standard deviation from the mean 95% of scores fall 2 standard deviations from the mean 99% of scores fall 3 standard deviations from the mean
What does the test score indicate? For both the Stanford-Binet and Wechsler scales, a score indicates whether that person’s performance fell above or below the average.
How is an intelligence score derived using the normal curve? A performance higher than all but 2.5% of all scores earns an intelligence score of 130. A performance lower than 97.5% of all scores earns an intelligence score of 70.
How do the tests remain standardized? To keep the average score near 100, the StanfordBinet and Wechsler scales are periodically restandardized. The WAIS, 4th ed., was standardized on a sample who took the test during 2007, not to David Wechsler’s initial 1930’s sample.
Thinking critically: Why is there a need to restandardize tests? If you compared the performance of the most recent standardization sample with that of the 1930’s sample, do you suppose you would find rising or declining test performance? Discuss with your partner.
What is the Flynn effect? It turns out that intelligence test performance has improved. This worldwide phenomenon is called the Flynn effect, in honor of New Zealand researcher James Flynn who first calculated its magnitude. The average person’s intelligence test score in 1920 was —by today’s standard— only a 76.
What is reliability and how is it determined? Reliability is the extent to which a test yields consistent results and can be assessed three ways: Split-half: scores on two halves of the test (even items v. odd items) are compared. Alternative form: varying versions of the test are given and results are compared. Test-retest: the same test is readministered and results are compared. The higher the correlation between the two scores, the higher the test’s reliability
2. What Would You Answer? If the same test yields consistent results upon retesting, it can be said to have a high degree of A. reliability. B. validity. C. content validity. D. predictive validity. E. normal curve.
What is validity? the extent to which a test measures or predicts what it is supposed to For example, if your environmental science teacher spent several weeks discussing global warming trends, then gave an assessment on that subject, the test would be valid if it contained questions on global warming trends.
What is the difference between content validity and predictive validity? content validity predictive validity the extent to which a test samples the behavior that is of interest the success with which a test predicts the behavior it is designed to predict For example, the road test for a driver’s license has content validity because it samples the tasks a driver routinely faces. For example, some academic aptitude tests can predict success in school at certain ages.
When can predictive validity yield little information? Consider a correlation between football linemen’s body weight and their success on the field. Note how insignificant the relationship becomes when the range of weight is narrowed to 280 to 320 pounds.
3. What Would You Answer? Which of the following can be used to demonstrate that only about 2 percent of the population scores at least two standard deviations above the mean on an intelligence test? a. reliability test b. aptitude test c. predictive validity test d. test-retest procedure e. normal curve
The limits of prediction. As the range of data under consideration narrows, its predictive power diminishes.
Think about it. Are you working to the potential reflected in your standardized test scores? What, other than your aptitude, is affecting your school performance? Write down your thoughts.
Learning Target 61-1 Review Describe the characteristics of an intelligence test, and distinguish between achievement and aptitude tests. An intelligence test assesses people’s mental aptitudes and compares them with those of others, using numerical scores. Achievement tests are designed to assess what you have learned.
Learning Target 61-1 Review cont. Describe the characteristics of an intelligence test, and distinguish between achievement and aptitude tests. Aptitude tests are designed to predict what you can learn. The WAIS (Wechsler Adult Intelligence Scale), an aptitude test, is the most widely used intelligence test for adults.
Learning Target 61-2 Review Discuss when and why intelligence tests were created. In the late 1800s, Francis Galton, who believed that genius was inherited, attempted but failed to construct a simple intelligence test. His hope had been to identify those with exceptional abilities and encourage them to reproduce.
Learning Target 61-2 Review part II Discuss when and why intelligence tests were created. In France in 1904, Alfred Binet, who tended toward an environmental explanation of intelligence differences, started the modern intelligence-testing movement by Developing questions to measure children’s mental age and thus predict progress in the school system. Binet hoped his test would be used to improve children’s education rather than to limit their opportunities.
Learning Target 61-2 Review part III Discuss when and why intelligence tests were created, and explain how today’s tests differ from early intelligence tests. During the early twentieth century, Lewis Terman of Stanford University revised Binet’s work for use in the United States. Terman believed intelligence is inherited, and he thought his modified version of the Stanford-Binet could help guide people toward appropriate opportunities.
Learning Target 61-2 Review part IV Discuss when and why intelligence tests were created, and explain how today’s tests differ from early intelligence tests. During this period, intelligence tests were sometimes used to document scientists’ misguided assumptions about the innate inferiority of certain ethnic and immigrant groups.
Learning Target 61-3 Review Describe the normal curve, and explain standardization, reliability, and validity. The distribution of test scores often forms a normal (bell-shaped) curve around the central average score, with fewer and fewer scores at the extremes. Standardization establishes a basis for meaningful score comparisons by giving a test to a representative sample of future test-takers. Reliability is the extent to which a test yields consistent results (on two halves of the test, or when people are retested).
Learning Target 61-3 Review cont. Describe the normal curve, and explain standardization, reliability, and validity. Validity is the extent to which a test measures or predicts what it is supposed to measure. A test has content validity if it samples the pertinent behavior (as a driving test measures driving ability). It has predictive validity if it predicts a behavior it was designed to predict. (Aptitude tests have predictive ability if they can predict future achievements.)