Добавил:

SinGrey3 Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский государственный университет им. М.В. Ломоносова

Предмет:

Психологическое консультирование

Файл:

Essentials of NEPSY-II Assessment.pdf

Скачиваний:

Добавлен:

06.12.2024

Размер:

3.13 Mб

Скачать

☆

<<< < Предыдущая 4 5 6 7 8 9 10 11 12 13 14 1516 / 3016 17 18 19 20 21 22 23 24 25 26 27 28 > Следующая >>>

STRENGTHS AND WEAKNESSES OF NEPSY-II 233

starting and stopping points, scores were determined using empirically-based ﬁndings.

Despite the aforementioned strengths, the NEPSY-II does have several potential weaknesses. As can be seen in Rapid Reference 5.2, as a strategy to facilitate obtaining the normative data from the lengthy standardization battery, a number of subtests were not included. These subtests were not included in an effort to shorten the time required for a child to complete the standardization battery, but also because the development team determined that tasks required no modiﬁcations and likely would show little change in normative values. Consequently, these subtests (i.e., Design Fluency, Oromotor Sequences, Repetition of Nonsense Words, Manual Motor Sequences, Route Finding, Imitative Hand Positions) were not included in the standardization version and retained the normative scores from the 1998 version of the NEPSY. Despite these efforts, there are no data provided, in either the pilot or tryout phases of the NEPSY-II, to suggest that the 1998 version of these tasks is equivalent to the 2007 version. This creates a potential problem with interpretation and, in many respects, hinders one of the major beneﬁts of a battery of tasks (i.e., all tasks normed on the sample population). If these versions are not equivalent in their normative data, then unknown error variance will be created when these subtests are compared to the newly normed subtests in a proﬁle of scores. How this may affect interpretation is not known.

Second, the use of 50 children per age band is relatively weak, especially for the preschool years where there is signiﬁcantly more potential error variance in test data given the relatively lower reliability of test scores in a younger population. Further, the test developers provide little empirical data to support the use of six-month intervals in their age-splitting of the normative data. It is unclear if this was in line with their empirical data showing developmental changes on tasks over time.

PSYCHOMETRIC PROPERTIES

Reliability

The area of reliability is critical in test construction; not only does it determine the test’s capabilities of replicating results, but it also sets the upper limit on validity (Bracken, 1992). A summary of the strengths and weaknesses for the reliability of the NEPSY-II is presented in Rapid Reference 5.3 on the following page.

234 ESSENTIALS OF NEPSY-II ASSESSMENT

Rapid Reference 5.3

Strengths and Weaknesses of NEPSY-II Reliability

	Strengths	Weaknesses
	Most subtests were adequate to high for	Lowest reliability coefﬁcients were
	internal consistency estimates.	achieved on Response Set Total
	Good inter-rater reliability for Clocks,	Correct, Inhibition Total Errors,
	Good inter-rater reliability for Clocks,	Memory for Designs Spatial, Total
	Design Copying, Memory for Names,	Memory for Designs Spatial, Total
	Design Copying, Memory for Names,	Score, and Delay Total.
	Theory of Mind, Word Generation,	Score, and Delay Total.
	Theory of Mind, Word Generation,
	Visual Memory Delayed, and Visual	Practice effects most noted on
	Motor Precision, ranging from	Memory for Designs, Memory for
	.93 to .99.	Faces, and Inhibition.
	Test-retest reliability across seven age	Less focus on improved ceilings, which
	groups revealed little change in scores, with	may place a limit on interpreting
	test intervals ranging from 12 to 51 days	neurocognitive strengths; at present,
	(M = 21 days).	from ages 3 to 5.9, there are 0–5
	Focused on improving the test ﬂoors, with	subtests; from ages 6.0 to 9.9 there
	nearly all subtests across age bands having	are 3–8 subtests; and from 10.0 to
	at least a 2 standard deviation limit; more	16.9 there are 11–17 subtests.
	tasks at the 3.0 to 4.5 ages (5–13) do not
	reach this criterion, while 0–1 from ages 9.0
	to 16.9.
	Standard error of measure ranged from .85
	to 2.18 across subtests and ages.
	Reliability of subtests was relatively stable
	for both typical and impaired samples.

For the NEPSY-II primary and process scaled scores, most subtests have adequate to high internal consistency estimates, and the standard error of measure for all of the subtests ranged from 0.85 to 2.18 across all of the age ranges. Temporal stability of the NEPSY-II subtests across seven age groups (i.e., ages 3 to 4, 5 to 6, 7 to 8, 8 to 9, 9 to 10, 11 to 12, 13 to 16), revealed little change in scores over an average of a three-week period (range 12 to 51 days), suggesting that there was little practice effect across a relatively short time frame. This is critical in a test where alternative form reliability is not available, and provides an evidence-based

STRENGTHS AND WEAKNESSES OF NEPSY-II 235

foundation to use many of the subtests in various types of intervention studies. The largest score differences were noted on the Memory for Designs, Memory for Faces, and Inhibition subtests. In general, the highest reliability coefﬁcients were achieved for the subtests: Comprehension of Instructions, Design Copying, Fingertip Tapping, Imitating Hand Positions, List Memory, Memory for Names, Phonological Processing, Picture Puzzles, and Sentence Repetition. The lowest reliability coefﬁcients were noted for the subtest variables: Response Set Total Correct, Inhibition Total Errors, Memory for Designs Spatial and Total Scores, and Memory for Designs Delayed Total Score. Given the potential application of the NEPSY-II, it also is important to note that the reliability of the subtests was relatively stable for both typical and impaired samples.

The NEPSY-II also examined inter-rater agreement across all cases ascertained for standardization and clinical validity studies by employing two independent scores. For the more objective types of subtests (e.g., Comprehension of Instructions), rates were quite high, ranging from .98 to .99. This also speaks to the integrity of the data included in the standardization phase. Scoring for several of the subtests require clinical judgment (e.g., Clocks, Design Copying) or implementation of speciﬁc scoring rules (e.g., Memory for Names, Word Generation), thus necessitating the generation of inter-rater reliability estimates. For these subtests, the inter-rater agreement ranged from .93 (Word Generation) to .99 (Memory for Names, Theory of Mind). These ﬁndings suggest that these speciﬁc subtests will require more scoring attention from trainers and administrators of the NEPSY-II, but that a high degree of scoring reliability can be achieved on these subtests.

Finally, although not directly related to reliability, the issue of tests’ ﬂoors and ceilings can have a direct effect on reliability by potentially restricting the range of scores available. For the NEPSY-II, the test developers focused signiﬁcant resources on this issue, with targeted resources being devoted to improving the test ﬂoors. This is absolutely critical for a test that is designed to uncover neurocognitive weaknesses across different presenting problems and disorders. In this regard, nearly all of the NEPSY-II subtests across the age bands have at least a two standard deviation limit. There is a difference across the age range, however, with all or nearly all of the subtests having at least this ﬂoor from ages 9.0 to 16.9, but 5 to 13 of these subtests not reaching this ﬂoor for ages 5 to 13 years. In general, the number of subtests having at least a two standard deviation ﬂoor increases with age. Conversely, the test developers were not as focused on having at least a two standard deviation ceiling on the subtests. Although this is consistent with the notion of assessment via the NEPSY-II, this is unfortunate, as it may place

236 ESSENTIALS OF NEPSY-II ASSESSMENT

a limit on determining the presence of neurocognitive strengths; consequently, the concept of utilizing neurocognitive strengths to facilitate intervention may be limited. This notion is apparent in the NEPSY-II normative data, as from ages 3.0 to 5.9 few subtests meet this criterion (0 to 5 subtests); from ages 6.0 to 9.9 only 3 to 8 subtests meet this criterion, and from ages 10.0 to 16.9 only 11 to 17 subtests meet this criterion.

Validity

As noted in the test manual, “contemporary deﬁnitions of validity describe lines of evidence of validity as opposed to different types of validity” (p. 79). Evidence lines for validity may be the most important aspect of a test such as the NEPSY-II, where interpretation issues are critical to the ultimate clinical utility of the test (American Educational Research Association et al., 1999). According to the Standards for Educational and Psychological Testing (1999), key lines of validity include content, construct, and criterion-related. Strengths and weaknesses of the NEPSY-II for these lines of validity are presented in Rapid Reference 5.4.

Content validity (i.e., do the subtests adequately sample the targeted constructs of interest?) for the NEPSY-II had the beneﬁt of the 1998 NEPSY upon which to

Rapid Reference 5.4

Strengths and Weaknesses of NEPSY-II Validity

	Strengths	Weaknesses
	Content, concurrent, and construct	Although the NEPSY-II is driven by a
	validity issues all adequately addressed.	subtest model, users are still left with
	Subtests have strong theoretical (Lurian)	the issue of how tests are clustered,
	Subtests have strong theoretical (Lurian)	especially at different developmental
	and evidence-based foundations.	especially at different developmental
	and evidence-based foundations.	epochs.
	Subtest intercorrelations ﬁt a multitrait-	epochs.
	Subtest intercorrelations ﬁt a multitrait-	No subtest speciﬁcity estimates were
	multimethod model.	No subtest speciﬁcity estimates were
		provided to reinforce interpretation
		strength of the NEPSY-II.
		No relationships with adaptive behavior
		as measured by the ABAS-II.

STRENGTHS AND WEAKNESSES OF NEPSY-II 237


	Strengths	Weaknesses
	Correlations with intellectual batteries	Research criteria not employed for many
	(WISC-IV, DAS) and other task	of the special group studies, leaving
	cognitive batteries (NEPSY) are	them too heterogeneous and likely not
	moderate to strong.	generalizable to the larger contemporary
	Correlations with achievement batteries	research corpus.
	Correlations with achievement batteries
	(e.g., WIAT-II) are moderate to strong.	The special groups were not compared
	Correlations with speciﬁc neurocognitive	to show differential proﬁles.
	batteries (e.g., DKEFS, CMS, BBCS) are
	moderate to strong.
	Correlations with Devereux
	Scales of Mental Disorder show
	specific relationships with Autism
	(Comprehension of Instructions)
	and Conduct Disorder (Affect
	Recognition).
	Correlations with ADHD Scale show
	relationships between Inhibition Subtest
	and Focus Cluster.
	Employed 10 special group studies
	(e.g., ADHD, RD, MD, TBI, ASD, etc.).

base its modiﬁcations. The 1998 NEPSY was based on Lurian neuropsychological theory, but capitalized on recent advances in the ﬁeld of child neuropsychology. For the NEPSY-II, this theoretical foundation remained, but the research that utilized the 1998 version of the NEPSY was reviewed as to its relevance for test revision and speciﬁc modiﬁcations. The pilot and tryout phases of test development further facilitated the examination of speciﬁc items within subtests, as well as the subtests proper, with a particular focus on content gaps, and following the standardization phase additional analysis was conducted to determine the adequacy of content at the speciﬁc item level, content biases, and associated psychometric properties. This process also extended into examination of the child’s responses such that traditional and atypical responses were considered with respect to whether the item or subtest was capturing the intended information. Taken together, these procedures produced a battery of tasks that adequately sample the targeted constructs of interest.

238 ESSENTIALS OF NEPSY-II ASSESSMENT

Construct validity pertains to the internal structure of a test, particularly with respect to the interrelationship of subtests or components. This is important, given the theoretical neurocognitive domains espoused by the NEPSY-II, as it will drive how the components of the test are viewed for interpretation. The NEPSY-II provides an interesting challenge for construct validity in that while there appears to be an overarching set of neuropsychological domains (e.g., Language, Visuospatial Processing, Social Perception), the test is not designed to provide scores for these domains. As such, and in accordance with the guidance in the Clinical and Interpretive Manual, the administration and interpretation of the NEPSY-II should be guided by the subtests. The 1998 version of the NEPSY did not present any factor analysis data, but two subsequent reports did address this issue via exploratory (Stinnett, Oehler-Stinnett, Fuqua, & Palmer, 2002) and conﬁrmatory factor analytic methods (Mosconi, Nelson, & Hooper, 2008). Consistent with this philosophy, the NEPSY-II continues to focus on the subtests and their interrelationships. For the NEPSY-II, the test developers hypothesized that there would be a subtest intercorrelation pattern wherein the subtests within a domain would correlate more highly than subtests across domains. This multitrait-multimethod model for construct validity was supported in both the normative and clinical samples, with the correlations within many of the domains being higher in the clinical samples. Of note, subtests within the Language domain produced the highest intercorrelations, and many of these subtests also were more highly correlated with verbally based subtests in the other neurocognitive domains. The test developers use this pattern of correlations as support for the structure of the NEPSY-II.

Although the NEPSY-II is driven by a subtest model, with the authors and test developers arguing that this is the best approach for interpreting the data, users are still left to wonder about how these subtests cluster within and across domains, and across developmental epochs. As noted earlier, there were at least two efforts to examine the factor structure of the 1998 version of the NEPSY (Mosconi et al., 2008; Stinnett et al., 2002), with mixed support being provided. Stinnett and colleagues (2002) conducted an exploratory principle axis factor analysis using the correlation matrix for the 5- to 12-year-old children from the standardization sample (n = 800) and found that it yielded a 1-factor solution—a language/comprehension factor—and accounted for only 24.9% of the variance. Results also indicated that numerous subtests cross-loaded on multiple factors in two-, three-, and four-factor solutions, and that the same 11–12 subtests loaded on the ﬁrst factor for each of these models. These ﬁndings suggested that the NEPSY ﬁvedomain model was not supported, but Stinnett and colleagues (2002) suggested

STRENGTHS AND WEAKNESSES OF NEPSY-II 239

that conﬁrmatory factor analysis would provide more convincing support of the test’s structural validity.

In that regard, Mosconi and colleagues (2008), using the standardization sample from the 1998 version of the NEPSY, conducted a conﬁrmatory factor analysis for ages 5 though 12, as well as for the younger (5 to 8 years, n = 400) and older (9 to 12 years, n = 400) age bands. This latter question was important in the exploration of possible differences in test structure at different developmental epochs. Using four standard ﬁt indices, results indicated that a ﬁve-factor model was less than adequate for the entire sample, and produced negative error variance for the younger and older age groups, making any solutions for the two subgroups statistically inadmissible. A four-factor model without the Executive Function/Attention Domain subtests produced satisfactory ﬁt statistics for the entire sample and the younger group, but did not ﬁt the data as well for the older group. In contrast to Stinnett and colleagues’ (2002) ﬁndings, a one-factor model did not ﬁt well for the full sample. These results indicated that the structure of the 1998 NEPSY was not invariant across development, with the four-factor model best ﬁtting the data for the younger age group and for the entire school-age sample.

It is unfortunate that these additional analyses were not explored, or at least presented in the NEPSY-II Clinical and Interpretive Manual, as this structure would support data reduction strategies for research and would provide some sense of linkage to the theoretical model that underlies the NEPSY-II. In support of the test developers’ contentions, however, it is likely that different factor structures would be present across different clinical groups, much as is seen in the pattern of intercorrelations of subtests between clinical samples and the normative sample; this question will require ongoing examination.

Finally, with respect to construct validity, there are no data provided with respect to subtest speciﬁcity estimates. Subtest speciﬁcity estimates provide an index for what proportion of a subtest’s variance is reliable and unique to the subtest. For any assessment tool where subtest interpretation is possible, subtests with low speciﬁcity (i.e., < .25, and greater than the proportion of error variance; McGrew & Murphy, 1995) should not be interpreted as measuring a speciﬁc function. Given the strong emphasis on utilization and interpretation of the NEPSY-II at the subtest level, it would seem that empirical data would have been provided with respect to the strength of speciﬁc subtests and its assessment of unique variance, and the speciﬁcity estimates would have been examined across each age block encompassed by the NEPSY-II. This would have facilitated interpretation and utilization of the subtests in the way they

240 ESSENTIALS OF NEPSY-II ASSESSMENT

were intended, but with empirical evidence as to their ability to measure a speciﬁc function.

Criterion-related validity was determined primarily by using concurrent validity studies (i.e., the relationship of the NEPSY-II with other tests measuring similar constructs). The NEPSY-II was concurrently administered with a wide range of measures, including intellectual batteries (e.g., Wechsler Intelligence Scale for Children—Fourth Edition; WISC-IV); achievement batteries (e.g., Wechsler Individual Achievement Test—Second Edition; WIAT-II); speciﬁc neuropsychological measures (e.g., Delis-Kaplan Executive Function System; DKEFS); behavior (e.g., Devereux Scales of Mental Disorders); and adaptive behaviors (e.g., Adaptive Behavior Assessment System-II). The concurrent administration of the NEPSY-II and other tasks included a sufﬁcient number of participants to gain relatively stable validity coefﬁcient.

As can be seen in Rapid Reference 5.4, the NEPSY-II correlated in a moderate to strong fashion with the intellectual and achievement batteries—the latter being particularly important with respect to its importance for use with children being referred for a variety of learning problems. When speciﬁc neurocognitive batteries were examined, the NEPSY-II subtests that were most similar to the items being tapped by the battery generally aligned in a moderate to strong manner.

For example, the NEPSY-II Memory and Learning subtests correlated most highly with selected subtests from the Children’s Memory Scale; NEPSY-II Attention and Executive Function subtests correlated most highly with selected subtests from the DKEFS; and NEPSY-II Language subtests correlated most highly with the Bracken Basic Concept Scale—Third Edition Receptive and Expressive scales. The NEPSY-II subtests did not correlate with the Children’s Communication Checklist. Given the potential usage of the NEPSY-II for children with emotional/behavioral disturbance and intellectual disabilities, correlations with several behavioral measures (e.g., Devereux Scales of Mental Disorders, Brown Attention-Deﬁcit Disorders Scale, Adaptive Behavior Assessment System-II) also were examined. For the Devereux, the NEPSY-II subtests of Comprehension of Instructions and Affect Recognition showed moderate negative correlations with Autism and Conduct Disorder, respectively. The NEPSY-II Affect Recognition also correlated with the Devereux Externalizing Composite score. For the Brown Attention-Deﬁcit Scale, the Focus Cluster moderately and negatively correlated with the NEPSY-II Inhibition-Switching Combined scaled score, reﬂecting declining inhibitory control with increasing ADHD symptoms. None of the NEPSY-II subtests correlated signiﬁcantly with the Adaptive Behavior Assessment System,

STRENGTHS AND WEAKNESSES OF NEPSY-II 241

perhaps an indication of its lack of association to ecological, day-to-day behaviors. These ﬁndings support a convergent validity line of evidence for the NEPSY-II; however, it is important to note that these patterns of correlation may change depending on the age of the child, the presenting clinical condition, and the method of assessment (e.g., parent rater, clinician ratings, etc.), and this will require additional examination as the NEPSY-II begins to be employed in a variety of clinical settings.

The special group studies conducted with the NEPSY-II are noteworthy in that the test developers employed 10 different clinical conditions (i.e., Reading Disability, Math Disability, Traumatic Brain Injury, Autism Spectrum Disorder, Attention-Deﬁcit Hyperactivity Disorder, Language Disorders, Intellectual Disability, Asperger’s Disorder, Hearing Impairment, Emotional Disturbance). Comparison groups were derived from the normative sample and matched on chronological age, gender, race, and parent education levels. These studies have promised to determine the differential sensitivity of the NEPSY-II to the neuropsychological proﬁles that can be manifested by speciﬁc disorders.

Findings from these special group studies generally support the clinical utility of the NEPSY-II in the assessment of children referred for different conditions and disorders. More speciﬁcally, the special groups typically differed from the typical groups on variables where they would be expected to deviate, as well as in a wide range of other variables. For example, the special group study using children with ADHD showed signiﬁcant differences on NEPSY-II subtests assessing attention and executive functions, verbal memory, and sensorimotor abilities. Similarly, children with language disorders showed signiﬁcant differences on NEPSY-II subtests measuring language-related functions, and children with intellectual disabilities and autism spectrum disorders performed more poorly than the typical group on nearly all of the NEPSY-II subtests. The separate examination of the newly added Theory of Mind Subtest for the Autism Spectrum Disorders versus Controls and for the Asperger’s Disorder versus Controls also provided support for the use of this subtest with these types of clinical referrals.

While the test developers should be commended on the inclusion of the various clinical disorders and conditions, and the ﬁndings generally support the separation of these groups from the typical group, there are several concerns with respect to these clinical studies that require mention. First, many of the studies employed small sample sizes, with group sizes ranging from 10 (Traumatic Brain Injury) to 55 (ADHD). Second, although inclusion and exclusion criteria for participation in these studies are provided in Appendix F of the

<<< < Предыдущая 4 5 6 7 8 9 10 11 12 13 14 1516 / 3016 17 18 19 20 21 22 23 24 25 26 27 28 > Следующая >>>

Соседние файлы в предмете Психологическое консультирование

#
10.12.202426.11 Mб4Dawn P. Flanagan - Essentials of Cross-Battery Assessment-Wiley (2013).pdf
#
20.09.202421.48 Mб9English_File_4th_edition_Upper_Intermediate_WB.pdf
#
06.12.20245.63 Mб12Essentials of KTEA-3 and WIAT-III Assessment.pdf
#
06.12.20243.13 Mб8Essentials of NEPSY-II Assessment.pdf
#
21.06.202443.18 Mб22How psychology works the facts visually explained (DK) (Z-Library).pdf
#
18.03.20231.9 Mб109Nensi_Mak-Vilyams_-_Psikhoanaliticheskaya_diagnostika.pdf
#
18.03.20232.47 Mб79Nensi_Mak-Vilyams_Psikhoanaliticheskaya_diagnostika.doc
#
22.04.2024346.65 Кб26Сборник__текстов_по_психологии_для_чтения_на_английском_языке_с_упражнениями_Г.В._Бочарова,_М.Г._Степанова-1.docx