
- •Contents
- •Series Preface
- •Acknowledgments
- •RATIONALES UNDERLYING NEPSY AND NEPSY-II
- •NEPSY DEVELOPMENT
- •NEPSY-II REVISION: GOALS AND DEVELOPMENT
- •COMPREHENSIVE REFERENCES
- •CONCLUDING REMARKS
- •APPROPRIATE TESTING CONDITIONS
- •TYPES OF ASSESSMENTS
- •ASSESSING CHILDREN WITH SPECIAL NEEDS
- •OTHER ADMINISTRATION CONSIDERATIONS
- •SUBTEST-BY-SUBTEST RULES OF ADMINISTRATION
- •COMPUTER SCORING
- •PREPARATORY TO SCORING
- •ORDER OF SCORING
- •STEP-BY-STEP SCORING
- •TESTS WITH COMPLEX RECORDING AND/OR SCORING
- •QUICK-SCORING: DESIGN COPY GENERAL (DCG)
- •DESIGN COPYING PROCESS (DCP) SCORING
- •OVERVIEW OF SUBTEST SCORES
- •SUMMARIZING NEPSY-II SCORES
- •CONCLUDING REMARKS
- •GOALS OF INTERPRETATION AND IMPLEMENTATION OF GOALS
- •STEP-BY-STEP INTERPRETATION OF NEPSY-II PERFORMANCE
- •INTRODUCTION
- •TEST DEVELOPMENT
- •STANDARDIZATION
- •PSYCHOMETRIC PROPERTIES
- •ADMINISTRATION AND SCORING
- •INTERPRETATION
- •OVERVIEW OF STRENGTHS AND WEAKNESSES
- •THE NEPSY-II REFERRAL BATTERIES
- •DEVELOPMENTAL DISORDERS AND NEPSY-II
- •EVIDENCE OF RELIABILITY IN NEPSY-II
- •CONVENTIONS FOR REPORTING RESULTS
- •RELIABILITY PROCEDURES IN NEPSY-II
- •CONCLUDING REMARKS
- •CASE STUDY #1: GENERAL REFERRAL BATTERY
- •CLINICAL IMPRESSIONS AND SUMMARY
- •PRELIMINARY DIAGNOSIS
- •RECOMMENDATIONS
- •DIAGNOSIS
- •Appendix: NEPSY-II Data Worksheet
- •References
- •Annotated Bibliography
- •About the Authors
- •Index

Five
STRENGTHS AND WEAKNESSES
OF NEPSY-II
Stephen R. Hooper
INTRODUCTION
Test development is an incredibly arduous process that requires significant foresight and planning, attention to detail, and ultimately, the recognition that there will be strengths and weaknesses regardless of the Herculean effort to produce a quality measurement tool. The NEPSY-II (Korkman, Kirk, & Kemp, 2007) is no different. Although the authors and test developers had the experiences and lessons learned from the original NEPSY (Korkman, Kirk, & Kemp, 1998) upon which to base many of their insights, thoughts, and decisions, the process of modification requires an equal amount of foresight, planning, and attention to detail as new test construction. As such, the NEPSY, and its current version the NEPSY-II, maintain the distinction of being the first well-normed and standardized neuropsychological batteries for children and adolescents. This distinction is significant given the need for such a tool in the field of child neuropsychology, but such tests should receive detailed critique with respect to their ultimate utility in both clinical and research endeavors, and with respect to its appropriate application in the larger evaluation process.
Although it is recognized that specific strengths and weaknesses of a measure will surface once users have had sufficient time to use the tool, this chapter provides an initial detailed examination of the specific strengths and weaknesses that are present in the NEPSY-II as it moves into its real-time phase. Specific strengths and weaknesses are highlighted for: Test Development, Standardization, Psychometric Properties (i.e., Reliability, Validity), Test Administration and Scoring, and Interpretation.
TEST DEVELOPMENT
For this revision, the test developers took into account several key pieces of information: evidence-based findings in the field of child neuropsychology,
227

228 ESSENTIALS OF NEPSY-II ASSESSMENT
child development, and related neuroscience fields; customer feedback; author experiences with the NEPSY; and pilot data in the early phases of revision. In addition, there were four specific goals for the NEPSY-II: (1) to improve domain coverage across a wider age range; (2) to improve clinical and diagnostic utility; (3) to improve psychometric properties; and (4) to improve its ease of administration and, ultimately, its usability. The developers have addressed each of these issues in the Clinical and Interpretive Manual and, in general, have been successful in addressing each of these goals. Additionally, the developers are commended for engaging in an iterative process of subtest inclusion and elimination via pilot and tryout phases prior to their national standardization. It was during the tryout phase that the extension of the NEPSY-II into the adolescent years was examined, with the data suggesting additional subtest modifications (e.g., lowering test floors and raising ceilings), although only 45 adolescents were included in the tryout phase. They also followed test construction guidelines espoused by the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999). (See Rapid Reference 5.1 on the following page.)
Similar to its predecessor, the NEPSY-II continues to be based on the theoretical foundation of Luria (1966), wherein designated brain functions correspond with selected assessment tasks. While other concepts from Lurian Theory, such as functional systems and zones of proximal development, are not clear with respect to their integration into the NEPSY-II, the domains provided are clearly multidimensional and representative of the broad range of neurocognitive functions espoused by most neuropsychological models. Consistent with the NEPSY, the NEPSY-II includes the domains of Attention and Executive Functioning, Language, Memory and Learning, Sensorimotor, and Visuospatial Processing. The NEPSY-II also includes the new domain of Social Perception. Although the manual mentions a range of neuropsychological studies documenting the inclusion of selected dimensions within each domain, there is not a clear model underlying any of the domains that may have strengthened its theoretical underpinnings. Also, it is important to note that the neuropsychological domains are not empirically derived, or even statistically independent. Although this is clearly noted in the manual and, indeed, noted that the intent was not to create empirically derived domains, the user is left to determine how specific subtests relate within and across domains—and, importantly, how these relationships may change over the age range proposed by the NEPSY-II. Based on available research on the NEPSY, including information provided in the original NEPSY Manual, the NEPSY-II does not provide individual domain

STRENGTHS AND WEAKNESSES OF NEPSY-II 229
Rapid Reference 5.1
Strengths and Weaknesses of NEPSY-II Test
Development
Strengths |
Weaknesses |
|
Theoretically driven domains of function. |
Domains are not empirically |
|
Each domain is multidimensional in its |
derived or necessarily |
|
independent. |
||
conceptualization. |
||
|
||
No domain scores derived. |
Although theoretically derived, |
|
the use of specific models of each |
||
New subtests included based on clinical |
||
domain may have contributed to a |
||
sensitivity and utility, including new tasks |
||
different battery of task. |
||
measuring social cognition, executive |
||
Little available data on |
||
functions, and visual-spatial abilities. |
||
preschool tasks. |
||
Developed for use with neurodevelopmental |
||
|
||
and neurological disorders. |
|
|
Expanded age range of 3 through 16 years. |
|
|
Proposes eight different referral batteries to |
|
|
facilitate subtest selection, which replaces |
|
|
standard administration order. |
|
|
Pilot and tryout phases prior to national |
|
|
standardization. |
|
|
Followed APA guidelines for test |
|
|
construction. |
|
|
|
|
or composite scores. Although a variety of different types of scores can be generated from the NEPSY-II, the use of subtests to guide interpretation of results is emphasized.
Further, the NEPSY-II not only modified and expanded subtests within the domains to be more representative of different dimensions of a neurocognitive domain, they extended the number of domains to include subtests measuring social cognition (i.e., Affect Recognition, Theory of Mind). This latter addition is innovative with respect to its inclusion in such assessment batteries, will assess an important brain function to a certain degree, be applicable to certain clinical groups (e.g., Autism Spectrum Disorders), and provide an additional avenue for assessment from the neuroscience literature that has not

230 ESSENTIALS OF NEPSY-II ASSESSMENT
been routinely available to most clinicians. The addition of new subtests reflecting various aspects of executive functioning (e.g., Animal Sorting, Clocks, Inhibition); language (e.g., Word Generation); memory (e.g., Memory for Designs); and visual-spatial abilities (e.g., Geometric Puzzles, Picture Puzzles) also is noteworthy in the NEPSY-II. The extension of tasks into the middle adolescent years (i.e., up to age 16.9 years) required the addition of new items to a number of subtests (e.g., Arrows, Design Copying), but clearly will serve its users well with respect to having options for neuropsychological testing in this developmental period.
Similar to its early development, as well as the NEPSY, the NEPSY-II utilized available data to showcase its utility with both neurological and neurodevelopmental disorders. The utility with neurological disorders continues the long-standing standard use of such tests and procedures with frank neurological conditions, but the inclusion of neurodevelopmental disorders— conditions that presume neurological involvement—clearly expand the overall application of this test to a wide variety of potential clients. To facilitate this process, the manual proposes eight different referral batteries from which to facilitate subtest selection, and they designed the standardization in such a fashion as to reinforce this process; thus, replacing the need for a standard administration order.
Finally, the NEPSY and the NEPSY-II represent significant efforts to extend neuropsychological testing into the preschool years. Quite frankly, outside of selected intellectual batteries and single test approaches, there are few batteries that have attempted to provide neuropsychological measurement for such a young population. This should facilitate increased precision with respect to description of neurocognitive functions for a wide variety of children with both neurological and neurodevelopmental disorders and, subsequently, prescription of specific treatment strategies and interventions. It is important to note, however, that few empirical data were provided with respect to the utility of the preschool battery for the NEPSY—either from the publisher or from independent investigators— and the same seems to be true with respect to the release of the NEPSY-II. The field will need to determine the ultimate clinical utility of the preschool version of this test.
STANDARDIZATION
With few exceptions, the standardization of the NEPSY-II was well conceived and nicely executed. The normative sample was extracted from the most recently available census data, with excellent concurrence with the available census

STRENGTHS AND WEAKNESSES OF NEPSY-II 231
Rapid Reference 5.2
Strengths and Weaknesses of NEPSY-II Standardization
Strengths |
Weaknesses |
|
Normative sample extracted from most |
Several subtests are not renormed |
|
recent census data; has excellent match with |
and continue to employ 1998 norms |
|
2003 census figures, particularly for minority |
(e.g., Design Fluency, Imitative Hand |
|
representation and parent education. |
Positions, Route Finding). |
|
Stratified by age, race, parent education, and |
The use of 50 children per age band |
|
geographic region; sample evenly divided |
is relatively weak, especially for |
|
between gender. |
preschool years. |
|
Procedures for quality assurance of |
Little empirical data provided for use |
|
examiners in standardization administration. |
of six-month intervals in the age- |
|
Examined flexibility systematically via four |
splitting of the normative data—was |
|
this in line with the empirical data |
||
different administration orders. |
||
showing developmental changes? |
||
Inclusion of Contrast Scores that are based |
||
|
||
on the normative data and not regression or |
|
|
simple discrepancy methods. |
|
|
|
|
figures. The normative sample was well-matched on race, parent education, and geographic region of the country, with the minority representation being particularly noteworthy across nearly all of the stratification cells. The NEPSY-II also was stratified by chronological age and evenly divided by gender. (See Rapid Reference 5.2.)
The NEPSY-II was standardized on 1,200 children across the ages of 3 through 16 years. Of the 29 possible subtests included in the standardization, 17 were administered to children ages 3 to 4 years; children 5 to 6 years of age were administered 22 subtests and 2 delay tasks; children ages 6 to 12 years were given 23 subtests and 2 delay tasks; and the adolescent children, ages 13 to 16 years, received 24 subtests and 3 delay tasks. Final adjustments to the final composition of subtests were made following the standardization process (e.g., deletion of several subtests due to low clinical sensitivity and/or administration challenges), and scoring nuances were determined. Of note with respect to the scoring nuances, the NEPSY-II provides new scores for selected subtests, Contrast Scores, which are a quantification of the difference between one

232 ESSENTIALS OF NEPSY-II ASSESSMENT
measure and a comparison measure. In addition to being new to the NEPSY-II when compared to the 1998 NEPSY, the development of these scores utilized a normed contrast/comparison variable wherein normative data on the specific variable are produced by controlling for the ability of the children on the control variable. As such, this score does not assume equal base rates across different ages and ability levels, and the variances can be different across the range of the control variable.
One modification asserted for the NEPSY-II was to provide increased flexibility in the order of subtest administration. Although this was present in the 1998 version of the NEPSY, it was not clear how different orders of subtest administration would affect scores in a larger test profile. For the NEPSY-II, the test developers examined the flexibility systematically via four different administration orders during the standardization process. By not showing any order effects, they were successful in being able to document that the subtests in the NEPSY-II could be administered in a variety of orders without affecting results. This should facilitate subtest administration, particularly in difficult-to-test samples of children (e.g., child psychiatric disorders, ADHD-Hyperactive Type), and this benefit will extend into both the clinical and research realms. While it is recognized that all possible test orders could not be assessed given the possible number of permutations and combinations, this is one of the first times that a test developer has examined the evidence for order effects to such an extent, and it was critical to the NEPSY-II given the expressed option of a flexible subtest administration order.
Throughout the standardization process, the developers of the NEPSY-II made significant efforts in addressing the quality of the data being obtained. Specifically, they ensured that the examiners had testing experience with children, with the majority of the examiners having professional credentials to conduct psychological testing. Examiners were then trained using a videotape of NEPSYII administration and scoring procedures, and needed to pass a quiz with 90% accuracy on these procedures. Ongoing follow-up also was provided via monthly Internet meetings. Prior to engaging in the standardization phase, examiners were required to provide a practice case, with direct feedback being provided by the test publisher within 48 hours of submission. Once standardization testing ensued, cases were reviewed within 72 hours of receipt in the event of errors in administration and/or scoring, and a periodic newsletter was provided to all examiners indicating where commonly occurring errors appeared on the protocols. Once data were obtained, information was double-entered into the database, and all data were checked routinely for ranges, extreme values, and derivation of the