Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Encyclopedia of SociologyVol._3

.pdf
Скачиваний:
16
Добавлен:
23.03.2015
Размер:
6.4 Mб
Скачать

NONPARAMETRIC STATISTICS

Annual Absences for Single and Married Women

Days Absent

Married Women

Single Women

Row Totals

 

 

 

fo

(fe)

 

fo

(fe)

 

 

 

Low

(0–5)

30

(25)

 

70

(75)

 

100

 

Medium (6–10)

40

(25)

 

60

(75)

 

100

 

High

(11+)

30

(50)

 

170

(150)

 

200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Column Totals

 

 

100

 

 

 

300

 

400

 

Table 9

of the oldest test statistics) and Spearman’s ρ (one of the oldest nonparametric techniques). They are measures of association based on ordinal data.

τ is a measure of rank-order correlation. It also tests for independence of the two variables observed. It assumes at least ordinal data. The test scores of τ range between −1.00 and +1.0 by convention. A score of +1.00 indicates a perfect agreement between rankings of the two sets of observations. A score of −1.00 indicates a perfect negative correlation which means the ranking in one observation is in the reverse order of the other set of observations. All other scores fall between these two values. The test statistics for data with no ties is given below.

The test statistic:

τ =

S

 

(22)

n (n

1) / 2

 

 

 

 

 

 

Illustrative Assumptions: (1) The data are at least ordinal. (2) Variables in the study are continuous.

Hypothesis:

H0 : Mathematical and verbal test scores

are independent.

(23) H1 : Mathematical and verbal test scores

are related positively or negatively.

The number of concordant pairs is measured by the agreement between the two pairs based on their corresponding standing in their own ranking, and discordant pairs are calculated on the basis of the disagreement between them. The example given below in Table 11 consists of no tied ranks.

 

 

 

 

 

 

 

 

 

 

S = 8 − 7 = 1

τ =

S

 

=

8 − 7

=

1

= 0.067 (24)

n (n

 

 

 

 

1) / 2 6(5) / 2 15

 

S measures the difference between the pairs of observations in natural order. A number of students are ranked according to their mathematical and verbal test scores. The six students are identified by letters A to F. There is no need to know the actual scores, and if actual scores are known, they will be converted to ranks as in Table 10.

What is being measured here with τ is the degree of correspondence between the two rankings based on mathematical and verbal scores. The calculations can be simplified if one of the values is arranged in the natural ascending order from the lowest value to the highest value. The lack of correspondence between the two rankings can be measured by just using the second set of rankings which, in this case, is verbal ability.

Decision: With n = 6, the researcher fails to reject the null hypothesis H0 at 0.05 level since tau = 0.067 is smaller than tau = 0.600, which is the critical value for a two-tailed test.

Efficiency: The asymptotic relative efficiency: Both Kendall’s τ and Spearman’s ρ have an asymptotic relative efficiency of .912 compared to the Personian r (correlation coefficient) which is a parametric test and can go up to 1.12 in exponential type of distribution data.

Related Parametric Test: Personian correlation coefficient (in case of interval or ratio data).

Analogous Nonparametric Tests: Spearman’s

ρ (rho).

1968

NONPARAMETRIC STATISTICS

Mathematical (X) and Verbal (Y) Test Scores for Six Students

 

 

A

B

C

D

E

F

X

Math

9

15

8

10

12

7

Y

Verbal

10

13

9

11

4

12

Table 10

In many studies, however, the same values may occur more than once within X observations, within Y observations, or within both observa- tions—in which case there will be a tie. In the case of tied ranks, different formulas for taua, taub, tauc are used. There is, however, no general agreement in the literature about the usage of subscripts. More detailed treatment of this topic is provided in Kendall and Gibbons (1990). Kendall’s τ measures the association between two variables, and when there are more than two variables, Kendall’s coefficient of concordance can be used to measure the association among multiple variables.

Spearman’s ρ (rho) is another popular measure of a rank-correlation technique. The technique is based on using the ranks of the observations, and a Personian correlation is computed for the ranks. Spearman’s rank correlation coefficient is used to test independence between two rankordered variables. It resembles Kendall’s tau in that regard, and the two tests tend to be similar in their results of hypothesis testing.

Runs and Randomness. A run is a gambling term that in statistics refers to a sequence or an order of occurrence of event. For example, if a coin were tossed ten times, it could result in the following sequence of heads and tails: H T H H H T T H H T. The purpose of runs test is to determine whether the run or sequence of events occurs in a random order.

Another type of runs test would be to test whether the run scores are higher or lower than the median scores. A runs up-and-down test compares each observation with the one immediately preceding it in the sequence. Runs tests are not very robust, as they are very sensitive to variations in the data.

As there are no direct parallel parametric tests for testing the random order or sequence of a

series of events, the concept of power or efficiency is not really relevant in the case of runs tests.

Regression. The purpose of regression tests and techniques is to predict one variable based on its relationship with one or more other variables. The procedure most often used is to try to fit a regression line based on observed data. Though only a few simple regression techniques are included in Table 1, this is one of the more active areas in the field of nonparametric statistics.

The problems relating to regression analysis are similar to those in parametric statistics such as finding a slope, finding confidence intervals for slope coefficients, finding confidence intervals for median differences, curve fitting, interpolation, and smoothing. For example, the Brown-Mood method for finding a slope consists of graphing the observations on a scatter diagram; then a vertical line is drawn representing the median value, and points on both sides of the vertical line are joined. This gives the initial slope, which is later modified through iterative procedures if necessary. Some other tests are based on an extension of the rank order of correlation techniques discussed earlier.

Trends and Changes. Comparisons between, before, and after experiment scores are among the most commonly used procedures of research designs. The obvious question in such cases is whether there is a pattern of change and, if so, whether the trends can be detected. Demographers, economists, executives in business and industry, and federal and state departments of commerce are always interested in trends—whether downward, upward, or existing at all. The Cox-Stuart change test, or test for trend, is a procedure intended to answer these kinds of questions. It is a modification of the sign test. The procedures are very similar to the sign test.

1969

NONPARAMETRIC STATISTICS

Mathematical (X) and Verbal (Y)

Test Scores in Rank Order

 

 

(Y is similar

Y is reverse

Values

Rankings

in natural

of natural

(X, Y )

(X, Y )

order to X )

order of X )

 

 

 

 

 

 

 

 

(7, 12)

(1, 5)

1

 

4

 

(8, 9)

(2, 2)

3

 

1

 

(9, 10)

(3, 3)

2

 

1

 

(10, 11)

(4, 4)

1

 

1

 

(12, 4)

(5,1)

1

 

0

 

(15, 13)

(6, 6)

0

 

0

 

 

 

 

 

 

 

 

 

 

 

8

 

7

 

Table 11

Example: We want to test a statement in the newspaper that there is no changing trend in snowfall in Seattle between the last twenty years and the twenty years before that. The assumptions, procedures, and test procedures are similar to the sign test. These data constitute twenty sets of paired observations. If the data are not chronological, then values below the median and above the median would be matched in order. The value in the second observation in the set is compared to the first and would be coded as either positive or plus, negative or minus, and no change. A large number of positive or negative signs would indicate a trend, and the same number of positive and negative signs would indicate no trend. The distribution here is dichotomous, and, hence, a binomial test would be used. If there are seven positive signs and thirteen negative signs, it can be concluded that there is no statistically significant upward or downward trend in snowfall at the 0.05 level as p(k = 7/20/0.50) is greater, k being the number of positive signs. The asymptotic relative efficiency of the test compared to rank correlation is 0.79. Spearman’s coefficient of rank correlation and Kendall’s Tau may also be used to test for trends.

Proportion. The term ‘‘portion’’ or the term ‘‘share’’ may be used to convey the same idea as ‘‘proportion’’ in day-to-day usage. A sociologist might be interested in the proportion of working and nonworking mothers in the labor force, or in comparing the differences in the proportion of people who voted in a local election and the voter

turnout in a national election. In case of a dichotomous situation, a binomial test can be used as a test of proportion; but there are many other situations where more than two categories and two or more independent samples are involved. In such cases, the chi-square distribution may also be viewed as a test of differences in proportion between two or more samples.

Cochran’s Q test is another test of proportions that is applicable to situations with dichotomous choices or outcomes for three or more related samples. For example, in the context of an experimental situation, the null hypothesis would be that there is no difference in the effectiveness of treatments. Restated, it is a statement of equal proportions of success or failures in the treatments.

NEW DEVELOPMENTS

The field of nonparametric statistics has matured, and the growth of the field has slowed down compared to the fast pace in the 1940s and 1950s. Among the reasons for some of these changes are the development of new techniques in parametric statistics and log linear probability models, such as logit analysis for categorical data analysis, and the widespread availability of computers that perform lengthy calculations. Log linear analysis has occasionally been treated in nonparametric statistics literature because it allows the analysis of dichotomous or ordered categorical data and allows the application of nonparametric tests. As indicated earlier, it was not included in the tests discussed in this article because it is a categorical response analog to regression or variance models.

There is further research being conducted in the field of nonparametric robust statistics. The field of robust statistics is also in the process of developing into a new subspecialty of its own. At the same time, development of new tools—jack- knife, bootstrapping, and other resampling techniques that are used both in parametric and nonparametric statistics—has positively impacted the use of nonparametric techniques. Based on the contents of popular computer software, the field of nonparametric statistics cannot yet be considered part of the mainstream of statistics, although some individual nonparametric tests are widely used.

1970

NONPARAMETRIC STATISTICS

There are many new developments taking place in the field of nonparametric statistics. Only three major trends are referenced here. The scope of application of nonparametric tests is gradually expanding. Often, more than one nonparametric test may be available to test the same or similar hypotheses. Again, many new, appropriate, and powerful nonparametric tests are being developed to replace or supplement the original parametric tests. Last, new techniques such as bootstrapping are making nonparametric tests more versatile.

As in the case of bootstrapping, many other topics and subtopics in nonparametric statistics, have developed into full-fledged specialties or subspecialties—for example, extreme value statistics, which is used in studying floods and air pollution. Similarly, a large number of new techniques and concepts—such as (1) the influence curve for comparing different robust estimators, (2) M-esti- mators for generalizations that extend the scope of traditional location tests, and (3) adaptive estimation procedures for dealing with unknown dis- tribution—are being developed in the field of inferential statistics. Attention is also being directed to the developments of measures and tests with nonlinear data. In addition, attempts are being made by nonparametric statisticians to incorporate techniques from other branches of statistics, such as Bayesian statistics.

Parallel Trends in Development. There are two parallel developments in nonparametric and parametric statistics. First, the field of nonparametric statistics is being extended to develop more tests analogous to parametric tests. Recent developments in computer software include many of the new nonparametric tests and alternatives to parametric tests. Second, the field of parametric statistics is being extended to develop and incorporate more robust tests to increase the stability of parametric tests. The study of outliers, for example, is now gaining significant attention in the field of parametric statistics as well.

REFERENCES

Bradley, James B. 1968 Distribution-Free Statistical Tests. Englewood Cliffs, N.J.: Prentice-Hall.

Conover, W. J. 1999 Practical Nonparametric Statistics, 3rd ed. New York: John Wiley.

Daniel, Wayne W. 1990 Applied Nonparametric Statistics, 2nd ed. Boston: PWS-KENT.

Kotz, Samuel, and Normal L. Johnson, editors-in-chief 1982–1989 Encyclopedia of Statistical Sciences, vols. I- IX. New York: John Wiley.

Gibbons, Jean Dixon 1986 ‘‘Distribution-Free Methods.’’ In S. Kotz and N. L. Johnson, eds., Encyclopedia of Statistical Sciences, vol. 2. New York: John Wiley.

Hettmansperger, Thomas P. 1984 Statistical Inference Based on Ranks. New York: John Wiley.

Hollander, Myles, and Douglas A. Wolfe 1999 Nonparametric Statistical Methods, 2nd ed. New York: John Wiley.

Kendall, Maurice G., and Jean Dickinson Gibbons 1990

Rank Correlation Methods, 5th ed. London: Oxford University Press.

Kotz, Samuel, Norman L. Johnson, Campbell, and B. Read 1982–1989 Encyclopedia of Statistical Sciences, vols. 1–9 and supplement. New York: John Wiley.

Krauth, Joachim 1988 Distribution-Free Statistics: An Ap-

plication-Oriented Approach. Amsterdam: Elsevier.

Kruskal, William H., and Judith M. Tanur, eds. 1978

International Encyclopedia of Statistics, vols. I-II. New York: Free Press.

Neave, H. R., and P. L. Worthington 1988 DistributionFree Tests. London: Unwin Hyman.

Noether, Gottfried E. 1984 ‘‘Nonparametriccs: The Early Years-Impressions and Recollections.’’ American Statistician 38(3):173–178.

Savage I. R. 1962 Bibliography of Nonparametric Statistics. Cambridge, Mass.: Harvard University Press.

Siegel, Sidney 1956 Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.

Siegel, Sidney, and N. John Castellan, Jr. 1988 Nonparametric Statistics for the Behavioral Sciences, 2nd ed. New York: McGraw-Hill.

Sprent, P. 1989 Applied Nonparametric Statistical Methods. London: Chapman and Hall.

SUBHASH R. SONNAD

Recent advancements have accelerated developments in nonparametric and parametric statistics, narrowing many distinctions and bridging some of the differences between the two types, bringing the two fields conceptually closer.

NURSING HOMES

See Long Term Care Facilities.

1971

O

OBJECTIVITY

See Epistemology; Measurement; Scientific

Explanation.

OBSERVATION SYSTEMS

The most sensitive, sophisticated, and flexible instrument of observation available today is the human being. All the recent advances in technology have not changed this central fact; we can monitor communications, transcribe conversations using language processing software, and conduct computer-assisted content analysis of the results. Still, at some point a person who understands the social context must infer the meaning of the communications. Systematic methods of observation may vary the unit of analysis, shift the boundaries of categories, and adjust the level of judgment allowed, but sociologists are ultimately left with the basic reality of human beings watching other human beings. The role of the methodologist is to make this process more systematic.

Most sociological data are filtered through the perceptions of informants in an idiosyncratic manner. Retrospective accounts of events, opinion polls, and surveys measure the output of the social perception process. Only systematic observation, with valid and reliable instruments, provides a record of the events themselves rather than the retrospective reconstruction of the events. The more rigorously defined the categories, the more confident the researcher can be that the data

reflect the events and not just the biases and preconceptions of the informants.

While some of the systems that are described below were developed for specific purposes such as the observation of business case-study groups or the diagnosis of psychiatric patients, most attempt to capture the full range of social behavior and may thus be applied to a wide range of settings. Not included here are specialized systems that have been developed for single contexts, such as the classroom behavior of small children, the responses of subjects in tightly controlled laboratory experiments, and the evaluation of employees. One fast-food restaurant, for example, has developed a thirty-one-category observation system that managers can use to observe and evaluate counter staff. Items include ‘‘There is a smile,’’ ‘‘The bag is double folded,’’ and ‘‘Change is counted efficiently.’’

Observation systems have been used for a wide variety of purposes over the years. Early uses included psychiatric diagnosis, job placement, and basic research into group process and development. As corporate assessment centers came into widespread use for the selection of executives, early observation systems reappeared for the analysis of leaderless-group exercises. More recent applications have included research and consulting on team building and training, the evaluation of social workers, the prediction of success and failure of military cadets, the study of leadership networks in large corporations, the evaluation and treatment of problem children in the classroom, the evaluation of psychiatric interventions, the analysis of delinquent behavior, and resocialization,

1973

OBSERVATION SYSTEMS

and consultation on mergers and consolidations (Polley et al. 1988). In the past ten years, direct observation has been in decline. As a result of the high costs in terms of time and money, systematic observation has often given way to retrospective rating systems (Bales 1998). Such methods are, however, a poor substitute for direct observation.

SINGLE-CATEGORY SYSTEMS

Elliot D. Chapple introduced the interaction chronograph in 1940. It was a simple device that consisted of two telegraph keys. Observers were instructed to press key A when person A spoke and key B when person B spoke. A record of the conversation was kept on a moving paper tape. Not surprisingly, inter-rater reliabilities were nearly perfect. Twenty-five years later, the human observers wee replaced by voice-activated microphones attached to analog-digital converters (Wiens et al. 1965). That such a simple device could replace the human observer suggests that the systems were not taking full advantage of the observers’ capabilities. In reality, the decision to record such objective and basic information simply shifted the burden of interpretation from the observer to the researcher. Elliot Chapple (1940) and his successors developed elaborate schemes for interpreting the patterns of lines and blanks that appeared on their paper tapes. At the peak of its popularity, the interaction chronograph was used for everything from psychiatric diagnosis to employee placement.

MULTIPLE CATEGORY SYSTEMS

Chapple’s work serves as an important benchmark. The near-perfect reliability is achieved at the cost of validity. The observation systems that followed it generally traded off a measure of reliability for greater validity. More meaningful sets of categories will almost certainly be harder to employ with any degree of inter-rater reliability.

Interaction Process Analysis (IPA) was one of the earliest attempts to devise more meaningful categories. Bales (1950) began with an encyclopedic list of behaviors and gradually consolidated them into an elegant set of twelve basic categories. Fifty years later, his original system is still one of the most widely applied observation methods.

The IPA category list was evolving at the same time that Parsons and Bales (1955) were developing a model of family socialization. They saw family leadership in the 1940s and 1950s as divided between the father and the mother. IPA perpetuates this somewhat dated division through its primary dimension; six categories are provided for coding task-oriented behaviors and six are for coding socioemotional behaviors (Table 1). When Slater (1955) suggested that this role differentiation could be extended to a general model of effective leadership in groups, he sparked a continuing controversy. But it must be remembered that this conceptualization preceded any serious consideration of either androgyny or flexibility in sex roles.

Additional symmetries are built into IPA. Three of the six socioemotional categories carry positive affect, and each has a direct counterpart on the negative side. Task-oriented behavior is seen as the process of asking questions and offering answers, though the answers—in the form of suggestions, opinions, and orientation (or information)—may be in response to questions asked or implied. Finally, the functionalist orientation of Parsons and Bales (1955) appears in the identification of six problems faced by groups: communication, evaluation, control, decision, tension reduction, and reintegration.

Following Chapple’s lead, Bales developed and marketed a moving paper-tape recording device. The interaction recorder allowed for observation of groups rather than just dyads; the tape was divided into twelve rows and moved past a window at a constant speed so that the observer could write a code, indicating who was speaking to whom, within a category-by-time sector on the tape. This added complexity and enlarged the unit of analysis. The interaction recorder provided a continuous on-off record while IPA recorded discrete acts. The coding unity was, however, kept small. IPA coders often record two or three acts for a single sentence and are expected to record all acts. The first serious challenge to Bales’s IPA system was Borgatta’s Interaction Process Scores (IPS) system (1963). Borgatta argued that the twelve categories failed to make some crucial distinctions. His redefinition of the boundaries resulted in an eight- een-category system that had the advantage of greater precision and the disadvantages of greater

1974

OBSERVATION SYSTEMS

Interaction Process Analysis

 

Category

Emotion

Task

Problem

1.

Shows solidarity

Positive

 

Reintegration

2.

Shows tension release

Positive

 

Tension reduction

3.

Agrees

Positive

 

Decision

4.

Gives suggestion

 

Answer

Control

5.

Gives opinion

 

Answer

Evaluation

6.

Gives orientation

 

Answer

Communication

7.

Asks for orientation

 

Question

Communication

8.

Asks for opinion

 

Question

Evaluation

9.

Asks for suggestion

 

Question

Control

10.

Disagrees

Negative

 

Decision

11.

Shows tension

Negative

 

Tension reduction

12.

Shows antagonism

Negative

 

Reintegration

Table 1

SOURCE: Adapted from Bales (1950) p. 14.

complexity and a lack of symmetry. The former problem was largely solved by the availability of a self-training workbook (Borgatta and Crowther 1965). Such a manual has never been widely available for IPA.

The next logical step after categorization is the organization of categories. IPA had an implicit organization that was lacking in IPS. This, and the fact that IPA had twelve rather than eighteen categories, made IPA the easier system to learn and use. As Weick (1985) points out, however, there is a problem of ‘‘requisite variety.’’ A system for understanding a phenomenon must be at least as complex as the phenomenon itself. Weick uses the metaphor of a camera with variable focal length. In order to photograph objects at twenty different distances, the camera must have at least twenty focal settings or the pictures will not all be of equal clarity. This creates real problems for the interaction chronograph. Clearly, social behavior is more complex than Chapple’s on-off category system. Unfortunately for researchers, it is also more complex than IPA’s twelve-category system. This creates a dilemma. The mind can hold only so many categories at once; even with twelve categories, most observers tend to forget the rarer ones in an attempt to simplify their task.

MULTIDIMENSIONAL SYSTEMS

Timothy Leary (1957) proposed one of the first observation systems based primarily on dimensions rather than categories. His interpersonal diagnosis system placed sixteen categories at the compass points of a two-dimensional circumplex. The dimensions, dominance-submission and lovehate, would prove to be the two most common dimensions among the systems that followed. While intended primarily for the diagnosis of psychiatric disorders, the system also identified the ‘‘normal,’’ or less intense, variant of each behavior as well as the likely response that each behavior would generate in other people. For example, the general behavior in the direction of dominance is ‘‘manage, direct, lead’’; this behavior is likely to ‘‘obedience’’ in others. The extreme version of the category is ‘‘dominate, boss, order.’’ This falls into the larger category of behavior that is described as ‘‘managerial’’ when in the normal range and ‘‘autocratic’’ when in the abnormal range. More than a method of observation, the interpersonal diagnosis system was a remarkably well-articulated theory of interpersonal relations. Were it not for Leary’s well-publicized advocacy of lysergic acid diethylamide (LSD), the system might very well be in common

1975

OBSERVATION SYSTEMS

use today. While the research was briefly resurrected by McLemore and Benjamin (1979), it never had a major impact on either social psychology or psychiatry. Leary claimed that the system worked so well for psychiatric diagnosis that a receptionist trained in its use could provide diagnoses based on observations in the waiting room that rivaled those obtained by psychiatrists after conducting a diagnostic interview.

Leary’s two dimensions were theoretically derived. In a 1960 doctoral dissertation, Arthur Couch pioneered the application of factor analysis to interpersonal behavior, thus offering an empirical alternative for the derivation of dimensions. Couch’s six dimensions of interpersonal behavior were derived from the factor analysis of a vast amount of data on twelve groups of five undergraduates each. The individuals were given a large battery of personality tests and the twelve groups were observed participating in a wide variety of tasks across five meetings each. While the factor analysis is not independent of the original categories of measurement and observation, Couch’s data set was so exhaustive as to deserve credence.

MULTILEVEL SYSTEMS

The three dimensions of Bales and colleagues’ SYMLOG (System for the Multiple Level Observation of Groups) (1979) owe much both to IPA and to Couch’s empirical work. They also reflect the legacy of Leary’s circumplex. Dominant-submis- sive (up-down, or U-D) was the first factor from Couch’s analysis; in IPA it corresponds to total amount of talking rather than to interaction in any specific set of categories; Leary’s interpersonal diagnosis system had already identified the primary dimension as dominant-submissive. In essence, this was also the dimension that Chapple’s interaction chronograph measured. Friendly-unfriendly (Positive-negative, or P-N), Couch’s second factor, was represented in IPA by the three positive and three negative socioemotional categories, and it corresponds to love-hate in Leary’s system. Task- oriented–emotional expressive (forward-backward, or F-B), the controversial distinction from IPA, was a compromise that created one bipolar dimension out of two of Couch’s remaining factors. These three dimensions, generally referred to by the code letters shown above, define a three-di- mensional conceptual space. Following Leary, Bales

and his colleagues then defined the compass points of the space. In this case, definitions were produced for all twenty-six vectors in the three-dimen- sional space. Thus, a dominant, unfriendly, taskoriented act (UNF) is defined as ‘‘authoritarian and controlling,’’ and a submissive, unfriendly, emotional act (DNB) is described as ‘‘withdrawn and alienated.’’ This was a creative solution to the dilemma of providing requisite variety while keeping the number of categories reasonable. While there are twenty-six categories, they are organized into three dimensions, so coders can hold the three-dimensional space rather than the twenty-six categories in mind. The twenty-six category descriptions then become a reference to be used when first learning the system and trying to understand where specific behaviors fit.

In addition to the basic level of behavior described above, SYMLOG also provides definitions for the twenty-six vectors on the level of nonverbal behavior. This level is coded when unintentional messages are sent through nonverbal behaviors or when the nonverbal—or paralinguistic—cues are at variance with the overt verbal cues. In developing his descriptions of facial expression and nonverbal cues, Bales drew on the work of the eight- eenth-century French Encyclopedists. He found that Diderot and his colleagues had developed a more sophisticated understanding of facial expression and nonverbal nuance than have modern social scientists.

SYMLOG also allows for the coding of verbal content. Theoretically, the two levels of behav- ior—overt and nonverbal—could be coded without reference to the content of the message. Conversely, the content of messages could be coded from written transcripts without reference to the behaviors of the speakers. The content coding level attends only to evaluative statements. Each statement is first coded PRO (for statements in favor of something) or CON (statements against something). The content of the value statement is then coded in the same three-dimensional space that was used for behavior. Finally, the level of the value statement is recorded. Levels begin with the self and move outward: self, other, group, situation, society, and fantasy.

When the full SYMLOG system is used for coding a conversation, the result is a set of simple

1976

OBSERVATION SYSTEMS

sentences written in code. For example, Figure 1 records a brief exchange that took place from 1:23 to 1:24. Joe ordered Bill to stop contradicting him. The behavior was authoritarian (UNF) and the value statement was against negativity in Bill (CON N BIL). Bill responded in a rebellious manner (UNB) and made a value statement against Joe’s authoritarianism (CON UNF JOE). Ron intervened in a ‘‘purposeful and considerate’’ manner (UPF) and made a value statement in favor of reducing the level of conflict (PRO DP GRP). If we see contradictions between the overt and nonverbal behavior, we could add lines such as 123 JOE GRP N DN. This would indicate that underneath Joe’s authoritarianism is a note of insecurity and nervousness. (The first ‘‘N’’ in the message stands for ‘‘nonverbal,’’ the ‘‘A’’ in each of the messages in Figure 1 stands for ‘‘act.’’) Clearly, the system requires fairly extensive training of coders. Most SYMLOG coders went through a fifteen-week course that was run as a self-analytic group. During this time, they alternated between serving as group participants and retreating behind a mirror to serve as coders. While there remain some universities in which such courses are still taught, the use of this method peaked in the late 1970s and early 1980s. Most work being done with SYMLOG at this time uses a parallel system of retrospective rating which requires no special training. This has resulted in time and cost savings, but at the expense of sacrificing the dynamic moment-to-mo- ment process measurement that the original SYMLOG observation system made possible.

SYMLOG adds depth and sophistication to the coding of social interaction but again enlarges the unit of measurement. While IPA is an act-by- act coding system, SYMLOG is a ‘‘salient-act’’ coding scheme. Since it takes longer to record a SYMLOG observation, the coder must select the most important acts for recording. Clearly, this results in a loss of reliability as disagreements may result not only from two observers interpreting the same act differently but also from two observers selecting different acts for recording. Because of this, two observers are generally sufficient for IPA, but five or more are recommended for SYMLOG. Alternatively, interaction can be coded from video recordings. By going over the record repeatedly and picking up different acts with each pass, a higher percentage of acts can be picked up

by each coder, thus increasing the inter-rater reliability and decreasing the number of coders required. The only problem with this alternative is that SYMLOG does not record intensity. When multiple coders are used, more intense acts are likely to be selected as salient by more coders and will thus be weighted more heavily than the less intense acts that may be picked up by only one or two coders. Using only two coders and allowing them multiple passes in order to pick up a higher percentage of acts eliminates this implicit weighting effect.

The issue of reduced reliability is directly related to the problem of subjectivity. Moreno (1953) was an early critic of IPA; he argued that the observations of nonparticipants were meaningless because they could not possibly comprehend the life of a group to which they did not belong. While his position was extreme, it raised a difficult problem for any system that relies on the observations of outsiders. By standing outside the group, the outsider gains distance and ‘‘objectivity.’’ Unfortunately, it is not clear that objectivity has any real meaning when speaking of interpersonal interaction. IPA sidestepped the problem by providing very clear specifications of categories. SYMLOG confronts the problem directly since observers are required to make fairly strong inferences as to the meaning of behavior. The coder is instructed to take the role of the ‘‘generalized other,’’ or the ‘‘average’’ group member. When a group is polarized, this may be impossible. Half the group is likely to interpret an act in one way and the other half in a different way. This is another reason for having multiple observers. It is hoped that the various biases of the observers will cancel one another out if five or more people observe. Polley (1979) goes a step further by providing ‘‘descriptive reliabilities.’’ Instead of simply reporting a reliability figure, descriptive reliabilities allow for a systematic analysis of observer bias. It is argued that observations tell us as much about the observer as about the observed. This is particularly true when group members are trained to serve as observers, as is the case in the self-analytic groups that are often the subject of SYMLOG observations.

Both IPA and SYMLOG have been repeatedly criticized for their use of a bipolar model of the relationship between ‘‘task-oriented’’ and ‘‘emotional’’ behavior. Two of these critiques have proposed adding a fourth dimension. Hare (1976)

1977

OBSERVATION SYSTEMS

 

Time

Who

To Whom

Act/Non

Direction

Description

Pro/Con

Direction

Level

 

123

JOE

BIL

A

UNF

Stop contradicting me!

C

N

BIL

123

BIL

JOE

A

UNB

Don’t give me orders.

C

UNF

JOE

124

RON

GRP

A

UPF

I think we should all calm down.

P

DP

GRP

 

 

 

 

 

 

 

 

 

 

 

Figure 1. Sample SYMLOG Coding

went back to Parsons’ original Adaptation, Goal Attainment, Integration, Latent Pattern Maintenance (AGIL) system and concluded that the task dimension should be divided into two dimensions: serious versus expressive and conforming versus nonconforming. Wish and colleagues (1976) proposed leaving the task-emotion dimension intact, even though it did not seem to be quite bipolar, and adding a fourth dimension: intensity. Polley (1987) has argued that emotionality is already captured in the friendly-unfriendly dimension and that the third dimension should be reserved for recording conventional versus unconventional behavior. This solution contends that the polarization of task and emotion is an artifact from the 1950s and has largely lost its meaning since then. If work is defined as devoid of emotional satisfaction, then task and emotion are bipolar. As soon as we recognize the possibility of having an emotional reaction—positive or negative—to work, the two dimensions become orthogonal. As Stone points out, the task-emotion polarity ‘‘implies that most work involves sublimation of the libido, and demands impulse control. Moreover, it becomes difficult to imagine management’s orientation . . .

fostering individual creativity as some so-called excellent companies have been able to do’’ (1988, p. 18). It is becoming increasingly apparent that no ‘‘universal’’ scheme for coding interpersonal interaction exists totally independent of cultural and temporal context.

SPECIALIZED SYSTEMS

While Leary’s system was intended primarily for psychiatric diagnosis, and IPA was originally designed for the coding of case discussion groups at Harvard Business School, the methods described above represent attempts to develop comprehensive and general coding schemes. In addition to these all-purpose methods, observation systems

have been devised for somewhat more specific relationships or channels of communication.

Richard Mann’s (1967) sixteen-category system codes only one-way relationships from members to leaders. By narrowly defining the relationships to be observed, Mann was able to provide a much more detailed picture. Coding categories are provided for four types of impulse, four different expressions of affect, three variations of dependency, and five ego states. The additional sophistication is achieved by drastically reducing the number of observed relationships. If we observe a ten-person group using Mann’s leader-member relationship scheme, we are looking at nine oneway relationships. If we use one of the all-pur- pose methods, we are coding forty-five two-way relationships.

The most thoroughly studies channel of communication is probably the nonverbal. While this is a small part of the SYMLOG system, proxemic behavior has been exhaustively categorized by Hall (1963) in terms of posture, orientation of bodies, kinesthetic factors, touch code, visual code, thermal code, olfaction code, and voice loudness. The potential complexity of this mode of communication is further illustrated by the fact that Birdwhistell (1970) developed an equally elaborate system for the coding of movement. While Hall’s system codes states, Birdwhistell’s codes state-to-state transitions. Each of these coding schemes concentrates on the physical nature of nonverbal behavior. In contrast, Mehrabian’s system (1970) attempts to directly record the meaning of the behavior via a threedimensional model that closely parallels SYMLOG. Again, the difference is in whether meaning is inferred by the observer or deferred to the researcher.

The other channel of communication that has been studied in depth is content. Again, this is a small part of SYMLOG; all content that does not carry a positive or negative evaluation is ignored.

1978

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]