Добавил:

Shpric Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский государственный медико-стоматологический университет им. А.И. Евдокимова

Предмет:

Оториноларингология

Файл:

Учебники / Otolaryngology - Basic Science and Clinical Review

.pdf

Скачиваний:

316

Добавлен:

07.06.2016

Размер:

42.12 Mб

Скачать

☆

<<< < Предыдущая 8 9 10 11 12 13 14 15 16 17 18 1920 / 7920 21 22 23 24 25 26 27 28 29 30 31 32 > Следующая >>>

WRITING A RESEARCH PROPOSAL 169

So, if the cultures of medicine and science are so different, why should members of one bother to try and learn about the other? The answer seems obvious. Each culture has something to offer that will enrich the other. Science can answer questions that will improve our ability to provide the best care to patients. Medicine can inform scientists of the clinical issues in greatest need of address and thereby enable scientists to focus on investigations of greatest relevance. The clinicianscientist becomes an ambassador between these two cultures, providing a conduit for transmission of ideas and information in both directions. A clinician-scientist becomes a valuable collaborator who can interact with “pure scientists” and “pure clinicians” to advance clinical care. Becoming a clinician-scientist, however, takes time and patience. There is no shortcut to membership in either culture. Whether the clinician spends several years full time or many years part time, development of research expertise will take as long as for the full-time scientist.

Clearly, not all physicians have the interest, desire, or stamina to become independent investigators. Is there a value in some less rigorous exposure to science? There are several good reasons for all physicians-in-training to have research experience. Most important is the need for clinicians to read and understand the medical literature. The only rational basis for clinical innovation is found in scientiﬁc contributions to the literature. If the clinician cannot read critically, he or she cannot decide whether to accept or reject new ideas from the research world. Critical reading depends on some familiarity with the process by which the new information was discovered. Awareness of clinical issues alone is inadequate.The critical reader must be able to assess validity of the experimental design, methodology of data collection and analysis, and interpretation of the results. Hands-on experience in conducting science is an invaluable asset in developing skills of critical reading.Another beneﬁt of research experience during clinical training is to enable each trainee to make a lasting contribution to his or her ﬁeld. Trainees may never enter another laboratory or conduct another experiment for the rest of their lives, but they will forever have the satisfaction of knowing there is an article in the literature and reprints on their shelves that represent their unique contribution to the specialty’s body of knowledge. Finally, trainees deserve an opportunity to gain exposure to as many career options as possible. Because scientiﬁc investigation may be appealing to some, trainees will beneﬁt from a chance to try it.Those with aptitude can be encouraged to seek further training in the hope of weaving scientiﬁc investigation into the fabric of their careers.

There are many facets to a scientiﬁc career, and it is impossible to expose trainees to all aspects equally.The resources available at different institutions vary greatly. There are currently two common models for research training within residencies. One is to assign residents a particular laboratory or investigator who will then assign a project.Typically, the resident simply learns the necessary technique in use by that investigator, generates some data, and writes up the project, functioning as a rather high-level technician.The strengths of such an approach are that the residents are virtually guaranteed a publication and/or presentation,and they get some experience with the chosen techniques,collecting and managing data,and working with scientists.An alternative model is to require the residents to identify their own research question, ﬁnd a suitable mentor to help frame a testable research hypothesis,and design and execute an experiment.The strength of this approach is that it is a realistic simulation of a life in science. Most of the work involved in conducting research is in the conceptualization of the project. One can hire technicians and research assistants to “turn the crank”on a project, but only the principal investigator can do the intellectual work of asking the right question and designing the experiments to answer it. Unfortunately, this approach to research training engenders a great deal of frustration and often fails to yield a completed project. Whichever model is used, residents’ training will be enriched by an opportunity to conduct a scientiﬁc investigation.

This chapter is about conducting clinical research. Conceptually, performing scientiﬁc investigation in the clinical domain is no different from the laboratory.There are the same requirements for sound research design, data collection, data analysis, and interpretation. In practice, however, there are some signiﬁcant differences. Most importantly, clinical research is performed on human subjects.This imposes both practical and ethical considerations. Subject accrual, control groups, informed consent, and the generalizability of ﬁndings are but of few of the issues that arise. It is impossible to present a comprehensive course on clinical research in this brief chapter.There are many excellent books on the topic, some of which are listed in the Suggested Readings section at the end of this chapter, and the reader is strongly encouraged to seek such resources before embarking on a project. What follows here is an introduction to some of the important topics bearing on the conduct of clinical research.

WRITING A RESEARCH PROPOSAL

A written research proposal is an essential part of every research project.A typical research proposal is composed of an introduction that states the research topic, a list of

170 CHAPTER 13 HOW TO CONDUCT CLINICAL RESEARCH

speciﬁc aims or objectives of the project, a summary of the background and signiﬁcance of the proposed work, a research plan detailing the method and materials, and a budget with justiﬁcation of all proposed expenses. The proposal serves many functions. It will identify the research question and hypotheses. It will state the signiﬁcance of the work. It is the ﬁrst opportunity for the investigator to lay out the logical arguments supporting the research, enabling the person to visualize the project as a whole and in the context of whatever clinical questions and scientiﬁc studies have come before. It will codify details of the proposed methodology, necessary resources (ﬁnancial, personnel, equipment, supplies, space, etc.), and time frame of the study. It provides a sort of road map for the collaborators on the project to be sure they are all in agreement about the structure, work assignments, and execution of the project. It is often used as the basis for a grant application to obtain funding for the project. It is typically used as a draft version of the introduction and methods section of any publication coming from the project. A well-written research proposal launches a project.

Research projects begin with a question. There are three criteria that a research project must meet:

(1) the topic must hold the researcher’s interest for the duration of the project, (2) the project must be doable with available resources (time, money, personnel, equipment, etc.), and (3) the project must be nontrivial. Therefore, identifying the research question is very important. Questions arising from the young researcher’s own curiosity tend to be more motivating than those assigned by others. Often a clinical dilemma, interesting or poignant case, or unexpected ﬁnding will stimulate curiosity.This curiosity should lead to the literature to learn what is already known about the topic. Initially, such reading may be in review articles or textbook chapters but rapidly moves into primary source literature. Once the relevant clinical literature is exhausted, reading progresses into related basic science that bears on the topic. Eventually, one reaches a frontier where there is no scientiﬁc foundation for clinical practice, where clinical practice is supported only by tradition, empiricism, and folklore. This boundary is where biomedical science really takes place.The immediate research question then becomes obvious: it is the next thing that needs to be learned to build a foundation for clinical practice. Articulation of this question frames the research.

A research project cannot be based on a vague or general description of the investigator’s curiosity. It must have very speciﬁc objectives that, once enumerated, guide the design of the proposed experiments. In “grant speak,” these objectives are called “speciﬁc aims.”A well-conceived

project typically has anywhere from one to four speciﬁc aims. One can write a short overview in a few sentences stating the research question, background, and signiﬁcance, then enumerate the speciﬁc aims. The statement, “We will see what happens when we give gentamicin eardrops to patients with external otitis” is not a speciﬁc aim. There are many different styles for writing speciﬁc aims. A particularly effective style is to state each aim in terms of the hypothesis to be tested and the method to be used for that part of the experiment. For example, “We will test the hypothesis that administration of gentamicin eardrops accelerates resolution of external otitis by comparing the results of serial ear canal cultures in a group of external otitis patients receiving gentamicin drops to a control group of external otitis patients receiving acetic acid drops.”Anyone reading the proposal, then, has a short synopsis of exactly what the investigator is trying to accomplish and how.

The background and signiﬁcance of the project may best be understood by the analogy of building a pyramid. Every fact discovered by scientiﬁc research is one brick in a pyramid. Within any giant pyramid, there are a multitude of smaller pyramids. One can think of the giant pyramid as the “big picture.” For example, curing cancer is a giant pyramid that is still under construction. Embedded within it are countless other pyramids; there is one with discovery of oncogenes at its apex, another with p53 at its apex, another with cell surface marker antigens at its apex, another with sensitivity to cisplatin at its apex, etc. If you are a pyramid builder, no matter how much you want to place a brick in a speciﬁc place, you cannot do so unless someone has laid the other bricks upon which yours will rest.You cannot suspend a brick in space.Thorough and complete description of all the previous research and clinical knowledge that lead to and support your proposed project is the background. It begins several layers deep and climbs to include all the work immediately contiguous to the current proposal. It is the small pyramid that has your proposed project as its apex. The surrounding larger or giant pyramid in which your project is embedded is the signiﬁcance.Whether the background and signiﬁcance are written in bullet form,outline form,or narrative, they provide a chain of logic that should convince the reader that the proposed project is the necessary next important brick to lay.

Once the research proposal has stated the research question, listed the speciﬁc aims, and described the background and signiﬁcance, it must give details of the method and materials, or research plan, that will accomplish the stated objectives. The research plan should state the research design: retrospective versus

RESEARCH DESIGN 171

Outline of study protocol

Title

Research questions (objectives)

Signiﬁcance (background)

Design:

Subjects

Selection criteria

Sampling design

Variables

Predictor (independent)

Outcome (dependent)

Statistical issues

Hypotheses and analytic approach

Sample size and power

Figure 13–1 Research proposal outline (from Hulley and Cummings [1987]).

prospective, cohort versus case control, longitudinal versus cross-sectional, etc. (see Research Design). It must include a description of research subjects, including inclusion and exclusion criteria that will qualify them for the study.There should be a clear statement of predictor (independent) and outcome (dependent) variables and how they will be measured.There must be a description of controls and a justiﬁcation for their selection.The proposed sampling methods and measurements should be described, as well as a calculation of sample size necessary to achieve statistically meaningful results. Data analysis and statistical methods must be described. It is often helpful to include a section on anticipated problems or pitfalls and how those may be dealt with if encountered. It is worthwhile to be detailed and thorough in this part of the research proposal because it can help avoid methodological ﬂaws that could ultimately undermine the validity of the results. Fig. 13-1 is an outline of a research proposal. Completion of each section will form the basis of a complete and detailed written research proposal.

RESEARCH DESIGN

The design of a research study depends on the hypothesis to be tested and the resources available for realistically performing the study.There is an underlying assumption that ﬁndings in the study will be generalizable to the real world. The extent to which this is a valid assumption depends on the experimental design. If the population studied is a representative sample of the real-world population of interest, generalizability of results is enhanced. For example, if one wishes to assess the efﬁcacy of a new antihypertensive drug in senior citizens, studying the drug’s effect in

volunteer medical students is of questionable beneﬁt. Testing that same drug in attendees at the weekly bingo game at the local senior center should be better. However, if 90% of the bingo players are women, it may be risky to assume equivalent drug efﬁcacy in men.The choice of experimental design seeks to balance reliability/generalizability of the study and cost in time and money.

The process begins with identiﬁcation of the actual, real-world population of interest, the population to whom these results will ultimately be generalized. This enables the investigator to deﬁne inclusion criteria for suitable research subjects.Then the inclusion criteria are applied to an accessible target population that is a representative subset of the population of interest. Exclusion criteria eliminate those potential subjects who are unable to participate, who may provide bad data, or who are unethical to study.The remaining pool of candidates is either studied in its entirety or must be sampled. Sampling methods include consecutive samples, probability samples (e.g., random sampling), and nonprobability samples (e.g., convenience or judgmental).

In an experimental study, research subjects identiﬁed in the manner described here are assigned by the investigator to experimental and control groups who differ only by the nature of the experimental intervention. Assignment is done by a predetermined systematic means such as randomization. In a quasi-experimental study, the investigator is not able to make assignment into experimental and control groups. Usually this is a limitation imposed by the logistics of performing the study. For example, to compare the efﬁcacy of two different treatments, one may need to compare subjects given treatment A at one institution to subjects given treatment B at another institution. The choice of which treatment is administered at each institution is not in the investigator’s control.The two cohorts may differ in ways that undermine the validity of the study, but there is no practical alternative means of accomplishing the research.

There are many different designs, varying in complexity, reliability, and applicability. The inexperienced investigator is strongly advised to seek guidance from an expert in clinical research or consult one of the many excellent texts on the subject. It is essential that decisions of design, data collection, and data analysis be made before the project begins. Imagine you are a carpenter building a house and, once the walls are all ﬁnished, calling in an electrician and requesting installation of some lights and wall sockets. This is the situation faced when someone compiles a mass of data and then goes to the statistician for help with analysis. The experimental design, data collection method, type and amount of data collected, and statistical methods are all interrelated.

172 CHAPTER 13 HOW TO CONDUCT CLINICAL RESEARCH

The following is an introduction to some of these most important factors

RESEARCH DESIGN CATEGORIES

Research design can be broken into three general categories: experimental, quasi-experimental, and ex post facto (i.e.,retrospective),in descending order of reliability.

In experimental design one group of subjects is exposed to an experimental intervention, and one or more other control groups are treated differently. The assignment of treatments is under the control of the investigator, and the assignment of subjects to the various groups is by some predetermined systematic means, such as randomization, to ensure that experimental and control groups differ only on the basis of the experimental intervention.This is the model routinely used in laboratory animal experimentation. It is the design most likely to yield unambiguous results or determine causality and least likely to be confounded by various forms of bias. Its use in clinical research is more restricted, but it is still the ideal for obtaining results of the greatest validity.

Quasi-experimental design, as the name implies, looks like experimental design but differs in that the investigator cannot assign subjects to an absolutely equivalent control group. For example, subjects may not be assigned randomly to two different treatments, they may vary in

age, severity of their disease, temporal factors in duration of symptoms or time since treatment, site of treatment, or any host of other variables. Internal validity of a study, the likelihood that an observed effect of intervention is truly due to that intervention, is weaker in quasi-experimental design. It is the lack of experimental controls that introduces many threats to internal validity because there may be many variables that are not appreciated or controlled by the investigator. It is the job of the investigator to try and identify a population of patients to serve as approximate controls.The more closely the control group resembles the experimental group, the stronger the inference that can be drawn from the comparison. Cohort comparison and case-control studies are examples of quasiexperimental designs that can have good reliability.

Ex post facto design refers to analysis of groups after the fact.The investigator has no control of assignment to experimental groups. In other words, the investigator has no control of independent variables because they have already occurred. Sometimes this is the only practical type of research to answer certain questions, for example, in a disease so rare that it may take many years to accrue a signiﬁcant number of patients for study. However, retrospective studies are also the weakest in reliability because of the loss of investigator control of random treatment assignment and control of variables. Retrospective studies are the most likely to lead to

TABLE 13-1 HIERARCHY OF RELIABILITY OF RESEARCH METHODS FOR ASCERTAINING DIAGNOSTIC OR

TREATMENT EFFECTIVENESS

Research method (in descending order of level of reliability)

True experiments (investigator controls both allocation to groups and determination of treatment)

Randomized concurrent controlled trial including crossover design with random order of treatment Historical controls only in special case of certain diagnosis and known course of events

Randomized concurrent controlled trial with weakly randomized assignment or systematic assignment (odd/even, alternate appearance, etc.), including crossover design with systematic order of treatment

Nonrandomized concurrent controlled trial

Short-interval sequential trials within same institution or service

Controls from separate institutions or services with documented attention to coordination Controls from separate institutions or services with poor or no attention to coordination

Nonexperimental methods

Cohort comparison studies

Historical controls

Nonexperimental ease control or case referent studies

Series of eases (all corners to a center over a time interval)

Large series of cases (consecutive)

Small series of cases (consecutive)

Isolated case reports with documentation of active surveillance

Isolated case reports (volunteer)

Case report

DATA COLLECTION 173

incorrect interpretation, especially by faulty conclusion of causality, a cause-and-effect relationship that can never be ascertained retrospectively. The greatest methodological ﬂaw is to approach ex post facto data with no speciﬁc hypothesis or prediction in mind, but to just go “data mining,” looking for statistical associations. Such associations invariably show up but are uninterpretible. At best, these associations can be considered tantalizing ﬁndings that justify a prospective experimental or quasi-experimental study. Table 13-1 (from Troidl et al. [1998]) gives a hierarchy of reliability in research methods in descending order.

DATA COLLECTION

Research data can be acquired in two manners: they can be collected by the investigators or their agents (observational data) or contributed by the research subjects (questionnaires and surveys). In large part, the choice of data collection method depends on what is being measured. Individual measurements should be sensitive, speciﬁc, appropriate, and objective, and they should detect differences over a range of values. Questionnaire and interview instruments should be clear, accurate, and reliable parameters that are assessable by validating an instrument in advance of using it for a research study.

The precision and accuracy of measures are paramount. Precision is defined as the consistency of repeated measures. It is a major determinant of sample size and statistical power. It is produced by random error introduced by variability in the observer, the measurement instrument, or the research subject.Trained observers using a manual of procedures developed for the project will generate more precise results; so will automated instruments and averaging results of repeated measures. Accuracy is deﬁned as the degree to which a taken measurement actually reﬂects the true value of the variable. It is reduced by systematic error (bias) in the observer, instrument, or study subjects. Calibration, unobtrusive measures, and blinding improve accuracy.

Questionnaires and surveys can be self-administered or administered in a structured interview. In either case, when possible one should use existing instruments whose validity is already known. New instruments must be validated before use, a laborious process best done in collaboration with someone expert in questionnaire design. Although open-ended questions are useful for uncovering new or unanticipated information, closedended questions are more amenable to data analysis and are usually easier to answer. Generally, questions should be independent of each other, but they may be combined into summative or cumulative scores to

gain an improved measure of abstract variables. The investigator should strive for completion of all questions in each questionnaire to avoid missing data that will complicate analysis and weaken the results.

CONTROL AND VALIDITY

Research investigators seek to control three aspects of an experiment: the strength of the intervention, the equivalency of the experimental and control groups, and the equivalency of the study population and the realworld population to whom results may be generalized. These three levels of control are relatively straightforward for the bench scientist but can be extremely challenging for the clinical investigator. Strength of intervention may be the hardest to deal with. In a drug or radiation therapy trial, this is straightforward and relates to dosage. In other clinical interventions, such as teaching and training or surgery, standardization of the intervention across subjects may be impossible.

The relationship between the study group and the outside world is called external validity. Because one ultimately hopes to make results of the study applicable to the general population, the study population should be a representative subset of the real-world population of interest. Accurate deﬁnition of the sample frame, the potential pool of research subjects, is the ﬁrst step in ensuring external validity. A further subset of the sample frame is taken as actual enrollees in the study.The sampling technique (see next section) by which subjects are chosen can also help ensure external validity by avoiding introduction of sample bias. Proper sample size must be calculated to determine how many subjects will be necessary for valid statistical analysis of results. Allocation of subjects to experimental and control groups should be random whenever possible. Nonrandom allocation introduces another potential bias that differentiates the experimental and control groups. Identiﬁcation of sample frame, sampling method, calculation of sample size, and allocation of subjects are all under the control of the investigator in an experimental study. In a quasi-experimental design, some of these factors may not be under the investigator’s control.This threatens the external validity of the research. Other threats to external validity include interaction effect of tests or measure in the study that may alter subjects’ sensitivity to experimental intervention, interaction effects of selection bias and experimental intervention arising from known or unknown differences between experimental and control groups and their differential sensitivity to the experimental intervention, and multiple treatment interference in which effect of one treatment is not “washed out” before a next intervention is implemented.

174 CHAPTER 13 HOW TO CONDUCT CLINICAL RESEARCH

Though not a complete list, these are some of the more common examples of threats to external validity.

The relationship between experimental and control groups within a study is called internal validity. Threats to internal validity include history, maturation, testing, instrumentation, statistical regression, selection or allocation bias, experimental mortality (dropout rate), and unanticipated interaction between any of these other factors. History refers to those factors that are going on in the world outside the conﬁnes of the study that may differentially affect experimental and control groups. Maturation refers to factors that may affect both experimental and control groups attributable to the duration of the study. Testing threatens internal validity when study subjects learn to anticipate the test or gradually improve their test taking by repeated experience. Instrumentation can be a source of trouble if calibration changes over the duration of the study or if technology introduces new measures before and after intervention. Statistical regression refers to inherent variability of any measured variable among study subjects. Subjects may have higher or lower scores unrelated to the experimental intervention. Nonrandom sampling or allocations are the most obvious source of bias and threat to internal validity because it reduces the equivalency of control and experimental groups. Experimental mortality is the loss of subjects to completion of the study.The rate of dropout and the cause of dropout may be different between control and experimental groups, introducing bias or incorrect interpretation. Finally, there can be interaction between these factors. For example, if control and experimental groups are allocated nonrandomly, the resultant cohorts may be differentially susceptible to maturation bias.

SAMPLING METHODS

The sampling method is the means by which investigators select a representative subset of their target population for inclusion in a research study. The method is chosen to strike a balance between the “representativeness” of the study subjects and practical considerations such as time, expense, and availability of adequate numbers of subjects. Ideally, random selection is the best way to sample. In fact, sampling can be divided into probability and nonprobability methods. Probability methods are used when the chance that an element will be in the population is known. This enables a random selection technique to provide a representative sample. Nonprobability methods are used when the chance that an element in the population will be selected in the sample is unknown.The sample will not be representative of the

population, and therefore research results can only be suggestive of statistical characteristics of the population.

There are four basic types of probability sampling: simple random, systematic, stratiﬁed random, and cluster sampling. Simple random sampling is applied to a homogeneous population in which every element in the population has an equal likelihood of being selected.The probability of each element may be different, but all elements are independent of each other. Simple random sampling is both simple and representative, but failure to accurately identify the entire population may introduce bias. Systematic sampling consists of selecting study subjects according to a predetermined sequence, for example, taking every ﬁfth patient of the next 100 patients seen with the target disease. As long as the elements of interest are uniformly distributed through the sequence of patients, those chosen will be a representative random group. If, however, there is clustering of some element (e.g., all the older patients coming on the same bus from their nursing home), some elements may be overor underrepresented in the study cohort. In stratiﬁed random sampling, the study population is subdivided into homogeneous groups based on known variables, such as age, economic status, tumor stage, and so on, to improve representation. This method works only if the researchers know which variables are necessary to achieve representativeness. Cluster sampling typically is used when the target population is inﬁnite, potential subjects are widely scattered geographically, or there is no list identifying members of the population. Cluster sampling is often done in stages. However, as the number of sampling stages goes up, there are more chances for sampling error. Unappreciated differences between clusters or areas of sampling can introduce bias as well.

Nonprobability sampling methods include convenience sampling, judgment sampling, network sampling, and quota sampling. In the use of convenience sampling, subjects are selected by their accessibility.The validity of results will depend on how much the sample differs from the target population. Judgment sampling is when the researcher selects representatives of typical cases for study.The quality of the sample depends on the accuracy of the researcher in selecting representative cases. In network sampling a few subjects are chosen, and they, in turn, refer other subjects, who then refer a third tier.The networking continues until subject accrual is adequate. The technique is helpful when potential subjects may not readily make themselves known, but it has obvious bias because the subjects are not truly independent of each other. Quota sampling is similar to stratiﬁed random sampling. Bias from the differences between subgroups can be reduced by applying mathematical formulas to

DATA ANALYSIS AND STATISTICAL METHODS 175

correct for underrepresentation. However, there is still little or no control of selection within groups.

POWER AND SAMPLE SIZE

In research studies, power is deﬁned as the probability of concluding there was a difference when in fact there was none. Sample size refers to the number of study subjects. One objective of study design is to enroll several subjects to permit valid statistical analysis of results. It is a number that can be calculated in advance of the initiation of the project based on the details of the test hypotheses and experimental design. For analytic studies, there are four steps to estimating sample size: (1) state the null and alternative hypotheses, specifying whether one or two-tailed; (2) select an appropriate statistical test based on the type of predictor and outcome variables;

(3) estimate the effect size and its variability from pilot data or previous studies; and (4) specify appropriate values of and based on the importance of avoiding type 1 and type 2 errors.The actual sample size can then be looked up in statistical tables available for this purpose. For descriptive studies, the investigators must achieve a sample size that will provide a chosen conﬁdence level and precision. The steps are (1) for a dichotomous variable, estimate the proportion of subjects with the variable of interest; for a continuous variable, estimate its standard deviation; (2) specify the desired precision (width) of the conﬁdence interval; and (3) specify the conﬁdence level (e.g., 95%). Sometimes the sample size is ﬁxed or predetermined. If so, one can calculate backward to estimate the power or detectable effect size.The importance of addressing issues of sample size in advance cannot be overstated. It is this process that determines the feasibility of a project; if it is impossible to accrue enough subjects for statistical analysis, the project cannot be accomplished. In many cases, alterations in study design, such as choice of variables or statistical test, may enable performance of the research with a smaller sample size. Some common strategies include using continuous variables, more precise measurements, paired measurements, unequal group sizes, and more common outcomes. Collaboration with a statistician for sample size calculations and study design is essential.

DATA ANALYSIS AND

STATISTICAL METHODS

ROLE OF THE STATISTICIAN

As already noted, statisticians play crucial roles in research design.They are the individuals best qualiﬁed to perform power and sample size calculations.They can help choose

between different types of variables and measures based on objectives of generalizability and outcome validity.

All researchers should have a passing familiarity with the basics of biostatistics to discuss the issues intelligently with their consulting statistician. All physicians should have a passing familiarity with the basics of biostatistics so they can read the medical literature critically. Providing a solid primer on biostatistics is beyond the scope of this chapter.The objective here is only to introduce the reader to some basic biostatistics terminology and concepts.

Statistics provide a means of quantifying observations. The quantities can then be compared or manipulated to provide insights into those things measured. Statistics can be used either to describe a group (of people, of numbers, of illness, etc.) or to assess how well characteristics of a group can be generalized.These two applications are called descriptive statistics and inferential statistics. Statistics are performed on variables, where variable is deﬁned as what is being observed or measured. Variables can be dependent or independent. The dependent variable is the outcome of interest.The independent variable is the intervention or what is being manipulated. The variables can be either discrete or continuous. Discrete variables can have only a limited set of values, such as gender, eye color, and number of offspring. Continuous variables, such as height and weight, fall along a continuum with units imposed by the sensitivity of the measuring technique. Discrete and continuous variables lend themselves to different types of statistical treatment.

Variables can also be grouped by type: nominal, ordinal, interval, and ratio (mnemonic NOIR). Nominal variables are named categories, such as eye color, gender, and side of lesion. Ordinal variables are the same as nominal plus ordered categories (e.g., cancer stages I–IV). Interval variables are the same as ordinal plus equal intervals; that is, the difference between numbers is meaningful, but the ratios between them are not (e.g., intelligence quotient). Ratio variables are the same as interval plus meaningful zero; that is, the ratio between numbers is meaningful (e.g., weight). Once again, the different types of variables are amenable to analysis using different statistical methods. Parametric statistics are used for interval or ratio dependent variables. Nonparametric statistics are used for nominalor ordinal dependent variables.

DESCRIPTIVE STATISTICS

Descriptive statistics are concerned with the presentation, organization, and summarization of data. The ﬁrst important description of data is the central tendency.

176 CHAPTER 13 HOW TO CONDUCT CLINICAL RESEARCH

The best-known calculations are mean, median, and mode. The second important description is dispersion: how closely the data cluster around the measure of the central tendency. Dispersion can be reported as the range; that is, the difference between the highest and lowest values of the variable. A more informative description is standard deviation. Standard deviation (SD) deﬁnes how closely individual scores cluster around their mean. SD x2/N, where x is the deviation of an individual measure from the mean, x2 is the sum of the squares of all values of x in the sample, and N is the sample size. Standard error of the mean (SE) describes how close mean scores from repeated samples will be to the true (population) mean. SE SD/ N, where SD is the standard deviation and N is the sample size. Both SD and SE assume a normal distribution, the situation in which all values are evenly distributed above and below the central tendency. If the distribution is asymmetric about the central tendency, the asymmetry can be described by skew and kurtosis, and requires different means of expressing dispersion.

INFERENTIAL STATISTICS

Inferential statistics allow one to generalize from sample data to a larger group of subjects. One can deﬁne statistical inference as the determination of the probability (or likelihood) that a conclusion based on analysis from a sample is true. All statistical tests are based on the signal-to-noise ratio, where signal is the important relationship and noise is a measure of individual variation. There are four possible outcomes of signal-to-noise: (1) detecting a signal where there was none (false-positive), (2) detecting a signal when there was one (true-positive), (3) detecting no signal when it actually was present (false-negative), and (4) detecting no signal when there was no signal (true-negative). Statisticians deﬁne two terms, a ( ) and b ( ), to talk about erroneous measures. Alpha is the probability of concluding that the sample came from a different population (i.e., a signiﬁcant difference exists) when in fact it did not (making a type 1 error). Beta is the probability of concluding that no difference existed when in fact it did (making a type 2 error).The power of the statistic, mentioned earlier, equals 1 . The relationship between and is shown in Table 13-2.

When the statistical test can detect any difference between groups, regardless of direction, it is a two-tailed test.A one-tailed test speciﬁes the direction of difference in advance. Conﬁdence intervals deﬁne the range within which the true mean of a population falls; for example, 95% conﬁdence intervals ( 2 SE) have a 95% chance of

TABLE 13-2 RELATIONSHIP BETWEEN AND

	Truth
Called	No Difference	Difference
No difference	(1- )
Difference		(1- )

containing the true mean. If the 95% conﬁdence intervals of two populations overlap, then the difference between them is not signiﬁcant at the .05 level. Statistical signiﬁcance is a precondition for a consideration of clinical signiﬁcance but says nothing about the actual magnitude of the effect. For example, consider a new drug that has only one tenth the risk of ototoxicity as the drug in current use. The difference in ototoxicity between the two drugs is highly statistically signiﬁcant. However, if the risk of ototoxicity of the standard drug is a low 0.01%, the 0.001% risk of the new drug is not a clinically signiﬁcant improvement.

Parametric statistical tests of signiﬁcance are applied in cases when outcome variables are interval or ratio. The best known is the t-test, a method of comparing the means of two groups. It is based on the ratio of the difference between groups to the standard error of the difference.The unpaired t-test compares the means of two independent samples; the paired t-test compares two paired observations on the same individual or matched individuals. The t-test is not appropriate when there are more than two groups or when individuals in one group are matched to individuals in another. Analysis of variance (ANOVA) is used to compare among many means. It is used when the independent variable is nominal and the dependent variable(s) is/are interval or ratio.A one-way ANOVA deals with a single nominal independent variable; a factorial ANOVA deals with multiple different factors in many different conﬁgurations. Regression analysis is used in the situation in which there is one measured outcome/dependent variable and one or more measured independent variables, when both dependent and independent variables are interval or ratio. The Pearson correlation and the multiple correlation coefﬁcients describe the strength of the relationship between variables. Analysis of covariance (ANCOVA) combines regression and ANOVA when there is one measured dependent variable and independent variables can be both categorical and measured.

Nonparametric statistics are applied when the outcome/dependent variables are nominal or ordinal. The chi-square, binomial test, and Fisher exact test are all for nominal data and independent samples.The McNemar

DATA ANALYSIS AND STATISTICAL METHODS 177

chi-square test can be used for related samples.When data are ordinal and samples are independent, Mann-Whitney U, median, Kruskal-Wallis, and Kolmogorov-Smirnov tests may be applicable.There are several nonparametric measures of association equivalent to the correlation coefﬁcient. These include contingency coefﬁcient, phi coefﬁcient, and Cohen’s kappa coefﬁcient for nominal data, and Spearman’s rho, Kendall’s tau, and Kendall’s W for ordinal data.There are three advanced nonparametric techniques for handling designs where the dependent variable involves frequencies within categories, with more than one independent variable.The Mantel-Haenzel chi-square deals with two independent factors. Logistic regression and log-linear analysis can manage any number of independent variables. Logistic regression treats all independent variables as measured data like multiple regressions. Log-linear analysis handles the case of multiple categorical variables and estimates effects and interactions, analogous to factorial ANOVA.

In addition to this brief list of parametric and nonparametric statistical tools, there are a host of others, as well as multivariate techniques.There are many software packages available today to enable anyone to perform these statistical tests on a desktop computer. However, the software cannot assure the quality of data nor prevent you from using the wrong statistic. Only the statistician can do that.

INTERPRETING AND REPORTING

RESEARCH FINDINGS

As pointed out in the previous discussion, clinical research studies should be as rigorous in their statistical

SELF-TEST QUESTIONS

For each question select the correct answer from the lettered alternatives that follow.To check your answers, see Answers to Self-Tests on page 716.

1.A type 2 error refers to

A.An error in calculations made by data entry

B.Mistakes made through improper study design

C.The probability of erroneously identifying that there is no difference between two groups

D.Incorrectly detecting a false-positive result

2.Parametric statistical tests include

A.Chi-square test

B.Spearman’s rho

design as basic science research experiments.This point is especially critical when examining relatively rare disorders where large populations may be necessary to achieve statistical signiﬁcance.The process of interpreting data published in the medical literature should essentially involve reverse engineering the project. Initially, one should determine if the results actually addressed the key research question asked. Next, one should ask if the reported aims of the paper were achieved by the research design and if the statistics chosen to prove the null hypothesis were adequate. This same process should be followed when reporting your own research ﬁndings. In especially difﬁcult research situations, the results of the studies should be reported in enough detail so that they are suitable for later meta-analysis by other groups of investigators. If the research question is properly posed and the research project well designed,results worth publishing or results leading to a follow-up study should emerge. If these points are not considered, one’s time expenditure and utilization of resources will result in a project that neither addresses nor answers any signiﬁcant questions.

	LASER PHYSICS AND TISSUE INTERACTION	LASER WARNING SIGNS AND BLACKOUT SHADES
	THE LASER AS A SURGICAL TOOL	EYE AND SKIN PROTECTION
		LASER PLUME BIOHAZARD AND THE NEED FOR
	POWER
		UNIVERSAL PRECAUTIONS
	SPOT SIZE AND POWER DENSITY
		SMOKE EVACUATION
	TREATMENT TIME AND FLUENCE
		SPECIFIC LASER APPLICATIONS IN THE HEAD AND NECK
	PULSED DELIVERY
		CUTANEOUS APPLICATIONS
	LASER TRANSMISSION AND INSTRUMENTATION
		LARYNGEAL AND TRACHEOBRONCHIAL
	OPTICAL FIBERS
		APPLICATIONS
	FLASH SCANNERS	ORAL CAVITY AND OROPHARYNGEAL APPLICATIONS
	LASER INSTRUMENTATION	LASER-ASSISTED INTRANASAL AND PARANASAL
	LASER SAFETY CONTROL MEASURES	SINUS SURGERY
		EAR APPLICATIONS
	EDUCATION
		PHOTODYNAMIC THERAPY
	SAFETY GUIDELINES AND CREDENTIALING
		SUMMARY
	GENERAL SAFETY CONSIDERATIONS IN THE OFFICE

	OR OPERATING ROOM	SUGGESTED READINGS
	LASER QUALITY CONTROL AND LOCKOUT	SELF-TEST QUESTIONS
	FEATURES