
- •Validity and reliability
- •What is Reliability?
- •What is Validity?
- •Conclusion
- •What is External Validity?
- •Psychology and External Validity The Battle Lines are Drawn
- •Randomization in External Validity and Internal Validity
- •Work Cited
- •What is Internal Validity?
- •Internal Validity vs Construct Validity
- •How to Maintain High Confidence in Internal Validity?
- •Temporal Precedence
- •Establishing Causality through a Process of Elimination
- •Internal Validity - the Final Word
- •How is Content Validity Measured?
- •An Example of Low Content Validity
- •Face Validity - Some Examples
- •If Face Validity is so Weak, Why is it Used?
- •Bibliography
- •What is Construct Validity?
- •How to Measure Construct Variability?
- •Threats to Construct Validity
- •Hypothesis Guessing
- •Evaluation Apprehension
- •Researcher Expectancies and Bias
- •Poor Construct Definition
- •Construct Confounding
- •Interaction of Different Treatments
- •Unreliable Scores
- •Mono-Operation Bias
- •Mono-Method Bias
- •Don't Panic
- •Bibliography
- •Criterion Validity
- •Content Validity
- •Construct Validity
- •Tradition and Test Validity
- •Which Measure of Test Validity Should I Use?
- •Works Cited
- •An Example of Criterion Validity in Action
- •Criterion Validity in Real Life - The Million Dollar Question
- •Coca-Cola - The Cost of Neglecting Criterion Validity
- •Concurrent Validity - a Question of Timing
- •An Example of Concurrent Validity
- •The Weaknesses of Concurrent Validity
- •Bibliography
- •Predictive Validity and University Selection
- •Weaknesses of Predictive Validity
- •Reliability and Science
- •Reliability and Cold Fusion
- •Reliability and Statistics
- •The Definition of Reliability Vs. Validity
- •The Definition of Reliability - An Example
- •Testing Reliability for Social Sciences and Education
- •Test - Retest Method
- •Internal Consistency
- •Reliability - One of the Foundations of Science
- •Test-Retest Reliability and the Ravages of Time
- •Inter-rater Reliability
- •Interrater Reliability and the Olympics
- •An Example From Experience
- •Qualitative Assessments and Interrater Reliability
- •Guidelines and Experience
- •Bibliography
- •Internal Consistency Reliability
- •Split-Halves Test
- •Kuder-Richardson Test
- •Cronbach's Alpha Test
- •Summary
- •Instrument Reliability
- •Instruments in Research
- •Test of Stability
- •Test of Equivalence
- •Test of Internal Consistency
- •Reproducibility vs. Repeatability
- •The Process of Replicating Research
- •Reproducibility and Generalization - a Cautious Approach
- •Reproducibility is not Essential
- •Reproducibility - An Impossible Ideal?
- •Reproducibility and Specificity - a Geological Example
- •Reproducibility and Archaeology - The Absurdity of Creationism
- •Bibliography
- •Type I Error
- •Type II Error
- •Hypothesis Testing
- •Reason for Errors
- •Type I Error - Type II Error
- •How Does This Translate to Science Type I Error
- •Type II Error
- •Replication
- •Type III Errors
- •Conclusion
- •Examples of the Null Hypothesis
- •Significance Tests
- •Perceived Problems With the Null
- •Development of the Null
Reproducibility vs. Repeatability
Reproducibility is different to repeatability, where the researchers repeat their experiment to test and verify their results.
Reproducibility is tested by a replication study, which must be completely independent and generate identical findings known as commensurate results. Ideally, the replication study should utilize slightly different instruments and approaches, to ensure that there was no equipment malfunction.
If a type of measuring device has a design flaw, then it is likely that this artefact will be apparent in all models.
The Process of Replicating Research
For most of the physical sciences, reproducibility is a simple process and it is easy to replicate methods and equipment.
An astronomer measuring the spectrum of a star notes down the instruments and methodology used, and an independent researcher should be able to achieve exactly the same results, Even in biochemistry, where naturally variable living organisms are used, goodresearch shows remarkably little variation.
However, the social sciences, ecology and environmental science are a much more difficult case. Organisms can show a huge amount of variation, making it difficult to replicate research exactly and so reproducibility is a process of attempting to make the experiment as reproducible as possible, ensuring that the researcher can defend their position.
In addition, these sciences have to make much more use of statistics to dampen down experimental noise caused by physiological and psychological differences between the subjects.
This is one of the reasons why most social sciences accept a 95%probability level, which is a contrast to the 99% confidence required by most physical sciences.
Reproducibility and Generalization - a Cautious Approach
Observing due caution with the process of generalizing results helps to strengthen the case for experimental reproducibility.
In any study, there is a far smaller chance of finding confoundingevidence if the claims are narrowly defined than if they are sweeping generalizations.
For example, a psychologist who found that aggression in children under the age of five increased if they watched violent TV, could generalize that all children under five would display the same condition.
Extending this to all children means that the experiment is prone to replication issues - A researcher finding that aggression did not increase in nine year old children would invalidate the entire premise by questioning the reproducibility.
Reproducibility is not Essential
It is important to understand that creating replicable research is not essential for validity, although it does help. Sometimes, due to sheer impracticality, temporal difficulties, or expense, this is not possible.
The Framingham Heart Study, an experiment testing three generations of nurses for cardiac issues, has been going on for over 60 years and nobody is seriously expected to replicate it.
Instead, results from other studies around the world are used to build up a database of statistical evidence supporting the findings.