
- •Validity and reliability
- •What is Reliability?
- •What is Validity?
- •Conclusion
- •What is External Validity?
- •Psychology and External Validity The Battle Lines are Drawn
- •Randomization in External Validity and Internal Validity
- •Work Cited
- •What is Internal Validity?
- •Internal Validity vs Construct Validity
- •How to Maintain High Confidence in Internal Validity?
- •Temporal Precedence
- •Establishing Causality through a Process of Elimination
- •Internal Validity - the Final Word
- •How is Content Validity Measured?
- •An Example of Low Content Validity
- •Face Validity - Some Examples
- •If Face Validity is so Weak, Why is it Used?
- •Bibliography
- •What is Construct Validity?
- •How to Measure Construct Variability?
- •Threats to Construct Validity
- •Hypothesis Guessing
- •Evaluation Apprehension
- •Researcher Expectancies and Bias
- •Poor Construct Definition
- •Construct Confounding
- •Interaction of Different Treatments
- •Unreliable Scores
- •Mono-Operation Bias
- •Mono-Method Bias
- •Don't Panic
- •Bibliography
- •Criterion Validity
- •Content Validity
- •Construct Validity
- •Tradition and Test Validity
- •Which Measure of Test Validity Should I Use?
- •Works Cited
- •An Example of Criterion Validity in Action
- •Criterion Validity in Real Life - The Million Dollar Question
- •Coca-Cola - The Cost of Neglecting Criterion Validity
- •Concurrent Validity - a Question of Timing
- •An Example of Concurrent Validity
- •The Weaknesses of Concurrent Validity
- •Bibliography
- •Predictive Validity and University Selection
- •Weaknesses of Predictive Validity
- •Reliability and Science
- •Reliability and Cold Fusion
- •Reliability and Statistics
- •The Definition of Reliability Vs. Validity
- •The Definition of Reliability - An Example
- •Testing Reliability for Social Sciences and Education
- •Test - Retest Method
- •Internal Consistency
- •Reliability - One of the Foundations of Science
- •Test-Retest Reliability and the Ravages of Time
- •Inter-rater Reliability
- •Interrater Reliability and the Olympics
- •An Example From Experience
- •Qualitative Assessments and Interrater Reliability
- •Guidelines and Experience
- •Bibliography
- •Internal Consistency Reliability
- •Split-Halves Test
- •Kuder-Richardson Test
- •Cronbach's Alpha Test
- •Summary
- •Instrument Reliability
- •Instruments in Research
- •Test of Stability
- •Test of Equivalence
- •Test of Internal Consistency
- •Reproducibility vs. Repeatability
- •The Process of Replicating Research
- •Reproducibility and Generalization - a Cautious Approach
- •Reproducibility is not Essential
- •Reproducibility - An Impossible Ideal?
- •Reproducibility and Specificity - a Geological Example
- •Reproducibility and Archaeology - The Absurdity of Creationism
- •Bibliography
- •Type I Error
- •Type II Error
- •Hypothesis Testing
- •Reason for Errors
- •Type I Error - Type II Error
- •How Does This Translate to Science Type I Error
- •Type II Error
- •Replication
- •Type III Errors
- •Conclusion
- •Examples of the Null Hypothesis
- •Significance Tests
- •Perceived Problems With the Null
- •Development of the Null
The Definition of Reliability - An Example
Imagine that a researcher discovers a new drug that she believes helps people to become more intelligent, a process measured by a series of mental exercises. After analyzing the results, she finds that the group given the drug performed the mental tests much better than the control group.
For her results to be reliable, another researcher must be able to perform exactly the same experiment on another group of people and generate results with the same statistical significance. If repeat experiments fail, then there may be something wrong with the original research.
Testing Reliability for Social Sciences and Education
In the social sciences, testing reliability is a matter of comparing two different versions of the instrument and ensuring that they are similar. When we talk about instruments, it does not necessarily mean a physical instrument, such as a mass-spectrometer or a pH-testing strip.
An educational test, questionnaire, or assigning quantitative scores to behavior are also instruments, of a non-physical sort. Measuring the reliability of instruments occurs in different ways.
Test - Retest Method
The Test-Retest Method is the simplest method for testing reliability, and involves testing the same subjects at a later date, ensuring that there is a correlation between the results. An educational test retaken after a month should yield the same results as the original.
The difficulty with this method is that it assumes that nothing has changed in that time period. Staying with education, if you administer exactly the same test, the student may perform much better because they remember the questions and have thought about the questions.
How many times have you left an exam and, after a couple of hours, thought; “How could I have been so stupid - I knew the answer to that one!” Of course, next time, you will get that question right, meaning that the test is unreliable.
For this reason, if you have to retake an exam, you will be faced with different questions and may be marked a little more strictly to take into account that you had extra time to revise. This is not the complete picture, because the two exams will need to be compared, to ensure that they produce the same results. This shows the importance of reliability in our lives and also highlights the fact that there is no easy way to test it.
Internal Consistency
The internal consistency test compares two different versions of the same instrument, to ensure that there is a correlation and that they measure the same thing.
For example, sticking with exams, imagine that an examining board wants to test that its new mathematics exam is reliable, and selects a group of test students. For each section of the exam, such as calculus, geometry, algebra and trigonometry, they actually ask two questions, designed to measure the aptitude of the student in that particular area.
If there is a high internal consistency, and the results for the two sets of questions are similar, then the new test is likely to be reliable. The test - retest method involves two separate administrations of the same instrument, internal consistency measures two different versions at the same time.
A horribly complicated statistical formula, called Cronbach's Alpha tests the reliability and compares the various pairs of questions but, luckily, computer programs take care of that and spit out a single number, telling you exactly how reliable the test is!