
- •Validity and reliability
- •What is Reliability?
- •What is Validity?
- •Conclusion
- •What is External Validity?
- •Psychology and External Validity The Battle Lines are Drawn
- •Randomization in External Validity and Internal Validity
- •Work Cited
- •What is Internal Validity?
- •Internal Validity vs Construct Validity
- •How to Maintain High Confidence in Internal Validity?
- •Temporal Precedence
- •Establishing Causality through a Process of Elimination
- •Internal Validity - the Final Word
- •How is Content Validity Measured?
- •An Example of Low Content Validity
- •Face Validity - Some Examples
- •If Face Validity is so Weak, Why is it Used?
- •Bibliography
- •What is Construct Validity?
- •How to Measure Construct Variability?
- •Threats to Construct Validity
- •Hypothesis Guessing
- •Evaluation Apprehension
- •Researcher Expectancies and Bias
- •Poor Construct Definition
- •Construct Confounding
- •Interaction of Different Treatments
- •Unreliable Scores
- •Mono-Operation Bias
- •Mono-Method Bias
- •Don't Panic
- •Bibliography
- •Criterion Validity
- •Content Validity
- •Construct Validity
- •Tradition and Test Validity
- •Which Measure of Test Validity Should I Use?
- •Works Cited
- •An Example of Criterion Validity in Action
- •Criterion Validity in Real Life - The Million Dollar Question
- •Coca-Cola - The Cost of Neglecting Criterion Validity
- •Concurrent Validity - a Question of Timing
- •An Example of Concurrent Validity
- •The Weaknesses of Concurrent Validity
- •Bibliography
- •Predictive Validity and University Selection
- •Weaknesses of Predictive Validity
- •Reliability and Science
- •Reliability and Cold Fusion
- •Reliability and Statistics
- •The Definition of Reliability Vs. Validity
- •The Definition of Reliability - An Example
- •Testing Reliability for Social Sciences and Education
- •Test - Retest Method
- •Internal Consistency
- •Reliability - One of the Foundations of Science
- •Test-Retest Reliability and the Ravages of Time
- •Inter-rater Reliability
- •Interrater Reliability and the Olympics
- •An Example From Experience
- •Qualitative Assessments and Interrater Reliability
- •Guidelines and Experience
- •Bibliography
- •Internal Consistency Reliability
- •Split-Halves Test
- •Kuder-Richardson Test
- •Cronbach's Alpha Test
- •Summary
- •Instrument Reliability
- •Instruments in Research
- •Test of Stability
- •Test of Equivalence
- •Test of Internal Consistency
- •Reproducibility vs. Repeatability
- •The Process of Replicating Research
- •Reproducibility and Generalization - a Cautious Approach
- •Reproducibility is not Essential
- •Reproducibility - An Impossible Ideal?
- •Reproducibility and Specificity - a Geological Example
- •Reproducibility and Archaeology - The Absurdity of Creationism
- •Bibliography
- •Type I Error
- •Type II Error
- •Hypothesis Testing
- •Reason for Errors
- •Type I Error - Type II Error
- •How Does This Translate to Science Type I Error
- •Type II Error
- •Replication
- •Type III Errors
- •Conclusion
- •Examples of the Null Hypothesis
- •Significance Tests
- •Perceived Problems With the Null
- •Development of the Null
Type I Error
The Type I error (α-error, false positives) occurs when a the null hypothesis (H0) is rejected in favor of the research hypothesis (H1), when in reality the 'null' is correct. This can be understood in terms of medical tests. For example, suppose there is a test that is used to detect a disease in a person. If a Type I error occurs in the test, it means that the test will say the person is suffering from that disease even though he is healthy.
Type II Error
Type II errors (β-errors, false negatives) on the other hand, imply that we reject the research hypothesis, when in fact it is correct. In the similar example of a medical test for a disease, if a Type-II error occurs, then it means that the test will not detect the disease in the person even though he is actually suffering from it.
Hypothesis Testing
|
|
Scientific Conclusion |
|
|
|
H0 Accepted |
H1 Accepted |
Truth |
H0 |
Correct Conclusion! |
Type 1 Error (false positive) |
H1 |
Type 2 Error (false negative) |
Correct Conclusion! |
In case of Type-I errors, the research hypothesis is accepted even though the null hypothesis is correct. Type-I errors are a false positive that lead to the rejection of the null hypothesis when in fact it may be true.
When a Type-II error occurs, the research hypothesis is not detected as the correct conclusion and is therefore passed off. In terms of the null hypothesis, this kind of an error might lead to accepting the null hypothesis when in fact it is false.
The significance level refers only to the Type-I error. This is because we ask the question "What is the probability that the correlation we observed is purely by chance?" and when this question yields an answer of below a significance level (typically 5% or 1%), we state that the result wasn't a chance process and that the parameters under study are indeed related.
Reason for Errors
Scientific experiments involve a different type of error analysis than a statistical experiment. In science, experimental errors may be caused due to human inaccuracies like a wrong experimental setup in a science experiment or choosing the wrong set of people for a social experiment.
Systematic error refers to that error which is inherent in the system of experimentation. For example, if you want to calculate the value of acceleration due to gravity by swinging a pendulum, then your result will invariably be affected by air resistance, friction at the point of suspension and finite mass of the thread.
Random errors occur because it is impossible to practically achieve infinite precision. Since the value is higher or lower in a random fashion, averaging several readings will reduce random errors.
Type I Error - Type II Error
Whilst many will not have heard of Type I error or Type II error, most people will be familiar with the terms 'false positive' and 'false negative', mainly as a medical term.
A patient might take an HIV test, promising a 99.9% accuracy rate. This means that 1 in every 1000 tests could give a 'false positive,' informing a patient that they have the virus, when they do not.
Conversely, the test could also show a false negative reading, giving an HIV positive patient the all-clear. This is why most medical tests require duplicate samples, to stack the odds up favorably. A one in one thousand chance becomes a 1 in 1 000 000 chance, if two independent samples are tested.
With any scientific process, there is no such ideal as total proof or total rejection, and researchers must, by necessity, work upon probabilities. That means that, whatever level of proof was reached, there is still the possibility that the results may be wrong.
This could take the form of a false rejection, or acceptance, of the null hypothesis.