
- •Validity and reliability
- •What is Reliability?
- •What is Validity?
- •Conclusion
- •What is External Validity?
- •Psychology and External Validity The Battle Lines are Drawn
- •Randomization in External Validity and Internal Validity
- •Work Cited
- •What is Internal Validity?
- •Internal Validity vs Construct Validity
- •How to Maintain High Confidence in Internal Validity?
- •Temporal Precedence
- •Establishing Causality through a Process of Elimination
- •Internal Validity - the Final Word
- •How is Content Validity Measured?
- •An Example of Low Content Validity
- •Face Validity - Some Examples
- •If Face Validity is so Weak, Why is it Used?
- •Bibliography
- •What is Construct Validity?
- •How to Measure Construct Variability?
- •Threats to Construct Validity
- •Hypothesis Guessing
- •Evaluation Apprehension
- •Researcher Expectancies and Bias
- •Poor Construct Definition
- •Construct Confounding
- •Interaction of Different Treatments
- •Unreliable Scores
- •Mono-Operation Bias
- •Mono-Method Bias
- •Don't Panic
- •Bibliography
- •Criterion Validity
- •Content Validity
- •Construct Validity
- •Tradition and Test Validity
- •Which Measure of Test Validity Should I Use?
- •Works Cited
- •An Example of Criterion Validity in Action
- •Criterion Validity in Real Life - The Million Dollar Question
- •Coca-Cola - The Cost of Neglecting Criterion Validity
- •Concurrent Validity - a Question of Timing
- •An Example of Concurrent Validity
- •The Weaknesses of Concurrent Validity
- •Bibliography
- •Predictive Validity and University Selection
- •Weaknesses of Predictive Validity
- •Reliability and Science
- •Reliability and Cold Fusion
- •Reliability and Statistics
- •The Definition of Reliability Vs. Validity
- •The Definition of Reliability - An Example
- •Testing Reliability for Social Sciences and Education
- •Test - Retest Method
- •Internal Consistency
- •Reliability - One of the Foundations of Science
- •Test-Retest Reliability and the Ravages of Time
- •Inter-rater Reliability
- •Interrater Reliability and the Olympics
- •An Example From Experience
- •Qualitative Assessments and Interrater Reliability
- •Guidelines and Experience
- •Bibliography
- •Internal Consistency Reliability
- •Split-Halves Test
- •Kuder-Richardson Test
- •Cronbach's Alpha Test
- •Summary
- •Instrument Reliability
- •Instruments in Research
- •Test of Stability
- •Test of Equivalence
- •Test of Internal Consistency
- •Reproducibility vs. Repeatability
- •The Process of Replicating Research
- •Reproducibility and Generalization - a Cautious Approach
- •Reproducibility is not Essential
- •Reproducibility - An Impossible Ideal?
- •Reproducibility and Specificity - a Geological Example
- •Reproducibility and Archaeology - The Absurdity of Creationism
- •Bibliography
- •Type I Error
- •Type II Error
- •Hypothesis Testing
- •Reason for Errors
- •Type I Error - Type II Error
- •How Does This Translate to Science Type I Error
- •Type II Error
- •Replication
- •Type III Errors
- •Conclusion
- •Examples of the Null Hypothesis
- •Significance Tests
- •Perceived Problems With the Null
- •Development of the Null
Cronbach's Alpha Test
The Cronbach's Alpha test not only averages the correlation between every possible combination of split halves, but it allows multi-level responses.
For example, a series of questions might ask the subjects to rate their response between one and five. Cronbach's Alpha gives a score of between zero and one, with 0.7 generally accepted as a sign of acceptable reliability.
The test also takes into account both the size of the sample and the number of potential responses. A 40-question test with possible ratings of 1 - 5 is seen as having more accuracy than a ten-question test with three possible levels of response.
Of course, even with Cronbach's clever methodology, which makes calculation much simpler than crunching through every possible permutation, this is still a test best left to computers and statistics spreadsheet programmes.
Summary
Internal consistency reliability is a measure of how well a test addresses different constructs and delivers reliable scores. The test-retest method involves administering the same test, after a period of time, and comparing the results.
By contrast, measuring the internal consistency reliability involves measuring two different versions of the same item within the same test.
Instrument Reliability
Instrument reliability is a way of ensuring that any instrument used for measuring experimental variables gives the same results every time.
Bottom of Form
In the physical sciences, the term is self-explanatory, and it is a matter of making sure that every piece of hardware, from a mass spectrometer to a set of weighing scales, is properly calibrated.
Instruments in Research
As an example, a researcher will always test the instrument reliability of weighing scales with a set of calibration weights, ensuring that the results given are within an acceptable margin of error.
Some of the highly accurate balances can give false results if they are not placed upon a completely level surface, so this calibration process is the best way to avoid this.
In the non-physical sciences, the definition of an instrument is much broader, encompassing everything from a set of survey questions to an intelligence test. A survey to measure reading ability in children must produce reliable and consistent results if it is to be taken seriously.
Political opinion polls, on the other hand, are notorious for producing inaccurate results and delivering a near unworkable margin of error.
In the physical sciences, it is possible to isolate a measuring instrument from external factors, such as environmental conditions and temporal factors. In the social sciences, this is much more difficult, so any instrument must be tested with a reasonable range of reliability.
Test of Stability
Any test of instrument reliability must test how stable the test is over time, ensuring that the same test performed upon the same individual gives exactly the same results.
The test-retest method is one way of ensuring that any instrument is stable over time.
Of course, there is no such thing as perfection and there will be always be some disparity and potential for regression, so statistical methodsare used to determine whether the stability of the instrument is within acceptable limits.