Validity & Reliability
in Survey Research

Share This:



Blog image

One of the most troublesome areas we see people having when conducting survey research is their lack of understanding when it comes to validity and reliability. Granted, neither are an easy task; they can be, and often should be, arduous undertakings. Without validity and reliability, the researcher’s efforts are in vain, albeit most often unknown to the researcher. In fact, in our experience very few persons have a sound understanding of validity and reliability. Such especially holds true for those attempting to address validity and reliability without having first earned their stripes through solid PhD programs with strong research design and methodology components, but such also holds true for some PhDs, as well. Granted, such issues with research design and methodology is often not thoroughly taught in many doctoral programs, such as engineering and other mathematical sciences programs; rather, they are taught primarily in social science programs. But I digress.

Validity is the degree to which an instrument captures its intended constructs, while reliability is the degree to which such capture can be repeated. Neither are easy to address;note that I did not use the word, establish. I used the softer word, address, for one may establish, or estimate, a reliability coefficient by calculating it, but he does not establish validity; rather, he addresses it. And while this may seem superficial, it is not.

Viewed more formally, validity is the extent to which a measure captures a specific variable, or set of variables, i.e. the extent to which a construct, or set of constructs, is operationalized. For example, in economics we are concerned with economic output; however, we must first define “economic output;” this is validity. Reliability is the extent to which a measure consistently captures its intended data. In risk engineering, for example, reliability is the ability of a system to perform consistently its required functions under stated conditions for a specified period. As another example, in psychometrics, reliability is the ability of an instrument, or test, to consistently collect the same data, repeatedly. However, it is possible to have reliability without validity; but technically, it is not possible to have validity without reliability. As an example, a watch that consistently runs 30 minutes later than the correct time offers no evidence that its measure offersany validity as a measure of time; however, the watch is consistent and therefore, reliable. In fact, while the time the watch keeps is “not valid,” to use the term loosely, its reliability coefficient is 1.0, or perfect.

In brief, there are three broad types of validity as it relates to instrument development and usage, including content validity, criterion validity, and construct validity. Of these three types of validity, content validity is the easiest to address; content validity often is equated with face validity and merely asks the researcher to subjectively address two basic questions: 1) Does the instrument appear to measure the constructs, or variables, of the study? 2) Is the sample being measured adequate to be representative of the construct to be measured?

Criterion validity allows researchers to approximate how well constructs within their study were operationalized. A simple example may involve a researcher comparing data collected through one instrument to data collected through a different, yet similar, instrument; or an estimation of criterion validity may involve the degree to which an instrument predicts the outcome of a variable.

Similar to criterion validity, construct validity is the degree to which an instrument measures a specific variable. Evidence of construct validity is most often considered the most difficult type of validity to address; however, it is none less important. Construct validity addresses whether an instrument accurately measures the constructs it was designed to measure. For example, human intelligence is often measured in terms of an intelligent quotient (IQ), using 100 as the mean across the population and 10 or 15 as the standard deviation. However, who is to define intelligence, and moreover, who is to say that this represents intelligence and this does not? Until “intelligence” is operationalized into a measurable variable, it remains merely an abstract construct of relatively little value.

As for reliability, recall that reliability is the degree to which capture of the same—or in reality, similar—data can be repeated. There are several types of reliability, each with its own unique method for estimating reliability. Generally, these types include 1) inter- rater reliability, 2) test-retest reliability, 3) parallel-forms reliability, and 4) internal consistency reliability. Inter-rater reliability is used to estimate the degree to which varying raters provide similar estimates of a same construct or phenomenon; and test-retest reliability is used to determine the consistency of a measure over time. Researchers use parallel-forms reliability to estimate the consistency of data of two separate instruments in the same domain; and internal consistency reliability, or internal reliability as it is most often called, is used to estimate the consistency of collected data across an instrument. As for estimating reliability, various methods are used, including:

1) Pearson Product-Moment Correlation, where
Formula
2) Cronbach’s Alpha, where
Formula
3) Kuder-Richardson, where
Formula
and
Formula

As with validity, many researchers versed in survey research propose that we cannot establish reliability; rather, we estimate reliability. Given such seeming ambiguity with both validity and reliability, it sometimes appears that validity and reliability are impossible to gage. However, such is not true. It is more important to use sound measurement methodologies and offer evidence that validity exits, allowing the reader to interpret your methodologies, analyses, and findings. While such is possible in academia, however, such is not possible in consulting, as you are the expect. Subsequently, it becomes imperative that as the researcher who is being paid, that you actually be an expect. Unless such holds true, clients will continue wasting monies on solutions that are laced with nothing more than pseudoscientific methods that offer no evidence of science whatsoever.

About the Author

Herbert M Barber, Jr, PhD, PhD serves as the chief executive officer of Xicon Economics. He is a respected author, engineer, economist, researcher, and teacher. Over the last 30 years, Dr. Barber has provided advisory and consulting in engineering economic systems as it relates to the implementation of large economic endeavors in industry and infrastructure. He is a seasoned scientific researcher with a keen understanding regarding the statistical and econometric effect and causality large financial and economic endeavors have on companies, governments, and economies around the world in both developing and developed economies. Prior to assuming his role with Xicon Economics, Dr. Barber served in leadership positions with Seminole Southern, Fluor Corporation, and Jacobs Engineering. He holds 5 earned academic degrees in engineering and economics.

About Xicon Economics

Whether calculating the financial and economic feasibility of constructing a new manufacturing plant, analyzing policy changes in various regulatory agencies, forecasting output from a rail system, mining smart grid data to develop real options valuations, or developing advanced energy algorithms, Xicon Economics stands ready to make a financial and economic difference. In fact, every decision we render centers around increasing financial and economic output, regardless of the client with which we are working. Our expertise falls broadly into the areas of economics, research, and statistics.


Share This: