Whenever you are presented with claims that a product, service or idea is valid, carefully check to see if the supporting evidence has been collected without bias and experiments have been repeated by others who do not have a financial or ideological motivation for proving the claims true.  Engage the process of Critical Thinking summarized here (no Java).  This page will describe why the methods used to collect scientifically reliable experimental (and observational) evidence are more trustworthy than uncontrolled testimonials:
An introduction to the methods of science and some terminology
The difference between Quantitative and Qualitative Data
Control Groups, Experimental Groups and Placebos
Experiments - Description
 
- Open Experiment
  - Single Blind Experiment
  - Double Blind Experiment
Double Blind Experiment - Examples
 - Product Effectiveness
 - Product Comparison
Surveys
Scientific Observational Evidence -

The Bottom Line

We are constantly bombarded by marketing claims (whether they are media ads or in-person demonstrations) that try to provide compelling evidence that will convince us that a product or service is safe and effective. 

It is extremely important to understand differences between the two main types of evidence you will encounter as you try and evaluate claims and determine whether any given product or service is actually safe and effective or just a feel-good 'remedy' that does not actually work as advertised.

  • Experimental Evidence - specific, measurable (objective) information (data) about a product's performance and effectiveness.  The evidence has been collected in a controlled manner to minimize errors and bias.  This evidence reliably demonstrates that an action (use of a specific product or service) actually caused a specific outcome (an improvement in health).  The methods of evidence collection must be well documented so that anyone can repeat the experiments (or observations) and reach the same conclusions.

  • Uncontrolled Testimonials or Anecdotes - people's subjective observations, opinions and descriptions about how they believe a product or service has helped them.  These personal testimonials can be extremely passionate and compelling if the story-teller believes a product or service has cured them or a loved one of a serious health problem.  However, personal testimonials are scientifically unreliable because they have NOT been collected using processes that control for and minimize errors and bias.  Consequently, there is no way to prove that an action (use of a specific product or service) actually caused a specific outcome (an improvement in health).  Eleven specific reasons that uncontrolled testimonials provide unreliable evidence of effectiveness are discussed here.

Two other methods of obtaining evidence that are more reliable than uncontrolled testimonials (anecdotes) are discussed on this page, but evidence collected by these methods is not generally used in marketing programs. 
Surveys are a way to collect what amounts to testimonial data in a manner that provides some organization and control, however there are shortcomings that can limit the reliability of the results.
Scientific Observation where events are carefully observed and documented (using processes that minimize bias) to determine patterns and deduce cause and effect relationships.  Observations are used in scientific disciplines like astronomy and paleontology where experiments are impossible to conduct.  Like experiments, scientific observations can be used to describe and understand natural processes and develop predictions (hypotheses) that can be tested for accuracy by further observation or experimentation. 

Introduction

This discussion will examine Experimental Evidence (collected using scientific research techniques) and my contention that: To provide reliable evidence that supports health claims about any product or service (particularly altered/enhanced water products that are allegedly energized, vortexted, structured, clustered, ionized, succussed, magnetized, oxygenated, etc.), it is necessary to control the way the evidence is collected, analyzed and presented.  This requires establishing and recording objective outcomes that can be clearly defined, tested and measured.  Three important characteristics of collecting controlled outcomes are that:
(1) all outcomes (positive and negative) are collected and clearly documented - not just outcomes that support the claim,
(2) a clear description of how the data was collected and analyzed is recorded, so anyone can evaluate the experimental process and try to reproduce and verify the results, and
(3) the process (experiments and/or observations) to collect the evidence and analyze the results is designed to minimize personal bias and expectations - which always contaminate poorly designed studies.

I discuss elsewhere how uncontrolled testimonials, which are generally solicited to help market a product or service or support other pseudoscientific claims, cannot be trusted to provide accurate, unbiased information about product effectiveness. 

If we can't necessarily trust what people tell us about their experiences with a product, though, how can we discover useful information about a product and determine whether a claim made about some product or service is accurate and trustworthy or whether it is pure marketing fantasy?  Testimonials are not all worthless, and they can actually provide useful information – but only if collection of the testimonials is controlled in a way that minimizes the potential problems outlined in the uncontrolled testimonials discussion.  The scientific experiment is one process specifically designed to collect information in a controlled manner and provide reliable information that allows an accurate evaluation of the product's effectiveness.

A brief introduction to the methods of science and some terminology:
The fundamental goal of science is to understand the natural universe in which we live.  Much of this understanding involves the discovery and description of causality (cause and effect relationships) in nature.  When natural cause and effect relationships are understood it is often possible to manipulate the causes and thus modify the effects.  If the causes of a disease are understood, for example, processes can be developed to remove or reduce exposure to those causes and/or modify the body's response (with vaccines or treatments) and minimize (or cure) the normal disease effects. 

The products of interest in this discussion (altered/enhanced water products) are marketed with claims that they can have some real, significant effects on health that are caused by REAL biological effects on the body and not just placebo responses.   If those claims are valid, there must be a way to collect valid supporting evidence using the methods outlined here.

Over the last several centuries scientists have developed a methodology (often referred to as the Scientific Method) that formalized guidelines for discovering possible cause and effect relationships (hypotheses) and then testing them to see if they can be validated.  There is lots of interesting (to me anyway) information available about about how science works.  However in the context of this very specific discussion, I will mostly focus on how to use the methods of science to collect reliable information about whether a specific product has a real effect on a person's health.  This process is often described as Evidence Based Medicine - this article provides an excellent introduction to clinical trials.  If you would like additional general information, you can read brief discussions on the goals of science and the scientific methodWikipedia and Berkeley also have good descriptions of the scientific method.

The key difference between the collection of uncontrolled testimonials to support a product's effectiveness and the scientific collection of information to determine whether a product is effective, is that scientists do not set out to collect only data that supports their hopes, expectations or beliefs about how a product should work.  Scientific data collection is (or should be) designed so all relevant data are collected, recorded and analyzed in a manner that minimizes ANY biases, expectations or beliefs scientists have which might influence how experiments are designed and how the data are collected, analyzed and interpreted.  Similarly the effects of the biases, expectations or beliefs of those who will use the product (the experimental subjects) should also be minimized.

It is important to realize that both the scientific researchers and the experimental subjects DO HAVE biases, expectations and beliefs about any experiment in which they are involved.  If the researchers were not interested in something about the product, and if they didn't have some expectations about the product's effects, they would not bother to conduct an experiment - if they were hired by a company to test a product, they might have a bias to produce favorable results.  If the subjects didn't have some expectations and beliefs about some claim made for a product they might not participate in the experiment. The examples below will illustrate how several scientific data collection methods can minimize the effect of all the biases everyone has and let the data 'speak for itself'.  

Other scientists judge whether an experiment is valid and has successfully demonstrated a cause and effect relationship in part by how well an experimental design, the methods and data analysis have successfully minimized all potential sources of bias.  High quality scientific journals strive only to publish results of studies where effects of bias, and expectation have been successfully controlled (a good starting list).  Research results submitted to these journals are reviewed by other researchers who are respected in their field, and papers are only approved for publication if all aspects of the experiment are high quality and meet stringent requirements to minimize bias and maximize objectivity, accuracy and reproducibility.  This process of Peer Review is employed by all high quality scientific journals. 

In recent years, however, a number of 'journals' have been established that may have impressive names but almost no quality control measures, and their peer review processes are non-existent or poorly controlled.  These 'journals' may be business ventures that require payment from the authors to be published &/or they may collect and publish 'research' that supports very biased positions that are not accepted by the scientific community as a whole.  Those who publish these suspect 'journals', and those who depend on them to become published, will claim there are conspiracies by the all the standard, high quality journals to suppress the findings of those researchers who are outside mainstream science.  They also claim traditional journals are completely biased toward traditional theories, only publish papers that support traditional scientific and medical beliefs and discriminate against anyone with novel ideas by ignoring their research.  An interesting example played out in late 2012 and early 2013 when a researcher from the Sasquatch Genome Project claimed to have sequenced Big Foot's DNA and was unable to get her paper published in a reputable journal.  She apparently purchased rights to an online journal, DeNovo, and published the paper.  The saga can be followed, here: a, b, c and d, from the Dallas Observer, the Huffington Post, io9 and the author herself.

Definitions

Quantitative and Qualitative Data:  There are two basic types of information (or data) about experimental subjects that can be collected and analyzed in scientific studies and used to form a conclusion about the product's effectiveness:

The scientific community has developed two research processes (Experiments and Surveys) that enable controlled, interactive data collection by researchers from subjects on whom products are tested.  There are several important characteristics of experiments (and to some extent, surveys) that help minimize and control for possible sources of bias.  These include:

Experiments

I am going to be honest here.  These examples are brutally detailed.  However, the detail is critical to provide enough information for a non-scientist to gain some understanding of how one of the most important scientific tools, controlled experiments works, and how controlled experiments differ from testimonial 'evidence' provided to support pseudoscientific products and services.

A. Description/Definition of Experiments
      Open
      Single Blinded
      Double Blinded
B. Examples of Double Blind Experiments
      Effectiveness Experiment (Double Blinded)
      Product Comparison (Double Blinded)
C. Summary

Specific Examples - a Double Blind Experiment will be described below.  The intent of providing the examples is that you can use the outlines, and substitute other products that have some measurable claims and actually test the claims you might be interested in evaluating.  These examples will test the claims of bottled OmyGod Oxygen Water that is marketed as containing eight times the normal amount of dissolved oxygen and has just become the rage on college campuses.  Two testable claims are made for OmyGod Oxygen Water (OmGOBW):
  1)  Better taste
  2)  Extra energy and vitality

Participants in a typical research experiment or survey might include:

These examples are fairly simple and could be adapted for a high school science fair project, but they provide a basic outline of the experimental processes that are similar whether the experiment is small and trivial or a multi-million dollar, multi-year experimental study.

I hope it is obvious how evidence collected in a controlled environment is more trustworthy than the uncontrolled testimonials used by those trying to peddle a product for which no objective controlled supporting evidence can be obtained because their product only works in the imagination and not in the real world.  It should be obvious too that surveys can't provide the experimental controls that are possible with blinded experiments.

Additional Resources:
  Experimental Design for Advanced Science Projects
  The Basics of Experimental Design

Disclaimer: The above discussion is not an exhaustive description of how scientific experiments work nor an outline all of the specific details, controls and statistical analyses that must be managed in order to design and execute a high-quality, reproducible experiment that stands a chance at publication in a high-quality journal.  Nor are all of the potential pitfalls that can sabotage even the best-intentioned experiment mentioned.  The discussion is provided to highlight some of the more important characteristics of scientific experiments, illustrate how they can provide useful information about whether or not claims for a product (or process or procedure) are true, and provide instructions so you can conduct your own blinded tests.

Surveys

Surveys are not much different from uncontrolled testimonials in that researchers collect and record information about their subjects without manipulating the study environment.  The survey participants are simply asked questions about the topics of interest without asking the subjects to make any changes in their behavior.  Surveys are a good way to gather data to generate ideas and starting points for conducting experimental studies. 

Because surveys can quickly, and relatively inexpensively reach and collect information on a large number of people, they are often the only practical way a representative picture of the attitudes, beliefs and other characteristics of a large population can be developed.

Surveys are also able to collect data on a broad range of subjects that are not easily studied by scientific experiments, for example; attitudes, opinions, beliefs and values.  Ironically, these are exactly the human characteristics that can cause some of the unreliability of individual testimonials.  These are also the traits that can contribute to the failure of science.  Strongly held attitudes, opinions, beliefs and values can result in experiments and observational studies that are poorly designed and carried out only to provide 'evidence' to confirm a specific bias - not to truly understand a situation.  Surveys also provide information about relationships between people, places, and things as they exist in the real world that are difficult to study in laboratory experiments.

Surveys have several advantages over experimental studies:

  1. They are much easier to develop, less expensive and faster to conduct than experimental studies.
  2. Information can be gathered on thousands of people just by asking questions.
  3. They can be conducted remotely by phone calls, online or by mail.
  4. Information can be gathered about non-scientific topics like attitudes, opinions, beliefs and values.
  5. Like experiments, surveys can be replicated by others to check and validate results.

Surveys also have significant disadvantages over experimental studies:

Measurement Errors Associated With Surveys
  • Question Wording: Does the question have a consistent meaning to respondents? Problems can occur with
    • Lengthy wording Words are unnecessarily long and complicated.
    • Length of question Question is unnecessarily long.
    • Lack of specificity Question does not specify the desired information.
    • Lack of frame of reference Question does not specify what reference comparisons
      should be made to.
    • Vague language Words and phrases can have different meanings to respondents.
    • Double negatives Question uses two or more negative phrases.
    • Double barreled Question actually asks two or more questions.
    • Using jargon and initials Phrasing uses professional or academic discipline-specific terms.
    • Leading questions Question uses phrasing meant to bias the response.
    • Cultural differences in meaning Phrases or words have different meanings to different population
      subgroups.
  • Respondent Characteristics: Characteristics of respondents may produce inaccurate answers. These include
    • Memory recall Problems remembering events or details about events.
    • Telescoping Remembering events as happening more recently than when they really occurred.
    • Agreement or acquiescence bias Tendency for respondents to “agree.”
    • Social desirability Tendency to want to appear in a positive light and therefore providing the desirable
      response.
    • Floaters Respondents who choose a substantive answer when they really do not know.
    • Fence-sitters People who see themselves as being neutral so as not to give the wrong answer.
    • Sensitive questions Questions deemed too personal.
  • Presentation of Questions: The structure of questions and the survey instrument may produce errors including:
    • Open-ended questions Response categories are not provided, left to respondent to provide.
    • Closed-ended questions Possible response categories are provided.
    • Agree-disagree Tendency to agree when only two choices are offered.
    • Question order The context or order of questions can affect subsequent responses as
      respondents try to remain consistent.
    • Response set Giving the same response to a series of questions.
    • Filter questions Questions used to determine if other questions are relevant.
  • Interviewer: The use of an interviewer may produce error.
    • Mismatch of interviewer-interviewee demographic characteristics.
    • Unconscious judgmental actions to responses.
Fundamentals of Social Work Research, Chapter 8; Rafael J. Engel and Russell K. Schutt
  1. Like testimonials, surveys cannot provide strong, reliable evidence of cause and effect relationships.
  2. Like testimonials, for various reasons survey respondents may not provide accurate, honest answers.
    •  they might want to present themselves in a favorable (or fabricated) manner
    •  they might have faulty memories of the subject matter
    • none of the answer options provided fit the respondent, or different responses fit at different times
    • they might have an agenda to either support or compromise the survey goals
  3. Like testimonials, respondents may provide accurate answers but have confused correlation with causation or forgotten key information.
  4. Survey questions and answers may be interpreted differently by different respondents and researchers.
  5. Open-ended survey questions,  where respondents can answer a question any way they like, must be interpreted and coded by the researcher.  This can lead to a variety of problems like biased interpretation and too many response types to analyze.
  6. Closed-ended survey questions, which provide the respondents with pre-selected choices, may not allow an accurate answer to be entered.
  7. Even slight variations in the wording or order of questions and answers can produce different results.
  8. It can be difficult to survey a representative sample of the population, so survey results may be biased
    • the surveyed population may contain answers from individuals who should have been excluded
    • the  surveyed population may not contain answers from individuals who should have been included
    •  people with certain strong views may choose not to participate in a specific survey
    •  people with certain strong views may be more likely to participate in a specific survey
  9. Survey results can only represent a picture of those who chose to respond to the survey

Cross-Sectional Survey - These studies collect information at a single point in time.  A number of individuals are contacted and asked the questions of interest.  The individuals are not followed and asked the same set of questions later to see what might occur over time.  This example will collect data on the energy benefits of OmyGGod Oxygen Water (OmGOW):

  1. The researcher will develop a set of questions to ask the subjects - in this survey, there will be eight questions,
       (1) have you ever tried OmGOW; If the answer is yes, continue with:
       (2) how long have you been drinking OmGOW, and how often do you drink it? 
       (3) what is your gender?
       (4) what is your age?
       (5) on a scale of 1-10, how active are you (1=not active)?
       (6) on a scale of 1-10 how would you rate your overall health (1=poor)?
       (7) does OmGOW taste better than tap water or worse?
       (8) after you drink OmGOW do you believe your energy and vitality levels increased or decreased?
  2. The researcher or assistants will call or contact in person as many individuals as is feasible within the constraints of available resources, ask people the eight questions and record the answers.
  3. After enough people have been contacted to provide usable information - perhaps several hundred people who have experienced OmGOW - the results will be analyzed to see what percent of the respondents thought their energy and vitality levels increased after drinking the oxygenated water.  Other results could be reported as well like the percent of users who thought OmGOW tasted better than tap water, whether females experienced the product differently than males, whether the length of time using the oxygenated water might have influenced their answers, etc.
Longitudinal Survey - These studies collect information at several points over time.  A number of individuals are contacted and asked the questions of interest.  The individuals are contacted again at one or more time intervals to see what might have occured over time.  This example describes a survey on the energy benefits of drinking OmyGod Oxygen Water (OmGOW) over time:
  1. The researcher will develop a set of questions to ask the subjects - in this survey, seven questions that will be asked at the beginning of the survey to collect basic data on the subjects,
       (1) have you ever tried OmGOW; If the answer is yes, continue with:
       (2) how long have you been drinking OmGOW, and how often do you drink it? 
       (3) does OmGOW taste better or worse than tap water?
       (4) what is your gender?;
       (5) what is your age?;
       (6) on a scale of 1-10, how active are you?
             1 = not active 'couch potato' | 10 = regularly work out and run marathons)
       (7) on a scale of 1-10 how would you rate your overall health
             1 = poor, lots of problems | 10 = extremely good health
    Three questions will be asked during the course of the survey.
       (1) does OmGOW taste better than tap water or worse?
             (who knows, the extra oxygen might fry the subject's taste buds)
       (2) after you drink OmGOW do you believe your energy and vitality levels increased or decreased?
             -10 = less energy | 0 = no change in energy | 10 = experience higher energy
       (3) have there been any significant changes in your daily routine and activity levels?
             -10 = less activity | 0 = no change in activity | 10 = higher activity levels
  2. The researcher or assistant willcontact as many people as it takes to find 20 individuals who drink OmGOW daily and who are willing to provide information about their experiences.  Answers to the initial six questions plus the three ongoing questions will be recorded.
  3. The researcher or assistant will call each subject daily  between noon and 1:00 pm and record the answers to the questions about taste and energy/vitallity and routine changes.
  4. After a week or two of collecting data on OmGOW - the results will be analyzed to see if the subjects experienced any changes in how they experienced OmGOW or in their energy levels during the experimental period.
Surveys provide somewhat more control over the collected data than uncontrolled testimonials and have legitimate uses.  In my opinion, though, the inherent methodological limitations and opportunities to influence results make surveys less than ideal for gathering data to make important decisions

Every time I see results of a survey presented as providing an accurate representation of reality I just remember my own experiences with surveys I have been asked to take.  If I choose to participate, there are almost always questions that I simply can't answer within the constraints of the question - or my answer might be variable depending circumstances not addressed in the survey - or, if I feel the survey topic, questions or interviewer are 'out of bounds', I may decide to answer randomly or provide answers that will confuse interpretation.  Usually, I choose not to participate, and my experiences and thoughts are not incorporated into the results.  If a significant fraction of survey potential respondents treat surveys the way I do, there is little hope for their accuracy.

Scientific Observations vs. Testimonials

In the context of this discussion, another method in which valid scientific evidence can be obtained needs to be mentioned - Scientific Observation.  Those who try to justify the use of uncontrolled testimonials to validate product claims might point out that science has always depended on observing the world and the universe to understand how the natural universe behaves - and that is just what their testimonials are, observations of what happened when someone tried their product.  What they apparently don't understand is that Observations are generally just the first step in the development of all scientific hypotheses and theories - but science never stops with just observations

The curiosity of scientists who observe natural phenomena can be engaged, and they come up with tentative explanations (or hypotheses) to explain what causes them.  At that point the explanations are just educated guesses, but these guesses must lead to predictions that can be validated.

In both cases, however, the processes of science require the hypotheses based on observations to be tested and confirmed.  In many cases the predictions based on theories developed using careful observations are validated experimentally when science progresses to the point where evolving technology finally enables experiments to be conducted. 

The Theory of Evolution, for example, was formulated by Charles Darwin in the mid 1800s and was based largely on his observations.  Since that time a number of predictions about evolution and natural selection formulated on his and other observations have been experimentally validated.  Similarly, observations of geological features by several scientists in the early 1900s lead to the development of the Theory of Plate Tectonics

Another key characteristic of legitimate Scientific Observation is that, like well designed Scientific Experiments, they deal with natural events that are subject to validation by anyone else who wishes to make their own observations. 

The 'observations' of pseudo-scientists and those collecting uncontrolled testimonials have two characteristics that quickly separate them from scientific observations. 

  1. Only those who believe in the alleged phenomena are actually able to make the observations.  Whenever a non-believer tries to make an observation something always 'goes wrong' to cause failure.  Examples include: structured water, ghosts, 'miracle' products & cures, UFOs, psychic auras, etc.
  2. These 'observations' never lead to the development of any testable theories or useful predictions - they are only used to market phony, products, services or ideas.

The Bottom Line:  The process of Observational Science and the validity of theories based on the observations are completely different from information collected by and conclusions based on Uncontrolled Testimonials.  Testimonials used to promote pseudoscientific products, services and ideas certainly make claims, but the claims are either not testable or they have never been tested by legitimate scientific processes.

    Copyright © 2005, Randy Johnson. All rights reserved.

Top

Updated April 2015