Database Design:  Reporting Results


Interpreting the Analysis

There are always three questions that you should answer in any report of your analysis of data:

·        What are the possible relationships?

·        What are the actual relationships?

·        What are the implications of this?

Sensitivity Measures

When I was an undergraduate at the University of Michigan, one of our statistics profs was famous for his saying, “Confidence levels don’t give you any confidence.  Significance tests don’t signify.  But what has importance is important.”

 

There are a number of terms and techniques used in statistics to estimate the importance of the findings, given the quality of the data.  All of these terms, taken together, are called “sensitivity measures.”  Sensitivity measures the amount of “error” in a mathematical analysis.  “Error,” in this sense, is a technical term meaning the amount of deviation of the observed values from the “true” (or “population”) values, due to

·        Constant error:  Effects which distort the measures in one direction

·        Random error:  Effects which obscure possible effects (i.e., which distort the measures in all directions).

 

There are four terms which, taken together, determine the sensitivity of an analysis:

·        Reliability: The degree to which a measure generates similar responses over time and across situations.  It is mostly affected by random error.  Reliability can be tested by:

o       Test-Retest:  Same subjects retake the test after some lapse of time and results are compared for consistency.

o       Alternate forms:  A group of subjects take alternate forms of the same test and the results are compared for consistency.

o       Subsample:  A sample of the subject pool is called back for a retest, another sample is given an alternate form, and the results of all three conditions (first form, alternate form, retest of first form) are compared for consistency.

Each of these techniques carries its own validity problems.  Retest is subject to maturation and learning (the subjects have had additional life experiences and have had some initial experience with the test when they sit for the retest) problems.  Alternate forms cannot answer which form is more accurate. Subsample shares all of these weaknesses.

 

·        Validity:  Measure of the extent to which measures accurately reflect what they are supposed to measure—to what extent are observed differences “true”?  Validity is mostly affected by constant error.  You have already considered the issue of validity in the discussion of Campbell & Stanley’s Quasi-Experimental Designs.  To summarize, there are two issues of validity:

o       Internal validity:  How effective are the measures within the confines of the study? 

o       External validity:  Can the results obtained be generalized o the broader population?  This involves questions of  the representativeness of subject group, the possibility that the selection or the measurement process might have affected the outcome (in addition to the experimental treatment), and the possibility that multiple conditions present in the study might be necessary for the experimental effect to occur.

 

·        Confidence limits:  Estimate of the error that can be expected in a measure, just due to the nature of measurement.  “Confidence” is a term with a very precise meaning in quantitative analysis, so use it in technical writing only when you intend the technical meaning.  It can be used to estimate the sample size needed for a study.  The general formula is of the form:

o       CL = x + (1.96σ / √n -1)

The confidence limit for correlation can be simplified to:

o       r2 = 1 / n-1

·        Significance:  Test to determine whether observed results could be explained by chance deviation from expected (average) values.  “Significance,” too, has a technical meaning and should only be used in technical writing with that intention.

o       Significance testing is based on the assumption of a “normal curve” (that is the infamous “bell-shaped” curve, the profile through a pile of sand that has dripped through your fingers—the distribution of objects whose interaction with each other is independent and random). 

o       The normal curve can be completely described by two terms:  the mean (m) and the standard deviation (s).  [I am using Greek letters here because I am referring to “ideal” parameters of a population, not those actually observed in a sample].  The mean measures the central point (the median and the mode will be at the same point in this ideal distribution).  The standard deviation measures the dispersion of cases about the mean.

o       Any observation can be located on an ideal normal curve through its “z-score.”  Z is a number which converts the value obtained from observation into a location on the normal curve.  It takes the form

§         z = (x – μ) / σ

Or, the distance of the observation from the mean, divided by the dispersion (standard deviation) about the mean.  Z-scores, then, are commonly used to gauge levels of significance.

o       Significance levels represent a trade-off between two potential types of error—the possibility that one could deny that there is a real difference (when there really is one) and the possibility that one could accept that there is a real difference (when there really isn’t one).  The relationship between the two is inverse—as one goes up, the other goes down.  In the ideal, the two lines cross at the .01 level of significance; this represents the balance between affirming a false result and denying a true result.  It is the level commonly used in bench science.  In medical science, where the consequences of affirming a false result could be fatal, researchers tend to take a stricter stand and use a .005 level.  In the social sciences, where experimental control is weaker and the data are often less precisely measured, we commonly use the .05 level. 

o       A “.05 level of significance” means that 5% of cases belonging to a normal distribution will fall outside of the point with a z-score of  +1.96.  Any smaller deviation from the mean can be expected to occur simply by chance 95% of the time.  And this explains what that “1.96” was doing in the formulas above.  It was setting the bounds for the .05 level of significance.

o       To use the z-score (i.e., 1.96 for the .05 level), simply multiply the sample standard deviation by 1.96.  Anything “outside” that value is not likely to belong to the sample distribution.

 Next Section

Back to Syllabus


 

609

 

 

© 1996 A.J.Filipovitch
Revised 11 March 2005