What is a “fact”? What is “truth”? Is there even a difference? For that matter, are empirical “facts” the only reasonable way to answer a question?
· What you see is not necessarily what you get.
· Sometimes ambiguity is built into the design itself. For example, consider the simple line drawing of a cube that you practiced in grade school. Sometimes it faces one way, sometimes 90 degrees the other way. Or, for people who are color blind, the Luscher color charts (the ones your eye doctor uses) can appear to be random dots while people with normal color vision clearly see numbers. You might also notice that the perception of ambiguous figures can alternate between one and then the other; sometimes, with effort, you might even be able to hold both figures simultaneously in perception.
· Sometimes ambiguity can be introduced through indirection (this is what the magician’s art is all about). Psychology has made an art (science?) of using ambiguous stimuli—drawings, like the TAT (Thematic Apperception Test), or free-form shapes like inkblots (the Rorschach charts)—to allow the individual to project meaning into the ambiguity. Advertising, too, uses ambiguity in the setting and posing of models to suggest possibilities which are (perhaps) better left unstated. What one person sees “so clearly,” another may see entirely differently. Is one “right” and the other “wrong”?
Viewpoint can also “complete” information that
is not really given. Two line drawings
of squares, overlapping at one corner with the overlapped parts left out, will
actually lead the eye to fill in the missing ink. Looking at one façade of a house, we “fill
in” what the other side would look like—which can lead to jarring
surprises. For example, many buildings
· What you get is not necessarily what you see, either.
· There are any number of “parlor games” the solution to which comes from “breaking the rules.” For example, given 9 dots in a square, connect them with only 4 straight lines. Or, given a chain with a ring on each end, create a closed loop (without opening one of the rings). Or, given 6 common nails, balance them on the head of a seventh nail which is standing up from a piece of wood.
· How do you learn to think “outside the lines”? Or, maybe more to the point, how did you come to learn not to?
· There is a famous short story by the Japanese author, Ryunosuke Akutagawa, called “In a Grove” (from Rashomon and Other Stories). It is the story of a rape and murder, told through the testimony of 7 witnesses (including a confession from the accused and the testimony of the murdered man presented through a medium). None of the 7 agree in every detail with any other, and every major detail is contradicted by at least one witness. Yet, between all these conflicting statements, it is possible to develop a coherent story of what actually happened.
· In trying to tease out conflicting and ambiguous information, it is often useful to ask yourself
· What is known true?
· What is known false?
· What is unclear?
· What needs to be known?
The English author, C.P. Snow, wrote a famous essay called “The Two Cultures.” He argues that there are two traditions of scholarly inquiry, the Empirical and the Humanistic. The Empirical Tradition is based on the creation of abstract models, the attempt to capture the essence of observed objects or events, the effort to create a “map” of some segment of the world such that the elements of the map represent elements of the “real world” (which is assumed to exist outside private experience). The Humanistic Tradition is based on description of the phenomenal road and the effort to capture the meaning of a phenomenon, usually by explaining the appearance of the phenomenon. For the humanistic tradition, description and explanation both are based on the interaction of the perceiver and the perceived.
· How does science develop? Some say it is a linear process, each step building on the ones that went before it; others argue that it is cyclical (or, perhaps, spiral)—not necessarily making “progress,” but coming back on itself and finding new ways to ask old questions.
· How does science approach the phenomenal world? Some argue that it is a search for “truth,” while others say “data” (predictive usefulness) are enough (“For purposes of theory, it doesn’t matter whether or not people are rational, only that, in the aggregate, they behave as if they were.”)
· How are scientific decisions made? Popper (2002) argues that it is only through “falsifiability,” while others argue for direct proof (we will return to this in discussing the difference between induction and deduction).
· What is the relation between units and the whole? Some argue that there are “levels” of analysis, and that what is true at the individual level may not be true at the group level (thus, the difference between psychology and sociology, between microeconomics and macroeconomics). Others argue that it makes more sense to look at the difference between aggregate effects (the sum of individual effects) and general effects (the action of the group as a whole, separate from the actions of individuals within the group).
· How are questions posed? Some pose their questions as dichotomies—“Anything is either A or not-A.” Others see a range of possibilities between “Being and Nothingness.” Some pose them as “trichotomies”—“Being…Becoming…Nothing.” A few are exploring more precise ways to measure the distance between Being (1) and Nothingness (0) using “multi-valent” sets—“0…25%…50%….75%….1” And a few are working on ways to deal with overlapping states between Being and Nothingness using “fuzzy sets”—“Absent…Probably absent…Could be present…Probably present…Clearly present.”
When the scientific (empirical) method is applied to solving everyday problems, it is often called the “rational decision making model.” This is the method used in much of public affairs, including planning and management. Of course, if there are so many questions which are still open, maybe we should use something else. The problem with that position is, What are the alternatives? There are many ways to choose between competing positions (appeal to force or appeal to emotions come readily to mind), but some form of “rational” argument would seem to be the only way to arrive at a choice that will lend itself to public scrutiny and the only way to build consent among people who might not have been initially persuaded. The rational decision-making model has several steps:
· Goals statement (ordering of needs)
o What needs are to be served
o Whose needs are to be served
o Importance of needs (as felt by the community as well as by the experts)
o Consequences of solution (intended & unintended).
o Note: It may not be possible to rationally (objectively & analytically) order needs. Some decisions are inherently irrational. Standards (professional judgment) may be used in those cases.
· Analysis of system structure
o Need variable (output, dependent variable)
o Control variable (input, independent variable, manipulated variable)
o Uncontrolled variables (external variables)
o Note: Direct impact on one system may have indirect impact on others.
· Selection of possible solutions
· Implementation of selected solution
· Evaluation of achievement of solution
Inductive arguments do not demonstrate the truth of their conclusions, but merely establish probability. Yet most scientific proof is based on inductive arguments, hence the question of whether science deals with “truth” or merely “predictive usefulness.”
Basically, induction is argument from analogy. The best discussion of induction is still J.S. Mill’s “Canons of Induction” (see Nagel, 1950). Mill set out five rules for induction:
You will notice that induction is much less cut-and-dried than the syllogism. That is what makes science so interesting—there is always the possibility that the residual approach will pull something out of what seems to be nothing, or that concomitant variation will disclose that what seemed to be causation was really only co-relation.
Over time, additional rules of thumb have developed for arguing increased confidence for inductive conclusions:
In the more confined and rigorous world of scientific analysis, one works to develop “theory” rather than “decision-making models.” (You, as students of applied problem-solving, will have to be at home in both worlds).
· Concepts: “The function of scientific concepts is to mark the categories which tell us more about our subject matter than any other categorical sets. Whether a concept is useful depends on the use we want to put it to.” Abraham Kaplan
· Axioms: Propositions assumed to be true
o Axioms are causal assertions. They are untestable (i.e., they are “asserted”) because it is impossible to control all the relevant variables.
o An axiom implies a direct causal link among variables.
o Axioms assert a sequence over time by which two things change together (“covariance”)
o Laws are propositions which have been proven to be true.
o A successful inquiry of particulars establishes “fact”; an inquiry of general statements establishes a “law.”
o Theory must interrelate propositions, 2 or more at a time.
o Ideally, a theory is a minimal set of propositions (axioms) from which all other propositions may be derived by logic. The reason for this is that the measured covariance between variables that are linked through other variables will be weak, because masked through the intervening variables. So a theory should have as few variables as possible. This is called the “Princple of Parsimony.”
o A general theory is a small number of abstract variables, linked to measurable variables.
o An auxiliary theory links abstract theory to substantive problems.
o Short-run prediction does not need theory
o If any major parameter varies, theoretical explanation will be needed to make prediction.
o Only a small fraction of predictor variables are also explanatory (parsimony, remember).
· Complexity: There are a number of ways to make a theory more complex (and less abstract and more similar to the world it is mapping):
o Add more variables
o More complex forms of relationship (nonlinear, nonadditive joint effects)
o Dynamic processes (time paths, feedbacks, cycles)
o More realistic assumptions (unexplained variation, measurement error)
· There are also a few rules which are commonly used to frame a theory:
· Empirical verification: The theory must correspond with observed reality
· Operational definition: Define the terms used in the theory by the operations involved in manipulating or observing the phenomena to which they refer.
· Controlled observation: Outcomes must be observed under different values of input and when all other variables can be discounted as possible causes of any change in outcome (these are sometimes referred to as “contrafactual”).
· Statistical generalization: Theory is tested on a random sample from the set of conditions to which you wish to generalize.
· Empirical confirmation: Consistency with other verified statements increases the probability of truth.
· Finally, there is a logic to linking data to the propositions which they measure.
· The first principle is simplicity—the theory should have a small number of abstract variables linked in explicit ways.
· The second is validity—of which there are several forms.
o Construct validity is the accuracy of the operationalization—do the data in fact capture the phenomena to which they refer?
o Internal validity is the accuracy with which the causal linkages between the propositions is captured in the theory—does it hold together?
o External validity is the accuracy with which the subset of the world on which the theory was tested reflects the entire world to which the theory applies—does the theory capture the whole domain to which it applies?
o Reliability is the accuracy with which the results of the theory can be replicated time after time—how much variability is there in the fit with the whole domain across time?
· The hypothesis (H1) is expressed as a prediction (i.e., “if this occurs then that will happen.”)
· The hypothesis is restated in the negative (H0—called the “null hypothesis”).
· Attempt to disprove the null hypothesis, in order to affirm the original hypothesis.
· The reason for testing the null hypothesis goes back to the character of inductive reasoning (which is what an experiment is): Since you cannot test all of anything, you try to find the exception to what you think is the rule. If you provide the optimum circumstances for the exception to occur, and it still doesn’t occur, then it is likely that it will not occur under less favorable circumstances.
o An experiment is designed to provide a formal specification of comparisons. It follows a rigorous form:
· Given an observation of differences (variances)—symbolize it as “O2 – O1”
· Argue that the difference is produced (it is a result of something)—“X > (O2-O1)”
· Determine that the difference is real, not an artifact of the design – This is the issue of “internal validity, “ which will be discussed below
· Determine that X is the causative agent – This is the issue of “confounding effects,” also discussed below.
o The classical experiment is done in the following form:
· O1 X O2 [Experimental Condition] O2 – O1 = A
· O3 O4 [Control Condition] O4 – O3 = B
· A – B = X [Experimental effect is the difference between A and B]
o Any research design must deal with internal threats to the validity of the design. If one uses a classical experimental design, internal validity is assured. These threats to internal validity are:
· History (order effect)—This is the possibility that the observed effect could be due to the order of presentation. For example, measuring blood pressure may itself raise a subject’s blood pressure. In a true experiment, this effect would also occur in the control group, and so be subtracted from the measurement of the effect in the experimental group.
· Maturation (process within subject due to passage of time)—For example, measurements of children’s scholastic achievement at intervals of a year would be expected to show increases simply from the experience of living (adult performance measures may be more stable).
· Instrumentation (change in calibration or observer)—Human observers will differ slightly in how they record their observations; mechanical devices are subject to wear and fatigue over time.
· Testing (prior experience affects subsequent behavior)—This can be seen as a form of maturation, but it is due to experience with the measurement instrumentation itself. One of the reasons that “teaching to the test “ works is that it gives the students experience with the form and the content of the questions they are likely to face. The longer the time between testing events, the less is the impact of this threat to validity.
· Selection (bias in original assignment)—If subjects are not randomly assigned to conditions, there might be characteristics of the subjects themselves which leads to differences between the experimental and control group. For example, one of the difficulties of assessing the effectiveness of strategies for alleviating poverty is that there may be other characteristics other than income which put people among the poor (such as educational achievement, or ….)
· Mortality (differential loss of respondents)—Not only may there be differences in the characteristics of the people who begin in each group, there may also be differences in who remains in the group to the end. One of the difficulties in comparing the effectiveness of public and private K-12 education is that the private schools may not retain their more difficult students to the same extent that the public schools will.
· Regression (groups selected for extreme scores will shift toward the mean)—Very tall people tend to have slightly shorter children, and very short people tend to have slightly taller children.
o If all of the variables in a setting are not controlled, the initial differences in those variables could interact with the experimental condition to affect the outcome. Control can be achieved in several ways.
· Randomization: This is not random sampling (which responds to threats to validity), but is an expression of the probability that any individual event is independent of any other. In an experiment, this is handled by measuring the “error” (unexplained variance) in the control and the experimental group (they should be the same).
· Matching: If there are known causes of variance (other than the experimental effect) which cannot be controlled, then the experimental design should be done so that individuals with that characteristic are “matched” in both the control and experimental groups. For example, if one were assessing techniques for introducing technology into an office, one would probably want to see to it that people of similar experience levels are assigned to both the control and experimental groups.
· Standardization: In field experiments, one often cannot “assign” individual cases to experimental and control conditions. But one can carry out the analysis by establishing comparable groups, or even by comparing individuals to a “standard.” For example, if one were looking at housing costs one would presumable want to see to it communities of similar sizes are included in both the control and experimental group.
· Partialling: A reference population (a “standard”) is often not available. In such cases, linear regression (a statistical tool) can be used to create “internal standardization,” called a “partial correlation.” One can statistically subtract interaction effects among several variables. The more variables are involved in the analysis, the more observations will be needed to partition the variability.
· It is not always possible to achieve true experimental control. Donald Campbell has systematically explored ways to relax some of the requirements of a true experiment (he calls them “quasi-experiments”), and the threats to validity which might weaken one’s conclusions (see Cook & Campbell, 1979).The quasi-experimental designs are (“O” means “observation,” “X” means “experimental manipulation,” and the passage of time is represented by the movement from left to right):
· Time Series: O1 O2 O3 X O4 O5 O6
In this design, periodic measurement before and after the experimental condition create a kind of internal control group. It does not, however, control for validity threats from history.
· Equivalent Time Samples: XO1 O2 XO3 O4 O5 XO6….
This is an extension of the time-series design, with the experimental condition inserted at random intervals. This design controls for all the threats to internal validity, although it does not control for threats to external validity (interaction effects).
· Nonequivalent Control Groups: O1 X O2
This design is similar to the classic experiment, but individuals were not randomly assigned to the experimental and control groups. This is probably the most common form of experimentation in social science. It can be susceptible to regression effects, and does not control for interaction effects or threats to external validity.
· Counterbalanced Designs: X1O X2O X3O X4O
X2O X3O X4O X1O
X3O X4O X1O X2O
X4O X1O X2O X3O
This design also employs nonequivalent control groups, but randomizes the presentation of different interventions. This design can be susceptible to internal interaction effects and may be susceptible to threats to external validity.
· Separate sample pretest/posttest: R O (X)
R X O
This design is often used when one has access to large enough populations (such as cities) that one can create a random pool with a pre-measure and then a random pool with a post-measure. The design has several internal weaknesses (history, maturation, mortality, interaction), but it handles the threats to external validity.
· Separate Sample Pretest/Posttest Control: R O (X)
R X O
This design is simple to the previous one, except that the same people are not retested (and thus interaction effects and threats to external validity are avoided).
· Multiple Time Series: O1 O2 O3 X O4 O5 O6
O1 O2 O3 O4 O5 O6
Similar to several of the previous designs, this design collects data from an equivalent control group to increase the certainty of the interpretation. It avoids all of the threats to internal validity, although it is still subject to threats to external validity.
· Institutional Cycle Design: X O1
O2 X O3
This designs combines elements of both “longitudinal” and “cross-sectional” approaches. Campbell & Stanley call it a “patched-up” design for field research in which one starts out with an inadequate design and then adds additional features to achieve greater control.
· Correlational Analysis: Not all research uses an experimental design. Correlational analysis examines a number of instances and asserts, based on various initial assumptions (which we will examine in the section on Statistics: Correlation), that there is a co-relationship between variables. Because a correlational study does not manipulate which variable precedes the other, it cannot attribute causal direction to the relationship. Also, correlational studies usually show a much weaker effect than experimental ones, because of the effect of the many uncontrolled variables which are mixed into each individual instance.
· Observational Design: Field observation is an uncontrolled source of data about a phenomenon. It assumes a causal ordering (A precedes B, therefore A caused B). The data are usually presented descriptively, providing information on “what” and “who.” But it can provide information that will inform a more controlled design (moving back and forth between observation and experiment is sometimes called “Grounded Design”).
o The most common form of observational design is the case study:
o Method: Case studies uses multiple sources of
o Analysis: There are three common ways to mine the data from a field study:
§ Time-series analysis
1. Develop (1) an experimental design, (2) a quasi-experimental design, and (3) a correlational or observational design to test the hypotheses you developed in a previous unit (or another set of hypotheses that are about an issue that interests you). Each design should test a different one of your hypotheses.
2. The Funders’ Network for Smart Growth, in “Civic participation and smart growth” (Nov, 2000), writes that:
“Putnam argues that sprawl is not the primary culprit for what he views as the country’s lessening social capital, but has been a ‘significant contributor to civic disengagement.’ He cites three reasons: first, sprawl and the consequent need to drive to most places takes time that could be used for civic purposes; second, sprawl leads to increased social homogeneity in communities, by class and race, which appears to result in reduced civic participation; and third, sprawl leads to the physical fragmentation of communities and our daily lives, which undercuts involvement in local affairs.” (p.2)
Design three studies to test these three hypotheses (one should be experimental, one quasi-experimental, one correlational/observational).
3. The Funders’ Network for Smart Growth in “Children, Youth and Families and Smart Growth” (August, 2002) make a number of assertions:
· “Families are drawn to cul-de-sacs primarily because the dead-end streets are thought to be safe havens for traffic.” (p.4)
· “The proliferation of cul-de-sacs makes the world beyond the subdivision entrance doubly dangerous. The lack of interconnected streets means every subdivision and shopping center empties onto the same overloaded arterial road, which inevitably is widened and reconfigured to accommodate a higher volume of speeding vehicles. As a result, even young adolescents, who need a degree of autonomy for their development, are all but barred from leaving the subdivision without a chauffeur-chaperone.” (p. 4)
· “…a large share of American families have moved to neighborhoods designed only for cars, where kids often are prohibited from walking to school or anywhere else for fear of the high-speed traffic beyond the cul-de-sac. That means children don’t often get the incidental exercise they once did, such as running to the corner store for Mom, hiking to the library or walking to school.” (p. 5)
· “In many ways, of course, middle class kids are the lucky ones, because their parents have the means to drive them to activities in our auto-dependent cities. Lower-income families, with one or fewer cars, are not able to haul children to sports, music, or other enrichment activities…(these) kids just miss out.” (p. 6)
· Eric Schlosser, in Fast Food Nation (Houghton Mifflin, 2001), argues, “As American cities and towns spend less money on children’s recreation, fast food restaurants have become gathering places for families with young children.”
Choose one or more of these hypotheses and design an experimental, quasi-experimental, and correlational/observational study to test it (them).
Anderson, Barry F.
1971. The Psychology Experiment, 2nd Ed.
Andranovich, Gregory D. &
Gerry Riposa. 1993. Doing
Campbell, Donald T. & M. Jean Russo. 1999. Social Experimentation.
Cook, Thomas D. & Donald T. Campbell. 1979. Quasi-Experimentation.
Levitt, Steven D & Stephen J. Dubner. 2005. Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. NY: William Morrow.
Nagel, Ernest. 1950. John Stuart Mill’s Philosophy of the Scientific Method. NY: Hafner.
2002. The Logic of Scientific Discovery.
Smith, Linda Tuhiwai. 1999. Decolonizing Methodologies: Research and Indigenous Peoples. Londong: Zed Books, Lt.
Snow, C.P. (1993) The Two Cultures.
Webb, Eugene J. et alii. 1966. Unobtrusive Measures.
Yin, Robert K.
1989. Case Study Research, Rev.Ed.
© 2006 A.J.Filipovitch
Revised 1 February 2010