Other Comparative Designs
Review of Terms:
Random Selection/Sampling vs. Random Assignment
• Random Selection = How do I get my sample? All people in your population have an equal chance of being selected
• Random Assignment = Once you have your sample, how do you assign them to a group?
Internal Validity vs. External Validity
• Internal Validity = control of the experiment
• External Validity = generalizability of your experiment
You want to have the correct balance between the two
Independent, Dependent, & Extraneous Variables
• Independent Variable = What you are manipulating
• Dependent Variable = depends on the independent
• Extraneous Variables = the things that mess everything up
Between vs. Within Subjects Variables
• Between subjects variable = looking for a difference between subjects - they don't get the same experience in both groups - but you need to make sure that both groups are as similar as possible to confirm that the only differences between groups is your independent variable
• Within subjects variable = Every individual gets both experiences in the experiment - ex. Pre vs. Post - Within is more ideal because you know your groups will be consistent for both independent variables
Factorial Design = measures the impact of more than one independent variable at once
Benefit: you can see if there is an interaction between the different independent variables
(No more than three independent variables, otherwise it gets too complicated and you can't tell what variables are interacting with which variables very clearly)
Experimental vs. Quasi-Experimental Designs
• "True" Experimental Designs = involve random assignment to conditions manipulated by experimenter
• Quasi-Experimental Designs = involve comparisions of groups in pre-selected conditions or groups - I design the study before I collect the data
• Causal Comparative Designs = are ex post facto quasi-experimental designs; They involve comparison of pre-selected conditions/groups after the fact
Time-Series Design
Measure dependent variable a lot over time
Experimental Designs, cont.
• Single Subject Designs
o -Like mini longitudinal experimental designs, on individuals or small groups of individuals
o -Similar to pretest/posttest designs; examines DV before and after IV
o -Used when it doesn’t make sense to pool effects across individuals - ex. When working with children with special needs, the specific behaviors and needs of one child are not the same as others' - But tracking that one child over time may help develop strategies to help that specific child - you're not trying to generalize your findings, you're just trying to help that one individual
o --Special populations
o --Focus on success of specific interventions with specific individuals
Possible Single-Subject Designs
• A-B Design = baseline, followed by intervention
o A = Baseline
o B = Intervention
But what happens after the intervention is removed? Does behavior go back to baseline?
A-B-A Withdrawal Design = baseline, followed by intervention, concluded with baseline
• When the intervention is removed, does behavior go back to baseline?
Ethical issue: is it OK to intervene but then leave subjects back at baseline behavior, especially if we know that the intervention is needed?
A-B-A-B Design = one step further; instead of leaving subjects back at baseline, present intervention again (more ethical)
Multiple Baselines Designs = alternative to ABAB design; used when it’s not ethical to leave the subject at the baseline condition and when measures on multiple Dependent Variables (DVs) are taken.
• -Taking baselines for multiple behaviors at same time – whether it’s one behavior in multiple individuals, multiple behaviors in one individual, one type of behavior in one individual in multiple settings, etc.
• -Difficult to use, because must ensure that multiple DVs aren’t related to one another
Issues to Consider: Internal Validity
Is Treatment Standardized?
• -# Variables Changing Across Conditions ? - Is the behavior changing because of my intervention, or is there another explanation?
Condition Length?
• -Degree and Speed of Change? - You want to show a stable trend. Do you have enough data points to see a stable trend in the experiment results? - The "magic" number is 7 measurements to see a stable trend
Experimenter Effects?
• -Because this is a one-on-one experiment, the experimenter is likely to have an impact on the individual
Practical Significance of Results?
• -Practical significance is looked at more than statistical significance because of the single-subject design. - ie. Did it help the student improve?
Single Subject Designs and External Validity
It is very difficult to generalize the results of these designs beyond the sample studied – WHY?
• -Because it was only designed for one person for a specific purpose. It is not necessarily meant to be generalized.
Thus, it is VERY IMPORTANT for this type of research to be replicated before strong conclusions are drawn.
*We now have all the information we need for quiz 3 if we want to take it now.
2/14-20
I. Experimental Design
*Smooch has 2 toys: a teal stuffed dolphin, and a red rubber bone. If I throw each toy out into the yard, she’ll charge after the dolphin, but almost never the bone. I conduct this experiment over and over again, sometimes varying the order in which I throw the items, sometimes varying the distance. However, I always get the same results.
My neighbor, who is watching over the fence, says that Smooch just doesn’t like rubber toys. What is wrong with my experimental design, that I can’t draw this conclusion?
I design a new experiment, in which I have:
a teal rubber ball
a teal stuffed dolphin
a red rubber bone
a red stuffed mouse
I throw the toys out into the yard in varying orders, and varying distances. Smooch always goes after the ball and the dolphin. She rarely goes after the bone or the mouse.
This time, I conclude that Smooch doesn’t like red toys. In my second experiment,
State the independent variable(s):
State the dependent variable:
What did I have to change/add in my 2nd experiment that wasn’t present in my 1st experiment?
II. What makes experimental designs so great? CONTROL!!!
In experimental designs, researchers have control over:
-Independent Variables (what is manipulated; whatever you are comparing, such as traiditonal vs. online technology)
-Dependent Variables - what is being measured, how are they operational? How is it being measured?
-Extraneous Variables - all of the things I can't control; things that have impact on my dependent variable, such as dog color blindness
III. Internal Validity (validity is is the expeirment measuring what it is supposed to test) and Control
*History (some uncontrolled event, such as a fire alarm)
*Selection (how do you select your subject)
*Maturation (individuals change over time)
*Testing (does just giving a test somehow influence what hte subjects think about the I.V.)
*Instrumentation (something could be wrong with the test)
*Treatment Replications
*Attrition (mortality, or losing subjects during the experiment)
*Regression to the Mean
*Diffusion of Treatment ( does the I.V. from one group share info to another group)
*Experimenter Effects (is the experimenter overly nice or overly mean to subjects; or if your tester is a cute girl and your subjects are 13 yr old boys, then they subjects do whatever the tester wants you to do)
*Subject Effects
***Self Report Data: Social Desirability Bias: subject wants to appear a certain way; Ex. anonymous survey about drug use
IV. External Validity (generalized study) and Control
1. So more control ≈ higher levels of internal validity… the more I can control my experiment
-Note, this is what allows us to make cause/effect conclusions about experimental data
2. But what happens to external validity? The more I control something, the farther away I am getting from the real world.
***Experimental designs are good because they can be controlled*****
IMPORTANT TERMS
a. Randomization
-Random Selection vs. Random Assignment
b. Within Subjects vs. Between Subjects Variables and Designs
c. Controlling for Confounds
-Holding variables constant
-Building variables into design
-Matching
V. Pretest/Posttest Designs
A. Single Group Posttest-Only Design
-Expose 1 group to the IV and then measure the DV (posttest) once
Example: I decide to test whether using puppets to “read” books with young children to help them learn different sound combinations.
-What’s the problem? No baseline!
Before IV IV After IV
X (Treatment) O (Observation)
Use puppets PA quiz
B. Single Group Pretest Posttest
-For a single group, give them a pretest (DV), then the IV, then a posttest (DV again)
Example, in the “puppet experiment,” I give students a pre-reading PA quiz, then read to them with the puppet, then give them a post-reading PA quiz
Before IV IV After IV
O (Observation) X (Treatment) O (Observation)
PA quiz Puppet Reading PA quiz
C. Nonequivalent Groups Pretest Posttest: Quasi-Experimental (chosen groups)
-Involves 2 groups (experimental and control); both get pretest (DV), then only the exper. Gp. is exposed to the IV, and both gps. get the posttest (DV)
Group Before IV IV After IV
O (Observation) X (Treatment) O (Observation)
Exper. PA quiz Puppets PA quiz
Control PA quiz No puppets PA quiz
Between Subjects Designs = each group of subjects receives a different level of the IV
-Advantage: often more practical that within subjects designs
-Disadvantage: are differences due to groups? Or to the IV?
-Use of Matching
VI. “TRUE” Experimental Designs (randomly assigned groups)
1. Randomized Groups Prettest Posttest Design
Individuals are randomly assigned to either exper. Or control grp.; both groups receive the pretest (DV), then only the exper. Gp. Is exposed to the IV, and then both gps. Receive the posttest (DV)
Random assignment gives control over group differences, and pretest allows for a baseline measure in both groups
Experimental Designs with more than one IV
Factorial Designs = measure the impact of 2 or more IVs on the DV
-Advantages
Allow researchers to determine whether effects are consistent across subject characteristics
Allow researchers to assess interactions
-Disadvantages
Can get messy if there are too many IVs
NEVER HAVE MORE THAN THREE INDEPENDENT VARIABLES WHEN DOING A STUDY!
SCENARIO:
Dr. Koolio has just finished a research study investigating the long-term effects of direct instruction in phonemic awareness on children’s reading abilities. He started studying 30 children in a kindergarten classroom in a mid-sized city in southeastern Louisiana in 2002, and he has conducted follow-up studies with these children every year in September (they’re now in the third grade). The students in this class move together from grade to grade, but their teacher changes every year. Nonetheless, each teacher uses some form of direct instruction in PA in his/her reading units.
Every year, Dr. Koolio goes into the students’ classroom and personally administers the Phonological Awareness Test 2, which he developed just before he started this study. In 2004 (when the children were in the 2nd grade), he found that all of the children (N=26) were progressing very well in reading. However, in September of 2005 he saw sharp decreases in children’s average reading achievement scores (N=16). Dr. Koolio conducted a statistical test comparing children’s test scores from kindergarten and their scores from the third grade and found that scores had increased only slightly, but not significantly (p>.05). He concluded that direct instruction in PA does not have any long-term benefits for children’s reading abilities.
Is the Phonological Awareness Test a valid and reliable measure? (Go to the website to find out).
Consider each of the following possible threats to internal validity. Do any of them apply to Dr. Koolio’s study? If so, how?
History
Selection
Maturation
Testing
Other Extraneous Variables?
What type of design did Dr. Koolio use? Can you recommend any improvements to his design? (And if you can, explain why this is an improvement on the original design).
Do you have any concerns about the conclusions Dr. Koolio drew, or the statistical tests on which they were based?
VII. Analyses of Exp. Designs (t and F)
T-test -- used to assess the impact of two levels of one IV on a single DV
ANOVA – used to assess impact of one or more IVs on a single DV
MANOVA – used to assess impact of one or more IVs on multiple DVs
ANCOVA – used to assess impact of one or more IVs on a single DV after removing effects of a variable that might be correlated to DV (e.g., age, gender, aptitude, achievement, etc.)
MANCOVA – used to assess impact of one or more IVs on multiple DVs after removing effects of a variable that might be correlated to DV (e.g., age, gender, aptitude, achievement, etc.)
Wednesday, February 20, 2008
Wednesday, February 6, 2008
Feb. 6 - Meta-Analysis
Reviewed the Quiz #1.
Meta-Analysis and other Methods of Research Synthesis
• Levels of Research Synthesis
• Literature Reviews
o -- allows for in depth discussion of individual studies’ findings and their theoretical implications
o -- no weight given to statistics in the discussion
• Numerical Reviews (“Vote Counting”)
o Studies that are statistically significant “cast votes” for effects
o Statistical significance depends on many factors
• Meta-Synthesis (synthesizing qualitative research findings)
o Used in qualitative research (qualitiative research is concepts and ideas and general results)
o It is NOT collection and aggregation of research results
o It is “the bringing together and breaking down of findings, examining them, discovering the essential features, and, in some way, combining them into a transformed whole” (Schreiber et al., 1997, p. 314)
• In qualitiative research the goal of the study is to try and understand it
o One of the important issues is how do they actually go about doing the study.
• What is the selection criteria? How am I going to select studies on whatever? What is my inclusion criteria? (is it published in a peer review journal? You want to do a peer review so they have been criticized, or passed a level of critique. There may be a restriction/criteria selection on date, participants or methods used.
o Once you have chosen, determined the critiera, you start to code and categorize all of the relevant information.
o Teams or Individuals? In general, teams are better. Results are a lot more valid.
o Audit Trail: a list or description of exactly what you did.
o Triangulation: you want to approach a topic or finding from at least three different perspectives
II. Meta-Analysis
• Origins of Meta-Analysis – started in agricultural research
• Definition
o Statistical technique that enables the results from a number of studies to be combined to determine the average effect size of a given independent variable.
• Is the effect significant or not? What was found, is it significant and how big was the effect? Remember, if it’s significant or not, that is the P value. ‘N’ indicates sample size.
o Supposedly more objective than narrative literature review
• Advantages
o Use stats to organize and extract info (organizing numbers)
o Eliminate selection bias (???) this is a bold statement;
o Makes use of all info in a study (???) only if you include it or pay attention to it
o Detects mediating characteristics
• Limitations
o No guide for implementation
o Time-limited
o No real rigid methodological rules
o Only as good as the studies it is based on
QUIZ QUESTION: META ANALYSIS ARE ONLY AS GOOD AS THE STUDIES I’M USING TO GET MY RESULTS.
III. Meta-Analysis Methodology
1. Research Focus – what’s the question that needs to be answered
2. Sampling – not recruiting partipants. You’re going out and finding studies. The studies that your meta anlaysis is based on make up the sample.
• Inclusion Criteria? see above re: peer reviewed, dated, etc.
3. Classifiying and Coding – not classifying qualitative or abstract data. Specific findings from other studies.
4. Role of the “Audit Trail” – pertains just as much to meta-analysis as it does to meta-synthesis (qualitative and quantitative are equally important).
5. Data Analysis
• Significance vs. Effect size vs. Effect Magnitude
o Is the effect size small, medium or large?
o Of what magnitude is the effect? Small = weak or large = really important and dramatic.
• Comparison to Zero Effects (Cohen’s d is the effect size)
• Positive vs. Negative Effect Sizes – usually only see positive effect sizes. Negative is usually the opposite of what you are wanting.
6. Interpretation and Beyond
• Raises Researchers Consciousness? – what are the important questions? What is important to look at?
• Highlights Gaps in Field Knowledge Base (lets researchers know what they need to go out and research)
• Motivates Future Research
• Implications for Practice?
Synthesis is qualitative; analysis is quantitative
Meta-Analysis and other Methods of Research Synthesis
• Levels of Research Synthesis
• Literature Reviews
o -- allows for in depth discussion of individual studies’ findings and their theoretical implications
o -- no weight given to statistics in the discussion
• Numerical Reviews (“Vote Counting”)
o Studies that are statistically significant “cast votes” for effects
o Statistical significance depends on many factors
• Meta-Synthesis (synthesizing qualitative research findings)
o Used in qualitative research (qualitiative research is concepts and ideas and general results)
o It is NOT collection and aggregation of research results
o It is “the bringing together and breaking down of findings, examining them, discovering the essential features, and, in some way, combining them into a transformed whole” (Schreiber et al., 1997, p. 314)
• In qualitiative research the goal of the study is to try and understand it
o One of the important issues is how do they actually go about doing the study.
• What is the selection criteria? How am I going to select studies on whatever? What is my inclusion criteria? (is it published in a peer review journal? You want to do a peer review so they have been criticized, or passed a level of critique. There may be a restriction/criteria selection on date, participants or methods used.
o Once you have chosen, determined the critiera, you start to code and categorize all of the relevant information.
o Teams or Individuals? In general, teams are better. Results are a lot more valid.
o Audit Trail: a list or description of exactly what you did.
o Triangulation: you want to approach a topic or finding from at least three different perspectives
II. Meta-Analysis
• Origins of Meta-Analysis – started in agricultural research
• Definition
o Statistical technique that enables the results from a number of studies to be combined to determine the average effect size of a given independent variable.
• Is the effect significant or not? What was found, is it significant and how big was the effect? Remember, if it’s significant or not, that is the P value. ‘N’ indicates sample size.
o Supposedly more objective than narrative literature review
• Advantages
o Use stats to organize and extract info (organizing numbers)
o Eliminate selection bias (???) this is a bold statement;
o Makes use of all info in a study (???) only if you include it or pay attention to it
o Detects mediating characteristics
• Limitations
o No guide for implementation
o Time-limited
o No real rigid methodological rules
o Only as good as the studies it is based on
QUIZ QUESTION: META ANALYSIS ARE ONLY AS GOOD AS THE STUDIES I’M USING TO GET MY RESULTS.
III. Meta-Analysis Methodology
1. Research Focus – what’s the question that needs to be answered
2. Sampling – not recruiting partipants. You’re going out and finding studies. The studies that your meta anlaysis is based on make up the sample.
• Inclusion Criteria? see above re: peer reviewed, dated, etc.
3. Classifiying and Coding – not classifying qualitative or abstract data. Specific findings from other studies.
4. Role of the “Audit Trail” – pertains just as much to meta-analysis as it does to meta-synthesis (qualitative and quantitative are equally important).
5. Data Analysis
• Significance vs. Effect size vs. Effect Magnitude
o Is the effect size small, medium or large?
o Of what magnitude is the effect? Small = weak or large = really important and dramatic.
• Comparison to Zero Effects (Cohen’s d is the effect size)
• Positive vs. Negative Effect Sizes – usually only see positive effect sizes. Negative is usually the opposite of what you are wanting.
6. Interpretation and Beyond
• Raises Researchers Consciousness? – what are the important questions? What is important to look at?
• Highlights Gaps in Field Knowledge Base (lets researchers know what they need to go out and research)
• Motivates Future Research
• Implications for Practice?
Synthesis is qualitative; analysis is quantitative
Subscribe to:
Posts (Atom)