Wednesday, April 16, 2008

Final Research Paper

The Questions you know to ask so far worksheet has been updated and will be the ones to address when updating my research paper.

When I rewrite my paper for the final analysis, I need to take the qeustions worksheet and the paper I already wrote and need to revise the paper. Don't start a new paper, just revise.

circle and put a star by 2H: what am I still really confused on, what has changed. this only needs to be a paragraph at the end of my research paper. what brought around the change? the reading of articles, the lexture, the textbook?

Next week in class, make sure paper is proofread and has no mistakes! Bring a hard copy of the initial analysys and a hard copy of the final analysis along with the article.

paper needs to be longer than five pages, or at least longer than the original.

ACTION RESEARCH: it's cyclical. designed to answer a specific problem in specific content. I should have a specific question from having read my article, such as is this going to work in the classroom? does it answer a question for me?

Developing a project for next week: it's on the handout and will be presented next week.

Program Evaluation/Action Research-4.16.08

Program Evaluation: takes components of any or multiple of the design discussed earlier this semester into one giants research project.

Assessment:
Program Evaluation (if a program is set out with certain goals you actually meet those goals) and Outcomes Assessment (want accountability to wehre did it actually do anything good; did I improve anything-refers to the idea that not only did I implement the program I improved something in some way)

In both of theses: formative assessment is as you go along (quizzes along the way)
Summative is at the end of something (final research paper)

It's important to have both kinds of assessment b/c you need to know before you get to the end that there is understanding as you go along so adjustments may be made to ensure the final project is going to be successful.

Both Program and Outcome Accountability are important
-you could have a program that does everything it says it's going to do, but it doesn't produce any outcomes
**DARE is an example of this problem.

-or you could have a program that doesn't do what is says it's going to do, but there are changes in outcomes
**so then how do you know what the changes in outcomes are due to, the program or some extraneous variable?

Taking a Look at Program Evaluation
Program Elements
A. Structure
1. Definition of the program
-the program is a developmental educational....
-this program is an integral part....
-this program includes
2. Rationale for the program
-importance of the program
-reasons why students need waht the program offers
3. Assumptions of the program
-identify and briefly describe the premise upon which a programs rests
**ethical and legal practice
4. Resources
-human
-financial
-political
a. program components
-training
-student interventions
-support for system

DOMAINS

Domains identify conceptual aspects of focus or priorities for the program.
-The identification of domains:
1. provides a description of areas of importance
2. provides a structure to organize goals and competencies
3. all assessments should be related to all domains of the program. Ex. don't want someone to throw an extra measurement into the research that has nothing to do with the program being studied.

Competencies are knowledge, skills, attidues (descriptions) of student behaviors that are linked to stated goals for their education. They are based on the assumption that the behaviors can be transferred from a learning situation to a real life situation.
Competencies are:
1. organized into goal content areas
2. arranged developmentally
3. measurable

Components of Program Evaluation
Process vs. Outcome Evaluation
-formative = evaluation of the process of learning
-summative= evaluation of final product (the end)

Stakeholders are anyone who has something to gain from the program. Anywhere from desingers, evaluators teachers and students.


Observations & Interviews 4/2
Interview Process
• Who is our interviewee?
• Design the interview protocol
• Where will the interview take place?
• Memorize the questions.
• Wording of the question:
o No leading questions
o Open-ended questions
• Obtain consent before starting
**respect in quantitative is called control; in qualitative it’s called rigor***
• Demonstrate Respect
o For individual, culture, community, etc.
o Develop rapport
o Stick to protocol as much as possible
• Record comments faithfully
o Use a recorder or videotape if possible

Interview Types
Structured or Semi-Structured
• “Verbal questionnaires”
• More Formal
• Specific protocol prepared in advance
• Designed to elicit specific information
Informal Interviews
• More informal, more like a conversation
• Still involve planning
• Move from non-threatening to threatening questions
• Designed to determine more general views of respondent

Types of Interview Questions
• Demographic Questions –who are these people; where are they from; what’s their occupation
• Knowledge Questions
• Experience or Behavior Questions
• Opinion/Value/Feeling Questions
• Sensory Questions

Interview Questioning Tips
• Ask questions in different ways – watch the person’s facial expression and it will let you know if there is comprehension
• Ask for clarification
• Vary control of flow of information
• Avoid leading questions
• Ask one question at a time
• Don’t interrupt

Observation Process
1. What is the research question? - be very clear of what the question is, what the specific look is, what are you watching for
2. Where is the observation taking place?
3. Who, what, when, and how long? -what: be aware of what on task behavior looks like and what off task behavior looks like
4. What is the role of the observer?

What is the observer’s role?
• Participant Observation
o Complete Participant – acts just like everybody else in the class; may participate in quizzes, asking questions, etc. no one knows who she is
o Participant-as-Observer – acts like a student, may engage in discussion, but doesn’t actually hand in anything and isn’t graded in anyway. Still a participant, but not 100% participant.

• Non-Participant Observation
o Complete Observer - may be sitting with a clipboard checking off things on a rubric
o Observer-as-Participant may be a little more covert, not let others be aware that observation is going on

• Minimize Bias – done in how a rseracher collects data

Collecting Observation/Interview Data
Note Taking Methods
• Field Jottings
• Field Diary
• Field Log

Descriptive vs. Reflective Notes
• Descriptive notes describe the subjects, settings, events, activities, observer behaviors
• Reflective notes reflect on analysis, method, ethics, frame of mind, points of clarification

Data Coding Schemes





Sample Field Notes Form

Research Question:

Date/Time:

Setting:

Descriptive Notes: Reflective Notes:






Sample Interview Protocol
Research Question:

Time of Interview:
Date:
Setting:
Interviewer:
Interviewee:
Position of Interviewee (if relevant):

(Briefly describe project to interviewee and obtain consent.)

Sample Interview Questions:
1. What has been your role in the incident?


Interviewee’s Response Interviewer’s Reflective Comments


2. What has happened since the event that you have been involved in?

Interviewee’s Response Interviewer’s Reflective Comments

3. What has been the impact on the community of this incident?

Interviewee’s Response Interviewer’s Reflective Comments

4. To who should we talk to find out more information about the incident?

Interviewee’s Response Interviewer’s Reflective Comments

(Thank individual for participating in this interview. Assure him/her of confidentiality of responses and potential future interviews.)


For Quiz 7 last question:
As a class, come up with a focus topic for a one-question interview related to the use of technology in education. Our broad focus is availability of technology in workplace.
Next, write the question. Make sure it is clear, unbiased, uses respectful terminology, etc., and that it gets at your focus topic. How does the availability of technology influence your productivity? Prep with how can you use what’s available to do your job? How do you use technology to influence your productivity? Everyone finds one person; record who they are, what their career is, the time, the date,. Tell them the purpose of the study, ask them the question, record what they actually say and then record my reflective notes as well. There is a response box for this on questions #7 on the quiz.
For your answer to the last question on quiz 7, you should interview one person, and record both descriptive (transcript) and reflective notes for this interview.

Wednesday, March 5, 2008

March 5-12, 2008

March 12
Surveys and Questionaires
I. Quantitative Methods
Functions of Quantitative Designs
Purpose of...
Experimental Research is to draw conclusions about cause and effect
Comparative Research is to examine relationships between variables
Descriptive (Survey) Research is to summarize characteristics of a particular sample or population

Types of Surveys
• Cross-Sectional –data is collected once from a variety of different groups
• If everyone in the class was surveyed to know the difference between 1st year master’s student’s perceptions and doctorate student’s perceptions. PhD students were at some point 1st year master’s students. So both masters and PhD people would be asked the same questions at the same time. Data is being collect at one point in time but it’s just different people/groups.

• Longitudinal –data is collected over several points in time from a single group
• Trend Studies–survey similar samples at different points in time
o Ex: I could sample the EDPS 6030 cohorts every year; sample Jane Doe first year of master’s and then sample Jane Doe again but when she’s a PhD student.
• Cohort Studies--survey a different sample from the same group at different points in time
o Ex: I could sample 4 students from this group of EDPS 6030 students, with different students in each sample
• Panel Studies--survey the same sample at different points in time
o •Ex: I could sample the current entire group of EDPS 6030 students every year
• ****this is the best way to do a longitudinal study***
• They are hard studies to do as you lose samples over time

Survey Research Methodology
Remember, whenever you do a study, the first thing to look at is WHAT IS THE QUESTION? When going in to the method section the reader must know what the question, or hypothesis is. This is called the focus or unit analysis.

What is the focus (unit of analysis) of the survey?
• Focus should be interesting and important enough for individuals to want to respond
• Objectives should be clearly defined, and all questions should relate to the objectives
Who made up the sample?
• Target Populations vs. Actual Populations
• Demographic Information
• Sampling Response Rates
How is the survey administered?
• – Direct (have a clipboard and go up to people somewhere and ask questions face to face
• – Mail (a lot of $ is spent putting together survey and mailing it out; some come back some don’t) Only a 30% return rate
• – Telephone - going by the sideway due to the no call list
• – In Person – same as direct
• – Internet (60-80% return rate)-easier b/c you can use survey sites, such as survey monkey.
What is explained to the subjects in the cover letter?
• The cover letter usually should say:
o Who is doing it
o What it’s about
o How long it will take
o Why they are being asked
o When it should be returned
• Should explain enough information to tell the person why they should spend their time completing the survey.

Individual Survey Items
• Should be as unambiguous as possible (people should be able to understand without having to ask questions)
• Should be short and well-organized
• Should not ask more than one question at a time
• Should not contain biasing or overly technical terms
• Should not use leading questions
• Should avoid using double negatives
• Should not overuse contingency questions (if you have choose this, or if you haven’t, choose this)
SO, SURVEY QUESTIONS SHOULD BE CLEAR, SHORT AND UNBIASED.
• Responses (Use reverse scoring to get a lot more truthful response)


Survey Item Types
Closed-ended (easy to analyze: check the box, choose a, b, or c.)
• Advantages
o Consistent and unambiguous
o Easier to score and interpret
o More likely to elicit responses
• Disadvantages
o Limits amount of “data” received (you only get the data you asked for, have you smoked. They can say yet but it doesn’t let you, the tester, know if it was in high school or now or yesterday or two minutes ago).
o More difficult to construct good questions
o More questions are usually required
Open-ended
• Advantages
o Allows more freedom in individual responses
o Easier to construct
• Disadvantages
o More difficult to score and interpret
o May elicit more ambiguous responses
o Less consistency in individuals’ response rates and response contents

Analyses of Survey Data: Closed-Ended Questions
Descriptive Statistics
Percentages
Mean, Median, Mode
Frequency Distributions
Comparative Statistics (the P value should be less than .05)
Correlations (r, R)
Chi Square (χ2) (
Inferential Statistics (the P value should be less than .05)
T-tests (t)
ANOVAs (F)
MANOVAs(F)

Analyses of Survey Data: Open-Ended Questions
Use of Qualitative Analysis Methods
• Content Analysis
• Triangulation (support for the idea from at least three different sources of data)
• Emergent Patterns
• Emergent Hypotheses (quotes seen are trying to back up their data)

Validity & Reliability
Pretesting: good surveys are pretested to see if potential respondents understand what I’m asking. It’s done to make sure we haven’t screwed something up.

Reliability Checks: need someone to check how reliable the data has been coded or entered in to a program.

Validity Checks
• Internal Validity (check all of the threats to internal validity. Check to make sure questions won’t bias the participant; that location doesn’t bias the participants; where it’s sent doesn’t bias them
• External Validity (am I asking questions that can be asked to anyone; is it applicable to those outside of the specific group it was written for).




March 5
Correlation Designs: Used to measure relationships between two or more variables (denoted by little 'r')

****you cannot draw conclusions about cause and effect with correlational designs!!!!****
*There are two kinds:
1. Explanation
2. Prediction

B. Robinson et al (2007) - researchers that are making conclusions about cause and effect on correlational information
C. Three necessary conditions for causal relationships:
1. Cause is related to effect
2. No plausible alternative explanation for the effect exists other than the cuase
3. Cause is beore effect

D. Look at Percentage of Nonintenrvention Articles on powerpoint for today's notes to see graph.
**Finished taking notes in Word as I became frustrated trying to format every line!

Here are our cohorts notes:
orrelational Designs
–Used to measure relationships between two or more variables (r)
• Explanation
• Prediction
–No conclusions about cause and effect may be drawn

More and more researchers are making causal conclusions inappropriately.
They are making causal conclusions from correlational data, which you can NOT do.

Analyzing Data
• Correlation Coefficients
– “r”can range from -1 to +1
– Negative correlation = as one variable decreases, other increases
The negative doesn't mean the correlation is any less strong, it simply goes in the other direction.
ex. Typing skill and typing errors
– Positive correlation = as one variable increases, other also increases
ex. Standardized test scores and GPA
– Zero correlation = no relationship between the two variables

Plotting Correlational Data --Scatterplots
The more scattered the dots are on the graph, the closer the correlation coefficient gets to zero

Interpreting Correlations
• Correlation coefficient “r”
–Ranges from -1 to 1
–Indicates whether relationship is positive or negative
–Statistical significance (p-value) depends on size of relationship and size of sample

Magnitude of r Interpretation
Look at the absolute value of r to determine the magnitude of r
So what does r mean, anyway?
.00 to .40 weak relationship
.41 to .60 moderate relationship
.61 to .80 strong relationship
.81 or above very strong relationship.

More on CorrelationalDesigns
• Predicting from Multiple Variables
–Can compute several individual correlation coefficients to produce a correlation matrix (a table showing how the different variables correlate)
• Or can conduct a Multiple Regression Analysis - puts all the variables together for one coefficient (R)
–Yields coefficient R, which can be interpreted similarly to the simple correlation, r
–Also yields R2(coefficient of determination)
Coefficient of determination = Effect size estimate for correlational studies = how much of the result we can explain by the effect of all the other variables
ex. Reading Comprehension has many variables. R2 = How much of the reading comprehension can we explain through these other factors?

Imagine trying to dissect the concept “Reading Comprehension.” It is made up of several related factors, such as:
•Fluency
•Intrinsic Motivation
•Verbal IQ
•Working Memory Capacity
•Background Knowledge
If we sum up the portion of those components that uniquely overlaps with reading comprehension, we can explain a big part of reading comprehension. That is essentially what R2 in a multiple regression does.
So R2tells you, given all of the factors we’ve entered into the equation, how much of Reading Comprehension can be explained by those factors.

Other Uses of Correlational Data
Structural Equation Modeling (aka. SEM, Path Analysis, Hierarchical, Stepwise) –maps out relationships among several variables
–Instead of lumping everything into a multiple regression, we can put them into a structural equation model; Allows researchers to see the “big picture”as well as relationships among individual variables
Factor Analysis –how do individual variables or items in a measure combine to create “mega-
variables”
–E.g. Several items on a questionnaire might relate to your interest in a topic. Instead of treating each item as an individual variable, we combine them as one “factor”
Path Modeling - shows how all of the factors correlated together AND how they work together to predict supportive computer use (looks like boxes with arrows going every which way showing correlational values between each combination)




Feb. 28 Notes

Other Comparative Designs

Review of Terms:

Random Selection/Sampling vs. Random Assignment
Random Selection = How do I get my sample? All people in your population have an equal chance of being selected
Random Assignment = Once you have your sample, how do you assign them to a group?

Internal Validity vs. External Validity
Internal Validity = control of the experiment
External Validity = generalizability of your experiment
You want to have the correct balance between the two

Independent, Dependent, & Extraneous Variables
Independent Variable = What you are manipulating
Dependent Variable = depends on the independent
Extraneous Variables = the things that mess everything up

Between vs. Within Subjects Variables
Between subjects variable = looking for a difference between subjects - they don't get the same experience in both groups - but you need to make sure that both groups are as similar as possible to confirm that the only differences between groups is your independent variable
Within subjects variable = Every individual gets both experiences in the experiment - ex. Pre vs. Post - Within is more ideal because you know your groups will be consistent for both independent variables

Factorial Design = measures the impact of more than one independent variable at once
Benefit: you can see if there is an interaction between the different independent variables
(No more than three independent variables, otherwise it gets too complicated and you can't tell what variables are interacting with which variables very clearly)

Experimental vs. Quasi-Experimental Designs
"True" Experimental Designs = involve random assignment to conditions manipulated by experimenter
Quasi-Experimental Designs = involve comparisions of groups in pre-selected conditions or groups - I design the study before I collect the data
Causal Comparative Designs = are ex post facto quasi-experimental designs; They involve comparison of pre-selected conditions/groups after the fact

Time-Series Design
Measure dependent variable a lot over time

Experimental Designs, cont.
Single Subject Designs
-Like mini longitudinal experimental designs, on individuals or small groups of individuals
-Similar to pretest/posttest designs; examines DV before and after IV
-Used when it doesn’t make sense to pool effects across individuals - ex. When working with children with special needs, the specific behaviors and needs of one child are not the same as others' - But tracking that one child over time may help develop strategies to help that specific child - you're not trying to generalize your findings, you're just trying to help that one individual
--Special populations
--Focus on success of specific interventions with specific individuals

Possible Single-Subject Designs
A-B Design = baseline, followed by intervention
A = Baseline
B = Intervention
But what happens after the intervention is removed? Does behavior go back to baseline?

A-B-A Withdrawal Design = baseline, followed by intervention, concluded with baseline
When the intervention is removed, does behavior go back to baseline?
Ethical issue: is it OK to intervene but then leave subjects back at baseline behavior, especially if we know that the intervention is needed?

A-B-A-B Design = one step further; instead of leaving subjects back at baseline, present intervention again (more ethical)

Multiple Baselines Designs = alternative to ABAB design; used when it’s not ethical to leave the subject at the baseline condition and when measures on multiple Dependent Variables (DVs) are taken.
-Taking baselines for multiple behaviors at same time – whether it’s one behavior in multiple individuals, multiple behaviors in one individual, one type of behavior in one individual in multiple settings, etc.
-Difficult to use, because must ensure that multiple DVs aren’t related to one another

Issues to Consider: Internal Validity
Is Treatment Standardized?
-# Variables Changing Across Conditions ? - Is the behavior changing because of my intervention, or is there another explanation?
Condition Length?
-Degree and Speed of Change? - You want to show a stable trend. Do you have enough data points to see a stable trend in the experiment results? - The "magic" number is 7 measurements to see a stable trend
Experimenter Effects?
-Because this is a one-on-one experiment, the experimenter is likely to have an impact on the individual
Practical Significance of Results?
-Practical significance is looked at more than statistical significance because of the single-subject design. - ie. Did it help the student improve?

Single Subject Designs and External Validity
It is very difficult to generalize the results of these designs beyond the sample studied – WHY?
-Because it was only designed for one person for a specific purpose. It is not necessarily meant to be generalized.
Thus, it is VERY IMPORTANT for this type of research to be replicated before strong conclusions are drawn.

*We now have all the information we need for quiz 3 if we want to take it now.

Wednesday, February 20, 2008

Feb. 20 Notes combined w/ 2.14 & 2.27

Other Comparative Designs
Review of Terms:

Random Selection/Sampling vs. Random Assignment
• Random Selection = How do I get my sample? All people in your population have an equal chance of being selected
• Random Assignment = Once you have your sample, how do you assign them to a group?

Internal Validity vs. External Validity
• Internal Validity = control of the experiment
• External Validity = generalizability of your experiment
You want to have the correct balance between the two

Independent, Dependent, & Extraneous Variables
• Independent Variable = What you are manipulating
• Dependent Variable = depends on the independent
• Extraneous Variables = the things that mess everything up

Between vs. Within Subjects Variables
• Between subjects variable = looking for a difference between subjects - they don't get the same experience in both groups - but you need to make sure that both groups are as similar as possible to confirm that the only differences between groups is your independent variable
• Within subjects variable = Every individual gets both experiences in the experiment - ex. Pre vs. Post - Within is more ideal because you know your groups will be consistent for both independent variables

Factorial Design = measures the impact of more than one independent variable at once
Benefit: you can see if there is an interaction between the different independent variables
(No more than three independent variables, otherwise it gets too complicated and you can't tell what variables are interacting with which variables very clearly)

Experimental vs. Quasi-Experimental Designs
• "True" Experimental Designs = involve random assignment to conditions manipulated by experimenter
• Quasi-Experimental Designs = involve comparisions of groups in pre-selected conditions or groups - I design the study before I collect the data
• Causal Comparative Designs = are ex post facto quasi-experimental designs; They involve comparison of pre-selected conditions/groups after the fact

Time-Series Design
Measure dependent variable a lot over time

Experimental Designs, cont.
• Single Subject Designs
o -Like mini longitudinal experimental designs, on individuals or small groups of individuals
o -Similar to pretest/posttest designs; examines DV before and after IV
o -Used when it doesn’t make sense to pool effects across individuals - ex. When working with children with special needs, the specific behaviors and needs of one child are not the same as others' - But tracking that one child over time may help develop strategies to help that specific child - you're not trying to generalize your findings, you're just trying to help that one individual
o --Special populations
o --Focus on success of specific interventions with specific individuals

Possible Single-Subject Designs
• A-B Design = baseline, followed by intervention
o A = Baseline
o B = Intervention
But what happens after the intervention is removed? Does behavior go back to baseline?

A-B-A Withdrawal Design = baseline, followed by intervention, concluded with baseline
• When the intervention is removed, does behavior go back to baseline?
Ethical issue: is it OK to intervene but then leave subjects back at baseline behavior, especially if we know that the intervention is needed?

A-B-A-B Design = one step further; instead of leaving subjects back at baseline, present intervention again (more ethical)

Multiple Baselines Designs = alternative to ABAB design; used when it’s not ethical to leave the subject at the baseline condition and when measures on multiple Dependent Variables (DVs) are taken.
• -Taking baselines for multiple behaviors at same time – whether it’s one behavior in multiple individuals, multiple behaviors in one individual, one type of behavior in one individual in multiple settings, etc.
• -Difficult to use, because must ensure that multiple DVs aren’t related to one another

Issues to Consider: Internal Validity
Is Treatment Standardized?
• -# Variables Changing Across Conditions ? - Is the behavior changing because of my intervention, or is there another explanation?
Condition Length?
• -Degree and Speed of Change? - You want to show a stable trend. Do you have enough data points to see a stable trend in the experiment results? - The "magic" number is 7 measurements to see a stable trend
Experimenter Effects?
• -Because this is a one-on-one experiment, the experimenter is likely to have an impact on the individual
Practical Significance of Results?
• -Practical significance is looked at more than statistical significance because of the single-subject design. - ie. Did it help the student improve?

Single Subject Designs and External Validity
It is very difficult to generalize the results of these designs beyond the sample studied – WHY?
• -Because it was only designed for one person for a specific purpose. It is not necessarily meant to be generalized.
Thus, it is VERY IMPORTANT for this type of research to be replicated before strong conclusions are drawn.

*We now have all the information we need for quiz 3 if we want to take it now.



2/14-20
I. Experimental Design
*Smooch has 2 toys: a teal stuffed dolphin, and a red rubber bone. If I throw each toy out into the yard, she’ll charge after the dolphin, but almost never the bone. I conduct this experiment over and over again, sometimes varying the order in which I throw the items, sometimes varying the distance. However, I always get the same results.

My neighbor, who is watching over the fence, says that Smooch just doesn’t like rubber toys. What is wrong with my experimental design, that I can’t draw this conclusion?

I design a new experiment, in which I have:
a teal rubber ball
a teal stuffed dolphin
a red rubber bone
a red stuffed mouse

I throw the toys out into the yard in varying orders, and varying distances. Smooch always goes after the ball and the dolphin. She rarely goes after the bone or the mouse.

This time, I conclude that Smooch doesn’t like red toys. In my second experiment,
State the independent variable(s):
State the dependent variable:
What did I have to change/add in my 2nd experiment that wasn’t present in my 1st experiment?

II. What makes experimental designs so great? CONTROL!!!
In experimental designs, researchers have control over:
-Independent Variables (what is manipulated; whatever you are comparing, such as traiditonal vs. online technology)
-Dependent Variables - what is being measured, how are they operational? How is it being measured?
-Extraneous Variables - all of the things I can't control; things that have impact on my dependent variable, such as dog color blindness

III. Internal Validity (validity is is the expeirment measuring what it is supposed to test) and Control
*History (some uncontrolled event, such as a fire alarm)

*Selection (how do you select your subject)

*Maturation (individuals change over time)

*Testing (does just giving a test somehow influence what hte subjects think about the I.V.)

*Instrumentation (something could be wrong with the test)

*Treatment Replications

*Attrition (mortality, or losing subjects during the experiment)

*Regression to the Mean

*Diffusion of Treatment ( does the I.V. from one group share info to another group)

*Experimenter Effects (is the experimenter overly nice or overly mean to subjects; or if your tester is a cute girl and your subjects are 13 yr old boys, then they subjects do whatever the tester wants you to do)

*Subject Effects

***Self Report Data: Social Desirability Bias: subject wants to appear a certain way; Ex. anonymous survey about drug use

IV. External Validity (generalized study) and Control
1. So more control ≈ higher levels of internal validity… the more I can control my experiment
-Note, this is what allows us to make cause/effect conclusions about experimental data

2. But what happens to external validity? The more I control something, the farther away I am getting from the real world.

***Experimental designs are good because they can be controlled*****

IMPORTANT TERMS
a. Randomization
-Random Selection vs. Random Assignment

b. Within Subjects vs. Between Subjects Variables and Designs

c. Controlling for Confounds
-Holding variables constant
-Building variables into design
-Matching

V. Pretest/Posttest Designs
A. Single Group Posttest-Only Design
-Expose 1 group to the IV and then measure the DV (posttest) once

Example: I decide to test whether using puppets to “read” books with young children to help them learn different sound combinations.

-What’s the problem? No baseline!
Before IV IV After IV
X (Treatment) O (Observation)
Use puppets PA quiz

B. Single Group Pretest Posttest
-For a single group, give them a pretest (DV), then the IV, then a posttest (DV again)

Example, in the “puppet experiment,” I give students a pre-reading PA quiz, then read to them with the puppet, then give them a post-reading PA quiz

Before IV IV After IV
O (Observation) X (Treatment) O (Observation)
PA quiz Puppet Reading PA quiz

C. Nonequivalent Groups Pretest Posttest: Quasi-Experimental (chosen groups)
-Involves 2 groups (experimental and control); both get pretest (DV), then only the exper. Gp. is exposed to the IV, and both gps. get the posttest (DV)

Group Before IV IV After IV
O (Observation) X (Treatment) O (Observation)
Exper. PA quiz Puppets PA quiz
Control PA quiz No puppets PA quiz


Between Subjects Designs = each group of subjects receives a different level of the IV
-Advantage: often more practical that within subjects designs
-Disadvantage: are differences due to groups? Or to the IV?
-Use of Matching

VI. “TRUE” Experimental Designs (randomly assigned groups)
1. Randomized Groups Prettest Posttest Design
Individuals are randomly assigned to either exper. Or control grp.; both groups receive the pretest (DV), then only the exper. Gp. Is exposed to the IV, and then both gps. Receive the posttest (DV)
Random assignment gives control over group differences, and pretest allows for a baseline measure in both groups

Experimental Designs with more than one IV
Factorial Designs = measure the impact of 2 or more IVs on the DV

-Advantages
Allow researchers to determine whether effects are consistent across subject characteristics
Allow researchers to assess interactions

-Disadvantages
Can get messy if there are too many IVs
NEVER HAVE MORE THAN THREE INDEPENDENT VARIABLES WHEN DOING A STUDY!

SCENARIO:
Dr. Koolio has just finished a research study investigating the long-term effects of direct instruction in phonemic awareness on children’s reading abilities. He started studying 30 children in a kindergarten classroom in a mid-sized city in southeastern Louisiana in 2002, and he has conducted follow-up studies with these children every year in September (they’re now in the third grade). The students in this class move together from grade to grade, but their teacher changes every year. Nonetheless, each teacher uses some form of direct instruction in PA in his/her reading units.
Every year, Dr. Koolio goes into the students’ classroom and personally administers the Phonological Awareness Test 2, which he developed just before he started this study. In 2004 (when the children were in the 2nd grade), he found that all of the children (N=26) were progressing very well in reading. However, in September of 2005 he saw sharp decreases in children’s average reading achievement scores (N=16). Dr. Koolio conducted a statistical test comparing children’s test scores from kindergarten and their scores from the third grade and found that scores had increased only slightly, but not significantly (p>.05). He concluded that direct instruction in PA does not have any long-term benefits for children’s reading abilities.

Is the Phonological Awareness Test a valid and reliable measure? (Go to the website to find out).

Consider each of the following possible threats to internal validity. Do any of them apply to Dr. Koolio’s study? If so, how?
History
Selection
Maturation
Testing
Other Extraneous Variables?

What type of design did Dr. Koolio use? Can you recommend any improvements to his design? (And if you can, explain why this is an improvement on the original design).

Do you have any concerns about the conclusions Dr. Koolio drew, or the statistical tests on which they were based?

VII. Analyses of Exp. Designs (t and F)

T-test -- used to assess the impact of two levels of one IV on a single DV

ANOVA – used to assess impact of one or more IVs on a single DV

MANOVA – used to assess impact of one or more IVs on multiple DVs

ANCOVA – used to assess impact of one or more IVs on a single DV after removing effects of a variable that might be correlated to DV (e.g., age, gender, aptitude, achievement, etc.)

MANCOVA – used to assess impact of one or more IVs on multiple DVs after removing effects of a variable that might be correlated to DV (e.g., age, gender, aptitude, achievement, etc.)



Wednesday, February 6, 2008

Feb. 6 - Meta-Analysis

Reviewed the Quiz #1.

Meta-Analysis and other Methods of Research Synthesis
• Levels of Research Synthesis

• Literature Reviews
o -- allows for in depth discussion of individual studies’ findings and their theoretical implications
o -- no weight given to statistics in the discussion
• Numerical Reviews (“Vote Counting”)
o Studies that are statistically significant “cast votes” for effects
o Statistical significance depends on many factors
• Meta-Synthesis (synthesizing qualitative research findings)
o Used in qualitative research (qualitiative research is concepts and ideas and general results)
o It is NOT collection and aggregation of research results
o It is “the bringing together and breaking down of findings, examining them, discovering the essential features, and, in some way, combining them into a transformed whole” (Schreiber et al., 1997, p. 314)
In qualitiative research the goal of the study is to try and understand it
o One of the important issues is how do they actually go about doing the study.
• What is the selection criteria? How am I going to select studies on whatever? What is my inclusion criteria? (is it published in a peer review journal? You want to do a peer review so they have been criticized, or passed a level of critique. There may be a restriction/criteria selection on date, participants or methods used.
o Once you have chosen, determined the critiera, you start to code and categorize all of the relevant information.
o Teams or Individuals? In general, teams are better. Results are a lot more valid.
o Audit Trail: a list or description of exactly what you did.
o Triangulation: you want to approach a topic or finding from at least three different perspectives

II. Meta-Analysis
• Origins of Meta-Analysis – started in agricultural research

Definition
o Statistical technique that enables the results from a number of studies to be combined to determine the average effect size of a given independent variable.
• Is the effect significant or not? What was found, is it significant and how big was the effect? Remember, if it’s significant or not, that is the P value. ‘N’ indicates sample size.
o Supposedly more objective than narrative literature review

• Advantages

o Use stats to organize and extract info (organizing numbers)
o Eliminate selection bias (???) this is a bold statement;
o Makes use of all info in a study (???) only if you include it or pay attention to it
o Detects mediating characteristics

• Limitations
o No guide for implementation
o Time-limited
o No real rigid methodological rules
o Only as good as the studies it is based on

QUIZ QUESTION: META ANALYSIS ARE ONLY AS GOOD AS THE STUDIES I’M USING TO GET MY RESULTS.

III. Meta-Analysis Methodology

1. Research Focus – what’s the question that needs to be answered

2. Sampling – not recruiting partipants. You’re going out and finding studies. The studies that your meta anlaysis is based on make up the sample.
• Inclusion Criteria? see above re: peer reviewed, dated, etc.

3. Classifiying and Coding – not classifying qualitative or abstract data. Specific findings from other studies.

4. Role of the “Audit Trail” – pertains just as much to meta-analysis as it does to meta-synthesis (qualitative and quantitative are equally important).

5. Data Analysis
• Significance vs. Effect size vs. Effect Magnitude
o Is the effect size small, medium or large?
o Of what magnitude is the effect? Small = weak or large = really important and dramatic.
• Comparison to Zero Effects (Cohen’s d is the effect size)
• Positive vs. Negative Effect Sizes – usually only see positive effect sizes. Negative is usually the opposite of what you are wanting.

6. Interpretation and Beyond
• Raises Researchers Consciousness? – what are the important questions? What is important to look at?
• Highlights Gaps in Field Knowledge Base (lets researchers know what they need to go out and research)
• Motivates Future Research
• Implications for Practice?
Synthesis is qualitative; analysis is quantitative

Wednesday, January 30, 2008

January 30 - Reliability & Validity

Important Characteristics of Measures

I. Validity vs. Reliability
1. Validity= appropriateness, correctness, meaningfulness, and usefulness of
inferences made about the instruments used in a study
2. Reliability= consistency of scores obtained, across individuals, administrators, and
sets of items

II. Relationship Between Reliability and Validity
Suppose I have a faulty measuring tape and I use it to
measure each student’s height. - My tool is invalid, but it’s still reliable.

On the other hand, if I have a correctly printed measuring tape... - My tool is both valid and reliable.
question 20 and 5
****If something is unreliable it is invalid, BUT it can be invalid and reliable*****
or valid and reliable

III. Types of Validity
a. Content Validity - do the contents of the measurement(ifnormation to be learned match the information on the test) match the contents of the validity
b. Criterion Validity - how well do two measures of something correlate with one another
1. –Predictive Validity
2. –Concurrent Validity (CRT measures what was learned that year;correlated or compared to grades for the year).
a. Convergent vs. Discriminant Validity (****test question: one way people give their content validity is to show it's correlation with some other measurement****) it's how we know that the SAT is a decent predictor of college success is b/c it has a decent correlation with college GPA
c. Construct Validity - constructing a test or constructing a mesureament by getting experts in the field who say yes, that really is being a part of the concept.
• Internal Validity - how well is your study designed, how well did you select your subjects, how well did you get into your subject (will discuss later)

Wednesday, January 23, 2008

January 23 - Sampling & Measurement

I. Sampling
A. Samples vs. Populations
*Sample = group of people participating in the study

*Population = group of people to whom I want to generalize the tresults
-target population - how well does it respresent the populaton you're trying to address
-accessible population (200 kids)

*Two types of sampling:
A. Probability Sampling (text calls it simpel random sampling) = (pulling a random group of people in such a way that each inidividual has an equal chance of being selected for participation in the study. (could also be called a straight random sampling)

Probability Sampling Methods:

1. Stratified Random Sampling= select subsets of the population to participate in the study in the same proportion as they appear in the population
e.g., 400 teachers in Salt Lake area schools, 275 are female and 125 are male
I decide to sample 40% of Salt Lake area teachers. My sample contains:
40% * 400 teachers = 160 total teachers in sample
40% * 275 female teachers = 110 females in sample
40% * 125 male teachers = 50 males in sample

2. Clustered random sample= select existing groups of participants instead of creating subgroups
e.g., Instead of randomly selecting individuals in correct proportions, I randomly select groups of individuals
So now I randomly select some schools in Salt Lake area district, and all teachers in those selected schools participate in my study
But, I must ensure that those groups selected are representative of my population as a whole.

3. Two-Stage Random Sampling (will probably never see this but should know what it is) = combines methods 1 (stratified) and 2 (clustered); in stage 1, existing groups are randomly selected; in stage 2, individuals from those groups are randomly selected

e.g., Instead of randomly selecting individuals in correct proportions, I randomly select groups of individuals, then randomly select individuals from those groups

Stage 1: I randomly select some schools in Salt Lake area district. (first pick my cluster, then stage 2 is picking my people from those clusters)

Stage 2: From each selected school, I randomly select a subset of teachers to participate in the study

B. Non-Probability (or non random) Sampling (also not always used) = individuals are selected from the population in such a way that not everyone has an equal chance of being selected for participation in the study. (such as choosing every 3rd M&M in a group)

1. Systematic Sampling (the best way to study)= every nth individual in a population is selected for participation in the study
e.g., I take an alphabetical list of all teachers in Salt Lake area schools, and select every 3rd individual from that list for participation in my study. Here, 3 is my sampling interval

**** (Means every what is being sampled) sampling interval = population size / desired sample size

e.g., sampling interval = 400 teachers / 160 teachers (or 40%) =2.5

sampling ratio = proportion of individuals in population selected for sample
e.g., sampling ratio = 160/400 = .4 or 40%

2. Convenience Sampling = select from a group of individuals who are conveniently available to be participants in your study

e.g., I go into schools at lunchtime and give surveys to those teachers who can be found in the teachers’lounge

Potential Problem:
Sample is likely to be biased –are those teachers in the lounge at lunchtime likely to be different from those who aren’t?
This type of sampling should be avoided if possible.

3. Purposive Sampling (be alert to this kind of sampling, biased)= researchers use past knowledge or own judgment to select a sample that he/she thinks is representative of the population

e.g., I decide to just give my survey to teachers who are also currently enrolled in the EDPS 63030, because I *think* they are representative of the population of Salt Lake area teachers

Potential problem: Researchers may be biased about what they believe is representative of a population, or they may be just plain wrong.

**Beware of this type of sampling as the potential for bias is very strong.

***First question you should ask aobut sampling: Does it represent the target population?


II. Sampling in Qualitative Research
• Purposive Sampling - i will select those individuals that I think represent the interest
• Case Analysis (can be any of the following)

–Typical (teacher burnout: typical example would be a teacher who is just tired)
–Extreme (teacher burnout: extreme example is the teacher who burned out so bad who went and got a job at Blockbuster or became a scuba instructor)
–Critical (portrays the critical lessons of burnout, usch as not carrying anymore, come up with reasons to cancel class, use the same stuff every time, going through the motions until retirement)
• Maximum Variation (someone who is really low in burnout, makes all other teachers sick b/c won't retire b/c they just love it so much so they teach 45 years)
• Snowball Sampling ( start off with a small group of participants and ask those people to ask other people to get more people, like a snowball rolling down the hill that gets bigger and bigger the longer it rolls).

Sampling and Validity
1. What size sample is appropriate?
Descriptive => 100 subjects

Correlational=> 50 subjects

Experimental => 30 subjects per group (often see less numbers than this)*

Causal-Comparative => 30 subjects per group*

But if groups are tightly controlled, less (e.g., 15 per group) may be OK.

2. How generalizable is the sample?
external validity= the results should be generalizable beyond the conditions of the individual study
1. Population generalizability= extent to which the sample represents the population of interest
2. Ecological generalizability= degree to which the results can be extended to other settings or conditions

III. What is Measurement?
• Measurement (just the collection of data)
• Evaluation (making decisions on the basis of the collected data)
• Where does assessment fit in?

A. What kind of scale is the measurement based on?
Nominal - refers to the same thing as categorical variables, things you can name, things that are categories, such as gender (qualitative variable)
Ordinal - rank ordering of information (quantitative); no information, just an order, such as 1, 2, 3; we don't know the distance between 1 and 2, who voted for who, just what rank they made
Interval - you do get information in regards to the distance between a rank of 1-2.) On an interval scale, you can measure distance; if a room is zero degrees in Fahrenheit, does that mean there's a complete absence of heat? No, that's an interval scale, no absolute zero!
Ratio - there is an absolute zero when measuring.

IV. Types of Educational Measures
• Cognitive vs. Non-Cognitive
Cognitive = interested in the learning process
Non-Cognitive=
• Commercial vs. Non-Commercial
- Commercial: probably the norm, been tested, been standarized, but may not be tailored to the specific interest of the research study
-Non-Commercial: materials can be developed specifically for the research study
• Direct vs. Indirect
Direct just get the information directly from the participant
Indirect: getting information from somebody else, such as doing a documents analysis to understand principles, where the documents may or may not come from the participant

V. Sample Cognitive Measures
Standardized Tests
–Achievement Tests
–Aptitude Tests

Behavioral Measures
–Naming Time
–Response Time (how quickly does one respond to a question)
–Reading Time (how quickly can one read a passage or a word)

Wpm

Eyetracking Measures (enables the researcher to determine exactly where the person is looking on a keyboard/computer down to the exact keystroke) determines how you look at images vs text
***AIO = area of interest

We'll do section VI later in the semester
VI. Non-Cognitive Measures
• Surveys & Questionnaires
• Observations
• Interviews

How is an individual’s score interpreted?
1. Norm-referenced instruments= an individual’s score is based on comparison with peers (e.g., percentile rank, age/grade equivalents, grading on curve, etc.)
2. Criterion-referenced instruments= an individual’s score is based on some predetermined standard (e.g., raw score)

Interpreting Data
Different Ways to Present Scores:
1. Raw Score= number of items answered correctly, number of times behavior is tallied, etc.
2. Derived Scores= scores changed into a more meaningful unit
a. age/grade equivalent scores= for a given score, tell what age/grade score usually falls in
b. percentile rank= ranking of score compared to all other individuals who took the test
c. standard scores (same as a z score)= indicates how far scores are from a reference point; usually best to use in research
e.g. the mean is 560 overall for a test; a classroom mean is 140 and the student gets a 132. how do you compare the two? using a Z score or a standard score puts both groups on the same scale, so you can compare apples and oranges.

Important Characteristics of Measures
Objectivity - all measures must be objective in order to get a good measure (using a rubric)
Usability - needs to be usable, need to know how to use it
Validity - refers to does the meausre actually measure what it is suppose to measure.(such as does a reading comprehension test really measure reading comprehension or is the student just comprehending what they are looking for by looking for the question specificallt when they read)
Reliability - do I get consistent measurement over time? Without reliability you don't have anything!!!!!