Wednesday, January 30, 2008

January 30 - Reliability & Validity

Important Characteristics of Measures

I. Validity vs. Reliability
1. Validity= appropriateness, correctness, meaningfulness, and usefulness of
inferences made about the instruments used in a study
2. Reliability= consistency of scores obtained, across individuals, administrators, and
sets of items

II. Relationship Between Reliability and Validity
Suppose I have a faulty measuring tape and I use it to
measure each student’s height. - My tool is invalid, but it’s still reliable.

On the other hand, if I have a correctly printed measuring tape... - My tool is both valid and reliable.
question 20 and 5
****If something is unreliable it is invalid, BUT it can be invalid and reliable*****
or valid and reliable

III. Types of Validity
a. Content Validity - do the contents of the measurement(ifnormation to be learned match the information on the test) match the contents of the validity
b. Criterion Validity - how well do two measures of something correlate with one another
1. –Predictive Validity
2. –Concurrent Validity (CRT measures what was learned that year;correlated or compared to grades for the year).
a. Convergent vs. Discriminant Validity (****test question: one way people give their content validity is to show it's correlation with some other measurement****) it's how we know that the SAT is a decent predictor of college success is b/c it has a decent correlation with college GPA
c. Construct Validity - constructing a test or constructing a mesureament by getting experts in the field who say yes, that really is being a part of the concept.
• Internal Validity - how well is your study designed, how well did you select your subjects, how well did you get into your subject (will discuss later)

Wednesday, January 23, 2008

January 23 - Sampling & Measurement

I. Sampling
A. Samples vs. Populations
*Sample = group of people participating in the study

*Population = group of people to whom I want to generalize the tresults
-target population - how well does it respresent the populaton you're trying to address
-accessible population (200 kids)

*Two types of sampling:
A. Probability Sampling (text calls it simpel random sampling) = (pulling a random group of people in such a way that each inidividual has an equal chance of being selected for participation in the study. (could also be called a straight random sampling)

Probability Sampling Methods:

1. Stratified Random Sampling= select subsets of the population to participate in the study in the same proportion as they appear in the population
e.g., 400 teachers in Salt Lake area schools, 275 are female and 125 are male
I decide to sample 40% of Salt Lake area teachers. My sample contains:
40% * 400 teachers = 160 total teachers in sample
40% * 275 female teachers = 110 females in sample
40% * 125 male teachers = 50 males in sample

2. Clustered random sample= select existing groups of participants instead of creating subgroups
e.g., Instead of randomly selecting individuals in correct proportions, I randomly select groups of individuals
So now I randomly select some schools in Salt Lake area district, and all teachers in those selected schools participate in my study
But, I must ensure that those groups selected are representative of my population as a whole.

3. Two-Stage Random Sampling (will probably never see this but should know what it is) = combines methods 1 (stratified) and 2 (clustered); in stage 1, existing groups are randomly selected; in stage 2, individuals from those groups are randomly selected

e.g., Instead of randomly selecting individuals in correct proportions, I randomly select groups of individuals, then randomly select individuals from those groups

Stage 1: I randomly select some schools in Salt Lake area district. (first pick my cluster, then stage 2 is picking my people from those clusters)

Stage 2: From each selected school, I randomly select a subset of teachers to participate in the study

B. Non-Probability (or non random) Sampling (also not always used) = individuals are selected from the population in such a way that not everyone has an equal chance of being selected for participation in the study. (such as choosing every 3rd M&M in a group)

1. Systematic Sampling (the best way to study)= every nth individual in a population is selected for participation in the study
e.g., I take an alphabetical list of all teachers in Salt Lake area schools, and select every 3rd individual from that list for participation in my study. Here, 3 is my sampling interval

**** (Means every what is being sampled) sampling interval = population size / desired sample size

e.g., sampling interval = 400 teachers / 160 teachers (or 40%) =2.5

sampling ratio = proportion of individuals in population selected for sample
e.g., sampling ratio = 160/400 = .4 or 40%

2. Convenience Sampling = select from a group of individuals who are conveniently available to be participants in your study

e.g., I go into schools at lunchtime and give surveys to those teachers who can be found in the teachers’lounge

Potential Problem:
Sample is likely to be biased –are those teachers in the lounge at lunchtime likely to be different from those who aren’t?
This type of sampling should be avoided if possible.

3. Purposive Sampling (be alert to this kind of sampling, biased)= researchers use past knowledge or own judgment to select a sample that he/she thinks is representative of the population

e.g., I decide to just give my survey to teachers who are also currently enrolled in the EDPS 63030, because I *think* they are representative of the population of Salt Lake area teachers

Potential problem: Researchers may be biased about what they believe is representative of a population, or they may be just plain wrong.

**Beware of this type of sampling as the potential for bias is very strong.

***First question you should ask aobut sampling: Does it represent the target population?


II. Sampling in Qualitative Research
• Purposive Sampling - i will select those individuals that I think represent the interest
• Case Analysis (can be any of the following)

–Typical (teacher burnout: typical example would be a teacher who is just tired)
–Extreme (teacher burnout: extreme example is the teacher who burned out so bad who went and got a job at Blockbuster or became a scuba instructor)
–Critical (portrays the critical lessons of burnout, usch as not carrying anymore, come up with reasons to cancel class, use the same stuff every time, going through the motions until retirement)
• Maximum Variation (someone who is really low in burnout, makes all other teachers sick b/c won't retire b/c they just love it so much so they teach 45 years)
• Snowball Sampling ( start off with a small group of participants and ask those people to ask other people to get more people, like a snowball rolling down the hill that gets bigger and bigger the longer it rolls).

Sampling and Validity
1. What size sample is appropriate?
Descriptive => 100 subjects

Correlational=> 50 subjects

Experimental => 30 subjects per group (often see less numbers than this)*

Causal-Comparative => 30 subjects per group*

But if groups are tightly controlled, less (e.g., 15 per group) may be OK.

2. How generalizable is the sample?
external validity= the results should be generalizable beyond the conditions of the individual study
1. Population generalizability= extent to which the sample represents the population of interest
2. Ecological generalizability= degree to which the results can be extended to other settings or conditions

III. What is Measurement?
• Measurement (just the collection of data)
• Evaluation (making decisions on the basis of the collected data)
• Where does assessment fit in?

A. What kind of scale is the measurement based on?
Nominal - refers to the same thing as categorical variables, things you can name, things that are categories, such as gender (qualitative variable)
Ordinal - rank ordering of information (quantitative); no information, just an order, such as 1, 2, 3; we don't know the distance between 1 and 2, who voted for who, just what rank they made
Interval - you do get information in regards to the distance between a rank of 1-2.) On an interval scale, you can measure distance; if a room is zero degrees in Fahrenheit, does that mean there's a complete absence of heat? No, that's an interval scale, no absolute zero!
Ratio - there is an absolute zero when measuring.

IV. Types of Educational Measures
• Cognitive vs. Non-Cognitive
Cognitive = interested in the learning process
Non-Cognitive=
• Commercial vs. Non-Commercial
- Commercial: probably the norm, been tested, been standarized, but may not be tailored to the specific interest of the research study
-Non-Commercial: materials can be developed specifically for the research study
• Direct vs. Indirect
Direct just get the information directly from the participant
Indirect: getting information from somebody else, such as doing a documents analysis to understand principles, where the documents may or may not come from the participant

V. Sample Cognitive Measures
Standardized Tests
–Achievement Tests
–Aptitude Tests

Behavioral Measures
–Naming Time
–Response Time (how quickly does one respond to a question)
–Reading Time (how quickly can one read a passage or a word)

Wpm

Eyetracking Measures (enables the researcher to determine exactly where the person is looking on a keyboard/computer down to the exact keystroke) determines how you look at images vs text
***AIO = area of interest

We'll do section VI later in the semester
VI. Non-Cognitive Measures
• Surveys & Questionnaires
• Observations
• Interviews

How is an individual’s score interpreted?
1. Norm-referenced instruments= an individual’s score is based on comparison with peers (e.g., percentile rank, age/grade equivalents, grading on curve, etc.)
2. Criterion-referenced instruments= an individual’s score is based on some predetermined standard (e.g., raw score)

Interpreting Data
Different Ways to Present Scores:
1. Raw Score= number of items answered correctly, number of times behavior is tallied, etc.
2. Derived Scores= scores changed into a more meaningful unit
a. age/grade equivalent scores= for a given score, tell what age/grade score usually falls in
b. percentile rank= ranking of score compared to all other individuals who took the test
c. standard scores (same as a z score)= indicates how far scores are from a reference point; usually best to use in research
e.g. the mean is 560 overall for a test; a classroom mean is 140 and the student gets a 132. how do you compare the two? using a Z score or a standard score puts both groups on the same scale, so you can compare apples and oranges.

Important Characteristics of Measures
Objectivity - all measures must be objective in order to get a good measure (using a rubric)
Usability - needs to be usable, need to know how to use it
Validity - refers to does the meausre actually measure what it is suppose to measure.(such as does a reading comprehension test really measure reading comprehension or is the student just comprehending what they are looking for by looking for the question specificallt when they read)
Reliability - do I get consistent measurement over time? Without reliability you don't have anything!!!!!

Wednesday, January 16, 2008

Jan 16 '08

Research Questions and Variables and Hypotheses (notes by Dr. Cook)

I.What is a researchable question?
II. Characteristics of researchable questions
III. Research Variables
A. Quantitative vs. Categorical
B. Independent vs. Dependent
IV. Hypotheses
A. Quantitative vs. Qualitative
V. Identifying Research Articles

Part II
A. Research Problem:a problem to be solved, area of concern, general question, etc.
e.g. We want to increase use of technology in K-3 classrooms in Utah.
B. Research Question:a clarification of the research problem, which is the focus of the research and drives the methodology chosen
e.g. Does integration of technology into teaching in grades K-3 lead to higher standardized achievement test scores than traditional teaching methods alone?
*you can ask a research question any way, it just has to drive the methodology

C. Researchable Research Questions
• Where do they come from?
–Experimenter interests
–Application issues (how an this be used in schools, or wherever)
–Replication issues (could be replicating something as is or may want to do test in a different situation, but same question)
• Do they focus on product or process? Or neither?
Ex: a process that we could study is behavioral change - how does behavior change over the course of our study
Ex: product is the end result
Ex: of neither is survey (not focused on product or process)
• Are they researchable? Unresearchable?
1. Researchable Questions–contain empirical(measuring) referents
a. Empirical Referent –something that can be observed and/or quantified (measured)in some way
e.g., The Pepsi Challenge –which soda do people prefer more? Coca-Cola or Pepsi? (not a research study)
2. Un-Researchable Questions–contain no empirical referents, involve value judgments
e.g., Should prayer be allowed in schools? (as opposed to "How many people want prayers in school?"

D. Essential Characteristics of Good Research Questions:

1. They are feasible. (depends on the person: either money, time; has it's own parameters. Is it a doable project???????)
2. They are clear. (do you understand all the terms in that question; do you understand how the terms are defined)
a. Conceptual or Constitutive definition = all terms in the question must be well-defined and understood (ex would be test scores able to define)
b. Operational definition = specify how the dependent variable will be measured
3. They are significant. (important; can depend on what is significant or important based on the context.) Ex: wearing a red shirt vs. a blue shirt isn't important.
4. They are ethical.
a. Protect participants from harm.
b. Ensure confidentiality.
c. Should subjects be deceived?

E. Variables: Quantitative vs. Categorical
1. Quantitative Variables (numerical in some way)
a. Continuous (it can be further divided up, such as there's space between 4 and 5)
b. Discontinuous (Discrete) (either 2 or 3, nothing in-between; very precise)
2. Categorical Variables
Can look for relationships among:
1. Two Quantitative Variables (relationship between height and weight)
2. Two Categorical Variables (relationship between religion and political science)
3. A Quantitative and Categorical Variable (relationship between age and occupation)

F. Variables: Independent vs. Dependent
1. Independent Variable= variable that affects the dependent variable, or is the
effect of interest
a. Manipulated (what is being manipulated; what is the effective interest; the variable that affects the dependent variable)
b. Selected (such as females vs. males (not able to put in to groups as you can't change females and males, they just are)
2. Dependent Variable= (what we're measuring) dependent on the independent variable, or what is being measured
*****this question is on Quiz 1 for sure and probably 2 & 3
3. Extraneous variable(or confounds)= uncontrolled factors affecting the dependent variable (the stuff that messes up the study; anything that can't be controlled)

G. Quantitative Research Hypotheses
• They should be stated in declarative form. (making a statement!)
• They should be based on facts/research/theory. (has a reason,justification behind prediction)
• They should be testable. (should be able to be tested)
• They should be clear and concise. (should know exactly what it is predicting)
• If possible, they should be directional (taking a stand! predicting in a particular direction e.g. females scores will be better vs. bad example: are female scores better than males)

H. Qualitative Research Questions
• They are written about a central phenomenon instead of a prediction.
• They should be:
–Not too general...not too specific
–Amenable to change as data collection progresses
--Unbiased by the researcher’s assumptions or hoped findings

I. Group Assignment (as a group we made each of these statements a researchable question)
• Teacher effectiveness and student motivation
• Learning differences among American and Pacific Islander children
• Anxiety and test-taking (Do highly anxious students produce lower test scores?)
• Single parents and time spent reading with children
• Physical disabilities and assistive technology

Thoughts: it is a lot harder to produce a reasonable, well-defined and clear research question than it seems. There are too many variables to consider and terms to define.

Article Review Project:
*find an article that is a primary source (original source)

J. Identifying Research Articles
1. What type of source is it?
–Primary Source–original research article
–Secondary Source–reviews, summarizes, or discusses research conducted by others
–Tertiary Source–summary of a basic topic, rather than summaries of individual studies
2. Is it peer reviewed?
–Refereed journals (scientific journals)
• Editors vs. Reviewers
• Blind Reviews (unbiased)
• Level of journal in field
–Non-referred journals (summary journals)

K. Why peer review?
• Importance of verification before dissemination (the check and balance)
–Once the media disseminates information, it is hard to undo the damage
• Scientist arguing autism a result of MMR vaccine never published his results in a scientific journal
• Claim of first human baby clone was based only on the company’s statement
–Greater the significance of the finding, the more important it is to ensure that the finding
is valid

L. Is peer review an insurance policy?
• Not exactly –some fraudulent (or incorrect) claims may still make it through to publication
– Korean scientist who fabricated data supporting the landmark claim in 2004 that he created the world's first stem cells from a cloned human embryo.
• Peer review is another source of information for:
– Funding Allocation– Quality of Research / Publication in Scientific Journals
– Quality of Research Institutions (both on department and university levels)
– Policy Decisions

***Peer review is a check for sound methodology, helps to make an article/paper better.

For research article: look up limited English and literacy to find an article to email the professor about.