Guides: Analyze Data: SPSS: How to use SPSS: Parametric inferential statistics

Video: Inferential Statistics

Flowchart Alternative Text: Which Statistic do I Pick?

Pearson correlation

Video: Pearson Correlation

Pearson correlation (sometimes referred to as Pearson’s r, bivariate correlation, or Pearson product-moment correlation coefficient) is used to determine the strength and direction of a relationship between two continuous variables. This test assumes both variables are approximately normally distributed.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Spearman’s rank-order correlation (i.e., the non-parametric alternative to a Pearson correlation). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. Ideally, your p-value should be > .05, your histogram should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plot should be fairly close to the line.

How to run a Pearson Correlation

Click on Analyze. Select Correlate. Select Bivariate.
Place two or more variables in the “Variables” box.
In the Correlation Coefficients section, ensure “Pearson” is checked.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate the following output: a Correlations table that indicates the Pearson correlation (r) between the variables, the significance value (p), and the number of observations (n).

Pearson’s r can range from -1 (perfect negative) to +1 (perfect positive), and indicates the strength and direction of the relationship between the two variables; p indicates statistical significance, with < .05 generally considered statistically significant (i.e., indicating a significant correlation between the two variables). Here, we see a non-significant weak positive correlation between the two continuous variables.

Pearson correlations are best visualized using a scatterplot, which can be easily created by clicking Graphs, Chart Builder. Select Scatter / Dot, drag the desired graph type from the options to the blue text, drag and drop your variables to the x and y axis, and click OK. This plot allows you to visualize the linear relationship between your selected variables.

One sample t-test

The one sample t-test is used to determine whether the mean of a single continuous variable differs from a specified constant. This test assumes that the observations are independent and that the data are normally distributed.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Wilcoxon signed rank test (i.e., the non-parametric alternative to the one sample t-test). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. Ideally, your p-value should be > .05, your histogram should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plot should be fairly close to the line.

How to run a one samples t-test

Click on Analyze. Select Compare Means. Select One Sample t-test.
Place one or more variables in “Test Variable(s)”, and indicate your desired constant comparison in “Test Value”.
Ensure the “Estimate effect sizes” box is checked.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate the following output: a One-Sample Statistics table that indicates some basic information about the variable you selected, a One-Sample Test table that indicates the result of the test (p < .05 is generally considered statistically significant, which would indicate that the variable mean differs from the test value), and a One-Sample Effect Sizes table that indicates both Cohen’s d and Hedges’ correction values.

Transcript: One Sample T-Test

Independent samples t-test

The independent samples t-test (also referred to as a between subject’s t-test, student’s t-test, unpaired t-test, two-sample t-test…) is used to determine whether two groups’ means on the same continuous variable differ. This test assumes that the observations are independent, and are randomly sampled from normal distributions that have the same population variance.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Mann Whitney U test (i.e., the non-parametric alternative to the independent samples t-test). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. Ideally, your p-value should be > .05, your histogram should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plot should be fairly close to the line.

How to run an independent samples t-test

Click on Analyze. Select Compare Means. Select Independent Samples T-test.
Place one or more variables in “Test Variable(s)”, and one variable in “Grouping Variable”. Be sure to click “Define Groups” to ensure the groups you wish to compare are set up properly (e.g., maybe 0 and 1 or 1 and 2, or something else).
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate the following output: a Group Statistics table that indicates some basic information about the variables you selected, a One-Sample Test table that indicates the result of the test (p < .05 is generally considered statistically significant, which would indicate that the variable mean differs from the test value), and a One-Sample Effect Sizes table that indicates Cohen’s d, Hedges’ correction, and Glass’s delta values.

Note that the One-Sample Test table includes a “Sig.” column for Levene’s test for equality of variances; if p < .05, the equality of variances assumption has been violated and you should use the “equal variances not assumed” row when interpreting the p-values found in the “Significance” column.

Transcript: Independent Samples T-Test

Paired samples t-test

The paired samples t-test (also referred to as a dependent t-test, repeated samples t-test, matched pairs t-test…) is used to determine whether the means of two continuous variables (e.g., before and after treatment) from the same group of participants differ. This test assumes that the participants are independent from one another, while the two variables (e.g., measurements) are from the same participant. This test also assumes that the distribution of differences is normally distributed and there are no extreme outliers in the differences.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Wilcoxon signed-rank test (i.e., the non-parametric alternative to the paired-samples t-test). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. For this test, normality is often calculated on the DIFFERENCE SCORES between your two related groups. Ideally, your p-value should be > .05, your histogram should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plot should be fairly close to the line.

How to run a paired samples t-test

Click on Analyze. Select Compare Means. Select Paired Samples T-test.
Place one or more variables in the “Variable 1” slot and one or more variables in the “Variable 2” slot in the Paired Variables window.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate the following output: a Paired Samples Statistics table that indicates some basic information about the variables you selected, a Paired Samples Correlations table that indicates the correlation between the variables you selected, a Paired Samples Test table that indicates the result of the test (p < .05 is generally considered statistically significant, which would indicate that the variable means of the groups differ), and a Paired Samples Effect Sizes table that indicates both Cohen’s d and Hedges’ correction values.

Transcript: Paired Samples T-Test

One-way ANOVA

The one-way Analysis of Variance (ANOVA) is used to determine whether the means from three or more groups differ. This test assumes that each group is an independent random sample from normally distributed populations, and that the populations have equal variances.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Kruskal-Wallis H test (i.e., the non-parametric alternative to the one-way ANOVA). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. You will need to place your dependent variable in the “Dependent List” box, and your independent variable in the “Factor List” box. Ideally, your p-values should be > .05, your histograms should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plots should be fairly close to the line.

How to run a one-way ANOVA

Click on Analyze. Select Compare Means. Select One-Way ANOVA.
Place your continuous dependent variable in the “Dependent List” slot, place your categorical independent variable in the “Factor” slot.
If you wish to conduct pairwise multiple comparisons between means, click on Post Hoc and select the desired test (Bonferroni and Tukey’s are common). Click Continue to save your choices.
Click on the Options Button. Check off Descriptives, Homogeneity of variance test, and Means plot. Click Continue to save your choices.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generates a large output with multiple sections. In the Oneway section: a Test of Homogeneity of Variances table, an ANOVA table, an ANOVA Effect Sizes table, a multiple comparisons table, and profile plots. (NOTE: that there are other tables as well, but these are the critical tables).

The Tests of Homogeneity of Variances table shows you whether homogeneity has been violated (p < .05); if this is the case, a non-parametric test might be more appropriate.

The ANOVA table shows you whether you have a main effect of your dependent variable based on the independent variable groups. If your ANOVA is significant (p < .05), you need to run follow-up t-tests or multiple comparisons to determine which levels are significantly different.

The Multiple Comparisons table indicates which groups of the independent variable are different from each other (p < .05).

The Means Plots section displays a line graph or bar graph (based on your specification). Note: it’s always a great idea to generate a graph so you can visually try to interpret your data! Here, it looks like there might be a significant difference, but pay attention to the scale of the y-axis! If we adjust the scale, it’s easy to see why there is no significant difference indicated in the ANOVA or the multiple comparisons.

To generate the Tests of Normality table, use the Explore procedure (see the “Descriptive statistics” page for details) to generate the Shapiro-Wilk statistic. For a one-way ANOVA, you need to check normality for each group (i.e., for each level of the independent variable). If p < .05 for any of the groups, the assumption of normality has been violated. In this example, all of the groups pass the normality assumption (only one group shown).

Transcript: One Way ANOVA

One-way repeated measures ANOVA

The one-way repeated measures Analysis of Variance (ANOVA) is used to determine whether the means of three or more continuous variables OR measurements at three or more timepoints on the same continuous variable differ for one group. There are several assumptions of this test; two important ones you should consider are normality and sphericity.

Note that this is a parametric test; if the assumptions of this test are not met, you could / should instead run a Friedman test (i.e., the non-parametric alternative to the one-way repeated measures ANOVA). For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. You will need to place each of your groups in the “Dependent List” box. Ideally, your p-values should be > .05, your histograms should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plots should be fairly close to the line.

How to run a one-way repeated measures ANOVA

Click on Analyze. Select General Linear Model. Select Repeated Measures.
In the “Repeated Measures Define Factor(s)” pop-up, input a name for your repeated measures variable in the “Within-subject Factor Name” box. Include how many levels of this factor (3+) you have in the “Number of Levels:” box. Click Add. Click Define.

In the “Repeated Measures” pop-up, move the different levels of your repeated measures variable to the “Within-Subject Variables ():” box. NOTE: you need to have one column for each level of the factor (e.g., pre, during, and post measurements would each need to be in a separate column).

If you want a plot, click Plots and move the within-subject factor from “Factors:” to “Horizontal Axis:”, click Add. Select whether you would like a line chart of a bar chart. Click Continue to save your choices.
If you require post-hoc tests or estimated marginal means, make those selections in the “Post Hoc” tab or the “EM Means” tab. For example, to get the pairwise comparisons between the different groups, in the EM Means tab move the repeated measures variable to the “Display Means for:” section and check the “Compare main effects” box.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate the following output: Mauchly’s Test of Sphericity table, Tests of Within-Subjects Effects table, Pairwise Comparisons table, and Profile Plots (NOTE: that there are other tables as well, but these are the critical tables).

The Sphericity table indicates which line(s) to read on the Within-Subject Effects table: If sphericity has been violated (Sphericity table p < .05), reading the “Greenhouse-Geisser” line is common; if sphericity has not been violated (Sphericity table p > .05), read the “Sphericity assumed” line in the Within-Subjects Effects table.

The Tests of Within-Subjects Effects table shows you whether you have a main effect of your repeated measures variable (p < .05). If your ANOVA is significant (p < .05), you need to run follow-up t-tests or pairwise comparisons to determine which levels are significantly different. The Pairwise Comparisons table indicates which levels of the repeated measures variable are different from each other (p < .05).

The Profile Plots section displays a line graph or bar graph (based on your specification). Note: it’s always a great idea to generate a graph so you can visually try to interpret your data! Here, it looks like there is a significant difference, which is indicated in the ANOVA.

To generate the Tests of Normality table, use the Explore procedure (see the “Descriptive statistics” page for details) to generate the Shapiro-Wilk statistic. For a repeated-measures ANOVA, you need to check normality for each group (i.e., for each group of the dependent variable). If p < .05 for any of the groups, the assumption of normality has been violated. In this example, one of the groups (shown) passes the normality assumption, but the other two groups do not.

Transcript: Repeated Measures ANOVA

Factorial ANOVA

A factorial Analysis of Variance (factorial ANOVA) is used to determine whether the means from two or more variables / factors with two or more levels each differ (e.g., main effects), and whether any factors interact (e.g., interactions). If there are two factors, this is sometimes called a “two-way” ANOVA; if there are three factors, this is sometimes called a “three-way ANOVA”.

There are three separate kinds of factorial ANOVA: fully between factorial ANOVA (where both or all factors have levels that are entirely between-group), fully within factorial ANOVA (where both or all factors have levels that are entirely within-group), and mixed factorial ANOVA (where one or more factors are entirely between-group AND one or more factors are entirely within-group).

Note that all three kinds of factorial ANOVA are parametric tests; if the assumptions of the test are not met, there is no non-parametric equivalent. You could consider transforming your dependent variable in some way and then running the parametric test (and all assumptions) again, though this does not guarantee normality. For a refresh on how to check normality, read the “Explore” procedure on the Descriptive Statistics SPSS LibGuide page. You will need to place your within / repeated variable(s) in the “Dependent List” box, and your between / independent variable(s) in the “Factor List” box. Ideally, your p-values should be > .05, your histograms should approximate a normal distribution (i.e., a standard “bell-shaped curve”), and the points on your Q-Q plots should be fairly close to the line. Additionally, your boxplots should indicate no outliers in any group.

Fully between factorial ANOVA

This is an extension of a one-way ANOVA, including two or more factors with two or more levels each that are fully “between” (i.e., each participant / sample is in only one condition). This test assumes that each participant / sample is in only one condition (independence of observations), that there should be no outliers, that your dependent variable is approximately normally distributed in each condition (normality), and that each condition has approximately equal variances (homogeneity).

How to run a fully between factorial ANOVA

Click on Analyze. Select General Linear Model. Select Univariate.
Place your continuous dependent variable in the “Dependent List” slot, place your categorical independent variables in the “Fixed Factor(s)” slot.

If you want a plot, click Plots and move the independent variables from “Factors:” to “Horizontal Axis:”, and/or “Separate Lines”, and/or “Separate Plots”, click Add. (NOTE: SPSS will only allow you to plot up to 3 variables at once). Select whether you would like a line chart (best for time series data) or a bar chart (generally the best choice for ANOVA). Click Continue to save your choices.
If you require post-hoc tests or estimated marginal means, make those selections in the “Post Hoc” tab (only between variables with >2 levels) or the “EM Means” tab. For example, to get the pairwise comparisons between the different groups, in the EM Means tab move the main effect (single independent variable) or interaction (independent variables with a “ * ” between them) variables to the “Display Means for:” section. Check the “Compare main effects” box and the “Compare simple main effects” box. Select “Bonferroni” from the drop-down menu. Click Continue to save your choices.
In options, you can select multiple other helpful pieces of information, such as descriptive statistics, estimates of effect size, homogeneity tests, and heteroskedasticity tests. Click Continue to save your choices.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate a large output with multiple sections. In the Univariate Analysis of Variance section: a Descriptive Statistics table that indicates some basic information about the variables you selected, a Test of Homogeneity of Variances table that indicates Levene’s statistic (if p < .05, the homogeneity assumption has been violated), and a Tests of Between-Subjects Effects table (the ANOVA table; if p < .05 for the independent variables or interaction, this indicates that at least one of the means is different than the others. This does not tell you which means are different from each other, so you need to follow up with pairwise comparisons, multiple comparisons, or t-tests!).

Importantly, when p > .05 in the Tests of Between-Subjects Effects table, we cannot say that there is “no difference" between the level(s); we can only say that we failed to find a difference. If p < .05, we could conclude that there is a difference somewhere between the levels, though we would need to follow this up with additional testing to determine which levels are significantly different from each other. If you find a significant main effect for a factor that only has two levels, you can simply look at the descriptives to know which is higher / lower (as the main effect already tells you that these are different). If your factorial ANOVA reveals p < .05 for a main effect for a factor that has more than two levels, or an interaction, you need to run pairwise post hoc tests to determine the location of the difference(s). To do so, look at the Estimated Marginal Means section showing a Pairwise Comparisons table, and / or a Post Hoc Tests section showing a Multiple Comparisons table. For both of these, if p < .05, this indicates a difference in the means between conditions, which could help explain a significant main effect or interaction in the ANOVA.

Looking at a graph can also help visualize where the differences may be, so be sure to look at the Profile Plots section if you requested a graph.

Fully within factorial ANOVA

This is an extension of a repeated measures ANOVA, including two or more factors with two or more levels each that are fully “within” (i.e., each participant / sample is in each condition). This test assumes that there should be no outliers, that your dependent variable is approximately normally distributed in each condition (normality), and that the differences between all conditions must be approximately equal (sphericity).

How to run a fully within factorial ANOVA

Click on Analyze. Select General Linear Model. Select Repeated Measures.
For each of your within-subject variables: enter the name of the variable (as you would like it displayed in the output file) under “Within-Subject Factor Name”, enter the number of levels next to “Number of Levels”, click on “Add”. Once you have finished specifying all variables, click on “Define”.

In the new box, move all of your “within” levels (this is a full cross of your factors: e.g., in this 3 x 2 example there are 6 columns of data) from the left side to the “Within-Subjects Variables” box using the blue arrow. Pay careful attention that the order of the levels matches the ordering in the Within-Subjects Variables box!

If you want a plot, click Plots and move one variable from “Factors:” to “Horizontal Axis:”, and move the other one to the “Separate Lines”, click Add. (NOTE: SPSS will only allow you to plot up to 3 variables at once). Select whether you would like a line chart (best for time series data) or a bar chart (generally the best choice for ANOVA). Click Continue to save your choices.
To get the pairwise comparisons between the different levels, in the EM Means tab move the main effect (including all desired within variables) and interaction (variables with a “ * ” between them) to the “Display Means for:” section. Check the “Compare main effects” box and the “Compare simple main effects” box. Select “Bonferroni” from the drop-down menu. Click Continue to save your choices.
In Options, you can select multiple other helpful pieces of information, such as “Descriptive statistics”, and “Estimates of effect size”. Click Continue to save your choices.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate a large output with multiple sections. In the General Linear Model section: a Descriptive Statistics table that indicates some basic information about the variables you selected, a Test of Sphericity table that indicates Mauchly’s statistic (if p < .05, the sphericity assumption has been violated), and a Tests of Within-Subjects Effects table (the ANOVA table; if p < .05 for the independent variables or interaction, this indicates that at least one of the means is different than the others. This does NOT tell you which means are different from each other, so you need to follow up with pairwise comparisons, multiple comparisons, or t-tests!).

Note. If the Sphericity assumption was violated (p < .05) according to Mauchly’s test, you should read the “Greenhouse-Geisser” lines in the “Tests of Within-Subjects Effects” table when interpreting your ANOVA.

Importantly, when p > .05 in the Tests of Within-Subjects Effects table, we cannot say that there is “no difference" between the level(s); we can only say that we failed to find a difference. If p < .05, we could conclude that there is a difference somewhere between the levels, though we would need to follow this up with additional testing to determine which levels are significantly different from each other. If you find a significant main effect for a factor that only has two levels, you can simply look at the descriptives to know which is higher / lower (as the main effect already tells you that these are different). If your factorial ANOVA reveals p < .05 for a main effect for a factor that has more than two levels, or an interaction, you need to run pairwise post hoc tests to determine the location of the difference(s). To do so, look at the Estimated Marginal Means section showing a Pairwise Comparisons table. If p < .05, this indicates a difference in the means between conditions, which could help explain a significant main effect or interaction in the ANOVA.

Looking at a graph can also help visualize where the differences may be, so be sure to look at the Profile Plots section if you requested a graph.

Mixed factorial ANOVA

A mixed factorial ANOVA (also sometimes called a split-plot design) is a combination of a one-way ANOVA and a repeated measures ANOVA, where at least one of the factors is fully “between” (i.e., each participant / sample is in only one condition for this factor) and one of the factors is fully “within” (i.e., each participant / sample is in each condition for this factor). This test assumes that there should be no outliers, that your dependent variable is approximately normally distributed in each condition (normality), that each condition has approximately equal variances (homogeneity), and that the differences between all conditions must be approximately equal (sphericity).

How to run a mixed factorial ANOVA

Click on Analyze. Select General Linear Model. Select Repeated Measures.
For each of your within-subject variables: enter the name of the variable (as you would like it displayed in the output file) under “Within-Subject Factor Name”, enter the number of levels next to “Number of Levels”, click on “Add”. Once you have finished specifying all variables, click on “Define”.

In the new box, move all of your “within” levels from the left side to the “Within-Subjects Variables” box using the blue arrow. If you have more than one within factor, pay careful attention that the order of the levels matches the ordering in the Within-Subjects Variables box!
Place your between-subject variable(s) in the “Between-Subjects Factor(s)” box using the blue arrow.

If you want a plot, click Plots and move your within-subject variable from “Factors:” to “Horizontal Axis” and move your between-subject variable from “Factors:” to “Separate Lines”, click Add. (NOTE: SPSS will only allow you to plot up to 3 variables at once). Select whether you would like a line chart (best for time series data) or a bar chart (generally the best choice for ANOVA). Click Continue to save your choices.
If you require post-hoc tests or estimated marginal means, make those selections in the “Post Hoc” tab (only between variables with >2 levels) or the “EM Means” tab. For example, to get the pairwise comparisons between the different groups, in the EM Means tab move the main effect (single independent variable) or interaction (independent variables with a “ * ” between them) variables to the “Display Means for:” section. Then, check the “Compare main effects” box and the “Compare simple main effects” box; select Bonferroni from the drop-down menu.
In options, you can select multiple other helpful pieces of information, such as “Descriptive statistics”, “Estimates of effect size”, and “Homogeneity tests”. Click Continue to save your choices.
Click OK to run the test (results will appear in the output window).

Interpreting the Output

Running the above steps will generate a large output with multiple sections, including the following relevant tables in the General Linear Model section: a Descriptive Statistics table that indicates some basic information about the variables you selected, a Mauchly’s Test of Sphericity table that indicates Mauchly’s statistic (if p < .05, the sphericity assumption has been violated), a Tests of Within-Subjects Effects table (the ANOVA table; if p < .05 for the independent variable(s) or interaction(s), this indicates that at least one of the means is different than the others. This does not tell you which means are different from each other, so you need to follow up!), a Test of Homogeneity of Variances table that indicates Levene’s statistic (if p < .05, the homogeneity assumption has been violated), and a Tests of Between-Subjects Effects table (the ANOVA table; if p < .05 for the independent variable(s) or interaction(s), this indicates that at least one of the means is different than the others. This does not tell you which means are different from each other, so you need to follow up!).

Note: If the Sphericity assumption was violated (p < .05) according to Mauchly’s test, you should read the “Greenhouse-Geisser” lines in the “Tests of Within-Subjects Effects” table when interpreting your ANOVA.

Importantly, when p > .05 in the Tests of Within-Subjects Effects table or the Tests of Between-Subjects Effects table, we cannot say that there is “no difference" between the level(s); we can only say that we failed to find a difference. If p < .05, we could conclude that there is a difference somewhere between the levels, though we would need to follow this up with additional testing to determine which levels are significantly different from each other. If you find a significant main effect for a factor that only has two levels, you can simply look at the descriptives to know which is higher / lower (as the main effect already tells you that these are different). If your factorial ANOVA reveals p < .05 for a main effect for a factor that has more than two levels, or an interaction, you need to run pairwise post hoc tests to determine the location of the difference(s). To do so, look at the Estimated Marginal Means section showing a Pairwise Comparisons table (within factors with > 2 levels OR interactions), and / or a Post Hoc Tests section showing a Multiple Comparisons table (between factors with > 2 levels). For both of these, if p < .05, this indicates a difference in the means between conditions, which could help explain a significant main effect or interaction in the ANOVA.

Looking at a graph can also help visualize where the differences may be, so be sure to look at the Profile Plots section if you requested a graph.