The library offers a range of helpful services. All of our appointments are free of charge and confidential.
The Descriptives procedure is used to find the measures of central tendency (mean, median, mode) and measures of dispersion (range, standard deviation, variance, minimum and maximum) and measures of kurtosis and skewness. This procedure is best suited to describe continuous variables.
The Frequencies procedure is used to generate statistics (similar to the Descriptives procedure) and graph summaries. Graph options include bar charts (best for categorical variables), pie charts, and histograms (best for continuous variables).
The Explore procedure is used to examine whether a variable is normally distributed with statistics (Kolmogorov-Smirnov and Shapiro-Wilk) and plots (Q-Q Plot, Stem and Leaf Plot, and Box Plot). You can also run certain descriptive statistics (similar to the Descriptives procedure).
Running the above steps will generate the following output: Descriptives, Tests of Normality (Kolmogorov-Smirnov and Shapiro-Wilk; we expect these to have p > .05 to assume normality, if p < .05 the assumption of normality has been violated), and multiple different plots.
The histogram is a visual depiction of the distribution of your selected variable(s). If the data approximately resemble a normal distribution (also sometimes called an inverted “U” or a “bell-shaped curve”), then your data are approximately normally distributed. Similarly, the Q-Q plot is a visual depiction of your residuals (i.e., the difference between your expected value if the data are normally distributed, and the actual observed values in your data). If the residuals approximately fall along the 45 degree line, then your residuals are approximately normally distributed. Note that normality can be formally assessed in the “Explore” procedure using the Kolmogorov-Smirnov and Shapiro-Wilk statistics (mentioned above).
The boxplot is a visual depiction to determine if your selected variable(s) includes outliers. If there are data points that fall beyond the “whiskers” of the plot, these might represent extreme values and should be further assessed to determine if they are outliers. Note that outliers are not formally assessed with statistics in the “Explore” procedure, and there are no outliers present in this example. A circle with a number indicates an outlier (and the row of respective data); a star with a number indicates an extreme outlier (and the row of the respective data).
The Crosstabs procedure is used to create a crosstabulation or contingency table. It is used to show the relationship between two or more categorical (nominal) variables. This procedure is often used to calculate the Chi-Square test and correlations (see “Inferential Statistics” section for details).
Running the above steps will generate the following output: a crosstab table between the variables you selected (e.g., indicating how many of each combination was present in your data).
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.