Skip to Main Content

Analyze Data: Stata

Contributors: Lindsay Plater and Riley Oremush

Summarize

The “summarize” command (PDF) produces output showing the number of observations, mean, standard deviation, as well as the minimum and maximum values of each respective variable. This procedure is best suited to describe continuous variables.

How to run summarize

  1. Type the command summarize followed by the names of the variable(s) you’re interested in examining.
  2. For additional information, use the describe command.
  3. If using the command tab within the Stata screen, complete the command by pressing enter. If using a do-file, highlight the command and click the “play” icon.
  4. The results will appear in the Results window.

The Stata interface with the summarize command in a do-file and the output (n, mean, sd, min, max) in the results window.

Tabulate

There are many ways to calculate frequencies (i.e., counts) in Stata. One option is the “tabulate” command (PDF). This command can be used to generate contingency tables / crosstabs. These kinds of statistics are best used for categorical variables.

How to run tabulate

  1. Type: tabulate [VARIABLE1] [VARIABLE2] to generate a table of frequencies
    • Click OK to run the test.
    • The results will appear in the Results window.

The Stata interface with the tabulate command in a do-file and the output (A contingency table with counts) in the results window.

  1. Type: tabulate [VARIABLE 1] [VARIABLE 2], row to generate a table of proportions / percentages.
    • Click OK to run the test.
    • The results will appear in the Results window.

The Stata interface with the tabulate command in a do-file and the output (a contingency table with proportions) in the results window.
 

Normality

Assessing normality can be done visually, such with a histogram or Q-Q Plot, or statistically with a Shapiro-Wilk Test. To run the Shapiro-Wilk test, use the “swilk” command (PDF).

How to run check normality

  1. Type: swilk [VARIABLE 1] 
    • Click OK to run the test.
    • The results will appear in the Results window.

The Stata interface with the swilk command in a do-file and the output (the Shapiro-Wilk statistic for normality) in the results window.

  1. To generate a histogram:
    • Type: hist [VARIABLE 1]
    • Click OK to run the test.
    • The results will appear in the Results window.

The Stata interface with the hist command in a do-file and the output (a histogram) in the results window.
 

Suggest an edit to this guide

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.