Skip to Main Content

Analyze Data: Stata

Contributors: Lindsay Plater and Riley Oremush

The Stata interface

The Stata interface provides a spreadsheet-like data editor window for creating and editing data files. It automatically opens when you start a session. Within the Stata interface window, there are five windows: Command, Results, History, Properties, and Variables. The Command window (bottom middle) is where commands are typed, which are then shown in the Results window (centre screen) above. Commands are then added to a list in the History window (left side) to keep track of your work. The window labelled Variables (top right) lists on the variables recognized from your dataset. Finally, the Properties window (bottom right) displays the particular properties of the dataset and variables, such as the name of the variable, the data type, et cetera.

Overview of the Stata interface.

While Stata can be command-driven by typing code in the Command window, it can also be used in a point-and-click manner using the tabs and buttons at the top left of the screen. There are several useful buttons to be aware of, including: Log, New Do-file Editor, and Data Editor.

The “Log” icon (the blue book) begins a Log-file to track and save the output from the Results window. This ensures replicability of the statistics being done during a Stata session.

A Stata log, showing an unnamed file ready to save output.
The “New Do-file Editor” icon (the paper and pencil) opens the Do-file window, allowing you to type and save the commands in a more replicable format. This combines both the “Commands” tab and the “Log” file. Note that the output of the commands written in the Do-file will still be displayed on the original Stata Results window.

The Stata interface showing the results window and a do-file editor. The right-most button (labelled run) of the do-file editor is circled, indicating that it will run all commands in the do-file, with the code's output appearing in the results window.

The “Data Editor” icons (the spreadsheets) display the data, and have two versions: the Data Editor (Edit) icon (spreadsheet with a pencil) opens an editable spreadsheet of your data, while the Data Editor (Browse) icon (spreadsheet with a magnifying glass) opens a non-editable spreadsheet of your data. The Variables tab (right side) provides a description of what is in the selected column. Columns often represent individual variables, and can include numbers or characters (text). Rows often represent individual cases / observations / samples.

The Stata data editor in browse mode, showing a dataset with region, country, popgrowth, and other variables.

Data entry

You can enter data directly into the data editor window in the Data Editor (edit) window. You can enter data in any order.

  1. In the data editor, select a cell.
  2. Enter the data value. The value is displayed in the cell editor at the top of the data editor window.
  3. Click Enter, or select another cell to save the value you just typed.

Importing data

In addition to files saved in Stata format, you can open a spreadsheet (Excel), Database (Access, dBASE), tab‐delimited file, and other types of ASCII text files without converting the files to an intermediate format.

Opening a Stata file (*.dta)

  1. Click on File. Select Open. Select Data.
  2. To view all files, in the “Files of Type” drop-down menu, select the “All Files (*.*)” option.
  3. In the “Open File” dialog box, select the file you want to open.
  4. Click Open.

Importing an Excel or CSV file (*.xls or *.csv)

  1. Click on File. Select Import. Select data-type of file of interest.
  2. In the “Import excel” box, select “browse” to choose the excel file to import.
  3. View how the data will look once imported in the “Preview” section. If variable names are incorrectly located as data, select the “Import first row as variable names” option.
  4. Click Open.

Compute variables

To compute variables in Stata, use the “generate” or “gen” command (PDF) followed by a new variable name, and some arithmetic expression. This computes values based on numeric transformations of other variables. For example, the “gen” command can be used to:

  • add, subtract, divide, multiply, or square the values in one or more columns
  • convert measurements (e.g., weight from pounds to kilograms)
  • conditionally generate a computation based on a specified condition

The Stata interface with the gen command in a do-file and the output (here, a log-transformed variable) in the results window.

Suggest an edit to this guide

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.