http://uit.unt.edu/research

Data Science and Analytics

UITHelp Desk | Training | About Us | Publications | DSA Home

Please participate in the DSA Client Feedback Survey.

Return to the SAS Short Course

MODULE 3

IV. SAS data handling - Data Step

Normally, SAS users do not pay attention to what type of files SAS uses in a SAS session. This section distinguishes several types of files that SAS can handle. Knowing this, you will be able to use each type of file advantageously.  The following flow chart illustrates how SAS data sets are processed: 

A data file that has been entered in the SAS Data step after the CARDS command needs to be converted into a SAS data file before SAS can use it. The DATA step takes care of this conversion.
 

Example 1:

Data;
input x y z @@;
datalines;
1 2 3 4 5 6 7 8 9
;
run;

This sample SAS program creates a temporary data set using the DATALINES statement to read in the in-line data.  The Data statement defines a temporary data set; the INPUT statement defines the variables and their formats; the DATALINES statement gives instruction to start reading in the data that follow.  The RUN statement ends this session of the program and submits for processing.

External Files

The most common type of data is sometimes referred to as an External File, Raw File, or even a Text File. These files have the same characteristics: they are made up of numbers and/or characters and they can be processed by other programming languages as well as SAS. There are two ways to incorporate this kind of file into a SAS program. The first and the commonly used one is to put data after a CARDS command as in the previous example. Another method is to refer to the location of data in the SAS program. The latter method is more efficient than the former, because it reduces the size of your SAS program to a more manageable level, especially, when your data set has over a thousand observations. The following SAS program shows you how to accomplish the latter method. 

Example 2:

FILENAME DATAIN 'A:\COUNTRY.DAT';

DATA COUNTRY;

INFILE DATAIN;

INPUT DEC 1 ID 2-4 NAME $CHAR26. SSCODE 31-33

CONTIN $ 34-35 DODEV 36 POPULATE 37-43

AREA 44-49 GNP 50-56 MILEXPED 57-64 .1

PEDEXPED 65-71 .1;

PROC PRINT DATA=COUNTRY;

RUN;

The FILENAME statement tells SAS to use DATAIN as a file reference for the data set named 'COUNTRY.DAT'. The INFILE command tells SAS to get the data file on drive A: under 'COUNTRY.DAT'.

In SAS 9 supported data files formats include dBASE files, Lotus 1-2-3, Microsoft Excel spreadsheets and Microsoft Access tables. Using the Import Wizard, the user will be guided to create a data set from files in these formats. To import files, click on File on the menu and select import as in the following:

The Import Wizard will guide you through the importation process. Once imported successfully; you should see the output window pop up and contain the complete data (N = 141 observations). You can download the 'country.dat' data file here if you would like to import it for practice.

After reading in the data, you can check the data under Libraries in the Explorer window.  Double-click the Libraries icon and open the Library window.  The data file just created is in the WORK library.

Exercise A: Importing a PASW/SPSS data file (ExampleData1.sav).

First, download and save the data to your flash drive using the link above.

Second, go to File, Import Data..., as shown in the above screen shots; to open the Import Wizard. Under the Standard Data Source heading select SPSS file (*.sav) from the data source drop down menu and click the Next button. Browse to find the Example Data 1 file on your flash drive, select it, and click the Next button. Under library select Work if it is not already selected. Under Member, simply type EXAMPLE1 and then click the Next button. Select your flash drive and type ExampleData1 in place of the * and notice the .sas file extension. Click the Save button. Then click the Finish button.

Now; there are a few of ways to ensure the data has been successfully imported. First, the Log window should display Notes indicating the successful importation of the data (e.g. the number of observations and variables). Second, in the Explorer window, you can double click on the Libraries icon, then double click on the Work icon and see the newly created Example1 file. Third, you can run a print procedure to view the data using the following simple syntax in the editor window:

PROC PRINT DATA=EXAMPLE1;
RUN;

Exercise B: Importing an Excel data file (ExampleData2.xls).

First, download and save the data to your flash drive using the link above.

Second, make sure you click on the editor window to activate it; otherwise you will not be able to get to the Import Wizard. Go to File, Import Data..., as shown in the above screen shots; to open the Import Wizard. Make sure Microsoft Excel Workbook is selected and click the Next button. Browse to find the file on your flash drive, select it, click Open, then OK. Then click on the Options button to review the options. Given the nature of this particular example (shown below), the default options will not need to be changed.

Click OK and Next. As a destination, choose the library Work and type the member EXAMPLE2, then click Next. Click Browse to find your flash drive and then type ExampleData2 in place of the * to name the file and specify a location to save the SAS version of it; make sure the .sas extension is present and then click Save. Now click Finish.

Again; there are a few of ways to ensure the data has been successfully imported. First, the Log window should display a Note indicating the successful importation of the data. Second, in the Explorer window, you should be able to see the Example2 file inside the Libraries and then Work directories. Third, you can run a print procedure to view the data using the following simple syntax in the editor window:

PROC PRINT DATA=EXAMPLE2;
RUN;

You will be able to see the distinction between Example 1 and Example 2 because, Example 2 contains missing values (e.g. blank cells).

 

V. SAS Files created during importation

SAS uses a special data format during data processing. This unique data format is called a SAS data or system File. If the data file you tell SAS to use is not a SAS File, SAS converts it to a SAS File before SAS starts processing the data set. SAS Files have special characteristics that make them more convenient and efficient for SAS to use. There are two types of SAS files: SAS Data Sets (*.sas7bdat) and SAS Catalogs (*.sc2). The most commonly used is the SAS Data Set. In a SAS Data Set, variable names, variable labels, and variable formats have been recorded together with the variable values.

A SAS File name is somewhat different from other types of data file names. A complete SAS file name consists of two parts separated by a period, for example PROJECT1.FITNESS. The first part is called the first-level name or libref, identifying the directory or library where the file is saved. The second part, the second-level name identifies the specific file name in the directory or library. Anyone can create a SAS Data Set from a regular file. The following Example 3 shows how to do this.

Example 3:

TITLE 'SAS SAMPLE - COUNTRY DATA';

DATA WORK.COUNTRY;

ARRAY MSG GNP MILEXPED PEDEXPED;

infile 'A:\country.dat';

input dec 1 id 2-4 name $char26. sscode 31-33

contin $ 34-35 dodev 36 populate 37-43

area 44-49 gnp 50-56 milexped 57-64 .1

pedexped 65-71 .1;

label name = "COUNTRY NAME"

CONTIN = 'CONTINENT'

DODEV = 'DEGREE OF DEVELOPMENT'

GNP = 'GNP IN MILLIONS OF DOLLARS'

MILEXPED= 'MILITARY EXPENDITURE IN MILLIONS OF $'

PEDEXPED= 'PUB. EDUCATION EXPENDITURE IN MIL. $';

DO OVER MSG;

IF MSG= 9999999 OR MSG = 999999.9 OR

MSG=99999.9 THEN MSG=.;

END;

RUN;

PROC PRINT DATA=WORK.COUNTRY;

RUN;

The LIBNAME directs SAS to associate WORK with the directory A:\MYDATA. After this job has been executed, you will have a SAS Data Set saved under A:\MYDATA\country.sas7bdat.   Retrieving a SAS Data Set is easy because you do not have to tell SAS the variable names, variable formats, variable label, and variable locations.

In a SAS for Windows session, you can have as many SAS data steps as you want. You can use the LIBNAME command as often as you need to direct SAS for Windows to different SAS data directories. In case you have many SAS data files in a SAS program, SAS for Windows allows you to keep track of your SAS data files and their variables.

SAS for Windows has LIBNAME, DIRECTORY, and VARIABLES windows. The LIBNAME window tells you how many SAS data libraries are in a SAS program. The DIRECTORY window displays how many SAS data files are in a SAS data library or directory. The VARIABLES window lists the SAS variables in each SAS data file. To tell SAS for Windows to go to the LIBNAME window, you type LIB at the command box. A list of libraries or directories will be shown on a new window (LIB Window). You can also go to the DIR and VAR windows directly by typing DIR and VAR respectively at the command prompt in any SAS for Windows display Manager window. By doing this, SAS for Windows displays the current directory which is the WORK directory. To tell SAS for Windows to display the desired directory, you can type the name of the directory at the top of the window. You can do the same with the VAR window.

 

Return to the SAS Short Course

UNT home page

Contact Information

Jon Starkweather, PhD

Jonathan.Starkweather@unt.edu

940-565-4066

Richard Herrington, PhD

Richard.Herrington@unt.edu

940-565-2140

Please participate in the DSA Client Feedback Survey.

Last updated: 2018.11.15 by Jon Starkweather.

UITHelp Desk | Training | About Us | Publications | DSA Home