SAS Data Sets

Creating a SAS Data Set

If you are handling a large data set, it is advisable to create a SAS data set and work with that. Under Unix systems, a SAS file is a specially structured Unix file. A SAS file is readable only by SAS from the operating system under which it was creat ed. A stored SAS file cannot be edited using an editor like emacs, vi, etc. SAS data sets are referenced with a one- or two-level name. The two-level name is of the form libref.member-name, where "libref" refers to the directory in which the data set is t o be stored or read, and "member-name" refers to the name of the SAS data file to be created or read. The one-level name is of the form member-name (the libref is omitted). In this case, the SAS system stores the files in the temporary WORK data library.

Suppose we want to create a SAS data set with the program file (clas1.sas) we created earlier. We want to create a SAS data set with the entire DATA steps without any PROC steps. Note the first two lines in the following program where the LIBNAME comma nd is issued to reference the directory in which the SAS file is to be created, and the two-level name links the member-name (e.g., anxiety) to the libref (e.g., try1).

	LIBNAME try1'pathname';
	DATA try1.anxiety;
		INFILE clas.dat;
		INPUT id 1-2 sex $ 3  exp 4 school 5
	   (c1-c10) (1.) (m1-m10) (1.) (mathscor compscor) (2.);
    
		IF mathscor=99 THEN  mathscor=.;
 		IF compscor=99 THEN  compscor=.;
 
 		c3=6-c3; c5=6-c5; c6=6-c6; c10=6-c10;
		m3=6-m3; m7=6-m7; m8=6-m8; m9=6-m9;
 
		compopi = SUM (OF c1-c10);
 		mathatti = m1+m2+m3+m4+m5+m6+m7+m8+m9+m10;
 
	 	LABEL id='STUDENT IDENTIFICATION' sex='STUDENT GENDER'
			exp='YRS OF COMP EXPERIENCE' school='SCHOOL REPRESENTING'
	  		mathscor='SCORE IN MATHEMATICS' 
	  		compscor='SCORE IN COMPUTER SCIENCE' 
	  		compopi='TOTAL FOR COMP SURVEY' 
	  		mathatti='TOTAL FOR MATH ATTI SCALE';
 	RUN;
	ENDSAS;

Replace "pathname" with the appropriate directory. When the job is executed (any execution mode), a SAS data set named anxiety.sas7bdat will be stored in your directory. Under certain Unix platforms the filename extension (e.g., sas7bdat) may differ.

Accessing a SAS Data Set

To read an existing SAS data set, use a two-level name of the form libref.member.name.

The following example illustrates how to access the SAS data set (e.g., anxiety.sas7bdat) created and run some SAS procedures with it. (Note that try2 is given as libref and anxiety is given as member-name.)


	LIBNAME try2 '/usr1/jdoe';

	DATA anxiety2;

	  SET try2.anxiety;



	PROC TTEST;

	  CLASS sex;

	  VAR compopi;

	TITLE 'T-TEST';



	PROC CORR DATA=anxiety2;

	  VAR compscor mathscor compopi mathatti;



	PROC REG DATA=anxiety2;

	  MODEL compopi=mathscor mathopi compscor;



	ENDSAS;

The above job invokes the SAS data set, anxiety.sas7bdat, from the directory specified and runs requested SAS procedures. Refer to SAS Language, and SAS Companion for the UNIX Environment for further information on SAS data files.

Reading Compressed Files

It is possible with SAS to compress your ASCII-test data files so that they take up less space and then have SAS automatically uncompress the files when you execute a command file. Note that SAS cannot read compressed SAS data sets or compressed SAS transport files. If your data are normal ASCII-text files, you can compress your files so that they take up considerably less space (a 50% reduction or more is possible).

To compress a data file, use the Unix compress command which employs then Lempel-Ziv compression method. At the Unix prompt, type:

compress filename

Replace "filename" with the name of the data file you wish to compress. This creates a new file with the extension ".Z". For example, if you compress a file called "test.dat," a compressed file called "test.dat.Z" would be created, replacing the original file (test.dat).

To read this file into SAS without having to uncompress it beforehand, you should add the following SAS command to the command file:

FILENAME alias PIPE 'zcat filename';

Replace "alias" with the file handle (a nickname for the data file) you wish to use which can be up to eight characters long. Replace "filename" with the name of the compressed file including the ".Z" extension. You must also include the path if the file is somewhere other than the default directory (i.e. the directory from which you launch SAS).

For example, if you had a data file called "test.dat" with 10 variables you wished to compress and then use in SAS, first compress the file at the Unix prompt:

compress test.dat

Next, create a SAS command file with the following lines:


	DATA test1;

	FILENAME test PIPE 'zcat test.dat.Z';

	INFILE test;

	INPUT v1-v10;

	RUN;

Finally, execute your SAS command file normally. SAS generates a data set called test1.

SAS Transport Libraries

SAS also handles transport format data sets. A transport format file is created when you want to move your SAS data set to another operating system. Also, if you are bringing SAS data sets from another operating system into the Unix platform, you have to use a transport file. Note that the various Unix platforms are distinct in the way they create SAS data sets. That means a SAS data set created under one Unix system may not be readable under another. For example, a SAS data set created under sUNos (e.g. Steel) is not readable by SAS under IBM AIX (Research SP node aries05). You want to create a transport file for this purpose.

Suppose you want to create a SAS transport format file from the SAS data file, anxiety.sas7bdat. Define a LIBNAME to read the SAS data set, and another libname to write a SAS transport file. The XPORT (for transport engine) parameter is used to indicate that we want to create a transport format file. A transport format file always has fixed block size with a record length of 80, and a block size of 8000. The select statement may be omitted if there is only one SAS file stored in the directory or if you want to convert all the members of a single SAS data library in to SAS transport format library.


	LIBNAME test1 '/usr1/jdoe';

	LIBNAME test2 XPORT '/usr1/jdoe/trans.xpt';

	PROC COPY IN=test1 OUT=test2;

		SELECT anxiety;

	RUN;

Once the job is executed, a file called trans.xpt will be created and stored in the directory specified.

To read a transport format file (e.g., trans.xpt) stored on the disk and create a SAS data file, as in the example given above, define two libnames (one for reading, and one for writing) as in:


	LIBNAME test1 '/usr1/jdoe';

	LIBNAME test2 XPORT '/usr1/jdoe/trans.xpt';

	PROC COPY IN=test2 OUT=test1;

	RUN;

When there is more than one file stored in a transport library a select statement may be used to access the file of your choice. If you want to read/write SAS transport files involving format library, use the CIMPORT/CPORT procedures instead of the COPY procedure. See SAS Language Reference V. 8, and SAS Procedures Guide V. 8 for information.


Further Reading
Prev: Writing and Executing a SAS Program
Up: Table of Contents