Wednesday, October 18, 2006

Modifying Data Sets and Variables Efficiently with PROC DATASETS

Data sets and variables can be modified using data steps or PROC DATASETS. Here are a few examples using each method, with the time they took to execute.

For this example, a dataset (sasuser.grads05) with 239,862 observations was created.

Modifying a Data Set
1. Using a data step:
Data sasuser.grads05 (label=’2004-05 Graduates’);
Set sasuser.grads05;
Run;

Real time: 1.45 seconds
CPU time: 0.18 seconds

2. Using PROC DATASETS:
Proc datasets lib=sasuser nolist;
Modify grads05 (label=’2004-05 Graduates’);
Quit;
Run;

Real time: 0.01 seconds
CPU time: 0.01 seconds

Modifying Variables
1. Using a data step:
Data sasuser.grads05;
Set sasuser.grads05(rename=(sex=gender));
Format dtupdate mmddyy10.;
run;

Real time: 2.91 seconds
CPU time: 0.19 seconds

2. Using PROC DATASETS:
Proc datasets lib=sasuser nolist;
Modify grads05;
Rename sex=gender;
Format dtupdate mmddyy10.;
Quit;
Run;

Real time: 0.02 seconds
CPU time: 0.02 seconds

Click here to learn more about PROC DATASETS.

1 Comments:

At 2:11 PM, Anonymous Anonymous said...

You need to compare apples to apples, when modifying or updateing in the data step you need to compare it with the Modify and Update Version of the Data Step. HOward Shrierer has a nice paper on this as well as a small introduction to it in the SAS Global Forum paper by Paul Choate and Toby Dunn

 

Post a Comment

<< Home