Wednesday, October 18, 2006

Merging Data Sets Faster

When merging two SAS data sets, many of us use one of these two methods to keep only the observations where both input data sets contribute:

data test1;
merge step1(in=a)
step2(in=b);
by district studentid;
if a=1 and b=1;
run;
or
data test1;
merge step1(in=a)
step2(in=b);
by district studentid;
if a and b;
run;

This method executes faster (and saves a few keystrokes):
data test1;
merge step1(in=a)
step2(in=b);
by district studentid;
if a*b;
run;

This shortened version of the IF statement works since if either the variable a or b is false, the expression resolves to zero (false).

4 comments:

  1. Anonymous7:20 AM

    This is great. Thanks, Linda.

    ReplyDelete
  2. Anonymous3:38 PM

    just tested with million level pseduo data set, no practical difference.

    ReplyDelete
  3. Anonymous6:26 AM

    Hey Anonymous @ 3:38 PM, you can't just walk past a bowl of cornflakes, can you?

    ReplyDelete
  4. thank you for sharing such a nice and interesting blog with us. i have seen that all will say the same thing repeatedly. But in your blog, I had a chance to get some useful and unique information. I would like to suggest your blog in my dude circle. please keep on updates. hope it might be much useful for us. keep on updating...
    Software Testing Training

    ReplyDelete