Contrasting Programming Techniques For Summarizing Voluminous SAS .

Transcription

Contrasting programming techniques for summarizing voluminous SAS outputusing the SAS Output Delivery System (ODS) (PROC FREQ as an example)Stuart Long, WestatLawrence Park, WestatABSTRACTMETHOD 1: MACRO VARIABLE ARRAYSSAS ODS provides programmers with the ability to extractselected information from a procedure and store it in datasets.Such datasets can then be combined to summarize the resultsfrom numerous procedures. The SAS Macro facility can be usedto execute and extract information from repetitively called SASprocedures.Have you ever wished you could run code like this?“Macro Variable Arrays” can simplify the extraction of informationfrom SAS procedures when identical or similar code is executedrepeatedly on a multitude of datasets and/or variables. Thispaper describes Macro Variable Arrays and contrasts these withother programming methods to obtain summary datasets fromODS output.Although PROC FREQ is used in this paper, these techniquescan serve as a blueprint for extracting and summarizing outputfrom other SAS procedures. This is an advanced tutorial intendedfor SAS programmers knowledgeable with the SAS Macro facility.INTRODUCTIONSAS can generate an enormous volume of output, such as whenprocedures are run repeatedly in analyzing a large number ofvariables. Often, only a selected, and possibly small, portion ofthe output is of immediate interest. The SAS Output DeliverySystem (ODS) provides a convenient mechanism for extractingstatistics for customized summarization. ODS can be used inconjunction with macro processing to automate the generationand presentation of summary data from SAS procedures. Theinformation presented in this paper serves as a blueprint forsummarizing statistics produced from SAS procedures executediteratively.We use PROC FREQ as the example for this paper. The datasetused consists of forty medical conditions, identified by the range“medcond1” – “medcond40”, and 23 symptoms, identified by therange “dizziness” -- “absentminded”. All variables aredichotomous, where 0 No and 1 Yes. This assignment of 0 and1 as the dichotomous values is necessary for proper execution ofthe macros discussed in this paper. Our task is to produce asummary dataset of frequencies, percentages, chi squareprobabilities and odds ratios with confidence limits for 2x2 tablescrossing each outcome with each symptom. Once this summarydataset has been produced, the user can create a summary tableusing whatever reporting method within the SAS System that theyprefer, or the data can be exported for other processing.Consider the following code:PROC FREQ DATA mydata;TABLES (dizziness--absentminded)*(medcond1-medcond40)/ CHISQ RELRISK;RUN;The above PROC FREQ will produce 920 2x2 crosstabs andassociated statistics, and would generate 920 pages of output inthe SAS listing. By using SAS ODS iteratively within a macro, wereduce the output to a 20 page summary report. In this paper wepresent 3 different methods to do this using: 1) arrays of macrovariables, 2) call execute, and 3) by variable processing in PROCFREQ. Each of these is described in detail in the sections thatfollow, and the appendices provide the full code for each of themethods.ARRAY symptom {*} dizziness--absentminded;ARRAY disease {*} medcond1-medcond40;DO i 1 TO DIM(symptom);DO j 1 TO DIM(disease);PROC FREQ DATA mydata;TABLES symptom(i)*disease(j)/ CHISQ RELRISK;RUN;END;END;Since arrays must be defined and executed within a data step, wesee that the above code is syntactically incorrect. However, bydefining two sets of Macro variables, we can achieve the abovelogic within the syntax requirements of the SAS system.First, let’s dissect the structure of a data step array.DATA myset;SET mydata;ARRAY symptom {*} dizziness--absentminded;total symptoms 0;DO i 1 TO DIM(symptom);IF symptom(i) 1 THEN total symptoms 1;END;RUN; The name of the array is “symptom”.The index is “i”.The array element is identified by “symptom(i)”.For i 1, symptom(i) dizziness.In the example below we use a Macro Variable Array, which has anearly isomorphic relationship to the structure of a data steparray.%LET symptom1 dizziness;%LET symptom2 nervous;%LET symptom3 absentminded;%MACRO arrays;%DO i 1 %TO 3;PROC FREQ DATA mydata;TABLES &&symptom&i*medcond1;RUN;%END; /*%DO Loop*/%MEND arrays; The root name of the array is “symptom”.The index is “i”.The array element is identified by “&&symptom&i”.For i 1, &&symptom&i dizziness.All items in the Macro Variable Array can be mapped to the datastep array, except the “&&” syntax controls the final resolution ofthe Macro Variables in the array. The advantage of the MacroVariable Array is that the programmer can create iterative code,which executes outside of the data step on a list of variables in amanner similar to iterative code that is executed on a list ofvariables within a data step.

CREATING MACRO VARIABLE ARRAYSBelow are three methods to create Macro Variable Arrays within aprogram.Method 1: %LET is used to assign values to macro variables. Byincrementing the index, or suffix, of the names of macro variablesto which elements of the array are assigned, you can create ausable Macro Variable Array. Below we create an array of 23symptoms, where the index is incremented by one, for eachsuccessive element assigned to the %LETrow 1 dizziness;row 3 speechProb ;row 5 depression ;row 7 tasteChange ;row 9 appetiteLoss;row 11 fastHeart;row 13 badBalance ;row 15 unConscious ;row 17 blurredSight;row 19 concentrate ;row 21 nightVision ;row 23 LET%LETrow 2 nervous ;row 4 nausea;row 6 lowEnergy;row 8 sweating ;row 10 headache ;row 12 numbHands;row 14 irritable;row 16 trembling;row 18 insomnia ;row 20 armsWeak ;row 22 twitching;The Macro Variable Array with elements row 1 to row 23contains the values dizziness - - absentminded.Method 2: If your values are already defined with elements thatcontain incremental suffixes, you can use a macro to makeassignments:%MACRO assign( root, n);%DO i 1 %TO & n;%GLOBAL col &i;%LET col &i & root&i;;%END;%MEND assign;%assign(medcond,40)col 1 medcond1 ;col 3 medcond3 ;col 5 medcond5 ;col 7 medcond7 ;col 9 medcond9 ;col 11 medcond11;col 13 medcond13;col 15 medcond15;col 17 medcond17;col 19 medcond19;col 21 medcond21;col 23 medcond23;col 25 medcond25;col 27 medcond27;col 29 medcond29;col 31 medcond31;col 33 medcond33;col 35 medcond35;col 37 medcond37;col 39 %LET%LET%LET%LET%LET%LET%LET%LET%LET%LETIn the above data step, CALL SYMPUT is used to makeassignments of values to the macro variables "row 1 – row 23"and "col 1 – col 40". In the statementCALL SYMPUT("row " LEFT(i),VNAME(symptoms(i)));the first argument in the CALL SYMPUT is the name of the macrovariable, and the second argument contains the variable that willbe assigned to the macro variable. Where i 1:"row " LEFT(i) becomes "row 1"VNAME(symptoms(i)) becomes "dizziness"This statement resolves to:CALL SYMPUT("row 1","dizziness");This is similar to making the following assignment outside of adata step using %LET:%LET row 1 dizziness;The above code generates 40 %LET statements which assign the40 elements in medcond1 - medcond40 to 40 macro variables inthe range of col 1 to col 40. It is identical to executing thefollowing LET%LET%LET%LET%LET%LET%LET%LET%LETDATA NULL ;SET mydata;ARRAY symptoms {*} dizziness--absentminded;ARRAY diseases {*} medcond1-medcond40;CALL SYMPUT("dim row",DIM(symptoms));CALL SYMPUT("dim col",DIM(diseases));DO i 1 TO DIM(symptoms);CALL SYMPUT("row " LEFT(i),VNAME(symptoms(i)));END;DO i 1 TO DIM(diseases);CALL SYMPUT("col " LEFT(i),VNAME(diseases(i)));END;STOP;RUN;col 2 medcond2 ;col 4 medcond4 ;col 6 medcond6 ;col 8 medcond8 ;col 10 medcond10;col 12 medcond12;col 14 medcond14;col 16 medcond16;col 18 medcond18;col 20 medcond20;col 22 medcond22;col 24 medcond24;col 26 medcond26;col 28 medcond28;col 30 medcond30;col 32 medcond32;col 34 medcond34;col 36 medcond36;col 38 medcond38;col 40 medcond40;Method 3: Although it requires less execution time to assignvalues to a macro array outside of a data step by using %LETstatements, it may be preferable to make the assignments withina data step with the aid of data step arrays.Once the code for the data step in example 3 is executed, MacroVariable Arrays are generated which are equivalent to the arrayscreated in examples 1 and 2. In addition, we have created macrovariables containing the dimensions of these Macro VariableArrays.USING MACRO VARIABLE ARRAYSEXECUTE PROCEDURES ITERATIVELYTOThe two Macro Variable Arrays created in the previous data stepcan be used outside of a data step as follows:%MACRO runfreq(dimrow,dimcol,dset);%DO i 1 %TO &dimrow;%DO j 1 %TO &dimcol;PROC FREQ DATA &dset;TABLE &&row &i * &&col &j/ CHISQ RELRISK;RUN;%END; /*%DO Loop j*/%END; /*%DO Loop i*/%MEND runfreq;%runfreq(&dim row,&dim col,mydata)We know that &dim row 23 and &dim col 40. Thus theabove macro, “runfreq” will generate 23 x 40, or 920 PROC FREQ2x2 tables. The first three times through the %DO Loop willgenerate the following SAS code:

PROC FREQ DATA mydata;TABLE dizziness*medcond1 / CHISQ RELRISK;RUN;PROC FREQ DATA mydata;TABLE dizziness*medcond02 / CHISQ RELRISK;RUN;PROC FREQ DATA mydata;TABLE dizziness*medcond03 / CHISQ RELRISK;RUN;SAS PROCEDURE DATA OBJECTS AND ODSTRACEThe statistics and other information generated by SAS proceduresare held internally in several data objects. The components ofthese objects and their names are specific to each procedure.The ODS TRACE statement gives the programmer a method toidentify the names of the results objects that are generated by agiven procedure. The following code shows how to use the ODSTRACE statement to identify the names of the data objectsgenerated by PROC FREQ.ODS TRACE ON;PROC FREQ DATA mydata;TABLES dizziness*medcond1 / CHISQ RELRISK;RUN;ODS TRACE OFF;FIGURE 1: SAS Log Listing for TRACE Statement 9ODS TRACE ON; 10 PROC FREQ DATA mydata; 11TABLES dizziness*medcond1 /CHISQ ELRISK; 12 RUN; Output Added: ------------ Name:CrossTabFreqs Label:Cross-Tabular Freq Table Data Name: Path:Freq.Dizziness by medcond1.CrossTabFreqs ------------ Output Added: ------------ Name:ChiSq Label:Chi-Square Tests Template: Base.Freq.ChiSq Path:Freq.Dizziness by medcond1.ChiSq ------------ Output Added: ------------ Name:FishersExact Label:Fisher's Exact Test Template: Base.Freq.ChisqExactFactoid Path: Freq.Dizziness by medcond1.FishersExact ------------ Output Added: ------------ Name:RelativeRisks Label:Relative Risk Estimates Template: Base.Freq.RelativeRisks Path: Freq.Dizziness by medcond1.RelativeRisks NOTE: There were 2529 observations read from the data set mydata. NOTE: PROCEDURE FREQ used: real time0.44 seconds cpu time0.01 seconds 13ODS TRACE OFF; Four output objects are generated by this PROC FREQ, based onthe options that the programmer has requested. Figure 1 showshow the SAS Log identifies each of these as they are added tothe SAS listing.Figure 2 shows the default SAS listing from the PROC FREQ call.Referring back to Figure 1, it is easy to match the components ofthe listing to their names generated by the ODS TRACE request.As noted in the introduction, our task is to summarize columnfrequencies and percents, the Mantel-Haenszel chi squareprobability, and the odds ratios with confidence limits. Thesestatistics are the 8 highlighted items in Figure 2. The problem isto extract only these components from the much larger set ofinformation printedFIGURE 2: PROC FREQ Output The Frequency Procedure Table of Dizziness by medcond1 Dizziness(Dizziness episodes) medcond1(Diagnosed with Arthritis) Frequency Percent Row Pct Col Pct 0) No 1) Yes Total 0) No 1698 79 1777 67.14 3.12 70.26 95.55 4.45 71.11 56.03 1) Yes 690 62 752 27.28 2.45 29.74 91.76 8.24 28.89 43.97 Total23881412529 94.425.58100.00 Statistics for Table of Dizziness by medcond1 StatisticDFValueProb Chi-Square1 14.4856 0.0001 Likelihood Ratio Chi-Square 1 13.5495 0.0002 Continuity Adj. Chi-Square1 13.7729 0.0002 Mantel-Haenszel Chi-Square1 14.4799 0.0001 Phi Coefficient0.0757 Contingency Coefficient0.0755 Cramer's V0.0757 Fisher's Exact Test Cell (1,1) Frequency (F)1698 Left-sided Pr F0.9999 Right-sided Pr F1.587E-04 Table Probability (P)8.009E-05 Two-sided Pr P2.813E-04 Estimates of the Relative Risk (Row1/Row2) Type of StudyValue 95% Confidence Limits Case-Control (Odds Ratio) 1.931.362.72 Cohort (Col1 Risk)1.041.011.06 Cohort (Col2 Risk)0.530.390.74 Sample Size 2529

GENERATING ODS OUTPUT DATASETSThe SAS ODS OUTPUT statement allows the programmer tosave the results from procedure objects to datasets. We use thisto extract the desired results from PROC FREQ into threedatasets. In the code below, PROC FREQ is executed within twonested %DO loops inside a macro, and the macro “onerec” iscalled to further process the output datasets.%MACRO runfreq( dimrow ,dimcol ,dset ,out ,);%DO i 1 %TO &dimrow;%DO j 1 %TO &dimcol;ODS OUTPUT CrossTabFreqs ctabChiSq chiRelativeRisks rrisk;PROC FREQ DATA &dset;TABLES &&row &i * &&col &j/ CHISQ RELRISK;RUN;ODS OUTPUT CLOSE;%onerec(row &&row &i, col &&col &j)%END; /* %DO Loop j */%END; /* %DO Loop i */%MEND runfreq;FIGURE 3: PROC PRINT of ODS Datasets Dataset ctab RC dmFoo ierwl zdePPP M zcTqeee i ioT Aurrr s nnY Beccc s OedP Lneee i bs0E Ecnnn n ss1yttt g 1 0) No 0) No 11 1 1698 67.14 95.55 71.10 . 2 0) No 1) Yes 11 1793.12 4.44 56.02 . 3 0) No. 10 1 1777 70.26. 4 1) Yes 0) No 11 1 690 27.28 91.75 28.89 . 5 1) Yes 1) Yes 11 1622.45 8.24 43.97 . 6 1) Yes. 10 1 752 29.73. 7. 0) No 01 1 2388 94.42. 8. 1) Yes 01 1 1415.57. 9. 00 1 2529 100.00.0 Dataset chi Obs StatisticDF Value Prob 1 Chi-Square1 14.48 0.0001 2 Likelihood Ratio Chi-Square 1 13.54 0.0002 3 Continuity Adj. Chi-Square1 13.77 0.0002 4 Mantel-Haenszel Chi-Square1 14.47 0.0001 5 Phi Coefficient0.07 6 Contingency Coefficient0.07 7 Cramer's V0.07 Dataset rrisk ObsStudyTypeValue Lower Upper CLCL 1 Case-Control (Odds Ratio) 1.93 1.36 2.72 2 Cohort (Col1 Risk)1.04 1.01 1.06 3 Cohort (Col2 Risk)0.53 0.39 0.74 The datasets generated by the ODS OUTPUT statements areprinted in Figure 3. The statistics we want are high-lighted. In themacro “onerec” (below), these statistics are extracted from thethree datasets and combined into a one-record summary dataset.This summary dataset is then appended to a cumulative summarydataset, which will contain observations for all 920 2x2 CrossTabsexecuted by the original macro call. The programmer can thengenerate a summary report using any facility within SAS or exportthe data to any external reporting software desired. The followingcode for the macro “onerec” is continuous from the %MACROstatement to the %MEND statement:%MACRO onerec(row , col );First, the statistics to keep in the summary dataset are specified.DATA onerec (KEEP row var col varn col1 n col2p col1 p col2Prob OddsRatioLowerCl UpperCl);LENGTH row var col var 32;row var VNAME(&row);col var VNAME(&col);Second, the information from the dataset created from theCrossTabFreqs table ctab are extracted into the one-recordsummary dataset ( onerec). The SET statement reads thectab dataset and identifies two records, based on the conditionspecified in the WHERE clause, from which frequencies andpercents will be kept for the one record output by the DATA Step,.The Frequency and ColPercent values for the two rows of ctabare stored in different elements of the n col and p col arrays:ARRAY n col (0:1);ARRAY p col (0:1);DO UNTIL ( eof ctab );SET ctab (WHERE (&row 1 AND&col IN (0 1) ) )END eof ctab;n col [ &col ] Frequency;p col [ &col ] ColPercent;END;Third, the record in chi containing the probability for the MantelHaenszel Chi-Square statistic is output to the onerec dataset:DO UNTIL ( eof chi );SET chi (WHERE (SUBSTR(statistic,1,6) "Mantel") )END eof chi;END;Finally, the Case-Control Odds Ratio and confidence limits areextracted from the Relative Risks dataset, “ rrisk”:DO UNTIL ( eof rrisk );SET rrisk end eof rrisk;IF (substr(studytype, 1 , 4 ) "Case")THEN OddsRatio Value;END; RUN;Once all the statistics from the three output datasets have beenread into the onerec dataset, they are appended to acumulative summary dataset (specified by the macro variable&out):PROC APPEND BASE &out DATA onerecFORCE; RUN;%MEND onerec;

METHOD 2: CALL EXECUTELet’s reexamine the code presented at the start of this paper:ARRAY symptom {*} dizziness--absentminded;ARRAY disease {*} medcond1-medcond40;DO i 1 TO DIM(symntom);DO j 1 TO DIM(disease);PROC FREQ DATA mydata;TABLES symptom(i)*disease(j)/ CHISQ RELRISK;RUN;END;END;As noted, this code is syntactically incorrect. However, SASprocedure calls can be generated within Data Steps by using theCALL EXECUTE routine. Using this, we can run the PROCFREQ iteratively within nested Data Step DO Loops with the codebelow. The code generated by the repeated use of CALLEXECUTE is stored in a temporary buffer and is executed afterthe completion of the data step.DATA NULL ;SET mydata;ARRAY symptom {*} dizziness--absentminded;ARRAY disease {*} medcond1-medcond02;DO i 1 TO DIM(symptom);row VNAME(symptom(i));DO j 1 TO DIM(disease);col VNAME(disease(j));CALL EXECUTE ('PROC FREQ DATA mydata;' 'TABLES ' row '*' col ' / CHISQ RELRISK;' ' RUN;');END;END;STOP;RUN;To make use of the ODS OUTPUT datasets described in theprevious section, we rewrite the above code to have CALLEXECUTE invoke the macro “runfreq” instead of a direct call toPROC FREQ. The following code will generate 920 macro calls,which are executed after the completion of the Data Step.DATA NULL ;LENGTH rowvar colvar 32;SET mydata;ARRAY col (*) &colvars;ARRAY row (*) &rowvars;DO i 1 TO DIM ( col );colvar VNAME ( col[i] );DO j 1 TO DIM ( row );rowvar VNAME ( row[j] );CALL EXECUTE ('%runfreq ( DATA mydata' " , row " rowvar " , col " colvar " , base summary)");END;END;STOP;RUN;The first time through the nested DO Loops, the following macrocall will be generated at the end of the Data Step:%runfreq (,,,data mydatarow dizzinesscol medcond1base summary)The Macro Variable Array version of this program uses twonested %DO Loops to govern the iterative processing of the 920PROC FREQs. In this version, the iteration is controlled by DataStep DO Loops, which generate the macro call using the CALLEXECUTE statement. The row and column variables in theTABLES statement of the PROC FREQ are standard macrovariables that are passed as parameters of the macro call; theseparameters replace the Marco Variable Arrays used in theprevious method. Below is the revised version of the macrorunfreq for use with the CALL EXECUTE statement.%MACRO runfreq (data ,row rowvar ,col colvar ,base );ODS OUTPUT CrossTabFreqs ctabChiSq chiRelativeRisks rrisk;PROC FREQ DATA &data;TABLES &row * &col / CHISQ RELRISK;RUN;ODS OUTPUT CLOSE;%onerec(row &row, col &col);%MEND runfreq;The “onerec” macro used here is identical to the one written forthe Macro Variable Array version in the previous section.NOTE: In SAS Version 8.2, using CALL EXECUTE to generatelarge amounts of code ( 1000 lines) can result in truncation andrandom failure to execute code. For details and a patch, 5243.html.METHOD 3: A SINGLE PROC FREQThe third method uses an entirely different strategy for solving thetask. Here, we use a data step to collect the counts of the cells ineach 2x2 table into a single 3-dimensional matrix and process thisthrough a single call to PROC FREQ.For any given cross-tabulation, each observation on a datasetcontributes a count of 1 to a single cell of the table. We use thisfact and build a single frequency table for all 23 x 40 cross-tabs ina 3-dimensional array of dimension 23 x 40 x 4. The first 2dimensions index on the 23 row and 40 column variables, and thethird dimension indexes the counts of the 4 cells in the 2x2 crosstabs for the row and column variables. The values in this arrayare assigned through two nested DO loops. Inside these loops,one of the 4 cell counters is incremented, depending on thevalues of the row and column variables. The following codeperforms this task.RETAIN count 0 ;ARRAY col (*) &colvars ;ARRAY row (*) &rowvars ;ARRAY count ( &dimcol , &dimrow , 0:3 );DO i 1 TO DIM ( col ) ;DO j 1 TO DIM ( row ) ;count [i , j , 2* col[i] row[j]] 1;END ;END ;Therefore, if row variable “NAUSEA” (symptom 4) has the value 0and column variable “MEDCOND5” has the value 1, then thecode within the DO loops resolves to:count [ 5 , 4 , (2*1 0) ] 1;orcount [ 5 , 4 , 2 ] 1;which increments the appropriate counter for the two variablesand their combined values.

Examine this from the perspective of a single 2x2 table.DISCUSSION OF THE THREE METHODSnausea(Nausea episodes)medcond5(Diagnosed with Angina)Frequency Percent Row Pct Col Pct 0) No 1) Yes 0) No increment this cell count by 1 1) Yes We have presented 3 solutions to summarize numerousfrequency cross-tabs. All three make use of macros and ODSoutput datasets to extract selected pieces of the output toefficiently process large volumes of information.When the final record of the dataset is read, each of the 3680 cellcounts is output to a new dataset. This dataset is then processedin the macro “odsmodule”, using a single PROC FREQ with aWEIGHT statement to generate the desired statistics.%MACRO buildset(dset);DATA q ( KEEP col row freq colval rowval) ;LENGTH col row 32 ;SET &dset END eof ;RETAIN count 0 ;ARRAY col (*) &colvars ;ARRAY row (*) &rowvars ;ARRAY count ( &dimcol , &dimrow , 0:3 );DO i 1 TO DIM ( col ) ;DO j 1 TO DIM ( row ) ;count [i , j , 2 * col[i] row[j]] 1;END ;END ;IF eof THEN DO ;DO i 1 TO DIM ( col ) ;col VNAME ( col [i] ) ;DO j 1 TO DIM ( row ) ;row VNAME ( row[j] ) ;DO k 0 TO 3 ;freq count [ i , j , k ] ;colval INT ( k / 2 ) ;rowval MOD ( k , 2 ) ;OUTPUT ;END ;END ;END ;END ;RUN ;%odsmodule%MEND buildset;%MACRO odsmodule;ODS OUTPUT CcrossTtabFreqs ctabChiSq chiRrelativeRrisks rrisk;PROC FREQ DATA q ;BY col row NOTSORTED;TABLE rowval * colval / CHISQ RELRISK ;WEIGHT freq ;RUN ;ODS OUTPUT CLOSE ;RUN;%summaryset *see Appendix 3;%MEND odsmodule;The first 2 solutions are very similar; both use looping to iterativelyrun PROC FREQ, which generates the desired statistics. TheMacro Variable Array version uses straightforward macro codingin which nested %DO loops control processing over arrays ofmacro variables. In contrast, the CALL EXECUTE version uses adata step to build a sequence of SAS statements to run; here,DO loops direct processing of variables in Data Step arrays. ThePROC FREQ calls are executed after the completion of the DataStep. An advantage of the Macro Variable Array version is thatthe variables analyzed may be a mixture of numeric andcharacter. The code in our CALL EXECUTE version would haveto be modified somewhat to accommodate mixed variable types.Both of these methods provide a blueprint for designing other,possibly more complex, summarizations from analyses in SAS.They are easily tailored to process output from procedures otherthan PROC FREQ and can be adapted to combine and1summarize results from multiple procedures .The third method we presented, using a single PROC FREQ,cannot be generalized to other procedures, but provides aninteresting contrast to the other 2 solutions. It has the potentialadvantage in run time; instead of executing 920 PROC FREQs,only one PROC FREQ is run. Thus, if execution time is asignificant concern, this method would be preferred for processingfrequency tables.Code for all three methods is presented in appendices one tothree. These will be of interest to a programmer. An end userdoes not necessarily care about the inner workings of the macros,but may be concerned only with obtaining the end product, thesummary dataset. Appendix four shows how the programmercan provide the user with a simple interface to run these macroswith a minimum of information. To use the macro, the user needsto provide 1) the name of the dataset to be analyzed, 2) the nameof the summary dataset to be produced, and 3) the names of therow and column variables. Appendix five provides code toproduce the summary dataset shown in Figure 5. If the user onlywants the summary dataset, then s/he can use the output dataset(Figure 4) as needed.CONCLUSIONODS provides the SAS programmer with an efficient method forcreating summary datasets from voluminous output. DifferentSAS programming techniques can be used to take advantage ofthe ODS utility, depending on the coding preferences of theprogrammer. The task presented in this paper requires a fairamount of knowledge about the SAS Macro facility on the part ofthe programmer. However, the end user does not have to beconcerned with these programming details and can be provided asimple way to use these macros. Once the final summary datasethas been created, the end user can generate a report with themethod provided by the programmer, or with any other SASreporting facility, or external method if additional reporting needsare required.DISCLAIMER: The contents of this paper are the work of theauthor and do not necessarily represent the opinions,recommendations, or practices of Westat.ACKNOWLEDGEMENTSThe authors would like to thank Ian Whitlock for his assistanceand review of the methods and coding.

REFERENCES1. Long, S., Darden, R. (2003). A MACRO Using SAS ODS toSummarize Client Information from Multiple Procedures. Proceedingsof the 28th Annual SAS Users Group International Conference.CONTACT INFORMATIONStuart Long (long3@niehs.nih.gov)Westat1009 Slater Road, Suite 110Durham, NC 27703SAS and all other SAS Institute Inc. product or service names areregistered trademarks or trademarks of SAS Institute Inc. in theUSA and other countries. indicates USA registration.APPENDIX 1:The Macro Variable Array method.%MACRO runfreq(dset);%IF %SYSFUNC(EXIST(&out)) %THEN %DO;PROC DATASETS;DELETE &out;QUIT; RUN;%END;%DO j 1 %TO &dimcol;%DO i 1 %TO &dimrow;PROC DATASETS; DELETE ctab chi rrisk;QUIT; RUN;ODS LISTING CLOSE;ODS OUTPUT CrossTabFreqs ctabChiSq chiRelativeRisks rrisk;PROC FREQ DATA &dset ORDER INTERNAL;WHERE &&row &i .z AND &&col &j .z;TABLES &&row &i * &&col &j/ CHISQ RELRISK; RUN;ODS OUTPUT CLOSE;ODS LISTING;%onerec(row &&row &i ,col &&col &j ,rlab %NRBQUOTE(&&rlab &i) ,clab %NRBQUOTE(&&clab &j));%END; /* i %LOOP*/%END; /* j %LOOP*/%MEND runfreq;%MACRO onerec( row ,col ,rlab ,clab );DATA onerec (KEEP row var col var n col1n col2 p col1 p col2col label row labelcol1 val col2 val ProbOddsRatio LowerCl UpperCl);LENGTH row var col var 32col label row label 50;rowval &row;colval &col;row var "&row";col var "&col";col label "&clab";row label "&rlab";col1 val 0;col2 val 1;ARRAY n col (0:1);ARRAY p col (0:1);DO UNTIL ( eof ctab );SET ctab (WHERE (&row 1 AND &col IN (0 1)))END eof ctab;n col [ &col ] Frequency;p col [ &col ] ColPercent;END;IF (EXIST(“ chi”)) THEN DO;DO UNTIL ( eof chi );SET chi (WHERE (SUBSTR(statistic,1,6) "Mantel") )END eof chi;END;END; ELSE Prob .;OddsRatio .;LowerCL .;Uppercl .;IF (EXIST(“ rrisk”)) THEN DO;DO UNTIL ( eof rrisk );SET rrisk END eof rrisk;IF (substr(studytype,1,4) "Case") THEN DO;OddsRatio value;lcl LowerCL;ucl UpperCL;END;END;LowerCL lcl;UpperCL ucl;END;FORMAT col1 val col2 val ny.;LABELcol var "Name of Column Variable"col label "Label of Column Variable"col1 val "1st Value of Col Variable"col2 val "2nd Value of Col Variable"row var "Name of Row Variable"row label "Label of Row Variable"n col1 "Frequency of Column 1 cell"n col2 "Frequency of Column 2 cell"p col1 "Percent of Column 1 cell"p col2 "Percent of Column 2 cell"Prob "Mantel-Haenszel Chi-Square"OddsRatio "Odds Ratio"LowerCL "95% Lower Confidence Limit"UpperCL "95% Upper Confidence Limit";RUN;PROC APPEND BASE &out DATA onerec FORCE;RUN;%MEND onerec;%MACRO freqinfo ( data &syslast, out , rowvars , colvars , report );%LET data &data; *force evaluation of DATA;%LET report %UPCASE(&report) ;DATA NULL ;SET &data;ARRAY row {*} &rowvars;ARRAY col {*} &colvars;CALL SYMPUT("dimrow",DIM( row));CALL SYMPUT("dimcol",DIM( col));DO i 1 TO DIM( row);CALL SYMPUT("row " LEFT(i),VNAME( row(i)));CALL SYMPUT("rlab " LEFT(i),VLABEL( row(i)));END;DO j 1 TO DIM( col);CALL SYMPUT("col " LEFT(j),VNAME( col(j)));CALL SYMPUT("clab " LEFT(j),VLABEL( col(j)));END;STOP;RUN;%runfreq(%STR(&data));%IF &report YES %THEN%print summary table ( out &out ) ;%MEND freqinfo;

APPENDIX 2:The CALL EXECUTE method.%MACRO runfreq (data ,row rowvar ,col colvar ,clab clabel ,rlab rlabel ,base );PROC

other programming methods to obtain summary datasets from ODS output. Although PROC FREQ is used in this paper, these techniques can serve as a blueprint for extracting and summarizing output from other SAS procedures. This is an advanced tutorial intended for SAS programmers knowledgeable with the SAS Macro facility. INTRODUCTION