Wednesday, February 27, 2013

A SAS Server Page Approach to inserting SAS Data Tables into RTF Documents

There are a number of ways to insert tables of SAS data into Word documents. ODS can be used to create the table and then it can be simply cut and pasted into the document. Alternatively the SAS Add-In for Microsoft Office can be used. And I am sure there are lots of other ways.

What I would like to illustrate with this blog posting is another way to do this using PROC STREAM and SAS Server Pages. The Use Case started with a comment that wouldn't it be nice if, for example, someone writing up a summary of clinical trial study could:
  • compose the document with all the appropriate commentary
  • enter some form of markup or commands for where the tables of data are to be inserted
  • hand that off to a SAS programmer or analyst to insert all the needed tables.
They key point was to make it easy to insert lots of tables and to refresh them easily once/if the data changed.

So that got me thinking that it would be nice if the markup could simply be a call to a SAS macro that generates the table. Something like the RTF file shown in Figure 1.

Figure 1. Sample RTF SAS Server Page.
The idea being that the RTF file is used as input to PROC STREAM (i.e., it is a SAS Server Page) creating the output RTF file shown in Figure 2 (shown in part).

Figure 2. Output document with embedded tables of SAS data.
The following sample program illustrates using PROC STREAM to do this.
filename inssp "&root\SSPs\Insert SAS Tables.rtf"
         lrecl = 32755;
filename out "&root\SSPs\With Embedded SAS Tables.rtf"
         lrecl = 32755;
filename newline temp lrecl = 32755;
options insert = (sasautos=("&root\macros"));
data _null_;
 /* make sure line feeds in the input
    document are preserved */
 infile inssp;
 input;
 file newline;
 if _n_ = 1 then put _infile_;
 else put '&streamDelim newline;' _infile_ ;
run;
proc stream outfile = out quoting = both;
BEGIN &streamDelim; %include newline;
;;;;
run;

The generateRTFTable Macro

OK. Now is the time for my disclaimer. This macro is very much a work-in-progress/proof-of-concept. For now, my goal was to just verify that it was possible to write a macro that could generate RTF text to embed a table of SAS data.

Frankly I am not sure that the time needed to do the research on what options might be needed and how to parameterize the RTF text is worthwhile. Perhaps readers of this blog entry can provide their thoughts.

Considerations when creating RTF files as SAS Server Pages

In chapter 4 of my upcoming e-book, SAS® Server Pages: Generating Dynamic Content, I describe some special considerations to keep in mind when using Microsoft Word to create RTF SAS Server Pages. And since I expected to discover more issues, I created an article Creating RTF SAS Server Page at www.sasCommunity.org to act as a repository of such additional issues.

For the purposes of this posting, the relevant concern (described in the book) is the Revision Save ID (RSID) feature of Microsoft Word. It can cause problems when text that is part of a SAS macro language element is added or deleted, or when the properties of that text are changed. The changes are tagged with a Save ID, which is a random number that changes each time the document is saved. These Revision Save IDs are primarily used when merging or comparing two documents with a common history but no revision marks (Track Changes is not turned on). The inserted RSID markup text causes PROC STREAM to not recognize the text as a macro variable reference, macro function call, or macro invocation. One way to avoid such errors is to disable the RSID feature. In Word 2003, deselect Store random number to improve accuracy under Tools ► Options ► Security Tab. In Word 2010, deselect Store random numbers to improve Combine accuracy under File ► Options ► Trust Center ► Trust Center Settings.) This is a global change and will apply to any future Word documents you edit (unless you change the setting back).

Next Steps

For anyone interested in trying this out, this zip file can be downloaded from www.sasCommunity.org. It contains the input RTF file, along with the above program and the generateRTFTable macro. And while not needed to try this yourself, I also included the output RTF file shown above. Just extract the zip file to a folder and add a %let statement to the above code and assign that location to the macro variable root.

And please consider this an open-call for anyone interested in collaborating on an effort to parameterize and generalize the generateRTFTable macro.

Sunday, February 24, 2013

Rich Text Data Entry and HTML5

On occasion I've had requests on projects where the user wanted a way to enter some text and be able to mark the text as bold or italic. I've been able to convince the users that entering some markup is the way to go. They typically go along after some convincing.

The contenteditable attribute, available in HTML5 makes such editing easier. You can try it out yourself below:
  1. The text between the lines below is tagged with that attribute
  2. Feel free to edit it:
    1. Add line feeds (i.e., hit the Enter key)
    2. Highlight text and use CTRL B, or I or U
  3. The click the button to see the HTML text with your changes.
This text can be edited by the user. Feel free to edit it. You can even hit the enter key for a line feed. You might want to do this at the beginning of every sentence. You can also highlight some text and use CTRL-B, CTRL-I, CTRL-U to make the text bold or italic or underlined.


So how did I do that? All I had to do is wrap the text in a DIV tag with the contenteditable attribute enabled, i.e.,

<div contenteditable="true" id="cmnts">
The text I want the user to be able to edit.
</div>

The DIV tag has an ID attribute so I can get the HMTL text using the innerHTML attribute.

Now your next question, is how can I upload the value to the server since the text is not in a form field (e.g., an INPUT or TEXTAREA tag)? All I need to do is to create a hidden form field in my form, e.g.,

<input type="hidden" name="userText" id="userText">

and then in my form tag I add an onSubmit attribute to assign the HTML text to the form field, e.g.,

onSubmit="document.getElementById('userText').value =
          document.getElementByID('cmnts').innerHTML;"

On the SAS Server side (e.g., the SAS/IntrNet® Application Dispatcher or the Stored Process Server), like for any other form field, a macro variable (userText) contains the HTML text that can be saved or used however the applications sees fit.

This is very much a Work In Progress and as I continue my research and experimentation with HTML5, hopefully I will find an easy to implement facility for a more complete Rich Text Editor. But for now, the contenteditable attribute is a pretty good alternative IMO.

Monday, February 18, 2013

Using DOSUBL to write Macros as functions

In my last post, DOSUB and DOSUBL - Data Driven Development, I mentioned one of the many uses of these very powerful new functions. I'd like to discuss another Use Case: using them to facilitate creating your own macros that work like functions.

Macros that need to return a single value, as opposed to generating code are often written as functions so they can be used, for example, as follows:
  • %let total = %getMax(data= . . ., var =  . . . );
or
  • retain  denom %getMax(data= . . ., var = . . . );
However if the macro needs to run some SAS code, it is not possible to call the macro this way since the generated code would be returned by the macro instead of the desired value. There are a number of work-arounds that many SAS programmers have used in this case, such as:
  • store the value into a macro variable that the calling program must know about;
  • save the value into a data set that the calling program must use to get the value;
  • use the %sysfunc macro along with the data access functions.
The DOSUB and DOSUBL functions provide another way to do this. They can be used via the %sysfunc macro to run the code. Consider the following real-life Use Case, which I use the Parameter File Maintenance (PFM) subsystem discussed in my upcoming e-book, SAS® Server Pages: Generating Dynamic ContentPFM is a web-based data entry facility that I've implemented on a number of projects to support managing the parameter files that drive the application. The getSurrogateKey macro is used as part of the update process to generate an analog to an auto-number surrogate key (something that many databases support with different names/terminology, but that SAS data files do not).

%macro getSurrogateKey
      (data = /* name of the dataset being updated */
      ,key =  /* name of surrogate key variable */
      );
 %local newkey;
 %if &&&key = . or &&&key = %then
 %do; /* generate a new surrogate key value */
   %let rc = %sysfunc(dosubl(
         'proc sql noprint;
           select cats(sum(max(&key),1))
           into:newkey from &data;quit;'
             ));
   %put NOTE: &key value of &newkey generated.;
   &newkey
 %end; /* generate a new surrogate key value */
 %else &&&key;
%mend getSurrogateKey;

The macro is called unconditionally by the update process and if the macro variable, from the Data Entry page (a SAS Server Page), for the key is missing, a new row is being added and this requires a new key value. The DOSUBL function is used, via %SYSFUNC) to run the PROC SQL code to get the next auto-number key value. Since DOSUBL is used, the macro execution does not return the SQL code to the current SAS session's input stack. Executing the macro returns:
  • the new key value, &newkey, for a new row.
  • the current key value, &&&key, for an existing row.
One final note about this example: it takes advantage of functionality introduced in the M1 realease of 9.3. In the M0 release, macro variables created in the code called by DOSUB/DOSUBL are not available. Returning macro variables and their values was added with the M1 release.

Monday, February 11, 2013

DOSUB and DOSUBL - Data Driven Development

I have always been a fan of data driven applications where data (including paramater files) drives or defines the code to be executed. The DOSUB and DOSUBL functions, experimental in SAS 9.3, are a great addition to the toolset available in SAS to build data driven applications.  In my (upcoming) ebook (SAS® Server Pages: Generating Dynamic Content), there are lots of examples that use these functions so SAS code can be executed from a SAS Server Page. Both functions have a single character argument:
  1. The argument to DOSUB is a fileref that point to the code to be executed.
  2. The argument to DOSUBL is the line (or lines) of code to be executed.
What I would like to describe here is how they can be used for data driven development. In Chapter 4 of my book there is an example of a mail-merge application. This is a very simple example of data driven development: for each observation in a SAS data set run some code to create a letter or a report. The logic is fairly straightforward:
  1. Determine how many observations there are in the input data set.
  2. Use a macro to loop from 1 to the number of observations and do the following in each iteration
    1. Read the ith observation
    2. Load the values of the needed variables into macro variables
    3. Invoke PROC STREAM to process a SAS Server Page that references the macro variables in the text of the letter
In other words, we use macro to run some code for each observation in a SAS data set The input SAS Server Page and one sample generated letter are shown below in Figures 1 and 2.
Figure 1. Input SAS Server Page
Figure 2. Generated Letter for John
What DOSUB and DOSUBL allow us to do is is to invert this process. For each observation in an input SAS data set, we run some code. So the DATA step become the driver instead of a Macro Language DO loop. The following program demonstrates this approach using the DOSUBL function.

proc format;
 /* map the value of sex to daughter/son */
 value $gender 'F' = 'daughter'
               'M' = 'son'
;
run;
data _null_;
 set sashelp.class;
 /* associate formats with sex and age */
 format sex $gender. age words.;
 /* create macro vars from the data step vars */
 /* vvalue uses the formatted value */
 call symputx('name',vvalue(name));
 call symputx('height',vvalue(height));
 call symputx('weight',vvalue(weight));
 call symputx('sex',vvalue(sex));
 call symputx('age',vvalue(age));
 /* define the code to run for each observation */
 code = 'filename letter "&root\letters\&name..html" '
      ||'lrecl=32767; '
      ||'proc stream outfile=letter quoting=both; '
      ||'begin '
      ||'&streamdelim; %include srvrpgs(class.html); '
      ||';;;;'
      ;
 /* run the code */
 rc = dosubl(code);
run;

Each execution of the  DATA step invokes PROC STREAM to generated the desired letter.

CALL EXECUTE vs DOSUB and DOSUBL

Like CALL EXECUTE, the DOSUB and DOSUBL functions allow you to generate code to be executed. However, unlike CALL EXECUTE, both DOSUB and DOSUBL execute the code the immediately while code passed to CALL EXECUTE is executed after the DATA step completes. In this example, if CALL EXECUTE had been used, all the generated letters would have used the values of the macro variables from the last observation in our input data set since each execution of the DATA step overwrites the macro variables. Since code passed to the DOSUB/DOSUBL functions is executed immediately, the values of the macro variables in our letter resolve the values from the current observation.

A Best Practice for the DOSUBL Argument

The length of the code stream passed to the DOSUBL routine can not exceed 32,767 characters. For the example included above, the code was included inline to simplify reviewing the example. As a Best Practice, code to be executed should not be included inline. Instead the code can be packaged as a macro and the macro call would be the argument to the DOSUBL routine. Alternatively it could be stored in an external file that is pointed to by a fileref and that fileref would be the argument to the DOSUB routine.

This Best Practice reinforces the paradigm shift when using DOSUB/DOSUBL routines. Instead of using the Macro Language to loop and execute mutiple DATA and/or PROC steps, we can now have the DATA step do the looping and execute a macro in each loop/iteration.