Insight Statistical Consulting Ltd
John Kirkpatrick MSc BSc (Hons) CStat CSci

How to get special characters, such as ∞ and μ in your output

You need to include non standard characters in your output. For me at least, this usually means Greek letters or mathematical symbols.

Strictly speaking, this solution works only for the RTF and HTML destinations. However, by usinga ppropriate style definitions, you can make the RTF destination look like the LISTING destination, so it will "work" there as well. I believe that a variant should also work for the PDF destination, though I've not tried it there.

The solution code I've postd will work for one destination at a time. However, by using the RAW escapechar function, you can make it work for both RTF and HTML destinations symultaneously, albeit at the expense of clarity in the output. Personally, I prefer to run the code once for each destination.

Both RTF and HTML have mechanisms for inserting non-standard characters into output. RTF uses the \u and \uc control words; HTML has a set of character entities.

For example, to write the infinity character (∞) to an RTF document, one could write

\u8734

whereas for an HTML document, one could write

∞

So the only problem is how to write the required text to the output in a clean and easily understandable way.

My preferred solution is to hide the complexities of the process in a macro. Programmers can then write something like

TITLE "This is an infinity symbol: %sym(INFINITY)";

One option would be to store the macro's keywords, RTF control words and HTML special characters in a dataset. This would make it easy to add additional items over time and then do a look up on the dataset to extract the relevant code word. However, this would make it harder to use the macro anywhere in a SAS program. (For example, you'd need to use the SYSEXEC function to access the macro from within a DATA step.) That's why I've chosen to implement the logic as an unsightly series of IF-THEN-ELSE statements.

Producing special characters in HTML output is straightforward: the required token is simply written to the output, making sure to quote the ampersand and semi-colon characters, for obvious reasons. The process is simple, but the range of characters available in HTML4 is limited. You can find the list of available special characters on several sites on the web. Here is one that I found.

The situation with RTF is a little more complex, for two reasons. First, for strict adherence to the standard, as well as using the \u control word to define the special character, RTF writers should also use the \uc control word to define an alternate text for RTF readers that cannot render the Unicode character requested. The \uc control word takes an argument equal to the length in bytes (characters) of the alternate text. In addition, the RTF destination escapes backslash characters by default, on the assumption that you're much more likely to want the backslash to appear in your output rather than to define the start of a control word. This means that you'll need to unescape the backslash by using the PROTECTSPECIALCHARS=OFF style option in any part of the output in which you use this technique.

Whilst the process to write special characters to the RTF destination is more complicated, the pay-off is considerable: any character that has a Unicode code can be written in this way. Full lists of Unicode character codes can be found on the Unicode website.


Enough of the preamble. The following macro call

%sym(_ALL_, dest=HTML)

produces this output, whereas

%sym(_ALL_, dest=RTF)

produces this RTF file.

The source code of the macro, edited for brevity, is

%macro sym(name, dest=RTF, alt=%str());
%local symlist i x;

%if &dest eq RTF %then %do;
%if &alt eq %str() and &name ne _ALL_ %then %do;
%if %upcase(&name) eq PLUSMINUS %then %let alt=%str(+/-);
%if %upcase(&name) eq _LE_ %then %let alt=%str(LE);
%if %upcase(&name) eq _GE_ %then %let alt=%str(GE);
%if %upcase(&name) eq _NE_ %then %let alt=%str(NE);
%else %let alt=&name;
%end;
%let symlist=alpha beta gamma delta epsilon zeta eta theta iota kappa lambda mu nu xi omicron
pi rho sigma tau upsilon phi chi psi omega ALPHA BETA GAMMA DELTA EPSILON ZETA ETA
THETA IOTA KAPPA LAMBDA MU NU XI OMICRON PI RHO SIGMA TAU UPSILON PHI CHI PSI OMEGA
degree plusminus _le_ _ge_ _ne_ infinity;
%if &name eq alpha %then \uc%length(&alt)\u945 &alt;
%else %if &name eq beta %then \uc%length(&alt)\u946 &alt;
%else %if &name eq gamma %then \uc%length(&alt)\u947 &alt;

...

%else %if %upcase(&name) eq INFINITY %then \uc%length(&alt)\u8734 &alt;
%else %if &name ne _ALL_ %then %do;
%put **** ERRROR: &name is not supported in the &dest destination.;
%if &sysver ge 9 %then %return;
%end;
%end;
%else %if %upcase(&dest) eq HTML %then %do;
%let symlist=ALPHA BETA GAMMA DELTA EPSILON ZETA ETA THETA IOTA KAPPA LAMBDA MU NU XI OMICRON
PI RHO SIGMA TAU UPSILON PHI CHI PSI OMEGA alpha beta gamma delta epsilon zeta
eta iota kappa lambda mu nu xi omicron pi rho sigma tau upsilon phi chi psi
omega infinity;
%if &name eq ALPHA %then %nrstr(Α);
%else %if &name eq BETA %then %nrstr(Β);

...

%else %if &name eq infinity %then %nrstr(∞);
%else %if &name ne _ALL_ %then %do;
%put **** ERRROR: &name is not supported in the &dest destination.;
%if &sysver ge 9 %then %return;
%end;
%end;
%else %do;
%put **** ERRROR: destination &dest is not recognised.;
%if &sysver ge 9 %then %return;
%end;

%if &name eq _ALL_ %then %do;
DATA _Temp;
LENGTH X $ 20 NAME $ 20;
%let i=1;
%let x=%scan(&symlist, &i, %str( ));
%do %while(%length(&x) gt 0 and &i le 100);
X = "%sym(&x, dest=&dest)";
Name = "&x";
OUTPUT;
%let i=%eval(&i + 1);
%let x=%scan(&symlist, &i, %str( ));
%end;
RUN;

ODS LISTING CLOSE;
PROC REPORT DATA=_Temp NOFS;
COLUMN Name X X=X1;
DEFINE Name/DISPLAY "Symbol";
DEFINE X/DISPLAY "Code";
DEFINE X1/DISPLAY STYLE=[PROTECTSPECIALCHARS=OFF] "Output";
RUN;
ODS LISTING;
PROC DATASETS NOLIST LIB=WORK;
DELETE _Temp;
RUN;
QUIT;
%end;
%mend;

You may download all the files in this example using the links below. The zip archive contains all files related to this FAQ.

SAS code RTF output HTML output Log file Zip archive