Decimal Precision alignment

Introduction

A precept of reporting (and perhaps a minor justification for the very existence of clinical trial SAS programmers) is that numeric results should be concise, representative, and readable. The focus of decimal alignment is specifically with the readability aspect.

Any format applied to an integer value can be used as a ‘starting’ point to set up a cascading set of numeric formats for aligning decimals via the following algorithm:

- Integer (0 decimal places): x [e.g., y=put (z, 5.0)]
- 1 decimal place: (x+2) 1 [e.g., y=put (z, 7.1)]
- 2 decimal places: (x+3) 2 [e.g., y=put (z, 8.2)]
- 3 decimal places: (x+4) 3 [e.g., y=put (z, 9.3)]

The example above uses an assumption of 5 places to the left of the decimal as adequate to present any of the possible integer values of z found in the data, and then it follows by assigning match-width formats to align results of varying numbers of places past the decimal with the initial, hypothetical integer.

A reasonable rule-of-thumb (and there are many) for numeric precision in clinical trial tables is suggested in the following:

- Minimum and maximum should be displayed with the greatest number of decimals as collected on the CRF or presented in the database (qualifier: non-rounded converted values are exempt from this rule)
- Mean, median, and any quartiles should be displayed with one decimal place further to the right than maximum
- Standard deviation and standard error should be displayed with two decimal places further to the right than maximum

Input Dataset

In the below dataset, paramcd HEIGHT contains maximum decimal as “1” and BMI,WEIGHT and TEMP contains maximum decimal as 3.

Algorithm

Evaluate the maximum number of decimal places necessary at each result and pass this information into a temporary dataset variable.
An integer format of “8.0” is selected as the starting point for alignment with all other results of varying precision. Express the eventual format in terms of 2 variables: the total length and the number past the decimal
Create a temporary record-count variable.
Apply a generic record-count label to each parameter name.
Pass the total number of parameters into a variable macro.
Apply generic record-count labels to each aspect of the format length (total vs past decimal).
The final step involves passing both macro variables into a few ‘put’ statements that are set up to be dynamic to the particular statistic being presented.
Sort the dataset based on the key variables.
Derive the descriptive Statistics. 10.Apply the decimal points based on the created macro variables.

Sample Code:

Proc sql noprint;
  create table dec 
  as select distinct(max(lengthn(scan(put(aval,best.),2,"."))))
  as max_dec, paramcd 
  from adlb group by paramcd;
quit;
 
data adlb_1;
  set adlb end=eof;
  count+1;
  testnum='t_'||compress(put(count,best.));
  call symput(testnum,trim(left(paramcd))); 
  if max_dec>3 then max_dec=3;
  pt1=8+(max_dec>0)+max_dec;
  pt2=max_dec;
  fnum='f_'||compress(put(count,best.));
  lnum='l_'||compress(put(count,best.));
  call symput(fnum,trim(left(put(pt1,best.))));
  call symput(lnum,trim(left(put(pt2,best.))));
  if eof then call symput('ct',compress(put(count,best.)));
  %put &ct;
run;
proc sort data=adlb_1 out=adlb_s;by paramcd avisit;run;
 
proc univariate data=adlb_s noprint;
  by  paramcd avisit;
  var aval;
  output out=lb_stat n=n mean=mean median=median std=std min=min max=max;
run;
 
 
data final;
  length col1 col2 $200.;
  set lb_stat;
  array values(*) n mean std median  min max;
  do stat=1 to 4;
    %do i=1 %to &ct;
      if trim(left(paramcd))="&&t_&i" then do;
         if stat=1 then do;
            %if %eval(&&l_&i=0) %then %do;
               col1="N";
               col2=put(values(stat),%eval(&&f_&i-1).);  
            %end;
            %else %do;
              col1="N";
               col2=put(values(stat),%eval(&&f_&i-&&l_&i).);
             %end;
            end;
           else if stat=2 then do;
             if std ne . col1=put(mean,%eval(&&f_&i+1).%eval(&&l_&i+1))||
               "( "||trim(left(put(std,%eval(&&f_&i).%eval(&&l_&i+2))))||")";
              else 
                 col1 ="Mean (SD)";
                 col2=put(mean,%eval(&&f_&i+1).%eval(&&l_&i+1))||"(NE)";
              end;
              else if stat=3 then do;
                col1="Median";
                col2=put(median,%eval(&&f_&i+1).%eval(&&l_&i+1));
              end;
              else if stat=4 then do;
                %if %eval(&&f_&i=0) %then %do;
                   col1="Min, Max";
                col2=put(min,%eval(&&f_&i-1).%eval(&&l_&i))||",              
         "||trim(left(put(max,%eval(&&f_&i-1).%eval(&&l_&i))));
                %end;
                %else %do;
                    col1="Min, Max";
                    col2=put(min,%eval(&&f_&i).%eval(&&l_&i))||", 
                      "||trim(left(put(max,%eval(&&f_&i).%eval(&&l_&i))));  
                %end; 
              end;
            end;
         %end;
         output;
     end;
run;

Output Dataset

back