Dondon's SAS Teaching Blog

2016年8月8日星期一

資料處理-尋找指定欄位中的每筆資料的舊字串，取代成新字串

%macro sql_replace (table_name,col_name,old_String,new_String);

proc sql;

update &table_name

set &col_name = &new_String

where &col_name = &old_String ;

quit;

%mend sql_replace;

data a1;

input x1$ x2$ @@;

cards;

aaa www

bbb www

ccc www

;

%sql_replace(a1,x1,"aaa","bbb");

2014年11月27日星期四

SAS的函數介紹 IFN,LAG,DIF functions IFN()、LAG()、DIF()

健保資料庫-抓某一疾病一年內發生幾次

提供以下程式碼讓大家參考：

當我們在處理健保資料庫時，也許會遇到要抓某一疾病在一段時間內發生幾次，

此時苦惱語法要怎麼寫嗎?
以下Macro可以協助解決這個問題，

我這邊是設定大於30且小於365天，這邊可以自行修改，

我之所以會設定大於30天是因為怕疾病錯誤認定。

**************************Beginning***********************************;

%macro Dz_times(data_in,dz_date_name,occur1,occur2,data_out);

data b1;set &data_in;

by id;

dz_date2=input(&dz_date_name,yymmdd8.);

one_yr_&occur1.times=ifn( first.ID , (.) , dif&occur2.(dz_date2) );

if 30<one_yr_&occur1.times<=365 then day365_&occur1.times=1;

run;

data day365_&occur1.times; set b1;if day365_&occur1.times=1;run;

data &data_out; set day365_&occur1.times; by id; if first.id ;run;

*有 ? 人在一年內發生occur1 次疾病;

%mend;

%Dz_times(data_in=DM_op_all,dz_date_name=dz_date,occur1=4,occur2=3,data_out=b2);

*變項解釋;

*data_in：放入的資料之名稱;

*dz_date_name：疾病發生日期的變數之名稱;

*occur1：一年內發生多少次疾病;

*occur2：請填入occur1-1的數字 (可以再想如何改寫語法);

*data_out：輸出的資料之名稱;

******************************End**********************************;

2014年4月14日星期一

SAS函數的介紹 INDEX functions

[INDEX]:找出字串中，某一個字或字串首次出現的位子

Function: INDEX(var,"value")

我們可以藉由他找出字串中是否有我們感興趣的字或是字串。

舉例來說：

found_A_first = index(X, "A") *found_A_first會回應變相X首次出現的位子;

當我們想要找字串中是否有我們感興趣的字或是字串時，

也可以利用index，

舉例來說：

if index(X, "A") > 0 then found_A = 1; else found_A = 0;

*若A有出現在X中，那index(X,"A")的回應值會大於0，因此語法可以這樣下;

若需要找出有A or B字元或是A&B字元可以利用

if found_A+found_B>= 1 then found_A_or_B = 1; else found_A_or_B = 0;

if found_A+found_B=2 then found_A_and_B = 1; else found_A_and_B = 0;

data a;

input X $ ;

cards;

ABC

BCD

BCDE

;

run;

data b;

set a;

found_A_first = index(X, "A");

found_B_first = index(X, "B");

found_AB_first = index(X, "AB");

found_BC_first = index(X, "BC");

if index(X, "A") > 0 then found_A = 1; else found_A = 0;

if index(X, "B") > 0 then found_B = 1; else found_B = 0;

if index(X, "AB") > 0 then found_AB = 1; else found_AB = 0;

if found_A+found_B>= 1 then found_A_or_B = 1; else found_A_or_B = 0;

if found_A+found_B=2 then found_A_and_B = 1; else found_A_and_B = 0;

run;

proc print data = b;

run;

2013年11月14日星期四

多層次模型 (Multilevel Model)

多層次模型 (Multilevel Model)

簡單介紹與使用SAS處理Multilevel Model，

本篇使用到三個比較常用的簡單模型，分別為The Variance components model、The Random intercept model、The Random slope model。

SAS Codes：

LIBNAME lib EXCEL "D:\SasData\jsp.XLSX" ;

data jsp; set lib."jsp2$"N; run;

ods graphics on;

/*The_variance_components_model*/

Proc mixed data=jsp method= ml noclprint covtest;

Class Gender Social_class School_ID;

Model math_yr_3= /s ddfm=bw OUTP=p1 OUTPM=pm1 RESIDUAL;

Random intercept /type=un sub=School_ID;

Run;

/*畫圖*/

goptions reset=global;

axis1 label =( f= "arial/bo" h=1.2 "8-year math score")

order = (0 to 45 by 5);

axis2 label =(a=90 f= "arial/bo" h=1.2 "11-year math score")

order = (26 to 37 by 5);

title "The variance components model";

symbol i=j r=50;

proc gplot data = p1;

where school_id<51;

plot pred*math_yr_1=school_id/ haxis=axis1 vaxis=axis2;

run;

quit;

/*Random_intercept_model*/

Proc mixed data=jsp method= ml noclprint covtest;

Class Gender Social_class School_ID;

Model math_yr_3=math_yr_1 /s ddfm=bw OUTP=p2 OUTPM=pm2 RESIDUAL;

Random intercept /type=un

sub=School_ID;

Run;

/*畫圖*/

goptions reset=global;

axis1 label =( f= "arial/bo" h=1.2 "8-year math score")

order = (0 to 45 by 5);

axis2 label =(a=90 f= "arial/bo" h=1.2 "11-year math score")

order = (0 to 45 by 5);

title "The Random Intercept Model";

symbol i=j r=50;

proc gplot data = p2;

where school_id<51;

plot pred*math_yr_1=school_id/ haxis=axis1 vaxis=axis2;

run;

quit;

/*Random_Slope_model*/

Proc mixed data=jsp method= ml noclprint covtest;

Class Gender Social_class School_ID;

Model math_yr_3=math_yr_1 /s ddfm=bw OUTP=p3 OUTPM=pm3 RESIDUAL;

Random intercept math_yr_1 /type=un sub=School_ID;

Run;

proc print data=pm3;

run;

/*畫圖*/

goptions reset=global;

axis1 label =( f= "arial/bo" h=1.2 "8-year math score")

order = (0 to 45 by 5);

axis2 label =(a=90 f= "arial/bo" h=1.2 "11-year math score")

order = (0 to 45 by 5);

title "The Random Coefficient Model";

symbol i=j r=50;

proc gplot data=p3;

where school_id<51 ;

plot pred*math_yr_1=school_id/ haxis=axis1 vaxis=axis2 href=0;

run;

quit;

資料來源：
http://www.bristol.ac.uk/cmm/

資料下載：https://www.dropbox.com/s/hl0h1uqkin44gtf/jsp.xlsx

2016年8月8日 星期一

資料處理-尋找指定欄位中的每筆資料的舊字串，取代成新字串

2014年11月27日 星期四

SAS的函數介紹 IFN,LAG,DIF functions IFN()、LAG()、DIF()

健保資料庫-抓某一疾病一年內發生幾次

2014年4月14日 星期一

SAS函數的介紹 INDEX functions

2013年11月14日 星期四

多層次模型 (Multilevel Model)

2016年8月8日星期一

2014年11月27日星期四

2014年4月14日星期一

2013年11月14日星期四