求如何用stata或者sas做heckman兩階段的編程或者步驟。 ?
我從暑假培訓的講義里截取一些代碼供參考吧。
(arlionn/software)
*-4.3.2 Heckman 模型估計方法
*-最大似然估計 (Maxlikelihood Estimation, MLE) *-基於二元聯合正態分布函數*-兩步法 (Two-step Estimation) *-Step1: Probit (Treat Equation) --&> Prob(Z=1) * Pr(y_j observed | z_j) = Normal(zg) * Inverse Mill"s Ratio:
* 逆米爾斯比率:用以修正 self-selection 導致的偏誤 *-Step2: reg y x IMR //回歸方程中加入 inverse Mill" ratio *-Note:* 表示正態分布的密度函數,
* 表示正態分布的累積分布函數.
下面是一個實例:
*------
*-4.3.3 Heckman 應用舉例: 婦女工資*-Heckman Model: An Example *-參考 Winkelmann, R., S. Boes, 2006, * Analysis of microdata[M], Springer. * p.232 基本上已經足夠了,這本書講的非常清晰。 shellout "$RWinkelmann_2006.pdf" * see Adkins, L., R. Hill, 2008, * Using stata for principles of econometrics, Wiley. * p.397 範例詳細解釋 shellout "$RAdkins_2008.pdf"
*-數據概況
capture ssc install bcuse, replace //安裝 -bcuse- 命令,自動下載數據
bcuse mroz, clear
gen exper2 = exper*exper
generate kids = (kidslt6 + kidsge6&>0)
label define kids 1 "with kids" 0 "no kids"
label value kids kids
des2
*-wage 有缺失,325個婦女的沒有工作,因此 wage=.
tabulate inlf
count if wage==.
*-哪些婦女更傾向於不去工作?
tabstat wage educ exper kidslt6 kidsge6 ///
huseduc huswage faminc city, ///
by(inlf) format(%3.1f) // Mean
inlf | wage educ exper kidslt6 kidsge6 huseduc huswage faminc city
-------+---------------------------------------------------------------------
0 | . 11.8 7.5 0.4 1.4 12.3 7.8 21698.1 0.6
1 | 4.2 12.7 13.0 0.1 1.4 12.6 7.2 24130.4 0.6
-------+---------------------------------------------------------------------
Total | 4.2 12.3 10.6 0.2 1.4 12.5 7.5 23080.6 0.6
-----------------------------------------------------------------------------
*-OLS v.s. Heckman Selection model
*-OLS
regress lwage educ exper exper2 if (hours&>0)
est store OLS
*-Heckman two-step
global x "educ exper exper2"
global z "age educ kidslt6 mtr huswage"
heckman lwage $x, select(inlf=$z) twostep
est store Heck2s
*-Heckman maximum likelihood
heckman lwage $x, select(inlf=$z)
est store HeckMLE
*-對比結果
local m "OLS Heck2s HeckMLE"
esttab `m", mtitle(`m") nogap compress s(rho N r2)
結果對比:
-------------------------------------------------
(1) (2) (3)
OLS Heck2s HeckMLE
-------------------------------------------------
main
educ 0.107*** 0.0888*** 0.0837***
(7.60) (5.55) (5.20)
exper 0.0416** 0.0381** 0.0342*
(3.15) (2.92) (2.57)
exper2 -0.000811* -0.000704 -0.000623
(-2.06) (-1.81) (-1.60)
_cons -0.522** -0.0742 0.0711
(-2.63) (-0.29) (0.26)
-------------------------------------------------
inlf
age -0.0354*** -0.0313***
(-5.14) (-4.38)
educ 0.108*** 0.0927***
(4.40) (3.85)
kidslt6 -0.802*** -0.705***
(-7.14) (-5.91)
mtr -5.628*** -6.179***
(-5.82) (-6.69)
huswage -0.118*** -0.113***
(-6.30) (-6.06)
_cons 5.276*** 5.598***
(5.54) (6.02)
-------------------------------------------------
/
mills -0.323**
(-2.74)
athrho -0.643**
(-3.28)
lnsigma -0.334***
(-6.08)
-------------------------------------------------
rho -0.464 -0.567
N 428 753 753
r2 0.157
-------------------------------------------------
t statistics in parentheses
* p&<0.05, ** p&<0.01, *** p&<0.001
*-Heckman 兩步法的解析(手動計算,便於理解原理)
global x "educ exper exper2"
global z "age educ kidslt6 mtr huswage"
probit inlf $z
predict w, xb
*-Inverse Mills Ratio (IMR)
generate IMR = normalden(w)/normal(w)
* Heckit two-step
regress lwage $x IMR
結果:
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 21.97
Model | 38.4091443 4 9.60228608 Prob &> F = 0.0000
Residual | 184.918307 423 .437159118 R-squared = 0.1720
-------------+---------------------------------- Adj R-squared = 0.1642
Total | 223.327451 427 .523015108 Root MSE = .66118
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P&>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0887514 .0155663 5.70 0.000 .0581544 .1193483
exper | .0380563 .0131323 2.90 0.004 .0122436 .0638689
exper2 | -.0007045 .000392 -1.80 0.073 -.001475 .0000661
IMR | -.3234024 .1161891 -2.78 0.006 -.5517824 -.0950224
_cons | -.0741731 .2544157 -0.29 0.771 -.5742497 .4259034
------------------------------------------------------------------------------
(arlionn/software)
打開stata,輸入help heckman,很詳細的介紹。一般就是:heckman 變數列表,select(變數) twostep
stata裡面有專門的heckman two step和heckman ml 的命令
推薦閱讀:
※關於定性變數(可能為等級資料)的分析,該選用卡方檢驗還是秩和檢驗,如何進行選擇?
※SAS,R,Python,matlab,spss,stata這類工具究竟是什麼?
※計量經濟學實證研究中,哪款軟體好?(SPSS,Eviews,Matlab,stata,SAS)
※怎樣優雅地學好 SAS 與 R 語言?