標籤:

求如何用stata或者sas做heckman兩階段的編程或者步驟。 ?


我從暑假培訓的講義里截取一些代碼供參考吧。

(arlionn/software)

*-4.3.2 Heckman 模型估計方法

*-最大似然估計 (Maxlikelihood Estimation, MLE)

*-基於二元聯合正態分布函數

*-兩步法 (Two-step Estimation)

*-Step1: Probit (Treat Equation) --&> Prob(Z=1)

* Pr(y_j observed | z_j) = Normal(zg)

* Inverse Mill"s Ratio: IMR = frac{phi(z_i hat{gamma})}{Phi(z_i hat{gamma})}

* 逆米爾斯比率:用以修正 self-selection 導致的偏誤

*-Step2: reg y x IMR //回歸方程中加入 inverse Mill" ratio

*-Note:

* phi(cdot) 表示正態分布的密度函數,

* Phi(cdot) 表示正態分布的累積分布函數.

下面是一個實例:

*------

*-4.3.3 Heckman 應用舉例: 婦女工資

*-Heckman Model: An Example

*-參考 Winkelmann, R., S. Boes, 2006,

* Analysis of microdata[M], Springer.

* p.232 基本上已經足夠了,這本書講的非常清晰。

shellout "$RWinkelmann_2006.pdf"

* see Adkins, L., R. Hill, 2008,

* Using stata for principles of econometrics, Wiley.

* p.397 範例詳細解釋

shellout "$RAdkins_2008.pdf"

*-數據概況

capture ssc install bcuse, replace //安裝 -bcuse- 命令,自動下載數據
bcuse mroz, clear
gen exper2 = exper*exper
generate kids = (kidslt6 + kidsge6&>0)
label define kids 1 "with kids" 0 "no kids"
label value kids kids
des2

*-wage 有缺失,325個婦女的沒有工作,因此 wage=.
tabulate inlf
count if wage==.

*-哪些婦女更傾向於不去工作?
tabstat wage educ exper kidslt6 kidsge6 ///
huseduc huswage faminc city, ///
by(inlf) format(%3.1f) // Mean

inlf | wage educ exper kidslt6 kidsge6 huseduc huswage faminc city
-------+---------------------------------------------------------------------
0 | . 11.8 7.5 0.4 1.4 12.3 7.8 21698.1 0.6
1 | 4.2 12.7 13.0 0.1 1.4 12.6 7.2 24130.4 0.6
-------+---------------------------------------------------------------------
Total | 4.2 12.3 10.6 0.2 1.4 12.5 7.5 23080.6 0.6
-----------------------------------------------------------------------------

*-OLS v.s. Heckman Selection model

*-OLS
regress lwage educ exper exper2 if (hours&>0)
est store OLS
*-Heckman two-step
global x "educ exper exper2"
global z "age educ kidslt6 mtr huswage"
heckman lwage $x, select(inlf=$z) twostep
est store Heck2s
*-Heckman maximum likelihood
heckman lwage $x, select(inlf=$z)
est store HeckMLE

*-對比結果
local m "OLS Heck2s HeckMLE"
esttab `m", mtitle(`m") nogap compress s(rho N r2)

結果對比:

-------------------------------------------------
(1) (2) (3)
OLS Heck2s HeckMLE
-------------------------------------------------
main
educ 0.107*** 0.0888*** 0.0837***
(7.60) (5.55) (5.20)
exper 0.0416** 0.0381** 0.0342*
(3.15) (2.92) (2.57)
exper2 -0.000811* -0.000704 -0.000623
(-2.06) (-1.81) (-1.60)
_cons -0.522** -0.0742 0.0711
(-2.63) (-0.29) (0.26)
-------------------------------------------------
inlf
age -0.0354*** -0.0313***
(-5.14) (-4.38)
educ 0.108*** 0.0927***
(4.40) (3.85)
kidslt6 -0.802*** -0.705***
(-7.14) (-5.91)
mtr -5.628*** -6.179***
(-5.82) (-6.69)
huswage -0.118*** -0.113***
(-6.30) (-6.06)
_cons 5.276*** 5.598***
(5.54) (6.02)
-------------------------------------------------
/
mills -0.323**
(-2.74)
athrho -0.643**
(-3.28)
lnsigma -0.334***
(-6.08)
-------------------------------------------------
rho -0.464 -0.567
N 428 753 753
r2 0.157
-------------------------------------------------
t statistics in parentheses
* p&<0.05, ** p&<0.01, *** p&<0.001

*-Heckman 兩步法的解析(手動計算,便於理解原理)

global x "educ exper exper2"
global z "age educ kidslt6 mtr huswage"
probit inlf $z
predict w, xb
*-Inverse Mills Ratio (IMR)
generate IMR = normalden(w)/normal(w)
* Heckit two-step
regress lwage $x IMR

結果:

Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(4, 423) = 21.97
Model | 38.4091443 4 9.60228608 Prob &> F = 0.0000
Residual | 184.918307 423 .437159118 R-squared = 0.1720
-------------+---------------------------------- Adj R-squared = 0.1642
Total | 223.327451 427 .523015108 Root MSE = .66118

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P&>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0887514 .0155663 5.70 0.000 .0581544 .1193483
exper | .0380563 .0131323 2.90 0.004 .0122436 .0638689
exper2 | -.0007045 .000392 -1.80 0.073 -.001475 .0000661
IMR | -.3234024 .1161891 -2.78 0.006 -.5517824 -.0950224
_cons | -.0741731 .2544157 -0.29 0.771 -.5742497 .4259034
------------------------------------------------------------------------------

(arlionn/software)


打開stata,輸入help heckman,很詳細的介紹。

一般就是:heckman 變數列表,select(變數) twostep


stata裡面有專門的heckman two step和heckman ml 的命令


推薦閱讀:

關於定性變數(可能為等級資料)的分析,該選用卡方檢驗還是秩和檢驗,如何進行選擇?
SAS,R,Python,matlab,spss,stata這類工具究竟是什麼?
計量經濟學實證研究中,哪款軟體好?(SPSS,Eviews,Matlab,stata,SAS)
怎樣優雅地學好 SAS 與 R 語言?

TAG:SAS | Stata |