IPUMS Complete Count Data
SlowMover 按:我所在的 IPUMS-USA 項目本周剛剛上線了 1930 年和 1920 年美國人口普查 100% 完整數據(初期版本),我參與了其中一部分變數的自動編碼工作。至此,我們已經發布了 1920 年至 1940 年間三次人口普查的完整數據(初期版本),以及 1850 年和 1880 年兩次人口普查的完整數據(最終版本)。
歡迎感興趣的研究者下載使用並提出意見、建議,謝謝!
1930 Preliminary Complete Count Data: The result of a recent collaboration between Minnesota Population Center and Ancestry.com, the 1930 complete count database is now available through IPUMS-USA. A few notes about this complete count database:
- Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.
- Coded variables derived from string variables are still in progress. These variables include: occupation and industry.
- We have allocated missing observations and edited some inconsistencies for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM,EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the Select data quality flags box on the extract summary page.
- Most inconsistent information was not edited for this release, thus there are observations outside of the universe for some variables.
1920 Preliminary Complete Count Data: The result of a recent collaboration between Minnesota Population Center and Ancestry.com, the 1920 complete count database is now available through IPUMS-USA. A few notes about this complete count database:
- Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.
- Coded variables derived from string variables are still in progress. These variables include: occupation and industry.
- We have allocated missing observations and edited some inconsistencies for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, MORTGAGE, FARM,CLASSWKR, OCC1950, IND1950, MARST, RACE, SEX, RELATE, MTONGUE. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the Select data quality flags box on the extract summary page.
- Most inconsistent information was not edited for this release, thus there are observations outside of the universe for some variables.
推薦閱讀:
※如何做legal research?
※沒有數據做支撐的工作彙報都是忽悠
※對話東航數據實驗室王學武:打造爆款速勝產品,為數據找到立足之地
※數據產品的定義和種類?
※人人都在說謊,怎樣才知道誰騙了你?