This test data set was obtained on July 27, 2000 using the public use Data Extraction System of the U.S. Bureau of the Census.
A data frame sampled from year 1995 with 1080 observations on the following 13 variables.
Final weight (2 implied decimal places)
Adjusted gross income
Employer contribution for hlth insurance
Federal income tax liability
Total person income
State income tax liability
Taxable income amount
Total other persons income
Amt of interest income
Total person earnings
Soc. sec. retirement payroll deduction
Amount: Total Wage and salary
Business or Farm net earnings
Public use file from the CASC project. More information on this test data can be found in the paper listed below.
Brand, R. and Domingo-Ferrer, J. and Mateo-Sanz, J.M., Reference data sets to test and compare SDC methods for protection of numerical microdata. Unpublished. https://research.cbs.nl/casc/CASCrefmicrodata.pdf
data(CASCrefmicrodata)
str(CASCrefmicrodata)
#> 'data.frame': 1080 obs. of 13 variables:
#> $ AFNLWGT : int 270914 250802 299391 167656 176962 193328 178808 260530 187347 253471 ...
#> $ AGI : int 45554 57610 56606 38993 40462 30406 8730 25938 95500 72700 ...
#> $ EMCONTRB: int 4173 2639 3315 1619 4604 3433 824 4145 5575 3894 ...
#> $ FEDTAX : int 4621 6045 4765 3932 4349 2463 372 629 12830 8756 ...
#> $ PTOTVAL : int 45527 42008 56485 23580 21751 32167 8730 25001 95500 44850 ...
#> $ STATETAX: int 1428 1902 1903 1177 1219 830 186 693 3406 2496 ...
#> $ TAXINC : int 30809 39234 31767 26216 28994 16420 2480 4194 65129 48915 ...
#> $ POTHVAL : int 27 1008 485 700 751 167 1030 1 6500 2850 ...
#> $ INTVAL : int 27 808 485 700 1 50 22 1 5000 2500 ...
#> $ PEARNVAL: int 45500 41000 56000 22880 21000 32000 7700 25000 89000 42000 ...
#> $ FICA : int 3480 3136 4284 1750 1606 2448 589 1912 5047 3213 ...
#> $ WSALVAL : int 45500 41000 56000 22880 21000 32000 7700 25000 89000 42000 ...
#> $ ERNVAL : int 45500 41000 56000 22880 21000 32000 7000 25000 89000 42000 ...