2005 Adjusting for Unequal Selection Probability in Multilevel Models:
18 Slides255.00 KB
2005 Adjusting for Unequal Selection Probability in Multilevel Models: A Comparison of Software Packages by Kim Chantala C. M. Suchindran Dan Blanchette
2005 Overview Compare capabilities of multilevel modeling software packages for analyzing data collected with a complex sampling plan Describe characteristics of survey data that can influence estimates Construct sampling weights for estimating multilevel models Contrast results from estimating a two-level model with different software packages
2005 Comparison of Software Packages: General Information SEM Analysis MLM Analysis Adjust for Clustering Adjust for Stratification MPLUS 3.1 LISREL 8.7 GLLAMM (Stata 8) MLWIN 1.1 HLM 6.0 MIXED (SAS 8.2) NLMIXED (SAS 8.2)
2005 Comparison of Software Packages: Implementation of Sampling Weights Allow MLM Sampling Weights Method for Scaling MLM Sampling Weights Responsibility for Scaling MLM Sampling Weights MPLUS 3.1 Asparouhov (2004) User LISREL 8.7 Pfefferman (1998) User GLLAMM (Stata 8) Pfefferman (1998) User MLWIN 1.1 Pfefferman (1998) User or MLWIN default HLM 6.0 Normalize HLM default MIXED (SAS 8.2) Unknown User NLMIXED (SAS 8.2) Grilli, L. (2004) User
2005 Comparison of Software Packages: MLM Analyses with Sampling Weights Multinomial Categorical Ordered Categorical Normal Binary Poisson MPLUS 3.1 LISREL 8.7 GLLAMM (Stata 8) MLWIN 1.1 HLM 6.0 MIXED (SAS 8.2) NLMIXED (SAS 8.2)
2005 Survey Data Characteristics: Design of Add Health 80 High Schools selected with probability proportional to size from list of 26,666 schools sorted by: Enrollment Size Region of Country School Type Location Percent White 52 High Schools did not include a 7th or 8th grade Feeder school selected with probability proportional to percentage of each high schools’ entering class that came from feeder school. 52 Feeder Schools 80 High Schools 18,924 Students selected from 132 schools for Wave I In-Home Interview All Students from Ethnic Samples 16 Schools High SES Black Cuban Puerto Rican Chinese Core Sample Disabled Sample Genetic Samples Twins Full siblings Half siblings Unrelated in Same HH
2005 Constructing Multilevel Weights Weight Components Needed to Construct Sampling Weights for Two-Level Analysis using the Add Health Data: Level Unit Interviewed 1 Adolescent i enrolled in School j fsu wti j Number of adolescents enrolled in school j with the same characteristics as adolescent i. 2 School j psu wtj Number of schools in the U.S. with the same characteristics as school j. Weight Component * Meaning of Weight Component * Stata programs for constructing sampling weights for estimating two-level models can be downloaded from our website (http://www.cpc.unc.edu/restools/data analysis/ml sampling weights) after August 1, 2005. These programs have implemented methods from Pfefferman (1998) and Asparouhov (2004).
2005 Some MLM Software Packages Requires Special Weights* Constructed for Each Level: 5000 5000 nj psu m2wt j Frequency 4000 fsu wt i j i 4000 nj 3000 3000 2000 2000 1000 1000 0 0 2 8 14 20 26 32 38 fsu m2 wt i j 60 Midpoint for Level 2 (School) Weights *Method of weight construction from Pfeffermann (1998) 0 1.5 3 4.5 6 fsu wt i j psu m2wt j 7.5 9 Midpoint for Level 1 (Adolescent) Weights
2005 Other MLM Software Packages require one Weight* that combines the weights from each level in a particular way: 8000 mpml wta i , j 4000 nj fsu wt i j i nj 2000 Midpoint *Method of weight construction from Asparouhov (2004) 4400 3600 2800 2000 1200 1000 800 650 550 450 350 250 150 0 50 Frequency 6000 fsu wt i j * psu wt j
2005 Illustrative Example Research Question: How is the effect of hours watching TV on BMI of students in a school influenced by the availability of a school recreation center? Data from the National Longitudinal Study of Adolescent Health (Add Health) Contrast the results from MPLUS, MIXED, LISREL, MLWIN, and GLAMM Weights for MPLUS & MIXED will be constructed with the Asparouhov (2004) method; weights for LISREL, MLWIN, and GLAMM will be constructed with the Pfeffermann (1998) method.
2005 Data in example Level Variable School RC S Individual BMIPCT Individual HR WATCH Meaning School has on-site recreation facility, 0 No,1 Yes Percentile BMI for age and sex of adolescent Hours watched TV, played video or computer games during past week
2005 Two-level Model Student-level model (Within or Level 1): BMIPCTij { 0j 1j(HR WATCHij)} eij where: E(eij) 0, Var(eij) σ2 School-level Model (Between or Level 2): 0j 00 01(RC S)j 0j 1j 10 11(RC S)j 1j where: E( 0j ) E( 1j ) 0 Var ( 0j ) σ2 0, Var( 1j) σ2 1, Cov( 0j, 1j ) σ 0, 1
2005 Effect of Sampling Weights on Estimates Parameter Fixed Effect 00 01 10 11 Random Effect σ2 0 σ2 1 σ 0, 1 σ2 Range of Parameter Estimates Using Weights Ignoring Weights Ratio 2.72 3.08 0.019 0.055 0.05 0.08 0.001 0.003 54.5 38.5 19.0 18.3 12.41 0.008 0.234 25.03 0.53 0.0005 0.025 0.62 23.4 16.0 9.36 40.3 When sampling weights were omitted from analyses, all software packages gave nearly the same results.
2005 Analysis Results from Different Packages Weight: MPML Method A Weight: PWIGLS Method 2 MPLUS 3.1 Estimate (S.E) MIXED 8.2 Estimate (S.E) LISREL 8.7 Estimate (S.E.) MLWIN 1.1 Estimate (S.E.) GLLAMM Estimate (S.E) 00 60.19 (0.65) 59.09 (0.79) 57.83 (0.72) 58.52 (0.58) 57.47 (0.77) 01 -4.49 (0.87) -2.74 (1.10) -1.678 (1.06) -1.41 (0.95) -1.51 (1.18) 10 0.033 (0.016) 0.038(0.020) 0.045 (0.018) 0.052 (0.013) 0.049 (0.021) 11 0.12 (0.021) 0.11 (0.027) 0.099 (0.025) 0.065 (0.022) 0.101 (0.029) σ2 0 16.27 (4.04) 24.84 (5.04) 14.13 (3.18) 12.43 (3.05) 17.11 (4.74) σ2 1 0.002 (0.002) 0.009(0.003) 0.002 (0.001) 0.001 (0.001) 0.007 (0.003) σ 0, 1 -0.065 (0.067) -0.241 (0.097) -0.047 (0.047) -0.007 (0.040) -0.12 (0.08) σ2 794.36 (10.12) 774.08 (8.19) 792.95 (8.72) 793.57 (8.38) 799.11 (11.94) Fixed Effects Random Effects
Parameter Estimate Profile for Analysis Using Sampling Weights MPLUS LISREL MLWIN GLAMM MIXED
2005 Predictions from Analysis Using Sampling Weights Percentile BMI (BMIPCT) of Students for Average School 64 Solid lines (RC S 1): schools with recreation centers 62 60 58 56 54 0 10 20 30 40 Dashed lines (RC S 0): schools without recreation centers MPLUS MPLUS MIXED MIXED LISREL LISREL MLWIN MLWIN GLLAMM 50 GLLAMM Hours per week watch TV, etc. (HR WATCH)
2005 Conclusion Use of sampling weights to adjust for non-response and the design characteristics of complex survey data has recently been incorporated in software used for estimating multilevel models. This provides analysts with a simple method for obtaining unbiased estimates from complex survey data. When sampling weights are used, results from these packages can vary. If weights are ignored, these packages produce the same results. Simulation studies need to be conducted to determine why these packages produce different results when sampling weights are used. Models with non-normal outcomes need to be examined.
2005 References Asparouhov, T. (2004). Weighting for Unequal Probability of Selection in Multilevel Modeling, Mplus Web Notes No. 8 available from http://www.statmodel.com/ Grilli, L., and Pratesi, M. Weighted Estimation in Multilevel Ordinal and Binary Models in the Presence of Informative Sampling Designs. Survey Methodology, June 2004, Volume 30, pp 93-103 Pfeffermann, D., Skinner, C. J., Holmes D. J, and H. Goldstein, Rasbash, J., (1998). Weighting for Unequal Selection Probabilities in Multilevel Models. JRSS, Series B, 60, 123-40.