MSU - REU - News and Pictures
REU Pictures REU 2010 Pictures

Highly Stratified Model in Biostatistics (Mentor: Dr. Haimeng Zhang)

The Cox proportional hazards regression model is used in a number of areas in biostatistics and epidemiology to quantify the effects of exposure on survival for a cohort of individuals followed over time. In this model, a common but unspecified baseline hazard function $\lambda_0(t)$ is assumed to apply to all cohort members. The relation between exposure and failure is the one of most interest, and is modeled by the real parameter $\theta_0$ specifying the increased relative risk, having the exponential form $e^{\theta_0 Z}$, say, for an individual with covariate $Z$. More explicitly, we assume that the conditional hazard function $\lambda(t\vert z)$ satisfies $\lambda(t\vert z) = \lambda_0(t) e^{\theta_0 z}$ given a time-independent covariates $z$. In the case where information is available on the entire cohort, the maximum partial likelihood estimator (MPLE) is often used. In practice, however, it becomes difficult and expensive to collect complete data when dealing with large cohorts, and sampling schemes not only offer substantial savings, but ultimately become the only practical alternative. One of the simplest and popular sampling schemes, termed the nested case-control sampling (NCCS) design is to choose a fixed number of controls to compare to the failure at each failure time. It has been shown that the MPLE is efficient in the sense that it achieves the asymptotic variance lower bound if information is available over every individual in the cohort. For sampling designs, however, the situation is quite different. It is not always clear whether the estimators typically employed utilize the given sampled data in the most efficient manner. For NCCS, in particular, it has been shown that the MPLE estimator is not efficient in its use of available information in the time fixed covariates case. In counterpoint to such cases, we explore a highly stratified model, under which the MPLE based on the NCCS design approaches efficiency. More explicitly, under highly stratified situations or instances where the covariate values are increasingly less dependent upon the past and no censoring, the MPLE uses the available information efficiently in the limit as the number of cohort members tends to infinity. The study of this ``efficient" model is valuable for two reasons; it limits the scope of the search for estimators which can

Page: 1  2

REU WebMaster: Xingzhou Yang © REU Team, 2011-2012