Please activate JavaScript!
Please install Adobe Flash Player, click here for download

ISCB2014_abstract_book

ISCB 2014 Vienna, Austria • Abstracts - Oral Presentations 81Wednesday, 27th August 2014 • 16:00-17:30 Monday25thAugustTuesday26thAugustThursday28thAugustAuthorIndexPostersWednesday27thAugustSunday24thAugust We analysed HF data collected from the administrative databank of an Italian regional district (Lombardia), concentrating our study on the days elapsed from one admission to the next one for each patient in our data- set. The aim behind this project is to identify groups of patients, conjecturing that the variables in our study, the time segments between two consecu- tive hospitalisations, are Weibull differently distributed within each hid- den cluster. Therefore, the comprehensive distribution for each variable is modeled by a Weibull Mixture. From this assumption we developed a sur- vival analysis in order to estimate, through a proportional hazards model, the corresponding hazard function for the proposed model and to obtain jointly the desired clusters. We find that the selected dataset, a good representative of the com- plete population, can be categorized into three clusters, corresponding to “healthy”, “sick” and “terminally ill” patients. Furthermore, we attempt a reconstruction of the patient-specific hazard function, adding a frailty pa- rameter to the considered model. C45.5 Smooth non-parametric estimation of the cumulative incidence functions for arbitrarily censored data A Nguyen Duc1 , M Wolbers1,2 1 Oxford University Clinical Research Unit, Ho Chi Minh City, Viet Nam, 2 Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom   The cumulative incidence function (CIF) describes the absolute risk of a specific event type over time and is a fundamental quantity to appropri- ately describe and analyze competing risks data. The most popular CIF es- timator is the nonparametric Aalen-Johansen estimator which produces a step function. However, a smooth function might be a more realistic approximation to the truth for many applications and nonparametric approaches have nonstandard asymptotic properties under interval cen- soring even in the survival setting. In contrast, parametric models rely on restrictive distributional assumptions. We introduce a novel flexible competing risks model which produces smooth CIF estimates for data with arbitrary censoring and truncation while relaxing the parametric assumptions. Our model is based on a mix- ture factorization of the joint distribution of the time (T) and type (D) of an event and the conditional distributions T|D are modeled using“smooth non-parametric densities” (SNPD), i.e. truncated (sieve) Hermite series expansions with an adaptive choice of the degree of truncation. Of note, SNPD have previously been successfully applied to econometrics and sur- vival models. An algorithm for fitting our models will be outlined and simulations pre- sented which show that in many scenarios, our CIF estimator has lower integrated mean squared error compared to both nonparametric and parametric estimators. We will also present the application of our method to an interval-censored dataset of the time to fungal clearance (favorable event) or death (competing unfavorable event) for patients with crypto- coccal meningitis. Finally, we will discuss extensions of our approach to regression modeling.   C46 Multiple imputation C46.1 A multi-stage multiple imputation in a large-scale cohort study K Furukawa1 , I Takahashi1 1 Radiation Effects Research Foundation, Hiroshima, Japan   Multiple imputation (MI) has been recognized as a flexible and general ap- proach to analysis involving missing data. Practically, however, it is unclear when and how data should be imputed in a project having data missing on variable(s) to be used in several analyses targeted for different subsets of the study subjects. To ensure MI to be valid, imputing for each individ- ual analysis is ideal, but if the analysis is targeted on a small subset of the subjects, we may lose information that is potentially available on the rest of the subjects to increase precision of imputation. This study proposes an alternative approach to impute data by estimat- ing the imputation model in multiple stages to improve the efficiency of the main analysis while keeping the consistency. Suppose that our main interest is evaluating the effect of a covariate X on the outcome Y, where X is available for the entire cohort (S1 ) but subject to missing and Y is mea- sured only on a small subset (S2 ) of the cohort. Also suppose that Z is an important predictor for X and available on S1 . We consider imputing data for missing Z from a distribution f[X|Y,Z]∝f1 [X|Z]f2 [Y|X,Z] where f1 is esti- mated with S1 and f2 with S2 . We apply this approach to analysis of cardiovascular disease incidence among a clinical subset of the Life Span Study cohort of more than100,000 Japanese atomic-bomb survivors, for whom radiation dosimetry and ba- sic demographic factors are mostly available but lifestyle factors such as smoking habits are substantially missing. C46.2 Sequential imputation for large epidemiological data sets NS Erler1 , J van Rosmalen1 , ETM Leermakers1 , VWV Jaddoe1 , OH Franco1 , EMEH Lesaffre1,2 1 Erasmus Medical Center, Rotterdam, The Netherlands, 2 L-Biostat, KU Leuven, Leuven, Belgium   Missing data are a challenge in cohort studies. An established procedure dealing with this is multiple imputation, as for instance with MICE (Van Buuren, 2012). After creating multiple imputed data sets, each can be analysed using standard software. Derived estimates are then pooled to obtain overall results. Multiple imputation for classical models is available in standard software, however, for more complex models the pooling pro- cedure is not always implemented and difficult to compute manually. For complex models, Bayesian methods may offer a solution. By specifying parametric distributions for the variables with missings, (e.g. a sequence of univariate conditional distributions as suggested by Ibrahim et al. (2002)), they can be imputed within the same MCMC-procedure used to estimate the model of interest, rendering pooling unnecessary. Furthermore, the Bayesian approach is theoretically justified and allows for a wide range of estimation models. However, Bayesian methods are often computation- ally intensive and may have convergence issues. In our study, we evaluate how practical this sequential Bayesian imputa- tion is in the context of epidemiologic questions that require analyses of large data sets with a high rate of missing values. We compare this pro- cedure with multiple imputation with regards to 1) ease of implementa- tion (using R and JAGS), 2) computational time, 3) robustness to modeling choices, and 4) the resulting estimates. To illustrate the method, we anal- yse the effect of sugar-sweetened beverage consumption on BMI trajecto- ries in young children with a linear mixed model in data obtained from the Generation R Study at Erasmus MC, Rotterdam.  

Pages Overview