C38 Patient-centered outcomes C38.1 The design of diagnostic studies - another case for STRATOS? W Vach1 1 Clinical Epidemiologiy, IMBI, University of Freiburg, Freiburg, Germany   Recently, a new initiative called STRATOS appeared. STRATOS abbreviates "STRengthening AnalyticalThinking for Observational Studies", and it aims to provide guiding documents at different levels for topics related ot the design and analysis of observational studies. In this talk I discuss some first ideas how STRATOS may contribute to the field of designing diagnostic studies. Methodological standards for diagnostic studies are rapidly changing in the last years. Accuracy studies have been the cornerstone in diagnostic research for many years, but they are today often regarded as insufficient as they do not aim in measuring directly a patient benefit. RCTs with pa- tient centered outcomes are often recommended as alternatives, but there is little experience so far in designing, planing and analysing such studies. Actually, such RCTs would evaluate a combination of diagnostic tests and subsequent treatment and management processes, i.e. complex interventions. So their analysis may still include many elements of obser- vational studies. Accuracy studies will probably still also play an important role in future, but there is a need for better design, reporting and analyses taking the benefit aspect into account. In such a state of change, it may be hard to develop guiding documents. Nevertheless, there are some key issues independent of the study type, which I discuss in my talk: Clear definitions of the clinical target situation and the target population, discrepancy between target and study popula- tion due to recruitment, need for a sufficient, but not artificial standardiza- tion, choice of external reference test or patient centered outcomes. C38.2 Methodological issues in developing scores and cut-offs of rheumatoid arthritis activity O Collignon1 1 CRP Santé, Strassen, Luxembourg   Rheumatoid arthritis (RA) is a systemic disease which occurs in about 1% of the world population and triggers joints inflammations that may worsen patients’ quality of life. In order to define treatment strategy and to evaluate response to therapy, disease activity may be measured via sev- eral scores using several bio-clinical variables, as the Disease Activity Score involving 28 joint counts (DAS28), the Simplified Disease Activity Score (SDAI) and the Clinical Disease Activity Score (CDAI). Furthermore cut-offs for these scores have been designed to help physicians classify patients into disease activity categories. However some methodological issues were neglected when the scores were built, leading potentially to inaccurate classification of patients and thus inappropriate choice of therapy. Also, problems like inter-physician variability in evaluating disease activity, choice of the clinical parameters, methods of validation of the scores are highlighted. A strategy to develop a relevant surrogate to disease activity and cut-offs using penalized logis- tic regression and bootstrap internal validation is then proposed. As long as the issues reviewed in this presentation are not addressed, re- sults of studies based on such disease activity scores should be considered with caution.   C38.3 Determining optimal fractional factorial designs of discrete choice experiments using d-efficiency: application in addiction services T Vanniyasingam1 , C Cunningham1 , G Foster1 , A Niccols1 , L Thabane1 1 McMaster University, Hamilton, Canada   In a discrete choice experiment (DCE), individuals are asked to choose the most preferred alternative among a set of alternatives. We conducted a DCE where participants were asked a series of questions and had to choose one of three options presented in each scenario. A scenario was comprised of three levels, one from three different attributes. We explored individual preferences of 16 four-level attributes for a survey designed to elicit stated preferences for professional development, by addiction service providers and administrators, for the enhancement of addiction services. Typically DCE surveys are generated based on a fractional experimental design. An issue of survey design arises when determining which of the various com- binations of attributes and levels of attributes should be presented within options in a scenario, and how many attributes will minimize participants’ response burden - whilst ensuring an optimal design. The objective of this talk is to present our results of how the optimality of the design, measured using d-efficiency, is affected by: the number of attributes, number of levels, number of scenarios, number of overlapping attributes between scenarios, and number per scenario. We will use simu- lations to create the fractional designs to evaluate the d-efficiency under the various conditions listed above.   C38.4 Developing robust scoring methods for use in child assessment tools P Gichuru1 , G Lancaster1 , A Titman1 1 Lancaster University, Lancaster, United Kingdom   Earlier and more sensitive diagnosis of disability reduces its detrimental effect on children. We therefore seek to develop robust scoring methods for Child Assessment tools which will ensure more timely intervention of detected delayed development to reduce stress on the child and its family. Most of the current development scores are dependent on age hence a key objective is to develop methods that correct or account for age. Generally, regardless of implementation medium, two main scoring ap- proaches are usually used; item by item scoring creates score norms for each item and total scoring uses all the responses of the child to give one score across the entire domain. Using data from 1,446 normal children from the recent Malawi Development Assessment Tool (MDAT) study, we review classical total scoring methods including simple scoring, Log Age Ratio methods and Item Response Models under different assumptions to derive normative scores in this child development context using binary re- sponses only. While evaluating the pros and cons of each method, we also suggest extensions to current total scoring methods including smooth- ing methods and using more flexible models within various scoring algo- rithms. Preliminary results show that weighting simple scores is important as a lack of response to all items does not necessarily imply a lack of ability. Further, smoothing of score values is beneficial when variability in certain age groups is high. The more complex methods produce more reliable and generalizable normative scores. The sensitivity analysis showed that simple methods perform well in ideal situations.  

