Methods for Risk Predictions

£399.00£599.00 exc VAT


Statistical methods for risk prediction and prognostic models

Patients and their care providers often want to know the risk of developing an (adverse) health outcome over time. Estimates of future risk (‘prognosis’) allow patients and their families to put a clinical diagnosis into context, and help care providers to make clinical decisions and devise treatment strategies. For this purpose there is a growing interest in risk prediction and prognostic models. This 3-day online course provides a thorough foundation of the statistical methods most commonly needed to develop and validate prognostic and prediction models in clinical research. A mixture of recorded lectures, computer practical exercises in Stata and R, and live question and answer sessions are used to ensure participants appreciate the underlying statistical concepts and can apply the methods learned to real datasets containing either binary or time-to-event outcomes.

Target Audience
The course is aimed at individuals that want to learn how to develop and validate risk prediction and prognostic models, specifically for binary or time-to-event clinical outcomes. We recommend participants have a background in statistics. An understanding of key statistical principles and measures (such as effect estimates, confidence intervals and p-values) and the ability to apply and interpret regression models is essential. We also recommend that participants are familiar with Stata or R, although the practical exercises will not require individuals to write their own code.

Course Content
The course is intended to be completed over 3 days, and focuses on model development (day 1), internal validation (day 2), and external validation and novel topics (day 3). Our focus is on multivariable models for individualised prediction of future outcomes (prognosis), although many of the concepts described also apply to models for predicting existing disease (diagnosis).

Participants are encouraged to watch a few pre-course videos prior to the week of the course. These introductory videos includes an overview of the rationale and phases of prediction model research, as well as basic model specification, focusing on the basic principles of logistic regression for binary outcomes, and Cox regression or flexible parametric survival models for time to event outcomes. This pre-course content is particularly important for participants less familiar with statistical modelling, and will help ensure a good foundation for the subsequent three days.

Day 1 covers key topics for model development including identifying candidate predictors, handling of missing data, modelling continuous predictors using fractional polynomials or restricted cubic splines for non-linear functions, and variable selection procedures.

Day 2 focuses on how models are optimised for the data in which they were derived, and thus often do not generalise to other datasets. Internal validation strategies are outlined to identify and adjust for overfitting. In particular, bootstrapping is covered to estimate the optimism and shrink the model coefficients accordingly; related approaches such as Lasso are also discussed. Statistical measures of model performance are introduced for discrimination (such as the C-statistic and D-statistic) and calibration (calibration-in-the-large, calibration plots, calibration slope and curves). Further sessions cover sample size considerations for model development and validation, and modern methods for shrinkage and penalisation.

Day 3 focuses on the need for model performance to be evaluated in new data to assess its generalisability, namely external validation. A framework for different types of external validation studies is provided, and the potential importance of model updating strategies (such as re-calibration techniques) are considered. Novel topics are then considered, including: the development and validation of models using large datasets (e.g. from e-health records) or multiple studies; the use of meta-analysis methods for summarising the performance of models across multiple studies or clusters; and the use of net benefit and decision curve analysis to understand the potential role of a model for clinical decision making. Practical guidance is then given about different ways in which prediction and prognostic models can be presented, and the final session discusses the importance of the TRIPOD reporting guideline when publishing prediction model research.

Stata and R practical exercises are included on all three days, and participants will be able to choose whether to focus on logistic regression examples (for binary outcomes) or Cox / flexible parametric survival examples (for time-to-event outcomes), to tailor these exercises to their own purpose.

Dr Kym Snell (course lead), Prof Richard Riley, Dr Joie Ensor, Lucinda Archer (Keele University)
Dr Laura Bonnett (University of Liverpool)
Prof Gary Collins (University of Oxford)

The course will be run online over three days using a combination of recorded lecture videos, computer practical exercises in Stata and R for participants to work through, and daily live question and answer sessions.

All course material (lecture videos and computer practicals) will be made available at the start of the week to provide plenty of time and flexibility for participants to work through the material. Two Q&A sessions will be scheduled for each of the three course days at 9:00-10:00 BST (8:00 UTC) and 16:00-17:00 BST (15:00 UTC) to allow discussion between faculty and participants. The different times should allow participants in different time zones to join at least one of the daily sessions. Questions can be submitted in advance at any time using the chat feature or asked during the live sessions. Faculty will also be available to answer outstanding questions using the chat.

For participants wanting to do the course between 9:00 and 17:00 BST, a proposed timetable is available that includes rest breaks. This can be adapted for participants in different time zones.

Lecture videos will remain available for one week after the course finishes.

Student (proof required): £399
Academic (public sector): £499
Industry (commercial): £599 A 10% discount is available for group bookings of 5 or more participants – please contact the events team at for group bookings.

For queries related to registration, please contact the events team 

For queries related to the course content, please contact Dr Kym Snell (