# Utility Functions for Parametric Multi-State Models

Provides functions for working with multi-state modelling, such as efficient simulation routines for estimating transition probabilities and length of stay. It is designed as an extension to multi-state modelling capabilities provided with the 'flexsurv' package (see Jackson (2016) ).

`multistateutils` provides a number of useful functions for analyzing parametric multi-state models. In this way it does not aim to supplement the modelling strategies found in `mstate`, `msm`, or `flexsurv`, but rather provide tools for subsequent analysis. It is designed to be used with semi-Markov multi-state models of healthcare data, but can be used for any system that can be discretized into the states being entered at observed time-points, with parametric families providing appropriate fits for these transition rates.

It currently provides several features:

• Estimate transition probabilities
• Estimate length of stay
• Draw predicted pathways through the state space with dynamic prediction

Examples of these features are provided below in Examples.

## Installation

You can install the latest release version from CRAN with:

Or the development version can also be installed directly from GitHub.

## Examples

This section provides a brief overview of the features in `multistateutils`, please see the Examples vignette for a thorough description.

In these examples we'll use the `ebmt3` data set provided in the `mstate` package. It describes recovery after bone marrow transplant. We'll model it with an illness-death model, with transplant as the starting state, platelet recovery pr as the illness state, and relapse-free survival rfs as the death state.

`multistateutils` is designed for applied modelling with an overall aim of prediction, particularly in health-related areas. For this reason it is designed to work with parametric transition models built with `flexsurv`. The example below fits a Weibull model to each transition using the `age` and `dissub` covariates.

## Estimating transition probabilities

Transition probabilities are defined as the probability of being in a state j at a time t, given being in state h at time s, as shown below where X(t) gives the state an individual is in at t. This is all conditional on the individual parameterised by their covariates and history, which for this semi-Markov model only influences transition probabilities through state arrival times.

Ph, j(s, t)=Pr(X(t)=j | X(s)=h)

We'll estimate the transition probabilities of an individual with the covariates `age=20-40` and `dissub=AML` at 1 year after transplant.

This functionality is already provided in `flexsurv::pmatrix.simfs`, but it is limited to assessing only one individual at a time, with s = 0

`predict_transitions` on the other hand can estimate transition probabilities for:

• Multiple individuals
• Multiple values of t
• Multiple values of s
• Any combination of the above, all from the same simulation
• A mixture of distributions for the transition models

The ability to generate probabilities by varying s and t within a single simulation makes it very efficient for generating dynamic predictions.

## Length of stay

Similarly, the length of stay functionality provided by `flexsurv::totlos.simfs` has also been extended to allow for estimates at multiple time-points, states, and individuals to be calculated simultaneously. As shown below, the function parameters are very similar and the estimates are very close to those produced by `totlos.simf`.

Again, the advantage of this implementation is that estimates can be produced from multiple conditions without needing to rerun the simulation.

## State flow diagram

Another feature in `multistateutils` is a visualization of a predicted pathway through the state transition model, calculated using dynamic prediction and provided in the function `plot_predicted_pathway`. It estimates state occupancy probabilities at discrete time-points and displays the flow between them in the manner of a Sankey diagram as shown below.

The output of this function is an interactive HTML widget that can be manipulated to layout the diagram to better suit your needs. Unfortunately widgets can't be embedded in GitHub READMEs so the image below is just a screenshot, but have a look at the version in the vignette for an working example.

Ph, j(s, t)=Pr(X(t)=j | X(s)=h)

## Discrete event simulation

The underlying simulation engine can be easily adapted to run cohort-wide simulation, where the output statistics of interest are global rather individual level measures. This is useful for estimating healthcare related values, such as the average time spent receiving treatment for a given disease population.

These discrete event simulations are run with the `cohort_simulation` function as below. They simply require a list of models as specified above, a cohort population, and the transition matrix. It outputs a long table where each row is an observed state entry for a given individual.

By default the simulation runs until every individual reaches a sink state, but a longitudinal specification can be provided, such that the individuals enter the simulation at set incident times and the simulation terminates after a predefined period. This functionality is useful when estimating statistics over a finite time-period, such as the expected number of deaths in a 5-year range, but it requires a model of the incidence process (i.e. how often a new patient is diagnosed). See the Examples vignette for further details.

## `msprep2`

A large part of the work involved with multi-state modelling is basic munging of the data into a format suitable for modelling. The `msprep` function from the `mstate` package helps a lot here by converting a wide data frame where each row corresponds to a user with state entry times given in column, into a long format where each row represents a possible state transition.

It does, however, have some slight limitations:

• Having to form a wide table in the first place with an entry time and status indicator for each state can be a bit time consuming and isn't necessarily a natural way of storing such data
• This wide format only allows an individual to enter a state once
• For censored state transitions it can be awkward having to replicate the censoring time for each non-visited state

For these reasons I've created a modified version of `msprep`, imaginatively called `msprep2`, that performs the same role but accepts the input data as a tidy table of state entry times rather than a wide table at the individual level.

In this format, each row in the input data frame corresponds to a known state entry time, and so non-visited states are simply omitted. The input data frame only needs 3 columns: an individual id, a state id, and the time of entry. Covariates and censoring information are provided by separate data frames that link back on the id, providing a cleaner interface and one that works well with data that is stored in relational databases.

The example below shows the data preparation for 2 individuals:

1. Enters state 2 at t = 23 then no more transitions until last follow-up at t = 744
2. Enters state 2 at t = 35 then enters the absorptive state 3 at t = 360

By specifying state entry as a long table there is now no limit to how many times a state can be entered by an individual. Let's demonstrate this by extending the illness-death model to allow a patient to recover (i.e. transition from illness back to healthy).

I'll create a dummy dataset with one individual moving from healthy->illness->death at times 6 and 11, while patient 2 goes from healthy->illness, then is cured and goes back to healthy, before moving back to illness and finally dying.

As seen below, `msprep2` has no problem converting this into a list of possible transitions. Note that we don't need to pass in anything to `censors` because we have complete follow-up on both patients.

## Upcoming features

There is currently a web-app (not publicly accessible but the source code is on Github) that provides a graphical interface for the entire multi-state modelling process and simulation process for a cohort simulation. I'd like to tidy this up and get it functioning with this new version of `multistateutils` and also provide an interface for individual level simulations, such as estimating transition probabilities.

# multistateutils 1.2.2

• Fix for changes in data.table 1.12.1

# multistateutils 1.2.1

• Cleaned up build process
• Added GPL-3 licence and contributing information

# multistateutils 1.2.0

• Added `cohort_simulation` function to run discrete event simulation over populations
• Added `msprep2` that does the same role as `mstate::msprep` but takes a tidy data of state entry times as input
• Added individual age limits for `cohort_simulation`, `predict_transitions` and `length_of_stay`. See vignette for futher details

# multistateutils 1.1.0

• First CRAN release

# Reference manual

install.packages("multistateutils")

1.2.2 by Stuart Lacy, 2 months ago

https://github.com/stulacy/multistateutils

Report a bug at https://github.com/stulacy/multistateutils/issues

Browse source code at https://github.com/cran/multistateutils

Authors: Stuart Lacy

Documentation:   PDF Manual