sequence_stats {MSCA}R Documentation

Compute sequence statistics

Description

Computes descriptive statistics for sequences, including sequence frequency for any sequence length, and conditional probability and relative risk for sequences of length 2 (pairwise transitions).

Usage

sequence_stats(
  seq_data,
  min_seq_freq = 0.01,
  min_conditional_prob = 0,
  min_relative_risk = 0,
  forward = TRUE
)

Arguments

seq_data

A list of data frames containing sequences, must be the output of get_cluster_sequences.

min_seq_freq

Numeric threshold (default = 0.01). Filters out sequences with relative frequency below this value.

min_conditional_prob

Numeric threshold (default = 0). Applies only for pairwise sequences (k = 2).

min_relative_risk

Numeric threshold (default = 0). Applies only for pairwise sequences (k = 2).

forward

If TRUE only sequences with median age at onset of from is lower than median age at onset of to are kept

Details

For k = 2, the function computes:

For k > 2, only seq_freq is computed.

Value

A list of data frames, each containing the sequence statistics for one cluster.

See Also

get_cluster_sequences


[Package MSCA version 1.2.1 Index]