calculate_bigram_probabilities {qtkit} | R Documentation |
Calculate Probabilities for Bigrams
Description
Helper function that calculates joint and marginal probabilities for bigrams in the input data using dplyr. It processes the data to create bigrams and computes their probabilities along with individual token probabilities.
Usage
calculate_bigram_probabilities(data, doc_index, token_index, type)
Arguments
data |
A data frame containing the corpus |
doc_index |
Column name for document index |
token_index |
Column name for token position |
type |
Column name for the actual tokens/terms |
Value
A data frame containing:
x: First token in bigram
y: Second token in bigram
p_xy: Joint probability of the bigram
p_x: Marginal probability of first token
p_y: Marginal probability of second token
[Package qtkit version 1.1.1 Index]