process_embed {keyclust} | R Documentation |
A tool designed to reduce redundant terms in a fitted embedding model
Description
Takes a fitted embedding model as an input. Allows users to combine embeddings by the case, stem, or lemma of associated terms.
Usage
process_embed(
x,
words = NULL,
punct = TRUE,
tolower = TRUE,
lemmatize = TRUE,
stem = FALSE
)
Arguments
x |
A fitted word embedding model in the data frame format |
words |
The name of a column that corresponds to the word dimension of the fitted word embeddings |
punct |
Removes punctuation |
tolower |
Combines terms that differ by case |
lemmatize |
Combines terms that share a common lemma. Uses the lexicon package by default. |
stem |
Combines terms that share a common stem. Note: Stemming should not be used in conjunction with lemmatize. |
Value
A data frame with the same columns as the input, but with redundant terms combined.
[Package keyclust version 1.2.5 Index]