vec_preprocess {LBDiscover} | R Documentation |
Vectorized preprocessing of text
Description
This function preprocesses text data using vectorized operations for better performance.
This function preprocesses text data using vectorized operations for better performance.
Usage
vec_preprocess(
text_data,
text_column = "abstract",
remove_stopwords = TRUE,
custom_stopwords = NULL,
min_word_length = 3,
max_word_length = 50,
chunk_size = 100
)
vec_preprocess(
text_data,
text_column = "abstract",
remove_stopwords = TRUE,
custom_stopwords = NULL,
min_word_length = 3,
max_word_length = 50,
chunk_size = 100
)
Arguments
text_data |
A data frame containing text data. |
text_column |
Name of the column containing text to process. |
remove_stopwords |
Logical. If TRUE, removes stopwords. |
custom_stopwords |
Character vector of additional stopwords to remove. |
min_word_length |
Minimum word length to keep. |
max_word_length |
Maximum word length to keep. |
chunk_size |
Number of documents to process in each chunk. |
Value
A data frame with processed text.
A data frame with processed text.
[Package LBDiscover version 0.1.0 Index]