chunk_texts {idiolect} | R Documentation |
Chunk a corpus
Description
This function can be used to chunk a corpus in order to control sample sizes.
Usage
chunk_texts(corpus, size)
Arguments
corpus |
A |
size |
The size of the chunks in number of tokens. |
Value
A quanteda
corpus object where each text is a chunk of the size requested.
Examples
corpus <- quanteda::corpus(c("The cat sat on the mat", "The dog sat on the chair"))
quanteda::docvars(corpus, "author") <- c("A", "B")
chunk_texts(corpus, size = 2)
[Package idiolect version 1.0.1 Index]