findRepresentativeDocs {sts} | R Documentation |
Function for Identifying Documents that Load Heavily on a Topic
Description
Extracts documents with the highest prevalence for a given topic
Usage
findRepresentativeDocs(object, corpus_text, topic, n = 3)
Arguments
object |
Model output from sts |
corpus_text |
vector of text documents, usually contained in the output of prepDocuments |
topic |
a single topic number |
n |
number of documents to extract |
Examples
#Examples with the Gadarian Data
library("tm"); library("stm"); library("sts")
temp<-textProcessor(documents=gadarian$open.ended.response,
metadata=gadarian, verbose = FALSE)
out <- prepDocuments(temp$documents, temp$vocab, temp$meta, verbose = FALSE)
out$meta$noTreatment <- ifelse(out$meta$treatment == 1, -1, 1)
## low max iteration number just for testing
sts_estimate <- sts(~ treatment*pid_rep, ~ noTreatment, out, K = 3, maxIter = 2)
docs <- findRepresentativeDocs(sts_estimate, out$meta$open.ended.response, topic = 3, n = 4)
plotRepresentativeDocs(docs, text.cex = 0.7, width = 100)
[Package sts version 1.4 Index]