nail_sort {NaileR} | R Documentation |
Sort textual data
Description
Group textual data according to their similarity, in a context in which the assessors have commented on a set of stimuli.
Usage
nail_sort(
dataset,
name_size = 3,
stimulus_id = "individual",
introduction = NULL,
measure = NULL,
request = NULL,
model = "llama3.1",
nb.clusters = 4,
generate = FALSE,
max.attempts = 5
)
Arguments
dataset |
a data frame where each row is a stimulus and each column is an assessor. |
name_size |
the maximum number of words in a group name created by the LLM. |
stimulus_id |
the nature of the stimulus. Customizing it is highly recommended. |
introduction |
the introduction to the LLM prompt. |
measure |
the type of measure used in the experiment. |
request |
the request of the LLM prompt. |
model |
the model name ('llama3.1' by default). |
nb.clusters |
the maximum number of clusters the LLM can form per assessor. |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
max.attempts |
the maximum number of attempts for a column. |
Details
This function uses a while loop to ensure that the LLM gives the right number of groups. Therefore, customizing the stimulus ID, prompt introduction and measure is highly recommended; a clear prompt can help the LLM finish its task faster.
Value
A list consisting of:
a list of prompts (one per assessor);
a list of results (one per assessor);
a data frame with the group names.
Examples
## Not run:
# Processing time is often longer than ten seconds
# because the function uses a large language model.
library(NaileR)
data(beard_wide)
intro_beard <- "As a barber, you make
recommendations based on consumers comments.
Examples of consumers descriptions of beards
are as follows."
intro_beard <- gsub('\n', ' ', intro_beard) |>
stringr::str_squish()
req_beard <- "Each group should contain beards with descriptions
that relate to a similar type of person - not
necessarily the same person, but sharing common traits.
Each group must have a short,
meaningful name that characterizes the person."
req_beard <- gsub('\n', ' ', req_beard) |>
stringr::str_squish()
res <- nail_sort(beard_wide[,1:5], name_size = 3,
stimulus_id = "beard", introduction = intro_beard,
measure = 'the description was',
request = req_beard,
nb.clusters = 6,
generate = TRUE)
cat(res$prompt_llm[[1]])
cat(res$res_llm[[1]])
res$dta_sort
## End(Not run)