scorer_detect {vitals}R Documentation

Scoring with string detection

Description

The following functions use string pattern detection to score model outputs.

Usage

detect_includes(case_sensitive = FALSE)

detect_match(
  location = c("end", "begin", "end", "any"),
  case_sensitive = FALSE
)

detect_pattern(pattern, case_sensitive = FALSE, all = FALSE)

detect_exact(case_sensitive = FALSE)

detect_answer(format = c("line", "word", "letter"))

Arguments

case_sensitive

Logical, whether comparisons are case sensitive.

location

Where to look for match: one of "begin", "end", "any", or "exact". Defaults to "end".

pattern

Regular expression pattern to extract answer.

all

Logical: for multiple captures, whether all must match.

format

What to extract after "ANSWER:": "letter", "word", or "line". Defaults to "line".

Value

A function that scores model output based on string matching. Pass the returned value to ⁠$eval(scorer)⁠. See the documentation for the scorer argument in Task for more information on the return type.

See Also

model_graded_qa() and model_graded_fact() for model-based scoring.

Examples

if (!identical(Sys.getenv("ANTHROPIC_API_KEY"), "")) {
  # set the log directory to a temporary directory
  withr::local_envvar(VITALS_LOG_DIR = withr::local_tempdir())

  library(ellmer)
  library(tibble)

  simple_addition <- tibble(
    input = c("What's 2+2?", "What's 2+3?"),
    target = c("4", "5")
  )

  # create a new Task
  tsk <- Task$new(
    dataset = simple_addition,
    solver = generate(solver_chat = chat_anthropic(model = "claude-3-7-sonnet-latest")),
    scorer = detect_includes()
  )

  # evaluate the task (runs solver and scorer)
  tsk$eval()
}


[Package vitals version 0.1.0 Index]