aaa3_tinycodet_strings {tinycodet}R Documentation

Overview of the 'tinycodet' Extension of 'stringi'

Description

Virtually every programming language, even those primarily focused on mathematics, will at some point have to deal with strings. R's atomic classes basically boil down to some form of either numbers or characters. R's numerical functions are generally very fast. But R's native string functions are somewhat slow, do not have a unified naming scheme, and are not as comprehensive as R's impressive numerical functions.

The primary R-package that fixes this is 'stringi'. 'stringi' is the fastest and most comprehensive string manipulation package available at the time of writing. Many string related packages fully depend on 'stringi'. The 'stringr' package, for example, is merely a thin wrapper around 'stringi'.

As string manipulation is so important to programming languages, 'tinycodet' adds a little bit new functionality to 'stringi':

References

Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1–59, doi:10.18637/jss.v103.i02

See Also

tinycodet_help(), s_regex()

Examples


# character vector:
x <- c("3rd 1st 2nd", "5th 4th 6th")
print(x)

# detect if there are digits:
x %s{}% "\\d"

# find second last digit:
loc <- stri_locate_ith(x, i = -2, regex = "\\d")
stringi::stri_sub(x, from = loc)

# cut x into matrix of individual words:
mat <- strcut_brk(x, "word")

# sort rows of matrix using the fast %row~% operator:
rank <- stringi::stri_rank(as.vector(mat)) |> matrix(ncol = ncol(mat))
sorted <- mat %row~% rank
sorted[is.na(sorted)] <- ""

# join elements of every row into a single character vector:
stri_c_mat(sorted, margin = 1, sep = " ")


[Package tinycodet version 0.3.0 Index]