similarity_matrix {keyclust} | R Documentation |
Algorithm designed to create a cosine similarity matrix from a fitted word embedding model
Description
This function takes a fitted word embedding model and computes the cosine similarity between each word.
Usage
similarity_matrix(x, words = NULL, max_terms = 25000)
Arguments
x |
A word embedding matrix |
words |
A vector of words or the name of a column that corresponds to the word dimension of the fitted word embeddings |
max_terms |
The maximum number of embedding terms that will be included in output similarity matrix. Assumes that embedding input is ordered by word frequency. |
Value
An N x N matrix of cosine similarity scores between words from a fitted word embedding model.
Examples
# Create a set of keywords using a pre-defined set of seeds
simmat <- similarity_matrix(wordemb_FasttextEng_sample, words = "words")
[Package keyclust version 1.2.5 Index]