generate_complex_data {topolow} | R Documentation |
Generate Complex High-Dimensional Data for Testing
Description
Generates synthetic high-dimensional data with clusters and trends for testing dimensionality reduction methods. Creates data with specified properties:
Multiple clusters along a trend line
Variable density regions
Controllable noise levels
Optional visualization
The function generates cluster centers along a trend line, adds points around those centers with specified spread, and incorporates random noise to create high and low density areas. The data is useful for testing dimensionality reduction and visualization methods.
Usage
generate_complex_data(
n_points = 500,
n_dim = 10,
n_clusters = 4,
cluster_spread = 1,
fig_name = NA
)
Arguments
n_points |
Integer number of points to generate |
n_dim |
Integer number of dimensions |
n_clusters |
Integer number of clusters |
cluster_spread |
Numeric controlling cluster variance |
fig_name |
Character path to save visualization (optional) |
Value
A data.frame
with n_points
rows and n_dim
columns. Column names
are "Dim1" through "DimN" where N is n_dim.
Examples
# Generate basic dataset
data <- generate_complex_data(n_points = 500, n_dim = 10,
n_clusters = 4, cluster_spread = 1)
# The function returns a data frame, which can be inspected
head(data)