Dataset

class niaarmts.dataset.Dataset

Bases: object

calculate_problem_dimension()

Calculates the dimension of the problem based on the type of features.

  • Adds 4 for each numerical attribute (lower and upper bound, threshold and permutation).

  • Adds 3 for each categorical attribute (category, threshold and permutation).

  • Adds 1 if an interval (datetime) attribute is present.

  • Adds 2 if time series data (timestamp) is present.

  • Adds 1 for cut point value.

Returns:

The calculated dimension of the problem.

get_all_transactions()

Get all transactions (rows) from the dataset.

Returns:

A DataFrame containing all transactions (rows).

get_categorical_features()

Get a list of categorical features.

Returns:

A list of categorical feature names.

get_datetime_features()

Get a list of datetime features.

Returns:

A list of datetime feature names.

get_feature_stats(feature_name: str)

Get detailed statistics for a given feature.

Parameters:

feature_name – The name of the feature to analyze.

Returns:

A dictionary with statistics about the feature.

get_feature_summary()

Get a summary of features, categorized by type.

Returns:

A dictionary with feature summaries.

get_numerical_features()

Get a list of numerical features.

Returns:

A list of numerical feature names.

load_data_from_csv(file_path: str, timestamp_col: str = None)

Load the dataset from a CSV file.

Parameters:
  • file_path – Path to the CSV file.

  • timestamp_col – Optional, the name of the column containing timestamps (if applicable).