Dataset¶
- class niaarmts.dataset.Dataset¶
Bases:
object
- calculate_problem_dimension()¶
Calculates the dimension of the problem based on the type of features.
Adds 4 for each numerical attribute (lower and upper bound, threshold and permutation).
Adds 3 for each categorical attribute (category, threshold and permutation).
Adds 1 if an interval (datetime) attribute is present.
Adds 2 if time series data (timestamp) is present.
Adds 1 for cut point value.
- Returns:
The calculated dimension of the problem.
- get_all_transactions()¶
Get all transactions (rows) from the dataset.
- Returns:
A DataFrame containing all transactions (rows).
- get_categorical_features()¶
Get a list of categorical features.
- Returns:
A list of categorical feature names.
- get_datetime_features()¶
Get a list of datetime features.
- Returns:
A list of datetime feature names.
- get_feature_stats(feature_name: str)¶
Get detailed statistics for a given feature.
- Parameters:
feature_name – The name of the feature to analyze.
- Returns:
A dictionary with statistics about the feature.
- get_feature_summary()¶
Get a summary of features, categorized by type.
- Returns:
A dictionary with feature summaries.
- get_numerical_features()¶
Get a list of numerical features.
- Returns:
A list of numerical feature names.
- load_data_from_csv(file_path: str, timestamp_col: str = None)¶
Load the dataset from a CSV file.
- Parameters:
file_path – Path to the CSV file.
timestamp_col – Optional, the name of the column containing timestamps (if applicable).