API Reference
Detailed documentation for the AuditTrialRecorder class and its methods.
AuditTrialRecorder(df, name=None)
Initialize the recorder with a pandas DataFrame.
df (pd.DataFrame)
The initial dataframe to track.
name (str, optional)
Name of the experiment/audit trail. Defaults to a timestamped name.
Example
from ml_audit import AuditTrialRecorder
auditor = AuditTrialRecorder(df, name="my_experiment")
.impute(column, strategy='mean', fill_value=None, method=None)
Fill missing values in one or more columns using statistical strategies or specific methods.
column (str | list)
Column name(s) to impute.
strategy (str)
'mean', 'median', 'mode', or 'constant'.
method (str)
'ffill' or 'bfill'. Overrides strategy if provided.
fill_value (any)
Value to use when strategy='constant'.
Example
auditor.impute(["age", "salary"], strategy='median')
auditor.impute("stock", method='ffill')
.scale(column, method='standard')
Scale numerical features to a specific range or distribution.
column (str | list)
Column name(s) to scale.
method (str)
'standard' (default), 'minmax', 'robust', 'maxabs'.
Example
auditor.scale(["height", "weight"], method='standard')
.encode(column, method='onehot', target_col=None)
Encode categorical features.
column (str | list)
Column name(s) to encode.
method (str)
'onehot', 'label', or 'target'.
target_col (str)
Required only for 'target' encoding.
Example
auditor.encode("color", method='onehot')
auditor.encode("zip", method='target', target_col='price')
.transform(column, func='log')
Apply mathematical transformations to columns.
column (str | list)
Column name(s) to transform.
func (str)
'log' (log1p), 'sqrt', 'cbrt', 'square'.
Example
auditor.transform("income", func='log')
.bin_numeric(column, bins=5, strategy='quantile', labels=None)
Discretize continuous variables into bins.
column (str | list)
Column name(s) to bin.
bins (int)
Number of bins to create.
strategy (str)
'quantile' (equal freq) or 'uniform' (equal width).
Example
auditor.bin_numeric("age", bins=4, strategy='quantile')
.extract_date_features(column, features=['year', 'month', 'day'])
Extract features from datetime columns.
column (str | list)
The datetime column(s).
features (list)
List of features to extract: 'year', 'month', 'day', 'weekday', 'hour'.
Example
auditor.extract_date_features("joined_at", features=['year', 'month'])
.balance_classes(target, strategy='oversample', random_state=42)
Balance class distribution in the target variable.
target (str)
The target class column.
strategy (str)
'oversample', 'undersample', or 'smote' (requires imblearn).
Example
auditor.balance_classes("churn", strategy='smote')
.filter_rows / .drop_columns
Basic dataframe manipulations.
# Filter: column, operator, value
auditor.filter_rows("age", ">=", 18)
# Drop: columns (list)
auditor.drop_columns(["id", "tmp"])
.track_pandas(method_name, *args, **kwargs)
Track any arbitrary pandas method.
auditor.track_pandas("dropna", subset=['col1'])
.export_audit_trail(filename=None, visualize=True)
Save the audit log to a JSON file and optionally generate an HTML visualization.
filename (str)
Output filename. Defaults to [name]_audit.json.
visualize (bool)
If True (default), generates a corresponding .html file in
visualizations/.