dran.storage package

Submodules

dran.storage.db_introspection module

dran.storage.db_introspection.get_table_names(database_path)[source]

Return a sorted list of user-defined table names from the given SQLite database file.

Parameters:

database_path (Path) – Path to the SQLite .db file

Returns:

List of table names

Return type:

List[str]

dran.storage.db_introspection.get_table_from_db(dbPath, tableName)[source]
Parameters:
  • dbPath (str)

  • tableName (str)

Return type:

DataFrame

dran.storage.db_introspection.prep_data(dataframe, source_name)[source]

Preprocess the data in the DataFrame for analysis.

Parameters:
  • dataframe (pd.DataFrame) – The input DataFrame containing observational data.

  • source_name (str) – The name of the source being processed.

Returns:

The processed DataFrame.

Return type:

pd.DataFrame

dran.storage.db_introspection.parse_time(timeCol)[source]
Parameters:

timeCol (str)

Return type:

str

dran.storage.db_introspection.parse_observation_dates(df, form='m')[source]

Parse the observation date column into a datetime format.

Parameters:

df (pd.DataFrame) – The input DataFrame.

Returns:

The DataFrame with parsed dates.

Return type:

pd.DataFrame

dran.storage.db_introspection.convert_to_numeric(dataframe, exclude_keywords)[source]

Convert columns to numeric, excluding those containing specific keywords.

Parameters:
  • dataframe (pd.DataFrame) – The input DataFrame.

  • exclude_keywords (List[str]) – Keywords to exclude from numeric conversion.

Returns:

The DataFrame with numeric columns.

Return type:

pd.DataFrame

dran.storage.db_introspection.ensure_positive_errors(df)[source]

Ensure all error columns have positive values.

Parameters:
  • dataframe (pd.DataFrame) – The input DataFrame.

  • df (DataFrame)

Returns:

The DataFrame with positive error values.

Return type:

pd.DataFrame

dran.storage.db_introspection.make_positive(value)[source]

Ensure the input value is positive and convert it to a float. If the value is invalid or negative, return NaN.

Parameters:

value (Any) – The input value to process.

Returns:

The positive float value or NaN if the value is invalid or negative.

Return type:

float

dran.storage.db_introspection.get_data_from_db(processed_db_path, DB_PATH, freq, table_name, src, log='')[source]
Parameters:
dran.storage.db_introspection.get_2ghz_data(table_name, cnx)[source]
Parameters:

table_name (str)

dran.storage.db_introspection.record_exists(conn, table, key_field, key_value)[source]

Fast existence check using an indexed lookup (UNIQUE field).

Returns True if a record exists, else False.

Parameters:
Return type:

bool

dran.storage.db_introspection.ensure_processed_files_table(conn)[source]

Ensure a small registry table exists for processed files.

This enables fast de-duplication across path changes and symlinks.

Parameters:

conn (Connection)

Return type:

None

dran.storage.db_introspection.processed_file_exists_by_path(conn, filepath)[source]
Parameters:
Return type:

bool

dran.storage.db_introspection.processed_file_hashes_by_size(conn, file_size)[source]
Parameters:
Return type:

List[str]

dran.storage.db_introspection.insert_processed_file(conn, *, file_hash, file_size, file_mtime, filepath, filename)[source]
Parameters:
Return type:

None

dran.storage.sqlite_connection module

dran.storage.sqlite_connection.get_connection(db_path, log=None)[source]

Open a SQLite connection with pragmatic defaults for local workloads.

Settings applied: - WAL mode for better concurrent reads/writes - synchronous NORMAL for balanced durability and speed - busy_timeout to reduce “database is locked” failures

Parameters:
Return type:

Connection

dran.storage.sqlite_repository module

dran.storage.sqlite_repository.insert_dict(conn, table, item)[source]

Insert a dict into table and return inserted row id.

Parameters:
Return type:

int

dran.storage.sqlite_repository.fetch_row(conn, table, row_id)[source]

Fetch a row and reconstruct arrays from BLOBs where possible.

Parameters:
Return type:

dict[str, Any]

dran.storage.sqlite_repository.get_existing_keys(conn, table, key)[source]

Load all existing values of a key into a set for fast membership checks.

Parameters:
Return type:

set[Any]

dran.storage.sqlite_repository.save_record(conn, table, item, *, create_table_fn=None)[source]

Insert one record. Returns row id.

create_table_fn is optional and lets callers ensure schema before insert.

Parameters:
Return type:

int

dran.storage.sqlite_schema module

dran.storage.sqlite_schema.infer_sqlite_type(value)[source]

Infer an SQLite column type from a sample value.

Uses: - BLOB for non-scalar NumPy arrays - REAL for int/float scalars - TEXT for everything else

Parameters:

value (Any)

Return type:

str

dran.storage.sqlite_schema.ensure_table_from_dict(conn, table, sample, unique_field='FILENAME')[source]

Create a table if it does not exist.

Column names are taken from sample keys. Each column type is inferred from sample values.

unique_field is used as a UNIQUE constraint if it exists in sample.

Parameters:
Return type:

None

dran.storage.sqlite_types module

dran.storage.sqlite_types.array_to_blob(arr)[source]

Encode a NumPy array as bytes for SQLite storage.

Uses np.save into an in-memory buffer, preserving dtype and shape.

Parameters:

arr (ndarray)

Return type:

bytes

dran.storage.sqlite_types.blob_to_array(blob)[source]

Decode stored bytes back into a NumPy array.

Parameters:

blob (bytes)

Return type:

ndarray

dran.storage.sqlite_types.normalize_for_schema(value)[source]

Normalize values for schema inference.

SQLite column typing is coarse. This converts NumPy scalars and 0-D arrays into Python scalars so the type checks behave as expected.

Parameters:

value (Any)

Return type:

Any

dran.storage.sqlite_types.normalize_for_storage(value)[source]

Prepare a value for SQLite insertion.

Rules: - 0-D NumPy arrays -> Python scalar - NumPy scalars -> Python scalar - N-D NumPy arrays (shape != ()) -> BLOB - Everything else unchanged

Parameters:

value (Any)

Return type:

Any

Module contents