dran.storage package

Submodules

dran.storage.db_introspection module

dran.storage.db_introspection.get_table_names(database_path)[source]

Return a sorted list of user-defined table names from the given SQLite database file.

Parameters:: database_path (Path) – Path to the SQLite .db file
Returns:: List of table names
Return type:: List[str]

dran.storage.db_introspection.get_table_from_db(dbPath, tableName)[source]

Parameters:

dbPath (str)
tableName (str)

Return type:

DataFrame

dran.storage.db_introspection.prep_data(dataframe, source_name)[source]

Preprocess the data in the DataFrame for analysis.

Parameters:

dataframe (pd.DataFrame) – The input DataFrame containing observational data.
source_name (str) – The name of the source being processed.

Returns:

The processed DataFrame.

Return type:

pd.DataFrame

dran.storage.db_introspection.parse_time(timeCol)[source]

Parameters:: timeCol (str)
Return type:: str

dran.storage.db_introspection.parse_observation_dates(df, form='m')[source]

Parse the observation date column into a datetime format.

Parameters:: df (pd.DataFrame) – The input DataFrame.
Returns:: The DataFrame with parsed dates.
Return type:: pd.DataFrame

dran.storage.db_introspection.convert_to_numeric(dataframe, exclude_keywords)[source]

Convert columns to numeric, excluding those containing specific keywords.

Parameters:

dataframe (pd.DataFrame) – The input DataFrame.
exclude_keywords (List[str]) – Keywords to exclude from numeric conversion.

Returns:

The DataFrame with numeric columns.

Return type:

pd.DataFrame

dran.storage.db_introspection.ensure_positive_errors(df)[source]

Ensure all error columns have positive values.

Parameters:

dataframe (pd.DataFrame) – The input DataFrame.
df (DataFrame)

Returns:

The DataFrame with positive error values.

Return type:

pd.DataFrame

dran.storage.db_introspection.make_positive(value)[source]

Ensure the input value is positive and convert it to a float. If the value is invalid or negative, return NaN.

Parameters:: value (Any) – The input value to process.
Returns:: The positive float value or NaN if the value is invalid or negative.
Return type:: float

dran.storage.db_introspection.get_data_from_db(processed_db_path, DB_PATH, freq, table_name, src, log='')[source]

Parameters:

DB_PATH (Path)
freq (int)
table_name (str)
src (str)

dran.storage.db_introspection.get_2ghz_data(table_name, cnx)[source]

Parameters:: table_name (str)

dran.storage.db_introspection.record_exists(conn, table, key_field, key_value)[source]

Fast existence check using an indexed lookup (UNIQUE field).

Returns True if a record exists, else False.

Parameters:

conn (Connection)
table (str)
key_field (str)
key_value (Any)

Return type:

bool

dran.storage.db_introspection.ensure_processed_files_table(conn)[source]

Ensure a small registry table exists for processed files.

This enables fast de-duplication across path changes and symlinks.

Parameters:: conn (Connection)
Return type:: None

dran.storage.db_introspection.processed_file_exists_by_path(conn, filepath)[source]

Parameters:

conn (Connection)
filepath (str)

Return type:

bool

dran.storage.db_introspection.processed_file_hashes_by_size(conn, file_size)[source]

Parameters:

conn (Connection)
file_size (int)

Return type:

List[str]

dran.storage.db_introspection.insert_processed_file(conn, *, file_hash, file_size, file_mtime, filepath, filename)[source]

Parameters:

conn (Connection)
file_hash (str)
file_size (int)
file_mtime (float)
filepath (str)
filename (str)

Return type:

None

dran.storage.sqlite_connection module

dran.storage.sqlite_connection.get_connection(db_path, log=None)[source]

Open a SQLite connection with pragmatic defaults for local workloads.

Settings applied: - WAL mode for better concurrent reads/writes - synchronous NORMAL for balanced durability and speed - busy_timeout to reduce “database is locked” failures

Parameters:

db_path (Path)
log (Logger | None)

Return type:

Connection

dran.storage.sqlite_repository module

dran.storage.sqlite_repository.insert_dict(conn, table, item)[source]

Insert a dict into table and return inserted row id.

Parameters:

conn (Connection)
table (str)
item (Mapping[str, Any])

Return type:

int

dran.storage.sqlite_repository.fetch_row(conn, table, row_id)[source]

Fetch a row and reconstruct arrays from BLOBs where possible.

Parameters:

conn (Connection)
table (str)
row_id (int)

Return type:

dict[str, Any]

dran.storage.sqlite_repository.get_existing_keys(conn, table, key)[source]

Load all existing values of a key into a set for fast membership checks.

Parameters:

conn (Connection)
table (str)
key (str)

Return type:

set[Any]

dran.storage.sqlite_repository.save_record(conn, table, item, *, create_table_fn=None)[source]

Insert one record. Returns row id.

create_table_fn is optional and lets callers ensure schema before insert.

Parameters:

conn (Connection)
table (str)
item (Mapping[str, Any])
create_table_fn (callable | None)

Return type:

int

dran.storage.sqlite_schema module

dran.storage.sqlite_schema.infer_sqlite_type(value)[source]

Infer an SQLite column type from a sample value.

Uses: - BLOB for non-scalar NumPy arrays - REAL for int/float scalars - TEXT for everything else

Parameters:: value (Any)
Return type:: str

dran.storage.sqlite_schema.ensure_table_from_dict(conn, table, sample, unique_field='FILENAME')[source]

Create a table if it does not exist.

Column names are taken from sample keys. Each column type is inferred from sample values.

unique_field is used as a UNIQUE constraint if it exists in sample.

Parameters:

conn (Connection)
table (str)
sample (Mapping[str, Any])
unique_field (str)

Return type:

None

dran.storage.sqlite_types module

dran.storage.sqlite_types.array_to_blob(arr)[source]

Encode a NumPy array as bytes for SQLite storage.

Uses np.save into an in-memory buffer, preserving dtype and shape.

Parameters:: arr (ndarray)
Return type:: bytes

dran.storage.sqlite_types.blob_to_array(blob)[source]

Decode stored bytes back into a NumPy array.

Parameters:: blob (bytes)
Return type:: ndarray

dran.storage.sqlite_types.normalize_for_schema(value)[source]

Normalize values for schema inference.

SQLite column typing is coarse. This converts NumPy scalars and 0-D arrays into Python scalars so the type checks behave as expected.

Parameters:: value (Any)
Return type:: Any

dran.storage.sqlite_types.normalize_for_storage(value)[source]

Prepare a value for SQLite insertion.

Rules: - 0-D NumPy arrays -> Python scalar - NumPy scalars -> Python scalar - N-D NumPy arrays (shape != ()) -> BLOB - Everything else unchanged

Parameters:: value (Any)
Return type:: Any

dran.storage package

Submodules

dran.storage.db_introspection module

dran.storage.sqlite_connection module

dran.storage.sqlite_repository module

dran.storage.sqlite_schema module

dran.storage.sqlite_types module

Module contents