pycarol.functions.misc

pycarol.functions.misc.delele_all_golden_data(carol, dm_name)[source]

Delete golden files from a datamodel in all storages.

Parameters
  • carol (pycarol.Carol) – Carol instance

  • dm_name (str) – Data Model name

Returns

list of tasks created

Return type

list

pycarol.functions.misc.delete_staging_data(carol, staging_name, connector_name)[source]

Delete a staging.

Parameters
  • carol (pycarol.Carol) – Login instance

  • staging_name (str) – Staging name

  • connector_name (str) – Connector name

pycarol.functions.misc.par_delete_golden(carol, dm_list, n_jobs=5)[source]

Deletes golden files from a list of datamodels in parallel.

Parameters
  • carol (pycarol.Carol) – Carol instance

  • dm_list (list) – List of datamodels

  • n_jobs (int, optional) – Number of parallel jobs. Defaults to 5.

Returns

list of tasks created

Return type

list

pycarol.functions.misc.par_delete_staging(carol, staging_list, connector_name, n_jobs=5)[source]

Deletes staging files from a list of datamodels in parallel.

Parameters
  • carol (pycarol.Carol) – Login instance

  • staging_list (list) – List of datamodels

  • connector_name (str) – Connector name

  • n_jobs (int, optional) – Number of parallel jobs. Defaults to 5.

Returns

list of tasks created

Return type

list

pycarol.functions.misc.track_tasks(carol, task_list, retry_count=3, logger=None, callback=None, polling_delay=5)[source]

Track a list of taks from carol, waiting for errors/completeness.

Parameters
  • carol (pycarol.Carol) – pycarol.Carol instance

  • task_list (list) – List of tasks in Carol

  • retry_count (int, optional) – Number of times to restart a failed task. Defaults to 3.

  • logger (logging.logger, optional) – logger to log information. Defaults to None.

  • callback (calable, optional) – This function will be called every time task status are fetched from carol. A dictionary with task status will be passed to the function. Defaults to None.

  • polling_delay (int, optional) – Time in seconds to pull task status from Carol

Usage:

from pycarol import Carol
from pycarol.functions import track_tasks
carol = Carol()
def callback(task_list):
    print(task_list)
track_tasks(carol=carol, task_list=['task_id_1', 'task_id_2'], callback=callback)
Returns

dict with status of each task and booling if any task failed more than retry_count times.

Return type

[dict, bool]