pycarol.functions.misc¶
-
pycarol.functions.misc.check_mapping(login, staging_name, connector_name=None, connector_id=None)[source]¶ Check if a staging is mapped to a datamodel and return the mapping.
Args:
login (pycaol.Carol): Carol instance staging_name ([type]): staging name connector_name ([type]): connetor name connector_id ([type]): connetor idUsage:
from pycarol import Carol from pycarol.functions import check_mapping carol = Carol() check_mapping(carol, staging_name='staging_name', connector_name='connector_name')
Returns:
list: list of mappings or None
-
pycarol.functions.misc.delele_all_golden_data(carol, dm_name)[source]¶ Delete golden files from a datamodel in all storages.
- Args:
- carol (pycarol.Carol): Carol instance dm_name (str): Data Model name
- Returns:
- list: list of tasks created
-
pycarol.functions.misc.delete_staging_data(carol, staging_name, connector_name)[source]¶ Delete a staging.
- Args:
- carol (pycarol.Carol): Login instance staging_name (str): Staging name connector_name (str): Connector name
-
pycarol.functions.misc.par_delete_golden(carol, dm_list, n_jobs=5)[source]¶ Deletes golden files from a list of datamodels in parallel.
- Args:
- carol (pycarol.Carol): Carol instance dm_list (list): List of datamodels n_jobs (int, optional): Number of parallel jobs. Defaults to 5.
- Returns:
- list: list of tasks created
-
pycarol.functions.misc.par_delete_staging(carol, staging_list, connector_name, n_jobs=5)[source]¶ Deletes staging files from a list of datamodels in parallel.
- Args:
- carol (pycarol.Carol): Login instance staging_list (list): List of datamodels connector_name (str): Connector name n_jobs (int, optional): Number of parallel jobs. Defaults to 5.
- Returns:
- list: list of tasks created
-
pycarol.functions.misc.pause_dm_mappings(carol, dm_list, connector_name=None, connector_id=None, do_not_pause_staging_list=None)[source]¶ Pause mappings from a connetor based on a list of Datamodels.
Args:
carol (pycal.Carol): Carol instance dm_list (list): list of Datamodels to pause connector_name (str): connector name connector_name (str): connector id do_not_pause_staging_list (list, optional): List of stagings to do not pause. Defaults to None.Usage:
from pycarol import Carol from pycarol.functions import pause_dm_mappings carol = Carol() pause_dm_mappings(carol, connector_name='connector_name', dm_list=['dm1','dm2'])
-
pycarol.functions.misc.pause_etls(carol, etl_list, connector_name=None, connector_id=None, logger=None)[source]¶ Pause ETLs from a connetor based on a list of ETLs.
Args:
carol (pycarol.Carol): Carol instance etl_list (list): ETLs to pause connector_name (str, optional): Connector Name. Defaults to None. connector_id (str, optional): Connector ID. Defaults to None. logger (logging.logger, optional): Logger. Defaults to None.Usage:
from pycarol import Carol from pycarol.functions import resume_process carol = Carol() etl_list = ['staging1', 'staging2', 'staging3'] pause_etls(carol, connector_name='rui', etl_list=etl_list,)
Returns:
list: list of paused ETLs
-
pycarol.functions.misc.resume_process(carol, staging_name, connector_name=None, connector_id=None, logger=None, delay=1)[source]¶ Resume staging process (mappings or ETLs)
Args:
carol (pycarol.Carol): Carol instance staging_name (str): staging name connector_name (str): connector name connector_id (str): connetor id logger (logging.logger, optional): logger. Defaults to None. delay (int, optional): time to wait after resume processing. Defaults to 1.Usage:
from pycarol import Carol from pycarol.functions import resume_process carol = Carol() resume_process(carol, staging_name='staging_name', connector_name='connector_name')
Returns:
list: list of mappings paused.
-
pycarol.functions.misc.track_tasks(carol, task_list, retry_count=3, logger=None, callback=None, polling_delay=5)[source]¶ Track a list of taks from carol, waiting for errors/completeness.
Args:
carol (pycarol.Carol): pycarol.Carol instance task_list (list): List of tasks in Carol retry_count (int, optional): Number of times to restart a failed task. Defaults to 3. logger (logging.logger, optional): logger to log information. Defaults to None. callback (calable, optional): This function will be called every time task status are fetched from carol.
A dictionary with task status will be passed to the function. Defaults to None.polling_delay (int, optional): Time in seconds to pull task status from Carol
Usage:
from pycarol import Carol from pycarol.functions import track_tasks carol = Carol() def callback(task_list): print(task_list) track_tasks(carol=carol, task_list=['task_id_1', 'task_id_2'], callback=callback)
Returns:
[dict, bool]: dict with status of each task and booling if any task failed more than retry_count times.