pycarol.functions.misc

pycarol.functions.misc.check_mapping(login, staging_name, connector_name=None, connector_id=None)[source]

Check if a staging is mapped to a datamodel and return the mapping.

Args:

login (pycaol.Carol): Carol instance staging_name ([type]): staging name connector_name ([type]): connetor name connector_id ([type]): connetor id

Usage:

from pycarol import Carol
from pycarol.functions import check_mapping
carol = Carol()
check_mapping(carol, staging_name='staging_name', connector_name='connector_name')

Returns:

list: list of mappings or None
pycarol.functions.misc.delele_all_golden_data(carol, dm_name)[source]

Delete golden files from a datamodel in all storages.

Args:
carol (pycarol.Carol): Carol instance dm_name (str): Data Model name
Returns:
list: list of tasks created
pycarol.functions.misc.delete_staging_data(carol, staging_name, connector_name)[source]

Delete a staging.

Args:
carol (pycarol.Carol): Login instance staging_name (str): Staging name connector_name (str): Connector name
pycarol.functions.misc.par_delete_golden(carol, dm_list, n_jobs=5)[source]

Deletes golden files from a list of datamodels in parallel.

Args:
carol (pycarol.Carol): Carol instance dm_list (list): List of datamodels n_jobs (int, optional): Number of parallel jobs. Defaults to 5.
Returns:
list: list of tasks created
pycarol.functions.misc.par_delete_staging(carol, staging_list, connector_name, n_jobs=5)[source]

Deletes staging files from a list of datamodels in parallel.

Args:
carol (pycarol.Carol): Login instance staging_list (list): List of datamodels connector_name (str): Connector name n_jobs (int, optional): Number of parallel jobs. Defaults to 5.
Returns:
list: list of tasks created
pycarol.functions.misc.pause_dm_mappings(carol, dm_list, connector_name=None, connector_id=None, do_not_pause_staging_list=None)[source]

Pause mappings from a connetor based on a list of Datamodels.

Args:

carol (pycal.Carol): Carol instance dm_list (list): list of Datamodels to pause connector_name (str): connector name connector_name (str): connector id do_not_pause_staging_list (list, optional): List of stagings to do not pause. Defaults to None.

Usage:

from pycarol import Carol
from pycarol.functions import pause_dm_mappings
carol = Carol()
pause_dm_mappings(carol, connector_name='connector_name', dm_list=['dm1','dm2'])
pycarol.functions.misc.pause_etls(carol, etl_list, connector_name=None, connector_id=None, logger=None)[source]

Pause ETLs from a connetor based on a list of ETLs.

Args:

carol (pycarol.Carol): Carol instance etl_list (list): ETLs to pause connector_name (str, optional): Connector Name. Defaults to None. connector_id (str, optional): Connector ID. Defaults to None. logger (logging.logger, optional): Logger. Defaults to None.

Usage:

from pycarol import Carol
from pycarol.functions import resume_process
carol = Carol()
etl_list = ['staging1', 'staging2', 'staging3']
pause_etls(carol, connector_name='rui', etl_list=etl_list,)

Returns:

list: list of paused ETLs
pycarol.functions.misc.resume_process(carol, staging_name, connector_name=None, connector_id=None, logger=None, delay=1)[source]

Resume staging process (mappings or ETLs)

Args:

carol (pycarol.Carol): Carol instance staging_name (str): staging name connector_name (str): connector name connector_id (str): connetor id logger (logging.logger, optional): logger. Defaults to None. delay (int, optional): time to wait after resume processing. Defaults to 1.

Usage:

from pycarol import Carol
from pycarol.functions import resume_process
carol = Carol()
resume_process(carol, staging_name='staging_name', connector_name='connector_name')

Returns:

list: list of mappings paused.
pycarol.functions.misc.track_tasks(carol, task_list, retry_count=3, logger=None, callback=None, polling_delay=5)[source]

Track a list of taks from carol, waiting for errors/completeness.

Args:

carol (pycarol.Carol): pycarol.Carol instance task_list (list): List of tasks in Carol retry_count (int, optional): Number of times to restart a failed task. Defaults to 3. logger (logging.logger, optional): logger to log information. Defaults to None. callback (calable, optional): This function will be called every time task status are fetched from carol.

A dictionary with task status will be passed to the function. Defaults to None.

polling_delay (int, optional): Time in seconds to pull task status from Carol

Usage:

from pycarol import Carol
from pycarol.functions import track_tasks
carol = Carol()
def callback(task_list):
    print(task_list)
track_tasks(carol=carol, task_list=['task_id_1', 'task_id_2'], callback=callback)

Returns:

[dict, bool]: dict with status of each task and booling if any task failed more than retry_count times.