Source Definition
Overview
- Automatically fetched from the selected connection: available entities/tables from your data source.
- Choose the source object and configure which fields are included.
- Apply filters and paging to limit the dataset processed per run and to segment large datasets.
Parameters
- Fields: select columns/attributes used by duplicate rules and output.
- Filters: add conditions to narrow candidates (date ranges, status, partitions).
- Paging: define batch size to control memory and runtime; large sources benefit from paged processing.
- Primary key: ensure the job knows a unique identifier for records.
Notes
- The Source Definition is read directly from the connection configuration; update credentials or permissions on the connection if entities are missing.
- For very large sources, consider creating focused jobs with tight filters to optimize throughput.