Testing a Job

Testing a Job

Before enabling production runs, use the Test tab to validate your job configuration against a small, controlled dataset. Testing does not trigger any actions — no records are merged and no webhooks are fired.

What Testing Does

A test run executes the full Find Duplicates and Process Duplicates pipeline but stops before the Action Duplicates stage. You can inspect:

  • Which record pairs were identified as duplicates
  • The probability score for each pair
  • The fields that contributed to the match
  • The generated merge output JSON for each pair

This lets you confirm that your matching rules and merge strategies produce the expected results before any real data is affected.

How to Run a Test

  1. Open a job and navigate to the Test tab
  2. Optionally add or tighten Source Definition filters to limit the test to a small representative subset (e.g., 100–500 records - just make sure you don’t split dataset separating out potential duplicates as they wouln’t be found)
  3. Click Run Test
  4. Review the results — inspect each duplicate candidate and its merge output

What to Look For

False positives — pairs flagged as duplicates that clearly are not the same record. If these appear:

  • Increase the Strictness setting
  • Reduce the rank of fields that are causing over-matching (e.g., a common city or status field with too high a rank)

False negatives — known duplicates that were not detected. If these appear:

  • Decrease the Strictness setting
  • Add or increase the rank of fields that are more distinctive for this dataset
  • Consider adding a fuzzy or phonetic algorithm for fields with common spelling variants. Add cleaning rules - especially Regex Cleaner may prove useful.

Merge output issues — the generated JSON does not reflect the intended merge. If this occurs:

  • Review the base/subordinate selection conditions in Process Duplicates
  • Check field-level merge strategies for the affected fields

Notes

  • Test runs don’t count against your plan’s processing limits like production runs. Use tight filters to keep test datasets small as test runs have tight limitations.
  • Test results are stored in Job Executions and are clearly labelled as test runs.