Deduplicating Microsoft Dynamics 365: A Practical Guide

January 10, 2026

Microsoft Dynamics 365 ships with a native duplicate detection feature. For organisations whose CRM data has been resident in Dynamics for years — with multiple integrations, user entry, and migration layers — it is usually not enough.

What Dynamics Native Duplicate Detection Does

Dynamics 365’s built-in detection works at the point of data entry. When a user creates or updates a record, the system checks configured duplicate detection rules and warns if a potential duplicate exists. The user can dismiss the warning and save the record anyway.

This is a form of prevention, not remediation. It helps keep new duplicates from being created (if users don’t dismiss the warning), but it does nothing about duplicates that already exist.

The native rules are also limited to exact or simple field-based matching. There is no fuzzy matching, no phonetic matching, and no compound rule weighting. If a duplicate crept in through an API import or was created with a slightly different spelling, the native rules won’t find it.

Why Duplicates Accumulate in Dynamics

The sources of duplicates in Dynamics are familiar:

System migrations — organisations moving from a previous CRM (Salesforce, Sugar, a legacy system) import historical data that was never deduplicated at the source
Marketing automation sync — tools like HubSpot or Marketo push contacts into Dynamics and don’t always match against existing records correctly
Manual entry — in any organisation with more than a handful of users, someone will create a lead without first searching whether it exists
External data enrichment — enrichment services that append records sometimes create new contacts instead of matching against the existing one

After five or more years of operation, a Dynamics environment with 500,000 contacts will typically have 8–20% duplication. That is 40,000 to 100,000 records that need to be identified and resolved.

Connection Architecture

DeDuplica connects to Dynamics 365 via the Dataverse Web API. This means:

No direct database access required — the connection uses the same API any Dynamics plugin or integration uses
All authentication is OAuth-based (client credentials flow with an Azure app registration)
Data is processed via DeDuplica’s agent, which can run inside your network perimeter if required
The TDS (Tabular Data Stream) endpoint must be enabled on your Dataverse environment for certain query patterns

The Dynamics connection documentation covers the full setup, including how to register an Azure Active Directory application and configure the required API permissions.

Handling Dataverse Limitations

The Dataverse API enforces a 5,000-row retrieval limit per request by default, and aggregate queries across large tables can time out. DeDuplica works around this through page-based processing — the job retrieves data in configurable pages, processes each page, and accumulates results. This makes it possible to process Dynamics tables with hundreds of thousands or millions of records without hitting API limits.

One important constraint: Dataverse does not support arbitrary SQL. The Dynamics connector uses FetchXML and OData queries, which have different performance profiles from direct SQL. For very large tables, the job configuration allows you to filter which records are in scope — for example, processing only accounts modified in the past 90 days.

A Realistic Deduplication Schedule for Dynamics

For an organisation with a mature Dynamics implementation, a sensible approach is:

Initial bulk cleanup — run a comprehensive Find Duplicates job with a broad date range to identify all historical duplicates. Review and process the high-confidence matches.
Weekly incremental runs — configure a scheduled job that processes only records created or modified in the past 7 days. This keeps new duplicates from accumulating.
Pre-migration/pre-upgrade runs — before any significant platform change, run a full deduplication pass to ensure the source data is clean.

DeDuplica’s job scheduling supports all three patterns from a single interface.

What Gets Merged

When DeDuplica processes a Dynamics duplicate group, you control which record becomes the base (the one that survives) and how conflicting field values are resolved. Common strategies include:

Prefer the most recently modified record’s field values
Prefer the longer string (keeps more data)
Keep the base record’s values unchanged, use subordinate data only to fill in empty fields

Related records — activities, notes, child records — can be re-parented to the base record rather than deleted. This ensures that historical context from a duplicate account or contact is not lost during the merge.

DeDuplica supports Dynamics 365, Microsoft SQL Server, PostgreSQL, MySQL, MariaDB, Oracle, and local files. Start your free trial.

Beyond Exact Matching: How Fuzzy Matching Finds Hidden Duplicates Data Quality at Enterprise Scale: Where Most Projects Fail