Overview#
Since duplicate companies cause confusion and inefficiency in your sales teams, most Cargo users prefer to have recurring workflows to impose hygiene on the CRM.
This template will walk you through the process of identifying and merging duplicate companies in the CRM when they share the same domain and/or LinkedIn page.
All integrations mentioned in this template require an associated connector to be set up in your Cargo workspace. Some integrations are eligible for use with Cargo credits. See the documentation for instructions on setting up connectors and using Cargo credits.
Companies deduplication using Cargo#
Step 1 - Set variables#
Set up your input variables to the workflow
Inputs used in the workflow are set up in the variables node at the beginning of the workflow. This node is used to define the parameters that will be passed through the rest of the workflow.
Since we're aiming to deduplicate companies in the CRM, we are most likely going to use an equivalent data model from the CRM connectors in Cargo to power this workflow.
To power this workflow, the following variables are needed:
- domain: The domain of the company you are targeting
- linkedinPage: The LinkedIn page URL that is shared among duplicate companies
- crmCompanyID: The CRM ID of the company
- linkedinId: The LinkedIn company ID
- tiers: An array with containing the convention used to prioritize companies in the CRM by importance (left to right, most import to least important)
- companyTierColumnName: The column name for company tier in your CRM, where the above defined 'tiers' can be found
- companyTypeColumnName: The column name for company type in your CRM (if absent, insert a dummy value)
Step 2 - Filter and Retrieve duplicates#
Filter generic domains and retrieve duplicate companies
The workflow first filters out generic domains (like bit.ly, linkedin.com, etc.) to avoid false positives. Then, it searches for duplicates based on available identifiers - either domain, LinkedIn page, or both.
The search is routed through different paths depending on which identifiers are available:
- If both domain and LinkedIn page are available, it searches using both
- If only domain is available, it searches using domain
- If only LinkedIn page is available, it searches using LinkedIn page
The workflow proceeds when more than one company is found and all companies are of type "Prospect".
Step 3 - Identify Primary Record#
Determine which company record should be the primary one
The workflow uses a sophisticated approach to determine which company record should be the primary one:
- It first checks the company tiers using the provided tier array (e.g., ["tier_1", "tier_2", "tier_3"])
- If companies have different tiers, it keeps the one with the highest tier (leftmost in the array being highest priority)
- If companies have the same tier or no tier, it selects the record with the most properties filled
The workflow then identifies and stores the HubSpot ID of this primary record for the merging process. This ensures that the most complete and highest-priority record is preserved.
Step 4 - Execute the Merge#
Merge all duplicate records into the primary record
Once the primary record is identified, the workflow:
- Creates a list of all HubSpot IDs except the primary record's ID
- Uses HubSpot's merge action to combine all duplicate records into the primary record
- Preserves all relevant data from the duplicate records in the process
This final step ensures that all company information is consolidated into a single, comprehensive record while maintaining the highest-quality data based on your company's prioritization rules.