Formatting Excel to Populate DQS Parent-Child Relationships

I wanted to import data into DQS that I already know is cleansed and mapped, but it took me a couple tries to get the Excel file in the right format. I’m not sure this type of scenario was in mind when they built the product UI.

The pre-existing mapping data exists in Excel spreadsheets, and we want to import that knowledge to DQS. I tried several formats for importing the records from Excel. During this process, I got lots of errors. For example:

Format Excel to Populate DQS

When I viewed the log, I saw many errors of the same format:

Format Excel to Populate DQS 2

I copied all the results to the clipboard and examined the data. I think this is how they translate:

Value “X” cannot be linked to itself.
-This term already exists in DQS.
-Usually, there is a capital to lower casing difference in the name when I see this error.
-For example, “UNITED STATES” to “United States”.
-I assume DQS does a string compare with non-sql code and determines the strings are different terms, but when it tries to insert into SQL, the value already exists because SQL isn’t case sensitive.

Value “X” cannot be linked because it is not a leading value.
-That value “X” has already been assigned to another parent value.
-Sometimes this is related to duplicates in the data.
-When I checked DQS after the import, these terms were usually linked correctly.

Values ‘X’ and ‘ Y’ cannot be linked since ‘ Y’ is already a leading value.
-There was a space before X or Y which confused DQS so it loaded Y as a leading value.
-You must go in to the DQS UI and re-assign this one manually.
-When I’ve seen this one, it’s paired with a ‘ Y’ cannot be linked because it is not a leading value.

Regardless, I found a format that worked most consistently for my data. At the top of the Excel file, in the first column, I created the full list of parent values in alphabetical order. At the bottom of that list, I added the child values (also in alphabetical order) so it looked like this:

Name Child
United Kingdom
United States
United Kingdom UK
United States US
United States USA

When I imported the data, it looked like I had hoped:

Format Excel to Populate DQS 3

I hope this helps save someone time!

Phone: 312-602-4000
222 W. Adams
Chicago, IL 60606
Show Buttons
Share On Facebook
Share On Twitter
Share on LinkedIn
Hide Buttons