Wednesday, March 28, 2012

Resolve conflict

Hi,

l've a raw data which contains list of device name and # of wires used, i.e.
DEVICE NAME # of Wires
-
A 10
B 11
C 17
D 5
A 0
E 0

So l would like to import these data into my table (with device name as primary key) which each device only appear once. If the device name appear twices, then l should ignore device with # of wires = 0. If device name appear twice and # of wires <> 0, then l should log it.

How should l do that in SSIS ?

1. Import the raw data

2. Put a MULTICAST into the pipeline.

3. Take Output1 from the MULTICAST and feed it into an AGGREGATE. Count the number of records per device name and sum the number of wires per device.

4. Us a conditional split to get all the devices where there is more than 1 record and the number of wires is greater than 0

5. Join that back to Output2 from the MULTICAST by using a MERGE JOIN, joining on device name.

6. Push the output from the MERGE JOIN into a flat file for later examination.

-Jamie

|||

I think there is an easier way to do this as long as there is no additional columns needed.

Source --> CS1 --> Agg --> CS2 --> Dest1

--> Dest 2

CS1 removes all items with # wires == 0 (e.g. expression on the output is # wires == 0 and then just don't attach this output and just attach the default output to Agg)

Agg then aggregates by name, does max (or min) on # wires and count

CS2 splits on count != 1, one output goes to the table (count == 1), and the other (count != 1) goes to either a script task that logs the info or a flat file that outputs the info.

This avoids MergeJoin and, therefore, a sort too.

HTH,

Matt

No comments:

Post a Comment