Friday, February 24, 2012

almost duplicate records

Hi,
Is there a way to 'merge' almost duplicate records. I have records which
are the same but the other duplicate record can differ from the
other. Normally i just delete the duplicate records, but in this case i
can't.
For example (in a 1-N relation):
id | field1 | field2 | field3
1 | 11.5 | 52.0 | null
1 | null | null | yes
should be:
id | field1 | field2 | field3
1 | 11.5 | 52.0 | yes
Is this possible?Maybe like this:
SELECT id, SUM(f1), SUM(f2), MAX(f3)
FROM YourTable
GROUP BY id ;
David Portas
SQL Server MVP
--|||David Portas wrote:
> Maybe like this:
> SELECT id, SUM(f1), SUM(f2), MAX(f3)
> FROM YourTable
> GROUP BY id ;
>
Hi David,
I tried that, but that will only work with numeric values. I have
numeric as well as varchars.
Any idea how i can solve this?|||SUM will only work with numerics. MIN and MAX will work with VARCHARs
too.
The answer all depends on what you want to achieve. In you example data
what result would you have required if Field3 on the first row had been
"No" instead of NULL?. You would have to devise an expression that
makes some sense based on your knowledge of the data.
There may be other alternatives. For example you could pick the row for
each ID that has the fewest NULL attributes. Example:
ALTER TABLE YourTable ADD i INTEGER IDENTITY
GO
SELECT P.id, P.f1, P.f2, P.f3
FROM YourTable AS P,
(SELECT MIN(i) AS i
FROM YourTable AS T2
WHERE
CASE WHEN T2.f1 IS NULL THEN 1 ELSE 0 END+
CASE WHEN T2.f2 IS NULL THEN 1 ELSE 0 END+
CASE WHEN T2.f3 IS NULL THEN 1 ELSE 0 END =
(SELECT MIN(
CASE WHEN T3.f1 IS NULL THEN 1 ELSE 0 END+
CASE WHEN T3.f2 IS NULL THEN 1 ELSE 0 END+
CASE WHEN T3.f3 IS NULL THEN 1 ELSE 0 END)
FROM YourTable AS T3
WHERE T2.id = T3.id)
GROUP BY id) AS Q
WHERE P.i = Q.i ;
David Portas
SQL Server MVP
--

No comments:

Post a Comment