Hey everyone,
this message comes in two forms, the short version and the long detailed
version-- that way hopefully I can get all the help possible as fast as
possible :)
Short version:
I have a table Names(names_id, name1, name2) that I normalized into 2
tables: Name(name_id, name) and ConnectNames(oldName_id, name_id,
number). The number column just indicates if the name was from the name1
column or the name2 column. Given that I've now normalized this, if I
need to get what was once 1 row (for example: 123, Bob, John), it will
appear as 2 rows if I need to get it by the original nameId as follows:
select oldName_id, name, number
from connectNames cn
inner join name n on n.name_id=cn.oldName_id
where oldName_id=123
would now return:
123 Bob 1
123 John 2
What I need however is for this to be displayed inline like it used to
so that it can be returned in one row (because this gets joined to other
tables). The issue: I'm dealing with millions of rows in this table, and
millions of rows in the other tables that ultimately got joined with the
old denormalized Names table. I've tried using a pivot table approach on
the normalized data to get it in 1 row, but it is very slow when
returning large rowcounts. I've tweaked the indexes, but you can only
get so much performance.
How can I do this better?
Thanks a bunch in advance,
DS
Long detailed version:
I started with a table Names(<pk>names_id, name1, name2). Problem is I
needed to search by name, so I normalized this into a names table and a
cross-reference table:
Name(<pk>name_id, name) and ConnectNames(oldName_id, name_id, number).
I've included the code for this at the bottom of the message for how I
went about this.
Hurra for normalization, now is easy to search for a name:
select oldName_id from connectNames cn inner join name n on n.id=cn.name
where name=@.nameToSearch;
This however presented another problem: I need to be able to display
both name1 and name2 in a single row-- I need to pivot what I just
created (thats actually why I sneaked in the number column into the
cross-ref table to make it easy to pivot). A solution I grabbed from
MSDN was to create a view that I could then join onto twice:
create view connectNamesView
select oldName_id,
MIN(CASE number WHEN 1 THEN name_id END) AS name_id1,
MIN(CASE number WHEN 2 THEN name_id END) AS name_id2,
from connectNames
group by oldName_id
select n1.name, n2.name from connectNamesView cnv
inner join name n1 on n1.name_id=cnv.name_id1
inner join name n2 on n2.name_id=cnv.name_id2
The real issue with this though is that as you may imagine this is
pretty resource intensive, especially when you have several million
records in these tables, and when you join oldName_id to another table
with several million records like this:
select id, product, name1, name2
from Owners o -- note: owners has millions of rows too
inner join connectNamesView cnv on cnv.name_id=o.name_id
inner join name n1 on n1.name_id=cnv.name_id1
inner join name n2 on n2.name_id=cnv.name_id2
where id between 10000 and 20000
Running this takes ages when its joined to another table. Is there a
better way to improve performance or to denormalize the results JUST for
display (display them in 2 columns)? (by the way, is denormalize the
correct term for doing what I need to here).
Thanks in advance for the help and reading this long-winded post :)
-DS
To normalise the table Id did the following:
create table tmpName (id int, name varchar(20), number tinyint);
insert into tmpName (id, name, number) select name_id, name1, 1
insert into tmpName (id, name, number) select name_id, name2, 2
create table Name (id int not null identity(1,1), name varchar(20))
insert into Name (name) select distinct name from tmpName
-- at this point the Names table has been normalized; now to create the
one to many relationship:
create table ConnectNames(oldName_id int, name_id int, number tinyint)
insert into ConnectNames(oldName_id, name_id, number)
select t.id, n.id, t.number
from name n
inner join tmpName t on t.name = n.name
drop table tmpNameProviding the relationship is always 1 to 1 (exactly one name for each
old_name), then this might work (untested):
select name as Name1
,Name 2
=(select name
from Names OldNames
where (OdlNames.oldName_id = Names.name_id))
from Names
For a better solution post DDL, sample data, preferably with expected result
s.
ML
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment