Showing posts with label calculate. Show all posts
Showing posts with label calculate. Show all posts

Friday, March 9, 2012

PL/SQL effeciency question

Right now I have a few PL/SQL procedures that calculate count(*) and sum(*) values for different reports. So far it's a daily operation, with around 80000 records per day (seems to be increasing by 10000 records every week though)

it's not a complex procedure, just a lot of number crunching for each columns. Right now the basic logic is to use a separate SQL statement for each column(with different conditions, of course), and repeat it until all the reports are filled in.

So far it's taking about 20 secs for going thru every 10000 records. Add the time it takes to load the records in (about 25-30 secs per 80000 records), and it becomes about 3 minutes of operation each day.

I have been thinking about how I can improve the performance of the procedure. So far I am thinking about whether the performance would improve if I can change the logic by declaring a cursor for the records, and just go thru it one time, then put the value into different variables.

Any suggestion is appreciated

MarkOriginally posted by mchih
Right now I have a few PL/SQL procedures that calculate count(*) and sum(*) values for different reports. So far it's a daily operation, with around 80000 records per day (seems to be increasing by 10000 records every week though)

it's not a complex procedure, just a lot of number crunching for each columns. Right now the basic logic is to use a separate SQL statement for each column(with different conditions, of course), and repeat it until all the reports are filled in.

So far it's taking about 20 secs for going thru every 10000 records. Add the time it takes to load the records in (about 25-30 secs per 80000 records), and it becomes about 3 minutes of operation each day.

I have been thinking about how I can improve the performance of the procedure. So far I am thinking about whether the performance would improve if I can change the logic by declaring a cursor for the records, and just go thru it one time, then put the value into different variables.

Any suggestion is appreciated

Mark
Generally it is preferable to avoid procedural logic if you want best performance. Maybe you could combine all (or many of) your counts and sums into a single query using DECODE (or CASE) to filter the records:

SELECT SUM( DECODE( col1, 'x', 1, 0 )) as COUNT_WHERE_COL1_IS X
, COUNT(*) TOTAL_COUNT,
, SUM( DECODE( col3, 123, col4, 0 )) as SUM_COL4_WHERE_COL3_IS_123
...
FROM ...|||it might be difficult to put many of the query together, since each of them has a different where clause.

eg:

table temp(
card_type
card_amount
message_type
message_response
.
.
.
)

a sample query would be
select count(*), sum(card_amount) from temp
where card_type = x
and message_type = y
and message_response = z;

now that i think about it, it might be possible to use a GROUP BY to get all the values with similar WHERE clause, but I don't know how to store the values individually so i can access them later (eg. put it into different table)

TIA

Mark

Saturday, February 25, 2012

Pivot Task Error - Duplicate pivot key

I am using the pivot task to to a pivot of YTD-Values and after that I use derived columns to calculate month values and do a unpivot then.

All worked fine, but now I get this error message:

[ytd_pivot [123]] Error: Duplicate pivot key value "6".

The settings in the advanced editor seem to be correct (no duplicate pivot key value) and I am extracting the data from the source sorted by month.

Could it be a problem that I use all pivot columns (month 1 to 12) in the derived colum transformation and they aren′t available at this moment while data extracting is still going on?

any hints?

Cheers
Markus

The pivot transform takes values like:

cust# Product Qty

-- -

1 Ham 4

1 Chips 2

1 Flan 1

2 Chips 3

2 Beer 19

and produces rows like:

cust# HamQty ChipsQty FlanQty BeerQty

-- - - - -

1 4 2 1 null

2 null 3 null 19

so what to do with input data like this?

cust# Product Qty

-- -

1 Ham 4

1 Chips 2

1 Chips 5

Which value should go into the ChipsQty column 2 or 5?

Most application would want 7, and so we suggest that the pivot be preceded by an aggregate transform to ensure that there is only 1 row for each distinct value of the pivot key. If not, you will see the error you report.

hope this helps