Tuesday, December 23, 2014

Avoid distinct in Query if its not needed actaully

This query came to me  as its taking more time return result and cost is also high.

Ex:
SELECT id_
  FROM act_ru_task
WHERE id_ IN (SELECT tasks.id_
                 FROM (SELECT a.*, rownum rnum
                         FROM (SELECT DISTINCT res.*
                                 FROM act_ru_task res
                                INNER JOIN act_ru_identitylink i
                                   ON i.task_id_ = res.id_
                                WHERE res.priority_ = :b3
                                  AND res.assignee_ IS NULL
                                  AND ((res.due_date_ <= :b2 AND
                                      res.due_date_ IS NOT NULLOR
                                      res.due_date_ IS NULL)
                                  AND res.assignee_ IS NULL
                                  AND i.type_ = 'candidate'
                                  AND i.group_id_ = :b1
                                ORDER BY res.id_ ASC) a
                        WHERE rownum <= :b4) tasks
                WHERE rnum >= :b5)

   FOR UPDATE

After taking plan 

I found there is no duplicate values in the table and distinct is doing full table scan which is not required here.

Before putiing distinct in the table please make sure its needed for your requirement.

Monday, November 24, 2014

Oracle Tuning a Query -table with 25 million Record

Am Having the transaction table which will grow daily and now  row count is around 25 million.

This is my query taking more time  and cost. Its due to order by in the computation field(TOTAL_WEIGHT)

SELECT   PPS_ID,TOTAL_WEIGHT from
                      (SELECT PPS_ID,TOTAL_WEIGHT
                        FROM (SELECT pps_id,round(((( 60 * name_pct_match / 100) + prs_weight + year_weight + dt_weight)/
                              90) * 100) total_weight
                FROM (SELECT pps_id,
                       round(func_compare_name( 'FRANCIS JOHN ROCH',upper(name_en),' ',60)) name_pct_match,
                       decode(prs_nationality_id, 99, 15, 0) prs_weight,
                       10 mother_weight,
                       100 total_attrib_weight,
                case when to_number(to_char(birth_date, 'yyyy')) = 1986 then 5 else 0 end year_weight,
                case when to_char(to_date('12-JAN-86','DD-MON-RRRR'), 'dd') = to_char(birth_date, 'dd')
                  and to_char(to_date('12-JAN-86','DD-MON-RRRR'), 'mm') = to_char(birth_date, 'mm') then 10
                     when to_date('12-JAN-86','DD-MON-RRRR') between birth_date-6 and birth_date+6 then 8
                     when to_date('12-JAN-86','DD-MON-RRRR') between birth_date-28 and birth_date+28 then 5
                     when to_date('12-JAN-86','DD-MON-RRRR') between birth_date-90 and birth_date+90 then 3
                else 0
                end dt_weight
                  FROM individual_profile
                 WHERE birth_date = '12-JAN-86' 
                 AND IS_ACTIVE = 1
                   AND gender_id = 1 
                   AND round(func_compare_name('FRANCIS JOHN ROCH',upper(name_en),' ',60)) > 20
                   ))
                  WHERE TOTAL_WEIGHT >= 100 
                  order by total_weight desc)
                  where rownum <= 10
Till oracle 11g there is no way to solve order by issue. In oracle 12c oracle  introduced TOP LIMIT concept to reduce this burden. In my case i cant use this since my DB is oracle 11g. so i did following steps to make this query performing
Step -1: Initially i thought to made my fetch fast. So i partitioned the table by birth_date and subpartitioned by gender_id. So now my fettch became fast.
Step -2: partitioned the table bith_date year wise as range partition  and gender wise as list partition.
Step-3:  Now i made the fetch faster. my query is used by many users at a time. So i want to think about CPU utilization also. After taking explain plan i found  bytes is high and cost somewhat reduced
Step-4: Now i tried different indexes to reduce the cost and bytes. but nothing helped as order by  is used.
step-5 : Finally i decided to make my query to run in parallel.
step-6: Oracle 11g has option of parallel(auto) which tell the optimizer to run in parallel based on user
step -7: SELECT  /*+ parallel(auto) */  PPS_ID,TOTAL_WEIGHT from 
step-8: added parallel hint which brought down cost and bytes. now query is performing well
Note: This parallel wont take indexes used in the table.It  will ran in parallel way.
         


Oracle Text Concept-- Like operator Replaced with Oracle Text which improves better performance

In this process i  want to replace like operator in where clause which is going for full table scan in some package. Oracle suggested replace the like operator with oracle text concepts. There are two ways to implement this 

·          Catsearch
·          Context
Catsearch:
           Catsearch is the best way to implement because  here the index will be auto updated  and we will get more matches in record. In our case it failed because its giving more record than which we are getting in like operator. Because its searching by fragment by fragment. But in our case it should start from start of the letter.
Context:
           This is done by contains clause. Here the index is not auto updated and we can  make it auto-commit by giving sync commit while creating the index. Created datastore and wordlist to perform index more better. And set attributes on the column which need to be indexed and prefix_index which would improve the performance. And set wildcard terms to 5000 which would avoid conversion error.
Example:
begin
 ctx_ddl.create_preference('spm_full_ename_ds', 'multi_column_datastore');
 ctx_ddl.set_attribute('spm_full_ename_ds', 'columns', '''xxstart'',spm_full_ename');
 ctx_ddl.set_attribute('spm_full_ename_ds', 'delimiter', 'NEWLINE');
 ctx_ddl.create_preference('spm_full_ename_wl', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('spm_full_ename_wl','PREFIX_INDEX','TRUE');
ctx_ddl.set_attribute('spm_full_ename_wl','PREFIX_MIN_LENGTH',1);
ctx_ddl.set_attribute('spm_full_ename_wl','PREFIX_max_LENGTH',15);
 CTX_DDL.SET_ATTRIBUTE ('spm_full_ename_wl', 'WILDCARD_MAXTERMS', 50000);
end;
Now while creating index  added nationality in the filter, because in some cases nationality is not considered as index and going for full table so made  nationality in the filter by and now it as composite index.
Index:
Create index idx_cont_spm_ename on list_master(spm_full_ename) indextype is ctxsys.context  filter by nat_code_curr_nat
parameters ('wordlist spm_full_ename_wl  datastore spm_full_ename_ds SYNC(ON COMMIT)') local;
If our system is RAC node oracle suggested to  alter the index to parallel to improve the performance
alter index idx_cont_spm_ename parallel 1;
Now after creating the index we will face response time is more for the query for that oracle suggested  some hints which would return the records in faster manner and optimize the cost of the query.
/*+ domain_index_no_sort */  and /*+ first_rows(1) */


First_rows() hint:
we can also optimize for response time using the related FIRST_ROWS hint. Like FIRST_ROWS(n), when queries are optimized for response time, Oracle Text returns the first rows in the shortest time possible.
 domain_index_no_sort:
Example:

SELECT /*+ first_rows(1)  domain_index_no_sort  */ id sub_sys_individual_id,
             spm_full_ename nm_e,
             spm_full_aname nm_a,
            -- func_get_moidata_gender(spm_person_no) sex,
             nat_code_curr_nat prs_nat,
             spm_person_no person_no,
             null person_tp,
             'NA' prog_where_not_allowed,
             spm_person_no udb_no
        FROM list_master
       WHERE (ls_code IN (2, 3))
    and contains(spm_full_ename,'xxstart AYOUB GHOLAMREZA AHMADI%') >0
        AND (((date_from <= trunc(SYSDATE) OR date_from IS NULL) AND
             (date_to >= trunc(SYSDATE) OR date_to IS NULL) AND
             (duration_flag = 2)) OR (duration_flag = 1));


ctx_ddl.create_preference and Set_attribute:
To create a datastore, lexer, filter, classifier, wordlist, or storage preference,we can  use the CTX_DDL.CREATE_PREFERENCE procedure.
Multi column Datastore: Data is stored in a text table in more than one column. Columns are concatenated to create a virtual document, one for each row.
we can also set attributes with the CTX_DDL.SET_ATTRIBUTE procedure.
Newline attribute:
Specify TRUE for Oracle Text to add no newline character after each detail row.
Prefix_index attribute:
In our case we are  using right-truncated predicate. For that oracle suggested to use the prefix_idex attribute by setting min and max values.
WILDCARD_MAXTERMS:
As wre are wild card character in our search % we are setting this  wildcard maxteerms as 5000. In wordlist we are setting this.
When we execute a query using wildcard(%), the term you used in a query is used to generate another list of terms (say match_list) from token_list.

Match_list consists of all the terms which have matched the wildcarded term - whith 0 or more characters to the query term in place of wildcard.
SYNC(ON COMMIT):
As we know context  search is not auto update we are updating the index on commit. So this sync( on commit) is used.
Local:
Since our table is partitioned  by creating the index as local. As we are using first_rows and domain_index_no_sort hint we cam improve the query performance and the response time. If a table is range or hash partition only we can create this index as local.v