Thursday, July 25, 2013

Create Metric Extension to Monitor Archivelog Gap on Standby Database

Create Metric Extension to Monitor Archivelog Gap on Standby

I want to monitor when a standby database falls behind a certain amount of archivelog files. I want to be able to do this no matter what platform or database version. I want to use a standard monitor across the database ecosystem. In this case I used metric extension to create my own metric for monitoring archivelog gap on standby database.

1.  Login to OEM

2. Click Enterprise>Monitoring>Metric Extension

3. Click create













4. Fill in the information and set collection schedule as needed click next.
Target Type- Database Instance
Name- <any_name>
Display Name- <any_name>
Adapter- SQL
Description- <any_description>



5. Insert the SQL query below into the SQL Query box and click next. What this query is doing is selecting on the gv$archived_log table on the primary where arch dest is standby and appiled is yes. If a standby has not applied an archivelog then on the primary applied will be no. There are cases where mrp process has stop sending confirmation to the primary that archivelog have been applied in that case it is still an issue and this alert will give you notification to take action.
select sum(local.sequence#-target.sequence#) Total_gap
from
(select thread#,max(sequence#) sequence# from gv$archived_log where
dest_id=(select dest_id from v$archive_dest where TARGET='STANDBY'
)
and applied='YES' group by thread#) target,
(select thread#,max(sequence#) sequence# from gv$log group by thread#) local
where target.thread#=local.thread#;
 6. Click add button to setup column info.
7. Fill in the information and click OK then next.
Name- column name
Display Name- how to display the column
Column Type- Data Column
Value Type- Number
Comparison Operator- Greater than or equal to what metric threshold needed
Set the advance values as needed

 8. Set credentials as needed
 9. Click add button to add target to test the metric.
 10. Select a database to test and click select.
 11. Click the run test button.
 12. Testing progress will start
 13. Test results will show up in test result section once complete click next.
 14. Review the metric extensions setting and click finish.
 15. You will now see the new metric extension but the status is editable. Select the new metric and click >action>save as deployable draft.
 16. Now the status of the metric extension is deployable and you will be able to deploy the metric extension to database instance targets. Click action>deploy to targets.
 17. Click add button
 18. Select the target or targets to deploy the metric. In this case we have the primary database targets in a group called Data Guard Primary Databases.
19. Click the submit button to submit metric process will start.

Example of alert

21 comments:

  1. Nice document...but this has to be done on Prod or Standby ?

    ReplyDelete
  2. You deploy the metric extension to the primary.

    ReplyDelete
  3. Also, I want to know if it is 3 node RAC, if i want thread wise difference in Logs, should i create 3 different metrics ie for each thread

    ReplyDelete
  4. Thanks Ruiz for the Doc.. It was very helpful..
    Quick Question - What about the email notifications? where can i setup them in order to get email when it hits threshold or critical

    ReplyDelete
    Replies
    1. Ravi first to get email from OEM you need to setup the notification method from within OEM. Go to setup>notification>notification method and fill in the info you need. Then go to setup>incident rules and configure the rule you need for the metric you want to be notified about.

      Delete
    2. Javier,Could you please explain the steps to get alerts ...

      Delete
    3. Ravi first to get email from OEM you need to setup the notification method from within OEM. Go to setup>notification>notification method and fill in the info you need. Then go to setup>incident rules and configure the rule you need for the metric you want to be notified about.

      Delete
  5. This query fails in a data guard broker environment where you have multiple standby's. "ora-01427: single-row subquery returns more than on row"
    do you have a query that works in one of these scenarios? Thanks

    ReplyDelete
    Replies
    1. Try the query below let me know if it works and I will add an update to the post.

      select sum(local.sequence#-target.sequence#) Total_gap
      from
      (select thread#,max(sequence#) sequence# from gv$archived_log
      where standby_dest='YES'
      and applied='YES'
      and first_time between sysdate-7 and sysdate
      group by thread#) target,
      (select thread#,max(sequence#) sequence# from gv$log group by thread#) local
      where target.thread#=local.thread#;

      Delete
  6. we have multiple standby I need to check the Gap but above query just giving only either 1 or null return.

    ReplyDelete
  7. Hola Javier,
    Sorry for bothering but I've been looking all over the internet without luck.
    I have some Non2jee apps that creates logs. I want to parse those logs using oem agent and ask to oem to issue an alert if an error appears in the log.
    Do you know by any chance how to do this?
    Thanks

    ReplyDelete
    Replies
    1. Santiago under host target there is already a metric that you can use to look at specific files and based on key words that can trigger alert. If that does not work check into OEM metric extentions.

      Delete
  8. Hi Javier,
    I have similar type of requirement to monitor below query.Can you please help me.

    On Active Data Guard Machine-
    SQL> SELECT name, value FROM V$DATAGUARD_STATS WHERE name like 'apply lag';
    NAME VALUE
    -------------------------------- -----------------------apply lag +00 00:00:01

    ReplyDelete
  9. Did you follow the steps in this blog it does exactly what you are looking for.

    ReplyDelete
  10. I ended up creating metric extension similar to yours with additional values as key columns. The difference is the gap is calculated for individual thread level and it takes into account if the primary has more than one standby.



    with target_dg as
    (
    select t1.dest_id, t1.destination, t1.db_unique_name, t2.thread#, max(t2.sequence#) sequence#
    from v$archive_dest t1, gv$archived_log t2
    where t1.target = 'STANDBY'
    and t2.applied = 'YES'
    and t1.dest_id = t2.dest_id
    group by t1.dest_id, t1.destination, t1.db_unique_name, t2.thread#
    ), local_dg as
    (select thread#,max(sequence#) sequence#
    from gv$log
    group by thread#)
    select sys_context('USERENV', 'DB_UNIQUE_NAME' ) as source_db_name, t1.dest_id, t1.db_unique_name target_db_name, t1.thread#, t2.sequence# - t1.sequence# gap, t2.sequence# primary_sequence#, t1.sequence# standby_sequence#
    from target_dg t1, local_dg t2
    where t2.thread# = t1.thread#;

    thanks,
    -Jey

    ReplyDelete
    Replies
    1. Hi jay,

      Please can you share the metric out put. if the two standbu also down how we get out put

      Delete
    2. hey J can you share the ouput we have two standby's too and would like to know how you set it up

      Delete
    3. hey J can you share the ouput we have two standby's too and would like to know how you set it up

      Delete
  11. Ooohh!! Very Nice document. I've and Implement It, it just Wouaouh!!!

    ReplyDelete