Thursday, February 5, 2015

Some evaluation results on Oracle AWR (Automatic Workload Repository)

Some evaluation results on Oracle AWR.
1. Operations
1.1 How to Install/Enable
Set STATISTICS_LEVEL to TYPICAL, does not need bounce.
If we want to save every second ASH data to disk, set “_ash_disk_filter_ratio”=1. By default, it save 1 second snapshot out of 10 seconds.
If we want to set AWR snapshot interval to 15 minutes, by default it’s 60 minutes: exec DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(interval=>15);
If we want to set AWR data retention time to 90 days, by default it’s 7 days: exec DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(retention=>90*1440);
1.2 How to Disable
In 11g, set CONTROL_MANAGEMENT_PACK_ACCESS = NONE as Oracle document said.
But even with that, as my test in 11.2.0.2, Oracle will still create AWR snapshots. You can download the file dbmsnoawr.plb from Oracle DOC [ID 436386.1], and run follow
@dbmsnoawr.plb
exec dbms_awr.disable_awr();
In 10g, set STATISTICS_LEVEL=BASIC. In case of STATISTICS_LEVEL is TYPICAL, set “_ASH_ENABLE”=FALSE to disable ASH, and run package in below document to disable AWR: Package for disabling AWR without a Diagnostic Pack license in Oracle [ID 436386.1]
Normally, the snapshot data can be purged using the package:
select min(snaP_id),max(snap_id) from DBA_HIST_SNAPSHOT ;
BEGIN
  dbms_workload_repository.drop_snapshot_range(low_snap_id => &lowid, high_snap_id=> &highid);
END;
/
In some cases, you want to recreate the AWR repository:
SQL> connect / as sysdba
SQL> alter system set statistics_level=basic;
SQL> @?/rdbms/admin/catnoawr.sql
SQL> @?/rdbms/admin/catawrtb.sql
SQL> alter system set statistics_level=typical;
1.3 How to create AWR report
1) AWR report
Run script $ORACLE_HOME/rdbms/admin/awrrpt.sql
Major different of AWR report and Statspack report:
AWR report can be html format, easy to read.
AWR report has some more info:
Foreground Wait Class
Wait Event Histogram Detail
SQL ordered by User I/O Wait Time
SQL ordered by Physical Reads
2) AWR difference report
Run script @$ORACLE_HOME/rdbms/admin/awrddrpt.sql
3) ADDM report
Run script @$ORACLE_HOME/rdbms/admin/addmrpt.sql
OR we can just get an existing ADDM report which is generated automatically:
select task_id,task_name,created from DBA_ADVISOR_TASKS order by task_id;
spool addm_473.txt
set long 1000000 pagesize 0 longchunksize 1000
column get_clob format a80
select dbms_advisor.get_task_report('ADDM:318357951_1_473', 'TEXT', 'TYPICAL') from sys.dual;
spool off;
4) AWR SQL report
Connect as sysdba, run script @$ORACLE_HOME/rdbms/admin/awrsqrpt.sql
5) AWR info report
Run script @$ORACLE_HOME/rdbms/admin/awrinfo.sql, AWR Information like current Usage and Data Distribution
1.4 How to create snapshot and baseline manually
1) Create AWR snapshot
Oracle will create AWR snapshot based on interval setting, every hour by default. It also can be created manually using:
exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
2) Create AWR baseline
There are 3 types of baselines.
A. Fixed Baselines
B. Moving Window Baseline
C. Baseline Templates (repeatable)
Using package to create baseline:
exec DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE (start_snap_id => 471,
end_snap_id => 473, baseline_name => 'peak baseline');
Show baselines info example:
select * from DBA_HIST_BASELINE;
select * from table(DBMS_WORKLOAD_REPOSITORY.SELECT_BASELINE_DETAILS (1));
select * from table(DBMS_WORKLOAD_REPOSITORY.SELECT_BASELINE_METRIC ('peak baseline'));
1.5 Useful SQL to query AWR data
1) Top CPU consuming Session in last 5 minutes
SELECT session_id,
       COUNT(*)
FROM   v$active_session_history
WHERE  session_state = 'ON CPU'
       AND sample_time > sysdate - ( 5 / ( 24 * 60 ) )
GROUP  BY session_id
ORDER  BY COUNT(*) DESC;
2) Top Waiting Session in last 5 minutes
SELECT session_id,
       COUNT(*)
FROM   v$active_session_history
WHERE  session_state = 'WAITING'
       AND sample_time > sysdate - ( 5 / ( 24 * 60 ) )
GROUP  BY session_id
ORDER  BY COUNT(*) DESC;
3) Top Waiting Event in last 5 minutes
SELECT event,
       COUNT(*)
FROM   v$active_session_history
WHERE  session_state = 'WAITING'
       AND sample_time > sysdate - ( 5 / ( 24 * 60 ) )
GROUP  BY event
ORDER  BY COUNT(*) DESC;
4) Top Active Machine in last 5 minutes
SELECT machine,
       COUNT(*)
FROM   v$active_session_history
WHERE sample_time > sysdate - ( 5 / ( 24 * 60 ) )
GROUP  BY machine
ORDER  BY COUNT(*) DESC;
5) Top SESSION by CPU usage, wait time and IO time in last 5 minutes
select
ash.session_id,
ash.session_serial#,
ash.user_id,
ash.program,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING',
decode(en.wait_class,'User I/O',1, 0 ), 0)) "WAITING" ,
sum(decode(ash.session_state,'WAITING',
decode(en.wait_class,'User I/O',1, 0 ), 0)) "IO" ,
sum(decode(session_state,'ON CPU',1,1)) "TOTAL"
from v$active_session_history ash,
v$event_name en
where en.event# = ash.event# AND SAMPLE_TIME >  SYSDATE - (5/(24*60))
group by session_id,user_id,session_serial#,program
order by sum(decode(session_state,'ON CPU',1,0));
6) Top SQL by CPU usage, wait time and IO time in last 5 minutes
SELECT ash.sql_id,
       SUM(DECODE(ash.session_state, 'ON CPU', 1, 0))        "CPU",
       SUM(DECODE(ash.session_state, 'WAITING', 1, 0))
    - SUM( DECODE(ash.session_state, 'WAITING',
DECODE(en.wait_class, 'User I/O', 1, 0), 0)) "WAIT",
       SUM(DECODE(ash.session_state, 'WAITING', 
 DECODE(en.wait_class, 'User I/O', 1, 0),  0))        "IO",
       SUM(DECODE(ash.session_state, 'ON CPU', 1,   1))       "TOTAL"
FROM   v$active_session_history ash,
       v$event_name en
WHERE  sql_id IS NOT NULL AND SAMPLE_TIME >  SYSDATE - (5/(24*60))
       AND en.event# = ash.event#
GROUP  BY sql_id
ORDER  BY SUM(DECODE(session_state, 'ON CPU', 1, 0)) DESC;
1.6 Functionality
1) ASH help to trace back the root cause of high sessions:
Check v$active_session_history, it includes “ora active” info every second (it’s retention time depends on how large “_ash_size”)
Check DBA_HIST_ACTIVE_SESS_HISTORY, it includes “ora active” info every 10 seconds (it’s retention time depends on snapshot retention setting)
2) Statistical Baselines
AWR baseline snapshot will not be purged.
AWR baseline help to quickly identify system health
2. Overhead
CPU: No observable overhead.
Disk: A few hundred MB per day on busy databases
Memory: Normally a few hundred MB in shared pool. It’s calculated by formula Max( Min ( No. of CPU * 2 M, 5% of SHARED_POOL_SIZE, 2% of SGA_TARGET) , 1M). It can be set by hidden parameter “_ash_size”
Get the live value by this SQL:
SQL> select bytes/power(1024,2) MB from v$sgastat where name in(‘ASH buffers’) and pool=’shared pool’ ;
In a word, no much overhead.
3. Conclusion
Advantages:
AWR is similar to statspack, no much add value except it’s more easy to use/maintain.
ADDM maybe help in some bad-tuned DB cases, but not in well-tuned DB(IMHO)
ASH : 1) More frequently active sessions data, every second in memory, every 10 seconds in disk.
Other solution: can use home-grown script to check v$session and save it.
2) Its data is easier to use, just query views/tables, and low overhead
Other solution: load the data into table in home-grown script.
Shortcomings:
AWR has no outstanding feature that can help us solve issues, comparing to statspack.
ASH :
1) Persistent records don’t have every second data, instead it’s every 10 seconds.
Solution: set “_ash_disk_filter_ratio” to 1, the cost is using 10 times disk space in SYSAUX.
2) In scenario of DB hang/crash/bounced, ASH info will be lost as long as snapshot interval (1 hour by default).
Remedy: set snapshot interval to 15 minutes, and using “oradebug dump ashdump” if possible, or home-grown script
3) Unknown bugs.
Remedy: proactively apply known patches.
Most importantly, AWR/ASH(in Oracle Diagnostics Pack) are not free.
4. FAQ
Q: What are the steps to install AWR?ASH
A: no extra installation required
Q: What are the steps to use AWR/ASH?
A: Using DBMS_WORKLOAD_REPOSITORY package and check DBA_HIST_* views
Q: What are the common views which contains the information about ASH and what does this data means and some ora scripts to leverage this data
A: V$ACTIVE_SESSION_HISTORY and DBA_HIST_* views. For means,
Q: How to change AWR/ASH data retention period?
The retention period can be set to 1 day to 100 years, default is 7 days. Example set it to 90 days: exec exec DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(retention=>90*1440);
Q: How frequently ASH data flush to disk?
By default, it will flushed to disk every hour(can be changed as follows) or ash buffer is 2/3 full(controlled by hidden parameter _ash_eflush_trigger)
The interval can be set to 10 minutes to 1 year. Example set it to 10 minutes: exec DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(interval=>10);
Q: How to check the retention and flush interval value?
SQL> select * from DBA_HIST_WR_CONTROL;
DBID SNAP_INTERVAL RETENTION TOPNSQL
———- ——————– ——————– ———-
1186199246 +00000 00:10:00.0 +00090 00:00:00.0 DEFAULT
Q: How many ASH data in memory(V$ACTIVE_SESSION_HISTORY) will be flushed to disk (DBA_HIST_ACTIVE_SESS_HISTORY)?
By default, it’s 1/10, controlled by hidden parameter “_ash_disk_filter_ratio” (default is 10)
Check the records will be/already flushed:
SQL> select distinct sample_id, SAMPLE_TIME,IS_AWR_SAMPLE from v$active_session_history where SAMPLE_TIME>sysdate-3/24/60/10;
Change the flush percentage to 1, meaning flush every records :
SQL> alter system set “_ash_disk_filter_ratio”=1;
Q: How frequently ASH will sample from database to memory(V$ACTIVE_SESSION_HISTORY)?
By default, it’s 1 second, controlled by hidden value _ash_sampling_interval (1000).
Q: How to dump ASH info to file when db is hang?
We can dump the ASH info in memory to trace file. Inside the trace file, it has instructions about how to load it into a table.
SQL> oradebug setmypid
SQL> oradebug dump ashdump 5 — This will dump last 5 minute content
SQL> oradebug tracefile_name
Q: Compared with statspack and freecon, what value will it bring to us?
AWR is similar to statspack, no much add value except it’s more easy to use/maintain.
ASH is similar to freecon, advantages 1) More frequently “ora active” data, every second in memory, every 10 seconds in disk. 2) Its data is easier to use, just query views/tables.
Q: Any impacted tools?
If we use statspack at the same time as AWR, no impact
If we replace statspack with AWR, we need to change the “ora sp*” to use AWR data instead.
For ASH, we need to add some ora script or chart tools to use it data more easily.
Q: Any management overhead?

I would say, it’s minimum (beside bugs), less than Statspack (need extra script to create snapshot and drop old ones). For AWR/ASH, we just need to set the parameters like memory, interval and retention time, etc. It’s pretty much self-managed.

No comments:

Post a Comment