Tuesday, September 24, 2019

How to do a quick health check of AWS RDS database

Just because the database is on AWS RDS, it doesn't mean that it won't run slow or get stuck. So when your users complain about the slowness of your RDS database, do the following quick health check:
1- From AWS console, in RDS section, go to your database and then go to Logs and Events tab. From Logs, in case of Oracle check alert log, in case of SQL Server check Error log, for PostgreSQL check postgres log and error log for MySQL database. Check for any errors or warnings and proceed accordingly as per that database engine.


2- If you dont see any errors or warnings or if you want to check in addition, then first check which database instance type you are using. For example for one of my test Oracle databases, it is db.r4.4xlarge.


Go to https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html and check specifications of this instance type.

For instance, for db.r4.4xlarge, it is :


Instance Class vCPU, ECU, Memory (GiB), VPC Only, EBS Optimized, Max. Bandwidth (Mbps), Network Performance
db.r4.4xlarge 16  53  122   Yes   Yes    3,500     Up to 10 Gbps


So this db.r4.4xlarge has a max bandwidth (throughput) of 437.5 MB/s  (3500 Mbps/8 = 437.5 MB/s). The throughput limit is separate for read and write, which means you’ll get 437.5 MB/s for read and 437.5 MB/s for write.


3- Now go to Monitoring tab of this RDS in your console and check Read Throughput and Write Throughput to see if your instance is touching above threshold. For instance in this case 437.5. If yes, then you know that IO is the issue and you may need to either tune the SQLs responsible or increase instance size.


4- Similarly, from the same monitoring tab check for CPU usage, free memory available and Free storage space to make sure no other threshold is being reached.
5- Also check for Disk Queue Depth. The Disk Queue Depth is the number of IO requests waiting to be serviced. This time spent waiting in the queue is a component of latency and service time. Ideally disk queue depth of 15 or less should be good, but in case you notice latency greater than 10 or milliseconds accompanied by high disk queue depth than that could cause performance issues.


6- Last but not least, reviewing and tuning your SQLs is the biggest optimization gain you can achieve whether your database is in RDS or not.


Hope it helps.

No comments: