In today's e-commerce era, an efficient and stable e-commerce platform is critical to maintaining competitiveness. As the core support of an e-commerce platform, database performance directly affects user experience and business fluency. This article will deeply explore how to effectively perform performance monitoring and optimization for Amazon Web Services RDS (MySQL) and ElastiCache (Redis) in e-commerce scenarios using Guance Cloud.
Background
E-commerce platforms usually face huge user traffic, large-scale data processing requirements, and high-concurrency transaction operations. These challenges require databases to:
- Respond quickly to user queries and ensure fast page loading speeds.
- Efficiently process transactions and support high-concurrency scenarios such as flash sales and rush purchases.
- Operate stably and ensure data consistency and integrity.
For this reason, monitoring and optimizing database performance has become a key task in e-commerce operations. As a partner of Amazon Web Services, Guance Cloud quickly provides customers with observability building solutions, helping customers optimize software code quality and improve the stability and reliability of business systems. Below we introduce how to quickly build performance monitoring for MySQL and Redis through Guance Cloud.
Metric Collection
CloudWatch is a monitoring service provided by Amazon Web Services. Developers and system administrators can use CloudWatch to collect, monitor, and analyze log files and metric data related to Amazon Web Services infrastructure and applications. Guance Cloud obtains basic data by integrating with CloudWatch, and leverages its own data cleaning, tagging, and custom data visualization capabilities to facilitate developers and system administrators to build customized unified observability dashboards, allowing users to more flexibly build cloud service observability scenarios. In addition, Guance Cloud can directly collect database performance metrics through DataKit, further enhancing the monitoring capabilities of CloudWatch.
In the unified observability scenario for RDS database-related services, we first need to integrate the data metrics of various cloud services into Guance Cloud to prepare for subsequent custom data dashboards. There are two components for data collection: the first is the Guance Cloud DataKit data collection suite. Through the DataKit bridge deployed in the user environment, users can easily send metrics of RDS MySQL or Redis to Guance Cloud. The second is Guance Cloud Dataflux Function data development platform (referred to as Func), which obtains cloud service metrics by accessing the Amazon Web Services API through scripts. Below we introduce how to collect CloudWatch metrics.
Collection via Guance Cloud DataKit
Configuration steps:
- First, make sure you have registered a Guance Cloud account, and allocated a workspace for the cloud resources you need to monitor
- Install DataKit on a virtual machine
- Go to the
conf.d/dbdirectory under your DataKit installation path, copymysql.conf.sampleand rename it tomysql.conf - Go to the
conf.d/dbdirectory under your DataKit installation path, copyredis.conf.sampleand rename it toredis.conf - Configure the connection parameters in the configuration file, then restart DataKit
- Log in to Guance Cloud to view the MySQL and Redis dashboards
Compared with obtaining cloud service metrics by accessing the Amazon Web Services API via scripts, collecting data directly through DataKit allows Guance Cloud to display more metrics with standard views, and can provide additional information: for example, direct collection from RDS can further obtain MySQL slow query logs, general logs, running logs and other information, which can be combined with metrics for further analysis on the Guance Cloud UI; for direct Redis collection, you can obtain analysis of fields such as redis_bigkey and redis_hotkey . Combined with collected Redis metrics and slow query logs, this helps engineers further analyze performance and locate problems.
Collection via Guance Cloud DataFlux Func
Configuration steps:
- First, make sure you have registered a Guance Cloud account, and allocated a workspace for the cloud resources you need to monitor
- Visit func.guance.com to get the DataFlux Func environment installation command, and complete the Func installation
- Log in to the Func page, complete the initial configuration, then enter the official script library
- Search for and install the collection script corresponding to MySQL: "Guance Cloud Integration (RDS Collection)" (ID:
guance_rds) - Search for and install the collection script corresponding to Redis: "Guance Cloud Integration (MemoryDB Collection)" (ID:
guance_memorydb) - After clicking [Install], configure Amazon Web Services access permission parameters for each script, including Amazon AK and Amazon account name
- Click [Deploy Startup Script], the system will automatically create a Startup script set, automatically configure the corresponding startup script, and start automatic metric collection
- Log in to the Guance Cloud platform, navigate to the corresponding workspace, and enter "Metrics" - "Metric Management" to check whether the corresponding monitoring data is available
- View the RDS MySQL dashboard
Log Collection
Metrics are quantitative data used to measure the current resource status and performance of a service, used for real-time monitoring status judgment and other numerical application scenarios, such as setting alert rules based on numerical judgment. Logs, on the other hand, provide more detailed event and operation records for troubleshooting, security auditing, performance analysis and other purposes. The two are usually used in combination to comprehensively monitor and manage resources and applications in a cloud environment. Therefore, in addition to collecting metric data of cloud services, we also need to supplement the collection of log data to improve the observability dimensions of cloud services.
For database services, a typical analysis scenario is performance optimization and troubleshooting. Slow query logs help identify query statements that take a long time to execute in the database, so you can find performance bottlenecks and potential optimization opportunities. By analyzing slow query logs, users can identify indexes that need optimization, caching problems, complex queries and other issues, to improve database performance and troubleshoot database failures.
As shown in the figure below, Guance Cloud can implement storage, query, analysis and visualization services by collecting RDS logs from Amazon CloudWatch Logs.
RDS logs are exported to Amazon CloudWatch Logs.
Configure the Lambda function and add a trigger.
Guance Cloud Lambda function address:
https://github.com/GuanceCloud/terraform-guance-forwarder/tree/main/examples/xxx
Add environment variables.
Transmit data to the DataKit machine and open port 9529.
Log in to Guance Cloud to view logs.
Build Performance Dashboard
After completing the preparation of basic data such as cloud service metrics and logs, users can log in to their Guance Cloud workspace, build a custom Dashboard through Guance Cloud's scenario dashboard builder, and enable functions such as real-time monitoring, fault diagnosis, performance optimization, trend analysis, and collaboration in a visual way, helping users fully understand and manage the operating status of the system. The display content of selected metrics and logs can be combined, filtered, and screened according to the user's own needs, fully leveraging the flexibility of Guance Cloud's data integration to build a customized dashboard that fits the user's usage habits and requirements.
To build a Guance Cloud dashboard, you can freely create a new dashboard from scratch. If you want to reduce development workload, you can also use the basic templates provided by Guance Cloud and modify it based on the existing dashboard. Standardized templates can be selected in "Scenarios" - "Dashboards". For example, if you need to monitor MySQL and Redis, you can enter the corresponding keywords to create the corresponding monitoring dashboard with one click.
RDS Template
Guance Cloud provides a standard template for RDS monitoring views, including metrics such as query response time, number of concurrent connections, transaction throughput, slow queries, and buffer pool usage. As shown in the RDS collection metric figure above, this template provides more metric analysis than the RDS console metrics, and can be combined with RDS logs for further visual analysis.
Redis Template
Guance Cloud provides a standard template for Redis monitoring views, including metrics such as hit rate, memory usage, read/write operation rate, redis_bigkey (large keys), redis_hotkey (hot keys), and other metrics. Compared with the metrics provided by the ElastiCache console, this template provides more metric analysis, and can be combined with Redis logs for further visual analysis.
Custom Dashboard
A key value provided by using Guance Cloud is that users can aggregate and correlatively display data from different sources in a single dashboard according to their own monitoring requirements. Through data tags or time tags, you can filter and correlatively display metrics and log content for the same failure time, improving the efficiency of problem analysis. To achieve this effect, you need to click to enter the corresponding dashboard, add a log viewer to the current time-series dashboard by adding a chart. Then edit the log source, select the slow log of the corresponding MySQL or the log of the Redis service, so that logs and metrics are displayed in the same dashboard:
These data can be linked based on page view variables or time tags. When a user selects a specified time tag or view variable, metrics and logs will display the corresponding data and text in association with the variable selection. Users can use this function to quickly locate monitoring information related to the failure time point, reduce the time spent searching and waiting for monitoring data, and improve failure processing efficiency.
Data Monitoring and Alerts
In addition to using basic cloud service metrics and logs for real-time status observability, users can also build alert monitors that automatically track system status based on these data, enabling functions such as automatic failure identification and alert notifications. To configure an alert monitor, go to "Monitoring" - "Monitors" - "Create New Monitor" in your workspace, and select different monitor types based on how you want to judge metrics.
Inside the monitor, after configuring the metric source to be monitored, detection conditions, notification content, and notification objects, click save to activate the monitor. If an abnormality occurs in the monitored object later, users can view the details of the corresponding alert through the "Events" menu in the workspace.
For standard cloud service products, Guance Cloud also provides pre-built standard monitoring templates, providing users with an out-of-the-box alert configuration experience.
Summary
By collecting cloud service metrics and logs, Guance Cloud provides users with capabilities such as unified views, correlation analysis, and custom visual charts. By centralizing metrics and log data from different Amazon Web Services into a single unified dashboard, you eliminate the need to switch between different cloud service monitoring consoles, improving monitoring efficiency. By correlatively displaying metrics and log data from different sources on custom dashboards, you can comprehensively analyze metrics and log data at different levels, discover potential correlated issues, and speed up fault diagnosis and problem resolution. Through the custom dashboard capability, users can create customized dashboards based on specific concerns and business requirements, making it easier for different users to use monitoring data to ensure the stable operation of the system.
This is a discussion topic separated from the original topic at https://juejin.cn/post/7369132466999672867

















