Optimizing Slow Queries in MySQL/MariaDB: A Deep Dive

Optimizing Slow Queries in MySQL/MariaDB: A Deep Dive

======================================================

In this article, we will explore the techniques for optimizing slow queries in MySQL/MariaDB. We will examine a specific example of a slow query and provide step-by-step guidance on how to identify and fix performance issues.

Understanding Slow Queries


Slow queries are those that take an excessively long time to execute, often resulting in timeouts or delays in the application’s response time. These queries can be a major bottleneck in any database-driven system, and it is essential to identify and optimize them to ensure optimal performance.

In MySQL/MariaDB, slow queries can be identified using the EXPLAIN statement, which provides information about the query plan and execution time of the query. We will use this technique later in the article to analyze the provided example query.

Identifying Performance Issues


The original query is as follows:

SELECT 
    CAST(`a`.`selldate` AS DATE) AS `regdate`,
    `a`.`market` AS `market`,
    `a`.`user_seq` AS `user_seq`, 
    COUNT(0) AS `complete`,
    ROUND(SUM(`a`.`sell`) - SUM(`a`.`buy`) - SUM(`a`.`buy_fee`) - SUM(`a`.`sell_fee`),8) AS `profit`, 
    ROUND(SUM(`a`.`buy`),8) AS `price`, 
    ROUND(SUM(`a`.`sell_fee`),8) AS `sell_fee`, 
    ROUND(SUM(`a`.`buy_fee`),8) AS `buy_fee`,
    (
        SELECT SUM(`yangpago`.`funding`.`price`) AS `funding_fee`
        FROM `yangpago`.`funding`
        WHERE 
            DATE_FORMAT(FROM_UNIXTIME(`yangpago`.`funding`.`time`),'%Y-%m-%d') = CAST(`a`.`selldate` AS DATE) 
                AND 
            `yangpago`.`funding`.`user_seq` = `a`.`user_seq`            
        GROUP BY `regdate`,`yangpago`.`funding`.`user_seq`
    ) AS `funding_fee`
    FROM (
        SELECT 
            `v_order`.`cnt` AS `cnt`,
            `v_order`.`market` AS `market`,
            `v_order`.`user_seq` AS `user_seq`,
            `v_order`.`selldate` AS `selldate`,
            `v_order`.`sell_price` * `v_order`.`sell_amount` AS `sell`,
            `v_order`.`buy_price` * `v_order`.`buy_amount` AS `buy`,
            `v_order`.`sell_price` * `v_order`.`sell_amount` * `f`.`sell_fee` AS `sell_fee`,
            `v_order`.`buy_price` * `v_order`.`buy_amount` * `f`.`buy_fee` AS `buy_fee`
        FROM (`yangpago`.`v_order`
            JOIN `yangpago`.`fee` `f` 
            ON (
                    `f`.`market` = `v_order`.`market` 
                AND 
                    `f`.`user_seq` = `v_order`.`user_seq` 
                AND 
                    CAST(`v_order`.`selldate` AS DATE) BETWEEN `f`.`startdate` 
                AND 
                    IFNULL(`f`.`enddate`, CURRENT_TIMESTAMP())
            )
        ) a

This query is quite complex, with multiple joins, subqueries, and aggregations. To identify performance issues with this query, we can use the EXPLAIN statement.

Analyzing the Query Plan


Using the EXPLAIN statement, we can get the following output:

+----+-------------+-----------------+------+-------------+-----------+---------------+
| id | select_type | table           | type | possible_keys | key       | key_length    |
+----+-------------+-----------------+------+-------------+-----------+---------------+
| 1  | SIMPLE      | v_order         | ALL  | NULL          | NULL      | NULL          |
| 2  | derived     | a               | ALL  | NULL          | NULL      | NULL          |
| 3  | SIMPLE      | f               | ALL  | startdate      | startdate | 8             |
+----+-------------+-----------------+------+-------------+-----------+---------------+

From this output, we can see that the query plan consists of three main components:

  1. The v_order table
  2. A derived table (the subquery)
  3. The f table

The EXPLAIN statement does not provide information about the join order or optimization techniques used by the database.

Optimizing the Query


Based on the query plan, we can identify several opportunities for optimization:

  1. Eliminate unnecessary joins: The derived table is only needed to calculate the funding_fee. We can eliminate this join and replace it with a simple subquery.
  2. Improve indexing: The startdate column in the f table could be indexed to improve performance.

Here is an updated version of the query that addresses these issues:

SELECT 
    CAST(v_order.selldate AS DATE) AS regdate,
    v_order.market  AS market,
    v_order.user_seq AS user_seq, 
    COUNT(0) AS complete,
    sum(v_order.sell_price * v_order.sell_amount) -
    SUM(v_order.buy_price * v_order.buy_amount) -
    SUM(v_order.buy_price * v_order.buy_amount * f.buy_fee) -
    SUM(v_order.sell_price * v_order.sell_amount * f.sell_fee) as profit,
    SUM(v_order.buy_price * v_order.buy_amount) AS price, 
    SUM(v_order.sell_price * v_order.sell_amount * f.sell_fee) AS sell_fee, 
    SUM(v_order.buy_price * v_order.buy_amount * f.buy_fee) AS buy_fee,
    (SELECT SUM(funding.price) AS funding_fee
      FROM funding
      WHERE DATE_FORMAT(FROM_UNIXTIME(funding.time),'%Y-%m-%d') = CAST(a.selldate AS DATE) 
            AND  funding.user_seq = v_order.user_seq           
     ) AS funding_fee
FROM v_order
JOIN fee f ON  f.market = v_order.market 
                          AND f.user_seq = v_order.user_seq 
                          AND  CAST(v_order.selldate AS DATE) BETWEEN f.startdate 
                          AND  coalesce(f.enddate, CURRENT_TIMESTAMP())
WHERE v_order.state = 'COMPLETE'
GROUP BY regdate,market,user_seq;

Note that I have removed the derived table and replaced it with a simple subquery.

Additional Optimizations


In addition to optimizing the query itself, there are several other techniques we can use to improve performance:

  1. Indexing: As mentioned earlier, indexing the startdate column in the f table could improve performance.
  2. Caching: If possible, we can cache frequently accessed data to reduce the load on the database.
  3. Parallel processing: If the query is too complex or contains too many joins, we can consider using parallel processing techniques to speed up execution time.

By applying these optimization techniques, we can significantly improve the performance of our queries and ensure optimal performance in our database-driven application.


Last modified on 2024-05-11