Decoding the Magento 2.3 Update: The Unexpected Surge in Database CPU Load and the Hidden Culprit Within Search Queries

Decoding the Magento 2.3 Update: The Unexpected Surge in Database CPU Load and the Hidden Culprit Within Search Queries

Decoding the Magento 2.3 Update: The Unexpected Surge in Database CPU Load and the Hidden Culprit Within Search Queries

The Magento 2.3 update, hailed for bringing new features into the mix, surprisingly unleashed a sizable surge in database CPU load, casting unexpected shadows on the website's overall performance. At the heart of this issue lies the innocuous-looking search query, SELECT DISTINCT COUNT(*) FROM search_query WHERE (store_id = 1) AND (num_results > 0), which due to its design and execution, is unduly stressing the database, especially those burdened with million-record search_query tables. This blog post delves into the complexities of this problem, exploring the root cause, the role of the DISTINCT operator, and potential strategies for mitigation, while raising questions about the persistence of the problem in subsequent Magento versions.

Unmasking the Culprit: The Impact of the Magento 2.3 Update on Database CPU Load

The eagerly anticipated upgrade to Magento 2.3 did more than just introduce novel functionalities โ€“ it unexpectedly lit a fuse, leading to an explosion in database CPU load. Unexpected and disastrous, this sudden surge in processor load was traced back to a single, seemingly harmless search query, SELECT DISTINCT COUNT(*) FROM search_query WHERE (store_id = 1) AND (num_results > 0). On the surface, this query seems innocuous enough, but once unleashed on databases brimming with millions of search_query records, it morphed into a monstrous stress inducer, pushing database performance to its limits. Live websites hosting 2.7 million search terms felt the brunt of this onslaught, with the query's execution promoting noticeable strains on the database and causing drastic slowdowns.

The Search Query and its Role: Diving into the Code

The Magento 2.3.0 update introduced the Popular Search Term Cache, and with it came the problematic search query, effectively turning the promising update into a ticking time bomb. The query's execution time was directly proportional to the size of the search_query table โ€“ the larger the table, the longer it took for the query to complete. The primary delay instigator within the query was the num_results > 0 check, acting as an anchor and dragging the query's execution time into the abyss.

An in-depth analysis of the issue confirmed that removing the num_results > 0 check resulted in a significant reduction in query duration. This revelation suggested that the culprit was, indeed, located deep within the code of the update. However, rather than reverting the PR, a consensus was reached that overriding the execute function might be a more efficient solution to ameliorate the performance.

The DISTINCT Operator: An Unnecessary Burden in the Query

Beyond the num_results > 0 check, another area of concern identified within the problematic query was the use of the DISTINCT operator. This operator, in this context, was redundant at best and performance-draining at worst. Our search_query table, given the unique constraint on query_text and store_id, renders the DISTINCT operator unnecessary, leading to nothing more than an extra, unwarranted burden on the system.

The removal of the DISTINCT operator from within the Magento\Search\Model\ResourceModel\Query\Collection resulted in marked performance improvements, confirming its detrimental impact on query execution. This discovery underscored the importance of thoroughly optimizing and debugging new code, particularly when dealing with large and potentially taxing databases, and further cemented the DISTINCT operator as a thorn in the side of Magento's performance.

Tackling the Performance Issue: Strategies for Improving Query Execution

The conundrum lies in the hands of the particular search query responsible for the heightened database CPU load. Digging deeper into the problem, it becomes clear that the central culprit in this situation is the num_results > 0 check. The query's completion time expands significantly with this check in place, creating an unnecessary burden on the system.

One may argue that removing the num_results > 0 part could improve the query duration, but this is only a band-aid solution. It doesn't address the core problem. A more comprehensive approach would include overriding the execute function to boost performance. This approach, however, may require advanced technical skills and a deep understanding of Magento's codebase.

Another promising solution lies in the realm of indexing. The search_query table, which can host millions of records and is a central factor in the slow query performance, lacks an index. By adding an index, it is plausible to see a significant reduction in CPU load and overall improved query performance.

Persisting Problems: The Performance Issue in Later Magento Versions

Regrettably, the performance issue witnessed in Magento 2.3 is not an isolated incident. It continues to plague later versions like Magento 2.4.1-p1 and even the latest 2.4-develop branch. This persistence of the issue indicates a fundamental flaw in the platform's design, specifically related to the caching of popular search terms.

The DISTINCT operator, while ostensibly helping to avoid duplicate entries, is exacerbating the slow query performance. This is particularly true for websites with a high cardinality of search terms. Despite having a unique constraint on query_text and store_id, the DISTINCT operator is unnecessarily employed, causing an unwarranted CPU load.

Seeking the Solution: Potential Mitigations and Long-term Strategies for Magento Updates

As we examine potential mitigations, it is crucial to focus on solutions that not only address the current issue but also preemptively mitigate the risk of similar problems in the future. One such strategy is the asynchronous insertion of search terms in batches. This method could help reduce the CPU load while maintaining the search term tracking feature.

Moreover, we could consider selectively inserting only a fraction of the search terms or disabling search term tracking altogether. However, this poses a new challenge โ€“ striking a delicate balance between performance optimization and maintaining the functionality of the popular search term feature.

In the quest for a more robust solution, removing the DISTINCT operator from queries within Magento\Search\Model\ResourceModel\Query\Collection has proven to significantly improve performance. This strategy coupled with the addition of an index on the search_query table, presents a promising way forward.

In conclusion, while Magento's frequent updates often bring new features and improvements, they can sometimes introduce new problems, like the one discussed here. It is crucial for website owners and developers to stay updated with these changes, understand their implications, and be prepared with mitigation strategies. With a holistic understanding of the issues at hand, we can not only solve the current problem but also work towards a future where such performance issues are a rarity.

The unexpected surge in database CPU load following the Magento 2.3 update underscores the critical importance of conducting comprehensive debugging and optimization when introducing new code, especially with large databases. Accordingly, the following steps can be taken to mitigate the effects:

  • Override the 'execute' function to improve the performance and reduce query duration.
  • Consider the addition of an index to the search_query table to cut CPU load and boost system performance.
  • Remove the redundant DISTINCT operator from queries within the Magento\Search\Model\ResourceModel\Query\Collection to further improve performance.

This situation serves as a potent reminder for website owners and developers to stay abreast of updates, understand their implications, and be equipped with effective mitigation strategies. Ultimately, with a deep understanding of the issues at hand and proactive problem-solving, we can aspire to a future where such performance issues are notably less frequent.