Unlocking the Bottleneck: How Trimming Redundancies in Magento's Search Queries Can Drastically Improve Your Site's Performance

Unlocking the Bottleneck: How Trimming Redundancies in Magento’s Search Queries Can Drastically Improve Your Site’s Performance

"Unlocking the Bottleneck: How Trimming Redundancies in Magento's Search Queries Can Drastically Improve Your Site's Performance"

The popular platform Magento 2 has been facing persistent performance issues, primarily arising from a substantial CPU load increase following the 2.3.0 update. At the heart of this problem is a frequently used search query, whose execution time significantly strains the database, particularly in situations where millions of search terms are involved. This article will delve into how eliminating redundancies, specifically the DISTINCT operator, from Magento's search queries could drastically improve your site's performance, thereby enhancing user experience and overall functionality.

Unmasking the Performance Issue: A Deep Dive into Magento 2's Search Term Cache Dilemma

Upon a closer examination of the Magento 2.3.0 update, it becomes evident that a significant performance issue has arisen due to the introduction of the Popular Search Term Cache. This feature, although well-intended, led to increased strain on the database, particularly on live sites with a high number of search terms reaching up to 2.7 million. The increased load on the database resulted in slow performance that, despite disabling search suggestions and incorporating Elasticsearch, continued to pose a challenge to the system.

Key to this performance issue is the execution of a time-consuming query: SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0). This query, in the context of a long search_query table, took an inordinate amount of time to complete, resulting in high CPU usage and slow response times, affecting both the user experience and the admin panel's functionality.

The Culprit Unveiled: Dissecting the Problematic Query and Its Impact

Further investigations into the problematic query revealed that the num_results > 0 condition was particularly time-consuming. This unnecessary condition added extra time to the query execution, lowering the overall performance of the system. Furthermore, it was found that the search bar continued to hit the MySQL search_query table despite the use of Elasticsearch, leading to a significant increase in the table size and, consequently, a drop in performance.

The impact of this issue is noticeable in various versions of Magento, including 2.4.4-p1 and 2.4.4-p2. The functionality of the admin panel was impaired due to high CPU usage, and keyword searches yielded slow response times. In some instances, the query SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0) took an unacceptable 10 seconds to execute, inevitably affecting the user experience.

Pinning Down The Causes: Role of The DISTINCT Operator In Performance Degradation

The DISTINCT operator within the problematic query was identified as a significant contributor to the performance degradation. This operator, which was incorporated into the queries within \Magento\Search\Model\ResourceModel\Query\Collection, served to add unnecessary overhead without affecting the query results. Given the unique constraint on query_text and store_id in the search_query table, the DISTINCT operator was effectively redundant.

The cardinality of the search_query table also emerged as a concern. With this table reaching millions of entries, the performance of the system was bound to degrade. The DISTINCT operator’s removal was seen to improve performance significantly, reducing the execution time of the problematic query from around 10 seconds to 2-3 seconds. The large number of rows in the search_query table, as well as the number of unique terms, was found to impact search performance negatively, underscoring the need for effective solutions to this bottleneck in the system.

The Patch vs. The Performance Issue: An Evaluation of the c90edaa Commit

The Magento community has made several attempts to patch this performance issue. However, the c90edaa commit, initially perceived as a fix, does not appear to solve the underlying problem. This update added more columns to the select query, a step which potentially increased the computation time rather than decreasing it. Despite the introduction of this patch, the issue persisted in Magento versions including 2.4.4-p2, 2.4.5-p4, and even in the 2.4.6 Commerce version.

High CPU usage caused by the query SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0) continued to put a significant strain on the overall system performance. This had the adverse effect of affecting the admin panel's functionality, resulting in slow response times, particularly during keyword searches.

Looking Ahead: Ensuring Efficient Search Performance in Future Magento Updates

While the c90edaa commit did not provide the solution anticipated by the Magento community, the identification of the redundant DISTINCT operator offers a promising lead towards resolving the performance issue.

Since the DISTINCT operator does not affect the query results because of the unique constraint on 'query_text' and 'store_id' in the search_query table, removing it could significantly improve query performance. Further, the DISTINCT operator's redundancy adds unnecessary overhead, which could be eliminated to speed up query execution time, thereby improving overall performance.

Possible future solutions for Magento updates could involve modifying the 'isTopSearchResult' function in \Magento\Search\Model\ResourceModel\Query\Collection to remove the DISTINCT operator from the query. Other potential alternative solutions could include only inserting a fraction of the search terms, disabling search term tracking completely, or asynchronously inserting search terms in batches to avoid performance degradation.

Improving the performance and scalability of Magento's search function is crucial as it directly impacts user experience and overall site functionality. Magento developers must continue to investigate and implement innovative solutions that ensure the platform's smooth operation, thereby maintaining its status as a leading eCommerce solution for businesses worldwide.

By unlocking this bottleneck and trimming redundancies in Magento's search queries, we can drastically improve the platform's performance, making it more robust, efficient, and responsive. This not only enhances the end user's experience but also increases the platform's reliability and trustworthiness among the broader eCommerce community.
Thus, we've identified the key contributors to Magento's search performance issue – namely, the redundant DISTINCT operator and the bloated search_query table. By removing the DISTINCT operator, which adds unnecessary overhead, and optimizing the number of entries in the search_query table, we can significantly enhance the system's performance.

  • A modification of the 'isTopSearchResult' function in \Magento\Search\Model\ResourceModel\Query\Collection to remove the DISTINCT operator could be a potential solution.
  • Other alternative solutions include selectively inserting search terms, disabling search term tracking entirely, or asynchronously inserting search terms in batches.

In conclusion, reforming Magento's search function is crucial for enhancing user experience and overall site functionality. The economy of our queries is directly proportional to the success of our eCommerce enterprises. This is not just about optimizing a single aspect of Magento; it is about maintaining Magento's status as a leading solution for businesses worldwide. By implementing these changes, we will not only improve user experience but also gain the trust of the broader eCommerce community, thereby unlocking the bottleneck in Magento's search queries.