Updating to High Performance Order Storage (HPOS) for High Scale WooCommerce Shops

Universal Yums had over 10 million order and subscription records when we transitioned to High Performance Order Storage. Due to our scale, we took a slightly custom migration path for the data, but everything else was very straightforward.

Code Compatibility

First we started with a code audit. The majority of WooCommerce extensions we use had already been marked as HPOS compatible, so this was primarily an audit of our 100k+ lines of custom code and functionality. The HPOS Recipe Book explained all the compatibility items to look for.

To work around some of the performance bottlenecks pre-HPOS, we had implemented clever solutions and overridden default queries to be more efficient. For example, querying for customer orders ran really slow in non-HPOS at our scale because it required a meta data query. Instead, we used the post_author column in the posts table to store the customer id and had query overrides to make use of that.

As part of upgrading to HPOS, we were able to remove a lot of these clunky workarounds since the new data structure is more efficient. For example, the new orders table has a customer_id column that can be queried against directly.

Finding and fixing all the compatibility issues took some time, but having ~500 unit tests for our codebase was a huge help. By looking at where tests broke with HPOS enabled, we were able to identify ~80% of the functionality that needed attention.

We had one third-party plugin (Shareasale) that never updated to support HPOS. Any extension that does not have HPOS support by now is likely abandoned by the developers, so a replacement plugin or alternate solution should be found. For Shareasale, we simply wrote our own integration.

Data Migration

The standard data migration path is to enable the HPOS setting in the settings. WooCommerce will migrate all data in the background to the new structure, but continue to save data to both the new data structure and the old data structure. Once all the data has been fully migrated, the site can switch fully to HPOS and delete the legacy data from the posts table and postmeta table.

At our scale, the idea of duplicate writes made me a bit nervous since it would be difficult to fully test in a staging environment. For instance, what if we got a surge of 500 orders in an hour? Or, worse, what if the migration took longer data than expected and we hit our renewal day when 10,000 orders per hour are generated? Would the site be able to perform the same when doing duplicate writes?

There was also a technical snag. When HPOS is enabled, it shows the progress of the migration on the settings page. Once this migration started, this page would time out and no longer load for us in staging. I believe this is because there is a query that attempts to diff the amount of orders in the posts table against the number of migrated orders in the orders table, and with 10+ million records it takes too long to complete.

For all these reasons, we decided to migrate our data with a script so that we could have more control over the process and monitor progress.

Migrating orders into the HPOS structure went surprisingly fast. We were able to migrate ~3 million orders an hour. Deleting order data from the previous structure (posts and postmeta table) ran a bit faster, ~4 million per hour.

To fully migrate, we first ran a script to migrate all our completed orders and cancelled subscriptions. Then we deleted the legacy data associated with those records from the posts table and postmeta table. We figured customers could get by for a day without accessing this.

Then, starting at 3am (which is our lowest traffic time), we put the site into maintenance mode and completed the migration for our remaining active subscriptions and orders. When this was complete, we deleted the remainder of the legacy data, switched the site into HPOS, and brought it our of maintenance mode.

Once the site was live, orders continued to flow in as normal. We had a few minor issues to fix due to the new URL structure for orders and subscriptions in the admin dashboard, but it was otherwise surprisingly smooth.

When renewals happened later in the month, we did have some unexpected issues with deadlocks in the address table. However, I suspect this is an issue that only very high scale sites will have when inserting thousands of records per hour. For this specific problem, the WooCommerce team has been highly responsive and will have a fix out in the upcoming 9.0 release.

Benefits of HPOS

The Universal Yums website had been highly optimized for performance with legacy data storage, so we didn’t see a real noticeable performance boost by switching to HPOS. However, for large sites that haven’t had a dedicated technical team tuning performance this switch is likely to be very helpful.

One benefit of HPOS we have experienced is we can now easily export a lean version of the database for local development that doesn’t include 100GB of customer data. When orders and posts were mixed in the posts table and postmeta table, it was very time consuming to extract and delete this data to build a manageable local database. Now we can export the latest data on the fly by exporting specific tables, or by creating fresh database copy from recent staging data.

Most importantly, we’re now confident our WooCommerce site will continue to scale as our data grows. Although we had solved all our performance bottlenecks up to 10 million order mark, we were concerned how long queries might take (especially to the postmeta table) as we continued to scale. With the new table structure, we no longer have those long-term concerns about performance.

Leave a Reply