Nov 1, 2024

Foundry V12 Support and the Game Launch Page

9 min read - Published: a month ago

The Forge is now officially recommending Foundry VTT v12 to all users on the recommended update channel! 🥳

For help on updating to V12 and updating Foundry VTT in general, see our help guide here.

Getting to this point was a long process, and we're quite happy with the improvements we've made in order to make that possible. Before Foundry's first v12 stable release in May, we were already hard at work updating our integrations to smooth out several friction points between the Forge and V12. Following the V12 stable release, we fixed most of these. This includes updates for our tweaks to the /setup and /join pages, a refactor of the Foundry log viewer, and fixes for premium module installations, user manager, and shared asset libraries.

The majority of that work was complete within about one month, near the end of June. After that we continued to make small improvements and kept an eye out for further issues with V12. One major issue remained - Foundry VTT was taking much longer to launch V12 worlds on The Forge! We initially thought this was a bug and raised it with the Foundry team during the summer. After some discussion, we learned that it was an intentional change on Foundry's part which had that side effect, and that the behavior would be staying. That put the ball back into our court in August to figure out a solution for V12 users on the Forge.

In the end, we've solved it by implementing our new game launch page, which is a lot more user friendly than our previous launch experience, and even displays the progress of loading the world as Foundry VTT launches. But this is more complicated than it sounds, and developing a solution for the longer world launch times turned out to be the greatest challenge to our V12 integration. We go into our journey to get here below.

The Issue

Warning: this article is going to get very technical, so get ready for a lot of jargon headed your way. Let's start off with an explanation of what changed within Foundry VTT, and why the longer launch time required a unique solution for our architecture.

As you may have noticed, Foundry VTT performs a data model migration when the core version changes. However, there have been historical problems with data model migrations not being performed in some edge cases. These tend to crop up when new core versions are released, making users frustrated or more hesitant to upgrade. To ensure this isn't a problem anymore, Foundry now performs the data model migration every time a world is launched to make sure everything in that world is definitely using the latest data format.

While that change had very little impact on users running Foundry VTT on their local instances, it makes a world of a difference when running Foundry VTT on a distributed infrastructure using networked storage devices. The massive amount of reads from the world's database, followed by writing back the data, was causing hundreds of thousands of read and write operations to happen during the world launch. Each operation had to be sent over the network to where the data actually resides, and while it can support huge transfer speeds, it can take quite a performance hit when it's extremely small reads/writes spread over a very large number of operations, especially when done sequentially. For example, we measured that a fresh pf2e world took about 200,000 disk operations to launch on v12. On a local SSD each of those is likely to take tens of microseconds. On a very fast networked storage device with 0.2 ms latency, that would still be a 10 times increase in latency, so a game that would load locally in 10 seconds would now spend 100 seconds just waiting on the network alone! Foundry or any other nodejs application using a NAS mounted device for storage would face similar limitations. The I/O latency is always going to be the limiting factor.

The Challenge

When The Forge started, Foundry VTT was a simpler application, and launching a world rarely took more than a couple of seconds. Because it was so fast to start up, initial plans for a game launch page (something to show between clicking "Launch Game" and Foundry being ready) were deemed unnecessary and shelved. Over time, as worlds and game systems became larger, and Foundry added more checks and safeguards to improve its robustness, that time to launch has lengthened. Still, it wasn't too bad unless you were migrating between core versions, and the world migration page we added in the V10 era helps a lot in that case.

However, with the V12 change to apply data model migrations on every launch, suddenly a world which would launch in five to ten seconds previously could now take over 2 minutes! This became a serious usability problem, since the entire launch process (from opening the game URL to Foundry responding to the request) was handled by a single blocking HTTP request. If clicking a link took 10 seconds to respond, it's acceptable for most users, though not a great user experience, but if it takes 2 minutes... it becomes a terrible user experience and also likely to time out the request as well.

The Solution

Because of this, a game launch page suddenly became a critical and necessary feature. At the root of it, we needed our own page which sits in between a game launch request and Foundry VTT itself. When a user clicks "launch" (or accesses the game URL when it is not online yet), they see the launch page. Once Foundry VTT is ready, it redirects them. Sounds easy enough, right?

Adding a State to the Machine

Unfortunately, this kind of refactor is far from trivial. A large part of what the Forge does is manage the Foundry VTT process - often for thousands of users at once. To accomplish that we have layers of systems which spread that load, and internal system calls to allow our servers to communicate with each other. All of this is tracked by essentially a state machine for each game, stored in the database.

What we needed now was to add a new "starting" state to that machine, check every spot in the code where we use the game state to see whether a check for "online" now needs to include a check for "starting", and of course refactor our Foundry VTT process management to set the game to starting, and then to online once the appropriate steps were completed. In this implementation, we also had to take care not to have a situation where the process gets stuck in the "starting" stage, or ends in an unexpected state that breaks some internal conventions and triggers unforeseeable errors.

Another thing to note is that we not only needed to show the launch page on specific requests, but also needed to block HTTP requests that proxy directly to Foundry VTT once the game is ready. This is because some parts of it, like the websocket connection, or some images and javascript/css files cannot work if their requests redirect to an HTML page.

Progress Bars? Yes, Please!

Further complicating this, we also wanted to change how we launch Foundry VTT as well, at least for V11 and newer worlds. As some users may have noted, Foundry VTT started showing a world launch progress bar in its own setup screen as of V11. This progress bar uses websocket data sent from the Foundry VTT server, and any authenticated client can listen to that.

We figured we could make use of that same data on our launch page, and of course, showing users a real progress bar is preferable to a loading spinner! But again, this ended up requiring significant refactoring of how we manage and interact with the Foundry process.

Previously, we told Foundry VTT which world to launch at the same time the process was started (using the  --world command line argument), but we quickly learned that Foundry doesn't send any progress messages if you launch a world this way. It doesn't even start listening to any network requests until the world launch is finished! That made the old approach a complete non-starter for the launch page, at least for V11 and newer.

We refactored the launch process to not specify the world (launching Foundry VTT to the setup page), and then our server connects and sends a command as though you clicked the launch button on a world. Then, Foundry VTT starts sending the progress messages, so our launch page can listen to them and show our own progress bar!

This was a very delicate thing to achieve since launching Foundry without the --world option meant that it would launch into the setup page, which could be locked with an admin authentication key, or where the launch could fail and cause a game to be stuck on the setup page, when it should not have access to that page (in the case of a user with the Game Manager enabled).

Also, we need to properly recognize when the user is navigating to the game's page and redirect them to the launch page instead, but any resource requests should continue to use the blocking HTTP request as mentioned above until the game is "online", but websocket connections themselves should be allowed to pass through even before the game is online, since we need the connection to succeed in order to receive those progress messages from Foundry via the websocket connection.

But Wait, What About the Classics?

Excellent - so now we have a refactored game state machine, Foundry launch process, and a launch page! Ready to ship... right?

Well, no. If only it were so easy! Now that we had almost completely changed how game launches work, we had to do the really hard work - regression testing. As of this writing, we allow new users to select Foundry versions as old as 0.8.9, and we have a policy of not kicking users off of even older versions until they decide to update themselves.

This means that we had to make sure the new launch process and the game launch page works with Foundry all the way from 0.6.6, up to 12.331! As most users are no doubt aware, even the bravest and most indefatigable Foundry community developers usually do not actively maintain support for more than one or two major Foundry VTT versions at a time, as the behavior of Foundry VTT has evolved significantly over its life.

In addition to that, there are a number of contexts for us to launch Foundry VTT in, each of which presents its own little wrinkles and needed to be tested. You could have Game Manager enabled or disabled; with Game Manager disabled, we'll still remember the world you had active when your server idled and restore that when you launch next time; with Game Manager enabled you could launch multiple games if you have multiple licenses attached to your account. Even switching Game Manager off or on needs to idle all of the games for the other mode so that your license doesn't get stuck attached to a game you can't access!

And, of course, we still had to support Foundry premium package installations - because of how Foundry VTT premium packages are secured, we need to launch Foundry and instruct it to install the package each time, and then shut down Foundry again afterward. If the installation handler doesn't account for the new behavior of the Foundry launch process, we could end up breaking premium package installs/updates - an extremely negative experience for our users.

The Finish Line

Writing the changes to deal with all of those scenarios, and then running through our test suite for each of our internal release candidates turned into a very long process. Many times we would fix something for one scenario, but introduce problems for another. We had to expand our QA infrastructure to allow us to properly test game launches across remote servers (like what happens in production for our users). We thought of even more test cases along the way - like public games, or ones with admin keys set - which resulted in a few more iterations.

And every time we made changes, we had to rerun our entire QA suite from scratch. This meant testing from 0.6.6 to v12, with and without Game Manager, with and without the administrator authentication password, testing world migrations, upgrades and downgrades, premium installs, and more!

We thought we were ready during the week of October 14th, and started deploying our shiny new game launch page. Unfortunately, right at the end of that deployment we noticed a few critical bugs which had slipped through our earlier testing and had to roll back the release. (A few users got a little sneak peak that day!) We iterated some more, added more use cases to our QA suite, fixed those issues, reran the full QA a few more times, and then finally released it for good the following week.

In the end, we made it! The game launch page is now live for all Forge users, and has even received a couple patches since the initial release to fix a few non-critical bugs. We're very happy with how it is performing and the much-improved user experience it provides, and based on the user feedback we've received, our users are happy too! The game launch experience was the last piece of the puzzle we were working on to feel comfortable marking V12 as the recommended version to run on The Forge. Now that it is done, we are recommending users to upgrade to Foundry v12.

Happy gaming!