Improved Chrome Extension UX, again

A few months ago, Philipp Hancke from &yet posted Improved Screensharing UX on Talky which specifically addresses the UX for first-time users. The problem being addressed is threefold:

In Chrome, desktop capture for screen sharing is enabled via the desktopCapture Chrome extension API, so when first-time users try to start screen share, they must first install the extension, then re-start the screen share request;
The method used to signal from the app to the extension to request the media stream involves using a content script to post messages to the background script;
When a Chrome extension is installed, content scripts are not injected into already-open tabs, only tabs loaded after it was installed

Before I elaborate, I'll clarify that the extension messaging and screen capture is encapsulated in getScreenMedia and its example extension, so you don't have to deal with this whole process in your app directly.

The strategy used before was (assuming a new user):

Attempt the request for screen media.
Set a timeout to fire if the extension never responds to the request with a pending response.
In the timeout, kick off inline installation for the extension.
When (if) inline installation succeeds, save the "intent to share screen" in local storage, then reload the page (this allows Chrome to properly inject the content script for messaging the background script).
When the app loads, check for intent to share screen in local storage (with a TTL, to ensure it's not stale), and if the intent exists, kick off the request for screen media again. This time, the content script will properly message the extension background script.

This is pretty good; it's not the worst we could do. We handle the refreshing and re-kick for screen media. The worst we could do is fail, requiring a manual reload and restart (or not offer inline installation at all).

The best thing we could do would be to avoid the reload altogether. Fortunately, there's a solution. Instead of creating a communication channel on chrome.runtime.connect, and messaging directly, we can use external messaging. Instead of posting a message to the window, which gets picked up by the content script and passed to the background script (and vice versa), we can use chrome.runtime.sendMessage(extensionId, options, callback) and, in the background script chrome.runtime.onMessageExternal. This works where the other solution doesn't, because background scripts are loaded immediately upon extension installation, whereas content scripts are injected on page load.

This simplifies the strategy to:

Kick off inline installation for the extension¹
When (if) inline installation succeeds, message the background script requesting screen media

Not only is this a much simpler method to code against, as an app developer, but it also dramatically improves the user experience.

This new strategy for screen media requests for first time user was cooked up by Fippo and myself while I was visiting &yet for a WebRTC deep dive with my team from PureCloud. It hasn't been finalized quite yet, but you can check out the PR. If you're looking to level-up your WebRTC game - get on the repos, or go see them in Richland, too!

¹We can use a sessionStorage trick to determine the existence of the extension, so we can know immediately if it's been installed or not. To do this, we include a content script that only sets the id of the extension in sessionStorage. In practice, getScreenMedia can try to fallback to the old messaging method, but then the timeout style check for extension exists. This way, we fall through new strategy -> old strategy -> extension not installed.