This is a bit of a back-to-basics topic. It’s a common task for a web page to open another page (a popup) that does some action and returns its result back to the caller page. This could be a same-origin or a cross-origin popup. The typical use cases include location picker, contact picker, payment form, social sign-in/SSO, and so on.
Android SDK has a simple and clean API for this — startActivityForResult. For instance, see Place Picker. Web, however, has no such clear API. Back in IE days, there was a non-standard showModalDialog API that was poorly designed and implemented, and has since been deprecated in most of browsers. So, if you were to try to support such a use case on the web today, you’d have to cover the following issues:
window.open(url, target)
API.postMessage
) with window.opener
. The popup must be opened correctly to ensure that the window.opener
is available (no rel=noopener
).window.opener
. Obviously, you’d trust the page you open itself, but it’s possible that the popup site could be compromised, which would expose the caller site to attacks as well.target=_top
. In this case, the window.opener
would not be available and there would be no way to message to it. Typically, this is solved via use of redirects, again, using custom protocols by passing data via redirect URLs.window.opener
.history.state
and related APIs.These are just some of the important issues a Web developer has to contend with. Different browsers introduce additional issues. This is hard to do, the solutions are often brittle and the resulting UX is poor.
Surely, Web platform could do better.
Popups will be mentioned a lot in this article, so I’d like to immediately addressed the question: why not just use iframes in place of popups? With iframes, the parent page has more control over the popup’s positioning and sizing, and iframes are void of most problems described above — most notably there’s no redirect fallback or unloading, which are probably the hardest problems to solve. In fact, iframes are used in similar use cases on the web already. The use of popups has gone down quite a bit: partially because of the issues listed above, and partially because iframes provide a better UX and simpler implementation.
However, for many use cases, especially security-sensitive cases (such as SSO and payments), use of iframes is limited due to:
Some proposals in the past considered creating a special subclass of a fullscreen mode for iframes to provide guarantees against click/input-jacking. However, given iframe cookie partitioning this would not be sufficient. On top of that, guaranteeing a non-clickjackable mode for an iframe requires participation of both client and server, which defines restrictions in the form of X-Frame-Options
and CSP. The server stack would need to have some strict guarantees by the client to lift iframing restrictions — this would be a major backward compatibility problem.
Given these restrictions, we will mainly focus on popups as the solution space.
The Web Platform could continue with API path of supporting key use cases that desire popups. Just as was done with Payment Request UI, the Web Platform can define APIs for Social Sign-in, Calendar sharing, and so on. This would be a big improvement for the such use cases.
However, the design and implementation of such APIs takes time. The migration requires polyfills. And ultimately the Web Platform can’t support all such use cases. Thus I welcome new Web APIs, but the generic solution for “get a result” popups is still very much needed regardless.
Before we start looking at different approaches to address “get a result” popups, let’s take a look at how Android APIs arrange such use cases.
Android provides the startActivityForResult API. It comprises three components: startActivityForResult
, onActivityResult
, and setResult
. For instance, the Place Picker API can be used like this:
@Overrideprotected void onActivityResult(int requestCode, int resultCode,Intent data) {if (resultCode != RESULT_OK) {return;}if (requestCode == PICK_PLACE_REQUEST) {// Process result.}}
private void pickPlace() {startActivityForResult(new PlacePicker.IntentBuilder().build(this),PICK_PLACE_REQUEST);}
The Place Picker activity itself has to call setResult
method:
private void clickHandler() {Intent intent = new Intent();intent.putExtra("place", selectedPlaceData);setResult(RESULT_OK, intent);}
The fact that this API explicitly exists is a good sign that Android takes these use cases seriously. But there are some important properties to consider:
onActivityResult
handler will be called when the caller activity is restarted.There are several approaches that the Web Platform could consider to make coding of such “get a result” popups easier and less brittle. There are two major directions here that would support each other: contextual popups and an API for the result exchange.
This is a specialized version of the window.open
API (e.g. window.openInContext
, or maybe using a new window.open
option) for contextual popups — the, so to say, “picture-in-picture” mode for popups. Such popups would open in the context of the caller page. This API is constructed to avoid most of the pitfalls listed in the introduction section. In particular, there’s no need for the redirect fallback and the caller (context) page would never be unloaded. It would still rely on custom window messaging to communicate results back to the caller.
To illustrate this option, consider the current state of popups on the Web. The UX is invariably poor in desktop and mobile browsers: a separate tab or window is opened. If user switches to another tab it’s pretty hard to find the popup.
One model for this could be Payment Request UI. Both on desktop and on mobile this UX ensures:
It’d be great if the popup can be opened as a top-level browsing context, but still structurally and visually presented in the context of the caller page.
One important nuance: such popups cannot open nested popups — there’s no good way to make this a good UX. But this would also give browsers better performance parameters by slowing the proliferation of tabs.
The additional big promise: contextual popups could completely eliminate the redirect fallback, which could in turn also simplify many APIs. As you will see elsewhere in this document, the redirect fallback (either due to single-window browsers, or page unloading) is the major complication for “get a result” pattern.
Final word on this topic: Web Views. Web Views today serve a huge portion of the mobile traffic (by some estimates approaching 50%), and they significantly complicate this pattern. A Web View is by default a single-window browser. Supporting a multi-window mode is a very complicated task involving memory tradeoffs. Ideally, contextual popups would be implemented in modern Web Views out of the box. They don’t have to be enabled by default, but the implementation itself is important to reduce solution space fragmentation between browsers and Web Views.
When a popup can be actually opened, the window messaging could be used to return a result back. The caller and the popup would have to agree on a custom messaging protocol to do so. It’s a bit of an overkill for something as simple as returning a single result back and it’s full of security gotchas. But it works.
However, messaging is not always available. Some cases where messaging is not available:
window.opener
field is not available and messaging is not possible. Notice that the redirect mode is sometimes a UX choice, but sometimes forced by the environment: a single-window browser, a popup blocker, etc.In the redirect mode, the only way to return data back to the caller is to redirect back to it with the result in the URL, e.g. in the URL fragment. When the result is sensitive, it has to be encrypted to avoid undesired leaks. Supporting the redirect mode correctly is an arduous task and any solution would be very brittle.
A couple of approaches to consider here: provide a safe way to return data in the redirect mode, or adopt a new API that supports both popups and redirects similar to Android Activity API.
We’d like to focus on the Activity API for Web, but first, a few words about redirect mode.
We could provide a special API aimed specifically at the redirect mode to return data back safely and securely. For instance, we could extend Web History API. The popup would push data into the history stack before redirecting back to the caller:
history.pushResult(resultData, "https://caller.com");window.location.replace("https://caller.com/continue");
The caller will be able to read the data from the history stack as soon as it starts up:
var result = history.popResult("https://popup.com");if (result) {// Result is already available....}
Notice that both push and pop are restricted to specific origins of the caller and the popup to ensure that data is exchanged securely.
Adopting the Android-like Activities API on Web is a more radical solution. But it could solve both the redirect and popup modes very naturally. It could also play well with contextual popups idea for an improved UX.
Following Android API, we could introduce similar API on the Web to include: window.openForResult()
, window.onResult()
, and window.setResult()
. The Place Picker caller from above would look like this on the Web:
// Anticipate that the result might arrive at some point, even// if openForResult has not been called in the instance of this// page.window.onResult('pickPlace','https://maps.google.com',(response) => {if (response.ok) {// Process result.}});
// Call openForResult.button.onclick = () => {window.openForResult('pickPlace','https://maps.google.com/pickplace',target);};
The popup page would return the result like this:
window.setResult(ok, payload);
Let’s look at this API in more detail.
window.openForResult(requestId: string,url: string,target: string,opt_options: string): void
This method is similar to window.open
, however there are key differences as well. The arguments are:
requestId
— the string ID that will later available in the window.onResult
callback.url
— the popup URL.target
— the window target: this either makes it into a popup or a redirect.opt_options
— additional options, similar to window.open
Unlike window.open
, neither the popup window reference nor window.opener
are needed in window.openForResult
API. While these still could be provided, removing them would reduce the surface for XSS exploits.
window.onResult(requestId: string,resultOrigin: string,callback: function(Response)): void
This method is used to register a callback for the requestId that will further be specified in the window.openForResult
.
It’s very tempting to do away with this method and simply make window.openForResult
return a promise. However, it’s not so simple because of redirects and page unloading. If the caller page has been redirected or unloaded, the result would be kept by the browser in some ephemeral storage (such as history stack) and as soon as the callback is registered using window.onResult
it would be immediately called with the result. This feature would also come in handy for the redirect polyfill in one of the sections below. On the other hand, the contextual popups, if they ever became a reality, could help simplify this API.
Another nuance, resultOrigin
parameter is generally not necessary since the origin is clear from the url
in the window.openForResult
call. However, it’s a good additional protection and would also be helpful for the polyfill.
window.setResult(ok: boolean,payload: Object): void
This method will be called by the popup once the result is available, and it will close the popup to return back to the caller.
The arguments are:
ok
— the completion signal: true
indicates success, false
indicates cancellation or failure.payload
— the data payload of the action taken in the popup. In case of error — the reason of failure.Polyfilling this API is challenging, but possible.
The API call is:
window.openForResult(requestId, url, target, opt_options)
The polyfill will execute the following steps:
popup = window.open(url, target, opt_options)
. Add the #ACTIVITY={requestId, returnUrl}
fragment to the url
proactively, in case the browser environment will silently open the window as _top
(see step 7).window.open
failed or invalid popup object is returned, go to step 7.popup.closed
. We need it in case the popup is closed by the user to produce “canceled” signal.requestId
and response{ok, payload}
to the result processing code.window.open
call failed, redirect the current page to the url
with an added fragment #ACTIVITY={requestId, returnUrl}
. The caller page execution is aborted, but we expect to return once the popup page has completed via redirect. The end.In case of window.open
failure when the polyfill falls back to redirect (step 7), when the popup page redirects back to the caller:
#ACTIVITY={…}
fragment parameter which will contain structure {requestId, ok, payload}
.document.referer
and url
are from the same origin. If not the same origin, fail.requestId
and response{resultOrigin, ok, payload}
to the result processing code. The end.After all of these operations, the #ACTIVITY
is erased from the fragment.
The API call is:
window.onResult(requestId, resultOrigin, function(response) {})
The polyfill execution depends on whether the response arrives before or after the window.onResult()
is called.
If the response arrives before the window.onResult
is called, polyfill simply would store the response in memory or history stack for the requestId
. Once the window.onResult
is called and the corresponding response is available, do the following steps:
resultOrigin
argument is the same as response.resultOrigin
.callback(response{ok, payload})
.The API call is:
window.setResult(ok, payload);
The polyfill in the popup page considers two modes: popup or redirect. If the window.opener
is available, assume the popup mode and execute these steps:
{ok, payload}
via window.opener.postMessage
.If window.opener
is not available or messaging failed, assume redirect mode and execute the following steps:
#ACTIVITY={requestId, returnUrl}
.returnUrl
and document.referrer
have the same origin. If not, fail.window.opener
is not available, redirect to the returnUrl
with added fragment #ACTIVITY={requestId, ok, payload}
. The end.window.opener
is available (i.e. messaging failed), open the same URL with target _blank
, i.e. window.open(‘returnUrl#ACTIVITY={…}’, ‘_blank’)
. The end.Redirect really complicates mechanics of data exchange. Passing sensitive data as the payload
in the redirect URL might be problematic in some use cases. If that’s considered to be a problem, the popup and the caller page must agree not to send data in plain text and instead use some form encryption. For instance:
window.onResult('sensitive-request','https://popup-domain',(response) => {if (response.ok) {fetch('https://popup.com/decrypt', {method: 'POST',body: response.payload.encrypted,}).then(response => {// Process the payload as the actual popup response.});}});
This way the sensitive data is never passed in the redirect URLs and service fetch can control for CORS origin. The https://popup.com/decrypt.json
would decrypt and return the payload and rely on CORS Origin
header to ensure origin-to-origin security. Critically, this request must also rely on some internal session identifier to prevent session fixation attacks (SFA):
app.post('/decrypt.json', (req, res) => {// "decrypted" is a structure:// {forOrigin: string, sessionId: string, data: Object}var decrypted = decrypt(req.body['encryped']);
// The CORS origin must match the indented origin:if (decrypted.forOrigin != req.headers.origin) {res.sendError(403);return;}
// The CORS cookie should correspond to the intended sessionId:if (decrypted.sessionId != getSessionId(req.cookies)) {res.sendError(403);return;}
// All good: send back the data.res.send(decrypted.data);});
While not exactly the same API, the web-activities project in GitHub implements an API with a very similar shape.
Currently, window.open
could, in theory, be forwarded to a matching native activity on Android. However, there’s no way to return the result back to the caller page. Let’s imagine how this could be made possible.
If the browser implements the proposed API, it would be straightforward to extend it to native activities using the same API. How would, however, the openForResult
API allow native invocation? Android already allows intent filters to intercept URLs, but additionally, we could extend openForResult
to support alternative URLs for intent URLs and custom protocols. For instance:
window.openForResult(requestId,[// First try an intent URL.'intent://get-place/#Intent;scheme=a;package=com.a;end',// Then, try a custom protocol.'web+location://get-place',// Finally, fallback to web URL.'https://maps.google.com/placepicker']);
Note that in this case window.onResult
would also accept an array of origins for the resultOrigin
argument.
If the origin is important for sensitive communication the caller page may require “strict origin mode”, in which case the browser/Android platform can require origin verification similar to Android’s Digital Assets Links protocol. E.g.:
window.openForResult(requestId, url, {origin: 'strict'});
Native support is not polyfillable at this time. If multiple URLs are specified, the polyfill would simply use the last (fallback) URL in the array.
Implementing high-quality popup-for-result pattern on the Web today is too complicated. This is additionally exacerbated by the fact that many such cases have critical security and privacy needs. Web Platform could implement improvements to this pattern in the form of contextual popups, and/or by following the existing startActivityForResult
protocol from Android SDK. Either or both could significantly improve development experience, stability, safety, cross-browser support, and UX.