As web content is making its way into all parts of daily life and mobile device performance continues to improve, developments allow content to be accessed across various platforms and devices.
Nevertheless, web content can be a victim of certain restraints if not optimized appropriately. One issue that is often encountered with content created using the HTML 5 markup is the initial sluggish loading speed when the user accesses the content. A white screen appears while the web page is loading, and the complexity of the HTML 5 processes often leads to lengthy loading times, resulting in poor user experience.
This article examines the possible ways to improve the initial loading speed of app function modules which load web pages for certain content. The proposed optimizations can be applied to make the initial loading experience as smooth as that of a native app.
Initial page loading — why the delay?
The markup language HTML is widely used for structuring and presenting digital content across multiple platforms. Although the fifth version, HTML 5, brings better multimedia and multi-platform support for web pages, it can be plagued with slow initial loading times.
This is mainly due to the complicated steps involved in loading a HTML 5 web page:
Initializing webview → Requesting page → Downloading data →Parsing HTML → Requesting JS/CSS resources → DOM Render → Parsing JS execution → JS requesting data (sometimes not required) → Parsing Render → Downloading rendered picture
Generally, a white screen is displayed before the DOM Render step. Users can only browse the full web page after downloading the rendered pictures. In other cases, only parts of the page are displayed.
To solve this issue, front-end, client-side and other optimizations can be made to reduce the duration of such processes and thus achieve zero-latency for initial loading.
Optimizations to the internal front-end part of the page can be implemented to improve initial page loading speeds. These optimizations include:
· Reducing the number of requests
This is realized by merging resources, reducing the number of HTTP requests, and through minify/gzip compression, webP, and lazyLoad.
· Accelerating the speed of requests
This is realized by pre-parsing the DNS, reducing the number of domains, parallel loading, and CDN distribution.
This includes HTTP protocol cache requests, offline cache manifests, and offline data cache on localStorage.
This is done by utilizing JS/CSS optimization, loading sequence, server rendering, and pipelines.
Among the above solutions, network requests have the greatest influence on initial loading speeds; therefore, the front-end’s request caching strategy should be the focus of optimization.
The cache can be divided into a static file cache (containing HTML and JS/CSS/image resources) and a json data cache. The protocol of the static file cache is defined by HTTP. Once the browser implements these protocols, static files can cached as well (more details can be found here). For static files, there are two types of caches:
· Those that ask if there are any updates
These cache requests from the back-end ask for updates based on protocols such as If-Modified-Since/ETag. If there are no updates the page returns to 304, and the browser uses the local cache.
· Those that use the local cache directly
This type of cache determines how long an update request is not required based on the Cache-Control/Expires field in the protocols. When such a request is not required, the local cache is used directly.
Based on the above, the optimized front-end cache strategy starts off with the HTML file asking the server for updates every time it is loaded, while the JS/CSS/Image resource file uses the local cache directly rather than requesting updates.
A potential issue with this approach is updating the JS/CSS files. The solution is to give each resource file a version number or hash value in the build process. If there is an update for the resource file, the version number and hash value will change, thus changing the URL for the resource request. In the meantime, the corresponding web page will be updated and will request the URL for new resources, thus updating the resources.
These cache strategies can realize the full cached resource files, such as JS/CSS, and the user data cache. It also uses the local cache data directly when the page is loaded without the need for network requests. However, this cannot be done for the caching of web pages. If the time periods for ‘Expires’ and ‘max-age time’ are set too long and only the local cache is used for long periods, the page will not be updated in time. Meanwhile, if the time periods for ‘Expires’ and ‘max-age time’ are set too short, the page will make a network request for updates every time it is loaded, and will then decide whether to use local resources.
The general front-end strategy is to request for updates every time, which is when users see the white-screen for long periods. This is caused by the conflict between the cache and the updates of the HTML file.
Usually, the front-end adopts the strategy of requesting updates every time the page is loaded. This makes the white-screen time very long, especially when the network connection is poor. Therefore, a conflict of objectives exists between “cache” and “update” in the HTML pages.
Now that web pages are embedded in most apps, the client-side has authority over page optimization. Because the client-side has more freedom over which cache strategies are implemented, all web page requests can be intercepted and the cache can be managed by the client-side itself.
To overcome the conflict of objectives between the cache and update for HTML, the following approach has shown encouraging results:
· Once there is an intercept request at the client-side, cache the data when the HTML file has been requested for the first time.
· When the web page is loaded for the second time, do not send the second request and just use the cached data directly instead.
This direct use of local sources removes the need to wait for network requests and so improves loading speeds of web pages. At the same time, real-time updates can be maintained as frequently as possible, solving the caching problem.
This seems to solve the cache problem completely — but in fact many issues remain to be solved. Improvements to cache logic is one way of solving these issues, but a better solution is to use a web page package.
Improved cache logic
Webview can control access to the cache directly, while it cannot control access to the caching logic. This uncontrollable cache results in multiple issues, but each can be solved or worked around.
· Important HTML/JS/CSS caches are cleared after the caching of a few big images
This issue stems from uncontrollable clearing logic and limited cache space. The solution is to configure a list of preloaded caches which need to request updates in advance when the app is opened or at a preset time. This list must contain all required web page modules and resources, and must consider the fact that one web page module may have multiple pages. This list may be long, and thus need tools to generate and manage it.
· Data cannot be pre-loaded from disk to memory
This issue stems from the uncontrollable disk IO. It can be overcome by the client-side taking over all requested caches and choosing not to use webview’s default caching logic. The caching mechanism is then implemented by itself based on cache priority and cache preloading.
· Full downloads during background HTML/JS/CSS updates are time consuming with a poor Internet connection
This occurs because of the huge amount of data involved. The solution is to perform incremental updates for every HTML and resource file. However, this is somewhat inconvenient to implement and manage.
· HTML pages hijacked by operators or other third parties are cached for a long period of time
This issue can be overcome by using the httpdns + https anti-hijack system at the client-side.
Although workable, these solutions are cumbersome to implement because HTML and resource files are numerous and scattered, making it difficult to manage them.
Web page package
As the usage scenario presented is to use web page development function modules and given the fact that problems arise from managing scattered files, an obvious solution is the packaging and delivering of all relevant pages and resources.
This solution can solve the above problems using certain relatively straightforward methods:
· The entire web page package can be pre-downloaded, and its configuration can be carried out based on service modules instead of files. The web page package contains all pages related to the service modules and all of them can be preloaded at the same time.
· The core files of the web page package and the caches for the dynamic image resource files of the page are separated, which makes cache management easier. Also, the entire web page package can be preloaded into the memory, reducing time required for disk IO.
· A web page package can easily make incremental updates based on version.
· A web page package is delivered through an encrypted and verified archive, making it impossible for operators and third parties to hijack or tamper with it.
The specific solution is summarized below:
1. Use build tools at the back-end to package all pages and resources related to the same service module into one file, and encrypt/sign it at the same time.
2. At the client-side, download the web page package at a customized time, and then decompress/decrypt/verify it based on the configuration table.
3. After opening a service, transfer the page to the web page package entrance page based on the configuration table.
4. Intercept network requests and read returned data from the web page package for the existing files in the web page package. Otherwise, apply the HTTP protocol’s caching logic.
5. When updating a web page package, the server only delivers the diff data package between the current and the latest versions and the client performs an incremental update by merging the two client-sides.
Web page packages are a workable solution for web page development. There are numerous other optimization options that should also be considered.
Other optimizations can also be implemented to complement the web page package solution, including:
· Using a public resource package
· Webview preloading
· Data preloading
· Fallback optimizations
· Client interface optimizations
· Server rendering optimizations
Using a public resource package
Each package uses the same JS framework and global CSS style. These resources are wasted as they appear in every web page package. Instead, a public resource package can be created to provide these global files.
For both iOS and Android, initialization of webview components suffers from long loading time, which can be overcome by pre-loading webview. There are two types of preloading:
· First-time preloading
The first initialization of webview is usually very slow. A webview can be pre-initialized and then released when the app is opened, so that the process is faster when the user loads the webview in the web page modules.
· Webview pool
Instead of the old method of creating a new webview each time a web page is opened, two or more webviews can be delicately used. However, one issue with this approach is how to clear the previous page when a page jump occurs. Another is that if there is a JS memory leak in a web page, other pages will be affected and cannot be released.
Under ideal circumstances, when the web page package is opened for the first time, all HTML/JS/CSS use local caches and network requests are not required. However, user data on the page still needs a real-time request, which is where optimization can be applied. Making a parallel data request saves a lot of loading time.
When implementing this solution, the URLs which need to be preloaded in a web page package use the configuration table. During webview initialization, the client-side sends the request, as well as the cache result when the preloading is complete.
Next, webview will begin requesting the preloaded URLs when the initialization is complete. The client-side intercepts the request and transfers it to the Manager. If the preloading is complete, the client-side then returns to the content. If it is not, then the client-side waits for it to do so.
A potential issue with web page package modules is that a user may try to access a web page package module that has not yet been downloaded, or the configuration table detects a new version but the old one is still being used in the app.
There are several solutions that can be adopted to overcome this:
· If there is no local web page package or the existing local web page package is not the latest, a synchronous blocking can be performed to download the latest web page package. However, this solution may affect user experience because the size of the web page package is relatively large.
· If there is an existing older version, the user could be allowed to directly use it once. This solution may cause delays in the updating, and the user may not be using the latest version.
· An online version of the web page package can be created, with each of the files in the web page package having a corresponding access address on the server. When there is no existing web page package, the corresponding online address can be accessed directly in the same way an online page is normally opened. This solution creates a better user experience compared to that of downloading the entire web page package, and ensures that the user accesses the latest version. This can also serve as a backup solution. In unforeseen circumstances where a web page package malfunctions, the user can access the online version directly, and thus the functions of the page will not be affected.
The advantage of these fallback optimizations is that they can be mixed and matched depending on needs.
Client interface optimization
If webkit’s ajax and localStorage are used on network and client interfaces, there will be many restrictions and optimization becomes more difficult.
JS can be provided with these interfaces at the client-side, so more specific optimizations such as DNS pre-parsing/IP direct connection/long connection/parallel requests can be carried out at the client-side. With regards to storage, the interfaces at the client-side can be used to carry out targeted optimizations such as read-and-write concurrency/user isolation.
Server rendering optimization
Unlike early web pages, most web page content relies on JS logic to determine what to render, which can be a time-consuming process. For example, waiting for JS to requesting JSON data and then splicing it into HTML-generated DOM and rendering it to the page is a lengthy process.
Solutions for this issue range from the artificial reduction of JS rendering logic to the more thorough and fundamental server rendering. This is where all content is determined by the HTML returned from the server, without the need to wait for JS logic to take place.
However, this can raise certain challenges, such as a change in development model/increased traffic/increased server cost. Certain pages in QQ Mobile have adopted a specific server rendering method called “dynamic direct output” — for more details, see this article.
From front-end optimization, client-side caching, and web page package, to more detailed optimizations, there are numerous solutions to web page initial loading speeds. If these solutions are appropriately adopted and properly executed, the user experience of opening a web page can almost be comparable to that of a native app.
To sum up, the general approach to optimizing a web page is “cache, preload, compute in parallel”:
· Cache all network requests
· Load as much content as possible before the user opens the page
· Perform as many processes as possible in parallel instead of serial form
(Original article by Chen Zhenzhuo陈振焯)
First hand and in-depth information about Alibaba’s latest technology → Search “Alibaba Tech” on Facebook