DPS921/Web Worker API

From CDOT Wiki
(Redirected from Web Worker API)
Jump to: navigation, search

Project Summary

Team Members

  1. Khang Nguyen
  2. Anton Biriukov
  3. Akshatkumar Patel

Brief Description

JavaScript is a powerful programming language of the web but it is single-threaded by nature. The Web Workers API, introduced in HTML5 and supported by modern browsers, remedies that by enabling web applications to execute computationally expensive JavaScript code in a background thread separated from the main interface thread. Besides that, a background thread can be shared between other threads in the same web application and, therefore, facilitates lightweight communication between multiple windows of that application. Overall, the project's objectives are to explore the Web Worker API and how to utilize it in improving the overall performance and user experience of web applications.

Introduction

Parallel computing has certainly become a very wide-spread and rapidly-developing area in programming. It comes from the fact that a number of resources grouped together can solve problems faster. We are all well-aware of multi-core and multithreaded CPUs and GPUs that provide technical compatibility to support parallel computing. There are a number of parallel computing libraries, such as OpenMP, MPI, TBB, that we have looked into in our course. However, we have not discussed any front-end web development APIs allowing for multithreading.

Modern web applications feature a wide range of computationally expensive tasks, such as data fetching, advertisement, image processing, and others. Until recently, parallel computation has been mostly applied on the backend part of web applications and not so much on the front-end. However, to keep up with the increased backend parallelism frontend also needs to utilize this technique. As stated previously, in our research project we will investigate how the use of Web Workers API might help us achieve this goal.

Web Workers API

Overview

Web Workers API, as specified in the HTML Living Standard, is an application programming interface that defines specifications for running scripts in the background independently of any user interface scripts. The first iteration of the Web Workers API standard was initially presented in 2010 with HTML5. Even though web workers are not a part of JavaScript, but rather a browser feature, they can be accessed through JS. Web Workers are utilized by millions of websites across the web for a multitude of tasks, varying from image processing to artificial intelligence and even bitcoin mining. There is a number of interfaces defined in the Web Workers API, but we will mainly focus on the following two: Shared and Dedicated workers. Both of them inherit from the WorkerGlobalScope, which represents the generic scope of any worker, which primarily does the same thing as Window in the normal web content.

Basic Syntax

Web Workers are created by a constructor function Worker():

const myWorker = new Worker('worker.js');

In order to pass information between workers and the main thread postMessage() method and the onmessage event handler are used. The following is an example of how to send a simple string to the web worker:

myWorker.postMessage('Text message');

Then inside our worker we can use onmessage to handle the incoming message and send some response back to the main thread:

onmessage = function(e) {
  console.log('Message received from main script');
  var workerResult = 'Result: ' + e;
  console.log('Posting message back to main script');
  postMessage(workerResult);
}

Finally, back in the main thread we receive the message with the following:

myWorker.onmessage = function(e) {
  result.textContent = e.data;
  console.log('Message received from worker');
}

Limitations

Worker thread has its own context and therefore you can only access selected features inside a worker thread. For instance, there a three main limitations:

  • You can't directly manipulate the DOM from inside a worker.
  • You can not use some default methods and properties of the window object since window object is not available inside a worker thread.
  • The context inside the worker thread can be accessed via DedicatedWorkerGlobalScope or SharedWorkerGlobalScope depending upon the usage.

Browser Support

According to https://caniuse.com/webworkers Web Workers API is supported on most of the modern web browsers on both desktop and mobile:

DedicatedWorkerSupport.PNG

Shared Worker features a lower support rate, but is still available on a plenty of browsers:

SharedWorker.PNG

Dedicated Worker

Dedicated workers are usually referred to when general web workers are mentioned, hence the basic syntax mentioned above us utilized in this section. Dedicated workers are only accessible by the script that spawn them and are often represented in the DedicatedWorkerGlobalScope. Dedicated Workers are compatible with both desktop and mobile browsers. Dedicated workers provide us the capability to offload computation from the master thread to a thread running on another logical processor. We have leveraged dedicated workers in the following demo we hosted on GitHub pages:

https://vitokhangnguyen.github.io/WebWorkerDemo/index.html

In the demo, we take a 6K image and apply a Sobel filter. This filter is used to emphasize the edges within an image. It is often used in image processing and computer vision. When we ran the serial code, we noticed the UI lagged (when scrolling) and the cursor remained in the pointer state throughout the entire filtering process, delaying the user from interacting with the user interface. We took an extensive look into this lagging using the performance tool in Firefox. We discovered that the DOM clicking event occurred throughout the entire duration of the function execution (8 seconds for my PC), and the FPS almost dropped down to 0, as shown in the image below.


Firefox Performance.PNG


To counter this, we used Dedicated Workers to perform the CPU intensive calculations on another thread, enabling the user to interact with the website.

dedicated.js

function performParallelSobel() {
    // Reset canvas
    const tempContext = parallelCanvas.getContext("2d");
    tempContext.drawImage(image, 0, 0);

    // Check if web workers are compatible (All browsers that follow HTML5 standards are compatible)
    if (window.Worker) {

        // Record starting time
        let start = window.performance.now();
        let end;
        const numOfWorkers = slider.value;
        let finished = 0;

        // Height of the picture chunck for every worker
        const blockSize = parallelCanvas.height / numOfWorkers;

        // Function called when a worker has finished
        let onWorkEnded = function (e) {
            // Data is retrieved using a memory clone operation
            const sobelData = e.data.result;
            const index = e.data.index;

            // Copy sobel data to the canvas
            let sobelImageData = Sobel.toImageData(sobelData, parallelCanvas.width, blockSize);
            tempContext.putImageData(sobelImageData, 0, blockSize * index);

            finished++;

            if (finished == numOfWorkers) {
                // Calculate Time difference
                end = window.performance.now();
                const difference = `${end-start} ms`;
                parallelResults.textContent = difference;
                const color = '#' + Math.floor(Math.random() * 16777215).toString(16);
                // Update chart
                updateChart(numOfWorkers, end - start, color, `Parallel (${numOfWorkers})`);
            }
        };

        // Launch n numbers of workers
        for (let i = 0; i < numOfWorkers; i++) {
            // Create a web worker object by passing in the file consisting of code to execute in parallel
            const worker = new Worker('./scripts/dedicatedWorker.js');
            
            // Once the worker has completed, execute onWorkEnded function
            worker.onmessage = onWorkEnded;

            // Break image into chunks using the blocksize
            const canvasData = tempContext.getImageData(0, blockSize * i, parallelCanvas.width, blockSize);

            // Start Working - (launch the thread)
            worker.postMessage({
                data: canvasData,
                // Thread ID
                index: i,
            });
        }
    }
}

The code that the worker runs is the following (dedicatedWorker.js):

importScripts('./sobel.js');

// Retrieve message (data) from the script that created the worker
self.onmessage = function (event) {
    // Set worker ID
    const index = event.data.index;

    // Get data and call the Sobel filter
    const sobelData = Sobel(event.data.data);

    // Post the data back on completion
    self.postMessage({ result: sobelData, index: index});
};

Now running the Sobel filter on a separate thread allows us to continue interacting with the UI but, the execution time slightly increased. This is because we have to instantiate the worker, which takes time and resources. Luckily, we can drastically improve the Sobel filtering process by breaking the image down into horizontal chunks. We do this by getting the height and dividing by the number of processors allowed by the browser, which is obtained using the Windows API (window.navigator.hardwareConcurrency). Once we chunk the image, we can create n number of worker objects based on the hardware concurrency set by the browser and post the chunk of data to them with their ID (index). When a worker is finished running, it will send a response using the onmesasge event handler, so we can assign a function to be executed when the event is triggered. In the file, we passed into the web worker constructor, we refer to the global scope using self and call the onmessage event handler. This event handler receives the posted data we sent from dedicated.js and initiates the Sobel filtering. Once the filtering is done, the data is posted back by referring to self.

Sobel Filtering Execution Time Per Processor.PNG

Here are the results we got from running the Sobel filter on 10 processors. As we utilize more web workers, the faster the filter is applied to the image.

Shared Worker

Shared Worker is a kind of Web Worker that allows multiple scripts (e.i.: tabs, windows, iframes or other workers) to access and share the worker's resources simultaneously among them. The Shared Worker implements a different interface compared to the Dedicated Worker and the process on which it runs also has a different global scope - the SharedWorkerGlobalScope.

The SharedWorker Interface - The Main Processes

The SharedWorker interface is used in the process that would spawn one or many shared worker processes.

Creating a Shared Worker

The SharedWorker interface derives from the AbstractWorker interface.

To create an instance of a Shared Worker, we use the SharedWorker's constructor. The constructor receives, as the first parameter, a URL to a JavaScript source file that would be the starting point of the process. It also takes, as an optional 2nd parameter, an option object.

let worker = new SharedWorker('url-to-worker.js', options);

What happens at the construction-time of the Shared Worker is that the browser will start-up a new process, load the JavaScript source file as the starting point and return an instance of that process with a port object if no process associated with that JavaScript source file has been created before. In the case that an existing process associated with the JavaScript source file is found, the browser simply returns that process instance with a new port object.

Send Message to the Shared Worker

As a Shared Worker is created, the returned worker object has a read-only property of type MessagePort, namely port. Different main processes instantiating a new Shared Worker with the same JavaScript source file would have different port objects representing different connections.

The port object is used to communicate with the Shared Worker and control the connection with three methods demonstrated below. The following code snippet creates a shared worker, initiates communication with it and sends 3 separated messages which are a string, a complex object and another string in that order. Finally, it terminates the communication with the worker.

// Instantiate a Shared Worker
let worker = new SharedWorker('sharedWorker.js');

// Initiate the communication
worker.port.start();

// Send 3 messages to the shared worker
worker.port.postMessage("Hello worker from the main process!");
worker.port.postMessage({ foo: 3, bar: "dps921", foobar: [1, 2, 4] });
worker.port.postMessage("Goodbye!");

// Terminate the communication
worker.port.close();

Notice from the code that it is different from the Dedicated Worker where the communication was done via the worker itself, the Shared Worker has to communicate through a port because (in the next section, we will see that) a Shared Worker can communicate with multiple processes at once.

Receive Message from the Shared Worker

Besides sending, the port object can also be used to receive message sent from the worker via the onmessage event handler. The handler can be set directly as in the following snippet.

// Instantiate a Shared Worker
let worker = new SharedWorker('sharedWorker.js');

// Initiate the communication and listen to any communication from the worker
worker.port.onmessage = function(event) {
	let data = event.data;
	console.log(data);
};

The snipppet above instantiates a shared worker and starts communication by registering a function to be executed when a message is received. The content of the message can be accessed in the data attribute of the event object which is passed in as the first parameter of the function.

Notice from the code that the calls to the start() method of the port are omitted. This is because the start() method is implicitly called when the onmessage handler is set. Also, the call to the close() method also no longer exists and that is because if close() is called, the onmessage event handler is also implicitly unregistered.

The SharedWorkerGlobalScope - The Worker

The SharedWorkerGlobalScope is the global scope of the JavaScript source file that is loaded by the SharedWorker() constructor. Unlike the regular browser's global scope which is accessible via the window object, the SharedWorkerGlobalScope is represented by the ""self"" object.

An important feature of the Shared Worker is that all processes connected to the worker share the same global scope as there is only a single worker. This enables data-sharing between multiple processes which will be demonstrated later in this section.

Detecting a New Connection

When a new connection is established between a process and a shared worker, an event handler in the SharedWorkerGlobalScope, namely onconnect is triggered. By registering a function to this handler, we can execute custom code when this happens. Let's say we have the following snippet in the sharedWorker.js source file:

// self.onconnect = ...
onconnect = function(event) {
	let port = event.ports[0];
	port.start();
	port.postMessage("Hello main process from worker!");
	port.postMessage("Goodbye!");
	port.close();
}

The sharedWorker.js file above, when loaded into a worker, would initiate the communication, send 2 message and end the communication for any process that connects to it. If that process had the onmessage event handler set up, they can access this message from the event.data attribute.

Notice on the code that the port of the connecting process can be accessed in the even.ports[0]. It is always the case that this ports array is going have exactly 1 element in the onconnect event handler. Besides that, same as in the main process, the start() and close() method calls on the port object are necessary when we do not set a onmessage event handler for it.

Communicating with Connected Processes

As the connected ports are accessible in the sharedWorker.js, specifically in the onconnect event handler, we can set them up so that each of them can listen for messages sent from their corresponding processes.

The following sharedWorker.js source file will use an array to store all the ports of the connected processes. Each of the port is set up to receive messages from its own process and, then, broadcast the messages to all other processes.

let allConnectedPorts = [];
onconnect = function(event) {
	let port = event.ports[0];
	port.onmessage = function(e) {
		allConnectedPorts.forEach(p => p.postMessage(e.data));
	};
	allConnectedPorts.push(port);
}

An Interactive Demo - Battleships Game

With the Shared Worker and a few techniques above, we can set up an application fully works on client-side (i.e.: the browser) and is capable of sharing data between windows.

I have leveraged this knowledge to complete a demo of a 2-player Battleships game that consists of 2 browser windows - 1 for each player. Rather than having a server coordiating the 2 players, the players would coordinate themselves by sending/receiving messages to/from each other through a shared worker acting as an arbitrator.

The demo is accessible here or through the following URL: https://vitokhangnguyen.github.io/WebWorkerDemo/shared.html

Additional Web Worker's Features

Load External Scripts

In the regular browser'scope, scripts can import to use data or functions from each other by using the <script> tag and adding them to the same HTML file. For example, by doing the following, script2 and script1 will have the same global scope:

<script src="/script1.js"></script>
<script src="/script2.js"></script>

However, with the web workers, that would not work because the worker processes do not load the HTML files. To achieve the same effect, the WorkerGlobalScope offers a function namely importScripts().

In the worker's source file, we can do the following so that we can access the global scope of 2 source files: script1.js, script2.js and foobar.js.

importScripts("script1.js", "script2.js", "foobar.js")

Notice that the function can take as many arguments as possible and the effect is going to be the same as importing one by one.

Transferable Objects

It is important to realize that in Web Worker communications, there are no real shared data! The data is transferred among the processes via message passing and the browser implements an algorithm called Structured Cloning to perform this. In short, an object in JavaScript is recursively copied and represented in the JSON format to be parsed upon being received.

The method has a weakness that if the copied object is too large in size, the process of copying and transmiting this object through a message can take very long. To remedy this issue, most modern browsers nowadays allow transferable objects which means an object would stay in memory and would not be copied while a reference to it is transferred over to the destination process, instead. The old reference to that object in the source process would no longer be available after the transfer. This makes a huge performance boost when transferring large objects.

To utilize the feature, the following syntax of the Worker.postMessage() function is employed:

let worker = new Worker('worker.js');
worker.postMessage(data, [arrayBuffer1, arrayBuffer2]);

Note that the 1st argument of the function call is the data you want to copy and send and the 2nd argument is an array of items you want to transfer. The array must only contain items of the ArrayBuffer type to be transferred.

Error Handling

Web Worker API provides the capability of handling errors in the worker processes from the main process. When an Error is raised in a worker, an ErrorEvent will be fired to the main process. We can listen to this event by assigning a function to the event handler onerror of the worker object.

The following code sets up the main process to print to the console a message when an Error is raised in the worker:

let worker = new Worker('worker.js');
worker.onerror = function(e) {
    console.log(`An error occcured in the ${e.filename} worker at line %{e.lineno}`);
    console.log(`Error: ${e.message}`);
}

Note that the ErrorEvent has 3 special properties which are filename (the name of the worker's source file where the error occured), lineno (the line number in the worker's source file where the error occured) and message (a description of the error).

Other Workers

Service Worker

As a type of Web Worker, the Service Worker is a also a JavaScript source file that runs in the background of the browser on a thread that is separated from the main thread. It acts as a proxy server between the application on the browser and the Internet. It is euqipped with utilities that help intercept network request and perform some actions with that requests. Those actions include relaying the request to its intneded destination, dropping the request, caching of the request contents or responding to the request with the cached data. In addition, it can act on the requests differently based on each of the request or the current network status.

For those utilities, the Service Worker enables your applications to be capable of working offline. It allows the developer to write the strategies for that such as "network-first-then-cache", "cache-first-then-network" or "cache-only"...etc. Service Worker is an essential component of any Progressive Web App (PWA).

Chrome Worker

This type of Web Worker allows the developer to write privileged codes by giving them accesses to low-level functions. Specifically, the worker gives us access to js-ctypes that allows an application to extend or call native codes written in C or C++.

This feature is not standardized and might be or already have been removed from many browsers.

References

  1. HTML Living Standard, Web Workers: https://html.spec.whatwg.org/multipage/workers.html. Last updated: 2 Dec. 2020.
  2. MDN Web Docs, Web Worker API: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API. Last updated: 26 Aug. 2020.
  3. Can I use..., Support tables for HTML5, CSS3, etc, Web Workers: https://caniuse.com/webworkers. Last updated: 30 Nov. 2020.
  4. Can I use..., Support tables for HTML5, CSS3, etc, Shared Workers: https://caniuse.com/sharedworkers. Last updated: 30 Nov. 2020.
  5. HTML5Rocks, The Problem: JavaScript Concurrency: https://www.html5rocks.com/en/tutorials/workers/basics/. Last updated: 26 Jul. 2020.
  6. MDN Web Docs, Service Worker API: https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API. Last updated: 16 Nov. 2020.
  7. MDN Web Docs, ChromeWorker: https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/Chrome/API/ChromeWorker. Last updated: 18 Feb. 2020.