MSEdgeExplainers

HTML Modules Design Document

Contact emails

daniec@microsoft.com, sasebree@microsoft.com, travil@microsoft.com, pcupp@microsoft.com

Introduction

We are proposing an extension of the ES6 Script Modules system to include HTML Modules. For a high-level description of HTML modules and the motivation for them see the associated explainer doc.

This document aims to describe of how we plan to implement HTML modules in Blink and V8.

V8 changes

Currently the v8::internal::Module class is V8's representation of a Source Text Module Record, which represents information about a Script Module such as its RequestedModules, its status ("uninstantiated", "instantiating", etc), and its imports/exports. We will introduce two subclasses, ScriptModule and HTMLModule. ScriptModule will contain the functionality specific to Script Modules that currently resides in Module, and HTMLModule will contain the new HTML module code. Common functionality will remain in Module, which can now be considered as roughly equivalent to the spec's Cyclic Module Record base type.

Module will contain the following new field to distinguish whether it is a ScriptModule or HTMLModule:

  DECL_INT_ACCESSORS(type)
  enum Type {
    kScript,
    kHTML
  };

HTMLModule will contain the following new fields in addition to those inherited from Module:

  // If this is an HTMLModule, this is the Document for the module.
  // For a ScriptModule, this is unused.
  DECL_ACCESSORS(document, Object)

  // Used for HTML Modules.  Each HTMLScriptElement in the module's document has
  // a corresponding script entry in this array.
  DECL_ACCESSORS(script_entries, FixedArray)

script_entries is an array of HTMLModuleScriptEntry, another class introduced to support HTML modules. Each script element in an HTML module corresponds to a single HTMLModuleScriptEntry. An HTMLModuleScriptEntry has 3 fields:

  DECL_BOOLEAN_ACCESSORS(is_inline)
  DECL_ACCESSORS(module_record, Object)
  DECL_ACCESSORS(source_name, Object)

Inline <script> elements will have an associated HTMLModuleScriptEntry with is_inline == true and module_record set to the ScriptModule associated with that <script> element. External script elements will have an associated HTMLModuleScriptEntry with is_inline == false and source_name set to the value of the src attribute of the <script> element. The module_record for this HTMLModuleScriptEntry will be filled in later during module graph instantiation. Note that the data for these fields are provided from Blink when the HTMLModuleScriptEntries are constructed; V8 doesn't actually have any knowledge of the <script> elements or the structure of the HTML module's Document.

The following is an overview of the algorithms for HTMLModule instantiation, evaluation, and import/export resolution. These will generally be implemented in HTMLModule overrides of Module functions, with their counterpart implementations remaining unchanged in ScriptModule (Module::PrepareInstantiate, Module::ResolveExport, etc). These descriptions are adapted from this document of proposed HTML module spec changes.

We will introduce the new classes blink::HTMLModuleScript and blink::JSModuleScript, both deriving from the existing blink::ModuleScript.

Class Diagram

The JavaScript-specific bits of ModuleScript will be pushed down to the derived JSModuleScript, but ModuleScript will otherwise remain the same, containing data and functionality common to both module types. We anticipate that as other module types are introduced, further corresponding derivations of ModuleScript will be added.

Most existing code in Modulator, ModuleMap, and ModuleTreeLinker will continue to work with ModuleScript, with updates to ensure that they are properly generalized to both HTML and JS modules.

We will also rename the ScriptModule class to ModuleRecord. ModuleRecord better reflects the usage of the class as Blink's handle to the ModuleRecord of the given module, and the old ScriptModule name was confusing when we also have a class named ModuleScript.

ModuleScriptLoader

ModuleScriptLoader will still have the responsibility for managing the fetching and construction of a given module, now generalized to support both JS modules and HTML modules. It will now use the HTTP response header's Content-Type to determine whether a given module is created as HTML or JavaScript, as follows:

if (mime_type == "text/html") {
  // Create HTMLModuleScript
} else if (MIMETypeRegistry::IsSupportedJavaScriptMIMEType(mime_type)) {
  // Create JSModuleScript
} else {
  // Refuse to create a module, and emit an error that 'text/html' or a JavaScript MIME-type is required.
}

Note: per spec discussion here a new MIME type may be introduced for HTML modules instead of reusing text/html.

ModuleScriptLoader will continue to delegate to ModuleScriptFetcher to perform the actual fetch, and ModuleScriptFetcher will continue to deliver a ModuleScriptCreationParams back to ModuleScriptLoader upon fetch completion. A MIME type will be added to ModuleScriptCreationParams so that it can be used by ModuleScriptLoader to determine which type of module to create.

DocumentModuleScriptFetcher will be updated so that it doesn't block a response with a text/html MIME type. Other ModuleScriptFetcher derived classes will remain unchanged as we still want to block text/html for Workers, ServiceWorkers, etc.

HTML Module Parsing

When a ModuleScriptLoader instance determines that the result of a fetch should be processed as an HTML module, it will instantiate a new HTMLDocument with a new DocumentClass flag marking it as an HTML module document. A DocumentParser instance will be connected to it and fed the contents of the file fetched by the ModuleScriptLoader. The parser will follow the standard HTML5 parsing rules, with a few differences. Firstly, if a script without type="module" is encountered the parser will terminate with an error that will prevent the HTML Module Record from being created (note: or we may decide instead to coerce these to type="module"; see discussion here). Secondly, when <script> elements (of type="module") are encountered, they are logged in an HTMLModuleScriptEntry list that records the following for each <script>:

Specifically, parsing an HTML module document will follow these changes to the normal rules:

Once parsing completes, the resulting HTMLDocument and the HTMLModuleScriptEntry list will be used to instantiate a new HTMLModule.

ModuleTreeLinker

Changes to ModuleTreeLinker will be minor and will mostly involve generalizing the algorithms such that they can handle the different structure of HTML modules, e.g. ensuring that ModuleTreeLinker::FetchDescendants treat an HTML module's inline child scripts correctly. For these there is nothing to fetch (since they have no URL and already have a Module Record created for them), but we still must recurse into them such that their descendants are fetched.

import.meta.document support

To support import.meta.document we will make the following changes: