daniec@microsoft.com, sasebree@microsoft.com, travil@microsoft.com, pcupp@microsoft.com
We are proposing an extension of the ES6 Script Modules system to include HTML Modules. For a high-level description of HTML modules and the motivation for them see the associated explainer doc.
This document aims to describe of how we plan to implement HTML modules in Blink and V8.
Currently the v8::internal::Module class is V8's representation of a Source Text Module Record, which represents information about a Script Module such as its RequestedModules, its status ("uninstantiated", "instantiating", etc), and its imports/exports. We will introduce two subclasses, ScriptModule and HTMLModule. ScriptModule will contain the functionality specific to Script Modules that currently resides in Module, and HTMLModule will contain the new HTML module code. Common functionality will remain in Module, which can now be considered as roughly equivalent to the spec's Cyclic Module Record base type.
Module will contain the following new field to distinguish whether it is a ScriptModule or HTMLModule:
DECL_INT_ACCESSORS(type)
enum Type {
kScript,
kHTML
};
HTMLModule will contain the following new fields in addition to those inherited from Module:
// If this is an HTMLModule, this is the Document for the module.
// For a ScriptModule, this is unused.
DECL_ACCESSORS(document, Object)
// Used for HTML Modules. Each HTMLScriptElement in the module's document has
// a corresponding script entry in this array.
DECL_ACCESSORS(script_entries, FixedArray)
script_entries is an array of HTMLModuleScriptEntry, another class introduced to support HTML modules. Each script element in an HTML module corresponds to a single HTMLModuleScriptEntry. An HTMLModuleScriptEntry has 3 fields:
DECL_BOOLEAN_ACCESSORS(is_inline)
DECL_ACCESSORS(module_record, Object)
DECL_ACCESSORS(source_name, Object)
Inline <script> elements will have an associated HTMLModuleScriptEntry with is_inline == true and module_record set to the ScriptModule associated with that <script> element.
External script elements will have an associated HTMLModuleScriptEntry with is_inline == false and source_name set to the value of the src attribute of the <script> element. The module_record for this HTMLModuleScriptEntry will be filled in later during module graph instantiation. Note that the data for these fields are provided from Blink when the HTMLModuleScriptEntries are constructed; V8 doesn't actually have any knowledge of the <script> elements or the structure of the HTML module's Document.
The following is an overview of the algorithms for HTMLModule instantiation, evaluation, and import/export resolution. These will generally be implemented in HTMLModule overrides of Module functions, with their counterpart implementations remaining unchanged in ScriptModule (Module::PrepareInstantiate, Module::ResolveExport, etc). These descriptions are adapted from this document of proposed HTML module spec changes.
The module's HTMLDocument document.
3 vectors, where the ith entry in each corresponds to the ith <script> element in document.
is_inline: is_inline[i] == true if the ith <script> in document is inline, false if it is external.source_urls: source_urls[i] is the ith <script>'s source URL if that script is external, null otherwise.v8_modules: v8_modules[i] is the ith <script>s module record if that script is inline, null otherwise.The contents of the above vectors will be used to initialize the HTMLModuleScriptEntries for the new HTMLModule.
Module Instantiation: HTMLModules don't have a lexical environment to set up, so there will be no need for an equivalent to ModuleDeclarationEnvironmentSetup in PrepareInstantiate/FinishInstantiate. Note that the ‘export * from all inline script elements’ functionality is performed in our redefined ResolveExport below.
Export Resolution: HTMLModule will implement a modified version of ResolveExport(exportName, resolveSet), overriding Module::ResolveExport. This function’s purpose is to “resolve an imported binding to the actual defining module and local binding name”. For HTMLModules, instead of looking for local exports etc. we’ll iterate through each inline script and export their contents as for an ‘export *’. We redefine as follows:
Module Execution: HTMLModule will implement a modified version of InnerModuleEvaluation(module, stack, index). For ScriptModules, the idea of this method is to recurse for each child module, then execute the JS of the current module. HTMLModules don't have any code of their own to execute -- instead, they just recurse into their RequestedModules (i.e., the <script> elements in the module's HTMLDocument). In spec language, the algorithm has the following changes from the ScriptModule version:
We will introduce the new classes blink::HTMLModuleScript and blink::JSModuleScript, both deriving from the existing blink::ModuleScript.

The JavaScript-specific bits of ModuleScript will be pushed down to the derived JSModuleScript, but ModuleScript will otherwise remain the same, containing data and functionality common to both module types. We anticipate that as other module types are introduced, further corresponding derivations of ModuleScript will be added.
Most existing code in Modulator, ModuleMap, and ModuleTreeLinker will continue to work with ModuleScript, with updates to ensure that they are properly generalized to both HTML and JS modules.
We will also rename the ScriptModule class to ModuleRecord. ModuleRecord better reflects the usage of the class as Blink's handle to the ModuleRecord of the given module, and the old ScriptModule name was confusing when we also have a class named ModuleScript.
ModuleScriptLoader will still have the responsibility for managing the fetching and construction of a given module, now generalized to support both JS modules and HTML modules. It will now use the HTTP response header's Content-Type to determine whether a given module is created as HTML or JavaScript, as follows:
if (mime_type == "text/html") {
// Create HTMLModuleScript
} else if (MIMETypeRegistry::IsSupportedJavaScriptMIMEType(mime_type)) {
// Create JSModuleScript
} else {
// Refuse to create a module, and emit an error that 'text/html' or a JavaScript MIME-type is required.
}
Note: per spec discussion here a new MIME type may be introduced for HTML modules instead of reusing text/html.
ModuleScriptLoader will continue to delegate to ModuleScriptFetcher to perform the actual fetch, and ModuleScriptFetcher will continue to deliver a ModuleScriptCreationParams back to ModuleScriptLoader upon fetch completion. A MIME type will be added to ModuleScriptCreationParams so that it can be used by ModuleScriptLoader to determine which type of module to create.
DocumentModuleScriptFetcher will be updated so that it doesn't block a response with a text/html MIME type. Other ModuleScriptFetcher derived classes will remain unchanged as we still want to block text/html for Workers, ServiceWorkers, etc.
When a ModuleScriptLoader instance determines that the result of a fetch should be processed as an HTML module, it will instantiate a new HTMLDocument with a new DocumentClass flag marking it as an HTML module document. A DocumentParser instance will be connected to it and fed the contents of the file fetched by the ModuleScriptLoader. The parser will follow the standard HTML5 parsing rules, with a few differences. Firstly, if a script without type="module" is encountered the parser will terminate with an error that will prevent the HTML Module Record from being created (note: or we may decide instead to coerce these to type="module"; see discussion here). Secondly, when <script> elements (of type="module") are encountered, they are logged in an HTMLModuleScriptEntry list that records the following for each <script>:
Specifically, parsing an HTML module document will follow these changes to the normal rules:
<script> element is encountered, instead of creating a PendingScript for it and calling FetchDescendants, we will log it as an entry in the HTMLModuleScriptEntry list noted above.<script> element is encountered, instead of calling FetchModuleScriptTree, we will log its URL in an entry in the HTMLModuleScriptEntry list noted above.<script> element is encountered, an error will be logged. ModuleScriptLoader will check for this error and skip creation of the module record if it is encountered (this result is analogous to a Script Module that has a parse error).Once parsing completes, the resulting HTMLDocument and the HTMLModuleScriptEntry list will be used to instantiate a new HTMLModule.
Changes to ModuleTreeLinker will be minor and will mostly involve generalizing the algorithms such that they can handle the different structure of HTML modules, e.g. ensuring that ModuleTreeLinker::FetchDescendants treat an HTML module's inline child scripts correctly. For these there is nothing to fetch (since they have no URL and already have a Module Record created for them), but we still must recurse into them such that their descendants are fetched.
To support import.meta.document we will make the following changes:
Member<Document> document_for_inline_module_ field. For inline script elements in an HTML module, this will be populated with the HTML Module's Document at construction time of the ModuleScript.Persistent<Document> document_for_inline_module_ field will be added to the ModuleImportMeta class, populated in ModulatorImplBase::HostGetImportMetaProperties from the document_for_inline_module_ of the ModuleScript from which the ModuleImportMeta is being created.v8::Local<v8::Object> meta, using the ModuleImportMeta's document_for_inline_module_.