| # Node.js API Documentation Tooling |
| |
| The Node.js API documentation is generated by an in-house tooling that resides |
| within the [tools/doc](https://github.com/nodejs/node/tree/main/tools/doc) |
| directory. |
| |
| The build process (using `make doc` or `make doc-only`) uses this tooling to |
| parse the markdown files in [`doc/api/`][] and generate the following: |
| |
| 1. Human-readable HTML in `out/doc/api/*.html` |
| 2. A JSON representation in `out/doc/api/*.json` |
| |
| These artifacts are published to nodejs.org for multiple versions of |
| Node.js. As an example the latest version of the human-readable HTML |
| is published to [nodejs.org/en/doc](https://nodejs.org/en/docs/), |
| and the latest version of the json documentation is published to |
| [nodejs.org/api/all.json](https://nodejs.org/api/all.json) |
| |
| The artifacts are built as part of release builds by running the [doc-upload](https://github.com/nodejs/node/blob/1a83ad6a693f851199608ae957ac5d4f76871485/Makefile#L1218-L1224) |
| Makefile target as part of the release-sources part of the |
| iojs+release job. |
| This target runs the `doc` target to build the docs and then uses |
| `scp` to copy them onto the staging/www server into a directory of the form |
| `/home/staging/nodejs/<type>/<full_version>/docs` where <type> is e.g. |
| release, nightly, etc. The promotion step (either automatic for |
| nightlies or manual for releases) then moves the docs to |
| `/home/dist/nodejs/docs/\<full\_version>` where they are served by node.org. |
| |
| **The key things to know about the tooling include:** |
| |
| 1. The entry-point is `tools/doc/generate.js`. |
| 2. The tooling supports the CLI arguments listed in the table below. |
| 3. The tooling processes one file at a time. |
| 4. The tooling uses a set of dependencies as described in the dependencies |
| section. |
| 5. The tooling parses the input files and does several transformations to the |
| AST (Abstract Syntax Tree). |
| 6. The tooling generates a JSON output that contains the metadata and content of |
| the Markdown file. |
| 7. The tooling generates a HTML output that contains a human-readable and ready |
| to-view version of the file. |
| |
| This documentation serves the purpose of explaining the existing tooling |
| processes, to allow easier maintenance and evolution of the tooling. It is not |
| meant to be a guide on how to write documentation for Node.js. |
| |
| #### Vocabulary & Good to Know's |
| |
| * AST means "Abstract Syntax Tree" and it is a data structure that represents |
| the structure of a certain data format. In our case, the AST is a "graph" |
| representation of the contents of the Markdown file. |
| * MDN means [Mozilla Developer Network](https://developer.mozilla.org/en-US/) |
| and it is a website that contains documentation for web technologies. We use |
| it as a reference for the structure of the documentation. |
| * The |
| [Stability Index](https://nodejs.org/dist/latest/docs/api/documentation.html#stability-index) |
| is used to community the Stability of a given Node.js module. The Stability |
| levels include: |
| * Stability 0: Deprecated. (This module is Deprecated) |
| * Stability 1: Experimental. (This module is Experimental) |
| * Stability 2: Stable. (This module is Stable) |
| * Stability 3: Legacy. (This module is Legacy) |
| * Within Remark YAML snippets `<!-- something -->` are considered HTML nodes, |
| that's because YAML isn't valid Markdown content. (Doesn't abide by the |
| Markdown spec) |
| * "New Tooling" references to the (written from-scratch) API build tooling |
| introduced in `nodejs/nodejs.dev` that might replace the current one from |
| `nodejs/node` |
| |
| ## CLI Arguments |
| |
| The tooling requires a `filename` argument and supports extra arguments (some |
| also required) as shown below: |
| |
| | Argument | Description | Required | Example | |
| | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------- | ---------------------------------- | |
| | `--node-version=` | The version of Node.js that is being documented. It defaults to `process.version` which is supplied by Node.js itself | No | v19.0.0 | |
| | `--output-directory=` | The directory where the output files will be generated. | Yes | `./out/api/` | |
| | `--apilinks=` | This file is used as an index to specify the source file for each module | No | `./out/doc/api/apilinks.json` | |
| | `--versions-file=` | This file is used to specify an index of all previous versions of Node.js. It is used for the Version Navigation on the API docs page. | No | `./out/previous-doc-versions.json` | |
| |
| **Note:** both of the `apilinks` and `versions-file` parameters are generated by |
| the Node.js build process (Makefile). And they're files containing a JSON |
| object. |
| |
| ### Basic Usage |
| |
| ```bash |
| # cd tools/doc |
| npm run node-doc-generator ${filename} |
| ``` |
| |
| **OR** |
| |
| ```bash |
| # nodejs/node root directory |
| make doc |
| ``` |
| |
| ## Dependencies and how the Tooling works internally |
| |
| The API tooling uses an-AST-alike library called |
| [unified](https://github.com/unifiedjs/unified) for processing the Input file as |
| a Graph that supports easy modification and update of its nodes. |
| |
| In addition to `unified` we also use |
| [Remark](https://github.com/remarkjs/remark) for manipulating the Markdown part, |
| and [Rehype](https://github.com/rehypejs/rehype) to help convert to and from |
| Markdown. |
| |
| ### What are the steps of the internal tooling? |
| |
| The tooling uses `unified` pipe-alike engine to pipe each part of the process. |
| (The description below is a simplified version) |
| |
| * Starting from reading the Frontmatter section of the Markdown file with |
| [remark-frontmatter](https://www.npmjs.com/package/remark-frontmatter). |
| * Then the tooling goes to parse the Markdown by using `remark-parse` and adds |
| support to [GitHub Flavoured Markdown](https://github.github.com/gfm/). |
| * The tooling proceeds by parsing some of the Markdown nodes and transforming |
| them to HTML. |
| * The tooling proceeds to generate the JSON output of the file. |
| * Finally it does its final node transformations and generates a stringified |
| HTML. |
| * It then stores the output to a JSON file and adds extra styling to the HTML |
| and then stores the HTML file. |
| |
| ### What each file is responsible for? |
| |
| The files listed below are the ones referenced and actually used during the |
| build process of the API docs as we see on <https://nodejs.org/api>. The |
| remaining files from the directory might be used by other steps of the Node.js |
| Makefile or might even be deprecated/remnant of old processes and might need to |
| be revisited/removed. |
| |
| * **`html.mjs`**: Responsible for transforming nodes by decorating them with |
| visual artifacts for the HTML pages; |
| * For example, transforming man or JS doc references to links correctly |
| referring to respective External documentation. |
| * **`json.mjs`**: Responsible for generating the JSON output of the file; |
| * It is mostly responsible for going through the whole Markdown file and |
| generating a JSON object that represent the Metadata of a specific Module. |
| * For example, for the FS module, it will generate an object with all its |
| methods, events, classes and use several regular expressions (ReGeX) for |
| extracting the information needed. |
| * **`generate.mjs`**: Main entry-point of doc generation for a specific file. It |
| does e2e processing of a documentation file; |
| * **`allhtml.mjs`**: A script executed after all files are generated to create a |
| single "all" page containing all the HTML documentation; |
| * **`alljson.mjs`**: A script executed after all files are generated to create a |
| single "all" page containing all the JSON entries; |
| * **`markdown.mjs`**: Contains utility to replace Markdown links to work with |
| the <https://nodejs.org/api/> website. |
| * **`common.mjs`**: Contains a few utility functions that are used by the other |
| files. |
| * **`type-parser.mjs`**: Used to replace "type references" (e.g. "String", or |
| "Buffer") to the correct Internal/External documentation pages (i.e. MDN or |
| other Node.js documentation pages). |
| |
| **Note:** It is important to mention that other files not mentioned here might |
| be used during the process but are not relevant to the generation of the API |
| docs themselves. You will notice that a lot of the logic within the build |
| process is **specific** to the current <https://nodejs.org/api/> infrastructure. |
| Just as adding some JavaScript snippets, styles, transforming certain Markdown |
| elements into HTML, and adding certain HTML classes or such things. |
| |
| **Note:** Regarding the previous **Note** it is important to mention that we're |
| currently working on an API tooling that is generic and independent of the |
| current Nodejs.org Infrastructure. |
| [The new tooling that is functional is available at the nodejs.dev repository](https://github.com/nodejs/nodejs.dev/blob/main/scripts/syncApiDocs.js) |
| and uses plain ReGeX (No AST) and [MDX](https://mdxjs.com/). |
| |
| ## The Build Process |
| |
| The build process that happens on `generate.mjs` follows the steps below: |
| |
| * Links within the Markdown are replaced directly within the source Markdown |
| (AST) (`markdown.replaceLinks`) |
| * This happens within `markdown.mjs` and basically it adds suffixes or |
| modifies link references within the Markdown |
| * This is necessary for the `https://nodejs.org` infrastructure as all pages |
| are suffixed with `.html` |
| * Text (and some YAML) Nodes are transformed/modified through |
| `html.preprocessText` |
| * JSON output is generated through `json.jsonAPI` |
| * The title of the page is inferred through `html.firstHeader` |
| * Nodes are transformed into HTML Elements through `html.preprocessElements` |
| * The HTML Table of Contents (ToC) is generated through `html.buildToc` |
| |
| ### `html.mjs` |
| |
| This file is responsible for doing node AST transformations that either update |
| Markdown nodes to decorate them with more data or transform them into HTML Nodes |
| that attain a certain visual responsibility; For example, to generate the "Added |
| at" label, or the Source Links or the Stability Index, or the History table. |
| |
| **Note:** Methods not listed below are either not relevant or utility methods |
| for string/array/object manipulation (e.g.: are used by the other methods |
| mentioned below). |
| |
| #### `preprocessText` |
| |
| **New Tooling:** Most of the features within this method are available within |
| the new tooling. |
| |
| This method does two things: |
| |
| * Replaces the Source Link YAML entry `<-- source_link= -->` into a "Source |
| Link" HTML anchor element. |
| * Replaces type references within the Markdown (text) (i.e.: "String", "Buffer") |
| into the correct HTML anchor element that links to the correct documentation |
| page. |
| * The original node then gets mutated from text to HTML. |
| * It also updates references to Linux "MAN" pages to Web versions of them. |
| |
| #### `firstHeader` |
| |
| **New Tooling:** All features within this method are available within the new |
| Tooling. |
| |
| Is used to attempt to extract the first heading of the page (recursively) to |
| define the "title" of the page. |
| |
| **Note:** As all API Markdown files start with a Heading, this could possibly be |
| improved to a reduced complexity. |
| |
| #### `preprocessElements` |
| |
| **New Tooling:** All features within this method are available within the new |
| tooling. |
| |
| This method is responsible for doing multiple transformations within the AST |
| Nodes, in majority, transforming the source node in respective HTML elements |
| with diverse responsibilities, such as: |
| |
| * Updating Markdown `code` blocks by adding Language highlighting |
| * It also adds the "CJS"/"MJS" switch to Nodes that are followed by their |
| CJS/ESM equivalents. |
| * Increasing the Heading level of each Heading |
| * Parses YAML blocks and transforms them into HTML elements (See more at the |
| `parseYAML` method) |
| * Updates BlockQuotes that are prefixed by the "Stability" word into a Stability |
| Index HTML element. |
| |
| #### `parseYAML` |
| |
| **New Tooling:** Most of the features within this method are available within |
| the new tooling. |
| |
| This method is responsible for parsing the `<--YAML snippets -->` and |
| transforming them into HTML elements. |
| |
| It follows a certain kind of "schema" that basically constitutes in the |
| following options: |
| |
| | YAML Key | Description | Example | Example Result | Available on new tooling | |
| | ------------- | ------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | --------------------------- | ------------------------ | |
| | `added` | It's used to reference when a certain "module", "class" or "method" was added on Node.js | `added: v0.1.90` | `Added in: v0.1.90` | Yes | |
| | `deprecated` | It's used to reference when a certain "module", "class" or "method" was deprecated on Node.js | `deprecated: v0.1.90` | `Deprecated since: v0.1.90` | Yes | |
| | `removed` | It's used to reference when a certain "module", "class" or "method" was removed on Node.js | `removed: v0.1.90` | `Removed in: v0.1.90` | No | |
| | `changes` | It's used to describe all the changes (historical ones) that happened within a certain "module", "class" or "method" in Node.js | `[{ version: v0.1.90, pr-url: '', description: '' }]` | -- | Yes | |
| | `napiVersion` | It's used to describe in which version of the N-API this "module", "class" or "method" is available within Node.js | `napiVersion: 1` | `N-API version: 1` | Yes | |
| |
| **Note:** The `changes` field gets prepended with the `added`, `deprecated` and |
| `removed` fields if they exist. The table only gets generated if a `changes` |
| field exists. In the new tooling only "added" is prepended for now. |
| |
| #### `buildToc` |
| |
| **New Tooling:** This feature is natively available within the new tooling |
| through MDX. |
| |
| This method generates the Table of Contents based on all the Headings of the |
| Markdown file. |
| |
| #### `altDocs` |
| |
| **New Tooling:** All features within this method are available within the new |
| tooling. |
| |
| This method generates a version picker for the current page to be shown in older |
| versions of the API docs. |
| |
| ### `json.mjs` |
| |
| This file is responsible for generating a JSON object that (supposedly) is used |
| for IDE-Intellisense or for indexing of all the "methods", "classes", "modules", |
| "events", "constants" and "globals" available within a certain Markdown file. |
| |
| It attempts a best effort extraction of the data by using several regular |
| expression patterns (ReGeX). |
| |
| **Note:** JSON output generation is currently not supported by the new tooling, |
| but it is in the pipeline for development. |
| |
| #### `jsonAPI` |
| |
| This method traverses all the AST Nodes by iterating through each one of them |
| and infers the kind of information each node contains through ReGeX. Then it |
| mutate the data and appends it to the final JSON object. |
| |
| For a more in-depth information we recommend to refer to the `json.mjs` file as |
| it contains a lot of comments. |
| |
| [`doc/api/`]: https://github.com/nodejs/node/tree/main/doc/api |