| Test Metadata |
| ============= |
| |
| Directory Layout |
| ---------------- |
| |
| Metadata files must be stored under the ``metadata`` directory passed |
| to the test runner. The directory layout follows that of |
| web-platform-tests with each test source path having a corresponding |
| metadata file. Because the metadata path is based on the source file |
| path, files that generate multiple URLs e.g. tests with multiple |
| variants, or multi-global tests generated from an ``any.js`` input |
| file, share the same metadata file for all their corresponding |
| tests. The metadata path under the ``metadata`` directory is the same |
| as the source path under the ``tests`` directory, with an additional |
| ``.ini`` suffix. |
| |
| For example a test with URL:: |
| |
| /spec/section/file.html?query=param |
| |
| generated from a source file with path:: |
| |
| <tests root>/spec/section.file.html |
| |
| would have a metadata file :: |
| |
| <metadata root>/spec/section/file.html.ini |
| |
| As an optimisation, files which produce only default results |
| (i.e. ``PASS`` or ``OK``), and which don't have any other associated |
| metadata, don't require a corresponding metadata file. |
| |
| Directory Metadata |
| ~~~~~~~~~~~~~~~~~~ |
| |
| In addition to per-test metadata, default metadata can be applied to |
| all the tests in a given source location, using a ``__dir__.ini`` |
| metadata file. For example to apply metadata to all tests under |
| ``<tests root>/spec/`` add the metadata in ``<tests |
| root>/spec/__dir__.ini``. |
| |
| Metadata Format |
| --------------- |
| The format of the metadata files is based on the ini format. Files are |
| divided into sections, each (apart from the root section) having a |
| heading enclosed in square braces. Within each section are key-value |
| pairs. There are several notable differences from standard .ini files, |
| however: |
| |
| * Sections may be hierarchically nested, with significant whitespace |
| indicating nesting depth. |
| |
| * Only ``:`` is valid as a key/value separator |
| |
| A simple example of a metadata file is:: |
| |
| root_key: root_value |
| |
| [section] |
| section_key: section_value |
| |
| [subsection] |
| subsection_key: subsection_value |
| |
| [another_section] |
| another_key: [list, value] |
| |
| Conditional Values |
| ~~~~~~~~~~~~~~~~~~ |
| |
| In order to support values that depend on some external data, the |
| right hand side of a key/value pair can take a set of conditionals |
| rather than a plain value. These values are placed on a new line |
| following the key, with significant indentation. Conditional values |
| are prefixed with ``if`` and terminated with a colon, for example:: |
| |
| key: |
| if cond1: value1 |
| if cond2: value2 |
| value3 |
| |
| In this example, the value associated with ``key`` is determined by |
| first evaluating ``cond1`` against external data. If that is true, |
| ``key`` is assigned the value ``value1``, otherwise ``cond2`` is |
| evaluated in the same way. If both ``cond1`` and ``cond2`` are false, |
| the unconditional ``value3`` is used. |
| |
| Conditions themselves use a Python-like expression syntax. Operands |
| can either be variables, corresponding to data passed in, numbers |
| (integer or floating point; exponential notation is not supported) or |
| quote-delimited strings. Equality is tested using ``==`` and |
| inequality by ``!=``. The operators ``and``, ``or`` and ``not`` are |
| used in the expected way. Parentheses can also be used for |
| grouping. For example:: |
| |
| key: |
| if (a == 2 or a == 3) and b == "abc": value1 |
| if a == 1 or b != "abc": value2 |
| value3 |
| |
| Here ``a`` and ``b`` are variables, the value of which will be |
| supplied when the metadata is used. |
| |
| Web-Platform-Tests Metadata |
| --------------------------- |
| |
| When used for expectation data, metadata files have the following format: |
| |
| * A section per test URL provided by the corresponding source file, |
| with the section heading being the part of the test URL following |
| the last ``/`` in the path (this allows multiple tests in a single |
| metadata file with the same path part of the URL, but different |
| query parts). This may be omitted if there's no non-default |
| metadata for the test. |
| |
| * A subsection per subtest, with the heading being the title of the |
| subtest. This may be omitted if there's no non-default metadata for |
| the subtest. |
| |
| * The following known keys: |
| |
| :expected: |
| The expectation value or values of each (sub)test. In |
| the case this value is a list, the first value represents the |
| typical expected test outcome, and subsequent values indicate |
| known intermittent outcomes e.g. ``expected: [PASS, ERROR]`` |
| would indicate a test that usually passes but has a known-flaky |
| ``ERROR`` outcome. |
| |
| :disabled: |
| Any values apart from the special value ``@False`` |
| indicates that the (sub)test is disabled and should either not be |
| run (for tests) or that its results should be ignored (subtests). |
| |
| :restart-after: |
| Any value apart from the special value ``@False`` |
| indicates that the runner should restart the browser after running |
| this test (e.g. to clear out unwanted state). |
| |
| :fuzzy: |
| Used for reftests. This is interpreted as a list containing |
| entries like ``<meta name=fuzzy>`` content value, which consists of |
| an optional reference identifier followed by a colon, then a range |
| indicating the maximum permitted pixel difference per channel, then |
| semicolon, then a range indicating the maximum permitted total |
| number of differing pixels. The reference identifier is either a |
| single relative URL, resolved against the base test URL, in which |
| case the fuzziness applies to any comparison with that URL, or |
| takes the form lhs URL, comparison, rhs URL, in which case the |
| fuzziness only applies for any comparison involving that specific |
| pair of URLs. Some illustrative examples are given below. |
| |
| :implementation-status: |
| One of the values ``implementing``, |
| ``not-implementing`` or ``backlog``. This is used in conjunction |
| with the ``--skip-implementation-status`` command line argument to |
| ``wptrunner`` to ignore certain features where running the test is |
| low value. |
| |
| :tags: |
| A list of labels associated with a given test that can be |
| used in conjunction with the ``--tag`` command line argument to |
| ``wptrunner`` for test selection. |
| |
| In addition there are extra arguments which are currently tied to |
| specific implementations. For example Gecko-based browsers support |
| ``min-asserts``, ``max-asserts``, ``prefs``, ``lsan-disabled``, |
| ``lsan-allowed``, ``lsan-max-stack-depth``, ``leak-allowed``, and |
| ``leak-threshold`` properties. |
| |
| * Variables taken from the ``RunInfo`` data which describe the |
| configuration of the test run. Common properties include: |
| |
| :product: A string giving the name of the browser under test |
| :browser_channel: A string giving the release channel of the browser under test |
| :debug: A Boolean indicating whether the build is a debug build |
| :os: A string the operating system |
| :version: A string indicating the particular version of that operating system |
| :processor: A string indicating the processor architecture. |
| |
| This information is typically provided by :py:mod:`mozinfo`, but |
| different environments may add additional information, and not all |
| the properties above are guaranteed to be present in all |
| environments. The definitive list of available properties for a |
| specific run may be determined by looking at the ``run_info`` key |
| in the ``wptreport.json`` output for the run. |
| |
| * Top level keys are taken as defaults for the whole file. So, for |
| example, a top level key with ``expected: FAIL`` would indicate |
| that all tests and subtests in the file are expected to fail, |
| unless they have an ``expected`` key of their own. |
| |
| An simple example metadata file might look like:: |
| |
| [test.html?variant=basic] |
| type: testharness |
| |
| [Test something unsupported] |
| expected: FAIL |
| |
| [Test with intermittent statuses] |
| expected: [PASS, TIMEOUT] |
| |
| [test.html?variant=broken] |
| expected: ERROR |
| |
| [test.html?variant=unstable] |
| disabled: http://test.bugs.example.org/bugs/12345 |
| |
| A more complex metadata file with conditional properties might be:: |
| |
| [canvas_test.html] |
| expected: |
| if os == "mac": FAIL |
| if os == "windows" and version == "XP": FAIL |
| PASS |
| |
| Note that ``PASS`` in the above works, but is unnecessary since it's |
| the default expected result. |
| |
| A metadata file with fuzzy reftest values might be:: |
| |
| [reftest.html] |
| fuzzy: [10;200, ref1.html:20;200-300, subtest1.html==ref2.html:10-15;20] |
| |
| In this case the default fuzziness for any comparison would be to |
| require a maximum difference per channel of less than or equal to 10 |
| and less than or equal to 200 total pixels different. For any |
| comparison involving ref1.html on the right hand side, the limits |
| would instead be a difference per channel not more than 20 and a total |
| difference count of not less than 200 and not more than 300. For the |
| specific comparison ``subtest1.html == ref2.html`` (both resolved against |
| the test URL) these limits would instead be 10 to 15 and 0 to 20, |
| respectively. |
| |
| Generating Expectation Files |
| ---------------------------- |
| |
| wpt provides the tool ``wpt update-expectations`` command to generate |
| expectation files from the results of a set of test runs. The basic |
| syntax for this is:: |
| |
| ./wpt update-expectations [options] [logfile]... |
| |
| Each ``logfile`` is a wptreport log file from a previous run. These |
| can be generated from wptrunner using the ``--log-wptreport`` option |
| e.g. ``--log-wptreport=wptreport.json``. |
| |
| ``update-expectations`` takes several options: |
| |
| --full Overwrite all the expectation data for any tests that have a |
| result in the passed log files, not just data for the same run |
| configuration. |
| |
| --disable-intermittent When updating test results, disable tests that |
| have inconsistent results across many |
| runs. This can precede a message providing a |
| reason why that test is disable. If no message |
| is provided, ``unstable`` is the default text. |
| |
| --update-intermittent When this option is used, the ``expected`` key |
| stores expected intermittent statuses in |
| addition to the primary expected status. If |
| there is more than one status, it appears as a |
| list. The default behaviour of this option is to |
| retain any existing intermittent statuses in the |
| list unless ``--remove-intermittent`` is |
| specified. |
| |
| --remove-intermittent This option is used in conjunction with |
| ``--update-intermittent``. When the |
| ``expected`` statuses are updated, any obsolete |
| intermittent statuses that did not occur in the |
| specified log files are removed from the list. |
| |
| Property Configuration |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| In cases where the expectation depends on the run configuration ``wpt |
| update-expectations`` is able to generate conditional values. Because |
| the relevant variables depend on the range of configurations that need |
| to be covered, it's necessary to specify the list of configuration |
| variables that should be used. This is done using a ``json`` format |
| file that can be specified with the ``--properties-file`` command line |
| argument to ``wpt update-expectations``. When this isn't supplied the |
| defaults from ``<metadata root>/update_properties.json`` are used, if |
| present. |
| |
| Properties File Format |
| ++++++++++++++++++++++ |
| |
| The file is JSON formatted with two top-level keys: |
| |
| :``properties``: |
| A list of property names to consider for conditionals |
| e.g ``["product", "os"]``. |
| |
| :``dependents``: |
| An optional dictionary containing properties that |
| should only be used as "tie-breakers" when differentiating based on a |
| specific top-level property has failed. This is useful when the |
| dependent property is always more specific than the top-level |
| property, but less understandable when used directly. For example the |
| ``version`` property covering different OS versions is typically |
| unique amongst different operating systems, but using it when the |
| ``os`` property would do instead is likely to produce metadata that's |
| too specific to the current configuration and more difficult to |
| read. But where there are multiple versions of the same operating |
| system with different results, it can be necessary. So specifying |
| ``{"os": ["version"]}`` as a dependent property means that the |
| ``version`` property will only be used if the condition already |
| contains the ``os`` property and further conditions are required to |
| separate the observed results. |
| |
| So an example ``update-properties.json`` file might look like:: |
| |
| { |
| "properties": ["product", "os"], |
| "dependents": {"product": ["browser_channel"], "os": ["version"]} |
| } |
| |
| Examples |
| ~~~~~~~~ |
| |
| Update all the expectations from a set of cross-platform test runs:: |
| |
| wpt update-expectations --full osx.log linux.log windows.log |
| |
| Add expectation data for some new tests that are expected to be |
| platform-independent:: |
| |
| wpt update-expectations tests.log |
| |
| Why a Custom Format? |
| -------------------- |
| |
| Introduction |
| ------------ |
| |
| Given the use of the metadata files in CI systems, it was desirable to |
| have something with the following properties: |
| |
| * Human readable |
| |
| * Human editable |
| |
| * Machine readable / writable |
| |
| * Capable of storing key-value pairs |
| |
| * Suitable for storing in a version control system (i.e. text-based) |
| |
| The need for different results per platform means either having |
| multiple expectation files for each platform, or having a way to |
| express conditional values within a certain file. The former would be |
| rather cumbersome for humans updating the expectation files, so the |
| latter approach has been adopted, leading to the requirement: |
| |
| * Capable of storing result values that are conditional on the platform. |
| |
| There are few extant formats that clearly meet these requirements. In |
| particular although conditional properties could be expressed in many |
| existing formats, the representation would likely be cumbersome and |
| error-prone for hand authoring. Therefore it was decided that a custom |
| format offered the best tradeoffs given the requirements. |