The XML domain in FuzzTest is designed to generate well-formed XML (Extensible Markup Language) strings. These strings can be used as inputs for fuzz testing software that parses or processes XML data. By generating a wide variety of valid and complex XML structures, this domain helps uncover potential vulnerabilities, bugs, or unexpected behaviors in XML-handling code.
The XML domain is built using several components of the FuzzTest library, centered around the XmlElement class and the XmlElementDomain function.
XmlElement StructureThe core of the XML generation is the XmlElement class, which represents a single XML element. It's defined to hold the tag name, attributes, and content of an element. The content itself can be either a simple text string or a vector of child XmlElement objects, allowing for recursive, nested structures.
// Located in: fuzztest/internal/domains/xml_domain.h #include <string> #include <vector> #include <map> #include <variant> // Required for std::variant #include "absl/strings/str_cat.h" #include "absl/strings/str_join.h" namespace fuzztest::internal { // Forward declaration class XmlElement; // Define content type: can be a string (text) or a vector of child XmlElements using XmlContentType = std::variant<std::string, std::vector<XmlElement>>; // Represents an XML element. class XmlElement { public: std::string tag_name; std::map<std::string, std::string> attributes; XmlContentType content; XmlElement() = default; XmlElement(std::string tag, std::map<std::string, std::string> attrs, XmlContentType cont) : tag_name(std::move(tag)), attributes(std::move(attrs)), content(std::move(cont)) {} // Function to serialize the XmlElement to a string std::string ToString() const { std::string attrs_str; for (const auto& attr : attributes) { attrs_str += absl::StrCat(" ", attr.first, "=\"", attr.second, "\""); } std::string content_str; if (std::holds_alternative<std::string>(content)) { content_str = std::get<std::string>(content); } else if (std::holds_alternative<std::vector<XmlElement>>(content)) { for (const auto& child : std::get<std::vector<XmlElement>>(content)) { content_str += child.ToString(); } } return absl::StrCat("<", tag_name, attrs_str, ">", content_str, "</", tag_name, ">"); } template <typename Sink> friend void AbslStringify(Sink& sink, const XmlElement& element) { std::string attrs_str_repr; for (const auto& attr : element.attributes) { if (!attrs_str_repr.empty()) attrs_str_repr += ", "; attrs_str_repr += absl::StrCat(attr.first, "=", attr.second); } std::string content_repr; if (std::holds_alternative<std::string>(element.content)) { content_repr = absl::StrCat("\"", std::get<std::string>(element.content), "\""); } else if (std::holds_alternative<std::vector<XmlElement>>(element.content)) { content_repr = absl::StrCat("[", absl::StrJoin(std::get<std::vector<XmlElement>>(element.content), ", ", [](std::string* out, const XmlElement& e){ out->append(e.ToString()); }), "]"); } absl::Format(&sink, "XmlElement{tag_name=%s, attributes={%s}, content=%s}", element.tag_name, attrs_str_repr, content_repr); } }; } // namespace fuzztest::internal
Several helper domains are defined to generate the constituent parts of an XML element:
XmlTagName(): Generates valid XML tag names (alphanumeric, starting with a letter).XmlAttributeName(): Generates valid XML attribute names (similar rules to tag names).XmlAttributeValue(): Generates string attribute values using printable ASCII characters (excluding quotes, which are handled by the serialization).XmlAttributes(): Creates a map of attribute name-value pairs.XmlTextContent(): Produces simple text strings that can serve as the content of an XML element.XmlElementDomain ImplementationThe main domain, XmlElementDomain(), orchestrates the generation of XmlElement objects and their subsequent serialization into XML strings.
fuzztest::DomainBuilder<XmlElement>: This is used to define the recursive structure of XmlElement. It allows an XmlElement to contain other XmlElement objects as children.XmlContentType) is generated using fuzztest::OneOf. This domain chooses between:XmlTextContent().Map(...).XmlElement objects, produced by builder.RecursiveContainerOf<std::vector<XmlElement>>(...).Map(...). The RecursiveContainerOf is crucial for creating nested XML structures, with max_depth and max_elements parameters to control complexity.builder.Set(...) is used to specify how to construct an XmlElement object, providing the domains for its tag_name, attributes, and content members.builder.Build().Map([](XmlElement&& element) { return element.ToString(); }) takes the generated XmlElement object and calls its ToString() method to produce the final XML string.// Located in: fuzztest/internal/domains/xml_domain.h // Domain for XML tag names (alphanumeric, starting with a letter) inline auto XmlTagName() { return StringOf(AlphaNumericChar()).WithMinSize(1).Filter([](const std::string& s) { return !s.empty() && std::isalpha(s[0]); }); } // Domain for XML attribute names (alphanumeric, starting with a letter) inline auto XmlAttributeName() { return StringOf(AlphaNumericChar()).WithMinSize(1).Filter([](const std::string& s) { return !s.empty() && std::isalpha(s[0]); }); } // Domain for XML attribute values (printable ASCII characters, excluding quotes) inline auto XmlAttributeValue() { return StringOf(CharacterSet(" abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!#$%&'()*+,-./:;<=>?@[]^_`{|}~")); } // Domain for XML attributes (a map of name-value pairs) inline auto XmlAttributes() { return Map( [](auto&&... args) { return std::map<std::string, std::string>(std::forward<decltype(args)>(args)...); }, VectorOf(PairOf(XmlAttributeName(), XmlAttributeValue())).WithMaxSize(5)); } // Domain for XML content (text) inline auto XmlTextContent() { return StringOf(PrintableAsciiChar()); } // Main XML Element Domain inline auto XmlElementDomain() { DomainBuilder<XmlElement> builder; // Define the domain for the content of an XML element. // It can be either a simple text string or a vector of child XmlElement objects. // MaxDepth is used for the recursive part to prevent infinite recursion. // MaxElements for the container size. auto xml_content_variant_domain = OneOf( XmlTextContent().Map([](std::string s) { return XmlContentType(std::move(s)); }), builder.RecursiveContainerOf<std::vector>(/*max_depth=*/3, /*max_elements=*/5) .Map([](std::vector<XmlElement> v) { return XmlContentType(std::move(v)); }) ); builder.Set( XmlTagName(), // domain for tag_name XmlAttributes(), // domain for attributes xml_content_variant_domain // domain for content (XmlContentType) ); return builder.Build().Map([](XmlElement&& element) { // Use the ToString() method of XmlElement for serialization. return element.ToString(); }); }
To use the XML domain in your fuzz test:
Include Header: Add the necessary header file.
#include "fuzztest/internal/domains/xml_domain.h"
(Note: The exact path might vary based on your project's include structure.)
Use in FUZZ_TEST: Specify fuzztest::internal::XmlElementDomain() in the .WithDomains() clause of your FUZZ_TEST macro.
Here's an example demonstrating how to use the XmlElementDomain to test a hypothetical XML processing function:
// Located in: examples/xml_fuzz_test.cc // NOTE: This example uses cc_test in its BUILD file instead of cc_fuzz_test // due to difficulties in locating the correct Bazel macro for cc_fuzz_test // in the testing environment. When integrating FuzzTest into your own project, // ensure your Bazel setup is correct to use cc_fuzz_test for full fuzzing // capabilities (corpus generation, coverage guidance, etc.). // The FUZZ_TEST macro itself is still used here and should work with a // FuzzTest-compatible main function (like fuzztest_gtest_main). #include "fuzztest/fuzztest.h" #include "fuzztest/internal/domains/xml_domain.h" // Adjust path as necessary #include <string> #include <iostream> #include <stack> #include <regex> // A simple placeholder function to "consume" XML strings. void ProcessXml(const std::string& xml_string) { // std::cout << "Generated XML: " << xml_string << std::endl; std::regex tag_regex("<([a-zA-Z0-9_:]+)([^>]*)>"); std::smatch match; std::string::const_iterator search_start(xml_string.cbegin()); std::stack<std::string> open_tags; while (std::regex_search(search_start, xml_string.cend(), match, tag_regex)) { std::string tag_name = match[1].str(); std::string full_tag = match[0].str(); if (full_tag.size() > 1 && full_tag[full_tag.size() - 2] == '/') { // Self-closing tag } else if (full_tag.size() > 0 && full_tag[1] == '/') { // Closing tag std::string actual_tag_name = tag_name.substr(1); if (open_tags.empty() || open_tags.top() != actual_tag_name) { return; } if (!open_tags.empty()) { open_tags.pop(); } } else { // Opening tag open_tags.push(tag_name); } search_start = match.suffix().first; } // Optional: Assert that all tags were closed if strict well-formedness is always expected. // FUZZTEST_ASSERT(open_tags.empty()); } // Fuzz test for the ProcessXml function using the XML domain FUZZ_TEST(XmlFuzzTest, ProcessXmlConsumesValidXml) .WithDomains(fuzztest::internal::XmlElementDomain()); // Basic test with a fixed string TEST(XmlFuzzTest, ProcessXmlSimpleValidCase) { ProcessXml("<root><item>Test</item></root>"); }
The provided example (examples/xml_fuzz_test.cc) is configured in its examples/BUILD file to run using a standard cc_test Bazel rule, linking with //fuzztest:fuzztest_gtest_main. This was a workaround due to difficulties encountered in the development environment with loading the cc_fuzz_test Bazel macro.
For full-fledged fuzzing (including corpus generation, advanced coverage guidance, etc.), you should use the cc_fuzz_test rule provided by FuzzTest. This typically involves loading it from a .bzl file (e.g., load("@com_google_fuzztest//fuzztest:build_defs.bzl", "cc_fuzz_test")) and using cc_fuzz_test instead of cc_test in your BUILD file. Ensure your project's Bazel WORKSPACE and FuzzTest integration are set up correctly to make cc_fuzz_test available.
# Example BUILD file structure (conceptual) for cc_fuzz_test: # load("@com_google_fuzztest//fuzztest:build_defs.bzl", "cc_fuzz_test") # Or the correct path for your setup # cc_fuzz_test( # name = "xml_fuzz_test", # srcs = ["xml_fuzz_test.cc"], # deps = [ # "//fuzztest:fuzztest", # "//fuzztest/internal/domains:xml_domain_impl", # Or your path to xml_domain.h # # Other necessary dependencies # ], # )
By leveraging the XML domain, developers can create more robust and secure XML processing applications.