| # Data Model |
| |
| The SuperSize data model is a sorted flat list of symbols. Using a flat list is |
| simple, and allows arbitrary queries to be made on symbols. |
| |
| [//tools/binary_size/libsupersize/models.py] contains the definition of all data |
| classes. |
| |
| [//tools/binary_size/libsupersize/models.py]: /tools/binary_size/libsupersize/models.py |
| |
| [TOC] |
| |
| ## Python API Reference |
| |
| ### SizeInfo |
| |
| Represents the data within a `.size` file. Contains: |
| |
| * `build_config`: JSON metadata applicable to all symbols. |
| * `containers`: List of Container instances used by symbols in this SizeInfo. |
| * `raw_symbols`: List of Symbols. |
| |
| ### Symbol |
| |
| Each symbol contains the following fields: |
| |
| * `container`: A (shared) Container instance. |
| * `section_name`: E.g. ".text", ".rodata", ".data.rel.local" |
| * `section`: The single character abbreviation of `section_name`. |
| E.g. "t", "r", "d". |
| * `size`: The number of bytes this symbol takes up, including padding that |
| comes before |address|. |
| * `padding`: The number of bytes of padding before |address|. |
| * `address` (optional): The start address of the symbol. |
| * `source_path` (optional): Path to the source file that caused this symbol to |
| exist (e.g. `base64.cc`, `SomeClass.java`). |
| * `object_path` (optional): |
| * For native and pak: Path to associated object file. E.g.: `base/base64.o` |
| * For dex: Package path. E.g.: `$APK/org/chromium/chrome/SomeClass.class` |
| * `aliases`: List of symbols that represent the same bytes. The |aliases| of |
| each symbol in this list points to the same list instance. |
| * `num_aliases`: The number of symbols with the same address (including self). |
| * `pss`: `size` / `num_aliases`. |
| * `padding_pss`: `padding` / `num_aliases`. |
| * `full_name`: Name for this symbol. |
| * Symbols are not required to have unique names, or names as all (empty |
| string is valid). |
| * `template_name`: Derived from `full_name`. Name with parameter list removed, |
| but template parameters present. |
| * `name`: Derived from `full_name`. Names with templates and parameter list |
| removed. |
| * `component`: The team that owns this feature (optional, maybe be empty). |
| * `flags`: Bitmask of flags. See `FLAG_*` constants in `models.py`. |
| * `disassembly` (optional): The disassembly code for the symbol. |
| |
| ### Diffs |
| |
| Diffs are represented in Python using `DeltaSizeInfo`, which contains a list of |
| `DeltaSymbol` instances. `DeltaSymbols` maintain the full fidelity of symbols in |
| the diff by storing a pointer to the before / after symbol that they represent. |
| See [diffs.md](diffs.md) for more details. |
| |
| ## Concepts |
| |
| ### Symbol Aliases |
| |
| Aliases occur when multiple symbols refer to the same bytes (have the same |
| `address`, `size`, and `padding`). |
| |
| Examples of where aliases are used: |
| |
| * Functions with identical code are de-deuped via identical code folding. |
| * Functions that appear in multiple translation units (e.g. functions with |
| inline linkage). These have the same name, but different paths. |
| * Represented as one alias per path, but are collapsed into a single symbol |
| with a path of `$COMMON_PREFIX/{shared}/$SYMBOL_COUNT` when the number of |
| aliases is large. |
| * E.g.: `base/{shared}/3` |
| * String literals that are de-duped by identical code folding. |
| * Pak entries with identical payloads. |
| |
| ### Path Normalization |
| |
| * Prefixes are removed: `out/Release/`, `gen/`, `obj/` |
| * This causes generated files to overlay non-generated source tree, which is |
| useful for attribution since the two generally mirror one another. |
| * Generated symbols have the `FLAG_GENERATED` bit set. |
| |
| ### Overhead and Star Symbols |
| |
| **Overhead symbols** are symbols with a name that starts with "Overhead:". They |
| track bytes that are generally unactionable. They are recorded as padding-only |
| symbols (e.g.: `size=10`, `padding=10`, `size_without_padding=0`) because |
| "padding" better associates with "overhead" vs. size. |
| |
| * `Overhead: ELF file`: `elf_file_size - sum(elf_sections)`. |
| * Captures bytes taken up by ELF headers and section alignment. |
| * `Overhead: APK file`: `apk_file_size - sum(compressed_file_sizes)` |
| * Captures bytes taken up by `.zip` metadata and zipalign padding. |
| * `Overhead: ${NAME}.pak`: `pak_file_size - sum(pak_entries)` |
| * `Overhead: aggregate padding of diff'ed symbols`: Appears in symbol diffs to |
| represent the per-section cumulative delta in padding. |
| |
| **Star symbols** are symbols with a name that starts with "\*\*". They represent |
| sections of binary that are unattributed. |
| |
| Examples: |
| |
| * `** Merge Globals` - Taken from linker map file. A section of data |
| containing unnamed constants. |
| * `** Symbol gap`: A gap between symbols that is larger than what could be |
| plausibly be due to alignment. |
| * `** ELF Section: .rel.dyn`: A native code ELF section that is not broken down |
| into smaller symbols. |