impl/memory/README.md - external/github.com/luci/gae - Git at Google

 Datastore implementation internals
 ==================================

 This document contains internal implementation details for this in-memory
 version of datastore. It's mostly helpful to understand what's going on in this
 implementation, but it also can reveal some insight into how the real appengine
 datastore works (though note that the specific encodings are different).

 Additionally note that this implementation cheats by moving some of the Key
 bytes into the table (collection) names (like the namespace, property name for
 the builtin indexes, etc.). The real implementation contains these bytes in the
 table row keys, I think.


 Internal datastore key/value collection schema
 ----------------------------------------------

 The datastore implementation here uses several different tables ('collections')
 to manage state for the data. The schema for these tables is enumerated below
 to make the code a bit easier to reason about.

 All datastore user objects (Keys, Properties, PropertyMaps, etc.) are serialized
 using `github.com/luci/gae/service/datastore/serialize`, which in turn uses the
 primitives available in `github.com/luci/luci-go/common/cmpbin`. The encodings
 are important to understanding why the schemas below sort correctly when
 compared only using `bytes.Compare` (aka `memcmp`). This doc will assume that
 you're familiar with those encodings, but will point out where we diverge from
 the stock encodings.

 All encoded Property values used in memory store Keys (i.e. index rows) are
 serialized using the settings `serialize.WithoutContext`, and
 `datastore.ShouldIndex`.

 ### Primary table

 The primary table maps datastore keys to entities.

 - Name: `"ents:" + namespace`
 - Key: serialized datastore.Property containing the entity's datastore.Key
 - Value: serialized datastore.PropertyMap

 This table also encodes values for the following special keys:

 - Every entity root (e.g. a Key with nil Parent()) with key K has:
   - `Key("__entity_group__", 1, K)` -> `{"__version__": PTInt}`
     A child entity with the kind `__entity_group__` and an id of `1`. The value
     has a single property `__version__`, which contains the version number of
     this entity group. This is used to detect transaction conflicts.
   - `Key("__entity_group_ids__", 1, K)` -> `{"__version__": PTInt}`
     A child entity with the kind `__entity_group__` and an id of `1`. The value
     has a single property `__version__`, which contains the last automatically
     allocated entity ID for entities within this entity group.
 - A root entity with the key `Key("__entity_group_ids__",1)` which contains the
   same `__version__` property, and indicates the last automatically allocated
   entity ID for root entities.

 ### Compound Index table

 The next table keeps track of all the user-added 'compound' index descriptions
 (not the content for the indexes). There is a row in this table for each
 compound index that the user adds by calling `ds.Raw().Testable().AddIndexes`.

 - Name: `"idx"`
 - Key: normalized, serialized `datastore.IndexDefinition` with the SortBy slice
   in reverse order (i.e. `datastore.IndexDefinition.PrepForIdxTable()`).
 - Value: empty

 The Key format here requires some special attention. Say you started with
 a compound IndexDefinition of:

     IndexDefinition{
       Kind: "Foo",
       Ancestor: true,
       SortBy: []IndexColumn{
         {Property: "Something", Direction: DESCENDING},
         {Property: "Else", Direction: ASCENDING},
         {Property: "Cool", Direction: ASCENDING},
       }
     }

 After prepping it for the table, it would be equivalent to:

     IndexDefinition{
       Kind: "Foo",
       Ancestor: true,
       SortBy: []IndexColumn{
         {Property: "__key__", Direction: ASCENDING},
         {Property: "Cool", Direction: ASCENDING},
         {Property: "Else", Direction: ASCENDING},
         {Property: "Something", Direction: DESCENDING},
       }
     }

 The reason for doing this will be covered in the `Query Planning` section, but
 it boils down to allowing the query planner to use this encoded table to
 intelligently scan for potentially useful compound indexes.

 ### Index Tables

 Every index (both builtin and compound) has one index table per namespace, which
 contains as rows every entry in the index, one per row.

 - Name: `"idx:" + namespace + IndexDefinition.PrepForIdxTable()`
 - Key: concatenated datastore.Property values, one per SortBy column in the
   IndexDefinition (the non-PrepForIdxTable version). If the SortBy column is
   DESCENDING, the serialized Property is inverted (e.g. XOR 0xFF).
 - Value: empty

 If the IndexDefinition has `Ancestor: true`, then the very first column of the
 Key contains the partial Key for the entity. So if an entity has the datastore
 key `/Some,1/Thing,2/Else,3`, it would have the values `/Some,1`,
 `/Some,1/Thing,2`, and `/Some,1/Thing,2/Else,3` as value in the ancestor column
 of indexes that it matches.

 #### Builtin (automatic) indexes

 The following indexes are automatically created for some entity with a key
 `/Kind,*`, for every property (with `ShouldIndex` values) named "Foo":

     IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
       {Property: "__key__", Direction: ASCENDING},
     }}
     IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
       {Property: "Foo", Direction: ASCENDING},
       {Property: "__key__", Direction: ASCENDING},
     }}
     IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
       {Property: "Foo", Direction: DESCENDING},
       {Property: "__key__", Direction: ASCENDING},
     }}

 Index updates
 -------------

 (Note that this is a LARGE departure from how the production appengine datastore
 does this. This model only works because the implementation is not distributed,
 and not journaled. The real datastore does index updates in parallel and is
 generally pretty fancy compared to this).

 Index updates are pretty straightforward. On a mutation to the primary entity
 table, we take the old entity value (remember that entity values are
 PropertyMaps), the new property value, create index entries for both, merge
 them, and apply the deltas to the affected index tables (i.e. entries that
 exist in the old entities, but not the new ones, are deleted. Entries that exist
 in the new entities, but not the old ones, are added).

 Index generation works (given an slice of indexes []Idxs) by:

 * serializing all ShouldIndex Properties in the PropertyMap to get a
   `map[name][]serializedProperty`.
 * for each index idx
   * if idx's columns contain properties that are not in the map, skip idx
   * make a `[][]serializedProperty`, where each serializedProperty slice
     corresponds with the IndexColumn of idx.SortBy
     * duplicated values for multi-valued properties are skipped.
   * generate a `[]byte` row which is the contatenation of one value from each
     `[]serializedProperty`, permuting through all combinations. If the SortBy
     column is DESCENDING, make sure to invert (XOR 0xFF) the serializedProperty
     value!.
   * add that generated []byte row to the index's corresponding table.

 Note that we choose to serialize all permutations of the saved entity. This is
 so that we can use repeated-column indexes to fill queries which use a subset of
 the columns. E.g. if we have the index `duck,duck,duck,goose`, we can
 theoretically use it to fill a query for `duck=1,duck=2,goose>"canadian"`, by
 pasting 1 or 2 as the value for the 3rd `duck` column. This simplifies index
 selection at the expense of larger indexes. However, it means that if you have
 the entity:

     duck = 1, 2, 3, 4
     goose = "færøske"

 It generates the following index entries:

     duck=1,duck=1,duck=1,goose="færøske"
     duck=1,duck=1,duck=2,goose="færøske"
     duck=1,duck=1,duck=3,goose="færøske"
     duck=1,duck=1,duck=4,goose="færøske"
     duck=1,duck=2,duck=1,goose="færøske"
     duck=1,duck=2,duck=2,goose="færøske"
     duck=1,duck=2,duck=3,goose="færøske"
     duck=1,duck=2,duck=4,goose="færøske"
     duck=1,duck=3,duck=1,goose="færøske"
     ... a lot ...
     duck=4,duck=4,duck=4,goose="færøske"

 This is a very large number of index rows (i.e. an 'exploding index')!

 An alternate design would be to only generate unique permutations of elements
 where the index has repeated columns of a single property. This only makes sense
 because it's illegal to have an equality and an inequality on the same property,
 under the current constraints of appengine (though not completely ridiculous in
 general, if inequality constraints meant the same thing as equality constraints.
 However it would lead to a multi-dimensional query, which can be quite slow and
 is very difficult to scale without application knowledge). If we do this, it
 also means that we need to SORT the equality filter values when generating the
 prefix (so that the least-valued equality constraint is first). If we did this,
 then the generated index rows for the above entity would be:

     duck=1,duck=2,duck=3,goose="færøske"
     duck=1,duck=2,duck=4,goose="færøske"
     duck=1,duck=3,duck=4,goose="færøske"
     duck=2,duck=3,duck=4,goose="færøske"

 Which be a LOT more compact. It may be worth implementing this restriction
 later, simply for the memory savings when indexing multi-valued properties.

 If this technique is used, there's also room to unambiguously index entities
 with repeated equivalent values. E.g. if duck=1,1,2,3,4 , then you could see
 a row in the index like:

     duck=1,duck=1,duck=2,goose="færøske"

 Which would allow you to query for "an entity which has duck values equal to 1,
 1 and 2". Currently such a query is not possible to execute (it would be
 equivalent to "an entity which has duck values equal to 1 and 2").

 Query planning
 --------------

 Now that we have all of our data tabulated, let's plan some queries. The
 high-level algorithm works like this:

 * Generate a suffix format from the user's query which looks like:
   * orders (including the inequality as the first order, if any)
   * projected fields which aren't explicitly referenced in the orders (we
     assume ASCENDING order for them), in the order that they were projected.
   * `__key__` (implied ascending, unless the query's last sort order is for
     `__key__`, in which case it's whatever order the user specified)
 * Reverse the order of this suffix format, and serialize it into an
   IndexDefinition, along with the query's Kind and Ancestor values. This
   does what PrepForIdxTable did when we added the Index in the first place.
 * Use this serialized reversed index to find compound indexes which might
   match by looking up rows in the "idx" table which begin with this serialized
   reversed index.
 * Generate every builtin index for the inequality + equality filter
   properties, and see if they match too.

 An index is a potential match if its suffix *exactly* matches the suffix format,
 and it contains *only* sort orders which appear in the query (e.g. the index
 contains a column which doesn't appear as an equality or inequlity filter).

 The index search continues until:

 * We find at least one matching index; AND
 * The combination of all matching indexes accounts for every equality filter
   at least once.

 If we fail to find sufficient indexes to fulfill the query, we generate an index
 description that *could* be sufficient by concatenating all missing equality
 filters, in ascending order, followed by concatenating the suffix format that we
 generated for this query. We then suggest this new index to the user for them to
 add by returing an error containing the generated IndexDefinition. Note that
 this index is not REQUIRED for the user to add; they could choose to add bits
 and pieces of it, extend existing indexes in order to cover the missing columns,
 invert the direction of some of the equality columns, etc.

 Recall that equality filters are expressed as
 `map[propName][]serializedProperty`. We'll refer to this mapping as the
 'constraint' mapping below.

 To actually come up with the final index selection, we sort all the matching
 indexes from greatest number of columns to least. We add the 0th index (the one
 with the greatest number of columns) unconditionally. We then keep adding indexes
 which contain one or more of the remaining constraints, until we have no
 more constraints to satisfy.

 Adding an index entails determining which columns in that index correspond to
 equality columns, and which ones correspond to inequality/order/projection
 columns. Recall that the inequality/order/projection columns are all the same
 for all of the potential indices (i.e. they all share the same *suffix format*).
 We can use this to just iterate over the index's SortBy columns which we'll use
 for equality filters. For each equality column, we remove a corresponding value
 from the constraints map. In the event that we _run out_ of constraints for a
 given column, we simply _pick an arbitrary value_ from the original equality
 filter mapping and use that. This is valid to do because, after all, they're
 equality filters.

 Note that for compound indexes, the ancestor key counts as an equality filter,
 and if the compound index has `Ancestor: true`, then we implicitly put the
 ancestor as if it were the first SortBy column. For satisfying Ancestor queries
 with built-in indexes, see the next section.

 Once we've got our list of constraints for this index, we concatenate them all
 together to get the *prefix* for this index. When iterating over this index, we
 only ever want to see index rows whose prefix exactly matches this. Unlike the
 suffix formt, the prefix is per-index (remember that ALL indexes in the
 query must have the same suffix format).

 Finally, we set the 'start' and 'end' values for all chosen indexes to either
 the Start and End cursors, or the Greater-Than and Less-Than values for the
 inequality. The Cursors contain values for every suffix column, and the
 inequality only contains a value for the first suffix column. If both cursors
 and an inequality are specified, we take the smaller set of both (the
 combination which will return the fewest rows).

 That's about it for index selection! See Query Execution for how we actually use
 the selected indexes to run a query.

 ### Ancestor queries and Built-in indexes

 You may have noticed that the built-in indexes can be used for Ancestor queries
 with equality filters, but they don't start with the magic Ancestor column!

 There's a trick that you can do if the suffix format for the query is just
 `__key__` though (e.g. the query only contains equality filters, and/or an
 inequality filter on `__key__`). You can serialize the datastore key that you're
 planning to use for the Ancestor query, then chop off the termintating null byte
 from the encoding, and then use this as additional prefix bytes for this index.
 So if the builtin for the "Val" property has the column format of:

     {Property: "Val"}, {Property: "__key__"}

 And your query holds Val as an equality filter, you can serialize the
 ancestor key (say `/Kind,1`), and add those bytes to the prefix. So if you had
 an index row:

     PTInt ++ 100 ++ PTKey ++ "Kind" ++ 1 ++ CONTINUE ++ "Child" ++ 2 ++ STOP

 (where CONTINUE is the byte 0x01, and STOP is 0x00), you can form a prefix for
 the query `Query("Kind").Ancestor(Key(Kind, 1)).Filter("Val =", 100)` as:

     PTInt ++ 100 ++ PTKey ++ "Kind" ++ 1

 Omitting the STOP which normally terminates the Key encoding. Using this prefix
 will only return index rows which are `/Kind,1` or its children.

 "That's cool! Why not use this trick for compound indexes?", I hear you ask :)
 Remember that this trick only works if the prefix before the `__key__` is
 *entirely* composed of equality filters. Also recall that if you ONLY use
 equality filters and Ancestor (and possibly an inequality on `__key__`), then
 you can always satisfy the query from the built-in indexes! While you
 technically could do it with a compound index, there's not really a point to
 doing so. To remain faithful to the production datastore implementation, we
 don't implement this trick for anything other than the built-in indexes.

 ### Cursor format

 Cursors work by containing values for each of the columns in the suffix, in the
 order and Direction specified by the suffix. In fact, cursors are just encoded
 versions of the []IndexColumn used for the 'suffix format', followed by the
 raw bytes of the suffix for that particular row (incremented by 1 bit).

 This means that technically you can port cursors between any queries which share
 precisely the same suffix format, regardless of other query options, even if the
 index planner ends up choosing different indexes to use from the first query to
 the second. No state is maintained in the service implementation for cursors.

 I suspect that this is a more liberal version of cursors than how the production
 appengine implements them, but I haven't verified one way or the other.

 Query execution
 ---------------

 Last but not least, we need to actually execute the query. After figuring out
 which indexes to use with what prefixes and start/end values, we essentially
 have a list of index subsets, all sorted the same way. To pull the values out,
 we start by iterating the first index in the list, grabbing its suffix value,
 and trying to iterate from that suffix in the second, third, fourth, etc index.

 If any index iterates past that suffix, we start back at the 0th index with that
 suffix, and continue to try to find a matching row. Doing this will end up
 skipping large portions of all of the indexes in the list. This is the algorithm
 known as "zigzag merge join", and you can find talks on it from some of the
 appengine folks. It has very good algorithmic running time and tends to scale
 with the number of full matches, rather than the size of all of the indexes
 involved.

 A hit occurs when all of the iterators have precisely the same suffix. This hit
 suffix is then decoded using the suffix format information. The very last column
 of the suffix will always be the datastore key. The suffix is then used to call
 back to the user, according to the query type:

 * keys-only queries just directly return the Key
 * projection queries return the projected fields from the decoded suffix.
   Remember how we added all the projections after the orders? This is why. The
   projected values are pulled directly from the index, instead of going to the
   main entity table.
 * normal queries pull the decoded Key from the "ents" table, and return that
   entity to the user.
	Datastore implementation internals
	==================================

	This document contains internal implementation details for this in-memory
	version of datastore. It's mostly helpful to understand what's going on in this
	implementation, but it also can reveal some insight into how the real appengine
	datastore works (though note that the specific encodings are different).

	Additionally note that this implementation cheats by moving some of the Key
	bytes into the table (collection) names (like the namespace, property name for
	the builtin indexes, etc.). The real implementation contains these bytes in the
	table row keys, I think.


	Internal datastore key/value collection schema
	----------------------------------------------

	The datastore implementation here uses several different tables ('collections')
	to manage state for the data. The schema for these tables is enumerated below
	to make the code a bit easier to reason about.

	All datastore user objects (Keys, Properties, PropertyMaps, etc.) are serialized
	using `github.com/luci/gae/service/datastore/serialize`, which in turn uses the
	primitives available in `github.com/luci/luci-go/common/cmpbin`. The encodings
	are important to understanding why the schemas below sort correctly when
	compared only using `bytes.Compare` (aka `memcmp`). This doc will assume that
	you're familiar with those encodings, but will point out where we diverge from
	the stock encodings.

	All encoded Property values used in memory store Keys (i.e. index rows) are
	serialized using the settings `serialize.WithoutContext`, and
	`datastore.ShouldIndex`.

	### Primary table

	The primary table maps datastore keys to entities.

	- Name: `"ents:" + namespace`
	- Key: serialized datastore.Property containing the entity's datastore.Key
	- Value: serialized datastore.PropertyMap

	This table also encodes values for the following special keys:

	- Every entity root (e.g. a Key with nil Parent()) with key K has:
	- `Key("__entity_group__", 1, K)` -> `{"__version__": PTInt}`
	A child entity with the kind `__entity_group__` and an id of `1`. The value
	has a single property `__version__`, which contains the version number of
	this entity group. This is used to detect transaction conflicts.
	- `Key("__entity_group_ids__", 1, K)` -> `{"__version__": PTInt}`
	A child entity with the kind `__entity_group__` and an id of `1`. The value
	has a single property `__version__`, which contains the last automatically
	allocated entity ID for entities within this entity group.
	- A root entity with the key `Key("__entity_group_ids__",1)` which contains the
	same `__version__` property, and indicates the last automatically allocated
	entity ID for root entities.

	### Compound Index table

	The next table keeps track of all the user-added 'compound' index descriptions
	(not the content for the indexes). There is a row in this table for each
	compound index that the user adds by calling `ds.Raw().Testable().AddIndexes`.

	- Name: `"idx"`
	- Key: normalized, serialized `datastore.IndexDefinition` with the SortBy slice
	in reverse order (i.e. `datastore.IndexDefinition.PrepForIdxTable()`).
	- Value: empty

	The Key format here requires some special attention. Say you started with
	a compound IndexDefinition of:

	IndexDefinition{
	Kind: "Foo",
	Ancestor: true,
	SortBy: []IndexColumn{
	{Property: "Something", Direction: DESCENDING},
	{Property: "Else", Direction: ASCENDING},
	{Property: "Cool", Direction: ASCENDING},
	}
	}

	After prepping it for the table, it would be equivalent to:

	IndexDefinition{
	Kind: "Foo",
	Ancestor: true,
	SortBy: []IndexColumn{
	{Property: "__key__", Direction: ASCENDING},
	{Property: "Cool", Direction: ASCENDING},
	{Property: "Else", Direction: ASCENDING},
	{Property: "Something", Direction: DESCENDING},
	}
	}

	The reason for doing this will be covered in the `Query Planning` section, but
	it boils down to allowing the query planner to use this encoded table to
	intelligently scan for potentially useful compound indexes.

	### Index Tables

	Every index (both builtin and compound) has one index table per namespace, which
	contains as rows every entry in the index, one per row.

	- Name: `"idx:" + namespace + IndexDefinition.PrepForIdxTable()`
	- Key: concatenated datastore.Property values, one per SortBy column in the
	IndexDefinition (the non-PrepForIdxTable version). If the SortBy column is
	DESCENDING, the serialized Property is inverted (e.g. XOR 0xFF).
	- Value: empty

	If the IndexDefinition has `Ancestor: true`, then the very first column of the
	Key contains the partial Key for the entity. So if an entity has the datastore
	key `/Some,1/Thing,2/Else,3`, it would have the values `/Some,1`,
	`/Some,1/Thing,2`, and `/Some,1/Thing,2/Else,3` as value in the ancestor column
	of indexes that it matches.

	#### Builtin (automatic) indexes

	The following indexes are automatically created for some entity with a key
	`/Kind,*`, for every property (with `ShouldIndex` values) named "Foo":

	IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
	{Property: "__key__", Direction: ASCENDING},
	}}
	IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
	{Property: "Foo", Direction: ASCENDING},
	{Property: "__key__", Direction: ASCENDING},
	}}
	IndexDefinition{ Kind: "Kind", Ancestor: false, SortBy: []IndexColumn{
	{Property: "Foo", Direction: DESCENDING},
	{Property: "__key__", Direction: ASCENDING},
	}}

	Index updates
	-------------

	(Note that this is a LARGE departure from how the production appengine datastore
	does this. This model only works because the implementation is not distributed,
	and not journaled. The real datastore does index updates in parallel and is
	generally pretty fancy compared to this).

	Index updates are pretty straightforward. On a mutation to the primary entity
	table, we take the old entity value (remember that entity values are
	PropertyMaps), the new property value, create index entries for both, merge
	them, and apply the deltas to the affected index tables (i.e. entries that
	exist in the old entities, but not the new ones, are deleted. Entries that exist
	in the new entities, but not the old ones, are added).

	Index generation works (given an slice of indexes []Idxs) by:

	* serializing all ShouldIndex Properties in the PropertyMap to get a
	`map[name][]serializedProperty`.
	* for each index idx
	* if idx's columns contain properties that are not in the map, skip idx
	* make a `[][]serializedProperty`, where each serializedProperty slice
	corresponds with the IndexColumn of idx.SortBy
	* duplicated values for multi-valued properties are skipped.
	* generate a `[]byte` row which is the contatenation of one value from each
	`[]serializedProperty`, permuting through all combinations. If the SortBy
	column is DESCENDING, make sure to invert (XOR 0xFF) the serializedProperty
	value!.
	* add that generated []byte row to the index's corresponding table.

	Note that we choose to serialize all permutations of the saved entity. This is
	so that we can use repeated-column indexes to fill queries which use a subset of
	the columns. E.g. if we have the index `duck,duck,duck,goose`, we can
	theoretically use it to fill a query for `duck=1,duck=2,goose>"canadian"`, by
	pasting 1 or 2 as the value for the 3rd `duck` column. This simplifies index
	selection at the expense of larger indexes. However, it means that if you have
	the entity:

	duck = 1, 2, 3, 4
	goose = "færøske"

	It generates the following index entries:

	duck=1,duck=1,duck=1,goose="færøske"
	duck=1,duck=1,duck=2,goose="færøske"
	duck=1,duck=1,duck=3,goose="færøske"
	duck=1,duck=1,duck=4,goose="færøske"
	duck=1,duck=2,duck=1,goose="færøske"
	duck=1,duck=2,duck=2,goose="færøske"
	duck=1,duck=2,duck=3,goose="færøske"
	duck=1,duck=2,duck=4,goose="færøske"
	duck=1,duck=3,duck=1,goose="færøske"
	... a lot ...
	duck=4,duck=4,duck=4,goose="færøske"

	This is a very large number of index rows (i.e. an 'exploding index')!

	An alternate design would be to only generate unique permutations of elements
	where the index has repeated columns of a single property. This only makes sense
	because it's illegal to have an equality and an inequality on the same property,
	under the current constraints of appengine (though not completely ridiculous in
	general, if inequality constraints meant the same thing as equality constraints.
	However it would lead to a multi-dimensional query, which can be quite slow and
	is very difficult to scale without application knowledge). If we do this, it
	also means that we need to SORT the equality filter values when generating the
	prefix (so that the least-valued equality constraint is first). If we did this,
	then the generated index rows for the above entity would be:

	duck=1,duck=2,duck=3,goose="færøske"
	duck=1,duck=2,duck=4,goose="færøske"
	duck=1,duck=3,duck=4,goose="færøske"
	duck=2,duck=3,duck=4,goose="færøske"

	Which be a LOT more compact. It may be worth implementing this restriction
	later, simply for the memory savings when indexing multi-valued properties.

	If this technique is used, there's also room to unambiguously index entities
	with repeated equivalent values. E.g. if duck=1,1,2,3,4 , then you could see
	a row in the index like:

	duck=1,duck=1,duck=2,goose="færøske"

	Which would allow you to query for "an entity which has duck values equal to 1,
	1 and 2". Currently such a query is not possible to execute (it would be
	equivalent to "an entity which has duck values equal to 1 and 2").

	Query planning
	--------------

	Now that we have all of our data tabulated, let's plan some queries. The
	high-level algorithm works like this:

	* Generate a suffix format from the user's query which looks like:
	* orders (including the inequality as the first order, if any)
	* projected fields which aren't explicitly referenced in the orders (we
	assume ASCENDING order for them), in the order that they were projected.
	* `__key__` (implied ascending, unless the query's last sort order is for
	`__key__`, in which case it's whatever order the user specified)
	* Reverse the order of this suffix format, and serialize it into an
	IndexDefinition, along with the query's Kind and Ancestor values. This
	does what PrepForIdxTable did when we added the Index in the first place.
	* Use this serialized reversed index to find compound indexes which might
	match by looking up rows in the "idx" table which begin with this serialized
	reversed index.
	* Generate every builtin index for the inequality + equality filter
	properties, and see if they match too.

	An index is a potential match if its suffix exactly matches the suffix format,
	and it contains only sort orders which appear in the query (e.g. the index
	contains a column which doesn't appear as an equality or inequlity filter).

	The index search continues until:

	* We find at least one matching index; AND
	* The combination of all matching indexes accounts for every equality filter
	at least once.

	If we fail to find sufficient indexes to fulfill the query, we generate an index
	description that could be sufficient by concatenating all missing equality
	filters, in ascending order, followed by concatenating the suffix format that we
	generated for this query. We then suggest this new index to the user for them to
	add by returing an error containing the generated IndexDefinition. Note that
	this index is not REQUIRED for the user to add; they could choose to add bits
	and pieces of it, extend existing indexes in order to cover the missing columns,
	invert the direction of some of the equality columns, etc.

	Recall that equality filters are expressed as
	`map[propName][]serializedProperty`. We'll refer to this mapping as the
	'constraint' mapping below.

	To actually come up with the final index selection, we sort all the matching
	indexes from greatest number of columns to least. We add the 0th index (the one
	with the greatest number of columns) unconditionally. We then keep adding indexes
	which contain one or more of the remaining constraints, until we have no
	more constraints to satisfy.

	Adding an index entails determining which columns in that index correspond to
	equality columns, and which ones correspond to inequality/order/projection
	columns. Recall that the inequality/order/projection columns are all the same
	for all of the potential indices (i.e. they all share the same suffix format).
	We can use this to just iterate over the index's SortBy columns which we'll use
	for equality filters. For each equality column, we remove a corresponding value
	from the constraints map. In the event that we _run out_ of constraints for a
	given column, we simply _pick an arbitrary value_ from the original equality
	filter mapping and use that. This is valid to do because, after all, they're
	equality filters.

	Note that for compound indexes, the ancestor key counts as an equality filter,
	and if the compound index has `Ancestor: true`, then we implicitly put the
	ancestor as if it were the first SortBy column. For satisfying Ancestor queries
	with built-in indexes, see the next section.

	Once we've got our list of constraints for this index, we concatenate them all
	together to get the prefix for this index. When iterating over this index, we
	only ever want to see index rows whose prefix exactly matches this. Unlike the
	suffix formt, the prefix is per-index (remember that ALL indexes in the
	query must have the same suffix format).

	Finally, we set the 'start' and 'end' values for all chosen indexes to either
	the Start and End cursors, or the Greater-Than and Less-Than values for the
	inequality. The Cursors contain values for every suffix column, and the
	inequality only contains a value for the first suffix column. If both cursors
	and an inequality are specified, we take the smaller set of both (the
	combination which will return the fewest rows).

	That's about it for index selection! See Query Execution for how we actually use
	the selected indexes to run a query.

	### Ancestor queries and Built-in indexes

	You may have noticed that the built-in indexes can be used for Ancestor queries
	with equality filters, but they don't start with the magic Ancestor column!

	There's a trick that you can do if the suffix format for the query is just
	`__key__` though (e.g. the query only contains equality filters, and/or an
	inequality filter on `__key__`). You can serialize the datastore key that you're
	planning to use for the Ancestor query, then chop off the termintating null byte
	from the encoding, and then use this as additional prefix bytes for this index.
	So if the builtin for the "Val" property has the column format of:

	{Property: "Val"}, {Property: "__key__"}

	And your query holds Val as an equality filter, you can serialize the
	ancestor key (say `/Kind,1`), and add those bytes to the prefix. So if you had
	an index row:

	PTInt ++ 100 ++ PTKey ++ "Kind" ++ 1 ++ CONTINUE ++ "Child" ++ 2 ++ STOP

	(where CONTINUE is the byte 0x01, and STOP is 0x00), you can form a prefix for
	the query `Query("Kind").Ancestor(Key(Kind, 1)).Filter("Val =", 100)` as:

	PTInt ++ 100 ++ PTKey ++ "Kind" ++ 1

	Omitting the STOP which normally terminates the Key encoding. Using this prefix
	will only return index rows which are `/Kind,1` or its children.

	"That's cool! Why not use this trick for compound indexes?", I hear you ask :)
	Remember that this trick only works if the prefix before the `__key__` is
	entirely composed of equality filters. Also recall that if you ONLY use
	equality filters and Ancestor (and possibly an inequality on `__key__`), then
	you can always satisfy the query from the built-in indexes! While you
	technically could do it with a compound index, there's not really a point to
	doing so. To remain faithful to the production datastore implementation, we
	don't implement this trick for anything other than the built-in indexes.

	### Cursor format

	Cursors work by containing values for each of the columns in the suffix, in the
	order and Direction specified by the suffix. In fact, cursors are just encoded
	versions of the []IndexColumn used for the 'suffix format', followed by the
	raw bytes of the suffix for that particular row (incremented by 1 bit).

	This means that technically you can port cursors between any queries which share
	precisely the same suffix format, regardless of other query options, even if the
	index planner ends up choosing different indexes to use from the first query to
	the second. No state is maintained in the service implementation for cursors.

	I suspect that this is a more liberal version of cursors than how the production
	appengine implements them, but I haven't verified one way or the other.

	Query execution
	---------------

	Last but not least, we need to actually execute the query. After figuring out
	which indexes to use with what prefixes and start/end values, we essentially
	have a list of index subsets, all sorted the same way. To pull the values out,
	we start by iterating the first index in the list, grabbing its suffix value,
	and trying to iterate from that suffix in the second, third, fourth, etc index.

	If any index iterates past that suffix, we start back at the 0th index with that
	suffix, and continue to try to find a matching row. Doing this will end up
	skipping large portions of all of the indexes in the list. This is the algorithm
	known as "zigzag merge join", and you can find talks on it from some of the
	appengine folks. It has very good algorithmic running time and tends to scale
	with the number of full matches, rather than the size of all of the indexes
	involved.

	A hit occurs when all of the iterators have precisely the same suffix. This hit
	suffix is then decoded using the suffix format information. The very last column
	of the suffix will always be the datastore key. The suffix is then used to call
	back to the user, according to the query type:

	* keys-only queries just directly return the Key
	* projection queries return the projected fields from the decoded suffix.
	Remember how we added all the projections after the orders? This is why. The
	projected values are pulled directly from the index, instead of going to the
	main entity table.
	* normal queries pull the decoded Key from the "ents" table, and return that
	entity to the user.