Skip to content

Conversation

@gburd
Copy link
Owner

@gburd gburd commented Oct 24, 2025

No description provided.

@gburd gburd force-pushed the cf-5556 branch 7 times, most recently from e777a6e to 650f621 Compare November 1, 2025 17:22
@gburd gburd force-pushed the cf-5556 branch 6 times, most recently from 16e0007 to 331cd76 Compare November 7, 2025 20:56
@gburd gburd force-pushed the cf-5556 branch 4 times, most recently from 9558f42 to 05c4e60 Compare November 16, 2025 18:53
@gburd gburd force-pushed the cf-5556 branch 4 times, most recently from ae8af13 to 9f584af Compare November 19, 2025 18:18
gburd added 5 commits December 1, 2025 11:22
This commit refactors the interaction between heap_tuple_update(),
heap_update(), and simple_heap_update() to improve code organization
and flexibility. The changes are functionally equivalent to the
previous implementation and have no performance impact.

The primary motivation is to prepare for upcoming modifications to
how and where modified attributes are identified during the update
path, particularly for catalog updates.

As part of this reorganization, the handling of replica identity key
attributes has been adjusted. Instead of fetching a second copy of
the bitmap during an update operation, the caller is now required to
provide it. This change applies to both heap_update() and
heap_delete().

No user-visible changes.
Refactor executor update logic to determine which indexed columns have
actually changed during an UPDATE operation rather than leaving this up
to HeapDetermineColumnsInfo in heap_update. This enables the comparison
to happen without taking a lock on the page and opens the door to reuse
in other code paths.

Because heap_update now requires the caller to provide the modified
indexed columns simple_heap_update has become a tad more complex.  It is
frequently called from CatalogTupleUpdate which either updates heap
tuples via their form or using heap_modify_tuple.  In both cases the
caller does know the modified set of attributes, but sadly those
attributes are lost before being provided to simple_heap_update.  Due to
that the "simple" path has to retain the HeapDetermineColumnsInfo logic
of old (for now).  In order for that to work it was necessary to split
the (overly large) heap_update call itself up.  This moves up into
simple_heap_update and heap_tuple_update a bit of what existed in
heap_update itself.  Ideally this will be cleaned up once
CatalogTupleUpdate paths are all recording modified attributes
correctly, when that happens the "simple" path can be simplified again.

ExecCheckIndexedAttrsForChanges replaces HeapDeterminesColumnsInfo and
tts_attr_equal replaces heap_attr_equal changing the test for equality
when calling into heap_tuple_update (but not simple_heap_update).  In
the past we used datumIsEqual(), essentially a binary comparison using
memcmp(), now the comparison code in tts_attr_equal uses type-specific
equality function when available and falls back to datumIsEqual() when
not.  This change in equality testing has some intended implications and
opens the door for more HOT updates (foreshadowing).  For instance,
indexes with collation information allowing more HOT updates when the
index is specified to be case insensitive.

This change forced some logic changes in execReplication on the update
paths is now it is required to have knowledge of the set of attributes
that are both changed and referenced by indexes.  Luckilly, the this is
available within calls to slot_modify_data() where LogicalRepTupleData
is processed and has a set of updated attributes.  In this case rather
than using ExecCheckIndexedAttrsForChanges we can preseve what
slot_modify_data() identifies as the modified set and then intersect
that with the set of indexes on the relation and get the correct set of
modified indexed attributes required on heap_update().
In execIndexing on updates we'd like to pass a hint to the indexing code
when the indexed attributes are unchanged.  This commit replaces the now
redundant code in index_unchanged_by_update with the same information
found earlier in the update path.
Currently, PostgreSQL conservatively prevents HOT (Heap-Only Tuple)
updates whenever any indexed column changes, even if the indexed
portion of that column remains identical. This is overly restrictive
for expression indexes (where f(column) might not change even when
column changes) and partial indexes (where both old and new tuples
might fall outside the predicate).  Finally, index AMs play no role
in deciding when they need a new index entry on update, the rules
regarding that are based on binary equality and the HEAP's model for
MVCC and related HOT optimization.  Here we open that door a bit so
as to enable more nuanced control over the process.  This enables
index AMs that require binary equality (as is the case for nbtree)
to do that without disallowing type-specific equality checking for
other indexes.

This patch introduces several improvements to enable HOT updates in
these cases:

Add amcomparedatums() callback to IndexAmRoutine. This allows index
access methods like GIN to provide custom logic for comparing datums by
extracting and comparing index keys rather than comparing the raw
datums. GIN indexes now implement gincomparedatums() which extracts keys
from both datums and compares the resulting key sets.  Also, as
mentioned earlier nbtree implements this API and uses datumIsEqual() for
equality so that the manner in which it deduplicates TIDs on page split
doesn't have to change.  This is not a required API, when not
implemented the executor will compare TupleTableSlot datum for equality
using type-specific operators and take into account collation so that an
update from "Apple" to "APPLE" on a case insensitive index can now be
HOT.

ExecWhichIndexesRequireUpdates() is re-written to find the set of
modified indexed attributes that trigger new index tuples on updated.
For partial indexes, this checks whether both old and new tuples satisfy
or fail the predicate. For expression indexes, this uses type-specific
equality operators to compare computed values. For extraction-based
indexes (GIN/RUM) that implement amcomparedatums() it uses that.

Importantly, table access methods can still signal using TU_Update if
all, none, or only summarizing indexes should be updated.  While the
executor layer now owns determining what has changed due to an update
and is interested in only updating the minimum number of indexes
possible, the table AM can override that while performing
table_tuple_update(), which is what heap does.  While this signal is
very specific to how the heap implements MVCC and its HOT optimization,
we'll leave replacing that for another day.

This optimization trades off some new overhead for the potential for
more updates to use the HOT optimized path and avoid index and heap
bloat.  This should significantly improve update performance for tables
with expression indexes, partial indexes, and GIN/GiST indexes on
complex data types like JSONB and tsvector, while maintaining correct
index semantics.  Minimal additional overhead due to type-specific
equality checking should be washed out by the benefits of updating
indexes fewer times.

One notable trade-off is that there are more calls to FormIndexDatum()
as a result.  Caching these might reduce some of that overhead, but not
all.  This lead to the change in the frequency for expressions in the
spec update test to output notice messages, but does not impact
correctness.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants