Against a clean seeded DB, reduces `GET /post/1/` from 63 queries to
26 by removing redundancies and slow lazy-loaded queries during
top comment pagination.
Also applies eager loading to /viewmore/ with the expected reduction
from 5*(N comments) queries to ~12/request.
For testing locally, use a newly seeded DB to ensure
Comment.descendant_count is populated.
Ref: #485
The comments schema, prior to December 2021, used parent_comment_id
instead of also storing top_comment_id. Comment pagination is based
now on top_comment_id. However, upstream never migrated their old
comments to populate tc_id, and thus retained two copies of pagination
logic, each using different limits to try to emulate similar behavior.
TheMotte foremost has no posts created prior to December 2021 (so
these branches never activated) and also has tc_id on all comments.
The dual limit pagination approach was already removed (there is
only one limit for paginating comments). This completes the removal of
this logic, since these are purely dead codepaths which have previously
caused confusion to contributors.
This is likely not an issue for production (since each request will
get its own SQLAlchemy session), but `scoped_session` results in the
tests reuseing the same Session across tests. The tests rely on
the default session expiry behavior.
Following #485, we began investigating post/comment rendering
bottlenecks. The most immediate issue is the eager comment loading
(merged in 23a8fb9663) did not seem fully operative: query logs
showed comments and associated FKs were being lazy loaded again
(linear query quantity in number of rendered comments). In fact,
CPU load seemed even worse than previous lazy loading.
Bisect revealed first bad commit: fb77cbcc2b
which fixed post view counters by committing the SQLAlchemy session
instead of flushing, following upstream's fix. However, committing
a session has the unfortunate side effect of dumping cached session
objects, such as the previously loaded comment objects and their
relationships, causing fallback to the old lazy behavior.
We fix this here by explicitly telling SQLAlchemy to not expire
the session on commit.
Hopefully this will simultaneously resolve the elevated DB CPU load
observed in production and speed up page rendering again.
Ported in from upstream with adjustments for TheMotte, most notably
universal default to 'new' and fixes to 'hot'. Lumped into this PR
because eager comment loading uses it.
Ported in logic from upstream to use SQLAlchemy eager loading instead
of repeated queries when building a submission_listing. Adjusted
loaded relationships to include only those used on TheMotte.
Using test data from seed_db, before and after:
GET /
|----------|--------|--------|--------|--------|--------|------------|
| Database | SELECT | INSERT | UPDATE | DELETE | Totals | Duplicates |
|----------|--------|--------|--------|--------|--------|------------|
| default | 83 | 0 | 0 | 0 | 83 | 72 |
|----------|--------|--------|--------|--------|--------|------------|
Total queries: 83 in 0.031s
GET /
|----------|--------|--------|--------|--------|--------|------------|
| Database | SELECT | INSERT | UPDATE | DELETE | Totals | Duplicates |
|----------|--------|--------|--------|--------|--------|------------|
| default | 14 | 0 | 0 | 0 | 14 | 0 |
|----------|--------|--------|--------|--------|--------|------------|
Total queries: 14 in 0.00718s
Generally standardizes the get_* helpers:
- Adds type hinting.
- Deduplicates block property addition.
- Respects `graceful` in more contexts.
- More resilient to invalid user input / less boilerplate necessary
at call-sites.