Following #485, we began investigating post/comment rendering
bottlenecks. The most immediate issue is the eager comment loading
(merged in 23a8fb9663) did not seem fully operative: query logs
showed comments and associated FKs were being lazy loaded again
(linear query quantity in number of rendered comments). In fact,
CPU load seemed even worse than previous lazy loading.
Bisect revealed first bad commit: fb77cbcc2b
which fixed post view counters by committing the SQLAlchemy session
instead of flushing, following upstream's fix. However, committing
a session has the unfortunate side effect of dumping cached session
objects, such as the previously loaded comment objects and their
relationships, causing fallback to the old lazy behavior.
We fix this here by explicitly telling SQLAlchemy to not expire
the session on commit.
Hopefully this will simultaneously resolve the elevated DB CPU load
observed in production and speed up page rendering again.
Ported in from upstream with adjustments for TheMotte, most notably
universal default to 'new' and fixes to 'hot'. Lumped into this PR
because eager comment loading uses it.
* titles: use rdrama's title finding code
this fixes a potential DoS in some really weird pages (seems to be a bug with BS4)
we're not parsing arbitrary HTML
in addition we make some nice checks
* unescape title to fix bug from upstream
* fix nameerror
* Do not proxy requests, since no proxy available.
On the upstream, the `proxies` dict was intended to use a local SOCKS
proxy running on port 18080 with the express purpose of masking the
server IP address. TheMotte isn't running behind a reverse proxy, so
this purpose is moot. Additionally, we don't have a proxy running in
Docker nor do we appear to have one on prod, which breaks autotitle
and thumbnailing regardless--not sure it matters for TheMotte's
use case, but both codepaths have been inoperative because of it.
* use gevent to timeout the function to prevent a
second theoretical DoS by sending data rly slowly
ref: 816389cf28
Co-authored-by: TLSM <duolsm@outlook.com>
Removes the following awards / fields on User:
- flairlock
- progressivestack
- bird
- longpost (pizzashill)
- marseyawarded
- rehab
- deflector
- mute
- unmutable
- eye (All-Seeing Eye)
- alt (Alt-Seeing Eye)
Primarily motivated by starting to remove some un-Mottelike cruft
from core commenting/posting routes. Cleared out other inapplicable
awards while in the process.
Borrows code from the upstream which has been working in production
reliably for ~months. Also, most of it was literally copy-pasted,
and the casted ID values aren't used later in the route functions.
Due to use of Submission.{choices, options, bet_options} in realbody,
generating submission_listings resulted in extremely high volume of
SELECT queries.
In local testing with 6 posts, one of which had a poll with 2 options,
the removal of these calls reduced quantity of queries on the homepage
from 84 to 22.
Given that it was previously decided to remove the polls feature after
a regression while adding comment filtering, the remaining dead code
paths for polls were also removed.
In four contexts, Comment.replies(.) was not updated to reflect the
interface changes with comment filtering. This directly caused #170
and #172 (which was a stack trace from the former).
- Updating notifications for DMs (routes/users.py L690)
- Updating notifications for modmail (routes/users.py L729)
- morecomments for logged out users (routes/posts.py L421)
- JSON for API access (classes/comment.py L347)
All four contexts seem to behave correctly after the change. However,
strictly speaking the JSON generation will not include a user's own
filtered or removed comments, though this is hard to remedy without
passing the user object `v` to json_core. Propagating that through the
codebase seems a worse option than leaving it as is.
This change disables multimedia embedding:
- In comments and comments replies.
- In new submissions.
- In comment & submission preview
And it's all toggle-able via an envvar, except for the JS bits,
but I linked those to the github issue, so should be easy to find
in the future.
The way it works is:
- removes markdown image/video syntax,
eg. `` into ``
- changes link text into anchors, eg.
`https://example.org/someimage.jpg` into
`[https://example.org/someimage.jpg](https://example.org/someimage.jpg)`
- removes html img/video/audio tags, eg.
`<img href="https://example.org/someimage.jpg" />` into ``
- when embedding gifs via the giphy modal in "new submission", it will
insert only an anchor to the gif
- when attaching an image, it will upload the image, then add only an
anchor to the post/comment body
I tested this manually, but not sure if I got all the test cases. What I
checked was:
- create comment w/ image/video/audio media using markdown -> success
- create comment reply w/ image/video/audio media using markdown ->
success
- create comment w/ link to img/imgur/youtube/audio -> success
- create comment w/ attachment -> success
- create comment reply w/ attachment -> success
- create comment w/ img/video tag -> success
- create comment reply w/ image/video tag -> success
- create post submission w/ image/video/media using markdown -> success
- create post submission w/ link to img/imgur/youtube/audio -> success
- create post submission w/ attachment -> success
- create post submission w/ giphy gif -> success
Also, updated the formatting page.
Co-authored-by: Ben Rog-Wilhelm <zorba-github@pavlovian.net>