
* prepare codebase to create scheduled tasks
there is some prep work involved with this. the scheduler would be happy
if this work was done. simply, we extract out the `created_utc`
interface from *everything* that uses it such that we don't have to
repeat ourselves a bunch. all fun stuff.
next commit is the meat of it.
* cron: basic backend work for scheduler
* avoid ipmort loop
* attempt 2 at fixing import loops
* parathensize because operator precedence
* delete file that came back for some reason.
* does NOPing the oauth apps work?
* import late and undo clients.py change
* stringify column names.
* reorder imports.
* remove task reference
* fix missing mapper object
* make coupled to repeatabletask i guess
* sanitize: fix sanitize imports
* import shadowing crap
* re-shadow shadowed variable
* fix regexes
* use the correct not operator
* readd missing commit
* scheduler: SQLA only allows concrete relations
* implement submission scheduler
* fix import loop with db_session
* get rid of import loop in submission.py and comment.py
* remove import loops by deferring import until function clal
* i give up.
* awful.
* ...
* fix another app import loop
* fix missing import in route handler
* fix import error in wrappers.py
* fix wrapper error
* call update wrapper in the admin_level_required case
* :marseyshrug:
* fix issue with wrapper
* some cleanup and some fixes
* some more cleanup
let's avoid polluting scopes where we can.
* ...
* add SCHEDULED_POSTS permission.
* move const.py into config like the other files.
* style fixes.
* lock table for concurrency improvements
* don't attempt to commit on errors
* Refactor code, create `TaskRunContext`, create python callable task type.
* use import contextlib
* testing stuff i guess.
* handle repeatable tasks properly.
* Attempt another fix at fighting the mapper
* do it right ig
* SQLA1.4 doesn't support nested polymorphism ig
* fix errenous class import
* fix mapper errors
* import app in wrappers.py
* fix import failures and stuff like that.
* embed and import fixes
* minor formatting changes.
* Add running state enum and don't attempt to check for currently running tasks.
* isort
* documentation, style, and commit after each task.
* Add completion time and more docs, rename, etc
* document `CRON_SLEEP_SECONDS` better.
* add note about making LiteralString
* filter out tasks that have been run in the future
* reference RepeatableTask's `__tablename__` directly
* use a master/slave configuration for tasks
the master periodically checks to see if the slave is alive, healthy,
and not taking too many resources, and if applicable kills its
child and restarts it.
only one relation is supported at the moment.
* don't duplicate process unnecessarily
* note impl detail, add comments
* fix imports.
* getting imports to stop being stupid.
* environment notes.
* syntax derp
* *sigh*
* stupid environment stuff
* add UI for submitting a scheduled post
* stupid things i need to fix the user class
* ...
* fix template
* add formkey
* pass v
* add hour and minute field
* bleh
* remove concrete
* the sqlalchemy docs are wrong
* fix me being dumb and not understanding error messages
* missing author attribute for display
* author_name property
* it's a property
* with_polymorphic i think fixes this
* dsfavgnhmjk
* *sigh*
* okay try this again
* try getting rid of the comment section
* include -> extends
* put the div outside of the thing.
* fix user page listings :/
* mhm
* i hate this why isn't this working
* this should fix it
* Fix posts being set as disabled by default
* form UI imrpovements
* label
* <textarea>s should have their closing tag
* UI fixes.
* and fix errenous spinner thing.
* don't abort(415) when browsers send 0 length files for some reason
* UI improvements
* line break.
* CSS :S
* better explainer
* don't show moderation buttons for scheduled posts
* ...
* meh
* add edit form
* include forms on default page.
* fix hour minute selectino.
* improve ui i guess and add api
* Show previous postings on scheduled task page
* create task id
* sqla
* posts -> submissions
* fix OTM relationship
* edit URL
* use common formkey control
* Idk why this isn't working
* Revert "Idk why this isn't working"
This reverts commit 3b93f741df
.
* does removing viewonly fix it?
* don't import routes on db migrations
* apparently this has to be a string
* UI improvements redux
* margins and stuff
* add cron to supervisord
* remove stupid duplication
* typo fix
* postgres syntax error
* better lock and error handling
* add relationship between task and runs
* fix some ui stuff
* fix incorrect timestamp comparison
* ...
* Fix logic errors blocking scheduled posts
Two bugs here:
- RepeatableTask.run_time_last <= now: run_time_last is NULL by
default. NULL is not greater than, less than, or equal to any
value. We use NULL to signify a never-run task; check for that
condition when building the task list.
- `6 <= weekday <= 0`: there is no integer that is both gte 6 and
lte 0. This was always false.
* pasthrough worker process STDOUT and STDERR
* Add scheduler to admin panel
* scheduler
* fix listing and admin home
* date formatting ixes
* fix ages
* task user interface
* fix some more import crap i have to deal with
* fix typing
* avoid import loop
* UI fixes
* fix incorrect type
* task type
* Scheduled task UI improvements (add runs and stuff)
* make the width a lil bit smaller
* task runs.
* fix submit page
* add alembic migration
* log on startup
* Fix showing edit button
* Fix logic for `can_edit` (accidentally did `author_id` instead of `id`)
* Broad review pass
Review:
- Call `invalidate_cache` with `is_html=` explicitly for clarity,
rather than a bare boolean in the call args.
- Remove `marseys_const*` and associated stateful const system:
the implementation was good if we needed them, but TheMotte
doesn't use emoji, and a greenfield emoji system would likely
not keep those darned lists floating in thread-local scope.
Also they were only needed for goldens and random emoji, which
are fairly non-central features.
- Get `os.environ` fully out of the templates by using the new
constants we already have in files.helpers.config.environment.
- Given files.routes.posts cleanup,get rid of shop discount dict.
It's already a mapping of badge IDs to discounts for badges that
likely won't continue to exist (if they even do at present).
- RepeatableTaskRun.exception: use `@property.setter` instead of
overriding `__setattr__`.
Fix:
- Welcome message literal contained an indented Markdown code block.
- Condition to show "View source" button changed to show source to
logged out. This may well be a desirable change, but it's not
clearly intended here.
* Fix couple of routing issues
* fix 400 with post body editing
* Add error handler for HTTP 415
* fix router giving wrong arg name to handler
* Use supervisord to monitor memory rather than DIY
Also means we're using pip for getting supervisord now, so we don't rely
on the Debian image base for any packages.
* fix task run elapsed time display
* formatting and removing redundant code
* Fix missing ModAction import
* dates and times fixes
* Having to modify imports here anyway, might as
well change it.
* correct documentation.
* don't use urlunparse
* validators: import sanitize instead of from syntax
* cron: prevent races on task running
RepeatableTask.run_state_enum acts as the mutex on repeatable tasks.
Previously, the list of tasks to run was acquired before individually
locking each task. However, there was a period where the table is both
unlocked and the tasks are in state WAITING between those points.
This could potentially have led to two 'cron' processes each running the
same task simultaneously. Instead, we check for runnability both when
building the preliminary list and when mutexing the task via run state
in the database.
Also:
- g.db and the cron db object are both instances of `Session`, not
`scoped_session` because they are obtained from
`scoped_session.__call__`, which acts as a `Session` factory.
Propagate this to the type hints.
- Sort order of task run submissions so /tasks/scheduled_posts/<id>
"Previous Task Runs" listings are useful.
* Notify followers on post publication
This was old behavior lost in the refactoring of the submit endpoint.
Also fix an AttributeError in `Follow.__repr__` which carried over
from all the repr copypasta.
* Fix image attachment
Any check for `file.content_length` relies on browsers sending
Content-Length headers with the request. It seems that few actually do.
The pre-refactor approach was to check for truthiness, which excludes
both None and the strange empty strings that we seem to get in absence
of a file upload. We return to doing so.
---------
Co-authored-by: TLSM <duolsm@outlook.com>
398 lines
11 KiB
Python
398 lines
11 KiB
Python
from __future__ import annotations
|
|
|
|
import contextlib
|
|
import dataclasses
|
|
from datetime import date, datetime, timedelta, timezone
|
|
from enum import IntEnum, IntFlag
|
|
from typing import TYPE_CHECKING, Final, Optional, Union
|
|
|
|
import flask
|
|
import flask_caching
|
|
import flask_mail
|
|
import redis
|
|
from sqlalchemy.orm import relationship, Session
|
|
from sqlalchemy.schema import Column, ForeignKey
|
|
from sqlalchemy.sql.sqltypes import (Boolean, DateTime, Integer, SmallInteger,
|
|
Text, Time)
|
|
|
|
from files.classes.base import CreatedBase
|
|
from files.helpers.time import format_age, format_datetime
|
|
|
|
if TYPE_CHECKING:
|
|
from files.classes.user import User
|
|
|
|
class ScheduledTaskType(IntEnum):
|
|
PYTHON_CALLABLE = 1
|
|
SCHEDULED_SUBMISSION = 2
|
|
|
|
def __str__(self):
|
|
if not self.name: return super().__str__()
|
|
return self.name.replace('_', ' ').title()
|
|
|
|
|
|
class ScheduledTaskState(IntEnum):
|
|
WAITING = 1
|
|
'''
|
|
A task waiting to be triggered
|
|
'''
|
|
RUNNING = 2
|
|
'''
|
|
A task that is currently running
|
|
'''
|
|
|
|
|
|
class DayOfWeek(IntFlag):
|
|
SUNDAY = 1 << 1
|
|
MONDAY = 1 << 2
|
|
TUESDAY = 1 << 3
|
|
WEDNESDAY = 1 << 4
|
|
THURSDAY = 1 << 5
|
|
FRIDAY = 1 << 6
|
|
SATURDAY = 1 << 7
|
|
|
|
WEEKDAYS = MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY
|
|
WEEKENDS = SATURDAY | SUNDAY
|
|
|
|
NONE = 0 << 0
|
|
ALL = WEEKDAYS | WEEKENDS
|
|
|
|
@classmethod
|
|
@property
|
|
def all_days(cls) -> list["DayOfWeek"]:
|
|
return [
|
|
cls.SUNDAY, cls.MONDAY, cls.TUESDAY, cls.WEDNESDAY,
|
|
cls.THURSDAY, cls.FRIDAY, cls.SATURDAY
|
|
]
|
|
|
|
@property
|
|
def empty(self) -> bool:
|
|
return self not in self.ALL
|
|
|
|
def __contains__(self, other:Union[date, "DayOfWeek"]) -> bool:
|
|
_days:dict[int, "DayOfWeek"] = {
|
|
0: self.MONDAY,
|
|
1: self.TUESDAY,
|
|
2: self.WEDNESDAY,
|
|
3: self.THURSDAY,
|
|
4: self.FRIDAY,
|
|
5: self.SATURDAY,
|
|
6: self.SUNDAY
|
|
}
|
|
if not isinstance(other, date):
|
|
return super().__contains__(other)
|
|
weekday:int = other.weekday()
|
|
if not 0 <= weekday <= 6:
|
|
raise Exception(
|
|
f"Unexpected weekday value (got {weekday}, expected 0-6)")
|
|
return _days[weekday] in self
|
|
|
|
|
|
_UserConvertible = Union["User", str, int]
|
|
|
|
@dataclasses.dataclass(frozen=True, kw_only=True, slots=True)
|
|
class TaskRunContext:
|
|
'''
|
|
A full task run context, with references to all app globals embedded.
|
|
This is the entirety of the application's global state at this point.
|
|
|
|
This is explicit state. This is useful so scheduled tasks do not have
|
|
to import from `files.__main__` and so they can use all of the features
|
|
of the application without being in a request context.
|
|
'''
|
|
app:flask.app.Flask
|
|
'''
|
|
The application. Many of the app functions use the app context globals and
|
|
do not have their state explicitly passed. This is a convenience get out of
|
|
jail free card so that most features (excepting those that require a
|
|
`request` context can be used.)
|
|
'''
|
|
cache:flask_caching.Cache
|
|
'''
|
|
A cache extension. This is useful for situations where a scheduled task
|
|
might want to interact with the cache in some way (for example invalidating
|
|
or adding something to the cache.)
|
|
'''
|
|
db:Session
|
|
'''
|
|
A database session. Useful for when a task needs to modify something in the
|
|
database (for example creating a submission)
|
|
'''
|
|
mail:flask_mail.Mail
|
|
'''
|
|
The mail extension. Needed for sending emails.
|
|
'''
|
|
redis:redis.Redis
|
|
'''
|
|
A direct reference to our redis connection. Normally most operations that
|
|
involve the redis datastore use flask_caching's Cache object (accessed via
|
|
the `cache` property), however this is provided as a convenience for more
|
|
granular redis operations.
|
|
'''
|
|
task:RepeatableTask
|
|
'''
|
|
A reference to the task that is being ran.
|
|
'''
|
|
task_run:RepeatableTaskRun
|
|
'''
|
|
A reference to this current run of the task.
|
|
'''
|
|
trigger_time:datetime
|
|
'''
|
|
The date and time (UTC) that this task was triggered
|
|
'''
|
|
|
|
@property
|
|
def run_time(self) -> datetime:
|
|
'''
|
|
The date and time (UTC) that this task was actually ran
|
|
'''
|
|
return self.task_run.created_datetime_py
|
|
|
|
@contextlib.contextmanager
|
|
def app_context(self, *, v:Optional[_UserConvertible]=None):
|
|
'''
|
|
Context manager that uses `self.app` to generate an app context and set
|
|
up the application with expected globals. This assigns `g.db`, `g.v`,
|
|
and `g.debug`.
|
|
|
|
This is intended for use with legacy code that does not pass state
|
|
explicitly and instead relies on the use of `g` for state passing. If
|
|
at all possible, state should be passed explicitly to functions that
|
|
require it.
|
|
|
|
Usage is simple:
|
|
```py
|
|
with ctx.app_context() as app_ctx:
|
|
# code that requires g
|
|
```
|
|
|
|
Any code that uses `g` can be ran here. As this is intended for
|
|
scenarios that may be outside of a request context code that uses the
|
|
request context may still raise `RuntimeException`s.
|
|
|
|
An example
|
|
|
|
```py
|
|
from flask import g, request # import works ok
|
|
|
|
def legacy_function():
|
|
u:Optional[User] = g.db.get(User, 1784) # works ok! :)
|
|
u.admin_level = \\
|
|
request.values.get("admin_level", default=9001, type=int)
|
|
# raises a RuntimeError :(
|
|
g.db.commit()
|
|
```
|
|
|
|
This is because there is no actual request being made. Creating a
|
|
mock request context is doable but outside of the scope of this
|
|
function as this is often not needed outside of route handlers (where
|
|
this function is out of scope anyway).
|
|
|
|
:param v: A `User`, an `int`, a `str`, or `None`. `g.v` will be set
|
|
using the following rules:
|
|
|
|
1. If `v` is an `int`, `files.helpers.get_account` is called and the
|
|
result of that is stored in `g.v`.
|
|
|
|
2. If `v` is an `str`, `files.helpers.get_user` is called and the
|
|
result of that is stored in `g.v`.
|
|
|
|
3. If `v` is a `User`, it is stored in `g.v`.
|
|
|
|
It is expected that callees will provide a valid user ID or username.
|
|
If an invalid one is provided, *no* exception will be raised and `g.v`
|
|
will be set to `None`.
|
|
|
|
This is mainly provided as an optional feature so that tasks can be
|
|
somewhat "sudo"ed as a particular user. Note that `g.v` is always
|
|
assigned (even if to `None`) in order to prevent code that depends on
|
|
its existence from raising an exception.
|
|
'''
|
|
with self.app.app_context() as app_ctx:
|
|
app_ctx.g.db = self.db
|
|
|
|
from files.helpers.get import get_account, get_user
|
|
|
|
if isinstance(v, str):
|
|
v = get_user(v, graceful=True)
|
|
elif isinstance(v, int):
|
|
v = get_account(v, graceful=True, db=self.db)
|
|
|
|
app_ctx.g.v = v
|
|
app_ctx.g.debug = self.app.debug
|
|
yield app_ctx
|
|
|
|
@contextlib.contextmanager
|
|
def db_transaction(self):
|
|
try:
|
|
yield
|
|
self.db.commit()
|
|
except:
|
|
self.db.rollback()
|
|
|
|
|
|
_TABLE_NAME: Final[str] = "tasks_repeatable"
|
|
|
|
|
|
class RepeatableTask(CreatedBase):
|
|
__tablename__ = _TABLE_NAME
|
|
|
|
id = Column(Integer, primary_key=True, nullable=False)
|
|
author_id = Column(Integer, ForeignKey("users.id"), nullable=False)
|
|
type_id = Column(SmallInteger, nullable=False)
|
|
enabled = Column(Boolean, default=True, nullable=False)
|
|
run_state = Column(SmallInteger, default=int(ScheduledTaskState.WAITING), nullable=False)
|
|
run_time_last = Column(DateTime, default=None)
|
|
|
|
frequency_day = Column(SmallInteger, nullable=False)
|
|
time_of_day_utc = Column(Time, nullable=False)
|
|
|
|
runs = relationship("RepeatableTaskRun", back_populates="task")
|
|
|
|
@property
|
|
def type(self) -> ScheduledTaskType:
|
|
return ScheduledTaskType(self.type_id)
|
|
|
|
@type.setter
|
|
def type(self, value:ScheduledTaskType):
|
|
self.type_id = value
|
|
|
|
@property
|
|
def frequency_day_flags(self) -> DayOfWeek:
|
|
return DayOfWeek(self.frequency_day)
|
|
|
|
@frequency_day_flags.setter
|
|
def frequency_day_flags(self, value:DayOfWeek):
|
|
self.frequency_day = int(value)
|
|
|
|
@property
|
|
def run_state_enum(self) -> ScheduledTaskState:
|
|
return ScheduledTaskState(self.run_state)
|
|
|
|
@run_state_enum.setter
|
|
def run_state_enum(self, value:ScheduledTaskState):
|
|
self.run_state = int(value)
|
|
|
|
@property
|
|
def run_time_last_or_created_utc(self) -> datetime:
|
|
return self.run_time_last or self.created_datetime_py
|
|
|
|
@property
|
|
def run_time_last_str(self) -> str:
|
|
if not self.run_time_last: return 'Never'
|
|
return (f'{format_datetime(self.run_time_last)} '
|
|
f'({format_age(self.run_time_last)})')
|
|
|
|
@property
|
|
def trigger_time(self) -> datetime | None:
|
|
return self.next_trigger(self.run_time_last_or_created_utc)
|
|
|
|
def can_run(self, now: datetime) -> bool:
|
|
return not (
|
|
self.trigger_time is None
|
|
or now < self.trigger_time
|
|
or self.run_state_enum != ScheduledTaskState.WAITING)
|
|
|
|
def next_trigger(self, anchor: datetime) -> datetime | None:
|
|
if not self.enabled: return None
|
|
if self.frequency_day_flags.empty: return None
|
|
|
|
day:timedelta = timedelta(1.0)
|
|
target_date:datetime = anchor - day # incremented at start of for loop
|
|
|
|
for i in range(8):
|
|
target_date = target_date + day
|
|
if i == 0 and target_date.time() > self.time_of_day_utc: continue
|
|
if target_date in self.frequency_day_flags: break
|
|
else:
|
|
raise Exception("Could not find suitable timestamp to run next task")
|
|
|
|
return datetime.combine(target_date, self.time_of_day_utc, tzinfo=timezone.utc) # type: ignore
|
|
|
|
def run(self, db: Session, trigger_time: datetime) -> RepeatableTaskRun:
|
|
run:RepeatableTaskRun = RepeatableTaskRun(task_id=self.id)
|
|
try:
|
|
from files.__main__ import app, cache, mail, r # i know
|
|
ctx: TaskRunContext = TaskRunContext(
|
|
app=app,
|
|
cache=cache,
|
|
db=db,
|
|
mail=mail,
|
|
redis=r,
|
|
task=self,
|
|
task_run=run,
|
|
trigger_time=trigger_time,
|
|
)
|
|
self.run_task(ctx)
|
|
except Exception as e:
|
|
run.exception = e
|
|
run.completed_utc = datetime.now(tz=timezone.utc)
|
|
db.add(run)
|
|
return run
|
|
|
|
def run_task(self, ctx:TaskRunContext):
|
|
raise NotImplementedError()
|
|
|
|
def contains_day_str(self, day_str:str) -> bool:
|
|
return (bool(day_str)
|
|
and DayOfWeek[day_str.upper()] in self.frequency_day_flags)
|
|
|
|
def __repr__(self) -> str:
|
|
return f'<{self.__class__.__name__}(id={self.id}, created_utc={self.created_date}, author_id={self.author_id})>'
|
|
|
|
__mapper_args__ = {
|
|
"polymorphic_on": type_id,
|
|
}
|
|
|
|
|
|
class RepeatableTaskRun(CreatedBase):
|
|
__tablename__ = "tasks_repeatable_runs"
|
|
|
|
id = Column(Integer, primary_key=True)
|
|
task_id = Column(Integer, ForeignKey(RepeatableTask.id), nullable=False)
|
|
manual = Column(Boolean, default=False, nullable=False)
|
|
traceback_str = Column(Text, nullable=True)
|
|
|
|
completed_utc = Column(DateTime)
|
|
|
|
task = relationship(RepeatableTask, back_populates="runs")
|
|
|
|
_exception: Optional[Exception] = None # not part of the db model
|
|
|
|
@property
|
|
def completed_datetime_py(self) -> datetime | None:
|
|
if self.completed_utc is None:
|
|
return None
|
|
return datetime.combine(
|
|
self.completed_utc.date(),
|
|
self.completed_utc.time(),
|
|
timezone.utc)
|
|
|
|
@property
|
|
def completed_datetime_str(self) -> str:
|
|
return format_datetime(self.completed_utc)
|
|
|
|
@property
|
|
def status_text(self) -> str:
|
|
if not self.completed_utc: return "Running"
|
|
return "Failed" if self.traceback_str else "Completed"
|
|
|
|
@property
|
|
def time_elapsed(self) -> Optional[timedelta]:
|
|
if self.completed_datetime_py is None: return None
|
|
return self.completed_datetime_py - self.created_datetime_py
|
|
|
|
@property
|
|
def time_elapsed_str(self) -> str:
|
|
elapsed:Optional[timedelta] = self.time_elapsed
|
|
if not elapsed: return ''
|
|
return str(elapsed)
|
|
|
|
@property
|
|
def exception(self) -> Optional[Exception]:
|
|
return self._exception
|
|
|
|
@exception.setter
|
|
def exception(self, value: Optional[Exception]) -> None:
|
|
self._exception = value
|
|
self.traceback_str = str(value) if value else None
|