backend: add automatic dead instance detection #204

Merged

toast merged 3 commits from dead-instance into main

2022-10-16 13:45:29 +00:00

Author	SHA1	Message	Date
Chloe Kudryavtsev	d762143b89	backend: fixup missing deadTime and incorrect import Some checks failed ci/woodpecker/pr/lint-backend Pipeline was successful Details ci/woodpecker/pr/build Pipeline was successful Details ci/woodpecker/pr/lint-client Pipeline failed Details ci/woodpecker/pr/lint-foundkey-js Pipeline was successful Details ci/woodpecker/push/build Pipeline was successful Details ci/woodpecker/push/lint-client Pipeline was successful Details ci/woodpecker/push/lint-backend Pipeline was successful Details ci/woodpecker/push/lint-foundkey-js Pipeline was successful Details ci/woodpecker/push/test Pipeline was successful Details ci/woodpecker/pr/test Pipeline failed Details	2022-10-16 09:32:01 -04:00
Chloe Kudryavtsev	21c1e5c06c	backend: simplify suspended and dead queries Some checks failed ci/woodpecker/push/build Pipeline was successful Details ci/woodpecker/push/lint-foundkey-js Pipeline was successful Details ci/woodpecker/push/lint-client Pipeline was successful Details ci/woodpecker/push/lint-backend Pipeline was successful Details ci/woodpecker/push/test Pipeline was successful Details ci/woodpecker/pr/lint-foundkey-js Pipeline was successful Details ci/woodpecker/pr/lint-backend Pipeline was successful Details ci/woodpecker/pr/build Pipeline was successful Details ci/woodpecker/pr/lint-client Pipeline failed Details ci/woodpecker/pr/test Pipeline failed Details This should also have better latency due to being a single query. Furthermore, it's no longer a linear scan, since host is indexed. Would be cool to simplify it further to a single query for blocks also... Why exactly are blocks not in the db?	2022-10-16 09:22:05 -04:00
Chloe Kudryavtsev	91a4f38871	backend: add automatic dead instance detection All checks were successful ci/woodpecker/push/build Pipeline was successful Details ci/woodpecker/push/lint-client Pipeline was successful Details ci/woodpecker/push/lint-backend Pipeline was successful Details ci/woodpecker/push/lint-foundkey-js Pipeline was successful Details ci/woodpecker/push/test Pipeline was successful Details It works by having a day-long cache of "when did we last successfully communicate with this instance?" Anything over a specified threshold (1 month) will act as though the instance is suspended - all outgoing jobs are dropped on processing. The day-long cache is in place because the ordering is necessarily a linear scan. Once an instance comes back online, we will detect that is the case as soon as we receive an activity from them (which will update the "last communicated at") field. Potential future TODOs: * Improve the caching system, it's actually pretty inefficient as it is. CacheBox with a call override? * Think of ways to make it not-a-linear-scan, since the instances table can get pretty big. It's around 4500 on toast cafe. ChangeLog: Added	2022-10-16 12:16:04 +00:00