-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Vendure serverless proof #1799
Comments
It occurs to me that the eventbus and job queue essentially accomplish very similar tasks. Would it be an option to merge the two together? |
This is, right now, the main thing missing for me in Vendure. We are running on GCR, which is an amazing service, but Vendure is not a terribly good fit for it because it relies on a long-running server. The main things that need to be addressed, IMO are:
Here is a transcript from a Slack discussion between you, me and @michaelbromley on Feb 25 2022 (thanks to @ivasilov for digging it up) discodancer Martijn Martijn Michael Bromley discodancer discodancer Martijn Michael Bromley Martijn discodancer Michael Bromley discodancer Michael Bromley |
Or the slack link for convenience https://vendure-ecommerce.slack.com/archives/CKYMF0ZTJ/p1645746002899449 |
Thanks @skid & everyone for the feedback.
So the core issue here is that event subscribers might get killed mid-processing because the response has already been sent. As I mention there and in the slack conversation above, I am hesitant to completely alter the observer pattern on the EventBus - and also yeah this is actually not at all possible with rxjs as it stands. But maybe there is some clever way to have a global setting like "await all subscribers before sending response", and then we do some kind of ref counting on all events published in that context, and figure out some way of waiting for the subscriber functions to return. Again, I'm not at all sure whether this is technically possible, but it is worth some research. Or the other option is, as has been suggested, to scrap the EventBus altogether and unify all async tasks into the job queue. One major issue with that is the overhead: the default job queue will add a row on every system event (of which there are many) and not only that, will have to poll in order to handle new jobs. That introduces not only the network latency to the events system, but also the polling latency as well as increased storage and DB load. The great thing about the rxjs observable implementation is that it is very lightweight and fast, with no external systems involved. I'm definitely interested in solving this and having a totally solid design for serverless architectures. I'm just not yet sure what the best option is. If anyone wants to put together a POC of some new approach, I'm very happy to look at suggestions!
I think there is a lot of work that can be done here with regard to things like deferring work until first needed. I believe @hendrik-advantitge & team have done a little research into this are for the same reason, as they are running on GCR too. |
Here's a few thoughts:
|
So not necessarily related to a requirement to run this on serverless arch, but for a while now I've been grappling with a problem that I think could be solved here and also in turn solve for the original issue. That problem is the need for a centralised event bus - right now for me there are two types of events:
Right now by running things in a farmed environment, the only way to cater for 2 is by using the job queue, but that creates job entries for each event and is therefore undesirable (it's equally undesirable when using search in unbuffered mode). As well as wanting some events to be consumed by multiple instances of vendure, I also want to be able to subscribe to those events and sink them into other places from independent services. I like the simplicity of RXJS, I do - but at the same time I often wonder if we could have different strategies for the event bus; so maybe I could plug in Rabbit, Kafka or the like. In so doing - we could solve for the original issue, arguably with some additional deployment complexity, but you'd gain a lot as well. Where this gets complicated is:
|
I had the same thoughts, but as stated above: It'll introduce a lot of jobs in job queue, and personally for me, I'll not be able to have a way to "intercept" events and modify data "in-place" inside transaction.
Having common subscriber, like I would like to have a middleware-pattern for EventBus. RxJS pattern may be saved for backward-compatibility.
Middleware pattern has proven its flexibility, and can be used in many ways in vendure, including integration with 3rd party services, handling events in blocking and non-blocking way. |
Hi all, saw the documentation here: https://www.vendure.io/docs/developer-guide/deployment/#serverless--multi-instance-deployments Thanks! |
@Logic-Bits both the server and worker can be run with multiple instances in parallel. There's an updated docs page with more information about this here: https://www.vendure.io/docs/deployment/horizontal-scaling/ |
@ashitikov I just re-read your middleware-based proposal and I find it very interesting. Would you be interested in developing a proof-of-concept implementation of this idea? I'm currently working to wrap up the breaking changes for v2.0, which is why I'm revisiting this. With a proper compatibility composer it looks like this feature might be possible with breaks. |
Ahh, thank you very much. The docs are clear now! |
Yes, I am interested in, but honestly I don't have free time now for that. Possible in Q2 I think |
@ashitikov OK, in that case I will defer this to a later release after you get time to work on it. If no breaking changes we can get it in a 2.x minor release. |
Hi @michaelbromley , just a follow up question. If we scale the server with multiple instances, how would we ensure that the DB changes wont be handled / applied multple times? |
@Logic-Bits with multiple instances of the server, any given request will still only arrive at a single instance (typically as chosen by the load balancer). So there should be no danger of the same request triggering multiple updates. |
@michaelbromley i mean on startup, when DB changes are applied. either through synchronize:true or migration. |
@Logic-Bits Ah I understand. There's no built-in handling of this scenario, but it's discussed here: Perhaps it would be a good feature to build-in to Vendure's migration wrapper! |
THX! Will check it out over there. |
@ashitikov Any updates on this? I opened the original issue, but we are not running fully serverless anymore: We still run on Google Cloud, but with background CPU usage allowed. @michaelbromley besides the technical considerations, this seems to be a design/architecture matter: Do we want Vendure to be able to run on serverless environments? I personally don't really need it 🙃 I also don't see any serious applications using serverless in real life. Small functions yes, applications that require fast response times, no. |
Hello, sorry, I am quite inactive right now on Vendure. Hopefully one day I will be back and help with further development :) |
Is your feature request related to a problem? Please describe.
Most (if not all?) serverless environments don't allow processing outside the requestcontext. Vendure's event handling has some async processes that occur outside the requestcontext. For examaple the apply-collection-filters
Event handling occurs after a response has been sent, because event.publish() is fire and forget (as it should be).
Describe the solution you'd like
It would be nice if we can somehow be sure that no background tasks are running when a response has been returned.
Describe alternatives you've considered
One option I see is to be able to implement our own remote
eventBus
strategy. We could doawait eventBus.publish()
and be sure that the event is published. Event handling can be done similar to how worker Job's are handled: in their own async request context if that's what's implemented in the implemented 'eventbus strategy'.A workaround for Google Cloud Run is to enable background processes, which basically disables the serverless function and makes it a normal running container. (Resulting in a ~3x price increase)
Additional context
This problem hasn't occurred yet, but if an eventhandler does resource intensive tasks that take longer than a few seconds, serverless environments would just kill the process.
This problem can occur with Google Cloud Run, Google Cloud Functions, AWS Lambda and probably similar products from other providers.
Is this something that needs to be implemented in Vendure, or is this a "won't support" feature? There is no shame in not supporting function/lambda like envs, but semi-serverless/autoscaling envs like Google Cloud Run are great platforms to save costs.
The text was updated successfully, but these errors were encountered: