Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callbacks for async/await #40

Open
mwaldrich opened this issue Jul 23, 2019 · 19 comments
Open

Callbacks for async/await #40

mwaldrich opened this issue Jul 23, 2019 · 19 comments

Comments

@mwaldrich
Copy link
Contributor

Would it be possible to implement new callbacks to instrument async and await?

For async, perhaps an argument could be added to the function call and function literal callbacks to indicate if it is an async function.

For await, I would suggest:

  • adding an isAwaited argument to the function call callbacks, indicating that the caller is awaiting this function's result.
  • a new await callback. This would allow the analysis to instrument when the await is scheduled.

These three changes together will allow analysis to distinguish between:

  • a regular function being invoked sequentially;
  • an async function being invoked sequentially (the value is not being awaited);
  • an async function being awaited.

Thanks!

@Haiyang-Sun
Copy link
Owner

@mwaldrich Hi, thanks for the suggestion here.

For await, I don't think it is a good idea to bind it with a function call. await can be used on any value or promise, and await on an asynchronous function is actually await the internal promise returned by the async function (an async function always return a promise object, even though the function itself may return some non-promise value). So what I propose is to add two callback pairs like below:

For await
awaitPre (iid, val) => where val could be any non-promise value or a promise object.
awaitPost(iid, ret) => where ret will be the actually value resolved by the awaited promise or value.

For async, we can provide a special asyncFunctionExit event, which provide an extra argument of the promise object returned by the asynchronous function call:
asyncFunctionExit(iid, ret, promise, exceptionWrapper)

With these two callbacks, you should be able to know if an async function has been awaited by tracking if any await is waiting for the same promiseObject returned by the async function.

I had some experimental implementation about the callbacks as above. However, it was based on my own hack on an old version of Graal.js. We will look into how to properly integrate it in Graal.js in future.

Feel free to discuss here what you think about these potential new callbacks.

@mwaldrich
Copy link
Contributor Author

Sounds great! This should be enough to perform general instrumentation on async/await in JS.

How much work will it be to clean up the hacky implementation and get it merged into Graal?

@mwaldrich
Copy link
Contributor Author

Any update on these callbacks?

@alexjordan
Copy link
Collaborator

Async support has been merged upstream thanks to @eleinadani, await is still in progress.

@mwaldrich
Copy link
Contributor Author

Thanks for the implementation! I've started playing around with these callbacks, but they don't seem to be called in the way I would expect.

I actually have two concerns:

  1. if I understand the purpose of the currently-implemented callbacks correctly, they are not being called at the right times
  2. I'm not sure the currently-implemented callbacks actually give us enough information.

Example (timing how long each await is)

Imagine you want to write a simple analysis to time how long each await is. You want to record a timestamp when a value is awaited, and another when the await is resolved.

Here's a simple analysis and sample program.

asyncAnalysis.js (analysis)

//DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {

        // this.promises = new Map();

        this.functionEnter = function (iid, f, base, args) {
            console.log("functionEnter@ " + J$.iidToLocation(iid), f.name, args.length);
        };
        this.functionExit = function(iid, returnVal, wrappedExceptionVal) {
            console.log("functionExit@ " + J$.iidToLocation(iid));
        };
        this.asyncFunctionEnter = function(iid) {
            console.log('async enter', J$.iidToLocation(iid));
        }
        this.asyncFunctionExit = function(iid, returnVal, wrappedException) {
            console.log('async exit', J$.iidToLocation(iid), returnVal, wrappedException);
        }

        this.awaitPre = function(iid, valAwaited) {
            console.log('awaitPre', J$.iidToLocation(iid), valAwaited);
        }
        this.awaitPost = function(iid, result, exceptionWrapper) {
            console.log('awaitPost', J$.iidToLocation(iid), result, exceptionWrapper);
        }

        this.binary = function(iid, op, left, right) {
            console.log('binary', op, left, right);
        }
        this.write = function(iid, name, val) {
            console.log('write', J$.iidToLocation(iid), name, val);
        }
        this._return = function(iid, val) {
            console.log('return', J$.iidToLocation(iid), val);
        }
        this.declare = function(iid, name, type) {
            console.log('declare', J$.iidToLocation(iid), name, type);
        };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

async.js (sample program)

async function countTo6(x) {
    if (x > 5) {
	      return x;
    } else {
        return await countTo6(x + 1);
    }
}

async function main() {
	  return await countTo6(1);
}

main().then(v => console.log("done! answer = " + v));

Analysis output

functionEnter@ (async.js:1:1:16:4)  0
functionExit@ (async.js:1:1:16:4)
declare (async.js:1:2:16:2) exports var
declare (async.js:1:2:16:2) require var
declare (async.js:1:2:16:2) module var
declare (async.js:1:2:16:2) __filename var
declare (async.js:1:2:16:2) __dirname var
declare (async.js:1:2:16:2) countTo6 var
declare (async.js:1:2:16:2) main var
functionEnter@ (async.js:1:2:16:2)  5
write (async.js:2:1:8:2) countTo6 async function countTo6(x) {
    if (x > 5) {
	      return x;
    } else {
        return await countTo6(x + 1);
    }
}
write (async.js:10:1:12:2) main async function main() {
	  return await countTo6(1);
}
async enter (async.js:11:4:11:29)
functionEnter@ (async.js:10:1:12:2) main 0
declare (async.js:2:1:8:2) x var
binary > 1 5
binary + 1 1
declare (async.js:2:1:8:2) x var
binary > 2 5
binary + 2 1
declare (async.js:2:1:8:2) x var
binary > 3 5
binary + 3 1
declare (async.js:2:1:8:2) x var
binary > 4 5
binary + 4 1
declare (async.js:2:1:8:2) x var
binary > 5 5
binary + 5 1
declare (async.js:2:1:8:2) x var
binary > 6 5
return (async.js:4:8:4:17) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:6:16:6:37) Promise { <pending> }
awaitPre (async.js:11:11:11:28) Promise { <pending> }
functionExit@ (async.js:10:1:12:2)
async exit (async.js:11:4:11:29) Promise { <pending> } {}
functionExit@ (async.js:1:2:16:2)
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
awaitPre (async.js:6:16:6:37) Promise { 6 }
awaitPost (async.js:6:16:6:37) 6 {}
return (async.js:6:9:6:38) 6
functionEnter@ (async.js:10:1:12:2) main 0
awaitPre (async.js:11:11:11:28) Promise { 6 }
awaitPost (async.js:11:11:11:28) 6 {}
return (async.js:11:4:11:29) 6
functionExit@ (async.js:10:1:12:2)
declare (async.js:14:13:14:52) v var
functionEnter@ (async.js:14:13:14:52)  1
binary + done! answer =  6
done! answer = 6
return (async.js:14:18:14:52) undefined
functionExit@ (async.js:14:13:14:52)

Issues

There are a few weird things happening in the analysis:

  1. functionEnter events are not being generated for recursive async calls
  2. asyncFunctionExit events are not being generated for recursive async calls
  3. functionExit events are not being generated after returning a real values from recursive async calls

In addition, I don't think it is possible to model the call stack from the analysis' perspective without also having an asyncFunctionEnter callback, which would be used when an async function is "resumed".

@Haiyang-Sun
Copy link
Owner

I will look into the issue, thanks for the feedback!

@mwaldrich
Copy link
Contributor Author

Also, a few more things I noticed:

  • there are too many awaitPre events being generated. ideally, there would be a one-to-one correspondence between awaitPre and awaitPost events, but there are exactly twice as many awaitPres compared to awaitPosts. this is because an additional awaitPre is generated when the await is resolved
  • I don't believe it's possible to tie together the awaitPres and awaitPosts unless:
    1. awaitPost also provides the original promise that awaitPre provided
    2. NodeProf ties a unique ID to each promise, and this ID is given in both awaitPre and awaitPost

@mwaldrich
Copy link
Contributor Author

I have tried using the new asyncawait-fix branch, and it seems to solve the issue of too many awaitPre events being generated.

However, the awaitPost callback still does not provide enough information to perform some basic analyses, such as timing awaits.

To do this, the awaitPost callback would need to provide information so it could be tied to its corresponding awaitPre callback. There are at least two ways I can think to do this:

  1. providing the original promise in the awaitPost callback (e.g. function awaitPost(iid, originalValAwaited, result, exceptionVal))
  2. providing a unique identifier for the corresponding promise in both awaitPre and awaitPost (e.g. function awaitPre(iid, valAwaited, awaitID) and function awaitPost(iid, result, awaitID, exceptionVal))

@eleinadani
Copy link
Collaborator

In Graal.js we can already instrument all Promise creations, so I assume that solution #1 would be easy -- @Haiyang-Sun what do you think?

@Haiyang-Sun
Copy link
Owner

@mwaldrich It makes sense to add the promise object also in the awaitPost statement. I think you can maintain an id for a promise object in your analysis using the builtin Map because it allows you to use an object as a key.

@eleinadani I think it's not related to the builtins. We can provide the promise object from the inputs. I will patch nodeprof to do it.

@Haiyang-Sun
Copy link
Owner

Haiyang-Sun commented Nov 29, 2019

@mwaldrich actually adding the promise object in awaitPost might not be necessary. Since now awaitPre and awaitPost are always in pair, you can maintain an "AwaitStack".

@mwaldrich
Copy link
Contributor Author

@Haiyang-Sun Even though awaitPre/awaitPost appear in pairs, I don't think they can be tracked in a stack.

I think this is because async/await programs, just like ES5 programs with native promises, have dynamic control flow that doesn't fit into a simple call-stack model. The program can be waiting on multiple independent awaits, and the processing order depends on the event loop.

Here's an example of a program whose awaitPre/awaitPosts do not form a stack:

example.js

  async function printHello() {
      console.log("hello!");
  }

  async function async1() {
      await printHello();
  }

  async function async2() {
      await printHello();
  }

  async1();
  async2();

analysis.js

// DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {
            this.awaitPre = function (iid, valAwaited) {
                console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
            };
            this.awaitPost = function (iid, result, exceptionVal) {
                console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
            };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

Output

hello!
awaitPre iid=1 loc=(async4.js:6:5:6:23) valAwaited=[object Promise]
hello!
awaitPre iid=2 loc=(async4.js:10:5:10:23) valAwaited=[object Promise]
awaitPost iid=1 loc=(async4.js:6:5:6:23) result=undefined
awaitPost iid=2 loc=(async4.js:10:5:10:23) result=undefined

If you tried to push all promises seen in awaitPre onto a stack, and popped them in awaitPost to infer to the promise, you would infer the wrong promise in this case. This is because any time an async function is called without an await, a new callstack is created, which would have its own separate await stack. This cannot be modeled with a single stack.

Therefore, I think providing the promise in awaitPost is still necessary.

@Haiyang-Sun
Copy link
Owner

@mwaldrich Umm, I know that providing a promise would be helpful. But it is not the key to the problem here. Assume the following example:

async function printHello() {
      console.log("hello!");
  }
let p = printHello();
  async function async1() {
      await p;
  }

  async function async2() {
      await p;
  }

  async1();
  async2();

You will still have the same promise object in the two awaitPost events.
I think the key is to use a queue instead of a stack, assuming that all the awaitPre events will be consumed one by one and maps to the next coming awaitPost.

@mwaldrich
Copy link
Contributor Author

@Haiyang-Sun I see what you're saying, but I think neither queues nor stacks will work.

Functions like those in the fs library may take lots of time to evaluate, and they may not be finished after 1 event loop completion. Therefore, they may be skipped in the next event loop run, while other awaits are processed. This skipping would result in awaits not being processed in a queue.

Here's an example of code that (usually) doesn't process its awaits as a queue:

asyncIO.js

This program awaits 2 different asynchronous operations:

  1. stating the current directory
  2. a trivial identity function

Since stating the current directory will usually take longer than evaluating an identity function, the awaits are not processed as a queue.

const util = require('util');
const fs = require('fs');
const statAsync = util.promisify(fs.stat);

async function id(x) {
    return await x;
}

async function longIOOperation() {
    return await statAsync(".");
}

async function main() {
    longIOOperation().then(_ => console.log("done with long i/o operation"));
    id(1).then(_ => console.log("done evaluating id function"));
}

main();
console.log("first event loop over");

analysis.js

// DO NOT INSTRUMENT
(function (sandbox) {
    function MyAnalysis() {
            this.awaitPre = function (iid, valAwaited) {
                console.log(`awaitPre iid=${iid} loc=${J$.iidToLocation(iid)} valAwaited=${valAwaited}`);
            };
            this.awaitPost = function (iid, result, exceptionVal) {
                console.log(`awaitPost iid=${iid} loc=${J$.iidToLocation(iid)} result=${result}`);
            };
    }

    sandbox.analysis = new MyAnalysis();
})(J$);

Analysis output

awaitPre iid=2 loc=(asyncIO.js:10:12:10:32) valAwaited=[object Promise]
first event loop over
awaitPre iid=4 loc=(asyncIO.js:6:12:6:19) valAwaited=1
awaitPost iid=4 loc=(asyncIO.js:6:12:6:19) result=1
done evaluating id function
awaitPost iid=2 loc=(asyncIO.js:10:12:10:32) result=[object Object]
done with long i/o operation

If I assumed all awaits were processed in order, my analysis would incorrectly infer the promise for the first awaitPost.

@Haiyang-Sun
Copy link
Owner

Yeah, you are right. So we might actually need a unique id for the await operation.
@eleinadani , if I remember correctly, there is an internal promise object that can be used for this purpose (https://github.com/graalvm/graaljs/blob/master/graal-js/src/com.oracle.truffle.js/src/com/oracle/truffle/js/nodes/control/AwaitNode.java#L169). For example, even for a statement like await 1; there is still some internal promise object created.
I will see if I can find a way to expose it as the identifier.

@eleinadani
Copy link
Collaborator

Yes, there should be an internal promise and you should be able to instrument that via builtin root nodes instrumentation. Regarding unique ids, I think you can just use System.identityCode() for object instances (e.g. promises), while for code locations you could either use the identity of the source section (should be unique) or implement your own mapping. On a separate note (still realted to async/await) --> your Graaljs bugfix will land probably today

@mwaldrich
Copy link
Contributor Author

Is there any update on the implementation of the modified awaitPre and awaitPost callbacks?

@Haiyang-Sun
Copy link
Owner

@mwaldrich Sorry for the late reply. I had a look, it is not straightforward to update this feature. I can work on it later this week.

@Haiyang-Sun
Copy link
Owner

@mwaldrich I have a workaround fix (#70).
In this fix, the promise object will be provided both in awaitPre and in awaitPost.
In awaitPost, the exception parameter is replaced with a isRejected flag, and the third argument will be the value resolved or rejected of the awaited promise.

Another special case is to await on a value instead of a promise. You won't get any promise object in the callbacks.

There is still a bug where we can miss the promise object in the awaitPost callback in some corner case, which I will look into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants