Add `Blockwise` `Op` #757

brandonwillard · 2022-01-17T03:57:09Z

This PR implements #695.

It's currently just an outline.

Sayam753 · 2022-08-29T14:55:22Z

aesara/tensor/blockwise.py

+        def transform(var: "TensorVariable", client_node: Optional[Apply]) -> Variable:
+            """Walk a graph and expand single gradient \"block\"s into their block-wise equivalents."""


Hi @brandonwillard
Can you explain what transform function is and how it is used in computing L_Op?

Just like its Elemwise counterpart, transform is supposed to use a "template" gradient graph for each input to construct broadcasted gradient graphs in which all the relevant Ops are Elemwise/Blockwise Ops applied to the original inputs.

Let's take a look at what's happening in Blockwise.L_op in first test_Blockwise_grad test.

First, the graph for which we want the L-op/gradient:

aesara.dprint(outputs) # Blockwise{op=<tests.tensor.test_blockwise.DotBW object at 0x7f5e8236fd90>, signature=((('m', 'n'), ('n', 'p')), (('m', 'p'),))} [id A] <TensorType(float64, (None, None, None))> # |input 0 [id B] <TensorType(float64, (None, None, None))> # |input 1 [id C] <TensorType(float64, (None, None, None))>

It's a Blockwise dot product node with two 3D inputs named input 0 and input 1.

A "template" graph of the gradient is produced for each input and stored in core_inp_grads. Each element of core_inp_grads corresponds to the generic form of a single-block's gradient wrt. each input.

aesara.dprint(core_inp_grads, print_type=True) # dot [id A] # |<TensorType(float64, (None, None))> [id B] # |InplaceDimShuffle{1,0} [id C] # |<TensorType(float64, (None, None))> [id D] # dot [id E] # |InplaceDimShuffle{1,0} [id F] # | |<TensorType(float64, (None, None))> [id G] # |<TensorType(float64, (None, None))> [id B]

We can see that the gradient of a dot in a single block is just another dot, and that the original inputs aren't present; instead some stand-in variables are used and they're 2D (i.e. TensorTypes with (None, None) static shapes).
In other words, we've used the core dimensions specified by the Blockwise and its Op to remove the broadcasted dimensions (i.e. that determine each block) and produce the generic form of a single "block"'s L-op from an existing Op.[grad|L_op] implementation.

Now, we can't simply replace those stand-in inputs with input 0 and/or input 1, because the dots in the gradient graphs don't work block-wise and, as a result, cannot take the original inputs as inputs. Also, the InplaceDimShuffle applied to one of the inputs in each graph wouldn't work with an input containing an extra third dimension.

The idea is that we need to convert the templates' dots into Blockwise(dot)s and do something about the InplaceDimShuffles. My guess is that the first input's gradient graph would end up looking like the following after applying transform:

# Blockwise{op=<tests.tensor.test_blockwise.DotBW object at 0x7f5e8236fd90>, signature=((('m', 'n'), ('n', 'p')), (('m', 'p'),))} [id A] # |input 0 [id B] # |InplaceDimShuffle{1,0,2} [id C] # |input 1 [id D]

The DimShuffleed dimensions will probably require a little bit of calculation involving Blockwise.signature (i.e. to transpose the correct, core dimensions), but most other Ops should be Blockwise amenable—at least after we formalize and attach the relevant signature information to our Ops. DimShuffle is perhaps a special case in which we don't want to create a Blockwise Op, mostly because there's no point in literally applying a DimShuffle block-wise when a new, equivalent DimShuffle can be produced that accomplishes the same thing, but more succinctly.

Any Ops that can't be converted to a Blockwise form (e.g. because they don't provide signature information in some way or another) should result in a no-gradient error.

brandonwillard · 2022-09-26T19:57:01Z

Closing in favor of #1215.

brandonwillard linked an issue Jan 17, 2022 that may be closed by this pull request

Create an Op for NumPy's generalized ufuncs #695

Open

brandonwillard force-pushed the add-blockwise-op branch from d359ff8 to 7f2d982 Compare January 17, 2022 18:04

brandonwillard added enhancement New feature or request new operator labels Jan 17, 2022

brandonwillard added the important label Mar 19, 2022

brandonwillard added Op implementation Involves the implementation of an Op and removed new Op labels Jul 11, 2022

brandonwillard force-pushed the add-blockwise-op branch from 82bd1ef to ae84fbe Compare July 23, 2022 21:05

This was referenced Jul 30, 2022

Implement aesara.tensor.matmul #488

Open

Implement aesara.tensor.matmul #744

Merged

brandonwillard force-pushed the add-blockwise-op branch from ae84fbe to 1e6012d Compare August 2, 2022 17:49

brandonwillard mentioned this pull request Aug 2, 2022

Wrong gradients when inputs are dynamically broadcasted #1089

Open

brandonwillard force-pushed the add-blockwise-op branch from 1e6012d to ad487f0 Compare August 3, 2022 18:50

Sayam753 reviewed Aug 29, 2022

View reviewed changes

brandonwillard and others added 5 commits August 29, 2022 15:05

Extract check method from InferShapeTester

b8f868b

Add a Blockwise Op

e79383c

Added perform and infer_shape outline to Blockwise

cf9affc

Added BlockWise infer_shape

82a103b

Initialized Blockwise.grad function

037f90f

brandonwillard force-pushed the add-blockwise-op branch from ad487f0 to 037f90f Compare August 29, 2022 20:05

brandonwillard closed this Sep 26, 2022

brandonwillard mentioned this pull request Sep 26, 2022

Add Blockwise Op #1215

Open

brandonwillard deleted the add-blockwise-op branch January 28, 2023 20:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Blockwise` `Op` #757

Add `Blockwise` `Op` #757

brandonwillard commented Jan 17, 2022 •

edited

Loading

Sayam753 Aug 29, 2022

brandonwillard Aug 29, 2022 •

edited

Loading

brandonwillard commented Sep 26, 2022

		def transform(var: "TensorVariable", client_node: Optional[Apply]) -> Variable:
		"""Walk a graph and expand single gradient \"block\"s into their block-wise equivalents."""

Add Blockwise Op #757

Add Blockwise Op #757

Conversation

brandonwillard commented Jan 17, 2022 • edited Loading

Sayam753 Aug 29, 2022

Choose a reason for hiding this comment

brandonwillard Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

brandonwillard commented Sep 26, 2022

Add `Blockwise` `Op` #757

Add `Blockwise` `Op` #757

brandonwillard commented Jan 17, 2022 •

edited

Loading

brandonwillard Aug 29, 2022 •

edited

Loading