Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-3779] Implement BITAND, BITOR, BITXOR scalar functions #3942

Merged
merged 1 commit into from
Sep 10, 2024

Conversation

normanj-bitquill
Copy link
Contributor

  • All three functions accept either a pair of Integer or Binary values as arguments
  • Binary arguments must be of the same length
  • Returns NULL if any argument is NULL
  • Returns a value of the same type as the first argument if both arguments are not null

Copy link
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why the result for TINYINT will be correct, since today Calcite incorrectly does not truncate the results for short arithmetic.

return b0 & b1;
}

/** Bitwise function <code>BITAND</code> applied to a Long and int value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitwise function BITAND applied to a int and Long value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more tests with different integer types for the arguments (integer and bigint).

Needed to rework the return type detection. It will now use the largest integer type if both arguments are integer types. There may be ways of cleaning the code up further for the return type inference.

@normanj-bitquill
Copy link
Contributor Author

I don't really understand why the result for TINYINT will be correct, since today Calcite incorrectly does not truncate the results for short arithmetic.

@mihaibudiu I have created a custom SqlReturnTypeInference class. It will look at the arguments and choose the largest integer type (if both are integer types), otherwise use the type of the first argument. If both arguments are NULL, then just assume integer nullable return type.

Perhaps you mean when the result of the BITAND is converted from a long back to a TINYINT. If this is the case, do you have any examples where it fails to truncate the value correctly? No truncation should be needed, since the result cannot exceed the largest argument type.

@mihaibudiu
Copy link
Contributor

I think this pattern is common in all arithmetic operations for integer types. Hopefully you could have reused some code.
Your runtime only supported int and long, so I suspected the results would be wrong for short.

}
}
if (integerReturnType != null && allArgsInteger) {
return typeFactory.createTypeWithNullability(integerReturnType, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these types nullable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since NULL is a possible result.

BITAND(1, NULL) = NULL
BITAND(CAST(x'01' AS BINARY(1)), NULL) = NULL

If nullability is set to false here, then NULL results are coerced back to integer types with the value 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nullability of the result should depend on the nullability of the arguments, if they are not null, the result can be not null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. I have reworked the return type detection. It is now only nullable if at least one argument type is nullable.

return typeFactory.createTypeWithNullability(opType, true);
}
}
return typeFactory.createTypeWithNullability(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can this happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For line 642 above, it would get there when at least one argument is a non integer type and not null. This is probably not quite correct. I only want to exclude the NULL type. A nullable value is fine as long as it is an integer or binary type.

I'll see about correcting this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reworked the logic here.

If all arguments are of the NULL type, then the return type is INTEGER nullable.

If all arguments are integer types, then the return type is the largest type found. It is nullable if at least one argument is nullable.

As a fallback, the first argument that is not of the NULL type is the return type. It is nullable if at least one argument is nullable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback could be coded more explicitly as a check for the NULL type and then an assertion failure.
I like to be as explicit as possible, it's more future-proof.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, if any argument has the NULL type the result type can also have the NULL type, since there is only one value with this type, NULL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: there is a "leastRestrictive" function which does much of this, why doesn't it satisfy the requirements? After you know that the inputs are integer or NULL, that function should do the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you in theory, but I don't know how to do this in practice. When I make the change so that the NULL type is returned if any argument is of the NULL type, then the tests fail. They fail with this error occurring when trying to convert the result.

java.lang.NoSuchMethodException: java.lang.Void.valueOf(void)

It looks like it is trying to the result to a Void type in Java. It never went to an implementation of the BITAND() function. I also tried use a return type of java.lang.Long for the relevant implementation of the BITAND() function. This gives the a similar error but for the Long type error:

java.lang.NoSuchMethodException: java.lang.Long.voidValue()

Another way to look at this is that BITAND(1, NULL) should return a NULL value of the type of the first argument. I am treating BITAND(1, NULL) as the same as BITAND(NULL, 1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping that the nullpolicy would take care of that, but I guess I am wrong (the STRONG policy ensures that a function with null arguments is never called). But I guess you still need a return type, it can't be NULL. So probably the right solution is to return a compile-time error when all arguments have NULL type, and otherwise the type of the first non-NULL typed argument will do fine if any argument has type NULL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Now it gives an error when all arguments are NULL.

* @return true if type2 is of a larger integer type than type1
*/
private static boolean isIntegerTypeLarger(RelDataType type1, RelDataType type2) {
if (SqlTypeName.TINYINT == type1.getSqlTypeName()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use the precision to make this simpler, and perhaps safer too.
In some type systems maybe some of these types could be the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that helps a lot. I have cleaned up the code to make use of precision.

@@ -207,6 +207,7 @@ ArgumentMustBeNumericLiteralInRange=Argument to function ''{0}'' must be a numer
ValidationError=Validation Error: {0}
IllegalLocaleFormat=Locale ''{0}'' in an illegal format
ArgumentMustNotBeNull=Argument to function ''{0}'' must not be NULL
AtLeastOneArgumentMustNotBeNull=At least one argument to function ''{0}'' must not be NULL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are other functions which give an error in this case, but this error message is frankly better. maybe we should update the other ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what I can look for? I could update them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen this: "Illegal use of ''NULL''"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave this for another PR, this has been strenuous enough

@mihaibudiu
Copy link
Contributor

From my pov we can squash the commits.

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Sep 9, 2024
@mihaibudiu
Copy link
Contributor

Sorry, but it looks like other merges have introduced conflicts, please fix them so we can merge this one too

* All three functions accept either a pair of Integer or Binary values as arguments
* Binary arguments must be of the same length
* Returns NULL if any argument is NULL
* If all arguments are NULL, throw an error
* Returns a value of the same type as the first argument if both arguments are not null
@normanj-bitquill
Copy link
Contributor Author

Commits have been squashed.

Copy link

@mihaibudiu mihaibudiu merged commit 890f9ad into apache:main Sep 10, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants