-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add map type support #13906
Add map type support #13906
Conversation
e1e08ce
to
e211795
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13906 +/- ##
============================================
+ Coverage 61.75% 63.86% +2.11%
- Complexity 207 1537 +1330
============================================
Files 2436 2618 +182
Lines 133233 144230 +10997
Branches 20636 22061 +1425
============================================
+ Hits 82274 92115 +9841
- Misses 44911 45323 +412
- Partials 6048 6792 +744
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
8c43e8d
to
1290317
Compare
ca8599d
to
ecb6208
Compare
int size = positionOffsetInVariableBufferAndGetLength(rowId, colId); | ||
ByteBuffer buffer = _variableSizeData.slice(); | ||
buffer.limit(size); | ||
return MapUtils.deserializeMap(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive will be deserializeMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be expensive for general map conversion and it's ok to be slow for select the whole map column.
The major purpose of introducing MAP type is to support more indexing for filtering and aggregation.
@Override | ||
public Object convert(Object value, PinotDataType sourceType) { | ||
switch (sourceType) { | ||
case OBJECT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we adding Object Type as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Source type may not be recognized as MAP, it could be OBJECT as well.
@Override | ||
public BlockValSet getBlockValueSet(String[] paths) { | ||
// TODO: only support one level of path for now, e.g. `map.key` | ||
assert paths.length == 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to assert or throw exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens at query runtime, so it will only fail the query execution
* Evaluates myMap['foo'] | ||
*/ | ||
public class ItemTransformFunction extends BaseTransformFunction { | ||
public static final String FUNCTION_NAME = "item"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Name item?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (numRows == 0) { | ||
return values; | ||
} | ||
if (storedType == DataType.UNKNOWN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Unknown type allowed during write? Should this be an exception?
if (_fieldSpec.isSingleValueField()) { | ||
_maxNumValuesPerMVEntry = -1; | ||
} else { | ||
_maxNumValuesPerMVEntry = columnMetadata.getMaxNumberOfMultiValues(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiValue MAP Column is not supported. What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAP is considered as a single value column.
} | ||
|
||
@Override | ||
public ColumnMetadata getKeyMetadata(String key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these methods return any metadata value. Is that expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are not implemented fully, so ignore.
} | ||
|
||
@Override | ||
public boolean isSingleValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
* @param context Reader context | ||
* @return MAP type single-value at the given document id | ||
*/ | ||
default Map<String, Object> getMap(int docId, T context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What indexes are supported on MAP type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not yet implemented
|
||
// Besides the value bytes, we store: size, length for each key, length for each value | ||
long bufferSize = (1 + 2 * (long) size) * Integer.BYTES; | ||
byte[][] keyBytesArray = new byte[size][]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for very large map will these become performance bottlenecks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes and expected.
d13b0f5
to
0bd5fa8
Compare
0bd5fa8
to
b59e4d8
Compare
MapItem
functionintMap['key']
to extract the map value from a given key.Sample schema:
Sample query 1: Selection only
Sample response:
Sample query 2: Selection order-by
Sample response:
Sample query 3: Aggregation only
Sample response:
Sample query 4: Aggregation group-by
Sample response:
Sample query 5: Filter
Sample response:
Sample query 6: Filter on non-existing key
Sample response:
Sample query 7: Select non-existing key
Sample response: