Component and dataset upgrades #524

jgalan · 2024-06-12T14:44:18Z

Added new methods Range and ApplyRange to TRestDataSet they allow to define a sample subset range from the dataset. When we use ApplyRange the internal data frame will be updated too. If we only invoke Range a RDF::Node with the specified range will be returned, but no internal modification will happen.
Solving an issue in ExtractParametricNodes appearing when the dataset is too large, of the order of 1500M entries. The, fSplitEntries=600,000 divides the operation of node extraction into several steps. See also ROOT-forum entry: https://root-forum.cern.ch/t/problem-with-large-number-of-entries-inside-rdataframe/59632
TRestComponentDataSet::fDFRange data member added. It allows to control the range of the dataset entries that will be used to generate the component.
In a component we can re-scale the distribution using weights, for the moment the weights were a column from the dataFrame, but now it is allowed also to write down a constant.

In this example, the distribution will be built with 2*Ngamma, where Ngamma is the rate contribution from each event.

<parameter name="weights" value="{NGamma,2}"/>

…lyRange and GetEntries

…zationNodes

…yRange

… dataset statistics

for more information, see https://pre-commit.ci

jgalan and others added 5 commits June 12, 2024 13:11

TRestDataSet::fDataSet renamed to fDataFrame. Added methods Range,App…

f67f748

…lyRange and GetEntries

TRestComponentDataSet::fSplitEntries added to split ExtractParameteri…

72d1dee

…zationNodes

TRestDataSet::RegenerateTree method added and used when invoking Appl…

f4c1d67

…yRange

TRestComponentDataSet::fDFRange added to allow reducing the component…

c978dba

… dataset statistics

[pre-commit.ci] auto fixes from pre-commit.com hooks

fe6aad7

for more information, see https://pre-commit.ci

jgalan self-assigned this Jun 12, 2024

jgalan requested review from mariajmz, AlvaroEzq, lobis and juanangp June 12, 2024 14:48

jgalan marked this pull request as ready for review June 13, 2024 16:16

jgalan requested a review from nkx111 as a code owner June 13, 2024 16:16

jgalan requested a review from a team June 13, 2024 16:16

mariajmz approved these changes Jun 13, 2024

View reviewed changes

AlvaroEzq approved these changes Jun 13, 2024

View reviewed changes

jgalan merged commit 9591b5a into master Jun 14, 2024
64 checks passed

jgalan deleted the jgalan_dataset_updates branch June 14, 2024 06:46

Provide feedback