Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-32724 Allow storage planes to be defined in bare-metal systems #19234

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

jackdelv
Copy link
Contributor

@jackdelv jackdelv commented Oct 24, 2024

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32724

Jirabot Action Result:
Assigning user: jack.delvecchio@lexisnexisrisk.com
Workflow Transition To: Merge Pending
Updated PR

@jackdelv jackdelv requested a review from ghalliday October 24, 2024 20:27
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look reasonable. Need to separate out the different types of changes though.

//printYAML(storage);
// Add groups defined in environment.xml
Owned<IPropertyTree> env = getHPCCEnvironment();
Owned<IPropertyTreeIterator> storagePlanes = env->getElements("Software/Globals/StoragePlaneList/StoragePlane[@name='*']");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor:I would use "Storage" (or "StoragePlanes" rather than "StoragePlaneList"

Also, no need to have a qualifier on the name:

Owned<IPropertyTreeIterator> storagePlanes = env->getElements("Software/Globals/StoragePlanes/StoragePlane");

I am not sure about the benefits of accessing via the environment, or by copying the data to the component configuration file and accessing from there.
@jakesmith opinions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard approach is to generate into the component config file (in this case daliconf.xml).
We do this already for other components (dfuserver, esp.. ), see e.g. initfiles/componentfiles/configxml/dfuserver.xsl. At the moment it does this for RemoteStorage only in those places.

should be generated in the same way into those configs too, and Dali (which doesn't need RemoteStorage) should generate only.

One caveat to that is that the above code only creates planes for existing groups in Dali (that correspond to engine groups in BM) if "planes" doesn't exist, but it will do after generating it into the config..

I suggest changing the createStoragePlane calls, so that in this context it adds a flag to the planes being added, to indicate they are based on groups (e.g. @fromGroup=true), and then change the start of this process to:

    if (createPlanesFromGroups)
    {
        // delete all [@fromGroup=true] planes 
        
        // existing code that constructs groups and adds planes for groups


...
     }

NB: this routine can be recalled if/when environment.xml changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied storage/planes definitions into component config files which allowed a group to be created for my storage plane.

I changed createStoragePlane to add the flag fromGroup to the planes created from it. Then at the beginning of the process changed it to remove all storage planes previously created by createStoragePlane by removing any with fromGroup=true. I also had to add a check at the beginning of createStoragePlanes for a storage plane that is already defined (this would be coming from the component config file) otherwise it would create a duplicate planes.

@@ -20,6 +20,20 @@
storageWrites="0.0038"/>
</Hardware>
<Software>
<Globals noCommon="true"
maxConnections="-1">
<StoragePlane name="storagePlane"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't contain StoragePlaneList

Also, I suspect these should not be checked in as part of the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed initfiles changes from PR.

@@ -459,12 +459,15 @@ AzureFile::AzureFile(const char *_azureFileName) : fullName(_azureFileName)

StringBuffer planeName(slash-filename, filename);
Owned<IPropertyTree> plane = getStoragePlane(planeName);
if (!plane)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to HPCC-32723 on my local repo.

const char * api = plane->queryProp("storageapi/@type");
if (!api)
throw makeStringExceptionV(99, "No storage api defined for plane %s", planeName.str());

constexpr size_t lenPrefix = strlen(azureBlobPrefix);
if ((strncmp(api, azureBlobPrefix, lenPrefix-1) != 0) || api[lenPrefix-1] != ':')
if ((strncmp(api, azureBlobPrefix, lenPrefix-1) != 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the extra test should have been

api[lenPrefix-1] == 0

so that it doesn't match azureblobv2

possibly better would be to define a string without the trailing colon and use strieq.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use strieq.

@jackdelv
Copy link
Contributor Author

@ghalliday I have removed changes unrelated to adding support for storage planes in bare-metal. The working code and the commits based on your comments can be found on branch HPCC-32723 in my forked repo.

@jackdelv jackdelv requested a review from ghalliday October 25, 2024 14:14
@ghalliday
Copy link
Member

@jackdelv please convert to normal PR
@jakesmith can you foresee any problems with this?

@ghalliday ghalliday requested a review from jakesmith November 4, 2024 09:37
@jackdelv jackdelv marked this pull request as ready for review November 4, 2024 12:08
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackdelv - please see comments

@jackdelv jackdelv requested a review from jakesmith November 12, 2024 21:14
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackdelv - looks good. Please address 1 trivial comment and squash.

{
// Remove old planes created from groups
while (storage->removeProp("planes[@fromGroup=\"1\"]"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: for consistency match with true vs 1, and can use single quotes for readability:
while (storage->removeProp("planes[@fromGroup='true']"));

@jackdelv
Copy link
Contributor Author

@jakesmith Squashed. I'm looking into why some of the smoketests are failing.

@ghalliday
Copy link
Member

@jackdelv tests are still failing. Did Jakes suggestion of changing from ='1' to ='true' break it?

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests currently failing

@jackdelv
Copy link
Contributor Author

jackdelv commented Nov 21, 2024

@ghalliday It seems that may have been the issue. I will wait for the tests to run to be sure since I'm still confused why it was happening. When I run this code for connecting to my azure blob storage it seemed to be working on roxie. It was hitting an unimplemented error rather than the missing host group exception regardless of if '1' or 'true' is passed in.

UNIMPLEMENTED feature [AzureFile::copyTo] in function copyTo() at azurefile.cpp(246)

in := DATASET('~plane::jackdelvstorageplanetest::taxi-blob::taxi_tripdata.csv', layout, CSV(HEADING(1)));
out := CHOOSEN(in,1000);
OUTPUT(out);
OUTPUT(out,,'~plane::jackdelvstorageplanetest::taxi-blob::taxi_tripdata_copy.csv',CSV, PLANE('jackdelvstorageplanetest'), OVERWRITE);
in2 := DATASET('~plane::jackdelvstorageplanetest::taxi-blob::taxi_tripdata_copy.csv', layout, CSV);
OUTPUT(in2,,'~mytaxi::taxi_tripdata_copy.csv',CSV, PLANE('jackdelvstorageplanetest'), OVERWRITE);

I tested with testing/regress/payload.ecl and was getting the same host group error with 'true'. Changing to '1' seemed to fix the error.

EXCEPTION: No entry found for hostGroup: 'mythor' (in Index Read 2)

Although, when the test suite reported the error, it was looking for myroxie, but I could only ever reproduce with the mythor host group.

@jackdelv jackdelv requested a review from ghalliday November 21, 2024 14:48
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackdelv please squash.

{
// Remove old planes created from groups
while (storage->removeProp("planes[@fromGroup='1']"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor style: For loops with empty bodies it is clearer to code as the following:

while (storage->removeProp("planes[@fromGroup='1']"))
{
}

because it makes it clear that the body is intentionally blank. No need to change for this PR since it matches other examples.

Although having said that there should be a function that avoids the need for the loop. @jakesmith Added HPCC-33048.

@@ -3498,6 +3498,11 @@ bool GroupInformation::checkIsSubset(const GroupInformation & other)

void GroupInformation::createStoragePlane(IPropertyTree * storage, unsigned copy) const
{
// Check that storage plane does not already have a definition
VStringBuffer xpath("planes[@name='%s']", name.str());
if (storage->hasProp(xpath))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a future PR it might be interesting to allow some attributes for the implicit storage planes to be set in the configuration - so merge the definitions. For example to be able to set expert options on thor storage planes, even if only for testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and have opened this JIRA.

- Copy any storage plane definitions from environment.xml Globals into component config files
- Change createStoragePlane to exit if the storage plane already exists (this is after any previous storage planes created with createStoragePlane are removed)
- Add fromGroup=true flag to storage planes created from groups

Signed-off-by: Jack Del Vecchio <jack.delvecchio@lexisnexisrisk.com>
@jackdelv
Copy link
Contributor Author

jackdelv commented Dec 2, 2024

@ghalliday Squashed.

@ghalliday ghalliday merged commit a8b1971 into hpcc-systems:master Dec 4, 2024
48 of 50 checks passed
Copy link

github-actions bot commented Dec 4, 2024

Jirabot Action Result:
Added fix version: 9.10.0
Workflow Transition: 'Resolve issue'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants