Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30809 Add new ESP service WsSasha #18018

Closed
wants to merge 2 commits into from

Conversation

wangkx
Copy link
Member

@wangkx wangkx commented Nov 10, 2023

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Signed-off-by: wangkx <kevin.wang@lexisnexis.com>
@wangkx wangkx closed this Nov 10, 2023
@wangkx wangkx reopened this Nov 10, 2023
1. Create the sashaclilib which contains the code shared by
both Sasha CLI and WsSasha.
2. Inside the sashaclilib, create the runSashaCommand() which
is based on the existing code inside sasha.cpp's main() from
building ISashaCommand to reading the command response.
3. Move the following from sasha.cpp to sashacli.cpp, so that
they can be used by the runSashaCommand(): extractTimings,
getNum(), DumpWorkunitTimings(), getVersion().
4. Implement 5 WsSasha methods: GetVersion, ListWU, ArchiveWU,
RestoreWU, and XREF.

Signed-off-by: wangkx <kevin.wang@lexisnexis.com>
@wangkx
Copy link
Member Author

wangkx commented Nov 10, 2023

@jakesmith The PR now has 2 commits for easy to review. I may squash them when needed. Please check this PR whether it is on correct direction or not. Please also comment on the questions and comments in the ecm file and other files. Some of comments may be removed before merging.

@wangkx wangkx changed the title H30809 Add new ESP service WsSasha HPCC-30809 Add new ESP service WsSasha Nov 13, 2023
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - I have scanned over both commits.

1st commit looks good framework.
2nd commit, although not all of the functions, it highlights some issues.

I think for now, the scope of this should be further reduced, so we can get something past review and merged (bit size is best), then expand on it.

With that in mind, I'd keep the 1st commit, rollback the 2nd commit, and add an extremely limited service for now, e.g. that just handles SCA_GETVERSION for example.
Keep sasha.cpp more or less exactly as it was, but conditionally call runSashaCommand for the select command (e.g. SCA_GETVERSION).
In the short term, that may mean there's some duplication between the legacy sasha.cpp and saruncmd.cpp.

This PR and review can then focus on the new service and the impl. of runSashaCommand etc. - subsequent PR's can start moving more functionality to the service.


using namespace ws_sasha;

constexpr const char* FEATURE_URL = "SashaAccess";

ISashaCommand* CWSSashaEx::setSashaCommand(SashaCommandAction action, bool dfu, SocketEndpoint& serverep)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why setXXX , it is a createXX method ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not use a createXX method because the createSashaCommand() (defined in sacmd.hpp) is called in line 34. I may rename this one to createAndSetSashaCommand.


#include "dautils.hpp"

extern SASHACLI_API void runSashaCommand(SocketEndpoint ep, ISashaCommand *cmd, IFileIOStream *outfile, StringBuffer &outBuffer, bool viaESP);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SocketEndpoint ep

don't want a copy - should probably be const reference

else
msg.append("archived");
msg.append(", Continue (Y/N)");
if (!confirm(msg.str()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this likely need to change..
Anything that it is confirming to proceed, is running 2 commands (but it's wrapped at the moment into a single runSashaCommand).

Esp and the command line will likely handle asking to proceed in very different ways.
I think this logic should be split so runSashaCommand only ever runs 1 command, and in the cases of commands that want confirmation and then optionally run another, that logic is elsewhere, e.g. in the command line tool, or eclwatch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly move out of the runSashaCommand()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and issue each command separately, i.e. split the bit after the confirm into a separate call to runSashaCommand

PROGLOG("Sasha server[%s]: Version %s",host.str(),id.str());
return;
}
if (viaESP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of this common code should have/switch on viaESP.
Instead, it should behave the same, e.g. throw an exception, the caller should handle it as it wants to, e.g. in case of CLI - log it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is no longer really "cli" .. since it's used for sasha command handling in general / by anything that wants to run command on the Sasha service.
Perhaps rename saruncmd.cpp


//ESPmethod [auth_feature("SashaAccess:READ")] ListDT(ListDTRequest, ResultResponse);
// By reading the code, looks like: it lists ECL WUs or DFU WUs with date time (when?) in Archive server.
// Should it be merged into the ListWU as an option (IncludeDateTime)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably, and related to prev. comment, if confirmed, then it would be fine to merge into single ecm request type with options, but I would not change the handling code that is moving from sasha.cpp into the service - or change as little as possible.

ESPmethod [auth_feature("SashaAccess:FULL")] RestoreWU(RestoreWURequest, ResultResponse); //restore ECL WUs or DFU WUs

//ESPmethod [auth_feature("SashaAccess:FULL")] BackupWU(BackupWURequest, ResultResponse); //backup one ECL WU
// Only backup an ECL WU (not DFU WU). How does it work? How to test? How about implementing in separate PR?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik, backing up/restoring dfu workunits is not supported (should be, but less of an issue because they are always tiny compared to ECL workunits).

How to test?
Not sure what you mean - you should be able to test backing up and restoring a ECL workunit manually, via the CLI, (and restore via Eclwatch)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should ask: what is the difference between Archive and Backup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are the same, except Archive deletes the workunit from Dali as well, whereas Backup leave it in place.

ESPmethod [auth_feature("SashaAccess:FULL")] XREF(XREFRequest, ResultResponse); //generate XREF for clusters

//ESPmethod [auth_feature("SashaAccess:FULL")] Stop(StopRequest, ResultResponse);
// Stop what? How does it work? How to test? How about implementing in separate PR?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should perhaps be a Stop or Restart request.

How to test?

Not sure what you mean using existing CLI : sasha action=stop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Stop here matches with the existing CLI : sasha action=stop. Looks like that a Stop action (SCA_STOP) terminates a Sasha server. I haven't seen a Restart function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a restart, I was suggesting there perhaps should be (it is prob. more useful and a more common thing to want than to stop) - but don't add now, let's keep changes to a minimum.

// Stop what? How does it work? How to test? How about implementing in separate PR?

//Do we need the following methods? They are not in the Sasha Usage, but, in Sasha code. If yes, how about implementing in separate PRs?
//SCA_WORKUNIT_SERVICES_GET ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think needed by CLI or service, but still needed, i.e. components (like esp) that interact with Sasha directly, should still continue to allow this command to be created and sent to Sasha directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the SCA_WORKUNIT_SERVICES_GET (as well as coalesce_suspend and coalesce_resume) do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCA_WORKUNIT_SERVICES_GET

I can't recall exactly, but I don't think this new service needs to support it for now, it will not be something wanted by the CLI or UI, but instead used by other esp services that talk directly to sasha (afaics only used by wsWorkunitList atm).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coalesce_suspend and coalesce_resume)

One of sasha's function is to coalesce the dali store (aggregate the current store + in flight deltas into a new store file).
It is resource intensive, and I think the idea was to pause it, so other resource intensive sasha services like XREF, could have more capacity.
In cloud, I don't think coalesce_suspend/coalesce_resume make much sense, since each sasha service is in it's own dedicated container.

//Do we need the following methods? They are not in the Sasha Usage, but, in Sasha code. If yes, how about implementing in separate PRs?
//SCA_WORKUNIT_SERVICES_GET ?
//coalesce_suspend ?
//coalesce_resume ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be added, but later.

@wangkx
Copy link
Member Author

wangkx commented Jan 24, 2024

This draft PR is not needed.

@wangkx wangkx closed this Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants