-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config error handling updates #74
Conversation
RRosio
commented
Feb 3, 2025
- Improved config file validation
- More robust error handling for configuration file creation/parsing errors
- Improved logging and propagation of errors
- Tests for config file errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta comment: I think the file-manager class should not feel like an API handler, but like a normal python class. I mean, it should allow exceptions to propegate back instead of returning {"success": True, ...
JSON like results. Those are appropriate for returning to a REST client, but not for calling from other python code.
Instead, the handle_exception I would code as a context manager so that it appears only in the handler class.
with handle_exception(default_response=...):
self.file_manager.do_operation(...)
In fact, the default response is always "failed", so no need to keep writing that out.
Furthermore, there is an argument that more information about the exception should be returned to the client, for the future case that the client is some python process (or actually a useful message that might be shown to the user in the jlab UI).
|
||
class Config(BaseModel): | ||
sources: List[Source] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest logging level to be added here, maybe in a future PR.
jupyter_fsspec/file_manager.py
Outdated
config = self.config | ||
|
||
if config == {}: | ||
self.filesystems = new_filesystems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is not necessary. If config is empty, we'll simply not run the loop below.
jupyter_fsspec/handlers.py
Outdated
config = self.fs_manager.check_reload_config() | ||
|
||
if ( | ||
not config.get("operation_success") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repeats the "did it work?" flow from the manager (lower exception), makes an exception with you then immediately catch and reraise with with original error message. I would say that exceptions are supposed to do this job without the repeated try
levels.
jupyter_fsspec/handlers.py
Outdated
err_mgs = "FileNotFound" | ||
else: | ||
err_mgs = config["error"] | ||
|
||
self.set_status(404) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a 404, but some 5xx code.
@@ -4,13 +4,50 @@ | |||
# TODO: Testing: different file types, received expected errors | |||
|
|||
|
|||
async def test_get_config(jp_fetch): | |||
async def test_get_config(setup_config_file_fs, jp_fetch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised not to see tests of the manager, but only tests of the HTTP API.
jupyter_fsspec/tests/test_api.py
Outdated
body["description"] | ||
== "Retrieved available filesystems from configuration file." | ||
) | ||
assert body["content"] != [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert body["content"]
or with len()
is cleaner
jupyter_fsspec/tests/test_api.py
Outdated
async def test_no_config(no_config_permission, jp_fetch): | ||
with pytest.raises(HTTPClientError) as exc_info: | ||
await jp_fetch("jupyter_fsspec", "config", method="GET") | ||
assert exc_info.value.code == 404 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, not 404, as that would imply that the URL endpoint was wrong.
Thank you for your feedback @martindurant! I've implemented some of your suggestions, and I hope I interpreted them them as you intended. So far I have the following:
I think that my test fixtures would benefit from some consolidation, so that's something I'd like to revisit. I appreciate your feedback! If you have any more recommendations for me, I would be happy to make further updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up going in a bit of a circle around what to do about the exceptions here, sorry! In the end, I leave it up to you which of the possible patterns to choose - but we should document where we expect to see error output (logs, terminal, client) and what information these ought to contain.
return config_content | ||
except Exception as e: | ||
if handle_errors: | ||
logger.error(f"Error loading configuration file: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I understand that the log-and-continue mode done by this block only happens on initialisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the current behavior!
jupyter_fsspec/file_manager.py
Outdated
|
||
def _get_protocol_from_path(self, path): | ||
if not os.access(config_dir, os.W_OK): | ||
raise PermissionError(f"Config directory was not writable: {config_dir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably no need to check and raise here, the writing would raise exactly the same thing, perhaps with more specific details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, on the browser it's a "PermissionError: [Errno 13] Permission denied: '~/jupyter-fsspec.yaml'
that is received. I believe I just need to update the test to properly mock this behavior since currently removing the this check does cause the test itself to fail.
jupyter_fsspec/file_manager.py
Outdated
logger.info(f"Configuration file created at {config_path}") | ||
return | ||
|
||
def create_config_file(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the difference between "create" and "write" config file methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I should consolidate that into one!
"instance": fs, | ||
"name": fs_name, | ||
"protocol": fs_protocol, | ||
"path": fs._strip_protocol(fs_path), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that we maybe want to make an explicit test case with caching, "simplecache::s3://bucket/path" to see if that works.
jupyter_fsspec/file_manager.py
Outdated
@@ -214,5 +227,6 @@ def get_filesystem_by_protocol(self, fs_protocol): | |||
|
|||
def get_filesystem_protocol(self, key): | |||
filesystem_rep = self.filesystems.get(key) | |||
print(f"filesystem_rep: {filesystem_rep}") | |||
if not filesystem_rep: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just allow the KeyError to happen?
jupyter_fsspec/handlers.py
Outdated
def handle_exception(handler, status_code=500): | ||
try: | ||
yield | ||
except yaml.YAMLError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to specialise for each error type? The underlying error message should have all the information needed, str(e)
, str(e.__dict__)
.
Instead, I would suggest that the function signature allows you to specify what exceptions to watch for handle_exception(handler, status_code=500, exceptions=(Exception, ))
and then you only need except Exceptions:
to express the things you expect might go wrong. Perhaps you could also allow for the "default message" to be passed in, which is what you were doing before.
handler.write({"status": "failed", "description": error_message, "content": []}) | ||
|
||
handler.finish() | ||
raise ConfigFileException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gather this is only used right now in handling the config. But it can be more flexible and perhaps used in other places in the handler classes.
Is the reason to reraise so that the default exit (which attempts to write results) never executes? I suppose this will result in a traceback in the console, in which the actual exception doesn't appear, except through "during the handling of exception another exception happened" and any log lines (which appear separately).
Perhaps a cleaner construct would be something like
try:
yield
except Exceptions:
handler.set_status(...)
...
else:
return
raise HandlerError("Something went wrong, see logs")
This keeps the console quiet
I am wrong, the original exception is still written out! I am not sure, then, how to silence it, but maybe this isn't so important after all.
self.finish() | ||
with handle_exception(self): | ||
self.fs_manager.check_reload_config() | ||
except ConfigFileException: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so this is how you silenced things.
I suppose this is equivalent to checking self._finished
(I see no official attribute/property for this) OR including the whole of the writing inside the context block.
Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>
I have opened up #78 and #79 as follow-ups to this PR. I tried addressing the advice and suggestions above. So maybe this is at a state where it can be merged and iterated on? |
I will go ahead and merge this! Thank you again Martin for your thorough reviews! |