Config error handling updates #74

RRosio · 2025-02-03T08:24:09Z

Improved config file validation
More robust error handling for configuration file creation/parsing errors
- Improved logging and propagation of errors
Tests for config file errors

martindurant

Meta comment: I think the file-manager class should not feel like an API handler, but like a normal python class. I mean, it should allow exceptions to propegate back instead of returning {"success": True, ... JSON like results. Those are appropriate for returning to a REST client, but not for calling from other python code.

Instead, the handle_exception I would code as a context manager so that it appears only in the handler class.

with handle_exception(default_response=...):
    self.file_manager.do_operation(...)

In fact, the default response is always "failed", so no need to keep writing that out.

Furthermore, there is an argument that more information about the exception should be returned to the client, for the future case that the client is some python process (or actually a useful message that might be shown to the user in the jlab UI).

martindurant · 2025-02-03T15:44:19Z

jupyter_fsspec/file_manager.py

+
+class Config(BaseModel):
+    sources: List[Source]
+


Suggest logging level to be added here, maybe in a future PR.

martindurant · 2025-02-03T15:48:51Z

jupyter_fsspec/file_manager.py

+        config = self.config
+
+        if config == {}:
+            self.filesystems = new_filesystems


This block is not necessary. If config is empty, we'll simply not run the loop below.

jupyter_fsspec/file_manager.py

martindurant · 2025-02-03T15:59:40Z

jupyter_fsspec/handlers.py

+            config = self.fs_manager.check_reload_config()
+
+            if (
+                not config.get("operation_success")


This repeats the "did it work?" flow from the manager (lower exception), makes an exception with you then immediately catch and reraise with with original error message. I would say that exceptions are supposed to do this job without the repeated try levels.

martindurant · 2025-02-03T16:00:58Z

jupyter_fsspec/handlers.py

+                err_mgs = "FileNotFound"
+            else:
+                err_mgs = config["error"]
+
            self.set_status(404)


It's not a 404, but some 5xx code.

martindurant · 2025-02-03T16:03:01Z

jupyter_fsspec/tests/test_api.py

@@ -4,13 +4,50 @@
 # TODO: Testing: different file types, received expected errors


-async def test_get_config(jp_fetch):
+async def test_get_config(setup_config_file_fs, jp_fetch):


I am surprised not to see tests of the manager, but only tests of the HTTP API.

martindurant · 2025-02-03T16:03:27Z

jupyter_fsspec/tests/test_api.py

+        body["description"]
+        == "Retrieved available filesystems from configuration file."
+    )
+    assert body["content"] != []


assert body["content"]

or with len() is cleaner

martindurant · 2025-02-03T16:04:11Z

jupyter_fsspec/tests/test_api.py

+async def test_no_config(no_config_permission, jp_fetch):
+    with pytest.raises(HTTPClientError) as exc_info:
+        await jp_fetch("jupyter_fsspec", "config", method="GET")
+    assert exc_info.value.code == 404


Again, not 404, as that would imply that the URL endpoint was wrong.

…server status codes

RRosio · 2025-02-07T06:33:42Z

Thank you for your feedback @martindurant! I've implemented some of your suggestions, and I hope I interpreted them them as you intended. So far I have the following:

setup the handler to process the exceptions using a context manager that logs errors and provides detailed error messages in the response. Now exceptions are only handled by the manager during initialization.
updated the response codes for the server response and updated some asserts to use len(), as you suggested
added tests for the manager class methods

I think that my test fixtures would benefit from some consolidation, so that's something I'd like to revisit.

I appreciate your feedback! If you have any more recommendations for me, I would be happy to make further updates.

martindurant

I ended up going in a bit of a circle around what to do about the exceptions here, sorry! In the end, I leave it up to you which of the possible patterns to choose - but we should document where we expect to see error output (logs, terminal, client) and what information these ought to contain.

martindurant · 2025-02-10T14:31:46Z

jupyter_fsspec/file_manager.py

+            return config_content
+        except Exception as e:
+            if handle_errors:
+                logger.error(f"Error loading configuration file: {e}")


Did I understand that the log-and-continue mode done by this block only happens on initialisation?

Yes that is the current behavior!

jupyter_fsspec/file_manager.py

martindurant · 2025-02-10T14:34:50Z

jupyter_fsspec/file_manager.py


-    def _get_protocol_from_path(self, path):
+        if not os.access(config_dir, os.W_OK):
+            raise PermissionError(f"Config directory was not writable: {config_dir}")


Probably no need to check and raise here, the writing would raise exactly the same thing, perhaps with more specific details

Oh yes, on the browser it's a "PermissionError: [Errno 13] Permission denied: '~/jupyter-fsspec.yaml' that is received. I believe I just need to update the test to properly mock this behavior since currently removing the this check does cause the test itself to fail.

jupyter_fsspec/file_manager.py

martindurant · 2025-02-10T14:36:43Z

jupyter_fsspec/file_manager.py

+        logger.info(f"Configuration file created at {config_path}")
+        return
+
+    def create_config_file(self):


Whats the difference between "create" and "write" config file methods?

Right, I should consolidate that into one!

martindurant · 2025-02-10T14:47:12Z

jupyter_fsspec/file_manager.py

+                "instance": fs,
+                "name": fs_name,
+                "protocol": fs_protocol,
+                "path": fs._strip_protocol(fs_path),


It occurs to me that we maybe want to make an explicit test case with caching, "simplecache::s3://bucket/path" to see if that works.

martindurant · 2025-02-10T14:48:19Z

jupyter_fsspec/file_manager.py

@@ -214,5 +227,6 @@ def get_filesystem_by_protocol(self, fs_protocol):

    def get_filesystem_protocol(self, key):
        filesystem_rep = self.filesystems.get(key)
-        print(f"filesystem_rep: {filesystem_rep}")
+        if not filesystem_rep:


Just allow the KeyError to happen?

martindurant · 2025-02-10T14:53:34Z

jupyter_fsspec/handlers.py

+def handle_exception(handler, status_code=500):
+    try:
+        yield
+    except yaml.YAMLError as e:


Do we really want to specialise for each error type? The underlying error message should have all the information needed, str(e), str(e.__dict__).

Instead, I would suggest that the function signature allows you to specify what exceptions to watch for handle_exception(handler, status_code=500, exceptions=(Exception, ))
and then you only need except Exceptions: to express the things you expect might go wrong. Perhaps you could also allow for the "default message" to be passed in, which is what you were doing before.

martindurant · 2025-02-10T15:08:59Z

jupyter_fsspec/handlers.py

+        handler.write({"status": "failed", "description": error_message, "content": []})
+
+        handler.finish()
+        raise ConfigFileException


I gather this is only used right now in handling the config. But it can be more flexible and perhaps used in other places in the handler classes.

Is the reason to reraise so that the default exit (which attempts to write results) never executes? I suppose this will result in a traceback in the console, in which the actual exception doesn't appear, except through "during the handling of exception another exception happened" and any log lines (which appear separately).

Perhaps a cleaner construct would be something like

try: yield except Exceptions: handler.set_status(...) ... else: return raise HandlerError("Something went wrong, see logs")

~~This keeps the console quiet~~
I am wrong, the original exception is still written out! I am not sure, then, how to silence it, but maybe this isn't so important after all.

martindurant · 2025-02-10T15:23:00Z

jupyter_fsspec/handlers.py

-            self.finish()
+            with handle_exception(self):
+                self.fs_manager.check_reload_config()
+        except ConfigFileException:


OK, so this is how you silenced things.
I suppose this is equivalent to checking self._finished (I see no official attribute/property for this) OR including the whole of the writing inside the context block.

Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>

RRosio · 2025-02-18T23:43:35Z

I have opened up #78 and #79 as follow-ups to this PR. I tried addressing the advice and suggestions above. So maybe this is at a state where it can be merged and iterated on?
cc @martindurant and @ericsnekbytes for additional comments.

RRosio · 2025-02-20T00:02:18Z

I will go ahead and merge this! Thank you again Martin for your thorough reviews!

RRosio added 3 commits January 31, 2025 23:45

add test checks

d6148c9

add robust error handling for config

2664f69

added more config tests, updated handler check

10fa713

RRosio added the bug Something isn't working label Feb 3, 2025

martindurant reviewed Feb 3, 2025

View reviewed changes

RRosio added 5 commits February 4, 2025 22:21

make initial changes from review, remove code unnecessary and update …

83de5e1

…server status codes

updated handling exceptions

6d4908a

clean up exception handling

74b6de0

updated and added filesystem manager tests

b3e0116

back to info log level

a1002aa

martindurant reviewed Feb 10, 2025

View reviewed changes

RRosio mentioned this pull request Feb 12, 2025

[WIP] Jupyter-Fsspec Roadmap #75

Open

17 tasks

RRosio and others added 9 commits February 14, 2025 00:43

Apply suggestions from code review

029fad6

Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>

create and write config file into one function

cab7a0f

remove unused code

28cf788

revised async fs handling

e1328cd

use pydantic model for source object access

90a399a

allow key error to happen

afb2b1d

update to general exception handling

77cc97f

handle permission errors

267d91a

remove permission check and update test

4c3cb71

This was referenced Feb 18, 2025

Caching Test #78

Open

Add Logging level to Pydantic Model #79

Open

RRosio marked this pull request as ready for review February 18, 2025 23:43

RRosio merged commit 0860388 into fsspec:main Feb 20, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config error handling updates #74

Config error handling updates #74

RRosio commented Feb 3, 2025

martindurant left a comment

martindurant Feb 3, 2025

martindurant Feb 3, 2025

martindurant Feb 3, 2025

martindurant Feb 3, 2025

martindurant Feb 3, 2025

martindurant Feb 3, 2025

martindurant Feb 3, 2025

RRosio commented Feb 7, 2025

martindurant left a comment

martindurant Feb 10, 2025

RRosio Feb 14, 2025

martindurant Feb 10, 2025

RRosio Feb 14, 2025

martindurant Feb 10, 2025

RRosio Feb 14, 2025

martindurant Feb 10, 2025

martindurant Feb 10, 2025

martindurant Feb 10, 2025

martindurant Feb 10, 2025

martindurant Feb 10, 2025

RRosio commented Feb 18, 2025

RRosio commented Feb 20, 2025

Config error handling updates #74

Config error handling updates #74

Conversation

RRosio commented Feb 3, 2025

martindurant left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RRosio commented Feb 7, 2025

martindurant left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RRosio commented Feb 18, 2025

RRosio commented Feb 20, 2025