Skip to content

Latest commit

 

History

History
54 lines (46 loc) · 4.22 KB

write_your_own_fetch_provider.md

File metadata and controls

54 lines (46 loc) · 4.22 KB

Fetch Providers

FetchProviders are the components OPAL uses to fetch data from sources on demand.

Fetch providers are designed to be extendable, and you can easily create more fetch providers to enable OPAL to fetch data from your own unique sources (e.g. a SaaS service, a new DB, your own proprietary solution, ...)

Writing your own fetch providers

  1. The basics

    All FetchProviders are classes that derive from BaseFetchProvider. FetchProviders are loaded into the fetch-register from the Python modules the configuration points to (OPAL_FETCH_PROVIDER_MODULES)

    • Providers use the FetcherEvent initialized in self._event and specifically the FetcherConfig configuration in self._event.config
    • Providers implement a _fetch_ method to access and fetch data from the data-source (indicated by self._event.url)
    • the FetchingEngine workers invokes a provider's .fetch() and .process() which proxy to _fetch_ and _process_ accordingly
  2. Deriving from BaseFetchProvider and implementing Fetch & Process

    1. Create a new class deriving from BaseFetchProvider
    2. Override ```fetch`` to implement the fetching itself
    3. Optionally override _process_ to mutate the data before returning it (for example converting a JSON string to an actual object)
    4. Manage a context
      • If you require a context for (cleanup or guard) Simply override __aenter__ and __aexit__
      • fetcher workers call providers with async with around fetch and process.
  3. FetcherConfig

    • Each FetcherProvider might require specific values that will be passed to it as part of its configuration. For such a case implement a Pydantic model (Deriving from FetcherConfig) to go alongside your new provider.
    • e.g. for HTTP request fetcher
      class HttpFetcherConfig(FetcherConfig):
          headers: dict = None
          is_json: bool = True
          process_data: bool = True
    • These should be used when triggering your provider from a DataUpdate via the OPAL-server's DataUpdatePublisher
  4. Saving and registering your provider

    The fetcher-register loads the providers and makes them available for fetcher-workers. The fetcher register loads the python packages indicated by the given configuration (OPAL_FETCH_PROVIDER_MODULES) and searches for classes deriving from BaseFetchProvider in them.

    • You can add your providers by supplying module files-

      • By adding python files to the default fetcher folder - 'opal/common/fetcher/providers'
      • By creating a package - i.e. a new folder with __init__.py
        • read about configuring python packages here
        • you can set your __all__ variable in your __init__.py to point to the modules you'd like
        • or expose them all using emport.dynamic_all (as shown here)
    • Loading from PyPi

      • coming soon

Module / Class structure