How to use an iterator/generator to paginate HTTP request responses #109
Replies: 1 comment 6 replies
-
One issue with this implementation is that its not obviously stack safe (ie might lead to
I think this might be a bug in pfun actually. The issue is that
The norm in pure FP programs is to only run effects at the very edge of your program, which this approach would make difficult to do. As such I think the first approach would be more natural to most functional programmers. In general, I think your pain here stems from trying to use imperative tools (a while loop) in functional style. I would use something like this as a starting point (not tested): from pfun import http, success, Effect, operator as op
from typing import Callable, List, TypeVar
A = TypeVar('A')
HTTPEffect = Effect[http.HasHTTP, http.ClientError, A]
class Page:
...
def parse(response: http.Response) -> Page:
...
def has_next_page(page: Page) -> bool:
...
def get_next_page(p: Page) -> HTTPEffect[Page]:
...
def get_while_has_next_page(current_page: Page) -> HTTPEffect[List[Page]]:
if has_next_page(current_page):
return (get_next_page(current_page)
.and_then(get_while_has_next_page)
.map(op.add([current_page])))
return success([current_page])
def paginate(url: str) -> HTTPEffect[List[Page]]:
first_page = http.get(url).map(parse)
return first_page.and_then(get_while_has_next_page) There's probably a generalization of def gather_until(p: Callable[[A], bool],
next_: Callable[[A], Effect[R, E, A]],
first: Effect[R, E, A]) -> Effect[R, E, List[A]]:
def iterate(x: A) -> Effect[R, E, List[A]]:
if p(x):
return next_(x).and_then(iterate).map(lambda rs: [x] + rs)
return success([x])
return first.and_then(iterate) (Discovered a bug in the mypy Let me know if you come up with something! Also, a general pagination library for |
Beta Was this translation helpful? Give feedback.
-
How can I go about iterating through "pages" of HTTP request responses?
For example, I want to query an HTTP API that pages results, and I want to create an iterator (perhaps a generator function) to perform the pagination (i.e., make as many HTTP requests as required to get all of the "pages") so that the caller of the function doesn't have to "manually" perform pagination.
In other words, a user should be able to paginate through multiple HTTP requests, similar to the following:
Ignoring details for the moment, I imagine a function similar to the following for fetching a page of results (in a
Response
, but ignoring parsing for the moment):In addition, the
paginate
function that the user would call might look something like the following (again, ignoring some details):Unfortunately, attempting to iterate through multiple
Response
s with an implementation similar to the above results in the following error:I realize this has to do with using
aiohttp
under the covers and making use of an event loop, but my limited understanding ofasyncio
and FP concepts makes it hard for me to determine how to refactor things to allow such iteration that keeps the client iteration logic relatively clean and simple.I've also thought that perhaps the signature for
paginate
should "invert" the position ofIterator
andEffect
, like so:Then, the caller might end up doing something like so, instead:
One of the details not shown is that in order to obtain the "next" page of results, I need to supply a "token" of sorts, and the "token" is provided in the response (as a header) from the "previous" page of results.
To be more concrete, and to allow experimentation, in particular, I want to page through results of queries made to this public URL: https://cmr.uat.earthdata.nasa.gov/search/collections.umm_json
If you open a browser to that URL, you will get a JSON response, which by default, returns the first page of 10 items. In order to obtain the next page of 10 items, the
CMR-Search-After
header in the response must be supplied as a header in the next request, which will then return a response with a new value for the header, to supply in the next response, and so on, until the last page is fetched, in which case the header will not be in the response (thus indicating the end of the pages of results).For reference on this HTTP API, see Earthdata CMR Search API. In particular, see the Search After section.
How can I implement iteration to abstract away some of these details from the user, and allow the user to iterate through pages in a relatively clean and simple manner?
Beta Was this translation helpful? Give feedback.
All reactions