Prototype using new LensKit pipeline abstraction #81

mdekstrand · 2024-08-02T21:37:34Z

This is a prototype of how the new pipeline I built for LensKit (lenskit/lkpy#462) would fit into the POPROX recommenders. It directly vendors a copy of the LensKit code, with dataset and trainability removed, since the LensKit version is not yet in a release.

The LensKit pipeline is heavily inspired by the POPROX one. My design process was basically “how do I take the core idea from the POPROX pipeline — which I really like — and add the capabilities I need for LensKit + build it on a DAG instead of linear and iterative state updates?”.

The pipeline docs are rendered here: https://lkpy.lenskit.org/en/latest/pipeline.html

src/poprox_recommender/default.py

karlhigley · 2024-08-06T15:45:31Z

src/poprox_recommender/default.py

+    candidates = pipeline.create_input("candidate", ArticleSet)
+    clicked = pipeline.create_input("clicked", ArticleSet)
+    profile = pipeline.create_input("profile", InterestProfile)
+    e_cand = pipeline.add_component("embed-candidates", article_embedder, article_set=candidates)


The verb style naming here feels a little weird to me, because a component is conceptually a thing rather than an action. If you really want the verb style, I don't think add_component is the right method name to go with that (but I do like add_component because I think it aligns more naturally with the mental model that e.g. a diversifier is a component of a larger pipeline/system.)

Fair! I was trying to do something consistent (and make a consistent recommendation in the docs), and wasn't 100% on this but favored “pick one, be consistent, and document the rec”.

We can use nouns / noun phrases, and it would be slightly longer in some cases but I think would read reasonably well:

e_cand = pipeline.add_component("candidate-embedder", ...)

Or if we want to name it after the thing returned:

e_cand = pipeline.add_component("candidate-embeddings", ...)

Working through that example, we have 3 possible naming conventions:

The name of the component itself (embedder).

The name of the component's return data (embeddings).

The action performed by the component (embed).

Based on your rationale, it sounds like you would lean towards (1).

I've pushed an update with the component name naming.

Would it make more sense to have something like add_process or add_stage and use the verb-based naming? Not sure it would, just pondering while I update the docs.

I like the noun-style naming better because I find it easier to think about composing objects than actions, but if we went the verb-style route I was considering add_step

mdekstrand · 2024-08-07T14:21:11Z

Cross-added the pipeline updates to LensKit in lenskit/lkpy#466.

mdekstrand · 2024-08-08T21:02:14Z

Now merged and tested w/ @karlhigley's new pipeline components change.

karlhigley · 2024-08-09T17:41:57Z

src/poprox_recommender/default.py

@@ -82,23 +85,31 @@ def personalized_pipeline(num_slots: int, algo_params: dict[str, Any] | None = N
        logger.info("Recommendations will be ranked with plain top-k.")
        ranker = topk_ranker

-    pipeline = RecommendationPipeline(name=diversify)
+    # TODO put pipeline name back in


Worth resolving this now or want to wait for a follow-up PR?

#85 adds it :)

karlhigley · 2024-08-09T17:44:19Z

tests/integration/test_smoke.py

@@ -25,7 +25,7 @@ def test_direct_basic_request():
        req.num_recs,
    )
    # do we get recommendations?
-    assert len(outputs.recs) > 0
+    assert len(outputs.default.articles) > 0


Not a blocker: Wonder if this might be worth applying some syntactic sugar to avoid having to sprinkle default in so many places

Quite possibly.

This updates the POPROX pipeline to use a vendored copy of the LensKit pipeline abstraction. The LensKit pipeline is heavily inspired by the POPROX one. My design process was basically “how do I take the core idea from the POPROX pipeline — which I really like — and add the capabilities I need for LensKit + build it on a DAG instead of linear and iterative state updates?”. The pipeline docs are rendered here: https://lkpy.lenskit.org/en/latest/pipeline.html

mdekstrand added 6 commits August 2, 2024 17:10

vendor the LensKit pipeline abstraction

2ae1e7b

fix outputs to get tests working

662f6c3

Add note on transitive fallbacks

58f181a

Add (failing) test for transitive fallbacks

da550d5

Add transitive fallbacks to runner

e1ca81b

use lenskit pipeline in offline eval

21c831e

mdekstrand marked this pull request as ready for review August 5, 2024 18:39

mdekstrand requested a review from karlhigley August 5, 2024 18:39

rerun pre-commit

b6e579f

karlhigley reviewed Aug 6, 2024

View reviewed changes

mdekstrand added 4 commits August 6, 2024 12:41

rename ranker stages

7b617a6

Add PipelineState support to the LensKit pipeline

0d1d57a

use PipelineState to run recommendations

a7f8f67

pass state around the evaluator

39cacd4

mdekstrand requested a review from karlhigley August 6, 2024 17:30

update tests for new PipelineState outputs

cf2999e

Merge branch 'main' into mdekstrand/lenskit-pipeline

8dda009

karlhigley approved these changes Aug 9, 2024

View reviewed changes

mdekstrand merged commit ac5911f into main Aug 9, 2024
4 checks passed

mdekstrand deleted the mdekstrand/lenskit-pipeline branch August 9, 2024 18:14

mdekstrand mentioned this pull request Aug 9, 2024

Make all component constructor parameters readily serializable #83

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype using new LensKit pipeline abstraction #81

Prototype using new LensKit pipeline abstraction #81

mdekstrand commented Aug 2, 2024 •

edited

Loading

karlhigley Aug 6, 2024

mdekstrand Aug 6, 2024

mdekstrand Aug 6, 2024

mdekstrand Aug 7, 2024

karlhigley Aug 9, 2024

mdekstrand commented Aug 7, 2024

mdekstrand commented Aug 8, 2024

karlhigley Aug 9, 2024

mdekstrand Aug 9, 2024

karlhigley Aug 9, 2024

mdekstrand Aug 9, 2024

Prototype using new LensKit pipeline abstraction #81

Prototype using new LensKit pipeline abstraction #81

Conversation

mdekstrand commented Aug 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdekstrand commented Aug 7, 2024

mdekstrand commented Aug 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdekstrand commented Aug 2, 2024 •

edited

Loading