Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saved segments spike #4648

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Saved segments spike #4648

wants to merge 8 commits into from

Conversation

apata
Copy link
Contributor

@apata apata commented Oct 2, 2024

This PR outlines schema and API structure for saved segments. Not intended to be merged as is.

Copy link
Contributor

@macobo macobo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts. 🚀

@@ -491,6 +491,7 @@ defmodule Plausible.Stats.Filters.QueryParserTest do
"metrics" => ["visitors"],
"date_range" => "all",
"filters" => [
["is", "segment", [200]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's not clear whether this is intended as the full code: Nit: separate test.

site: site,
user: user
} do
name = "foo"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I don't think DRY-ing the variables here helps here - it just ends up with a wordier test where you need to scroll back-and-forth.

"segment" => %{
"description" => nil,
"name" => ^name,
"segment_data" => ^segment_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is confusing to follow - it isn't clear what the response actually looks like. I suggest the following structure.

segment = from(s in segments, where: %{ site_id: ^site_id }) |> Repo.one()

assert json_response(conn, 200) == %{
               "role" => "owner",
               "segment" => %{ ... }
}

And using fully-hard-coded values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I struggled with the timestamps in this test file. Any tips on how to omit them from the comparison while remaining brief?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - elixir assertions don't have any clever shorthands like any(datetime) for assert. I'll dig into existing tests for controllers and see how they have solved it. 🤔 Will provide an answer tomorrow!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dug a bit.

We don't have a usecase yet for exposing updated_at or inserted_at at the API level. Looking at similar models like lib/plausible/auth/invitation.ex and lib/plausible/site.ex we don't expose them either. I'd remove exposing them.

I'd rewrite the test as follows:

  describe "POST /internal-api/:domain/segments" do
    setup [:create_user, :create_new_site, :log_in]

    test "creates segment successfully", %{conn: conn, site: site} do
      conn =
        post(conn, "/internal-api/#{site.domain}/segments", %{
          "segment_data" => %{"filters" => [["is", "visit:entry_page", ["/blog"]]]},
          "name" => "Blog entry"
        })

      segment = Plausible.Repo.one(Plausible.Segment)

      assert json_response(conn, 200) == %{
               "role" => "owner",
               "segment" => %{
                 "id" => segment.id,
                 "name" => "Blog entry",
                 "segment_data" => %{"filters" => [["is", "visit:entry_page", ["/blog"]]]},
                 "description" => nil,
               }
             }
    end
  end

) do
get_available_segments = fn ->
case Keyword.get(opts, :conn) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Why is this in legacy breakdown code and no-where else?

Feedback: Having a mysterious conn option passed to this module where assigns need to be in a specific structure couples things way too close to the controller and mixes responsibilities.

Suggestion: Store user_id on Query object, use that in here.

Problem: I don't think filtering by user_id is at all correct here and we should only filter on site_id. This is the sharing links problem I've mentioned before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted! I was just about to ask for help on this in team chat. Rather than storing the user_id, what if I store the loader function in the Query object?


@filter_tree_operators [:not, :and, :or]

def parse_filters(filters) when is_list(filters) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't apply the JSON schema which is a problem.

def change do
create table(:segments) do
add :name, :string, null: false
add :segment_data, :map, null: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add :segment_filters, :array, null: false or something equivelent? What are we winning by nesting the data under filters and creating an abstraction?

"maxItems": 3,
"items": [
{
"$ref": "#/definitions/filter_operation_for_segments"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Let's inline the operation given there's only one and we're not repeating the ref anywhere.

Suggested change
"$ref": "#/definitions/filter_operation_for_segments"
"const": "is"

timestamps()
end

create index(:segments, [:segment_data], using: :gin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be using this index?

create index(:segments, [:site_id])

create table(:segment_collaborators, primary_key: false) do
add :role, :string, null: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String columns like this are suspect but it seems this is ORM-generated right?


defp validate_segment_data(changeset) do
case get_field(changeset, :segment_data) do
%{"filters" => filters} when is_list(filters) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should validate here that it's valid and doesn't contain any nested segment references right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! It should be validated, but at what level? I didn't want to make this module depend on query / filters parsing. How about doing that in the API controller?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it in the controller seems valid and best (and if it's already validated then sorry for missing it!).

@@ -0,0 +1,41 @@
defmodule Plausible.Segment do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: move this under lib/plausible/segment.ex - the current placement makes the module hard to find.

} do
name = "foo"
description = "bar"
segment_data = %{"filters" => []}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use realistic test data to simulate how a user might use the API. Since the data doesn't matter for most tests, I suggest defining @segment_data %{"filters" => [["is", "visit:entry_page", ["/blog"]]]} at the top level and re-using it in most tests.

role = "owner"

%{id: segment_id} =
insert(:segment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than insert it here, would it make sense to use the other APIs to create these to simulate user behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants