Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does put_assoc insist on deleting the old association or killing its foreign key? #4579

Open
jdmarshall opened this issue Feb 17, 2025 · 22 comments
Labels

Comments

@jdmarshall
Copy link
Contributor

Elixir version

Elixir 1.18.2 (compiled with Erlang/OTP 27)

Database and Version

sqlite

Ecto Versions

3.12.5

Database Adapter and Versions (postgrex, myxql, etc)

ecto-sqlite3 0.18.1

Current behavior

TableB has many records, each of them belongs_to a single record in TableA.
Records in Table B are never modified. They are copied on write and replaced by a new record (audit history)
Table A has_one record from TableB which represents the latest version

When I try to insert a new row into TableB and `put_assoc(:latest, newRow), I am told that I cannot modify this without wiping out either the parent id in the old row or delete it entirely.

** (ArgumentError) invalid :on_replace option for :latest. The only valid options are: :raise, :mark_as_invalid, :delete, :delete_if_exists, :nilify

None of these options make any sense. And I would be very surprised if I were the first to run into this issue.

Expected behavior

I shouldn't have to delete rows or nullify one FK relationship just because another one no longer applies.

:nothing should be a valid argument.

@josevalim
Copy link
Member

Table A has_one record from TableB which represents the latest version

If TableA has one record in TableB which represents latest version, it means TableB belongs to TableA. Therefore, if you do put_assoc(:latest, new_row), and you do "nothing", you would have two entries in TableB that point to TableA as latest. Databases such as PostgreSQL do not guarantee ordering in query, which means your put_assoc could have no effect whatsoever, and the association will only work if you also use order by or filters when querying. If you meet all of these conditions, :nothing makes sense, so a PR is welcome, but it should be made it clear in the docs that you need to use it with care.

@jdmarshall
Copy link
Contributor Author

you would have two entries in TableB that point to TableA as latest.

From the migrations it looks more like I would have 2 entries in TableB that point to TableA. The latest field is in TableA. What I want is dozens of records in TableB pointing to TableA. belongs_to and has_one are only allowing one and creating orphaned records. I could fake it with a logical field I create by searching a partial index but I'd just as soon keep the FK relation since I know it at insert/update time.

I took a peek at the code to see what my odds were of filing a PR. It looks like on_repo_change/5 with on_replace: :nilify returns {:ok, nil}

and the other variants essentially return {:ok, struct}

Is that all that I need (aside from some tests) to make :nothing work? Or is there magic deeper down that I would have to content with?

@josevalim
Copy link
Member

Sorry, regardless of table A or table B, you would have duplicate records for a has_one, which is not ok unless you have also have filters and order by. Is this the case? If this is not the case, Can you provide an example that reproduces this?

@jdmarshall
Copy link
Contributor Author

So it looks like has_one works opposite from what I would have expected, and I need belongs_to, even though that inverts the relationship linguistically from what it is architecturally.

@josevalim
Copy link
Member

So did you have the associations declared in the wrong order? Please let us know if you have a reproducer.

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 24, 2025

Yeah I'm going to have to generate a repro because my _id fields aren't getting populated at all now.

What does 'declared in the wrong order' mean? Why would there be an order?

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 25, 2025

So I believe my problem is that the put_assoc documentation, and much of the conversation around it, pretty much implies that it will take care of the parent-child relationship but what actually happens is that it only takes care of one half of it.

This code feels an awful lot like I'm hitting Ecto with the biggest rock I could find:

  def update_plan(%Plan{} = plan, attrs) do
    parent = %Plan{}
    |> Plan.changeset(attrs)

    child = parent.changes.current
    |> put_assoc(:plan, plan)

    get_plan!(plan.id)
    |> change()
    |> put_assoc(:current, child)
    |> Repo.update()
  end

While I expected something more like:

  def update_plan(%Plan{} = plan, attrs) do
    plan
    |> Plan.changeset(attrs)
    |> Repo.update()
  end

Or

  def update_plan(%Plan{} = plan, attrs) do
    changeset = %Plan{}
    |> Plan.changeset(attrs)

    child = parent.changes.current
    |> put_assoc(:plan, plan)

    get_plan!(plan.id)
    |> change()
    |> put_assoc(:current, child)
    |> Repo.update()

To suffice.

Also still feel like the 'order' question is some form of tacit knowledge I didn't pick up from the docs. What needs to be ordered, and why?

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Feb 25, 2025

def update_plan(%Plan{} = plan, attrs) do
    plan
    |> Plan.changeset(attrs)
    |> Repo.update()
  end

You should be able to do something like this. What does your Plan.changeset function look like?

Also still feel like the 'order' question is some form of tacit knowledge I didn't pick up from the docs. What needs to be ordered, and why?

Say you have tables parent and child. The child table must contain a foreign key to the parent table. If the database allows this to be a 1 to many relationship then you need a consistent way to choose "one" in Ecto.. A common way is to order by some column in conjunction with limit 1 to pick the "first" or "latest" or something like that. Without this kind of deterministic condition picking the "one" will be random.

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 25, 2025

# Parent
  def changeset(plan, attrs) do
    plan
    |> cast(attrs, [])
    |> cast_assoc(:current)
  end

# Child
  def changeset(plan_edit, attrs) do
    plan_edit
    |> cast(attrs, [:title, :target_date, :started_at, :finished_at])
    |> cast_assoc(:description)
    |> validate_required([:title])
end

If I try to fiddle with plan_id in the child changeset, it just complains about how the parent is not loaded.

In retrospect I should have stuck with the insert case, which has the exact same problem and doesn't get confused by implementation details of how doing the child updates (copy on write).

When I use code like you suggest in create_plan:

  def create_plan(attrs \\ %{}) do
    %Plan{}
    |> Plan.changeset(attrs)
    |> Repo.insert()
end

I get a successful insert but the plan_id in the child is not set. Which seems like would be substantially the point of Ecto, especially given that nowhere is there a discussion of manually setting the backlink.

[debug] QUERY OK source="plan_edit" db=0.0ms idle=59.1ms
SELECT p0."id", p0."title", p0."description_id", p0."started_at", p0."finished_at", p0."target_date", p0."plan_id", p0."inserted_at", p0."updated_at" FROM "plan_edit" AS p0 []
↳ Project.Plans.list_plans/0, at: lib/project/plans.ex:23
[
  %Project.Plans.PlanEdit{
    __meta__: #Ecto.Schema.Metadata<:loaded, "plan_edit">,
    id: 1,
    title: "Plan A",
    description_id: 1,
    description: #Ecto.Association.NotLoaded<association :description is not loaded>,
    started_at: nil,
    finished_at: nil,
    target_date: nil,
    plan_id: nil,
    plan: #Ecto.Association.NotLoaded<association :plan is not loaded>,
    inserted_at: ~U[2025-02-25 08:12:43Z],
    updated_at: ~U[2025-02-25 08:12:43Z]
  }
]

@josevalim
Copy link
Member

Can you please provide a complete snippet of what you are trying to do? You can provide either a small app or change the existing sample repo. But it is hard for us to provide feedback if we don't have access to the code that is running. We need to see the schemas and changesets and how the operations are invoked at least. Thank you.

@jdmarshall
Copy link
Contributor Author

Yeah I need to work up a minimal repro case. I'm around the axle at this point.

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 26, 2025

Alright, this is a new liveview project with just a Thing liveview with a nested Comment field.

https://github.com/jdmarshall/ecto_issue

mix phx.gen.live wanted to create the latest field as:

    field :latest_id, :id

Which doesn't quite line up with the examples in https://hexdocs.pm/ecto/associations.html, which look more like:

    belongs_to :latest, Issue.Comment

which also lets preload work. But the thing_id field does not get populated on create or update. What am I doing wrong?

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Feb 26, 2025

It would help if you made the app minimal and also highlighted the failing test. Please take a step back and imagine someone was asking you for help with their situation which you know nothing about.. You have provided an application with several modules and a lot of functions not related to the issue. Also there are a lot of tests and it is not clear which, if any of them highlight the issue.

defmodule Issue.ThingsTest do
  use Issue.DataCase

  alias Issue.Things

  describe "things" do
    alias Issue.Things.Thing

    import Issue.ThingsFixtures

    @invalid_attrs %{title: nil}

    test "list_things/0 returns all things" do
      thing = thing_fixture()
      assert Things.list_things() == [thing]
    end

    test "get_thing!/1 returns the thing with given id" do
      thing = thing_fixture()
      assert Things.get_thing!(thing.id) == thing
    end

    test "create_thing/1 with valid data creates a thing" do
      valid_attrs = %{title: "some title"}

      assert {:ok, %Thing{} = thing} = Things.create_thing(valid_attrs)
      assert thing.title == "some title"
    end

    test "create_thing/1 with invalid data returns error changeset" do
      assert {:error, %Ecto.Changeset{}} = Things.create_thing(@invalid_attrs)
    end

    test "update_thing/2 with valid data updates the thing" do
      thing = thing_fixture()
      update_attrs = %{title: "some updated title"}

      assert {:ok, %Thing{} = thing} = Things.update_thing(thing, update_attrs)
      assert thing.title == "some updated title"
    end

    test "update_thing/2 with invalid data returns error changeset" do
      thing = thing_fixture()
      assert {:error, %Ecto.Changeset{}} = Things.update_thing(thing, @invalid_attrs)
      assert thing == Things.get_thing!(thing.id)
    end

    test "delete_thing/1 deletes the thing" do
      thing = thing_fixture()
      assert {:ok, %Thing{}} = Things.delete_thing(thing)
      assert_raise Ecto.NoResultsError, fn -> Things.get_thing!(thing.id) end
    end

    test "change_thing/1 returns a thing changeset" do
      thing = thing_fixture()
      assert %Ecto.Changeset{} = Things.change_thing(thing)
    end
  end
end

@jdmarshall
Copy link
Contributor Author

I'm trying to figure out a problem in a new ecosystem that I'm learning, that favors project generators over rolling your own. The likelihood that I can generate a minimalist elixir project is quite slim.

But I'll see what I can do.

@greg-rychlewski
Copy link
Member

Which test should we be looking at then that highlights your problem?

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 26, 2025

Alright, I fixed the tests and added one that covers the problem:

I expected code like in this commit: jdmarshall/ecto_issue@02c6a34

To pass this test:

jdmarshall/ecto_issue@00cf9f7

I should have some time later tonight to get this down to a mix.exs file and a couple of source files.

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 26, 2025

I couldn't find any instructions on how to set up a bare elixir project for running ecto tests. After a couple hours of trying to sort it out, I just went with reviewing the ecto tests, to see if I could add a new one.

ecto has bidirectional association tests,

test "handles valid nested assocs on insert" do

%Ecto.Repo.BelongsToTest.MySchema{
  __meta__: #Ecto.Schema.Metadata<:loaded, "my_schema">,
  id: 1,
  x: nil,
  y: nil,
  assoc_id: 1,
  assoc: %Ecto.Repo.BelongsToTest.MyAssoc{
    __meta__: #Ecto.Schema.Metadata<:loaded, "my_assoc">,
    id: 1,
    x: "xyz",
    sub_assoc_id: 1,
    sub_assoc: %Ecto.Repo.BelongsToTest.SubAssoc{
      __meta__: #Ecto.Schema.Metadata<:loaded, "sub_assoc">,
      id: 1,
      y: "xyz",
      my_assoc: #Ecto.Association.NotLoaded<association :my_assoc is not loaded>
    },
    my_schema: #Ecto.Association.NotLoaded<association :my_schema is not loaded>,
    inserted_at: ~N[2025-02-26 08:21:19],
    updated_at: ~N[2025-02-26 08:21:19]
  },
  nilify_assoc_id: nil,
  nilify_assoc: #Ecto.Association.NotLoaded<association :nilify_assoc is not loaded>
}

None of these tests ever round-trip the insert.

cobblers-children@f4898cd

That looks to be left up to the integration tests and I cannot get them to run by either method available in the instructions.

@greg-rychlewski
Copy link
Member

I couldn't find any instructions on how to set up a bare elixir project for running ecto tests. After a couple hours of trying to sort it out, I just went with reviewing the ecto tests, to see if I could add a new one.

Did you try the one Jose suggested to you and ran into an issue? https://github.com/elixir-ecto/ecto/tree/master/examples/friends

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Feb 26, 2025

The issue is that both Thing and Comment are setup to belong_to each other. If you change Thing to has_one then you will see the foreign key populated:

To help see why this is the case, I would take a step back and think about how you would set this up in SQL without any Ecto. You would make it so that only one side of the relationship has the foreign key. By the looks of your test, you want it to be Comment that is holding the foreign key of Thing.

Now translating this SQL into Ecto schemas, the side that has the foreign key will hold the belongs_to side of the association. While the one having its foreign key referenced will hold the has_one side of the association. That's the way Ecto detects whether it needs to place its id in the foreign key of the child. If the schema holds the has_one side of the association it knows its id is being used as a foreign key in the child.

So basically you want to do this

schema "things" do
    field :title, :string
    has_one :latest, Issue.Comment

    timestamps(type: :utc_datetime)
  end

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 26, 2025

You would make it so that only one side of the relationship has the foreign key. By the looks of your test, you want it to be Comment that is holding the foreign key of Thing.

But I am trying to establish two relationships. One to many, one to one, to avoid a very broad number of N+1 queries with n on the order of 50, which would go quadratic when I try to do a list page.

@josevalim
Copy link
Member

What are the two different relationships you want to establish? Keep in mind you don't need to establish relationships in order to query them nor to avoid N+1. Or perhaps, can you tell us which queries you want to run?

@jdmarshall
Copy link
Contributor Author

jdmarshall commented Feb 26, 2025

In a wiki, or a ticketing system:

Artifact <- [Edits]
Artifact -> Latest snapshot

List view:

[Artifact{latest: {…}}]

Or

[Artifact[latest: {…}] where latest.assignee == user_id and latest.status == :in_progress

I have two apps I’m working on and I’m likely to need to repeat this pattern at least nine times across the two of them. Need to settle this design issue before I start on the second (which is really the first but got split into two, long story). Without the back link all the queries grow wordier and now I need more indexes to manage aggregate and summary views, instead of just FK indexes and a join on FK==id. I’m not going to have a lot of update queries but I will end up with a bunch of search queries and some reporting queries.

I’m also likely to have more child associations on the snapshot to normalize other relationships and also to dedupe text blobs that change less often than the edit and rollback history.

I’m already doing a little tap dance in update_<table_name> to copy-on-write the latest snapshot with each new form submit, and my latest workaround to get the many to one reference populated is to insert_ without the children and then immediately call update_, and I sure would like to not have to look at that for the next couple of years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants