Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events for an aggregate are read without any specific order #409

Open
totemcaf opened this issue Jul 12, 2023 · 8 comments
Open

Events for an aggregate are read without any specific order #409

totemcaf opened this issue Jul 12, 2023 · 8 comments

Comments

@totemcaf
Copy link

totemcaf commented Jul 12, 2023

Describe the bug
When events are applied to an aggregate to materialize it, the events should be applied in version order to maintain the event history.

The events.AggregateStore use an eh.EventStore to read events and then applies them in the same order that were received from the event store.

The query to read the documents from the event collection provides a filter and no sort option, so the documents are returned in the "natural order" witch as explain by documentation: "This ordering is an internal implementation feature, and you should not rely on any particular ordering of the documents." (according to https://www.mongodb.com/docs/manual/reference/method/cursor.sort/#return-in-natural-order)

func (s *EventStore) Load(ctx context.Context, id uuid.UUID) ([]eh.Event, error) {
    cursor, err := s.events.Find(ctx, bson.M{"aggregate_id": id})
    ...
}

and

func (s *EventStore) LoadFrom(ctx context.Context, id uuid.UUID, version int) ([]eh.Event, error) {
    cursor, err := s.events.Find(ctx, bson.M{"aggregate_id": id, "version": bson.M{"$gte": version}})
    ...
}

To Reproduce

I couldn't reproduce it yet. I think that the current implementation of MongoDB store adds new documents in insertion order, and to insert in a different order you should delete some old documents.

Steps to reproduce the behavior (perhaps):

  1. Insert a number of events for several aggregates
  2. Remove the events of one the aggregate
  3. Insert new events for existent aggregates
  4. Read events for existent aggregates (one aggregate at a time)

It is possible that the events read for some the aggregates were not in the desired order (the version order).

Expected behavior
The events applied to an aggregate by the events.AggregateStore should be in order of the version number.

Possible solutions

There are two possible solutions to this problem, depending on which component should be responsible of the order:

  1. The eh.EventStore should guarantee the order of events
  2. The events.AggregateStore should guarantee the order of applications of events

In the first case, we can simple add a Sort option to the query to ensure the documents are read in the required order.
This uses computing power of the database and should be implemented in all the event store implementations.

In the second case, the events.AggregateStore should order the events before applying them. Because the version number of events is an integer between a and b (a the min version read, and b the max), and b - a should be the exact number of events, we can quickly move the events into an array using the version - a as index.
This uses computing power of the application, but the algorithm is smart enough.

I want to hear some comments and I can provide a PR to solve this.

@totemcaf
Copy link
Author

@maxekman which is your experience with this?

@maxekman
Copy link
Member

Great find @totemcaf! I have not experienced any problems with this, but as you state the natural sort order should not be trusted.

I wounded about the performance implications for the two solutions. Have you had any thought about it?

(I’m currently not using this project myself in production. Perhaps @qzio or @klowdo can weigh in here?)

@qzio
Copy link
Contributor

qzio commented Jul 23, 2023

We have not seen this in production (yet).

About the two solutions - maybe it can be configurable?

In our environment I would prefer to use the computing power of the the application instead of the database.

But if you have very large documents - and a constrained app server - this might not be the preferred solution.

@totemcaf
Copy link
Author

@qzio eh.Event is an interfaz, so you only are "moving" interfaz pointers, so the size of documents should not affect (except for very low level times while loading memory pages, or for sorting them in the database).

The number of events read is the critical variable.

@qzio
Copy link
Contributor

qzio commented Jul 31, 2023

@qzio eh.Event is an interfaz, so you only are "moving" interfaz pointers, so the size of documents should not affect (except for very low level times while loading memory pages, or for sorting them in the database).

The number of events read is the critical variable.

We use github.com/looplab/eventhorizon/eventstore/mongodb where each aggregate's events is in a single document.

Maybe we aren't affected at all, since we read the events based on the order in the array in the document anyway...

@jefflinse
Copy link
Contributor

+1 for making this configurable, and for adding an option to perform the sorting on the database side. I ran into this out-of-order situation recently (which led me here), and I'd like to be able to try out both the app-based and DB-based approaches in our environments before choosing which strategy will serve us best.

@totemcaf Thank you for putting together that PR! Happy to collab on the configuration aspects if you want as this would be a very valuable fix for us.

@totemcaf
Copy link
Author

@jefflinse Thanks for your comments. PR #411 includes the possibility to configure the new behaviour keeping backward compatibility.

I'm eager to see that PR approved and merged.

@totemcaf
Copy link
Author

@qzio eh.Event is an interfaz, so you only are "moving" interfaz pointers, so the size of documents should not affect (except for very low level times while loading memory pages, or for sorting them in the database).
The number of events read is the critical variable.

We use github.com/looplab/eventhorizon/eventstore/mongodb where each aggregate's events is in a single document.

Maybe we aren't affected at all, since we read the events based on the order in the array in the document anyway...

@qzio, your right, mongodb (aka v1) should not be affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants