Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Active-active KV #64

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions examples/use-cases/active-active/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Active Active

An active-active stream is typically used for
multi-region deployments where failover is desired.

There are two possible setups with active-active.

1. A primary region where all client connections and
traffic directed at a stream in region A will transparently fail
over to an *active replica* in region B.

From the client's perspective, there is continuity between these
regions.

If region A recovers, it may be preferred that the clients switch
back to region A especially to reduce latency.

2. Another setup may involve multiple regions each serving their own
set of clients, naturally partitioned by their geography. If a
given region becomes unvailable, those clients could failover to
a healthy region.

## Current limitations

- A stream by the same name cannot exist within the same account
even if placed on different clusters. By convention, each stream
should have a suffix, e.g. `events-west` and `events-east` to
differentiate where they exist.
- Although a two streams can bi-directionally source from one
another, the subjects cannot be homogenized since each streams
under the same account cannot have overlapping subjects. In
addition, bi-directional sourcing with homogenizing the subjects
would lead to a loop.
- There are no concept of consumer mirrors which means on a failover,
consumers will need to be recreated with the last known sequence.
This is feasible, however any message re-delivery state in the
consumer will be lost. On failover, clients would need to set the
sequence number to the earliest non-acked message and need to handle
later messages that may have been processed already.

## Possible improvements

- Formalize active-active streams by having clients that connect
to the cluster that a stream exists on to implicitly direct all
writes there.

- In the case where the stream/cluster/region becomes unavailable
clients connect to another cluster and continue appending/
consuming from the local stream.

## Assumptions

- Two or more regions
- A stream per region each sourcing from one another.

## Client responsibility

- Be aware of all cluster (region) endpoints
- Connect to the preferred cluster
- Each stream will have its own subject prefix corresponding to the cluster
- A client should publish to the stream with `Nats-Msg-Id` for dedupe
and always check for acks for sync and async publishing.
- For each consumer a client creates, it must maintain the current
stream ack floor sequence number
- An `AckAck` may be desirable if idempotent handling of messages
is problematic on the client-side
- When a failover is triggered, which may be automatic given some
failure detection mechanism or manually performed, the client must
reconnect to healthy cluster, swap the stream name and subject
prefix (for publishing) and bootstrap the consumers.
- Since the new local stream may be lagging in replication, it is
possible that a consumers ack floor is greater than the stream
sequence number. This implies the client observed a message in the
other cluster that was not yet replicated to this cluster. This also
implies that publishing to this stream in this state will result
in streams with potentially different ordering once the existing
cluster becomes healthy. This may be mitigated by deduplication.
28 changes: 28 additions & 0 deletions examples/use-cases/active-active/cli/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
FROM golang:1.19-alpine3.17 AS build

RUN apk update && apk add git bash curl jq

RUN go install github.com/nats-io/natscli/nats@latest
RUN go install github.com/nats-io/nats-server/v2@latest

WORKDIR /opt/app

COPY go.mod go.sum ./
RUN go mod download && go mod verify

COPY main.go ./
RUN go build -v -o /app .

FROM alpine

RUN apk update && apk add bash curl jq

COPY --from=build /go/bin/nats-server /usr/local/bin/
COPY --from=build /go/bin/nats /usr/local/bin/
COPY --from=build /app /app

COPY . ./

ENTRYPOINT ["bash"]

CMD ["main.sh"]
35 changes: 35 additions & 0 deletions examples/use-cases/active-active/cli/central-edit.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
{
"name": "events-central",
"subjects": [
"events.central.\u003e"
],
"placement": {
"cluster": "central"
},
"retention": "limits",
"max_consumers": -1,
"max_msgs_per_subject": -1,
"max_msgs": -1,
"max_bytes": -1,
"max_age": 0,
"max_msg_size": -1,
"storage": "file",
"discard": "old",
"num_replicas": 1,
"duplicate_window": 120000000000,
"sealed": false,
"deny_delete": false,
"deny_purge": false,
"allow_rollup_hdrs": true,
"allow_direct": true,
"sources": [
{
"name": "events-west",
"filter_subject": "events.west.\u003e"
},
{
"name": "events-east",
"filter_subject": "events.east.\u003e"
}
]
}
26 changes: 26 additions & 0 deletions examples/use-cases/active-active/cli/central.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"name": "events-central",
"subjects": [
"events.central.\u003e"
],
"placement": {
"cluster": "central"
},
"retention": "limits",
"max_consumers": -1,
"max_msgs_per_subject": -1,
"max_msgs": -1,
"max_bytes": -1,
"max_age": 0,
"max_msg_size": -1,
"storage": "file",
"discard": "old",
"num_replicas": 1,
"duplicate_window": 120000000000,
"sealed": false,
"deny_delete": false,
"deny_purge": false,
"allow_rollup_hdrs": true,
"allow_direct": true,
"sources": []
}
35 changes: 35 additions & 0 deletions examples/use-cases/active-active/cli/east-edit.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
{
"name": "events-east",
"subjects": [
"events.east.\u003e"
],
"placement": {
"cluster": "east"
},
"retention": "limits",
"max_consumers": -1,
"max_msgs_per_subject": -1,
"max_msgs": -1,
"max_bytes": -1,
"max_age": 0,
"max_msg_size": -1,
"storage": "file",
"discard": "old",
"num_replicas": 1,
"duplicate_window": 120000000000,
"sealed": false,
"deny_delete": false,
"deny_purge": false,
"allow_rollup_hdrs": true,
"allow_direct": true,
"sources": [
{
"name": "events-west",
"filter_subject": "events.west.\u003e"
},
{
"name": "events-central",
"filter_subject": "events.central.\u003e"
}
]
}
26 changes: 26 additions & 0 deletions examples/use-cases/active-active/cli/east.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"name": "events-east",
"subjects": [
"events.east.\u003e"
],
"placement": {
"cluster": "east"
},
"retention": "limits",
"max_consumers": -1,
"max_msgs_per_subject": -1,
"max_msgs": -1,
"max_bytes": -1,
"max_age": 0,
"max_msg_size": -1,
"storage": "file",
"discard": "old",
"num_replicas": 1,
"duplicate_window": 120000000000,
"sealed": false,
"deny_delete": false,
"deny_purge": false,
"allow_rollup_hdrs": true,
"allow_direct": true,
"sources": []
}
16 changes: 16 additions & 0 deletions examples/use-cases/active-active/cli/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
module github.com/bruth/nats-by-example/examples/use-cases/active-active-kv/cli

go 1.19

require (
github.com/nats-io/jsm.go v0.0.35
github.com/nats-io/nats.go v1.24.0
)

require (
github.com/golang/protobuf v1.5.3 // indirect
github.com/nats-io/nkeys v0.3.0 // indirect
github.com/nats-io/nuid v1.0.1 // indirect
golang.org/x/crypto v0.5.0 // indirect
google.golang.org/protobuf v1.30.0 // indirect
)
31 changes: 31 additions & 0 deletions examples/use-cases/active-active/cli/go.sum
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk=
github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg=
github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/klauspost/compress v1.15.12 h1:YClS/PImqYbn+UILDnqxQCZ3RehC9N318SU3kElDUEM=
github.com/minio/highwayhash v1.0.2 h1:Aak5U0nElisjDCfPSG79Tgzkn2gl66NxOMspRrKnA/g=
github.com/nats-io/jsm.go v0.0.35 h1:l03xuGttRA9b81Q0P/WEGm3e5DYof743ZEI4nQR3PUs=
github.com/nats-io/jsm.go v0.0.35/go.mod h1:AkNKZTxbvdFBOJCdlKuLHsRlOP+AI4hV9REQKmq3sWw=
github.com/nats-io/jwt/v2 v2.3.0 h1:z2mA1a7tIf5ShggOFlR1oBPgd6hGqcDYsISxZByUzdI=
github.com/nats-io/nats-server/v2 v2.9.6 h1:RTtK+rv/4CcliOuqGsy58g7MuWkBaWmF5TUNwuUo9Uw=
github.com/nats-io/nats.go v1.24.0 h1:CRiD8L5GOQu/DcfkmgBcTTIQORMwizF+rPk6T0RaHVQ=
github.com/nats-io/nats.go v1.24.0/go.mod h1:dVQF+BK3SzUZpwyzHedXsvH3EO38aVKuOPkkHlv5hXA=
github.com/nats-io/nkeys v0.3.0 h1:cgM5tL53EvYRU+2YLXIK0G2mJtK12Ft9oeooSZMA2G8=
github.com/nats-io/nkeys v0.3.0/go.mod h1:gvUNGjVcM2IPr5rCsRsC6Wb3Hr2CQAm08dsxtV6A5y4=
github.com/nats-io/nuid v1.0.1 h1:5iA8DT8V7q8WK2EScv2padNa/rTESc1KdnPw4TC2paw=
github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c=
golang.org/x/crypto v0.0.0-20210314154223-e6e6c4f2bb5b/go.mod h1:T9bdIzuCu7OtxOm1hfPfRQxPLYneinmdGuTeoZ9dtd4=
golang.org/x/crypto v0.5.0 h1:U/0M97KRkSFvyD/3FSmdP5W5swImpNgle/EHFhOsQPE=
golang.org/x/crypto v0.5.0/go.mod h1:NK/OQwhpMQP3MwtdjgLlYHnH9ebylxKWv3e0fK+mkQU=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.4.0 h1:Zr2JFtRQNX3BCZ8YtxRE9hNJYC8J6I1MVbMg6owUp18=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/time v0.1.0 h1:xYY+Bajn2a7VBmTM5GikTmnK8ZuX8YgnQCqZpbBNtmA=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
google.golang.org/protobuf v1.30.0 h1:kPPoIgf3TsEvrm0PFe15JQ+570QVxYzEvvHqChK+cng=
google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
Loading