profile
viewpoint
Reuben Bond ReubenBond Microsoft Redmond, Washington https://twitter.com/ReubenBond dotnet/orleans

OrleansContrib/OrleansFabricSilo 60

Orleans running on Service Fabric

OrleansContrib/Orleans.Consensus 37

Raft implemented on Orleans

ReubenBond/cocos2d-x 6

Port of cocos2d-iphone in C++

Netonic/Netonic 2

All the code

centur/DDDMelb2016.VirtualActors 1

Repository with Virtual actors workshop for DDD Mebourne 2016

Noplog/Noplog 1

.NET Replicated Log

ReubenBond/bootstrap-datepicker 1

A datepicker for @twitter bootstrap; originally by Stefan Petre of eyecon.ro, improvements by @eternicode

pull request commentdotnet/orleans

[Backport] Add C# 9.0 `record` support to serializer (#7108)

Does one still have to use Immutable<T> to tell the runtime not to deep-copy the object or for record types it is not necessary anymore (records are immutable by design)?

Records aren't necessarily immutable. They can have mutable fields or fields of mutable types (dictionary, list, etc).

Note that you can annotate a type with [Immutable] instead of wrapping it with Immutable<T> if you prefer

ReubenBond

comment created time in 21 hours

pull request commentdotnet/runtime

Increase default HostOptions.ShutdownTimeout value to 30 seconds

How do we decide if 30s is really a better value than 5s? Do we have any data about how long apps take to shut down?

I laid out the principles behind picking a value in the original post. 5s is the outlier - a value of 30s-2min is more typical for a shutdown timeout. No firm data, but I work with teams internal and external and it's common for them to need to increase this to make things work smoothly on Kubernetes.

If an app is hanging on shutdown then it's better that it annoys developers during dev time rather than causing odd/ungraceful behavior in production. I agree that we should change that in templates for local dev (like we do for binding ports and the environment).

We should optimize towards:

  • Correct apps work well in production
  • Defects are apparent at dev/test time
ReubenBond

comment created time in 4 days

issue commentdotnet/orleans

First-class on-premise Stream Provider

To clarify what that blog post says, etcd is used for storing metadata/specs/status, but not application state. Small, low velocity data. If it meets your needs, then by all means go for it.

turowicz

comment created time in 4 days

issue commentdotnet/orleans

[Epic]: Support Multi-clustering for Orleans

We discussed this with an internal partner team today and have some ideas for how to implement this well. I'm documenting some of the takeaways of the discussion here for our own good.

I suggest we call this feature "metaclusters" because it involves a cluster of clusters (a cluster is a set of collaborating processes, this is a set of sets of processes).

We want to support communication between multiple Orleans clusters, typically geographically separated.

This will involve some changes to Orleans' core and we have a few guiding principles:

  • The feature should be opt-in and pay-for-what-you-use
  • There should be no unavoidable single points of failure in the design
  • Clusters should be able to communicate with each other via a load balancer (TCP/HTTPS) and should not require a shared VPN / full IP-addressability to each server in each cluster.

Our current thinking in terms of design and work:

Metacluster Membership

  • We will add a new provider for mapping ClusterIds to communication endpoints (TCP/HTTPS gateway enpoints) as well as for indicating liveness.
  • The default implementation of this will use a globally accessible table (eg, Azure Table Storage)
  • Administrators can manipulate this table to add/remove clusters from the metacluster (possibly using tooling / management APIs)
  • Enhancement: allow clusters to check each other for liveness and automatically update the table with status (Up/Down) depending on network connectivity. Clusters do not crash & restart when they detect that they are marked Down, instead, they check that they can communicate with the currently Up clusters before resetting their status in the table to Up again.
  • Conceptually, to remove the single-point-of-failure on a globally accessible database in the future, we can use a static list of cluster endpoints as a seed and then use consensus among them to determine which are alive or not. This is complex in itself, so it's best to defer this while keeping the option open.
  • Clusters are responsible for managing liveness of their constituent servers themselves. I.e, there are no cross-cluster per-server liveness checks.
  • Clusters read each other's membership using API calls against the cluster gateway (rather than directly reading membership tables from storage). This can also be used to get cluster manifests, etc, for metacluster-aware placement.

Routing/Networking

  • Servers will connect to each Up endpoint in the metacluster table and route all messages destined for any server in the metacluster table via that connection.
  • Therefore, each server will establish a connection to every server which is a part of its cluster as well as a connection to a single gateway in every other cluster. When the connection drops (eg because that gateway shuts down) they will retry the connection indefinitely (likely with some backoff).

Placement:

  • SiloAddress should have some way of identifying the cluster which it belongs to. That might include embedding a cluster id or it might involve migrating to a string instead of an IPEndPoint, where the string can map to an IPEndPoint for local communication (either via parsing or a mapping included in the membership table (eg, "ClusterX/SiloY" has endpoints: { IPv4: 10.0.0.2:11111 IPv6: ::::2:11111, FQDN: "silox.clustery.internal.contoso.com"}, or something), or an external mapping.
  • Therefore, IGrainLocator implementations have some mechanism for returning foreign addresses. Typically, we imagine that only a subset of grain types will be allowed to exist across clusters (single instance per metacluster). That would use the existing mechanisms (customizable grain locator/directory per grain)
  • Orleans checks that the target silo is live before routing calls to grains. Therefore, the check needs to incorporate some knowledge of the metacluster.
  • It's the job of the developer to configure a placement provider which uses a globally accessible directory: that is not a part of the metacluster feature

Non-goals:

  • We want to avoid any notion that this is some kind of panacea for globe-spanning applications which automatically do the most optimal thing in any given situation. For example, we are not planning to give Orleans the notion of the inter-cluster link latency or bandwidth and Orleans wont divine which cluster is the best to place a given grain, it has no intrinsic knowledge about data sovereignty, etc. The goal is to provide mechanisms for building globe-spanning applications rather than a packaged solution
  • Grains are either globally accessible or locally accessible, depending on their configure grain locator: if IMyGrain uses a locally-scoped grain directory in each cluster, then sending a reference to it to a globally-scoped grain in some other cluster will result in a reference which is locally scoped to that foreign cluster, not the originating cluster. If there's a need to have per-cluster grains, that can be accomplished by encoding something into the grain id and interpreting that at placement time.

cc @JohnMorman @juyuz

rafikiassumani-msft

comment created time in 5 days

Pull request review commentdotnet/orleans

Address #7256 pulling agent checkpoint out of bounds

 public interface IStreamQueueCheckpointerFactory     {         Task<IStreamQueueCheckpointer<string>> Create(string partition);     }-    +     public interface IStreamQueueCheckpointer<TCheckpoint>     {         bool CheckpointExists { get; }         Task<TCheckpoint> Load();+        Task Reset() => throw new NotSupportedException();

I believe this requires .NET Core 3.x or later to work

oising

comment created time in 5 days

PullRequestReviewEvent

push eventbenjaminpetit/orleans

Reuben Bond

commit sha ba2212ba28d818ad9ef4814639d64e2e5ec41ed2

Update src/Orleans.Core/Providers/IGrainStorageSerializer.cs

view details

push time in 5 days

issue commentdotnet/orleans

First-class on-premise Stream Provider

How should I go about making sure it goes in the Contrib?

Coordinate with the community in the #development channel in Discord.

do you know how NATS compares to Etcd in terms of key-value storage?

It's not built for it - I was suggesting NATS for a stream provider, not KV storage. If the goal is to run on small devices, maybe these kinds of systems aren't ideal. They're mostly built for clusters of servers. Do you need something replicated/resilient? If not, maybe use SQLite for everything and store data locally.

turowicz

comment created time in 5 days

startedikarus23/MifareClassicTool

started time in 5 days

release OrleansContrib/Orleans.Redis

v3.2.1

released time in 6 days

created tagOrleansContrib/Orleans.Redis

tagv3.2.1

Redis support packages for Orleans

created time in 6 days

delete branch OrleansContrib/Orleans.Redis

delete branch : ReubenBond-patch-1

delete time in 6 days

push eventOrleansContrib/Orleans.Redis

Reuben Bond

commit sha 46fa5537d3246c8fbe8f8c1b889f5e24970f05e7

Bump version to 3.2.1 (#26)

view details

push time in 6 days

create barnchOrleansContrib/Orleans.Redis

branch : ReubenBond-patch-1

created branch time in 6 days

push eventOrleansContrib/Orleans.Redis

Artem Eliseev

commit sha 948e409f3dd61c1f3b9c0718ee35104ee43cc852

Replaced .Count() by .Length (#20) (#24) * Replaced .Count() by .Length (#20) * Update RedisGrainStorage.cs Co-authored-by: Reuben Bond <203839+ReubenBond@users.noreply.github.com>

view details

push time in 6 days

PR merged OrleansContrib/Orleans.Redis

Replaced .Count() by .Length (#20)

Fixes #20

As new bugfix (PR #23) will lead to preparation of new package version, I think it is OK to prepare another PR with slight perf improvement

+1 -1

0 comment

1 changed file

f-i-x-7

pr closed time in 6 days

issue closedOrleansContrib/Orleans.Redis

RedisGrainStorage.ReadStateAsync() should use Array.Length instead of Linq.Enumerable.Count()

There is a possibility for micro perf improvement by replacing this line:

https://github.com/OrleansContrib/Orleans.Redis/blob/1de3c47e591c14d2ff22c12dccbfe3de037f377c/src/Orleans.Persistence.Redis/RedisGrainStorage.cs#L124

to this:

if (hashEntries.Length == 2)

hashEntries is an array, so this is technically possible, and it should benefit by eliminating virtual call to ICollection<T>.Count that is called by Linq.Enumerable.Count() under the hood (in theory it is possible that due to several inlines made by JIT a possibility for guarded devirtualization can occur, but I think it is better to reduce JIT work anyway and just access Length directly).

This is very simple change, so if you are interested in it, I could submit a PR.

closed time in 6 days

f-i-x-7

push eventf-i-x-7/Orleans.Redis

Artem Eliseev

commit sha 1795ae199fad58b12b2faf0ce5c50dbcb57458a5

Feature/fix redis grain storage errors after redis server reboot (#23) * Fix issue #19: 1) ReadState: perform grain state format migration only when specific kind of RedisServerExceptions occurs (WRONGTYPE). 2) WriteState: handle NOSCRIPT errors related with cleared Redis server script cache (e.g. when Redis is rebooted). 3) WriteState: fix grain state format migration logic - migrate only when specific kind of RedisServerExceptions occurs (WRONGTYPE), unify migration logic with ReadState migration. * Tests are added. * LoadWriteScriptAsync() - do not use LINQ.

view details

Artem Eliseev

commit sha 9108471768c37b31d2f11d05631019fdc3148579

RedisGrainStorage: added ConfigureAwait(false) to some awaited calls so that after merge of these changes and PR #23 all awaits in this class will use ConfigureAwait(false). (#25)

view details

Reuben Bond

commit sha 2ddb8078b713affa29610d5696053256e931deda

Merge branch 'master' into feature/remove-linq-usage-in-one-method

view details

push time in 6 days

push eventf-i-x-7/Orleans.Redis

Reuben Bond

commit sha 78bcef0f184d17ea5e2969cdf134bd90ffb389a2

Update RedisGrainStorage.cs

view details

push time in 6 days

push eventOrleansContrib/Orleans.Redis

Artem Eliseev

commit sha 9108471768c37b31d2f11d05631019fdc3148579

RedisGrainStorage: added ConfigureAwait(false) to some awaited calls so that after merge of these changes and PR #23 all awaits in this class will use ConfigureAwait(false). (#25)

view details

push time in 6 days

PR merged OrleansContrib/Orleans.Redis

RedisGrainStorage: added ConfigureAwait(false) to some awaited calls …

…so that after merge of these changes and PR #23 all awaits in this class will use ConfigureAwait(false).

As new bugfix (PR #23) will lead to preparation of new package version, I think it is OK to prepare this PR with slight improvement.

+2 -2

0 comment

1 changed file

f-i-x-7

pr closed time in 6 days

pull request commentOrleansContrib/Orleans.Redis

Feature/fix redis grain storage errors after redis server reboot

Many thanks, @f-i-x-7, looks good to me

f-i-x-7

comment created time in 6 days

push eventOrleansContrib/Orleans.Redis

Artem Eliseev

commit sha 1795ae199fad58b12b2faf0ce5c50dbcb57458a5

Feature/fix redis grain storage errors after redis server reboot (#23) * Fix issue #19: 1) ReadState: perform grain state format migration only when specific kind of RedisServerExceptions occurs (WRONGTYPE). 2) WriteState: handle NOSCRIPT errors related with cleared Redis server script cache (e.g. when Redis is rebooted). 3) WriteState: fix grain state format migration logic - migrate only when specific kind of RedisServerExceptions occurs (WRONGTYPE), unify migration logic with ReadState migration. * Tests are added. * LoadWriteScriptAsync() - do not use LINQ.

view details

push time in 6 days

PR merged OrleansContrib/Orleans.Redis

Feature/fix redis grain storage errors after redis server reboot

Fixes #19.

  1. ReadState: perform grain state format migration only when specific kind of RedisServerExceptions occurs (WRONGTYPE).
  2. WriteState: handle NOSCRIPT errors related with cleared Redis server script cache (e.g. when Redis is rebooted).
  3. WriteState: fix grain state format migration logic - migrate only when specific kind of RedisServerExceptions occurs (WRONGTYPE), unify migration logic with ReadState migration.

Tested on Windows with latest Redis docker image (version 6.2.6). Test methodology:

  1. Automated - existing tests passed.
  2. Automated - added new tests with clear of Redis scripts cache before grain state write (passed).
  3. Semi-manual using existing test - passed:
  • placed breakpoint in Json_BackwardsCompatible_ETag_Writes test before last call to await grain.Set()
  • started debugging of the test
  • hit breakpoint
  • restarted Redis (with docker stop & docker start commands)
  • checked current value of grain state in Redis (HGETALL "GrainReference=00000000000000000000000000bc628f03ffffffaf52472e|json"), compared it with expected value from test
  • checked that Redis script cache is empty
  • proceeded with test code execution (no errors occured)
  • checked current value of grain state in Redis again, compared it with expected value from test and with previously observed value

I am also planning to perform fully-manual test by creating simple silo & client apps and restarting Redis server in between grain state writes.

+102 -40

1 comment

2 changed files

f-i-x-7

pr closed time in 6 days

issue closedOrleansContrib/Orleans.Redis

All keys deleted after Redis server rebooted unexpectedly

Redis server rebooted unexpectedly and all keys were deleted.

Following exception was thrown in every write operation after reboot: Exc level 0: StackExchange.Redis.RedisServerException: NOSCRIPT No matching script. Please use EVAL. at Orleans.Persistence.RedisGrainStorage.WriteToRedisAsync(IDatabaseAsync db, Object state, String etag, String key, String newEtag) at Orleans.Persistence.RedisGrainStorage.WriteStateAsync(String grainType, GrainReference grainReference, IGrainState grainState) at Orleans.Core.StateStorageBridge1.WriteStateAsync()

It seems that logic attempts to migrate grain state every time when RedisServerException is thrown. Is this a proper way to determine that state should be migrated?

When RedisServerException is thrown, delete and write commands are executed in the same transaction and in this case write operation inside the transaction will fail. This exception is not caught in the code.

Redis transactions don't work the same way than in normal relational database. Transactions are not going to rollback changes if one or more commands fails in transaction.

So reboot caused Redis server to clean all cached SHA:s for prepared scripts. Then server started to throw RedisServerException in every write operation. This was handled as a old grain state and started migration. Delete and write commands were executed in the same transaction. Only delete was executed successfully and all keys were deleted.

closed time in 6 days

jarkkojasberg
PullRequestReviewEvent

PR opened dotnet/runtime

Increase default HostOptions.ShutdownTimeout value to 30 seconds

Fixes #63709

See the above for context.

+1 -1

0 comment

1 changed file

pr created time in 6 days

create barnchReubenBond/runtime

branch : fix/63709/increase-shutdowntimeout

created branch time in 6 days

more