When it comes to "node discovery" the relationship between TOKEN_AWARE for NodeDiscoveryType and TOKEN_AWARE for ConnectionPoolType is interrelated and somewhat confusing.
NodeDiscoveryType is determined as follows (and it -usually- isn't via setDiscoveryType()):
- If you've provided Seeds via setSeeds and ConnectionPoolType is TOKEN_AWARE then NodeDiscoveryType is RING_DESCRIBE.
- If you've provided Seeds via setSeeds and ConnectionPoolType is anything other than TOKEN_AWARE then your configured setDiscoveryType will be used. This is the only case in which your configured NodeDiscoveryType (via setDiscoveryType) will be used.
- If you did not provide Seeds via setSeeds AND ConnectionPoolType is TOKEN_AWARE then NodeDiscoveryType is TOKEN_AWARE.
- If you did not provide Seeds via setSeeds AND ConnectionPoolType is anything other than TOKEN_AWARE then NodeDiscoveryType is DISCOVERY_SERVICE.
Node Discovery
Now that we've determined how NodeDiscoveryType is set, let's see how it impacts actually discovering nodes. Node discovery boils down to which implementation of HostSupplier (i.e. Supplier<List<Host>>
) is used.
- If NodeDiscoveryType (from above) is DISCOVERY_SERVICE then must use HostSupplier (via
withHostSupplier
). - If NodeDiscoveryType (from above) is RING_DESCRIBE then use RingDescribeHostSupplier.
- If NodeDiscoveryType (from above) is TOKEN_AWARE and HostSupplier is set (via
withHostSupplier
) then use FilteringHostSupplier with RingDescribeHostSupplier. - If NodeDiscoveryType (from above) is TOKEN_AWARE and no HostSupplier is set then use RingDescribeHostSupplier.
RingDescribe and using the local DC
Based on the configuration you've supplied you'll end up with RingDescribeHostSupplier. RingDescribeHostSupplier allows connections to all nodes in the ring unless you've specified a datacenter. So, when setting up your AstyanaxContext using ConnectionPoolConfigurationImpl you might want to setLocalDatacenter with the desired DC. That will ensure that hosts from the other dc's are not in the connection pool and that your requests are local.
.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
.setPort(9160)
.setMaxConnsPerHost(40)
.setLocalDatacenter("phx")
.setSeeds("cdb03.vip.phx.host.com:9160,cdb04.vip.phx.host.com:9160")
)
ConnectionPoolType
You also might want to set ConnectionPoolType to TOKEN_AWARE. When that value is left unset, it will default to ROUND_ROBIN (using the nodes from the node discovery work described above). TOKEN_AWARE ConnectionPoolType will "keep track of which hosts have which tokens and attempt to direct traffic intelligently".
I'd do something like this for Astyanax configuration, unless you are providing a HostSupplier.
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE)
)
Pool Optimizations
Another consideration would be optimizing the pool usage with Astyanax "latency awareness" on ConnectionPoolConfigurationImpl, but YMMV on the settings. e.g. :
.setLatencyScoreStrategy(new SmaLatencyScoreStrategyImpl(10000,10000,100,0.50))
// The constructor takes:
// UpdateInterval: 10000 : Will resort hosts per token partition every 10 seconds
// ResetInterval: 10000 : Will clear the latency every 10 seconds
// WindowSize: 100 : Uses last 100 latency samples
// BadnessThreshold: 0.50 : Will sort hosts if a host is more than 100%
See Astyanax Configuration
TLDR;
In summary, set NodeDiscoveryType to RING_DESCRIBE (if you aren't using a HostSupplier) and ConnectionPoolType to TOKEN_AWARE. Additionally, use setLocalDatacenter to keep requests local to the dc and consider the latency awareness settings.