문제

I am using Astyanax client to read the data from Cassandra database. I have a single cluster with four nodes. I am having replication factor of 2. I am trying to understand what is the difference between

setMaxConns and setMaxConnsPerHost 

methods in Astyanax client? I cannot find proper documentation on this.

I have a Multithreaded code which which spawn multiple threads and then create the connection to Cassandra database only once (as it is a Singleton) and then keep on reusing for other request.

Now I am trying to understand how the above two methods will play a role in read performance? And How those values should be set up?

And If I am setting those above two methods as-

setMaxConns(-1) and setMaxConnsPerHost(20) 

then what does it mean? Any explanation will be of great help.

Updated Code:-

Below is the code, I am using to make the connection-

private CassandraAstyanaxConnection() {

    context = new AstyanaxContext.Builder()
    .forCluster(ModelConstants.CLUSTER)
    .forKeyspace(ModelConstants.KEYSPACE)
    .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()      
        .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
    )
    .withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
        .setPort(9160)
        .setMaxConnsPerHost(20)
        .setMaxConns(-1)
        .setSeeds("host1:9160,host2:9160,host3:9160,host4:9160")
    )
    .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()      
        .setCqlVersion("3.0.0")
        .setTargetCassandraVersion("1.2"))
    .withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
    .buildKeyspace(ThriftFamilyFactory.getInstance());

    context.start();
    keyspace = context.getEntity();

    emp_cf = ColumnFamily.newColumnFamily(
        ModelConstants.COLUMN_FAMILY, 
        StringSerializer.get(), 
        StringSerializer.get());
}

If I am debugging this code, it is not even hitting the BagOfConnectionsConnectionPoolImpl class. I put a lot of breakpoint in the same class to see how it is using the conenctions and other default parameters. But don't know why it is not hitting that class.

도움이 되었습니까?

해결책

The behavior regarding these configuration properties might be dependent on implementation.

BagOfConnectionsConnectionPoolImpl

BagOfConnectionsConnectionPoolImpl is the only implementation at the moment that honors both these properties. It behaves as follows:

Connection is borrowed from the pool on every cassandra operation (query or mutation) and returned to pool upon completion of operation.

maxConnsPerHost - maximum number of connections per single cassandra host.

maxConns - maximum number of connections in the pool.

Both these numbers must be positive, so setMaxConns(-1) just won't work.

On the attempt to borrow a connection from pool, the pool checks active connection number against maxConns. If the limit is exceeded, it waits until some connection is released. If no connection is available in specified timeout, the pool throws PoolTimeoutException.

If maxConns limit is not exceeded, the pool attempts to find a cassandra host it's aware of (specified as seed or found during discovery) that has the number of active connections below maxConnsPerHost and connect to it. If all hosts reached connection limit, the pool throws NoAvailableHostsException.

For example, let's take a client that connects to cluster of 4 nodes:

setMaxConns(100); setMaxConnsPerHost(10): Effective maximum number of connections is 40 (10 connections per node, no further connection attempts will be made). NoAvailableHostsException will be thrown.

setMaxConns(20); setMaxConnsPerHost(10): Effective maximum number of connections is 20. The connections to different hosts will be distributed uniformly, but not necessary equally. PoolTimeoutException will be thrown.

Things get more complicated if nodes join or leave cluster, but general idea is the same.

TokenAwareConnectionPoolImpl & RoundRobinConnectionPoolImpl

Both TokenAwareConnectionPoolImpl & RoundRobinConnectionPoolImpl ignore maxConns configuration property. They just select a host (depending on row token or randomly) and attempt to connect to it.

If the number of active connections to that host exceeds maxConnsPerHost, the pool waits until some connection is released. If no connection is available during specified timeout, another connection attempt to (potentially) another host is executed as a part of failover.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top