This article applies to GemFire versions 7.x and above.
The purpose of this article is to provide a quick overview of best practices for networking with GemFire in the three important categories:
Latency is the most common performance bottleneck for network dependent systems like GemFire. This can be achieved following the best practices:
GemFire systems are often called upon to handle extremely high transaction volumes and as a consequence move large amounts of traffic through the network. As a result, one of the primary design goals in architecting a GemFire solution is to maximize network throughput which can be achieved in the following ways that assumes that TCP and IPv4 is being used:
Increasing TCP’s Initial Congestion Window allows TCP transfers more data in the first round trip and significantly accelerates the window growth which is an especially critical optimization for bursty and short-lived connections. The parameter to control this is net.ipv4.tcp_congestion_window
which Defaults to 1. This is recommended to be increased to 10. This is done on the network interface by adding a couple of lines like the following to /etc/rc.local
:
defrt=`ip route | grep "^default" | head -1`ip route change $defrt initcwnd 10
Disabling TCP Slow-Start After Idle Disabling will improve performance of long-lived TCP connections, which transfer data in bursts. Set the parameter net.ipv4.tcp_slow_start_after_idle
to 0 to disable. By default, TCP starts with a single small segment, gradually increasing it by one each time. This results in unnecessary slowness that impacts the start of every request.
Enabling Window Scaling (RFC 1323) increases the maximum receive window size and allows high-latency connections to achieve better throughput. Set net.ipv4.tcp_window_scaling
to 1 to enable
Enabling TCP Low Latency effectively tells the operating system to sacrifice throughput for lower latency. For latency sensitive workloads like GemFire, this is an acceptable tradeoff than can improve performance. Set net.ipv4.tcp_low_latency
to 1 to configure TCP for low latency, favoring low latency over throughput.
Enabling TCP Fast Open allows application data to be sent in the initial SYN packet in certain situations. TFO is a new optimization, which requires support on both clients and servers and may not be available on all operating systems. By default the TCP_FASTOPEN
feature is not enabled at runtime (unless you instructed that in the sysctl.conf
file). Set net.ipv4.tcp_fastopen
to 1 to enable.
In addition, increasing the size of the transmit queue can also help TCP throughput. Add the following command to /etc/rc.local
to accomplish this.
/sbin/ifconfig eth0 txqueuelen 10000
GemFire systems depend on network services and network failures can have a significant impact on GemFire operations and performance. As a result, network fault tolerance is an important design goal for GemFire solutions. Use Mode 6 Network Interface Card (NIC) Bonding – NIC bonding involves combining multiple network connections in parallel in order to increase throughput and provide redundancy should one of the links fail.