Some time ago, we were encountering a JDBC connection reset quite frequently in a standalone Java job that was being kicked off every 5 mins from cron.
Env details
JDBC Driver: Oracle 11.2.0.2.0 JDBC
JVM: java version "1.7.0_25" OpenJDK Runtime Environment (rhel-2.3.10.4.el5-x86_64) OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
OS: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Database Server: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
The exception stacktrace is displayed below :-
Exception in thread "main" java.sql.SQLRecoverableException: IO Error: Connection reset
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:428)
at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:536)
at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:228)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:521)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
...
The first observation from the logs was that the connect was hanging somewhere between 60 and 90 seconds before the Connection reset.
Some investigation soon revealed something that was not quite expected. Apparently, the 11g JDBC driver tries to initialize the java.security.SecureRandom
class to generate random numbers, possibly to be used in the client
server handshake for initial session setup. On Linux, the call to
generate a seed for SecureRandom can block if /dev/random
does not have sufficient entropy available. And after a certain
interval, the server resets the TCP connection as it sees no activity
from the client. This is what the exception is about.
The easy workaround is to set a JVM system param i.e. -Djava.security.egd=file:///dev/urandom in the Java command line. This works because reads from /dev/urandom do not block even in the absence of entropy and simply continue to return (pseudo-)random bytes of lower quality.
There is also an alternative and roughly equivalent setting in the java.security file that can also be used as described in 1 below.
The practical upshot is that one of the above parameter settings is always recommended even if no Connection reset exceptions are encountered. This is because some unnecessary blocking of the connect may still occur, though it may not always be long enough to trigger a reset from the server. Certainly this can cut down the database connect time in most cases. In theory, there could be security risks arising from the use of /dev/urandom but this may not be the weakest link in the app security chain.
Though
the problem can be worked around relatively easily, it is probably worthwhile
to get some more background as it not entirely obvious that a
connection reset from the database server can be directly related to a
random seed generation call from Java.
man urandom provides a lot more detail on /dev/random and /dev/urandom but an easy way of seeing the difference in behavior is to try the following a few times on Linux.
hexdump /dev/random
and
hexdump /dev/urandom
The
first will typically produce outputs slowly while the second just zips
through. Also the number of bits of entropy available at any point can
be seen using cat /proc/sys/kernel/random/entropy_avail
Note: Mac OSX uses a different mechanism and the man page (man urandom) mentions the use of the Yarrow pseudo random number generator (PRNG) with the result that /dev/random and /dev/urandom are equivalent and reads do not block. Things are similar for Windows.
Still,
there are further twists in the tale that are a bit tedious to get into
but it may be worth mentioning some of the key ones.
1. The config file $JAVA_HOME/jre/lib/security/java.security has an invalid setting for securerandom.source
securerandom.source=file:/dev/urandom
#
# The entropy gathering device is described as a URL and can also
# be specified with the system property "java.security.egd". For example,
# -Djava.security.egd=file:/dev/urandom
# Specifying this system property will override the securerandom.source
# setting.
Although
this setting should result in /dev/urandom being used for seed
generation, this does not really happen and one of these values has the
desired effect instead
file:///dev/urandom
file:/dev/./urandom
file:/dev/../dev/urandom
Seems like this is a Java quirk/bug and the same holds for -Djava.security.egd
2. SecureRandom has a nextBytes method for getting the actual random bytes. This is not the call that blocks. Instead, not surprisingly, it is getSeed or generateSeed that can potentially block when /dev/random is in effect.
3. Virtualized environments tend to suffer from a lack of entropy
http://www.phoronix.com/scan.php?page=news_item&px=MTI5NzY
http://www.forbes.com/2009/07/30/cloud-computing-security-technology-cio-network-cloud-computing.html
JDBC Driver: Oracle 11.2.0.2.0 JDBC
JVM: java version "1.7.0_25" OpenJDK Runtime Environment (rhel-2.3.10.4.el5-x86_64) OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
OS: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Database Server: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:428)
at oracle.jdbc.driver.PhysicalConnection.
at oracle.jdbc.driver.T4CConnection.
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:521)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
...
There is also an alternative and roughly equivalent setting in the java.security file that can also be used as described in 1 below.
The practical upshot is that one of the above parameter settings is always recommended even if no Connection reset exceptions are encountered. This is because some unnecessary blocking of the connect may still occur, though it may not always be long enough to trigger a reset from the server. Certainly this can cut down the database connect time in most cases. In theory, there could be security risks arising from the use of /dev/urandom but this may not be the weakest link in the app security chain.
and
hexdump /dev/urandom
#
# The entropy gathering device is described as a URL and can also
# be specified with the system property "java.security.egd". For example,
# -Djava.security.egd=file:/dev/urandom
# Specifying this system property will override the securerandom.source
# setting.
file:///dev/urandom
file:/dev/./urandom
file:/dev/../dev/urandom
http://www.phoronix.com/scan.php?page=news_item&px=MTI5NzY
http://www.forbes.com/2009/07/30/cloud-computing-security-technology-cio-network-cloud-computing.html
No comments:
Post a Comment