[Bug 224079] java/openjdk8: Elasticsearch won't start after OpenJDK upgrade

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Sat Dec 16 19:32:04 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224079

John W. O'Brien <john at saltant.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #188887|                            |maintainer-approval?
              Flags|                            |

--- Comment #4 from John W. O'Brien <john at saltant.com> ---
Created attachment 188887
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=188887&action=edit
java/openjdk8: Preserve OS-supplied IPv6 interface scope IDs

The problem is in the way u152 started handling IPv6 scope IDs in the
java.net.NetworkInterface class. This patch corrects that defect and allows
elasticsearch (ES) to start. Read on for details of my investigation and
analysis.

My attention was drawn to "[::1%2]" in the ktrace output. This looked wrong to
me.

I adapted the soconnect.d DTrace script from Gregg and Mauro [0] to get a look
at the bind(2) calls (sobind.d).

On a machine with u144 where ES starts:

PID    PROCESS          FAM ADDR                                    SCOPE   
PORT
44187  java             28  fe80::1                                 3       
9300
44187  java             28  ::1                                     0       
9300
44187  java             28  127.0.0.1                               0       
9300
44187  java             28  fe80::1                                 3       
9200
44187  java             28  ::1                                     0       
9200
44187  java             28  127.0.0.1                               0       
9200


On a machine with u152 where ES fails:

PID    PROCESS          FAM ADDR                                    SCOPE   
PORT
65851  java             28  fe80::1                                 3       
9300
65851  java             28  ::1                                     3       
9300
65851  java             28  ::1                                     3       
9301
65851  java             28  ::1                                     3       
9302
65851  java             28  ::1                                     3       
9303
65851  java             28  ::1                                     3       
9304
[repeat through PORT 9400]

Next, I wrote a short C program to exercise getifaddr(3) and ran it on a few
different systems. I also wrote a C program to try calling bind(2) on an
arbitrary IPv6 address, scope ID, and port. This is the simplest, most direct
way I could think of to demonstrate the problem and confirm my understanding of
the applicable APIs.

FreeBSD 10.4-RELEASE-p3:

$ ./gifa lo0
iface flags       af addr                                           scope ifidx
lo0   0x00008049  18                                                   -1     3
lo0   0x00008049  28 ::1                                                0     3
lo0   0x00008049  28 fe80::1                                            3     3
lo0   0x00008049   2 127.0.0.1                                         -1     3
$ ./trybind ::1 0 9300 && echo OK
OK
$ ./trybind ::1 3 9300 && echo OK
Could not bind: Can't assign requested address
$ ./trybind ::1 999 9300 && echo OK
Could not bind: Can't assign requested address


RedHat Enterprise Linux 6.9:

$ ./gifa lo
iface flags       af addr                                           scope ifidx
lo    0x00010049  17                                                   -1     1
lo    0x00010049   2 127.0.0.1                                         -1     1
lo    0x00010049  10 ::1                                                0     1
$ ./trybind ::1 0 9300 && echo OK
OK
$ ./trybind ::1 1 9300 && echo OK
OK
$ ./trybind ::1 999 9300 && echo OK
OK


macOS Sierra 10.12.6:

$ ./gifa lo0
iface flags       af addr                                           scope ifidx
lo0   0x00008049  18                                                   -1     1
lo0   0x00008049   2 127.0.0.1                                         -1     1
lo0   0x00008049  30 ::1                                                0     1
lo0   0x00008049  30 fe80::1                                            1     1
$ ./trybind ::1 0 9300 && echo OK
OK
$ ./trybind ::1 1 9300 && echo OK
OK
$ ./trybind ::1 999 9300 && echo OK
OK


>From these results I infer that RHEL and macOS ignore sin6_scope_id unless it's
needed to disambiguate an address known to be scoped, while FreeBSD always
considers the scope ID part of the address and treats scope 0 as the unscoped
scope.

For reference, the POSIX spec [1] states:

"The sin6_scope_id field is a 32-bit integer that identifies a set of
interfaces as appropriate for the scope of the address carried in the sin6_addr
field. For a link scope sin6_addr, the application shall ensure that
sin6_scope_id is a link index. For a site scope sin6_addr, the application
shall ensure that sin6_scope_id is a site index. The mapping of sin6_scope_id
to an interface or set of interfaces is implementation-defined."

Is the loopback address scoped? According to RFC-4007 [2], "::1, is treated as
having link-local scope".

The OpenJDK patch that introduced the breakage is a changeset [3] that modifies
the java.net.NetworkInterface class to unconditionally jam the interface index
into sin6_scope_id (see diff lines 1.926, 1.1274, and 1.1662). This is
unnecessary for any address that is scoped, because the OS will have already
populated sin6_scope_id with the correct link index. This is also incorrect for
any address that is not scoped, because it constitutes a JDK-defined mapping of
scope ID to an interface or set of interfaces, whereas the OS is entitled to
define that mapping.

I have found nowhere else where OpenJDK depends upon finding the interface
index in the sin6_scope_id field.

[0]
https://github.com/brendangregg/DTrace-book-scripts/blob/master/Chap6/soconnect.d
[1] http://pubs.opengroup.org/onlinepubs/000095399/basedefs/netinet/in.h.html
[2] https://tools.ietf.org/html/rfc4007#section-4
[3] http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/3dc438e0c8e1

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-java mailing list