ipv6 connection hash function wanted ...
Garrett Cooper
youshi10 at u.washington.edu
Thu Nov 16 20:01:10 UTC 2006
JINMEI Tatuya / ???? wrote:
>>>>>> On Tue, 14 Nov 2006 20:20:47 +0100,
>>>>>> Max Laier <max at love2party.net> said:
>>>>>>
>
>
>>>> Any ideas? Any papers that deal with this problem?
>>>>
>>> Assuming you don't want to use one of the standard cryptographic
>>> ones (which I can imagine being a bit slow for something done
>>> per-packet), then one option might be to use a simpler hash that
>>> is keyed. Choose the key at boot/module load time and make it hard
>>> to produce collisions unless you know the key.
>>>
>
>
>> That's exactly what I am looking for ... now I need someone[tm] - with
>> better Math-Knowledge than mine - to write such a thing down in a simple
>> formula :-) i.e. take those bits from there and there and XOR them with
>> your canary yada-yada-yada ...
>>
>
> If you want something whose behavior is mathematically guaranteed, I'd
> recommend universal hashing as already suggested in this thread.
>
> One example implementation is given below, assuming the hash key is
> source and destination IPv6 addresses and transport layer ports(*).
> As shown in the implementation, it's pretty simple and should run
> reasonably fast. Also, as long as the random values are reasonably
> unpredictable, the expected probability of producing the same hash
> value for arbitrarily-chosen two different keys is 1/65537. In
> general, for any prime number p as the value of LARGE_PRIME, this
> probability is 1/p (so if 65537 is too large, we can use a smaller
> number, e.g., 1009, depending on the desired balance between collision
> risk and memory footprint for hash buckets).
>
> (*)Technically, using in6_addr to represent an IPv6 address may not be
> enough; we may want to take into account zone indices (sin6_scope_id)
> as part of hash keys.
>
> JINMEI, Tatuya
> Communication Platform Lab.
> Corporate R&D Center, Toshiba Corp.
> jinmei at isl.rdc.toshiba.co.jp
>
> #define HASHKEYLEN 38 /* sizeof(in6_addr) * 2 + sizeof(in_port_t) * 2 */
> #define LARGE_PRIME 65537
>
> /*
> * This should be filled with unpredictable random values between 0
> * and LARGE_PRIME-1 at startup time.
> */
> u_int32_t random_parameters[HASHKEYLEN];
>
> u_int32_t
> hash(struct in6_addr *saddr, struct in6_addr *daddr,
> in_port_t sport, in_port_t dport)
> {
> int i, j = 0;
> u_int32_t accumulated = 0;
> u_char *cp;
>
> for (i = 0; i < sizeof(*saddr); i++)
> accumulated += saddr->s6_addr[i] * random_parameters[j++];
> for (i = 0; i < sizeof(*saddr); i++)
> accumulated += daddr->s6_addr[i] * random_parameters[j++];
> cp = (u_char *)&sport;
> for (i = 0; i < sizeof(sport); i++)
> accumulated += cp[i] * random_parameters[j++];
> cp = (u_char *)&dport;
> for (i = 0; i < sizeof(dport); i++)
> accumulated += cp[i] * random_parameters[j++];
>
> return (accumulated % LARGE_PRIME);
> }
Jinmei,
Wouldn't you get some overflow if (pardon my set notation) "INT_MAX <
{saddr,daddr}->s6_addr[i] * random_parameters[j++]", which would
implicitly introduce collision into your algorithm. Or is the overall
set size sufficiently large not to worry about this particular issue?
-Garrett
More information about the freebsd-hackers
mailing list