 |
|
| Science Forum Index » Cryptography Forum » Hash of a Map... |
|
Page 1 of 1 |
|
| Author |
Message |
| Maaartin... |
Posted: Fri Oct 23, 2009 8:35 am |
|
|
|
Guest
|
I have a Map<String, String> which can be viewed as an unordered set
of String pairs (name, value) where name is unique (which doesn't
matter here). I need a function returning hash of the Map which of
course mustn't depend on the order in which the pairs get processed. I
see two obvious ways how to do it:
1. Order the pairs first according to name, than compute hash(hash
(name0) at (no spam) hash(value0) at (no spam) hash(name1) at (no spam) hash(value1) at (no spam) ...), where " at (no spam) "
denotes concatenation and name0 < name1 < ...
2. Omit the sorting and compute hash((hash(name0) at (no spam) hash(value0)) +
hash(name1) at (no spam) hash(value1) + ...) where "+" denotes any associative
and commutative operation (and the name sequence is unsorted).
I think, the first construction is as secure as the given hash
function, right?
The second one seems to me to be less secure (generalized bithday
attack?), right? But it's better for me as it allows a sort of
incremental computation for different submaps of the given Map. I hope
it can be made secure by something like
hash((expand(hash(name0) at (no spam) hash(value0))) + hash(expand(name1) at (no spam) hash
(value1)) + ...)
where "expand" is an injective function lengthening it's input.
Obviously it must be non-linear w.r.t. "+", my question is how it must
look like in order for it to work. |
|
|
| Back to top |
|
|
|
| Ilmari Karonen... |
Posted: Sun Oct 25, 2009 1:30 am |
|
|
|
Guest
|
On 2009-10-23, Maaartin <grajcar1 at (no spam) seznam.cz> wrote:
[quote]I have a Map<String, String> which can be viewed as an unordered set
of String pairs (name, value) where name is unique (which doesn't
matter here). I need a function returning hash of the Map which of
course mustn't depend on the order in which the pairs get processed. I
see two obvious ways how to do it:
1. Order the pairs first according to name, than compute hash(hash
(name0) at (no spam) hash(value0) at (no spam) hash(name1) at (no spam) hash(value1) at (no spam) ...), where " at (no spam) "
denotes concatenation and name0 < name1 < ...
2. Omit the sorting and compute hash((hash(name0) at (no spam) hash(value0)) +
hash(name1) at (no spam) hash(value1) + ...) where "+" denotes any associative
and commutative operation (and the name sequence is unsorted).
I think, the first construction is as secure as the given hash
function, right?
[/quote]
I would assume so, too. By the way, note that you don't need to sort
the names in lexicographic order; any consistent ordering will do. In
particular, you might prefer to hash the names first and then sort the
hashes: this could be slightly more efficient if you expect to see
lots of keys with long identical prefixes.
[quote]The second one seems to me to be less secure (generalized bithday
attack?), right? But it's better for me as it allows a sort of
incremental computation for different submaps of the given Map. I hope
it can be made secure by something like
hash((expand(hash(name0) at (no spam) hash(value0))) + hash(expand(name1) at (no spam) hash
(value1)) + ...)
where "expand" is an injective function lengthening it's input.
Obviously it must be non-linear w.r.t. "+", my question is how it must
look like in order for it to work.
[/quote]
I presume you meant to write something like
H( X( N_1, V_1 ) + X( N_2, V_2 ) + ... )
where N_i and V_i are the names and values, + is XOR, H is a normal
hash function and X is some function that in effect acts like a very
long hash. I'm hardly an expert on generalized birthday attacks -- in
fact, I hadn't heard of them before you mentioned them in your post --
but from what I gather from some quick Googling, the length of the
output of X in bits should be about 1/4 times the square of the length
of the eventual output of H.
That in itself may constitute a performance problem. If you want your
final hash to have, say, 256 bits of strength, then whatever your
function X does internally, it must produce 16384 bits = 2 kilobytes
of output. This may well be substantially longer than the average
input.
That said, if that's not prohibitive for you, I see no reason why such
a construction couldn't work. (Of course, it might have some glaring
security flaw that I've completely missed; again, I'm hardly an expert
here.) As for how the function X might be implemented, the obvious
and probably the most secure solution would be to simply use a hash
function designed to support arbitrarily long output, like RadioGatun
or Keccak.
Alternatively, if you have an existing fixed-length hash function H,
and would like to use it to construct X, something like the following
construction might do the job:
X(N,V) = H(K + 1) || H(K + 2) || H(K + 3) || ...
where || denotes concatenation and K is an intermediate value computed
e.g. as K = H( H(N) || H(V) ).
Of course, other methods of key expansion could be used as well, such
as using K to key a stream cipher (or a block cipher in a suitable
mode). Indeed, that's effectively what the construction above does,
using the hash function H to build an ad hoc stream cipher in a manner
vaguely resembling the Salsa20 construction.
Again, don't take my word that it's secure. But I suspect it may be.
--
Ilmari Karonen
To reply by e-mail, please replace ".invalid" with ".net" in address. |
|
|
| Back to top |
|
|
|
| Maaartin... |
Posted: Sun Nov 01, 2009 8:06 am |
|
|
|
Guest
|
On Oct 25, 8:30 am, Ilmari Karonen <usen... at (no spam) vyznev.invalid> wrote:
Many thanks for you reply.
[quote]On 2009-10-23, Maaartin <grajc... at (no spam) seznam.cz> wrote:
I have a Map<String, String> which can be viewed as an unordered set
of String pairs (name, value) where name is unique (which doesn't
matter here). I need a function returning hash of the Map which of
course mustn't depend on the order in which the pairs get processed. I
see two obvious ways how to do it:
1. Order the pairs first according to name, than compute hash(hash
(name0) at (no spam) hash(value0) at (no spam) hash(name1) at (no spam) hash(value1) at (no spam) ...), where " at (no spam) "
denotes concatenation and name0 < name1 < ...
2. Omit the sorting and compute hash((hash(name0) at (no spam) hash(value0)) +
hash(name1) at (no spam) hash(value1) + ...) where "+" denotes any associative
and commutative operation (and the name sequence is unsorted).
I think, the first construction is as secure as the given hash
function, right?
I would assume so, too. By the way, note that you don't need to sort
the names in lexicographic order; any consistent ordering will do. In
particular, you might prefer to hash the names first and then sort the
hashes: this could be slightly more efficient if you expect to see
lots of keys with long identical prefixes.
[/quote]
Ok, the number of entries has no limit but there'll be normaly only
tens or hundreds of them.
[quote]The second one seems to me to be less secure (generalized bithday
attack?), right? But it's better for me as it allows a sort of
incremental computation for different submaps of the given Map. I hope
it can be made secure by something like
hash((expand(hash(name0) at (no spam) hash(value0))) + hash(expand(name1) at (no spam) hash
(value1)) + ...)
where "expand" is an injective function lengthening it's input.
Obviously it must be non-linear w.r.t. "+", my question is how it must
look like in order for it to work.
I presume you meant to write something like
H( X( N_1, V_1 ) + X( N_2, V_2 ) + ... )
where N_i and V_i are the names and values, + is XOR, H is a normal
hash function and X is some function that in effect acts like a very
long hash.
[/quote]
This is a possible view of what I meant, but I was hoping for the
following:
(A) There could be a faster way for computing X than a long hash
function.
(B) There could be a way how to make it secure with a more practical
length of X than 2kB.
(C) There could be an associative and commutative operation precluding
the generalized birthday attack.
But it looks like all my hopes were futile, what I describe below in
more detail.
[quote] I'm hardly an expert on generalized birthday attacks -- in
fact, I hadn't heard of them before you mentioned them in your post --
but from what I gather from some quick Googling, the length of the
output of X in bits should be about 1/4 times the square of the length
of the eventual output of H.
That in itself may constitute a performance problem. If you want your
final hash to have, say, 256 bits of strength, then whatever your
function X does internally, it must produce 16384 bits = 2 kilobytes
of output. This may well be substantially longer than the average
input.
[/quote]
Yes, this make the incremental construction completelly unusable for
me.
[quote]That said, if that's not prohibitive for you, I see no reason why such
a construction couldn't work. (Of course, it might have some glaring
security flaw that I've completely missed; again, I'm hardly an expert
here.) As for how the function X might be implemented, the obvious
and probably the most secure solution would be to simply use a hash
function designed to support arbitrarily long output, like RadioGatun
or Keccak.
Alternatively, if you have an existing fixed-length hash function H,
and would like to use it to construct X, something like the following
construction might do the job:
X(N,V) = H(K + 1) || H(K + 2) || H(K + 3) || ...
where || denotes concatenation and K is an intermediate value computed
e.g. as K = H( H(N) || H(V) ).
Of course, other methods of key expansion could be used as well, such
as using K to key a stream cipher (or a block cipher in a suitable
mode). Indeed, that's effectively what the construction above does,
using the hash function H to build an ad hoc stream cipher in a manner
vaguely resembling the Salsa20 construction.
[/quote]
(A again) Any of these methods makes it very slow, I could imagine
there's a faster and still secure method. In fact, there're provably
secure algorithms like Poly1305 saving a lot of time by using a non-
cryptographic function at it's core. But I'm sure I can find no fast
function suitable for expanding a short hash (let say 512 bits) while
providing security against all possible attacks (after the combining
operation and the final hash).
(B again) It looks like what I wanted was the very same thing which
got broken in the original paper (XHASH and AdHASH), unless other
operation gets used.
(C again) Instead of speaking about associativity and commutativity I
should have said, I need an abelian group. All abelian groups can be
written as a direct sum of n cyclic groups of order pow(p, k) with p
prime. With n>=2 the components can be attacked independently (just
like for XHASH; I'm probably quite imprecise here but I think the idea
is right), so let's assume n=1, i.e., a cyclic group of prime-power
order. With p=2 we get to AdHASH, so lets assume p>2. But I'm quite
sure the attack can be applied to it as well, simply by using the
least significant digits instead of the least significant bits. So
let's fix k=1, the question is if the attack can be applied when the
combining operation is addition in Z/p. I'm afraid it can, although
probably not by starting with the least significat bits in the binary
represantion.
So I'm afraid, it can't work, as any such group is isomorphic to a
group where the attack can be applied. Except in case the isomorphism
is hard to compute.
[quote]Again, don't take my word that it's secure. But I suspect it may be.
[/quote]
Thank you again for your answer. I'm not going to use it, because of
the computational overhead, but still I find it very interesting. |
|
|
| Back to top |
|
|
|
|
|
All times are GMT - 5 Hours
The time now is Sun Dec 06, 2009 2:22 pm
|
|