Main Page | Report this Page
 
   
Science Forum Index  »  Cryptography Forum  »  Does this scheme look sound, questionable, or downright wort
Page 1 of 1    
Author Message
Jeff
Posted: Sat Jan 03, 2004 6:07 pm
Guest
Suppose a scheme is devised to munge a poorly-distributed sequence of
integers (representing web application user IDs) beginning with 1 and
increasing to some value much, much less than 2^32.

Each time a user is created, the next available, unused integer is
used as that user's user ID. HOWEVER, the application is designed to
NEVER expose the raw user ID to users. Instead, when each user is
created, the md5 hash of the integer's string representation (with
some secret, but unchanging string of indeterminate length
concatenated to it to make determining the original value through
brute force more difficult). Furthermore, THAT raw md5 value is never
exposed directly to users, either (the main purpose of md5 at THIS
stage is to generate a nicely-distributed collection of 128-bit
numeric values to use instead of the tiny sequence of small integers.)

Instead, each time a user session begins, a random 128 bit integer is
created for that user and XOR'ed against the 128-bit md5 value prior
to making it visible to the user. Likewise, values submitted by the
user as input parameters representing the hashed id of some other user
are desalinated against the same 128 bit salt via XOR as well.

So far, so good... I think.

Now for the vulnerabilities:

Suppose a malicious user knows EXACTLY how the scheme works. He
doesn't know the raw integer representing his user ID, nor does he
know the raw user ID of any other user. Furthermore, he doesn't know
what value was used to salt the raw user IDs to generate the md5 hash
stored alongside the raw id in the database.

Based upon the knowledge he DOES have, how hard would it be for him to
determine his raw user ID, or the raw user ID of another user, by
repeatedly logging on (to generate a new session salt) and comparing
the observed values of his session-salted user ID and those of some
finite number of other specific users?

OK, getting a little trickier. Let's assume he knows that the raw
underlying user IDs increase by 1 each time a new user is created.
Suppose further that he were to create two or more new users late at
night when he knew he had a good chance of creating users with
sequential user IDs. Assuming his assumption were right, and he knew
for a fact that the salted hash associated with user "B" has a raw
value exactly 1 greater than the raw value associated with user "A",
how hard would it be for him to discover the raw id of his user, or
some other user? In other words, now he doesn't just know how the
scheme works... he also knows that two hashed values, though unknown,
represent integers differing by exactly 1, and he also knows which one
is the larger of the two. He logs on as user "A" and discerns the
session-salted value associated with "B" and his own user, then logs
on as user "B" to discern the session-salted value associated with
both himself and "A". If necessary, he creates THREE, FOUR, or some
other small, finite number of users believed to have sequential
underlying user IDs and pulls the same stunt... logging in as each one
or more times to determine the session-salted hashed values for the
other users.

Ultimately, does this look like a decent scheme for keeping raw
underlying user IDs secret, and making HASHED user IDs hard to find
for ANY user, much less a specific one (due to there being only a few
thousand valid ones among a haystack with 2^128 values), without
introducing TOO much overhead into transactions that are going to
happen a LOT (so overhead *is* a big deal, but tolerable as long as
the only computation that needs to be repeated with every single
request is a Java BigInteger.xor()?

The biggest vulnerability I can see is that if someone manages to
discern his raw md5 hash, he'll be able to easily discover the current
session's salt. From that point, he could easily determine the
nonvarying raw md5 hash of any other user. Armed with that knowledge,
he could try to discover the unvarying value used to salt the raw
integer ids prior to taking their md5 hash through brute force by
systematically guessing at plausible salts and generating the md5
hashes of integers from 0 to, say, 10000 (knowing there's a good
chance the underlying values ARE less than 1000) salted with that
guessed salt value and looking for three sequential hash values that
match the three known raw hashes. From that point, it would just be a
matter of generating the md5 hashes of all the integers from 0 to some
reasonable maximum salted with that now-known salt value to generate
an index against which any user's real id could be determined.

One possible improvement to the scheme might be to randomly generate a
new salt for EACH new user to use when salting the new user ID to
obtain the unchanging md5 hash associated with each user. Obviously
I'd have to carefully check for collisions (since two different
numbers salted with two different random strings of random length
COULD produce identical md5 hashes). Varying the salt would probably
screw up the distribution of md5 hash values, too... but then again,
treating the salt like a one time pad would effectively remove
discovery of that salt as a possible exploit since it would differ for
every single id.

Another possible improvement might be to do away with sequential raw
user IDs altogether and just generate md5 hashes from random strings
of random length from the very beginning. I'd have to make sure I
didn't accidentally re-use an existing hash value, but that would
effectively eliminate sequential users as a potential exploit.
Ultimately the randomness would be no better than whatever Sun
happened to design into Java, but it would probably be way better than
sequential underlying integers in any case.

Hmmm. If I did both improvements, are there any obvious
encryption-specific exploits left that I've overlooked? I suspect if I
did everything I mentioned above, a potential attacker would just
forget about trying to crack the user ids and work on hacking his way
into the database server or web server instead as an easier
alternative... Comments?
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Sun Nov 23, 2008 3:52 am