MD5 Alive? [Archive] - SpeedGuide.net Broadband Community

View Full Version : MD5 Alive?


JuiceMan
08-19-07, 07:07 PM
I'm hoping I can find somebody who might know about the guts of MD5 ,
really on a platform or implementation independent way.

Background: I've got an existing web app (Perl/CGI) chugging along
doing its thing. It's been happy for some time now. I'm relying on MD5
to give me a unique value for a password, data (m/d/y), some other
flags, prefereneces, or status as to what the user is doing all told
about 70-80 characters. I'm not relying solely on MD5 for password
authentication. I have other alogrithmns being used for that. This has
nothing to do with on-line commerce. If the user submits a form either
one or all of the following could be wrong: username, not todays date,
my code may not be able to understand where they are. I find something
wrong I reset user has to relog in and starts from the beginning. What
has become a nice little artifact of this is that users have to relog
in everyday.

So I was young back then I choose to use authentication over the use
of cookies. I'm not going to update this to change, but as I ponder my
next design for the current client I do have opportunity to try
something new. Though I'd rather not. So MD5:

>From what I've heard and gotten as feedback I think I'm isolated from
the vulnerabilties of MD5? (Q1. True or False). In general, its a low
risk that anybody is go looking into the source and see the hidden
fields. It's an even less of a risk that somebody would even think I'm
doing MD5 for this. I was young back then :) Also in the grand scheme
things not that major user log in is not impacted password is
maintained under another encryption function of the database itself.
Worse case user will avoid logging in each day or confuse my script as
to what there doing. There's nothing for them to see that they can't
see now.

So now as I ponder my new design I would be interested in answers to
the following questions, and it sounds like I just stumbled onto a
site that might be able to answer them:

What's even better (I think?) is that I've got a twist on MD5. I'm
running the digest 25 times on itself, breaking the digest into 2
parts and inserting a known string (like a salt) in between the parts
and rerunning it another 25 times all 3 reassembled pieces(original
first, my salt, and original second). Am I interfering with the nature
MD5 by doing this and losing anything it's giving me, by doing this do
I run the risk of not getting a unique value from MD5? (Q2 Yes or No).

I've visited one of those cracker sites. I've ran MD5 a 5-6 times on
itself and they were able to tell me what my previous digest was and
eventually back to the original Ok fine. I ran it 1250 times on itself
and they weren't so successful.

I've gotten a detail implmentation of MD5 which matches what the
cracker site tells me, and the results both match for "abc" (just for
fun). I also have a detailed write up as to MD5's working. I might
just fall in love with this. Now instead being in some library
somewhere I have a sense of control of it and I know my host won't
take it away from me. As I read it there are 4 functions that are used
each "round" summarized below:
1. AND(x,y) OR AND(Notx, y)
2. AND(x,z) OR AND(Notz,y)
3. EXOR (X,Y,Z)
4. EXOR(Y, (AND (X, NotZ)
Sorry, notation's not the best but hopefully you can decipher. So what
if I changed the order of the functions like did 2,1,3,4. Can I louse
up MD5 so that it won't be effective in producing unique digest (Q3.
Yes or No)? What about if changed or scrambled what x, y, z like kept
the calculations the same like made my z into x, y into z, and x
became y? (Q4. Rearrange the letters hurts MD5?).

I'm guessing I could really louse up MD5 -- if I really got in there
and started playing around with it. As I do gaze into the future I do
like authentication over cookies although the latter will probably win/
has won out. I don't need a 128 characters of output. If I could
shorten that maybe do some different operations within the MD5
algorihtmn and not compromise it I might be able to make good use of
it.

Sorry, this inevitably brings up a debate between the use of cookies
and authentication to maintain state in HTML documents. I do admit
that cookies are better and they will win :<, but please don't blame a
guy for trying or thinking. In my brief of your site sounds like this
is what you folks do anyway.

Ertugrul Soeylemez
08-21-07, 07:15 PM
JuiceMan <jaysgeneral@snet.net> (07-08-19 17:07:22):

> From what I've heard and gotten as feedback I think I'm isolated from
> the vulnerabilties of MD5? (Q1. True or False).

MD5 is weak regarding collision resistance. Whether this is a threat to
your application depends on your specific usage scenario. If I
understood your scenario properly, then you should be safe. Still
switch to a stronger hash function for future applications.


> What's even better (I think?) is that I've got a twist on MD5. I'm
> running the digest 25 times on itself, breaking the digest into 2
> parts and inserting a known string (like a salt) in between the parts
> and rerunning it another 25 times all 3 reassembled pieces(original
> first, my salt, and original second). Am I interfering with the nature
> MD5 by doing this and losing anything it's giving me, by doing this do
> I run the risk of not getting a unique value from MD5? (Q2 Yes or No).

Probably yes. Say, you're running MD5 five times, and let F = G = MD5.
You're calculating the hash value with the following formula:

H = (G . G . G . G . F) P

where P is the plaintext and `.' means function composition. Although F
and G are the same function, there is a slight difference. While F's
input space is infinite (the set of all finite messages), G's input
space is finite and in fact the same as its output space. So if MD5 is
not a bijection, then you're losing entropy by applying it multiple
times without changing the intermediate hash values. On the other hand,
you add complexity by adding computation time, but this may be a small
benefit.

Put short: Probably you shouldn't do this.


> 1. AND(x,y) OR AND(Notx, y)
> 2. AND(x,z) OR AND(Notz,y)
> 3. EXOR (X,Y,Z)
> 4. EXOR(Y, (AND (X, NotZ)
> Sorry, notation's not the best but hopefully you can decipher. So what
> if I changed the order of the functions like did 2,1,3,4. Can I louse
> up MD5 so that it won't be effective in producing unique digest (Q3.
> Yes or No)?

MD5 does not, and in fact cannot, guarantee uniqueness. If, like stated
above, MD5 with the same input space as output space is a bijection,
then it does guarantee uniqueness for inputs of exactly 128 bits, but we
don't know that even for the original MD5. So reasoning about how a
modified version behaves is going to be even more difficult without
proper research.


> What about if changed or scrambled what x, y, z like kept the
> calculations the same like made my z into x, y into z, and x became y?
> (Q4. Rearrange the letters hurts MD5?).

Consider that there are reasons for the way MD5 works, you shouldn't
change anything at all. Your function might be more secure, but history
has shown that in general such customizations rather lead to weaker
primitives. If you find MD5 too weak, use another hash function, but I
urge you to refrain from custom adjustments without background knowledge
and a lot of research.


> Sorry, this inevitably brings up a debate between the use of cookies
> and authentication to maintain state in HTML documents. I do admit
> that cookies are better and they will win :<, but please don't blame a
> guy for trying or thinking. In my brief of your site sounds like this
> is what you folks do anyway.

There are many other methods of maintaining state in query-response
protocols. For non-public sites I like the idea of using SSL
certificate-based authentication. Unfortunately the average user won't
be able to comprehend this, and will eventually look for alternatives,
so this method really isn't suitable for public sites.


Regards,
Ertugrul Söylemez.


--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.

Sebastian G.
08-22-07, 06:53 AM
Ertugrul Soeylemez wrote:


> MD5 does not, and in fact cannot, guarantee uniqueness. If, like stated
> above, MD5 with the same input space as output space is a bijection,
> then it does guarantee uniqueness for inputs of exactly 128 bits, but we
> don't know that even for the original MD5.


Actually we should assume quite the contrary: If MD5 is a pseudorandom
function, than it's a pseudorandom mapping and therefore about 1/e of all
outputs will not occur, and about the same number will be collisions.

JuiceMan
08-22-07, 09:36 PM
On Aug 22, 7:53 am, "Sebastian G." <se...@seppig.de> wrote:
> Ertugrul Soeylemez wrote:
> >MD5does not, and in fact cannot, guarantee uniqueness. If, like stated
> > above,MD5with the same input space as output space is a bijection,
> > then it does guarantee uniqueness for inputs of exactly 128 bits, but we
> > don't know that even for the originalMD5.
>
> Actually we should assume quite the contrary: IfMD5is a pseudorandom
> function, than it's a pseudorandom mapping and therefore about 1/e of all
> outputs will not occur, and about the same number will be collisions.

Thanks for the feedback. I figured monkeying around with the internals
of MD5 would not help me.

However, some alarming points were brought up and I wouldn't mind
clarification on them.

It sounds like I got a big NO to my 25 times, split, 25 more times. As
in this is NOT making MD5 any more secure and in fact might be making
it less secure.

Running the Digest on itself say upwards of 1000 times is -- in the
circles that I've been in kind of an accepted thing to do.

For example if I take "abc" and run MD5 on it, take the answer run MD5
on that, and do so a 1000 more times what I get is no more secure then
if I ran it once?

If I'm reading correctly it even sounds that what I get utlimately at
the end of 1000 times might be the same as if it had been done with
"xyz" 1000 times?

Is this true?

Ertugrul Soeylemez
08-25-07, 09:01 PM
JuiceMan <jaysgeneral@snet.net> (07-08-22 19:36:55):

> > > MD5 does not, and in fact cannot, guarantee uniqueness. If, like
> > > stated above, MD5 with the same input space as output space is a
> > > bijection, then it does guarantee uniqueness for inputs of exactly
> > > 128 bits, but we don't know that even for the original MD5.
> >
> > Actually we should assume quite the contrary: If MD5 is a
> > pseudorandom function, than it's a pseudorandom mapping and
> > therefore about 1/e of all outputs will not occur, and about the
> > same number will be collisions.

But we don't know that. Some attacks were published, which show
evidence, that MD5 might not be a pseudo-random function.


> Thanks for the feedback. I figured monkeying around with the internals
> of MD5 would not help me.

It wouldn't, unless you're a researcher. But you may use one of the
established methods, if you really need more strength. However, mostly
just using another hash function is probably the best method.


> It sounds like I got a big NO to my 25 times, split, 25 more times. As
> in this is NOT making MD5 any more secure and in fact might be making
> it less secure.

The idea isn't extremely bad, but you shouldn't try to invent your own
strengthening function, without putting a lot of research effort on it.
Basically you're doing something quite similar to HMAC, and probably
that is part of what you're looking for.

PBKDF2 may be the answer. You can use it together with HMAC for exactly
the purpose of strengthening.


> Running the Digest on itself say upwards of 1000 times is -- in the
> circles that I've been in kind of an accepted thing to do.

Yes, but not the way you're doing it. Look up PBKDF2.


> For example if I take "abc" and run MD5 on it, take the answer run MD5
> on that, and do so a 1000 more times what I get is no more secure then
> if I ran it once?
>
> If I'm reading correctly it even sounds that what I get utlimately at
> the end of 1000 times might be the same as if it had been done with
> "xyz" 1000 times?

Yes, this is true. But it's not related to the fact that you're
applying MD5 multiple times. However, doing that directly as you do it
_might_ increase the probablility for this to happen.


Regards,
Ertugrul Söylemez.


--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.

Unruh
08-28-07, 12:35 AM
JuiceMan <jaysgeneral@snet.net> writes:

>On Aug 22, 7:53 am, "Sebastian G." <se...@seppig.de> wrote:
>> Ertugrul Soeylemez wrote:
>> >MD5does not, and in fact cannot, guarantee uniqueness. If, like stated
>> > above,MD5with the same input space as output space is a bijection,
>> > then it does guarantee uniqueness for inputs of exactly 128 bits, but we
>> > don't know that even for the originalMD5.
>>
>> Actually we should assume quite the contrary: IfMD5is a pseudorandom
>> function, than it's a pseudorandom mapping and therefore about 1/e of all
>> outputs will not occur, and about the same number will be collisions.

>Thanks for the feedback. I figured monkeying around with the internals
>of MD5 would not help me.

Help you with what? MD5 is a cryptographic hash. Its primary purpose is to
make finding the preimage from the output very very hard. Its purpose is
NOT to prevent output collisions. What are you suing it for that you would
worry about the collisions.



>However, some alarming points were brought up and I wouldn't mind
>clarification on them.

>It sounds like I got a big NO to my 25 times, split, 25 more times. As
>in this is NOT making MD5 any more secure and in fact might be making
>it less secure.

>Running the Digest on itself say upwards of 1000 times is -- in the
>circles that I've been in kind of an accepted thing to do.

Really bad idea. On each level, you loose about as he says, 1/e outputs.
Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
outputs. (I am sure it is more than that, but lets stick with the random
mapping assumption).





>For example if I take "abc" and run MD5 on it, take the answer run MD5
>on that, and do so a 1000 more times what I get is no more secure then
>if I ran it once?


Probably much less secure. What are you trying to do?



>If I'm reading correctly it even sounds that what I get utlimately at
>the end of 1000 times might be the same as if it had been done with
>"xyz" 1000 times?
No idea what this means.



>Is this true?

Sebastian G.
08-28-07, 10:48 AM
Unruh wrote:


>> Running the Digest on itself say upwards of 1000 times is -- in the
>> circles that I've been in kind of an accepted thing to do.
>
> Really bad idea. On each level, you loose about as he says, 1/e outputs.


This would mean you loose about 0.66 bits per iteration, so after 194
iterations you would have created a unique collision...

> Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
> outputs.


You assume that the iterations are independent. However, for MD5 you're very
likely not not stumble upon an additional collision after the first
iteration. This is a property inherited by the design.

> (I am sure it is more than that, but lets stick with the random
> mapping assumption).


Be careful about your assumptions.

Unruh
09-04-07, 06:12 PM
"Sebastian G." <seppi@seppig.de> writes:

>Unruh wrote:


>>> Running the Digest on itself say upwards of 1000 times is -- in the
>>> circles that I've been in kind of an accepted thing to do.
>>
>> Really bad idea. On each level, you loose about as he says, 1/e outputs.


>This would mean you loose about 0.66 bits per iteration, so after 194
>iterations you would have created a unique collision...

>> Ie, aftr 1000 times,. you will have only something like 1/e^1000 unique
>> outputs.


>You assume that the iterations are independent. However, for MD5 you're very
>likely not not stumble upon an additional collision after the first
>iteration. This is a property inherited by the design.

>> (I am sure it is more than that, but lets stick with the random
>> mapping assumption).


>Be careful about your assumptions.

AGreed, but what makes you think that there are far MD5(MD5(random)) has
fewer collisions than MD5(random)?

Ie, what makes you think that the output of MD5 does not act like a random
input to MD5?

Ertugrul Soeylemez
09-04-07, 10:32 PM
Unruh <unruh-spam@physics.ubc.ca> (07-09-04 23:12:55):

> > > (I am sure it is more than that, but lets stick with the random
> > > mapping assumption).
>
> > Be careful about your assumptions.
>
> AGreed, but what makes you think that there are far MD5(MD5(random))
> has fewer collisions than MD5(random)?

You must have misunderstood something.


> Ie, what makes you think that the output of MD5 does not act like a
> random input to MD5?

Actually it's exactly the fact that MD5 output _does_ look like random
input to MD5. Read again carefully.


Regards,
Ertugrul Söylemez.


--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.

Sebastian G.
09-04-07, 11:06 PM
Unruh wrote:


> AGreed, but what makes you think that there are far MD5(MD5(random)) has
> fewer collisions than MD5(random)?


Because MD5 has been designed to resist this kind of attack. As I described,
it would be trivial to create a collision if it behaved otherwise.

> Ie, what makes you think that the output of MD5 does not act like a random
> input to MD5?

A design decision.

Sebastian G.
09-04-07, 11:09 PM
Ertugrul Soeylemez wrote:


>> Ie, what makes you think that the output of MD5 does not act like a
>> random input to MD5?
>
> Actually it's exactly the fact that MD5 output _does_ look like random
> input to MD5. Read again carefully.

And maybe you should read again as well. If it really behaved like this,
then 192 iterations would be sufficient to create a collision in MD5 with
probability 1/e. So about 8 bit bit of security. That's clearly not a
cryptographic hash.

Ertugrul Soeylemez
09-05-07, 12:25 AM
"Sebastian G." <seppi@seppig.de> (07-09-05 06:09:16):

> > > Ie, what makes you think that the output of MD5 does not act like
> > > a random input to MD5?
> >
> > Actually it's exactly the fact that MD5 output _does_ look like
> > random input to MD5. Read again carefully.
>
> And maybe you should read again as well. If it really behaved like
> this, then 192 iterations would be sufficient to create a collision in
> MD5 with probability 1/e. So about 8 bit bit of security. That's
> clearly not a cryptographic hash.

Uhm, just a typo. Add the "not".


Regards,
Ertugrul Söylemez.


--
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.

Unruh
09-06-07, 12:30 PM
"Sebastian G." <seppi@seppig.de> writes:

>Unruh wrote:


>> AGreed, but what makes you think that there are far MD5(MD5(random)) has
>> fewer collisions than MD5(random)?


>Because MD5 has been designed to resist this kind of attack. As I described,
>it would be trivial to create a collision if it behaved otherwise.

No idea what you are talking about. It IS trivial to create collisions.
Take 10^64 inputs and the probablility is high that there will be a
collision (Birthday calculation). That helps one not at all in any
security application.



>> Ie, what makes you think that the output of MD5 does not act like a random
>> input to MD5?

>A design decision.

Exactly which design decision are you talking about?

Sebastian G.
09-07-07, 11:45 AM
Unruh wrote:

> "Sebastian G." <seppi@seppig.de> writes:
>
>> Unruh wrote:
>
>
>>> AGreed, but what makes you think that there are far MD5(MD5(random)) has
>>> fewer collisions than MD5(random)?
>
>
>> Because MD5 has been designed to resist this kind of attack. As I described,
>> it would be trivial to create a collision if it behaved otherwise.
>
> No idea what you are talking about. It IS trivial to create collisions.
> Take 10^64 inputs and the probablility is high that there will be a
> collision (Birthday calculation). That helps one not at all in any
> security application.


1. 2^64 inputs are already sufficient, and still 2^64 are not so trivial at all.
2. According to the iterated attack you described, 2^8 (!) inputs would be
sufficient.

>>> Ie, what makes you think that the output of MD5 does not act like a random
>>> input to MD5?
>
>> A design decision.
>
> Exactly which design decision are you talking about?

Collision resistance on iteration

ciphex
09-20-07, 10:25 AM
On 7 Sep, 17:45, "Sebastian G." <se...@seppig.de> wrote:
> Unruh wrote:
> > "Sebastian G." <se...@seppig.de> writes:
>
> >> Unruh wrote:
>
> >>> AGreed, but what makes you think that there are far MD5(MD5(random)) has
> >>> fewer collisions than MD5(random)?
>
> >> Because MD5 has been designed to resist this kind of attack. As I described,
> >> it would be trivial to create a collision if it behaved otherwise.
>
> > No idea what you are talking about. It IS trivial to create collisions.
> > Take 10^64 inputs and the probablility is high that there will be a
> > collision (Birthday calculation). That helps one not at all in any
> > security application.
>
> 1. 2^64 inputs are already sufficient, and still 2^64 are not so trivial at all.
> 2. According to the iterated attack you described, 2^8 (!) inputs would be
> sufficient.
>
> >>> Ie, what makes you think that the output of MD5 does not act like a random
> >>> input to MD5?
>
> >> A design decision.
>
> > Exactly which design decision are you talking about?
>
> Collision resistance on iteration

The only reason i can think of for 'fiddling' with MD5 in order to
protect
authentication code in an online application is to defeat reverse
lookup
databases. In this case it would be better to mangle the input before
hashing in some way that doesn't decrease the randomness of the
initial input text - for example, a simple substitution/transposition
scheme might work. The cracker would then have a problem, there is
less chance that the resulting hash would appear in the database
(assuming the resulting input is long enough).

On the whole though, if you don't understand the implications of this
kind of step then I'd avoid it and, as many have said, use a stronger
hashing algorithm.