 |
|
| Science Forum Index » Cryptography Forum » Decrypting ODS file, strange symptom... |
|
Page 1 of 1 |
|
| Author |
Message |
| Vaughn T... |
Posted: Tue Oct 27, 2009 2:16 pm |
|
|
|
Guest
|
Hello all:
I'm writing a DLL to decrypt a password-protected ODS file. I first
consulted Steve Elliot's explanation of Open Office encrypt, found a
version of the PBKDF2 utility (rtl_digest_pbkdf2) and a version of
Blowfish (the C implementation by Paul Kocher 1997), ported the code
to Visual C++ 2005, and got these two routines working with some test
vectors I found on the web.
I continued my testing with two versions of the same spreadsheet, one
password-protected and one unprotected. As a test I extracted the
compressed-and-encrypted "content.xml" section from the ODS file and
then decrypted it. This seemed to match the binary data in the
"content.xml" section of the unprotected spreadsheet file. So I
proceeded going through the not-at-all-trivial work of integrating the
pbkdf2 and blowfish-decrypt calls in the open-source "unzip60"
library. (This library already works great for extracting and
decompressing the "content.xml" section of the unprotected ODS file.)
The new decrypt-and-unzip library seemed to be "almost" working, but
for some reason the "inflate" routines encountered an error shortly
after it started processing the Huffman tables. On close
investigation, I discovered to my dismay, that the decrypted-but-still-
compressed data did NOT actually match the data in the unprotected
file. There was a 2-byte mismatch at the offset 43 hex, and the data
completely diverged at offset 2B5 hex.
What I'm getting at is this- is there a way two blowfish keys can be
just SLIGHTLY off? It seems highly improbable that it would get ALL
BUT TWO of the first 693 bytes decrypted correctly, and then suddenly
go off kilter. In my experience with encryption, especially of the
ciper-feedback kind, ANY error produces COMPLETELY DIFFERENT results.
I already verified that my unzip routine opens the file in binary mode
(I'm doing this on a PC running Vista) so there's no issue with
conversion of "line feed" characters. Another thing I was wondering -
could I maybe have like one byte off in one of my S-boxes- perhaps I
got an erroneous version of the blowfish source code?
Oh, and by the way, I just remembered something interesting. The
Blowfish test vectors I used were by Eric Young, and I got them from
Bruce Schneier's website. Under "chaining mode test data", near the
bottom of the document, it listed the results of the test for CFB64
mode as
E73214A2822139CAF26ECF6D2EB9E76E3DA3DE04D1517200519D57A6C3
My result, however, was
E73214A2822139CAF26ECF6D2EB9E76E3DA3DE04D1517200A6579D51C3
Note that bytes 24-27 were reversed in order for my result. Looking at
this, I thought, there's no way I can be _close_ to correct, the
document must be erroneous. Since my ported Blowfish-CFB routine
_seemed_ to decrypt my "content.xml" data correctly, I figured I was
correct in that assumption. Or could I have been mistaken, and this is
some the cause of my problem?
How could this happen? (Cue "Twilight Zone" theme.)
Thanks for any help/advice/suggestions you can give me!
Vaughn |
|
|
| Back to top |
|
|
|
| Gordon Burditt... |
Posted: Tue Oct 27, 2009 7:41 pm |
|
|
|
Guest
|
[quote]What I'm getting at is this- is there a way two blowfish keys can be
just SLIGHTLY off? It seems highly improbable that it would get ALL
BUT TWO of the first 693 bytes decrypted correctly, and then suddenly
go off kilter. In my experience with encryption, especially of the
ciper-feedback kind, ANY error produces COMPLETELY DIFFERENT results.
I already verified that my unzip routine opens the file in binary mode
(I'm doing this on a PC running Vista) so there's no issue with
conversion of "line feed" characters. Another thing I was wondering -
could I maybe have like one byte off in one of my S-boxes- perhaps I
got an erroneous version of the blowfish source code?
[/quote]
I seem to recall, a long time ago, I was attempting to implement
DES on an Altair 8800 (really old 8080 system running CP/M). Some
of it matched test vectors. Some didn't. There was one vector
that was "close". The problem? *ONE BIT* wrong in an S-box (which
had been typed in manually out of an article in, I think, Byte
Magazine). It took a *long* time to find that. My guess is that
the offending bit was only used in the last round, so errors didn't
cascade much for the vector that was "close".
I think it would be interesting to track some of the test vectors
and see which S-boxes actually get used, and whether there are any
S-box entries that aren't tested by the test vectors. |
|
|
| Back to top |
|
|
|
| Vaughn T... |
Posted: Wed Oct 28, 2009 7:33 am |
|
|
|
Guest
|
On Oct 27, 6:41 pm, gordonb.k8... at (no spam) burditt.org (Gordon Burditt) wrote:
[quote]What I'm getting at is this- is there a way two blowfish keys can be
just SLIGHTLY off? It seems highly improbable that it would get ALL
BUT TWO of the first 693 bytes decrypted correctly, and then suddenly
go off kilter. In my experience with encryption, especially of the
ciper-feedback kind, ANY error produces COMPLETELY DIFFERENT results.
I already verified that my unzip routine opens the file in binary mode
(I'm doing this on a PC running Vista) so there's no issue with
conversion of "line feed" characters. Another thing I was wondering -
could I maybe have like one byte off in one of my S-boxes- perhaps I
got an erroneous version of the blowfish source code?
I seem to recall, a long time ago, I was attempting to implement
DES on an Altair 8800 (really old 8080 system running CP/M). Some
of it matched test vectors. Some didn't. There was one vector
that was "close". The problem? *ONE BIT* wrong in an S-box (which
had been typed in manually out of an article in, I think, Byte
Magazine). It took a *long* time to find that. My guess is that
the offending bit was only used in the last round, so errors didn't
cascade much for the vector that was "close".
I think it would be interesting to track some of the test vectors
and see which S-boxes actually get used, and whether there are any
S-box entries that aren't tested by the test vectors.
[/quote]
Gordon,
Thanks for your reply. I had a hunch it might be something like that.
Though it's not as easy to make that kind of mistake any more, when
you download the source. I found a more recent version of Paul
Kocher's source code (this time from Bruce Schneier's web page) and
did a comparison with WinMerge - no differences in the P-array or the
S-boxes. I substituted in this new module anyway, with no effect. I
also fixed the pointers in my "wrapbf" routine (which implements the
cipher feedback) which were formerly signed but are now unsigned. No
change yet. I'm still quite puzzled.
Vaughn |
|
|
| Back to top |
|
|
|
| CatId... |
Posted: Wed Oct 28, 2009 10:45 am |
|
|
|
Guest
|
Vaughn T,
Yes this is a common problem when implementing crypto libraries.
Small bugs in the implementation are hard to catch without a lot of
testing and also re-reading the code on a different day than you wrote
it and asking yourself at each step "does this make sense?" The test
vectors for a published algorithm cannot possibly cover every scenario
and test every problem.
Have fun!
http://catid.org
--
--------------------------------- --- -- -
Posted with NewsLeecher v3.95 Beta 3
Web at (no spam) http://www.newsleecher.com/?usenet
------------------- ----- ---- -- - |
|
|
| Back to top |
|
|
|
| Gordon Burditt... |
Posted: Wed Oct 28, 2009 6:46 pm |
|
|
|
Guest
|
[quote]I seem to recall, a long time ago, I was attempting to implement
DES on an Altair 8800 (really old 8080 system running CP/M). Some
of it matched test vectors. Some didn't. There was one vector
that was "close". The problem? *ONE BIT* wrong in an S-box (which
had been typed in manually out of an article in, I think, Byte
Magazine). It took a *long* time to find that. My guess is that
the offending bit was only used in the last round, so errors didn't
cascade much for the vector that was "close".
I think it would be interesting to track some of the test vectors
and see which S-boxes actually get used, and whether there are any
S-box entries that aren't tested by the test vectors.
Gordon,
Thanks for your reply. I had a hunch it might be something like that.
Though it's not as easy to make that kind of mistake any more, when
you download the source. I found a more recent version of Paul
Kocher's source code (this time from Bruce Schneier's web page) and
did a comparison with WinMerge - no differences in the P-array or the
S-boxes. I substituted in this new module anyway, with no effect. I
also fixed the pointers in my "wrapbf" routine (which implements the
cipher feedback) which were formerly signed but are now unsigned. No
change yet. I'm still quite puzzled.
[/quote]
I was translating the code from something like Pascal to assembly
language to speed it up. The data tables probably could have been
translated automatically, but this system didn't really have good
tools for that, unlike UNIX. Amazingly, I didn't run into much
problem with the translated code. It managed to almost keep up
with encrypting a 1200 bits-per-second data stream (on a 2MHz 8080).
It is also possible that with your code there's a previously
undetected problem with the code, which may have worked perfectly
on the platform/compiler/OS it was tested on. Is it in C or C++?
It's possible that "undefined behavior" has been doing the expected
thing most of the time, but now manages not to. (Or it might just
be a compiler bug).
- Endian dependencies. If it was flat out endian-dependent, this would
probably be noticed immediately, but if it's only in one spot, it
might appear this way.
- Shifts. Shifts like x << 0 aren't defined to work correctly, and it's
just possible that it's doing the expected thing except sometimes
on the first time through the loop.
- Signedness. Sign extension might screw things up. This might involve a
64-bit number that was 32 bits on previous platforms. And it might be
something obscure like signedness of an S-box table rather than the data.
Does the problem become different between turning off all optimization vs.
turning it way up? |
|
|
| Back to top |
|
|
|
| Noob... |
Posted: Thu Oct 29, 2009 4:21 am |
|
|
|
Guest
|
[ Cross-posting to comp.lang.c ]
Gordon Burditt wrote:
[quote]Shifts like x << 0 aren't defined to work correctly, and it's
just possible that it's doing the expected thing except sometimes
on the first time through the loop.
[/quote]
As far as I know, x << 0 is well-defined (as x). |
|
|
| Back to top |
|
|
|
| Richard Heathfield... |
Posted: Thu Oct 29, 2009 4:51 am |
|
|
|
Guest
|
In <hcbq9m$l4r$1 at (no spam) aioe.org>, Noob wrote:
[quote][ Cross-posting to comp.lang.c ]
Gordon Burditt wrote:
Shifts like x << 0 aren't defined to work correctly, and it's
just possible that it's doing the expected thing except sometimes
on the first time through the loop.
As far as I know, x << 0 is well-defined (as x).
[/quote]
Correct. "The integral promotions are performed on each of the
operands. The type of the result is that of the promoted left
operand. If the value of the right operand is negative or is greater
than or equal to the width in bits of the promoted left operand, the
behavior is undefined."
[If Gordon Burditt (or indeed anyone, but I'm thinking particularly of
Gordon Burditt) replies to this, he may quote my article if and only
if he attributes its authorship correctly and retains the
attributions shown above.]
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh at (no spam)
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within |
|
|
| Back to top |
|
|
|
| Vaughn T... |
Posted: Thu Oct 29, 2009 6:52 am |
|
|
|
Guest
|
On Oct 29, 3:51 am, Richard Heathfield <r... at (no spam) see.sig.invalid> wrote:
[quote]In <hcbq9m$l4... at (no spam) aioe.org>, Noob wrote:
[ Cross-posting to comp.lang.c ]
Gordon Burditt wrote:
Shifts like x << 0 aren't defined to work correctly, and it's
just possible that it's doing the expected thing except sometimes
on the first time through the loop.
As far as I know, x << 0 is well-defined (as x).
[/quote]
Hadn't thought about the shift issue. I know bitfields in C (yes, this
is old-fashioned C language) are undefined in their directionality,
but I would expect that shifts would work the same on Unix/Linux (the
usual target for open source files like this one) and Visual Studio.
Especially since I'm only using unsigned integers. If I was compiling
for some kind of micro, yes I'd expect it to be weird.
Though another thought occurred to me this morning. I know that
Blowfish keys are supposed to repeat (as the comment in the source
code says, ab=abab,) From the source it's obvious that when X-ORing
the key bytes into the P-array, when you run out of bytes, you start
over. My question is this- my key is 16 bytes, which will XOR with 4
long values. But 18 (the size of the long array) modulo 4 is 2. Is is
possible I shouldn't be XORing those last two longs, since the key
won't go in there evenly?
Thanks again to everyone for your suggestions.
Vaughn
[quote]Correct. "The integral promotions are performed on each of the
operands. The type of the result is that of the promoted left
operand. If the value of the right operand is negative or is greater
than or equal to the width in bits of the promoted left operand, the
behavior is undefined."
[/quote]
The previous paragraph was written by Richard Heathfield. :-)
[quote][If Gordon Burditt (or indeed anyone, but I'm thinking particularly of
Gordon Burditt) replies to this, he may quote my article if and only
if he attributes its authorship correctly and retains the
attributions shown above.]
--
Richard Heathfield <http://www.cpax.org.uk
Email: -http://www. +rjh at (no spam)
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within[/quote] |
|
|
| Back to top |
|
|
|
| Vaughn T... |
Posted: Thu Oct 29, 2009 2:12 pm |
|
|
|
Guest
|
[quote]Though another thought occurred to me this morning. I know that
Blowfish keys are supposed to repeat (as the comment in the source
code says, ab=abab,) From the source it's obvious that when X-ORing
the key bytes into the P-array, when you run out of bytes, you start
over. My question is this- my key is 16 bytes, which will XOR with 4
long values. But 18 (the size of the long array) modulo 4 is 2. Is is
possible I shouldn't be XORing those last two longs, since the key
won't go in there evenly?
[/quote]
That theory about the P-array was easy to prove wrong.
I monkeyed with the parameters around a bit more.
A one-bit change to the key produces major changes in the output, so
the key is unlikely to be the problem, unless by some very strange
coincidence I have a key that's "almost" right. But a one-bit change
to the initial vector didn't necessarily produce a major change.
Currently I'm thinking there's a problem with my cipher-feedback.
Excuse me if this is a obvious question, but I'll ask it anyway: When
reviewing the Wikipedia article on block cipher modes, they mentioned
the shift-register mode of CFB, which can be used to encrypt/decrypt
data an arbitrary number of bits at a time. Am I correct in assuming
that data encrypted by CFB N-bits-at-a-time mode must be decrypted by
CFB in N-bits-at-a-time mode? I hacked out a quick one-byte-at-a-time
version of my Blowfish test routine, and unless I did it totally
wrong, it appears that the modes are incompatible.
Vaughn |
|
|
| Back to top |
|
|
|
| rossum... |
Posted: Thu Oct 29, 2009 3:38 pm |
|
|
|
Guest
|
On Thu, 29 Oct 2009 09:52:27 -0700 (PDT), Vaughn T
<vtreude at (no spam) ulsinc.com> wrote:
[quote]Hadn't thought about the shift issue. I know bitfields in C (yes, this
is old-fashioned C language) are undefined in their directionality,
but I would expect that shifts would work the same on Unix/Linux (the
usual target for open source files like this one) and Visual Studio.
Especially since I'm only using unsigned integers. If I was compiling
for some kind of micro, yes I'd expect it to be weird.
Left shifts are usually OK, but there can be a problem with right[/quote]
shifts as to whether or not the shift is sign extended, Java >> vs
Java >>>. This might affect things if the left hand bit of the
operand is set.
rossum |
|
|
| Back to top |
|
|
|
| Flash Gordon... |
Posted: Thu Oct 29, 2009 7:12 pm |
|
|
|
Guest
|
rossum wrote:
[quote]On Thu, 29 Oct 2009 09:52:27 -0700 (PDT), Vaughn T
vtreude at (no spam) ulsinc.com> wrote:
Hadn't thought about the shift issue. I know bitfields in C (yes, this
is old-fashioned C language) are undefined in their directionality,
but I would expect that shifts would work the same on Unix/Linux (the
usual target for open source files like this one) and Visual Studio.
Especially since I'm only using unsigned integers. If I was compiling
for some kind of micro, yes I'd expect it to be weird.
Left shifts are usually OK, but there can be a problem with right
shifts as to whether or not the shift is sign extended, Java >> vs
Java >>>. This might affect things if the left hand bit of the
operand is set.
[/quote]
In C, unsigned integers don't *have* a sign bit, so it is guaranteed
that it won't be sign extended. For signed integer types (which the OP
is NOT using) it is implementation defined.
--
Flash Gordon |
|
|
| Back to top |
|
|
|
|
|
All times are GMT - 5 Hours
The time now is Sat Nov 28, 2009 5:25 pm
|
|