If you're serializing the JSON, that is UTF8. You're signing UTF8 bytes... don't mess with them, and the signature will be the same... also, for JWT, the UTF8 is then converted to base64 representation and tethered to the signature. You're signing the UTF8 bytes, not JSON. It doesn't matter how it's serialized, if it isn't UTF8, you're doing it wrong. The order or properties doesn't matter, the signature is on the bytes.
which means that you have to be careful about how you remove the signature in order to verify the original. The "main" node implementation does this by parsing the json, removing the field, and then re serializing, forcing any alternate implementation to exactly match the node serialization in order to be compatible
as far as I can tell, it's cause they are hardcore JS devs, where JSON is seen as almost a part of the language, and little to no effort was put into things like future-proofing. A lot of the failings have been fixed as the community grows, but this one is core to how the whole thing works, so changing it at this point is fairly non-trivial (would basically involve completely breaking any kind of backwards compat, and potentially even some forwards compat)
probably they did that, then Roy Fielding came in, unplugged their router and wouldn't let them back on the internet til they stopped polluting their resources with metadata
At least when I implemented it, it was the full message, as a complete valid json object. The signature would get added as another field, and then the object reserialized.
This is why JWT is better... JWT signs "header.body" with each part as base64 from the JSON, and joined with a period. The content in the body and header are immaterial.
It's not "better", it solves a materially different problem. The article already acknowledges that if you can afford to, you should just stick a tag on the outside. (That's what JWT does, but the actual thing recommended, just HMAC, is better still, for reasons mentioned elsewhere in the comments.)
This way you have full control over the raw bytes you want to sign (by forcing them into Base64 where other systems can't get their dirty paws on them).
I guess the problem here is if intermediate systems want to do stuff based on the payload (but without validating it), they won't like this.
But if the problem is just intermediate systems barfing on non-json, this might work!
p.s. enjoyable blog post - as they always are! ;-)
Yep! That works, but it's essentially the first option ("How to sign a JSON object") but with JSON as the outer serialization format instead of a comma in the middle.
You also correctly identified why that is different from the other schemes: they don't change the structure of the outer object.
You could also serialize the json with a placeholder string (All spaces or zeroes or something), calculate the HMAC, and substitute the string. You could then do that in reverse on the receiving end. The deserialization could easily note the offset of the hmac, which could then easily be verified against the original bytes.
The original spec said UTF-8/16/32, but unaware of any reference implementation that used anything other than UTF-8 ... though, who knows with hand rolled crap, and windows in the mix.
That's true if you have the luxury of a traditional tag on the outside, but falls apart when you have systems where signatures are in-band, like SAML, which is where canonicalization shows up. (Both for UTF-8 itself, which isn't canonical, but especially for your serialization format e.g. XML/JSON, which is usually far hairier.)
If you treat the original UTF8 as bytes, then you still have that as bytes... this is part of why JWT uses base64 around the JSON string's bytes. In any case, you don't create a signature on a string per se, you create it on bytes, even if those bytes are derived from a string. Also, you don't validate a signature by re-creating the bytes in question... that's a flawed approach, and not the approach for example JWT takes.
> In any case, you don't create a signature on a string per se, you create it on bytes, even if those bytes are derived from a string.
Everyone is on the same page that at some point bytes go into a hash function. That's not the problem.
You mention JWT, but JWT only does external signing, which doesn't trigger the problematic case several people are describing to you. Perhaps an example would be more useful. If you start with a JSON like:
{"a": 1}
how do you build a JSON like:
{"a": 1, "tag": "deadbeefdeadbeefdeadbeef"}
with a signing and verification algorithm that works?
> Also, you don't validate a signature by re-creating the bytes in question... that's a flawed approach, and not the approach for example JWT takes.
Can you describe an HMAC validation process that doesn't involve recreating the bytes in the HMAC tag?
That is literally the first thing suggested in the post. You can say the other thing is a "flawed approach" but that's the design problem being solved, so your answer is simply not responsive to the question.
Don't worry, multiple people (including me) have told the community in question that it's a flawed approach. Doesn't seem to have done much to fix it though... :shrug: