Intro

Le dans «x.509» par hase
Mots-clés: , ,

Lets say, you want an IOU from me.
and as this is no longer the 90ies, only a digitally signed document will do, right?

So I would have to 1) create a file with the content "I owe you 5 bucks" 2) run this file through a (cryptographically secure) hash function. This has function will come out with a number, the hash value. 3) encrypt that hash value using the private half of my key pair 4) attach this encrypted hash value to the document somehow.

Actually, the attaching part is quite optional: detached signatures living in their own file exist and are quite useful.
But after doing the 4 steps above, I have created a digitally signed document.
The encryptes hash value is the documents signature - well, basically that is.

When we decode an actual signature, it looks way more complex, convoluted and involved. In the signature block of a signed document there is certainly more than just one encrypted numer!
This is true, but bear with me: the encrypted hash value i the beating heart of the signature.
everything else is just technical stuff which tells us something about that signature.
But the ence'd-hash is the core, the meat, the actual signature.

The technical information is actually necessary.
Remember step 2)? Running the doc through a hash function?
Which hash function?
Well: see in the signature block: the chosen hash function is named there.
And this information is necessary when we want to verify the signature.
Other necessary technical information is for example the type of cryptographic algorithm used in the encryption step and the length of the key that was used.
Also a reference to who the signer is could prove helpful later on, right?

But before we look into these thing, lets see what the signature gives us, what the properties of the signed document are.

function of the hash

A hash function is a bit of math that will ingest any data (no matter the size, type etc.). Anything.
And the hash functin will output a number that is guaranteed to fit within a certain number of bits. Or has exactly this number of bits when including possibly some leading zeroes.

The cryptographically secure hash functions are designed to - minimize hash collisions to practically non-existence - create wildly different hash values for very similar input

The latter means that two very similar texts - say a comma swapped for a semicolon ind a text file - till have very differnt hash values.
The former means that two different texts will (practially) always have different hash values; but I have to be precise here or Internet commenters will call me stupid:
Technically, every hash function must suffer from hash collisions, meaning that it is mathimatically speaking always possible to find two texts that produce the same hash value.
But "mathematically" means "within infinite time and resources", as mathematicians are never afaid of the infinite ot infinitesimal.

Imagine a hash function that produces an 8 Bit hash value. 256 different values are possible.
So feed 300 different texts into that function - you will (very probably) find two texts with the same hash value, i.e. a hash collitions.
From this example it is clear, why every hash function must have collisions: even a 256Bit or 512Bit number is limited and it is always possible (with infinite resources) to generate more than 2²⁵⁶ or 2⁵¹² different texts. So collitions are guaranteed.

But they are rare, because a) even "just" 2²⁵⁶ is a mindboggingly large number, and because the hash functions are specifically designed to make collisions extremely improbable.
So for practical purposes, we simply accept them as collision-free.

Until we don't: MD5 was considered cryptographically secure and was the standard hash used in many digital signature schemes.
Until some clever math person showed a way of computing hash collisions - thus proving the "collision-free" assumption wrong.
The kinda-sorta successor to MD5 was the "Secure Hash Algorithm" SHA, nowadays called SHA1 - which also proved vunerable (even more than MD5, in fact).

So today, at the time of this writing, we use the SHA2 and SHA3 algorithms (favouring SHA2 heavily) as standard hashes for signatures.
Both come in 3 different variants (some internal parameters set differently) giving 256, 384 and 512 Bits output length respectively.
Note: no, the 256Bit output of a SHA2 is not just the 512Bit value truncated to 256, they are truly different.
The "callsigns" SHA3-256, SHA3-384 and SHA3-512 care self-explanatory. For its younger cousin, the SHA256, SHA384 and SHA512 are a bit confusing sometimes, but nobody calles them their systematic SHA2-256... abbreviations.

The hash function being collision-free make the hash value a great proxy for the text: the value is specific to this very text, no other text will lead to the same value.
And the has value - unlike the text - has a fixed length. This is always better to work with than "arbitrary lenght, mieght be Giga/Tera/Exabytes...".

in short

In short: the has value ties the signature to the signed text

function of the encryption step

In step 3) I encrypted the hash value using the private half of my keypair.
In short we call this the PrivateKey or PrivKey, but remembering that this one is just half of a key pair is helpful.

This operation ties the signature to me: only I have access to my PrivKey.
Therefore only I can encrypt something (like a hash value) using my PrivKey.
This is what makes the signature my signature.

It is important to remember this part: whoever has your PrivKey makes your signature.
Not "forges your signature" or "makes something like your signature".
Because this is cryptography we are discussion here - and cryptography is math - it is your signature. With mathematical precision.

In customer projects involving Key Pair Cryptography, I always stress this point, and I always look into non-believing eyepairs.
Because there is no equivalent in the analog world, where forgeries are always somehow distinguiashable from the real thing.
Not here: your PrivKey - your signature.

Because it is so immensely important to limit access to a PrivKey and protect it from being copied, clever devices that help with this protections-against-copies have been designed. But I will discuss Smartcards and the Big Cousins the Hardware Security Modules in another article drivint into that particular rabbit hole.

"But what about the IT Admin Motto?" you may ask.
(The Admin Motto is - as we probably all know - "No Backup? No pity!".
Yes, backups are important. Necessary.
But not for your Signature-PrivKey. Loosing that to some malfunction (say: a broken Smartcard) is like loosing your favourite fountain pen.
It may temporarily set you back, but get a new one and you are back in signing business.

Loosing an encryption PrivKey is another matter: that locks your out from all your encrypted-with-that-keypair data. Problem.
So there a backup makes sense.
This is btw. one of the the main reasons behind the "different KeyPairs for different uses"-rule and why we have 2key usage" and "extended key usage" attributes in X.509 certificates: to indicate the intended use of the KeyPair and the resulting backup strategy.
But I should discuss that in another article again :-)

So again: using my PrivKey makes the signature my signature.
And a document may have multiple signature of course. A contract is a good example.
Or a petition to the European Union to get their act together regarding regulation of crypto - but wait! Semken! stick with the tech here, the political stuff goes in the blog.

Verifying the signature

As we have seen above: the process of hash->encrypt->attach created the signature (for the IOU in our example).
But now you would want to check if the thing was really signed and by whom; trust is good - check is better ("Vertrauen ist gut, Kontrolle ist besser", german proverb).

to chekc the signature, all you need to do is 1) calculate the hash again from the text 2) decrypt the encrypted hash 3) compare the two

If the comparision checks out: the signature belongs to the text (hash values match => this is exactly the text that was signed) and it was may by me, because my PubKey decrypted the hash.

Oh, did I mention that already?
The basic way that Key Pair Cryptography works is: what one key encrypts, the other one decrypts
And btw: the official term is Public Key Cryptography, not Key Pair Cryptography.
But the PubKey is never involved alone; any transaction involves both halves of the key pair.
At different times and probably in different places - but both are involved.
Here, the PrivKey half of my KeyPair is used to create the signature, the PubKey half is used to verify it.

So all you need to check my signature is some information about the signature: what hash functino was used, what crypto algorithm with which key length, and whos PubKey to use for the decrypt: the technical info in the signature block.

This leaves but one question:
How do you know, that my PubKey is actually my PubKey (and not one that an imposter has created)?

That is where Certificates come in.
And that is a story for another article.