Bug #35576 Provide alternative output encodings for hash functions
Submitted: 26 Mar 2008 15:52 Modified: 27 Mar 2008 9:31
Reporter: Marcus Bointon Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version: OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: encoding, hash, md5, sha1, string

[26 Mar 2008 15:52] Marcus Bointon
Description:
The MD5() and SHA1() functions generate a 128-bit and 160-bit integer respectively, encoded as a base-16 string, giving 32 chars for MD5 and 40 chars for SHA1. I'd like to see an additional option to allow the encoding base to be chosen. in particular I'd like to see base-62 ([0-9a-zA-Z]+ (DNS-safe) and base-64 (two flavours in common use: [0-9a-zA-Z\+\/]+ (MIME) and [0-9a-zA-Z_-]+ (URL-safe)). This would allow these hash representations to be stored with far fewer chars, reducing storage and index requirements, speeding up searching etc. I have databases with millions of hashes in, and every character saved helps.

This should perhaps be considered alongside http://bugs.mysql.com/bug.php?id=35188. Maybe it would be better to provide MD5_62() and MD5_64A() functions to accomplish this so they could allow multiple params.

How to repeat:
This is a feature request.
[26 Mar 2008 16:34] Susanne Ebrecht
Many thanks for writing a feature request.

For what issue do you want to use this crypt algorithm?
For login (password)?
For text fields?
For crypting tables/rows/databases?
[26 Mar 2008 16:57] Marcus Bointon
It could apply to any of those - all of them would benefit from reduced storage requirements.

I use hashes to identify many things that need to be available on public URLs, but need the security that knowing one URL doesn't let you see another. For example, I might have a URL that allowed a user to unsubscribe from a mailing list without requiring additional authentication, something like:

http://www.example.com/unsubscribe/9c16d8a4f4a4760d0b8070591d8c374e

With my proposed feature, it might look more like this:

http://www.example.com/unsubscribe/9dAH75Ggd0

You can see that it makes URLs shorter, reduces storage space and improves search speed.

I have numerous other uses similar to this - and I have over 40 million hashes stored in my current database so I could probably save 60% storage space with this improvement.

An alternative implementation could be that you allow native 128 and 160-bit integer types (maybe bigger too), and provide encoding functions to represent those large numbers in arbitrary bases.
[27 Mar 2008 9:31] Susanne Ebrecht
Many thanks for writing a feature request. We will discuss this issue.