Saturday, 25 October 2014

OpenPGP vanity keys with a patched GnuPG

I'm vain, but also lazy to remember long strings of arbitrary hexadecimal digits. Generating vanity keys for GnuPG would address both needs (wants): a vanity key fingerprint would be meaningful, and therefore more memorable.

What is a memorable key fingerprint? This is not one:
pub 1024D/0DA1756824EEB426 2001-01-15 [expired: 2006-01-14] uid Bernd Jendrissek (or owner of this keypair) <Bernd.Jendrissek@mailbox.co.za> uid Bernd Jendrissek <Bernd.Jendrissek@mailbox.co.za>
That is one of my old keys, and it happens that I did memorize at least the short-form keyid - but what a nerdy thing to do. And not something I can expect other people to remember - to them it's just an arbitrary hex string, with little personal relevance.

The obvious solution is to generate a key whose fingerprint carries some meaning, that's easier to remember. A word maybe, spelled in l33tspeak, or some recognizeable number like pi or e (a shibboleth for fellow nerds).

Something like this:
pub 4096R/FAC02779DEC0DED1 2014-01-21 [expires: 2024-01-19] uid Bernd Jendrissek

That's my now-current key that should have propagated to a keyserver near you.

By what black magic did I do this? It took me a while to get to a solution that works well enough to find worthwhile vanity keys. At first I just ran gpg in a shell loop, generating keys one by one. But that's ridiculously slow - it generates a whole new key from scratch each time, which exhausts the entropy pool (arguably a bug in GnuPG - /dev/urandom is really good enough) and does a whole bunch of CPU-intensive computing for every key.

Unfortunately, it takes minutes for standard GnuPG to generate just one key (although most of that is probably in waiting for more entropy). I'd be dead 500 times over before I had a good sampling of word-like key fingerprints. No, I would have to get my hands dirty by patching GnuPG and the libraries it uses. Here's one for libgcrypt:
diff --git a/cipher/rsa.c b/cipher/rsa.c index ccc9f96..319c001 100644 --- a/cipher/rsa.c +++ b/cipher/rsa.c @@ -177,6 +177,7 @@ check_exponent (void *arg, gcry_mpi_t a) * TRANSIENT_KEY: If true, generate the primes using the standard RNG. * Returns: 2 structures filled with all needed values */ +int delta __attribute__((visibility("default"))); static gpg_err_code_t generate_std (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e, int transient_key) @@ -324,6 +328,26 @@ generate_std (RSA_secret_key *sk, unsigned int nbits, unsigned long use_e, return GPG_ERR_SELFTEST_FAILED; } + if (getenv("LIBGCRYPT_MINE")) { + int spread = atoi(getenv("LIBGCRYPT_MINE")); + for (delta = -spread; delta <= 0; delta++) { + switch (fork()) { + case -1: + log_debug("failed to fork\n"); + exit(1); + break; + case 0: + /* Child. */ + return 0; + default: + /* Parent. */ + wait(NULL); + break; + } + } + exit(0); + } + return 0; } diff --git a/src/libgcrypt.vers b/src/libgcrypt.vers index 5a617cc..fd18fc2 100644 --- a/src/libgcrypt.vers +++ b/src/libgcrypt.vers @@ -104,6 +104,7 @@ GCRYPT_1.2 { gcry_mpi_set_ui; gcry_mpi_snew; gcry_mpi_sub; gcry_mpi_sub_ui; gcry_mpi_subm; gcry_mpi_swap; gcry_mpi_test_bit; gcry_mpi_lshift; +delta; local: *;
This bit of code forks the process after it has generated a precious prime pair and exposes a different fudge value (delta) which gets used later in GnuPG proper. Speaking of which:
diff --git a/g10/keygen.c b/g10/keygen.c index 8c3e9f6..64cc2d6 100644 --- a/g10/keygen.c +++ b/g10/keygen.c @@ -120,6 +120,7 @@ struct opaque_data_usage_and_pk { PKT_public_key *pk; }; +extern int delta; static int prefs_initialized = 0; static byte sym_prefs[MAX_PREFS]; @@ -1435,7 +1463,7 @@ gen_rsa (int algo, unsigned nbits, KBNODE pub_root, KBNODE sec_root, DEK *dek, if (!nbits) nbits = DEFAULT_STD_KEYSIZE; - if (nbits < 1024) + if (!getenv("GNUPG_MINE") && nbits < 1024) /* XXX I'm not sure if this is important. */ { nbits = 1024; log_info (_("keysize invalid; using %u bits\n"), nbits ); @@ -1463,7 +1491,7 @@ gen_rsa (int algo, unsigned nbits, KBNODE pub_root, KBNODE sec_root, DEK *dek, sk = xmalloc_clear( sizeof *sk ); pk = xmalloc_clear( sizeof *pk ); - sk->timestamp = pk->timestamp = timestamp; + sk->timestamp = pk->timestamp = timestamp + delta; sk->version = pk->version = 4; if (expireval) { @@ -3350,6 +3378,65 @@ start_tree(KBNODE *tree) delete_kbnode(*tree); } +static char const *check_vanity(KBNODE pub_root, char keyid[static 17]) +{ + PKT_public_key *pk; + gcry_md_hd_t md; + + gcry_md_hd_t do_fingerprint_md(PKT_public_key *pk); + + if (pub_root) { + pk = find_kbnode (pub_root, PKT_PUBLIC_KEY)->pkt->pkt.public_key; + md = do_fingerprint_md(pk); + } + if (!pub_root || md) { + const byte *dp; + char const *vanity_patterns[] = { + /* Simple repetitions and sequences. */ + "(.)\\1.*\\1\\1.*\\1\\1", + "(.)\\1\\1.*\\1.*(.)\\2\\2|(.)\\1\\1.*(.).*\\2\\2\\2", + "(.)\\1\\1\\1$", + "123456|234567|345678|456789|56789A|6789AB|789ABC|89ABCD|9ABCDE|ABCDEF", + /* Math/physics constants. */ + "314159|271828|60221|1380[67]|8314[45]|299792|6626[01]|66738|9109[34]|40490FDB|27315", + /* Biographical details. */ + "^(..)*1976|^(..)*760518|^(..)*1805|^(..)*5231080", + /* 13375p34k. */ + "^([05ABCDEF]*[0AE]){2,}[05ABCDEF]*$", + "CA5CADE|0DE55A|ACCE55|A55E55|BA0BAB|CA5CADE|DECADE|DEC0DE|D0D0E5|FACADE|0B5E55|5EABED|5EAF00D|5EED", + /* Former keys. */ + "24EEB426|FCDE4B8C|804177F8|AAE76E8E|D7CBA633|D5B8044C|1726CB60", + /* Famous words. */ + "DEADBEEF|C0FFEE", + /* ASCII. */ + "^(2[0DEF]|3[0-9F]|[46].|[57][0-A])*$", + "^(..)*[46]2[46]5[57]2[46]e[46]4|[46]2[57]0[46]a|[46]a[46]5[46]e[46]4[57]2[46]9[57]3", + NULL + }; + int i; + int check_regexp(const char *expr,const char *string, int sanitize); + + if (pub_root) { + dp = gcry_md_read(md, 0); + + sprintf(keyid, "%02X%02X%02X%02X%02X%02X%02X%02X", + dp[12], dp[13], dp[14], dp[15], dp[16], dp[17], dp[18], dp[19]); + } else { + keyid[0] = 0; + } + + for (i = 0; vanity_patterns[i]; i++) { + if (check_regexp(vanity_patterns[i], keyid, 0)) { + return vanity_patterns[i]; + } + } + if (!vanity_patterns[i]) { + return NULL; + } + } + + return NULL; +} static void do_generate_keypair (struct para_data_s *para, @@ -3363,6 +3450,7 @@ do_generate_keypair (struct para_data_s *para, int rc; int did_sub = 0; u32 timestamp; + char keyid[17]; if( outctrl->dryrun ) { @@ -3453,13 +3541,15 @@ do_generate_keypair (struct para_data_s *para, linked list. The first packet is a dummy packet which we flag as deleted. The very first packet must always be a KEY packet. */ - start_tree (&pub_root); - start_tree (&sec_root); - timestamp = get_parameter_u32 (para, pKEYCREATIONDATE); if (!timestamp) timestamp = make_timestamp (); + check_vanity(NULL, keyid); + + start_tree (&pub_root); + start_tree (&sec_root); + /* Note that, depending on the backend (i.e. the used scdaemon version), the card key generation may update TIMESTAMP for each key. Thus we need to pass TIMESTAMP to all signing function to @@ -3492,6 +3582,17 @@ do_generate_keypair (struct para_data_s *para, } } + /* Check key for vanity value. */ + if (!rc && getenv("GNUPG_MINE")) { + char const *matching_pattern = check_vanity(pub_root, keyid); + if (matching_pattern) { + log_debug("Found key %s with %s, delta = %d\n", + keyid, matching_pattern, delta); + } else { + exit(0); + } + } + if(!rc && (revkey=get_parameter_revkey(para,pREVOKER))) { rc = write_direct_sig (pub_root, pub_root, pri_sk, revkey, timestamp); @@ -3637,7 +3751,7 @@ do_generate_keypair (struct para_data_s *para, keydb_release (pub_hd); keydb_release (sec_hd); - if (!rc) + if (!rc && !getenv("GNUPG_MINE")) { int no_enc_rsa; PKT_public_key *pk; diff --git a/g10/trustdb.c b/g10/trustdb.c index dedb18c..f8c2dda 100644 --- a/g10/trustdb.c +++ b/g10/trustdb.c @@ -1852,8 +1852,8 @@ sanitize_regexp(const char *old) /* Used by validate_one_keyblock to confirm a regexp within a trust signature. Returns 1 for match, and 0 for no match or regex error. */ -static int -check_regexp(const char *expr,const char *string) +int +check_regexp(const char *expr,const char *string, int sanitize) { #ifdef DISABLE_REGEX /* When DISABLE_REGEX is defined, assume all regexps do not @@ -1863,19 +1863,59 @@ check_regexp(const char *expr,const char *string) int ret; char *regexp; - regexp=sanitize_regexp(expr); + regexp=sanitize?sanitize_regexp(expr):xstrdup(expr); #ifdef __riscos__ ret=riscos_check_regexp(expr, string, DBG_TRUST); #else { + static struct { + char *pattern; + regex_t pat; + } pat_cache[100]; + static int initialized = 0; regex_t pat; + int i, i_empty; + + if (!initialized) { + for (i = 0; i < 100; i++) { + pat_cache[i].pattern = 0; + } + initialized = 1; + } + + for (i = 0, i_empty = -1; i < 100; i++) { + if (i_empty < 0 && !pat_cache[i].pattern) { + i_empty = i; + } + if (strcmp(pat_cache[i].pattern, regexp) == 0) { + break; + } + } - ret=regcomp(&pat,regexp,REG_ICASE|REG_NOSUB|REG_EXTENDED); + if (i < 100) { + memcpy(&pat, &pat_cache[i].pat, sizeof (pat)); + ret = 0; + } else { + ret=regcomp(&pat,regexp,REG_ICASE|REG_NOSUB|REG_EXTENDED); + } if(ret==0) { + if (i_empty < 0) { + for (; i < 100; i++) { + if (!pat_cache[i].pattern) { + i_empty = i; + } + } + } + if (i_empty >= 0) { + pat_cache[i_empty].pattern = xstrdup(regexp); + memcpy(&pat_cache[i_empty].pat, &pat, sizeof (pat)); + } ret=regexec(&pat,string,0,NULL,0); - regfree(&pat); + if (i_empty < 0) { + regfree(&pat); + } ret=(ret==0); } } @@ -1970,7 +2010,7 @@ validate_one_keyblock (KBNODE kb, struct key_item *klist, || opt.trust_model != TM_PGP || (uidnode && check_regexp(kr->trust_regexp, - uidnode->pkt->pkt.user_id->name)))) + uidnode->pkt->pkt.user_id->name, 1)))) { /* Are we part of a trust sig chain? We always favor the latest trust sig, rather than the greater or
This bit uses the fudge value set earlier in each variant process. (I'm lucky here in that GnuPG's keyring seems to be concurrency-safe - although the loop does wait for each child process to finish, appending its key to the keyring, before continuing to fork more child processes.) The fudge value is a delta that modifies the apparent timestamp of the key, adjusting it by up to $LIBGCRYPT_MINE seconds into the past. The timestamp affects the key fingerprint, so this is an easy way to diversify the single pair of primes that gets reused usefully up to thousands of times. (Beyond that, super-linear scaling in GnuPG's keyring access imposes the law of diminishing returns, and it's better to just generate a new pair of primes and write the resulting keys into a fresh keyring.)

The result is a stream of fast, identical keys (with just a differing timestamp), which I then filter with a few regular expressions (compiling them just once, in the parent process, and caching the compiled form for re-use in the child processes). to get a slightly slower stream of candidate keys whose fingerprint is, in some way, memorable and/or meaningful.

I chose one (FAC02779DEC0DED1) fairly arbitrarily - there were several others that took me a long while to reject. I don't need them anymore, so if anyone wants them, I'll sell them for 5mBTC per unique key. You really shouldn't take up this offer though, since these keys would come to you pre-compromised: I have access to the private keys. You'd have to just take my word for it that I'll delete each keyring that I can sell. It's really just for fun, a novelty that you could use for unimportant things. A few cool ones from a total of more than a million:
8F0230444444444C A8111111111C2A72 AB4F37D43CA5CADE 5F2833DEAD52BEEF BB3418DC0FFEEBAD AAA3B57C1CADA555 E30E7A290B314159 9CBC000011112CE1 1251D7C715EAF00D

Saturday, 18 October 2014

Indeterminate hash table traversal

On and off for a few weeks I've been trying to figure out why the gnetlist testsuite has been failing. I could've sworn that I regenerated the golden output two years ago and painstakingly checked that the differences were only in the order of output with the help of a special-purpose script.

I hadn't run the testsuite in a while, so it's likely that this was the first time I've run it since I got a new hard disk and started using a newer OS. (I'm using debian jessie now.) When I looked at the differences between the golden outputs and the currently produced outputs, it seemed clear that the differences were only in the order of output, and not due to the output being incorrect. At least, not more incorrect than the golden output.

Think things through, don't draw hasty conclusions



I pretty quickly focused on the possibility that GLib's hash table implementation had changed between Ubuntu 10.10's version and jessie's. But at first I was convinced that there was no material change in the hash table implementation: all the hash table stuff was still doing the same thing, albeit in a more cache-friendly way. I even copied GLib's ghash.c from version 2.26.1 (the one that Ubuntu 10.10 used), renamed a few functions and used those instead of the jessie-native GLib's functions, but still the output stayed in the different order.

Still, I was convinced that there must have been a change in hash table traversal order. What else could be reordering the output? I've barely worked on gEDA since getting my new hard disk, and in any case I also ran the testsuite of the very commit that last touched the golden outputs: still reordered output.

The simplicity (and reliability) of blunt tools



My next step was to use a blunt tool: duplicating more exactly the Ubuntu 10.10 environment. I downloaded the relevant debs and unpacked them in a staging directory, and built gEDA against those versions of GLib and a few others. (That required editing the pkg-config package metadata files to point at locations within the staging directory, then setting $PKG_CONFIG_PATH and also using some LDFLAGS=... and $LD_LIBRARY_PATH trickery.)

Finally! Linking to the Ubuntu 10.10 version of GLib reproduced the golden output. That meant that the change in output was a result of the change in computing environment, not some subtle bug I've introduced with my patches. Now I can confidently make some more order-changing patches (to decouple program output order from hash table implementation). Going from a known-good to a presumed-good state is a lot less scary than going from unknown-if-good to presumed-no-worse. I'll probably still check the regenerated golden output with my order-aware diff-checking script, just to be sure that I don't add bugs.

Mystery solved



With the gEDA source code validated (it now passes the testsuite, given the right libraries), I still want to satisfy my curiosity: why did the output order change, when it seemed like the GLib hash table implementation hadn't changed? Because, well, it had, and I had only overlooked what had changed: the hashing function itself, g_str_hash, that transforms a string to a hash value.

What I failed to realize during my first diff-hunt was that GLib 2.26.1's g_str_hash lived in a different file than the rest of the hash table implementation, so I simply didn't see the old version (and never noticed it was missing) while looking at changes to glib/ghash.c.

Don't depend on the order of hash table traversal!