An introduction to hSub

What is an hSub?

hSub is an acronym for Hashed Subject. It's function is to provide a means for individuals to identify their own messages from within a shared mailbox using a specially coded Subject header.

What's a shared mailbox?

As its name suggests, a shared mailbox is one used by more than one person. All of the messages in the mailbox are encrypted and can only be decrypted by their owner. In the context of this page, the shared mailbox in use is the Usenet Newsgroup alt.anonymous.messages.

Using an hSub to identify the message owner

Actually it's impossible to identify a message owner using an hSub. The hSub employs a one-way hashing cipher which, as its name suggests, can only be encoded, not decoded. This is an important concept to grasp in understanding an hSub. The only way to verify if a message belongs to an individual is to encode another hSub using the same criteria as was used to generate the one being checked. If the criteria matches, the resulting hSub will be identical. This is known as a hash collision.

Getting technical

An hSub consists of two parts, a random number and a passphrase. The random number is public but the passphrase is private. The random number is always generated by the server or service and is openly included in the published hSub. Each user of the shared mailbox then tries to generate an hSub using that same random number. A collision informs the user that the message being tested is for them.

The Gory Details

If you just want to use hSubs, you can ignore this section. If you want a technical understanding, please read on.

As mentioned in the above section, an hSub is generated using a random number and a passphrase. The actual process is:

hSub Generation Process

The resulting hSub is always 80 hex digits (320bits) but is frequently truncated, for example, 48 digits to make it indistinguishable from the older eSub format. When validating an hSub, the length of the original should define how many bits must collide.

Reference Implementation

The following Python code provides a reference for generating and testing hSubs.
# vim: tabstop=4 expandtab shiftwidth=4 autoindent
# Copyright (C) 2010 Steve Crook 
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2, or (at your option) any later
# version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTIBILITY
# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.

from hashlib import sha256
from os import urandom


def hash(text, iv = None, hsublen = HSUBLEN):
    """Create an hSub (Hashed Subject). This is constructed as:
    | 64bit iv | 256bit SHA2 'iv + text' |
    # Generate a 64bit random IV if none is provided.
    if iv is None: iv = cryptorandom()
    # Concatenate our IV with a SHA256 hash of text + IV.
    hsub = iv + sha256(iv + text).digest()
    return hsub.encode('hex')[:hsublen]

def check(text, hsub):
    """Create an hSub using a known iv, (stripped from a passed hSub).  If
    the supplied and generated hSub's collide, the message is probably for
    # We are prepared to check variable length hsubs within boundaries.
    # The low bound is the current Type-I esub length.  The high bound
    # is the 256 bits within SHA2-256.
    hsublen = len(hsub)
    # 48 digits = 192 bit hsub, the smallest we allow.
    # 80 digits = 320 bit hsub, the full length of SHA256 + 64 bit IV
    if hsublen < 48 or hsublen > 80: return False
    iv = hexiv(hsub)
    if not iv: return False
    # Return True if our generated hSub collides with the supplied
    # sample.
    return hash(text, iv, hsublen) == hsub

def cryptorandom(bytes = 8):
    """Return a string of random bytes. By default we return the default
    IV length (64bits)."""
    return urandom(bytes)

def hexiv(hsub, digits = 16):
    """Return the decoded IV from an hsub.  By default the IV is the first
    64bits of the hsub.  As it's hex encoded, this equates to 16 digits."""
    # We don't want to process IVs of inadequate length.
    if len(hsub) < digits: return False
        iv = hsub[:digits].decode('hex')
    except TypeError:
        # Not all Subjects are hSub'd so just bail out if it's non-hex.
        return False
    return iv

def main():
    """Only used for testing purposes.  We Generate an hSub and then check it
    using the same input text."""
    passphrase = "Pass phrase"
    hsub = hash(passphrase)
    iv = hexiv(hsub)
    print "Passphrase: " + passphrase
    print "IV:   %s" % iv.encode('hex')
    print "hsub: " + hsub
    print "hsub length: %d bytes" % len(hsub)
    print "Should return True:  %s" % check(passphrase, hsub)
    print "Should return False: %s" % check('false', hsub)

# Call main function.
if (__name__ == "__main__"):