Core Lightning implementation of BOLT #11 invoices - part 4

LIVE #20January 04, 2024

In this live we try to understand why the description foo bar of an invoice is represented in a BOLT #11 string by vehk7grzv9eq.

Transcript with corrections and improvements

Today we are going to look at the transformation of the information of an invoice from its representation with bytes of 8 bits to bytes of 5 bits.

Specifically, we'll try to understand why the description foo bar of an invoice is represented in a BOLT #11 string by vehk7grzv9eq.

Why bytes of 5 bits?

Because BOLT #11 invoices use bech32 encoding specified by BIP 173, and its charater set contains 32 characters, and 32 is equal to 2 to the power of 5.

A BOLT #11 invoice breaks down into 4 parts

A BOLT #11 invoice breaks down into 4 parts, as shown below

   HRP      separator     data       checksum
    |           |          |             |
lnbcrt100n      1      pj47g4...      r8rsyy

where:

  • HRP stands for Human Readable Part, lnbcrt means it is an invoice for the regtest chain of 10000msat (100n),

  • 1 is the separator (last 1 in the string)

  • the we have a data part and finally

  • the checksum (6 characters) as defined by BIP 173.

Let's run some code!

Generate an invoice on regtest with the description "foo bar"

Let's start two Lightning nodes running on the Bitcoin regtest chain by sourcing the script lightning/contrib/startup_regtest.sh provided by CLN repository and running the command start_ln:

◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ source contrib/startup_regtest.sh
...
◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ start_ln
...

We can check that l1-cli is just an alias for lightning-cli with the base directory being /tmp/l1-regtest:

◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ alias l1-cli
alias l1-cli='/home/tony/clnlive/lightning/cli/lightning-cli --lightning-dir=/tmp/l1-regtest'

Now we can generate a BOLT #11 invoice like this:

◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ l1-cli invoice 10000 label "foo bar"
{
   "payment_hash": "53afc5894cc6a9aed4a547b776ae2d4a541be02189815f8364c98a37497c0856",
   "expires_at": 1704985787,
   "bolt11": "lnbcrt100n1pjedj3msp5cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qqpp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcqp74tlj",
   "payment_secret": "c23af8f360dfb6394e5d46b098b190e9229957b3d0383d9a6517170af9d2bbc0",
   "created_index": 1,
   "warning_capacity": "Insufficient incoming channel capacity to pay invoice"
}

Break down the BOLT #11 invoice into 4 parts

We can break down the previous invoice

lnbcrt100n1pjedj3msp5cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qqpp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcqp74tlj

into the following 4 parts:

  • hrp: lnbcrt100n,

  • separator: 1,

  • data: pjedj3msp5cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qqpp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcq,

  • checksum: p74tlj.

timestamp

The first 7 characters of the data part of the BOLT #11 is the timestamp: pjedj3m.

s (secret) and p (payment_hash) tagged field

After removing the timestamp from the data part of the BOLT #11 invoice, we are left with that string:

sp5cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qqpp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcq

That string start with s, so it corresponds to the tagged field for the secret.

The next two characters, corresponding to the length (big-endian) of the data for the secret, are p5.

Why?

Because, 52 in decimal corresponds to p5 (big-endian) in bech32 as we can see below:

/img/2024-01-04-live-0020-latex-01.png

But why the data length of the secret is 52?

Because, the secret is a SHA256 number that takes 256 bits. And to be represented using bech32 character set we need to pad those 256 bits with four 0s as described in BOLT #11 spec to have a length of 260, a number divisible by 5.

Specifically, 260 divided by 5 is equal to 52.

/img/2024-01-04-live-0020-latex-02.png

Finally, the data part of the s tagged field is:

cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qq

In bits we have the following:

[5   bits] s
[10  bits] p5
[260 bits] cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qq

After removing the secret tagged field from the data part of the BOLT #11, we are left with that string:

pp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcq

That string start with p, so it corresponds to the payment_hash tagged field. We can do something similar as for the s tagged field.

In bits we have the following:

[5   bits] p
[10  bits] p5
[260 bits] 2whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptq

d (description) tagged field

After removing the payment_hash from the data part of the BOLT #11 invoice, we are left with that string:

dqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcq

That string starts with d, so it corresponds to the tagged field of the description.

The next two characters, corresponding to the length (big-endian) of the data of the description, are qv.

As qv (big-endian) in bech32 corresponds to 12, the data part of the description is:

vehk7grzv9eq

Below we see why foo bar string description is represented by vehk7grzv9eq:

/img/2024-01-04-live-0020-latex-03.png

Terminal session

We ran the following commands in this order:

$ source contrib/startup_regtest.sh
$ start_ln
$ alias l1-cli
$ l1-cli invoice 10000 label "foo bar"

And below you can read the terminal session (command lines and outputs):

◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ source contrib/startup_regtest.sh
...
◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ start_ln
...
◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ alias l1-cli
alias l1-cli='/home/tony/clnlive/lightning/cli/lightning-cli --lightning-dir=/tmp/l1-regtest'
◉ tony@tony:~/clnlive/lightning:[git»(HEAD detached at v23.11)]
$ l1-cli invoice 10000 label "foo bar"
{
   "payment_hash": "53afc5894cc6a9aed4a547b776ae2d4a541be02189815f8364c98a37497c0856",
   "expires_at": 1704985787,
   "bolt11": "lnbcrt100n1pjedj3msp5cga03umqm7mrjnjag6cf3vvsay3fj4an6qurmxn9zuts47wjh0qqpp52whutz2vc656a499g7mhdt3dff2phcpp3xq4lqmyex9rwjtupptqdqvvehk7grzv9eqxqyjw5qcqp2fp4pctzqnye37nx89nzl5msq2htxgaa85a6f5qkfws6dfgj48dwv9rtq9qx3qysgqxyxw4gfzgd6zgml6me24m7nr0prgvq8e726ycrj2kjjvuseep7zxrdwqz2utu5x2ejt2rrfhhfgwf7zy35qe2tu42w79y64trhjfkxcqp74tlj",
   "payment_secret": "c23af8f360dfb6394e5d46b098b190e9229957b3d0383d9a6517170af9d2bbc0",
   "created_index": 1,
   "warning_capacity": "Insufficient incoming channel capacity to pay invoice"
}

BOLT #11 - Data Part

This section is taken from bolts:11-payment-encoding.md.

The data part of a Lightning invoice consists of multiple sections:

  1. timestamp: seconds-since-1970 (35 bits, big-endian)

  2. zero or more tagged parts

  3. signature: Bitcoin-style signature of above (520 bits)

Tagged Fields

Each Tagged Field is of the form:

  1. type (5 bits)

  2. data_length (10 bits, big-endian)

  3. data (data_length x 5 bits)

Note that the maximum length of a Tagged Field's data is constricted by the maximum value of data_length. This is 1023 x 5 bits, or 639 bytes.

Currently defined tagged fields are:

  • p (1): data_length 52. 256-bit SHA256 payment_hash. Preimage of this provides proof of payment.

  • s (16): data_length 52. This 256-bit secret prevents forwarding nodes from probing the payment recipient.

  • d (13): data_length variable. Short description of purpose of payment (UTF-8), e.g. '1 cup of coffee' or 'ナンセンス 1杯'

  • m (27): ...

  • n (19): ...

  • h (23): data_length 52. 256-bit description of purpose of payment (SHA256). This is used to commit to an associated description that is over 639 bytes, but the transport mechanism for the description in that case is transport specific and not defined here.

  • x (6): ...

  • c (24): data_length variable. min_final_cltv_expiry_delta to use for the last HTLC in the route. Default is 18 if not specified.

  • f (9): ...

  • r (3): ...

  • 9 (5): ...

Requirements

A writer:

  • MUST include exactly one p field.

  • MUST include exactly one s field.

  • MUST set payment_hash to the SHA2 256-bit hash of the payment_preimage that will be given in return for payment.

  • MUST include either exactly one d or exactly one h field.

    • if d is included:

      • MUST set d to a valid UTF-8 string.

      • SHOULD use a complete description of the purpose of the payment.

    • if h is included: ...

  • MUST include one c field (min_final_cltv_expiry_delta).

    • ...

  • if there is NOT a public channel associated with its public key:

    • MUST include at least one r field.

      • ...

  • MUST pad field data to a multiple of 5 bits, using 0s.

  • ...

Bech32 as defined in BIP 0173

This section is taken from https://en.bitcoin.it/wiki/BIP_0173.

A Bech32 string is at most 90 characters long and consists of:

  • The human-readable part, which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.

  • The separator, which is always "1". In case "1" is allowed inside the human-readable part, the last one in the string is the separator.

  • The data part, which is at least 6 characters long and only consists of alphanumeric characters excluding "1", "b", "i", and "o".

|     | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|-----+---+---+---+---+---+---+---+---|
| +0  | q | p | z | r | y | 9 | x | 8 |
| +8  | g | f | 2 | t | v | d | w | 0 |
| +16 | s | 3 | j | n | 5 | 4 | k | h |
| +24 | c | e | 6 | m | u | a | 7 | l |

The last six characters of the data part form a checksum and contain no information.

UTF-8

This section is taken from https://en.wikipedia.org/wiki/UTF-8.

  • Since the restriction of the Unicode code-space to 21-bit values in 2003, UTF-8 is defined to encode code points in one to four bytes, depending on the number of significant bits in the numerical value of the code point.

  • The x characters are replaced by the bits of the code point.

Code point ↔ UTF-8 conversion

| First code point | Last code point |   Byte 1 |   Byte 2 |   Byte 3 |   Byte 4 |
| U+0000           | U+007F          | 0xxxxxxx |          |          |          |
| U+0080           | U+07FF          | 110xxxxx | 10xxxxxx |          |          |
| U+0800           | U+FFFF          | 1110xxxx | 10xxxxxx | 10xxxxxx |          |
| U+10000          | [nb 2]U+10FFFF  | 11110xxx | 10xxxxxx | 10xxxxxx | 10xxxxxx |

The first 128 code points (US-ASCII) need one byte.

ASCII

This section is taken from https://man7.org/linux/man-pages/man7/ascii.7.html.

The following table contains the 128 ASCII characters:

C program '\X' escapes are noted.

Oct   Dec   Hex   Char                        Oct   Dec   Hex   Char
────────────────────────────────────────────────────────────────────────
000   0     00    NUL '\0' (null character)   100   64    40    @
001   1     01    SOH (start of heading)      101   65    41    A
002   2     02    STX (start of text)         102   66    42    B
003   3     03    ETX (end of text)           103   67    43    C
004   4     04    EOT (end of transmission)   104   68    44    D
005   5     05    ENQ (enquiry)               105   69    45    E
006   6     06    ACK (acknowledge)           106   70    46    F
007   7     07    BEL '\a' (bell)             107   71    47    G
010   8     08    BS  '\b' (backspace)        110   72    48    H
011   9     09    HT  '\t' (horizontal tab)   111   73    49    I
012   10    0A    LF  '\n' (new line)         112   74    4A    J
013   11    0B    VT  '\v' (vertical tab)     113   75    4B    K
014   12    0C    FF  '\f' (form feed)        114   76    4C    L
015   13    0D    CR  '\r' (carriage ret)     115   77    4D    M
016   14    0E    SO  (shift out)             116   78    4E    N
017   15    0F    SI  (shift in)              117   79    4F    O
020   16    10    DLE (data link escape)      120   80    50    P
021   17    11    DC1 (device control 1)      121   81    51    Q
022   18    12    DC2 (device control 2)      122   82    52    R
023   19    13    DC3 (device control 3)      123   83    53    S
024   20    14    DC4 (device control 4)      124   84    54    T
025   21    15    NAK (negative ack.)         125   85    55    U
026   22    16    SYN (synchronous idle)      126   86    56    V
027   23    17    ETB (end of trans. blk)     127   87    57    W
030   24    18    CAN (cancel)                130   88    58    X
031   25    19    EM  (end of medium)         131   89    59    Y
032   26    1A    SUB (substitute)            132   90    5A    Z
033   27    1B    ESC (escape)                133   91    5B    [
034   28    1C    FS  (file separator)        134   92    5C    \  '\\'
035   29    1D    GS  (group separator)       135   93    5D    ]
036   30    1E    RS  (record separator)      136   94    5E    ^
037   31    1F    US  (unit separator)        137   95    5F    _
040   32    20    SPACE                       140   96    60    `
041   33    21    !                           141   97    61    a
042   34    22    "                           142   98    62    b
043   35    23    #                           143   99    63    c
044   36    24    $                           144   100   64    d
045   37    25    %                           145   101   65    e
046   38    26    &                           146   102   66    f
047   39    27    '                           147   103   67    g
050   40    28    (                           150   104   68    h
051   41    29    )                           151   105   69    i
052   42    2A    *                           152   106   6A    j
053   43    2B    +                           153   107   6B    k
054   44    2C    ,                           154   108   6C    l
055   45    2D    -                           155   109   6D    m

056   46    2E    .                           156   110   6E    n
057   47    2F    /                           157   111   6F    o
060   48    30    0                           160   112   70    p
061   49    31    1                           161   113   71    q
062   50    32    2                           162   114   72    r
063   51    33    3                           163   115   73    s
064   52    34    4                           164   116   74    t
065   53    35    5                           165   117   75    u
066   54    36    6                           166   118   76    v
067   55    37    7                           167   119   77    w
070   56    38    8                           170   120   78    x
071   57    39    9                           171   121   79    y
072   58    3A    :                           172   122   7A    z
073   59    3B    ;                           173   123   7B    {
074   60    3C    <                           174   124   7C    |
075   61    3D    =                           175   125   7D    }
076   62    3E    >                           176   126   7E    ~
077   63    3F    ?                           177   127   7F    DEL

For convenience, below are more compact tables in hex and decimal.


   2 3 4 5 6 7       30 40 50 60 70 80 90 100 110 120
 -------------      ---------------------------------
0:   0 @ P ` p     0:    (  2  <  F  P  Z  d   n   x
1: ! 1 A Q a q     1:    )  3  =  G  Q  [  e   o   y
2: " 2 B R b r     2:    *  4  >  H  R  \  f   p   z
3: # 3 C S c s     3: !  +  5  ?  I  S  ]  g   q   {
4: $ 4 D T d t     4: "  ,  6  @  J  T  ^  h   r   |
5: % 5 E U e u     5: #  -  7  A  K  U  _  i   s   }
6: & 6 F V f v     6: $  .  8  B  L  V  `  j   t   ~
7: ' 7 G W g w     7: %  /  9  C  M  W  a  k   u  DEL
8: ( 8 H X h x     8: &  0  :  D  N  X  b  l   v
9: ) 9 I Y i y     9: '  1  ;  E  O  Y  c  m   w
A: * : J Z j z
B: + ; K [ k {
C: , < L \ l |
D: - = M ] m }
E: . > N ^ n ~
F: / ? O _ o DEL

Resources