Date
13 August, 2020
Topics
Speakers
Transcript by
Michael Folkson
Transcript completed by: Stephan Livera Edited by: Michael Folkson
ANYPREVOUT BIP (BIP 118): https://github.com/ajtowns/bips/blob/bip-anyprevout/bip-0118.mediawiki
Stephan Livera (SL): Christian welcome back to the show.
Christian Decker (CD): Hey Stephan, thanks for having me.
SL: I wanted to chat with you about a bunch of stuff that you’ve been doing. We’ve got a couple of things that I was really interested to chat with you about: ANYPREVOUT, MPP, Lightning attacks, the latest with Lightning Network. Let’s start with ANYPREVOUT. I see that yourself and AJ Towns just recently did an update and I think AJ Towns did an email to the mailing list saying “Here’s the update to ANYPREVOUT.” Do you want to give us a bit of background? What motivated this recent update?
CD: When I wrote up the NOINPUT BIP it was just a bare bones proposal that did not consider or take into consideration Taproot at all simply because we didn’t know as much about Taproot as we do now. What I did for NOINPUT (BIP118) was to have a minimal working solution that we could use to implement eltoo on top and a number of other proposals. But we didn’t integrate it with Taproot simply because that wasn’t at a stage where we could use it as a solid foundation yet. Since then that has changed. AJ went ahead and did the dirty work of actually integrating the two proposals with eachother. That’s where ANYPREVOUT and ANYPREVOUTANYSCRIPT, the two variants, came out. Now it’s very nicely integrated with the Taproot system. Once Taproot goes live we can deploy ANYPREVOUT directly without a lot of adaption that that has to happen. That’s definitely a good change. ANYPREVOUT supersedes the NOINPUT proposal which was a bit of a misnomer. Using ANYPREVOUT we get the effects that we want to have for eltoo and some other protocols and have them nicely integrated with Taproot. We can propose them once Taproot is merged.
Christian Decker on Eltoo at Chaincode Labs: https://diyhpl.us/wiki/transcripts/chaincode-labs/2019-09-18-christian-decker-eltoo/
SL: Let’s talk a little bit about the background. For the listeners who aren’t familiar, what is eltoo? Why do we want that as opposed to the current model for the Lightning Network?
CD: Eltoo is a proposal that we came up with about two years ago. It is an alternative update mechanism for Lightning. In Lightning we use what’s called an update mechanism to go from one state to the next one and make sure that the old state is not enforceable. If we take an example, you and I have a channel open with 10 dollars on your side. The initial state reflects this. 10 dollars goes to Stephan and zero goes to Christian. If we do any sort of transfer, some payment that we are forwarding over this channel or a direct payment that we want to have between the two of us, then we need to update this state. Let’s say you send me 1 dollar. The new state becomes 9 dollars to Stephan and 1 dollar to Christian. But we also need to make sure that the old state cannot be enforced anymore. You couldn’t go back and say “Hey I own 10 out of 10 dollars on this contract.” I need to have the option of saying “Wait that’s outdated. Please use this version instead.” What eltoo does is exactly that. We create a transaction that reflects our current state. We have a mechanism to activate that state and we have a mechanism to override that state if if it turns out to be an old one instead of the latest one. For this to be efficient what we do is we say “The newest state can be attached to any of the old states.” Traditionally this would be done by taking the signature and if there’s n old states, creating n variants with n signatures, one for each of the binding to the old state. With the ANYPREVOUT or NOINPUT proposal we have the possibility of having one transaction that can be bound to any of the previous states without having to re-sign. That’s the entire trick. We make one transaction applicable to multiple old states by leaving out the exact location from where we are spending. We leave out the UTXO reference that we’re spending when signing. We can modify that later on without invalidating the signature.
SL: Let me replay my understanding there. This is the current model of Lightning. You and I set up a channel together. What we’re doing is we’re putting a multisignature output onto the blockchain and that is a 2-of-2. Then what we’re doing is we’re passing back and forward the new states to reflect the new output. So let’s say it was initially 10 dollars to me and zero to you and then 9 dollars to me and 1 dollar to you. In the current model if somebody tries to cheat the other party. Let’s say I’m a scammer and I try to cheat you. I publish a Bitcoin transaction to the blockchain that is the pre-signed commitment transaction that closes the channel. The idea is your Lightning node is going to be watching the chain and say “Oh look, Stephan’s trying to cheat me. Let me do my penalty close transaction.” In the current model that would put all the 10 dollars onto your side.
CD: Exactly. For any of my wrong actions you have a custom tailored reaction to that that punishes me and penalizes me by crediting you with all the funds. That’s the exact issue that we’re facing is that these reactions have to be custom tailored to each and every possible misbehavior that I could do. Your set of retaliatory transactions grows every time that we perform a state change. We might have had 1 million states since the beginning and for each of these 1 million states you have to have a tailored reaction that you can replay if I end up publishing transaction 993 for example. This is one of the core innovations that eltoo brings to the table. You no longer have this custom tailored transaction to each of the previous states. Instead you can tailor it on the fly to match whatever I just did. You do not have to keep an ever-growing set of retaliation transactions in your database backed up somewhere or at the ready.
SL: In terms of benefits, it softens the penalty model. Instead of one party cheating the other and then losing everything, now if somebody publishes a wrong transaction or an old state then the other party just publishes the most up to date one that they have. The other benefit here is a scaling one that it might be easier for someone to host watchtowers because it’s less hard drive usage.
CD: Exactly. It is definitely the case that it becomes less data intensive in the sense that a watchtower or even yourself do not have to manage an ever growing set of of transactions. Instead all you need to do is to have the latest transaction in your back pocket. Then you can react to whatever happens onchain. That’s true for you as well as for watchtowers. Watchtowers therefore become really cheap because they just have to manage these 200 bytes of information. When you hand them a new transaction, a new reaction, they throw out the old one and and keep the new one. The other effect that you mentioned is that we now override the old state instead of using the old state but then penalizing. That has a really nice effect. What we do in the end is enforce a state that we agreed upon instead of enforcing “This went horribly wrong and now I have to grab all of the money.” It changes a bit the semantics of what we do towards we can only update the old state and not force an issue on the remote end and steal money from them. That’s really important when it comes to backups with Lightning. As it is today backups are almost impossible to do because whenever you restore you cannot be sure that it’s really the latest state and when you publish it that it’s not going to be seen as a cheating attempt. Whereas with eltoo you can take any old state, publish it and the worst that can happen is that somebody else comes along and says “This is not the latest state, theres’s a newer one. Here it is.” You might not get your desired state. Let’s say you want to take all 10 out of 10 dollars from the channel but you will still get the 9 out of 10 that you own in the latest state because all I can do is override your “10 go to Stephan” with my “9 go to Stephan and 1 goes to Christian”. We’ve reduced the the penalty for misbehavior in the network from being devastating and losing all of the funds to a more reasonable level where we can say “At least I agreed to the state and it’s going to be a newer state that I agreed upon.” I often compare it to the difference between Lightning penalty being the death by beheading whereas eltoo is death by a thousand paper cuts. The cost of misbehaving is much reduced allowing us to have working backups and have a lot of of nice properties that we can probably talk about later such as true multiparty channels with any number of participants. That’s all due to the fact that we no longer insist on penalizing the misbehaving party, we now instead correct the effects that the misbehaving party wanted to trigger.
SL: Fantastic. From your eltoo paper it introduces the idea of state numbers and onchain enforceable variant of sequence numbers. As I understand there’s a ratchet effect that once you move up to that new state that’s now the new one. It means that at least one of our nodes has the ability to enforce the correct latest state. Could you just explain that state numbers idea?
CD: The state numbers idea is actually connecting back to the very first iteration of Bitcoin like we had with the nSequence proposal that Satoshi himself added. nSequence meant that you could have multiple versions of transactions and miners were supposed to pick the one with the highest sequence number and confirm that. Basically replacing any previous transaction that had a lower sequence number. That had a couple of issues, namely that there is no way to force miners to do this. You can always bribe a miner to use a version of a transaction that suits you better or they might be actively trying to defraud you. There is no really good way of enforcing nSequence numbers. On the other hand, what we do with the state numbers is that we do not give the miners the freedom to choose which transaction to confirm. What we do is we say “We have transaction 100 and this transaction 100 can be attached to any previous transaction that could be confirmed or unconfirmed, that has a state number lower than 100. In eltoo we say “This latest state represented by this transaction with a state number of 100 can be attached to any of the previous transactions and override their effect by ratcheting forward the state.” Let’s say you have a published state 90. That means that anything with state number 91, 92, 93 and so on can be attached to your transaction. Your transaction might confirm but the effects that you want are in the settlement part of the transaction. If I can come in and attach a newer version of that state, a new update transaction to your published transaction, then I can basically detach the settlement part of the transaction from this ratcheting forward. I have just disabled your attempt at settlement by ratcheting forward and initiating the settlement for state 100. Then you could come come along and say “Sorry I forgot about state 100. Here’s state 110.” Even while closing the channel we can still continue making updates to the eltoo channel using these state numbers. The state numbers are really nothing else than making an explicit way of saying “This number 100 overrides whatever came before it.” Whereas with LN penalty the only association you have between the individual transactions and so on is by following the “is spent by” relationship. You have a set of transactions that can be published together. But there is no sense of of transitive overriding of effects.
SL: A naive question that a listener might be thinking, Christian what if I tried to set my state number higher than yours, what’s stopping me from doing that?
CD: You can certainly try. But since we are still talking about 2-of-2 multisig outputs I would have to countersign that. I might sign it but then I will make sure that if we later come to a new agreement on what the latest state should be that that state number must be higher than whatever I signed before. This later state can then override your spuriously numbered state. In fact that’s something that we propose in the paper to hide the number of updates that were performed on a channel. Not to go incrementing one by one but have different sized increment steps so that when we settle onchain we don’t tell the rest of the world “Hey, by the way we just had 93 updates.”
Matt Corallo on RBF pinning: https://gnusha.org/url/https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-April/017757.html
SL: From watching the Bitcoin dev mailing list, I saw some discussion around this idea of whether the Lightning node should also be looking into what’s going on in the mempool of Bitcoin versus only looking for transactions that actually get confirmed into the chain. Can you comment on how you’re thinking about the security model? As I understand, you’re thinking more that we’re just looking at what’s happening on the chain and the mempool watching is a nice to have.
CD: With all of these protocols we can usually replay them only onchain and we don’t need to look at the mempool. That’s true for eltoo as it is true for Lightning penalty. Recently we had a lengthy discussion about an issue that is dubbed RBF pinning attack which makes this a bit harder. The attack is a bit involved but it basically boils down to the attacker placing a placeholder transaction in the mempool of the peers making sure that that transaction does not confirm. But being in the mempool that transaction can result in rejections for future transactions. That comes into play when we are talking about HTLCs which span multiple channels. We can have effects where the downstream channel is still locked because the attacker placed a placeholder transaction in the mempool. We are frantically trying to react to this HTLC being timed out but our transaction is not making it into the mempool because it is being rejected by this poison transaction there. If that happens on a single channel that’s ok because eventually we will be able to resolve that and a HTLC is not a huge amount usually. Where this becomes a problem is if we were forwarding that payment and we have a matching upstream HTLC that now also needs to timeout or have a success. That depends on the downstream HTLC which we don’t get to see. So it might happen that the upstream timeout gets timed out. Our upstream node told us “Here’s 1 dollar. I promised to give it to you if you can show me this hash preimage in a reasonable amount of time.” You turned around and forwarded that promise and said “Hey, your attacker, here’s 1 dollar. You can have it if you give me the secret in time.” The downstream attacker doesn’t tell you in time so you will be ok with the upstream one timing out. But it turns out the downstream one can succeed. So you’re out of pocket in the end of the forwarded amount. That is a really difficult problem to solve without looking at the mempool because the mempool is the only indication that this attack is going on and therefore that that we should be more aggressive in reacting to this attack being performed. Most lightning nodes do not actually look at the mempool currently. There’s two proposals that we’re trying to do. One is to make the mempool logic a bit less unpredictable, namely that we can still make progress without reaction even though there is this poison transaction. That is something that we’re trying to get the Bitcoin core developers interested in. On the other side we are looking into mechanisms to look at the mempool, see what is happening and then start alerting nodes that “Hey you might be under attack. Please take precautions and and react accordingly.”
SL: I also wanted to chat about SIGHASH flags because ANYPREVOUT and ANYPREVOUTANYSCRIPT are some new SIGHASH flags. Could you take us through some of the basics on what is a SIGHASH flag?
CD: A sighash flag is sometimes confused with an opcode. It is just a modifier of existing opcodes namely OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY variants that basically instructs the the CHECKSIG operation to which part of the transaction should be signed and which part should not be signed. In particular what we do with SIGHASH_ANYPREVOUT is when computing the signature and verifying the signature, we do not include the previous outputs in the signature itself. These can be modified if desired without invalidating the signature. It is like a kid having a bad grade at at school coming home and needing a signature from the parents. What he does is he covers up part of the permission slip and the parents still sign it. Only then he uncovers the covered part. This changing what was signed does not invalidate the signature itself. That’s sort of a nefarious example but it can be really useful. If you’ve ever given out a blank check for example where you could fill in the amount at a later point in time or fill out the recipient at a later point in time, that’s a very useful tool. For eltoo we use the reaction transaction to something that our counterparty has done and adapted in such a way that it can cleanly attach to what your counterparty has done. There are already some existing SIGHASH flags. The default one is SIGHASH_ALL which covers the entirety of the transaction without the input scripts. There’s SIGHASH_SINGLE, which has been used in a couple of places. It signs the input and the matching output but there can be other inputs and outputs as well that are not covered by the signature. You can amend a transaction and add later on new funds to that transaction and new recipients to that transaction. We use that for example to attach fees to transactions in eltoo. Fees in eltoo are not intrinsic to the update mechanism itself. They are attached like a sidecar which removes the need for us to negotiate fees between the end points. Something that has in the beginning of Lightning caused a lot of channels to die, simply disagreement on fees. There’s also SIGHASH_NONE which basically signs nothing. It signs the overall structure of the transaction but it doesn’t restrict which inputs can be used, it doesn’t restrict which outputs can be used. If you get one of these transactions, you can basically rewrite it at will sending yourself all the money that would have been transferred by it.
SL: I guess for most users, without knowing when they’re doing standard single signature spending on their phone wallet or whatever, they’re probably using SIGHASH_ALL. That’s what their wallet is using in the background for them. If the listener wants to see how this might work, they could pull up a block explorer and on a transaction you can see the different inputs and outputs. What we’re talking about here is what we are committing to when we sign. What is that signature committing to?
CD: It takes the transaction, it passes through a hash function and then the hash is signed. The effect that we have is that if anything in the transaction itself is modified, which was also part of the hash itself, then the signature is no longer valid. It means that I both authorize this transaction and I authorize it only in this form that I’m currently signing. There can be no modification afterwards or else the signature would have to change in order to remain valid. If with the SIGHASH flags we remove something from the commitment to the transaction then we give outsiders or ourselves the ability to modify without having to re-sign.
SL: For the typical user just doing single signature, their wallet is going to use SIGHASH_ALL but where they are doing some sort of collaborative transaction or there’s some kind of special construction that’s where we’re using some of these other sighash flags. With eltoo and ANYPREVOUT the idea is that these ANYPREVOUT sighash flags will allow us to rebind to the prior update correct?
CD: Exactly yes.
SL: Could we talk about ANYPREVOUT and ANYPREVOUTANYSCRIPT? What’s the difference there?
CD: What we do with SIGHASH_ANYPREVOUT is no longer explicitly saying “Hey by the way I’m spending those funds over there.” Instead what we say is “The output script and the input script have to match. Other than that we can mix these transactions however we want.” Instead of having an explicit binding saying “Hey my transaction 100 now connects to transaction 99, and then the scripts have to match. By the scripts I mean the output script would specify that the spender has to sign with public key X and the input script would contain a signature by public key X. Instead of binding by both the explicit reference and the scripts we now bind solely by the scripts. That means that as long as the output says “The spender has to sign with public key X and the input of the other transaction that is being bound to it has a valid signature for public key X in it” then we can attach these two. The difference between ANYPREVOUT and ANYPREVOUTANYSCRIPT is whether we include the output script in the hash of the spending transaction or not. For ANYPREVOUT we still commit to what script we are spending. We take a copy of the script saying that the spending transaction needs to be signed by public key X. We move that into the spending transaction and then include it into the signature computation so that if the output script is modified we cannot bind to it. Whereas the ANYPREVOUTANYSCRIPT says “We don’t take a copy of the output script into the input of the spending transaction but instead we have a blank script. We can bind it to any output whose output script matches our input script.” It is a bit more freedom but it is also something that we need for eltoo to work because the output script of the transaction we’re binding to includes the state number and that obviously changes from state to state. We still want to have the freedom of taking a later state and attaching it to any of the previous states. For eltoo we’d have to use ANYPREVOUTANYSCRIPT. There are a couple of use cases where ANYPREVOUT is suitable on its own. For example if we have any sort of transaction malleability and we still want to take a transaction that connects to a potentially malleable transaction, then we can use SIGHASH_ANYPREVOUT. If the transaction gets malleated in the public network before it is being confirmed we can still connect to it using the connection between the output script and the input script and the commitment of the output script in the spending transaction.
SL: You were mentioning malleation there. Could you outline what is malleation?
CD: Malleation is the bane of all offchain protocols. Malleation is something that that we’ve known about for well over seven years now. If you remember the MtGox hack was for some time attributed to malleability. They said “Our transactions were malleated. We didn’t recognize them anymore so we paid out multiple times.” What happens is I create a transaction and this transaction includes some information that is covered by the signature and can therefore not be changed, but it also could include some information that cannot possibly be covered by the signature. For example, the signature itself. In the input script of a transaction we need to have the signatures but we cannot include the signatures in the signature itself. Otherwise we’d have this circular argument. So while signing the input scripts are set to blank and not committed to. That means that if we then publish this transaction there are places in the transaction that can be modified without invalidating the signature anymore. Some parts of this include push operations, for example normalizations of signatures themselves. We can add prefixes to stuff. We can add dummy operations to the input script. Change how the transaction looks just slightly but not invalidating the signature itself. The transaction now looks different and is getting confirmed in this different form but we might have a dependent transaction where we’re referring to the old form by its hash, by its unchanged form. This follow up transaction that was referencing the unmodified transaction can no longer be used to spend those funds because the miner will just see this new transaction, go look for the old output that it is spending and this output doesn’t exist because it looks slightly different now because the hash changed. It will say “I don’t know where you’re getting that money from. Go away. I’m throwing away that transaction and it will not get confirmed. Whereas with SIGHASH_ANYPREVOUT we can counter this by having the transaction in the wider network, be modified, be confirmed in this modified state and then the sender of the follow up transaction can just say “Ok I see that there has been a modification to the transaction that I’m trying to spend from. Let me adjust my existing transaction by changing the reference inside of the input to this new alias that everybody else knows the old transaction.” Now we can publish this transaction. We did not have to re-sign the transaction. We did not have to modify the signature. All we had to do was take the reference and update it to point to the real confirmed transaction. That makes offchain protocols a lot easier because while having a single signer re-sign a transaction might be easy to do, if we’re talking about multisig transactions where multiple of us have to sign off on any change, that might not be so easy to implement. ANYPREVOUT gives us this freedom of reacting to stuff that happens onchain or in the network without having to go around and convince everybody “Hey please sign this updated version of this transaction because somebody did something in the network.”
SL: If I understood correctly the way eltoo has been constructed, it is defending against that risk. You’re trying to use this new functionality of being able to rebind dynamically. For listeners who are concerned that maybe there’s a risk this is all opt in? It is only if you want to use Lightning in the eltoo model. You and I have this special type of SIGHASH flag where we have a special kind of output that we are doing the updates on our channel. If somebody doesn’t want to they can just not use Lightning.
CD: Absolutely. It is fully opt in. It is a SIGHASH flag. We do have a couple of SIGHASH flags already but no wallet that I’m aware of implements anything but SIGHASH_ALL. So if you don’t want to use Lightning or you don’t want to use any of the offchain protocols that are based on SIGHASH_ANYPREVOUT simply don’t use a wallet that can sign with them. These are very specific escape hatches from the existing functionality that we need to implement more advanced technologies on top of the Bitcoin network. But it is by no means something that suddenly everybody should start using just because it’s a new thing that is out there. If we’re careful not to even implement SIGHASH_ANYPREVOUT in everyday consumer wallets then this will have no effect whatsoever on the users that do not want to use these technologies. It is something that has a very specific use case. It’s very useful for those use cases but by no means everybody needs to use it. We’re trying to to add as many security features as possible. For example if you sign with a SIGHASH flag that is not SIGHASH_ALL you as the signing party are the only one that is deciding whether to use the sighash flag or not. Whereas with the ANYPREVOUT changes that were introduced, AJ has done a lot of work on on this, he introduces a new public key format that explicitly says “Hey I’m available for SIGHASH_ANYPREVOUT.” Even the one that is being spent from now has the ability to opt into ANYPREVOUT being used or not. Both have to match, the public key that is being signed for has to have opted in for ANYPREVOUT and the signing party has to opt in as well. Otherwise we will fall back to existing semantics.
SL: As I understand this BIP 118, there is a reliance on Taproot being activated first before ANYPREVOUT?
CD: Obviously we would have liked to have ANYPREVOUT as soon as possible but one of the eternal truths of software development is that reviewer time is scarce. We decided to not push too hard on ANYPREVOUT being included in Taproot itself to keep Taproot very minimal, clean and easy to review. Then try to do a ANYPREVOUT soft fork at a future point in time at which we will hopefully have enough confidence in our ability to perform soft forks that we can actually roll out ANYPREVOUT in a reasonable amount of time. For now it’s more important for us to get Taproot through. Taproot is an incredible enabling technology for a number of changes, not just for Lightning or eltoo but for a whole slew of things that are based on Taproot. Any delay in Taproot would definitely not be in our interest. We do see the possibility of rolling out ANYPREVOUT without too many stumbling stones at a second stage once we have seen Taproot be activated correctly.
SL: Also in the BIP118 document by AJ there’s a discussion around signature replay. What is signature replay? How is that being stopped?
CD: Signature replay is one of the big concerns around the activation of ANYPREVOUT. It consists of if I have one transaction that can be rebound to a large number of transactions this doesn’t force me to use that transaction only in a specific context but I could use it in a different context. For example, if we were to construct an offchain protocol that was broken and couldn’t work we could end up in a situation where you have two outputs of the same value that opted in for ANYPREVOUT and you have one transaction that spends one of them. Since both opted into ANYPREVOUT and both have the identical script and the identical value, I could replay that transaction on both outputs at once. So instead of the intended effect of me giving you let’s say 5 dollars in one output you can claim 5 dollars twice by replaying this multiple times. This is true for offchain protocols that are not well developed and are broken because well designed offchain protocols will only ever have one transaction that you can bind to. You cannot have multiple outputs that all can be spent by the same ANYPREVOUT transaction. But it might still happen that somebody goes onto a blockchain explorer and looks up the address and then sends some money that happens to be the exact same value to that output. What we’re trying to do is find good ways to prevent exactly the scenario of somebody accidentally sending funds and creating an output that could potentially be claimed by SIGHASH_ANYPREVOUT by making these scripts unaddressable. We create a new script format for which there is no bech32 encoding for the script. Suddenly you cannot go on to a blockchain explorer and manually interfere with an existing offchain protocol. There are a number of steps that we are trying to do to reduce this accidental replayability. That being said in eltoo for example, the ability to rebind a transaction to any of the previous matching ones is exactly what we were trying to achieve. It is a very useful tool but it in the wrong hands it can be dangerous. So don’t go play with SIGHASH_ANYPREVOUT if you don’t know what you are doing.
SL: What would the pathway be to activate ANYPREVOUT? What stage would it be in terms of people being able to review it or test it?
CD: I had a branch for SIGHASH_NOINPUT which was used by Richard Myers to implement a prototype of eltoo in Python that is working. I’m not exactly sure if ANYPREVOUT has any code that can be used just yet. I would have to definitely check with AJ or implement it myself. It shouldn’t be too hard to implement given that SIGHASH_NOINPUT consisted of two IF statements and a total of four lines changed. I don’t foresee any huge technical challenges. It is mostly just the discussion on making it safe, making it usable and making it efficient that will take a bit longer. We have that time as well because we are waiting for Taproot to be activated in the meantime.
SL: Is there anything else you wanted to mention about ANYPREVOUT or shall we now start talking about MPP?
CD: ANYPREVOUT is absolutely cool. We’re finding so many use cases. It’s really nice. I would so love to see it.
MPP blog post: https://medium.com/blockstream/all-paths-lead-to-your-destination-bc8f1a76c53d
SL: We’ll see what all the Bitcoin people out there are thinking. The benefits of having eltoo in Lightning would be pretty cool. It enables multi-party channels which for listeners who haven’t listened to our first episode, I think it is Episode 59 off the top of my head, have a listen to that. There’s a lot of possibilities there in terms of multi-party channels. That helps in terms of being able to get around that idea that there won’t necessarily be enough UTXOs for every person on earth. That’s why multi-party channels might be a handy thing to have. So let’s have a look into MPP then, multi-part payments. Listeners also check out my earlier episode with Rusty on this one. Christian, you had a great blog post talking about MPP and how it’s been implemented in c-lightning. Do you want to tell us the latest with that?
CD: We implemented multi-part payments as part of our recent 0.9.0 release which we published about 10 days ago. Multi-part payments is one of those features that has been long awaited because it enables us to be much more reliant and adapt ourselves better to the network condition that we encounter. It boils down to instead of me sending a payment and doing it all in one chunk, we split the payment into multiple partial payments and send them on different paths from us to the destination. Thus making better use of the network liquidity allowing us to create bigger payments since we are no longer constrained by the capacity of individual channels. Instead we can bundle multiple channels’ capacities and use the aggregate of all of these channels together. It also allows us to make the payments much more reliable. We send out parts, get back information and only retry the parts that failed on our first attempt.
SL: So there’s a lot of benefits.
CD: There’s also a couple of benefits when it comes to privacy but we’ll probably talk about those a bit later.
SL: What are some of the main constraints if your node wants to construct an MPP multi-part payment onion package?
CD: There’s two parts of MPP. One is the recipient part which is “I know I should receive 10 dollars and I received only two. I’ll just keep on waiting until I get the rest, holding on to the initial two that I already have for sure.” The recipient grabs money and waits for it to be all there before claiming the full amount. On the sender’s side what we do is we split the payment into multiple parts. For each of these partial payments we compute a route from us to the destination. For each of these we go ahead and compute the routing onion. Each individual part has its own routing onion, has its own path, has its own fate so to speak. Then we send out the partial payment with its onion on its merry way until we either get to the destination, at which point the destination will collect the promise of incoming funds and if it has all of the funds that were promised available it will release the payment preimage thus locking in atomically all of the partial payments. Or we get back an error saying “This channel down the road doesn’t have enough capacity. Please try again.” At which point we then update our view of the network. We compute a new route for this payment and we try again. If we cannot find a new route for the amount that we have, we split it in half and now we have two parts that we try to send independently from each other. The sender side is pretty much in control of how big do we make these parts? How do we schedule them? How do we route them? How do we detect whether we have a fatal error that we cannot recover from? When do we detect that this part is ok but this part is about to be retried? The sender part is where all of the logic is. The recipient just waits for incoming pieces and then at some point decides “Ok I have enough. I’ll claim all of them.” The sender side required us to reengineer quite a lot of our payment flow. That also enabled us to build a couple of other improvements like the key send support for example, which we so far only had a Python plugin for but now have a C plugin for it as well.
SL: You were talking through the two different processes. You’ve got the pre-split and then you mentioned the adaptive splitting. Once you’ve tried it one time and it failed, now you can take that knowledge and try a slightly different split or slightly different route. It will then create the new payment and try to send that?
CD: Exactly. The adaptive splitting is exactly the part that we mentioned before. We try once and then depending on what comes back we decide do we retry? Do we split? What do we do now? Is this something that we can still retry and have a chance of completing? Or do we give up?
SL: Let’s say you have installed c-lightning and you’re trying to do a payment. In the background what’s going on is your node has its own graph of the network and it’s trying to figure out “Here’s where the channels are that I know about. Here’s what I know of the capacity.” Does it then have better information and therefore each successive try is a little bit better. How does that work?
CD: Exactly. So initially what we have in the network is we see channels as total capacities. If the two of us opened a 10 dollar channel then somebody else would see it as 10 dollars. They would potentially try to send 8 dollars through this channel. Depending on the ownership of those 10 dollars this might be possible or not. For example, if we each own five there’s no way for us to send 8 dollars through this channel. We will report an error back to the sender and the sender will then know 8 was more than the capacity. I will remember this upper limit on the capacity. It might be lower but we know that we cannot send 9 Bitcoin through that channel. As we learn more about the network our information will be more and more precise and we will be able to make better predictions as to which channels are usable and which channels aren’t for a given payment of a given size. There is no point in us retrying this 8 dollar payment through our well balanced channel again, because that cannot happen. But if we split in two and now have two 4 dollar parts then one of them might go through our channels. Knowing that we have 5 and 5 it will actually go through. Now the sender is left with a much easier task of finding a second 4 dollar path from himself to the destination rather than having this one big chunk of 8 all at once.
SL: From the blog post the way the fees work is there’s a base fee and there’s typically a percentage fee. If you split your MPP up into hundred different pieces you’re going to end up paying massive amount of base fee across all of those hundred pieces. Your c-lightning node node has to make a decision on how many pieces to split into.
CD: Exactly. We need to have a lower value after which we say “From now on it’s unlikely that we’re going to find any path because the payment is so small in size that it will be dominated by the base fee itself.” This is something that we’ve encountered quite early on already when the first games started popping up on the Lightning Network. For example Satoshi’s Place, if you wanted to color in one pixel on Satoshi’s Place you’d end up paying one millisatoshi but the base fee to get there would already be one satoshi. You’d be paying a 100,000 percent fee for your one millisatoshi transfer which is absolutely ludicrous. So we added an exception for really tiny transfers that we call signaling transfers because their intent is not really to pay somebody it’s more to signal activity. In those cases we allow you to have a rather large fee upfront. That is not applicable to MPP payments because if we were to give them a fee budget of 5 satoshis each then these would accumulate across all of the different parts and we’d end up with a huge fee. So we decided to give up if a payment is below 100 satoshis in size. Not give up but not split any further because at that size the base fee would dominate the the overall cost of the transfers. What we did there was take the network graph and compute all end to end paths that are possible, all shortest paths, and compute what the base fee for these paths would be. If I’m not mistaken we have a single digit percent of payments that may still go through even though they are below 100 satoshis in size. We felt that aborting at something that is smaller than 100 satoshis is safe. We will still retry different routes but we will not split any further because that would double the cost in base fees at each splitting.
SL: In practice most people are opening channels much, much larger than that. A hundred satoshis is really trivial to be able to move that through. At current prices we’re talking like 6 or 7 cents or something.
CD: Speaking of channels and the expected size of a payment that we can send through, that brings me back to our other payment modifier, the pre-split modifier which instead of having this adaptive mechanism where we try and then learn something and retry with this new information incorporated. We decided to do something a bit more clever and say “Wait why do we even try these really large payments in the first place when we know perfectly well that most of them will not succeed at first?” I took my Lightning node and tried to send payments of various sizes to different endpoints by probing them. Unlike my previous probes where I was interested in seeing if I could reach those those nodes, I was more interested if I can reach them how much capacity could I get on this path? What is the biggest payment that would still succeed getting it to the destination? What we did is we measured the capacity of channels along the shortest path from me to I think 2000 destinations. Then we plotted it and it was pretty clear that amounts below 10,000 satoshis, approximately 1 dollar, have a really good chance of of succeeding on the first try. We measured the capacities in the network and found that payments with 10,000 satoshis in size can succeed relatively well. We have an 83 percent success rate for payments of exactly 10,000 satoshis, smaller amounts will have higher success rates. Instead of trying these large chunks at first and then slowly moving towards these sizes anyway, we decided to split right at the beginning of the payment into roughly 1 dollar sized chunks and then send them on their way. These already have a way better chance of succeeding on the first try then this one huge chunk would have initially.
SL: To clarify, that percentage you were mentioning, that is on the first try, correct? It will then retry multiple times and the actual payment success rate is even higher than that for 10,000 sats?
CD: Absolutely, yeah.
SL: I think this is an interesting idea because it makes it easier for the retail HODLer or retail Lightning enthusiast to be able to set up his node and be a meaningful user of the network that they’re not so reliant on routing through the well known massive nodes, the ACINQ node, the Bitrefill node or the Zap node. It is easier for an individual because now you can split those payments across multiple channels?
CD: Absolutely. What we do with the pre-split and adaptive splitter, we make better use of the network resources that are available by spreading a single payment over a larger number of routes. We give each of the nodes on those routes a tiny sliver of fees instead of going through the usual suspects and giving them all of the fees. We make revenue from routing payments more predictable. We learn more about the network topology. While doing MPP payments we effectively probe the network and find places that are broken and will cause them to close channels that are effectively of no use anyway. Something that we’ve seen with the probing that we we did for the Lightning Network conference was that if we end up somewhere where the channel is non-functional, we will effectively close that channel and prune the network of these relics that are of no use. We also speed up the end-to-end time by doing all of this in parallel instead of sequentially where each payment attempt would be attempted one by one. We massively parallelized that to learn about the network and make better use of what we learned by speeding up the payment as well.
SL: I also wanted to touch on the privacy elements. I guess there’s probably two different ways you could think of, multiple angles I can think of. One angle might be if somebody was trying to surveil the network and they wanted to try to understand the channel balances and ascertain or infer from the movement in the balances who is paying who, MPP changes that game a little bit. It makes it harder for them. But then maybe on the downside you might say because we haven’t moved to the Schnorr payment points PTLC idea then it’s still the same payment preimage. It is asking the same question to use the phrasing Rusty used. In that sense it might theoretically be easier for a hypothetical surveillance company to set up spy Lightning nodes and see that they’re asking the same question. What are your thoughts there?
CD: There is definitely some truth in the statement that by distributing a payment over more routes and therefore involving more forwarding nodes, we are telling a larger part of the network about a payment that we are performing. That’s probably worse than our current system where even if we were using a big hub that hub would see one payment and the rest of the network would be none the wiser. On the plus side however the one big hub thing would give away the exact value you’re transferring to the big hub. Whereas if we pre-split to 1 dollar amounts and then do adaptive splitting, each of the additional nodes that are now involved in this payment learns a tiny bit about the payment being performed, namely that there is a payment, but since we use this homogeneous split of everything splits to 1 dollar, they don’t really know much more than that. They will learn that somebody is paying someone but they will not learn about the amount, they will not learn about the source and destination. And we are making traffic analysis a lot harder for ISP level attackers by really increasing the chattiness of the network itself. We make it much harder for observers to associate or to collate individual observations into one payment. It is definitely not the perfect solution to to tell a wider part of the network about the payment being done, but it is an incremental step towards the ultimate goal of making every payment indistinguishable from each other which we are getting with Schnorr and the point timelocked contracts. Once we have the point timelocked contracts we truly have a system where we are sending back and forth payments that are not collatable by payment hash as you correctly pointed out. Not even by amount, because all of the payments have roughly the same amounts. It is the combination of multiple of these partial payments that gives you the actual transferred amount. I think it’s not a clear loss or a clear win for privacy that we’re now telling a larger part of the network. But I do think that the pre-splitter and the adaptive splitting when combined with PTLC will be an absolute win no matter where you look at it.
SL: I think that’s a very fair way to summarize. In terms of getting PTLCs, point timelocked contracts, the requirement for that would be the Schnorr Taproot soft fork? Or is there anything else that’s also required?
CD: Taproot and Schnorr is the only one that is required for PTLCs. I’m expecting the Lightning Network specification to be really quick at adapting it, pushing it out to the network and actually making use of that new found freedom that we have with PTLCs and Schnorr.
SL: I suppose the other component to think about and consider from a privacy perspective is the onchain footprint aspect of Lightning. Maybe some listeners might not be familiar but when you’re doing Lightning you still have to do the open and close of a channel. You did some recent work at the recent Lightning conference showing an ability to understand which ones of these were probably Lightning channel opens. That is another thing where Taproot might help particularly in the case of a collaborative close. Once we have Taproot, let’s say you and I open a channel together and it’s the happy path, the collaborative close, that channel close is indistinguishable from a normal Taproot key path spend?
CD: Exactly. Our opens will always look exactly like somebody paying to a single sig. The single sig under the covers happens to be a 2-of-2 multisig disguised as a single sig through the signature aggregation proposals that we have. The close transactions, if they are collaborative closes, they will also look like single sig spends to the destinations that are owned by the endpoint. It might be worth pointing out that non-collaborative closes will leak some information about the usage of eltoo or Lightning penalty simply because we enter this disputed phase where we reveal all of the internals of our agreement, namely how we intend to overwrite or penalize the misbehaving party. Then we can still read out some of the information from a channel. That’s where I mentioned before that you might not want to increment state numbers one by one for example. This is also the reason why in LN penalty, we hide the commitment number in the locktime field but encrypt it. That information might still eventually end up on the blockchain where they could be analyzed. But we’d gossip about most of this informations anyway because we need to have a local view of the network in order to route payments.
SL: It is a question of what path you really need to be private I guess. One other part where I wanted to confirm my understanding is with the Taproot proposal, my understanding is you’ll have a special kind of Taproot output. The cool thing about the Schnorr signatures aspect is that people can do more cryptography and manipulation on that. That’s this idea of tweaking. My understanding is you either have the key path spend which is the indistinguishable spend. That’s the collaborative close example. But then in the non-collaborative close, that would be a script path spend. As part of Taproot there’s a Merkle tree. You have to expose which of the scripts that you want to spend. You’re showing the script I want to spend and the signatures in relation to it. Is that right?
CD: That’s right. The Taproot idea comes out of this discussion for Merklized Abstract Syntax Trees. It adds a couple of new features to it as well. A Merklized Abstract Syntax Tree is a mechanism of having multiple scripts that are added to a Merkle tree and summed up until we get to the root. The root would be what we put into our output script. When we spend that output we would say “That Merkle tree corresponds to this script and here is the input that matches this script proving that I have permission to spend these coins. Taproot goes one step further and says “That Merkle tree root is not really useful. We could make that a public key and mix in the Merkle root through this tweaking mechanism.” That would allow us to say “Either we sign using the root key into which we tweak the the Merklized Abstract Syntax Tree. That’s the key path spent. Or we can say “I cannot sign with this pubkey alone but I can show the script that corresponds to this commitment. Then for that I do have all of the information I need to sign off. In the normal case for a channel close we use the root key to sign off on the close transaction. In the disputed case we say “Here’s the script that we agreed upon before. Now let’s run through it and resolve this dispute that we have by settling onchain and having the blockchain as a mediator for our dispute.”
SL: I also wanted to talk about some of the Lightning attacks that are coming out in articles. From my understanding from chatting with yourself and some of the other Lightning protocol developers, it seems to me like there’s a bunch of these that have been known for a while but some of them are now coming out as papers. An interesting recent one is called “Flood and Loot: A Systemic Attack on the Lightning Network. As I understand this, it requires establishing channels and then trying to send through a lot of HTLC payments. They go non-responsive and then they force the victim to try to go to chain. The problem is if they’ve done it with many people and many channels all at once they wouldn’t be able to get confirmed. That’s where the victim would lose some money. Could you help help me there? Did I explain that correctly?
CD: That’s correct. The idea is to have an attacker send to a second node. He owns a massive amount of payments going through the victims. What you end up doing there is you add a lot of HTLCs to the channel of of your victim and then you hold on to these payments on the recipient side of the channel. Something that we’ve known for quite some time. We know that holding onto HTLCs is kind of dangerous. This attacker will hold onto HTLCs so long that the timeout approaches for the HTLC. The HTLC has two possible outcomes, either it is successful and the preimage is shown to the endpoint that added the HTLC, or we have a timeout. Then the funds revert back to the endpoint that added the HTLC. This works because there is no race between the success transaction and the timeout transaction. If there is no success for let’s say 10 hours then we will trigger the timeout because we can be confident that the success will not come after the timeout. This Flood and Loot attack does exactly that by holding onto the HTLC, it forces us to have a race between the timeout and the success transaction. The problem is that our close transaction, having all of these HTLCs attached, is so huge that it will not confirm for quite some time. So they can force the close to take so long that the timeout has expired. We are suddenly in a race between the successful transaction and the timeout transaction. That’s the attack. To bloat somebody else’s channel such that the confirmation of the close transaction that is following is so long that we can actually get into a situation where we are no longer sure whether the timeout or the success transaction will succeed in the end.
SL: I guess there’s a lot of moving parts here. You could say “Let’s modify the CSV window and make that longer” or “Let’s change the number of HTLCs and restrict that for each channel”. Could you talk to us about some of those different moving parts here?
CD: It is really hard to say “One number is better than the other”. But one way of of reducing the impact of this attack is to limit the number of HTLCs that we add to our own transaction. That will directly impact the size of our commitment transaction and therefore our chances of getting confirmed in a reasonable amount of time. Avoid having this race condition between success and timeout. The reason why I’m saying that there is no clear solution is that reducing the number of HTLCs that we add to our channels reduces the utility of the network as a whole because once we have 10 HTLCs added and we only allow 10 to be added at once then that means that we cannot forward the 11th payment for example. If our attacker knows that we have this limit, they could effectively run a DOS attack against us by by opening 10 HTLCs. Exhausting our budget for HTLCs and therefore making our channel unusable until they release some of the HTLCs. That’s an attack that we are aware of and so far hasn’t been caught by academia but I’m waiting for it. All of these parameters are a trade-off between various goals that we want to have. We don’t currently have a clean solution that has only upsides. The same goes for CSVs. If we increase the CSV timeouts then this attack might be harder to enforce because we can spread confirmation of transactions out a bit further. On the downside having large CSVs means that if we have a non-collaborative close for a channel then the funds will return only once the CSV timeout expires. That means that the funds are sure to come back to us but might not be available for a couple of days before we can reuse them.
SL: It is an opportunity cost because you want to be able to use that money now or whatever. There are trade off and there’s no perfect answer to them. Let’s say somebody tries to jam your channels. How do HTLCs release? What’s the function there?
CD: Each HTLC has a timeout. The endpoint that has added the HTLC can use this timeout to recover funds that are in this HTLC after this timeout expires. Each HTLC that is added starts a new clock that counts down until we can recover our funds. If the success case happens before this timeout then we’re happy as well. If this timeout is about to expire and we need to resolve this HTLC onchain then we will have to force this channel onchain before this timeout expires, a couple of blocks before. Then force our counterparty to either reveal the preimage or grab back our funds through the timeout. We then end up with a channel closing slightly before the timeout and then an onchain settlement of that HTLC.
SL: So we could think of it like we set up our node, we set up the channels and over time HTLCs will route through. It is usually going to be a CSV or maybe a CLTV where over time those HTLCs will expire out because the timer has run out on them. Now you’ve got that capacity back again.
CD: In these cases they are CLTVs because we need absolute times for HTLCs. That’s simply because we need to make sure the HTLC that we forwarded settles before the upstream or the HTLC where we received from settles. We need to have this time to extract the information from the downstream HTLC, turn around and forward it to the upstream HTLC in order to settle the upstream HTLC correctly. That’s where the notion of CLTV delta comes in. That is a parameter that each node sets for himself and says “I am confident that if my downstream nodes settles in 10 blocks I have enough time to turn around and inform my upstream note about this downstream settlement so that my channel can stay active.”
SL: I also wanted to touch on the commitment transaction size. Part of this attack in the Flood and Loot example depends on having a very large commitment transaction. If there’s a lot of pending HTLCs why does that make the transaction bigger? Is it that there’s a lot more outputs there?
CD: That’s exactly the case. The commitment transaction varies in size over time as we change our state. Initially when we have a single party funding the channel then the entirety of the funds will revert back to that to that party. The commitment transaction will have one output that sends all of the funds back to the funding party. As soon as the counterparty has ownership of some funds in the channel then we will add a second output, one going to endpoint A and one going to endpoint B. Those reflect the settled capacity that is owned by the respective party. Then we have a third place where we add new outputs, that’s exactly the HTLCs. Each HTLC doesn’t belong to either A or B but it’s somewhere in the middle. If we succeed, it belongs to B and if it doesn’t succeed, it reverts back to A. Each of the HTLCs has their own place in the commitment transaction in the form of an output reflecting the value of the HTLC and the output script, the resolution script of the HTLCs, which spells out “Before block height X I can be claimed by this and after block height X I can be reverted back to whoever added me”. Having a lot of HTLCs attached to a channel means that the commitment transaction is really large in size. That’s also why we have this seemingly random limit on the total number of HTLCs in the protocol of 483 maximum HTLCs attached to a single transaction because at that point with 483 HTLCs we’d end up with a commitment transaction that is a 100 kilobytes in size, I think.
SL: That’s pretty big. A standard transaction might be like 300 bytes?
CD: It’s a massive cost as well to get that confirmed. It definitely is a really evil attack because not only are you stealing from somebody but you’re also forcing them to pay a considerable amount of money to get their channel to settle.
SL: The other point there is that because we count fees in terms of sats per byte and if you’ve done that fee negotiation between the two nodes upfront, let’s say you and I negotiated that early on and then one of us goes offline because it’s the flood and loot attack. You’d have this huge transaction but you wouldn’t have enough fees to close it.
CD: Exactly. We would stop adding HTLCs before we no longer have any funds to settle it but it would still be costly if we ever end up with a large commitment transaction where something like 50 percent of our funds go to fees because it’s this huge thing.
SL: Maybe we can step back and talk about Lightning generally, the growth of the Lightning Network and some of the different models that are out there. In terms of how people use a Lightning node today there’s the Phoenix wallet ACINQ style where it is non-custodial but there’s certain trade offs there and it’s all going through the ACINQ node. Then you’ve got Wallet of Satoshi style. They’re kind of like a Bitcoin Lightning bank and the users are customers of that bank. Then you’ve got some people who are going full mobile node Neutrino style and then maybe the more self-sovereign style where people might run node packages like myNode, Nodl or Raspiblitz and have a way to remote in with their Blue wallet or Zap or Zeus or Spark wallet. Do you have any thoughts on what models you think will be more popular over time?
CD: I definitely can see the first and last model quite nicely namely the sort of mobile wallet that has somebody on the operational side taking care of operating your node but you are still in full control of your funds. That would be the Phoenix ACINQ model where you care for your own node but the hard parts of maintaining connectivity and maintaining routing tables and so on would be taken care of by a professional operator. That’s also why together with ACINQ we came up with the trampoline routing mechanism and some other mechanisms to outsource routing to online nodes. Running a full Lightning node on a mobile phone while way easier than a Bitcoin full node, it is still going to use quite a considerable amount of resources in terms of battery and data to synchronize the view of the network to find paths from you to your destination. You would also need to monitor the blockchain in a reliable way so that if something happens, one of your channels goes down, you are there to react. Having somebody taking care of those parts, namely to preprocess he changes in the network view and providing access to the wider network through themselves is definitely something that I can see being really popular. On the other side, I can definitely see people that are more into operating a node themselves going towards a self sovereign node style at home where they have a home base that their whole family might share or they might administer it for a group of friends and each person would get a node that they can remote into and operate from there. There is the issue of synchronizing routing notes and so on to your actual devices that you’re running around with like a mobile phone or your desktop. It doesn’t really matter because you have this 24 hour node online that will take care of those details. The fully mobile nodes, I think they’re interesting to see and they definitely show up a lot of interesting challenges but it might be a bit too much for the average user to have to take care of all of the stuff themselves. To learn about what a channel is, to open a channel, to curate channels to make sure that they are well connected to the network. Those are all details that I would like to hide as much as possible from the end user because while important for your performance and your ability to pay they are also hard concepts that I for example would not want to try to explain to my parents.
SL: Of course. Obviously your focus is very deep technical protocol level but do you have any thoughts on what is needed in terms of making Lightning more accessible to that end user? Is it better ways to remote into your home node? Do you have any ideas around that or what you would like to see?
CD: I think at least from the protocol side of things we still have a lot we can do to make all of this more transparent to the user and enable non tech savvy people to take care of a node themselves. I don’t know what the big picture is at the end but I do know that we can certainly abstract away and hide some of the details in the protocol itself to make it more accessible and make it more usable to end users. As for the nice UI and user experience that we don’t have yet, I think that will crystallize itself out in the coming months. We will see some really good looking things from wallet developers. I’m not a very graphical person so I can’t tell you what that’s going to look like but I’m confident that there are people out there that have a really good idea on what this could look like. I’m looking forward to seeing it myself.
SL: There’s a whole bunch of different models because people who just want something easy to get started, something like Phoenix might be a good one for them. If you’re more technical then obviously you can go and do the full set up your own c-lightning and Spark or set up lnd and Zap or whatever you like. It is building out better options to make it easy for people even if we know not everyone is going to be capable to do the full self-sovereign style as we would like.
CD: Absolutely. It is one of my pet peeves that I have with the Bitcoin community, we have a tendency to jump right to the perfect solution and shame people that do not see this perfect solution right away. This shaming of newcomers into believing that there is this huge amount of literature they have to go through before even touching Bitcoin the first time. That can be a huge barrier to entry. I think what we need to have is a wide range of of utilities that as the user grows in their own understanding of Bitcoin he can upgrade or downgrade accordingly to reflect his own understanding of the system itself. We shouldn’t always mandate that only the most secure solution is the only one that is to be used. I think that there are trade offs when it comes to user friendliness and privacy and security, and we have to accept that some people might not care so much about the perfect setup, they might be ok with a decent one.
Bastien Teinturier at Lightning Conference: https://diyhpl.us/wiki/transcripts/lightning-conference/2019/2019-10-20-bastien-teinturier-trampoline-routing/
Bastien Teinturier on the Lightning dev mailing list: https://gnusha.org/url/https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-August/002100.html
SL: That’s a very good comment there. I wanted to talk about trampoline routing. You mentioned this earlier as well. I know the ACINQ guys are keen on this idea though I know that there has also been some discussion on GitHub from some other Lightning developers who said “I see a privacy issue there because there might not be enough people who run trampoline routers and therefore there’s a privacy concern there. All those mobile users will be doxing their privacy to these trampoline routers.” Do you have any thoughts on that or where are you placed on that idea?
CD: Just to reiterate trampoline routing is a mechanism for a mobile wallet or a resource constrained wallet to contact somebody in the network that offers this trampoline service and and forwarding a payment to that trampoline node. When the trampoline node unpacks the routing onion it will see “I’m not the destination and I should forward it to somebody.” But instead of telling me exactly whom I have to forward it to it tells me the final destination of the payment. Let’s say I’m a mobile phone and I have a very limited knowledge of my surroundings in the network but I know that you Stephan are a trampoline node. Then when I want to try Rusty for example I can look in my vicinity to see if I have a trampoline node. I can build a payment to you with instructions to forward it to Rusty whom I don’t know how to reach. Then I send this payment. When you unpack your onion you just receive it like usual. You don’t know exactly who I am because I’m still onion routing to you. You unpack this onion and see “Somebody who has sent me this payment has left me 100 satoshis in extra fees. I’m supposed to send 1 dollar to Rusty.” Now I have 100 satoshis as a budget to get this to Rusty.” I outsourced my route finding to you. What have you seen from this payment? You’ve obviously seen that Rusty is the destination and that he should receive 1 dollar worth of Bitcoin. But you still don’t know me. We could go one step further and say “Instead of having this one trampoline hop we can also chain multiple of them.” Instead of telling you to go to Rusty I would tell you to go to somebody else who also happens to be a trampoline and then he can forward it to Rusty. We can expand on this on this concept and make it an onion routed payment inside of individual onion routed hops. What does the node learn about the payment he is forwarding? If we only do this one trampoline hop then you might guess that I’m somewhere in your vicinity, network wise, and you learned that Rusty is the destination. If we do multiple trampoline hops then you will learn that somebody has sent you a payment. Big surprise, that’s what you always knew. You can no longer say that I’m in your vicinity, I the original sender, because you might have gotten it from some other trampoline node. You can also not know the next trampoline you’re supposed to send to is the destination or whether that’s an intermediate trampoline as well. We can claw back some of the privacy primitives that we have in pure onion routing that is source based routing inside of the trampoline routing. But it does alleviate the issue of the sender having to know a good picture of the network topology in order to send a payment first. I think we can make a good case for this not being much worse but much more reliable than what we have before. We also have a couple of improvements that come alongside trampoline routing. Let’s go back to the initial example of me sending to you, you being the trampoline and then sending to Rusty. Once you get the instruction to send to the final destination, you can retry yourself instead of having to tell me “This didn’t work. Please try something else.” We can do in network retries which is really cool especially for mobile phones that might have a flaky connection or it might be slow. We can outsource retrying multiple attempts to the network itself without having to be in the active path ourselves.
SL: Fascinating. If I had to summarize some of your thinking there, it’s kind of like think through a little bit more clearly about exactly who are you doxing and what are you doxing to who? If you haven’t doxed any personal information about yourself to me then really what’s the privacy loss there? Maybe it would become the case that the hardcore Bitcoin Lightning people might run trampoline routing boxes in a similar way to some hardcore people run Electrum public servers to benefit people on the network.
CD: Absolutely. It is not just because of the kindness of your heart that you’re running trampoline nodes. One thing that that I mentioned before is that you get a lot of fees in order for you to be able to find a route. The sender cannot estimate how much it’s going to cost to reach the destination. They are incentivized to overpay the trampoline node to find a route for them. This difference then goes to the trampoline node. Running a trampoline node can be really, really lucrative as well.
SL: Yeah, that’s fascinating. I didn’t think about that. That’s a good point. In some ways there’s more incentive to do it than running an Electrum public server because people don’t pay Electrum public servers right now. It is even better in that sense.
CD: Yeah and it’s not hard to implement. We can implement trampoline routing as a plugin right now.
SL: Another thing I was interested to touch on is privacy attacks on Lightning. With channel probing the idea is that people construct a false onion that they know cannot go through and then try to figure out based on that. They sort of play Price is Right. Try 800 sats, try 8 dollars and then figure it out based on knowing roughly how much is available in that channel. People say that’s violating the privacy principles of Lightning but how bad is that really? What’s the actual severity of it? Just losing some small amount of privacy in a small way that doesn’t really stop the network growing? Do you have any reflections on that?
CD: I do because probing was one of my babies. I really like probing the network to be honest. I come from a background that is mostly measurements and probing the Bitcoin network. I was really happy when I found a way to probe the Lightning Network and see how well it works and if we can detect some failures inside of the network. You’re right that probing boils down to attempting a payment that we know will never succeed because we gave it a payment hash that doesn’t correspond to anything that the recipient knows. What we can do is compute a route to whichever node I’m trying to probe, I will construct an onion and then send out a HTLC that cannot possibly be claimed by the recipient. Depending on the error message that comes back, whether the destination says “I don’t know what you’re talking about” or some intermediate node saying “Insufficient capacity” we can determine how far we got with this probe and what kind of error happened at the point where it failed. We can learn something about the network and how it operates in the real world. That’s invaluable information. For example we measured how probable a stuck payment is, something that has been dreaded for a long time. It turns out that stuck payments are really rare. They happen in 0.18 percent of cases for payments. It’s also really useful to estimate the capacity that we have available for sending payments to a destination. That’s something that we’ve done for the pre-split analysis for example. We said “Anything below 10,000 satoshis has a reasonable chance of success. Anything above might might be tricky.” Before even trying anything we split right at the start into smaller chunks. Those are all upsides for probes but I definitely do see that there is a downside for probing and that is that we leaked some privacy. What privacy do we leak? It’s the channel capacities. Why are channel capacities dangerous to be known publicly? It could enable you to trace a payment through multiple hops. Let’s say for example we have channels A, B and C that are part of a route. Along these three channels we detect a change in capacity of 13 satoshis. Now 13 satoshis is quite a specific number. The probability of that all belonging to the same payment is quite high. But for us to collate this information into reconstructing payments based solely on observing capacity changes we also need to make sure that our observations are relatively close together. Because if an intermediate payment comes through that might obscure our signal that allows us to collate the payment. That’s where I think that MPP payments can actually hugely increase privacy simply by providing enough noise to make this collating of multiple observations really hard because channel balances now change all the time. You cannot have a channel that is constant for hours and hours and then suddenly a payment goes through and you can measure it. Instead you have multiple payments going over a channel in different combinations and the balances of those changes cannot be collated into an individual payment anymore. That is combined with efforts like Rene Pickhardt’s just-in-time rebalancing where you obscure your current balance by rebalancing on the fly while you are holding onto an HTLC. That can pretend to be a larger channel than it actually is simply because we rebalance our channel on the fly. I think probing can be really useful when it comes to measuring the performance metrics for the Lightning Network. It could potentially be a privacy issue but at the timeframes that we’re talking today it’s really improbable to be able to trace a payment through multiple channels.
SL: Especially with MPP and once you add all these different layers it seems quite low risk I guess. Christian, I’ve really enjoyed chatting with you. We’ve almost gone two hours at this point. I’ve definitely learned a lot and I’m sure SLP listeners will appreciate being able to learn from you today. For any listeners who want to find you online, where can they find you?
CD: I’m /@cdecker on Github and /@snyke on Twitter.
SL: Fantastic. I’ve really enjoyed chatting with you. Thank you for joining me.
CD: Thank you so much. Pleasure as always and keep on doing the good work.
Community-maintained archive to unlocking knowledge from technical bitcoin transcripts