Location: Monash Cybersecurity Seminars
ROAST paper: https://eprint.iacr.org/2022/550.pdf
ROAST blog post: https://medium.com/blockstream/roast-robust-asynchronous-schnorr-threshold-signatures-ddda55a07d1b
ROAST in Python: https://github.com/robot-dreams/roast
Welcome everyone to Monash Cybersecurity Seminars. Today we will be having Tim Ruffing. Tim is an applied cryptographer in the research team of Blockstream, a Bitcoin and blockchain technology company. He obtained his PhD from Saarland University researching cryptographic aspects of cryptocurrencies. His current research focus on advanced applications of Schnorr signatures, for example multi and threshold signatures. Tim also serves as co-maintainer of libsecp256k1, a cryptographic open source library used in Bitcoin Core. Today’s talk is about ROAST, robust asynchronous Schnorr threshold signatures. The stage is all yours Tim, you may start sharing your screen.
As you mentioned this talk is about ROAST, robust asynchronous Schnorr threshold signatures. This is joint work with a lot of nice co-authors, some of them are here and maybe helping to answer questions.
So the motivation for this work, the background, the context is that Schnorr signatures are available on Bitcoin. Bitcoin supports Schnorr signatures and in particular the corresponding Bitcoin Improvement Proposal (340) has been activated as part of the Taproot soft fork last November. The reasons to prefer Schnorr signatures over ECDSA which has been used in Bitcoin before and can still be used but it is kind of a legacy thing, Schnorr signatures have better provable security, they are more efficient but most importantly for us they enable easier constructions of advanced signing protocols. This is exactly what we will cover in this talk.
To give you more background on Schnorr signatures in Bitcoin, the vision here is to have a two layer thing: onchain the Bitcoin nodes will support just Schnorr signature verification, this is part of the consensus rules, the plain Schnorr signature verification algorithm. Then the idea here is that we can build many applications on top of this. For example users can perform threshold signatures in an offchain manner and they could also perform a multisignature protocol or blind signature protocol. As long as the output of these results offchain is a Schnorr signature that can be put on the blockchain and accepted by verifiers. This two layer approach is very nice because the onchain consensus layer is kept simple. The only thing it knows about are Schnorr public keys, Schnorr signatures and Schnorr signature verification. The more complex offchain protocols are hidden from verifiers. If you introduce new offchain protocols, you change them and so on, we never need to touch the onchain layer which is consensus and therefore hard to change. Another nice thing is the only thing that ends up onchain is just the Schnorr public key and the Schnorr signature. This is efficient and it is pretty compact no matter how many parties may be involved in these offchain computations here. What ends up onchain is just O(1) piece of data or two, the public key and the signature.
This talk is about threshold signatures. What are threshold signatures? Threshold signatures are t-of-n signatures. You have some n
nodes or n
signers and to create a signature you need a subset of t
of those n
signers. We have two basic properties here. First is unforgeability which means that if you have fewer than t
signers, maybe up to t-1
signers, even if they are malicious they shouldn’t be able to produce a signature. On the other hand we have the robustness property which means that t
honest signers can reliably produce a valid signature. This is a very informal definition. This talk is about robustness and we will see a more detailed definition later. For now this one suffices. If you have questions please feel free to interrupt during the talk or ask them later. Anything is fine.
Why do we want threshold signatures? There are a lot of nice applications. The most basic example of threshold signatures, often deployed now, is a 2-of-3 secure personal wallet where maybe you have some hardware wallets at home, 3 of them, and you split your signing key among the 3 devices and then you only need 2 to make a signature. You probably don’t have them at home, you can put them in different geographic locations and make sure that they are not all stolen or destroyed at the same time. When I say this is currently used in Bitcoin I mean in the way that is not as efficient as we are going to do now, threshold signatures that look like a single Schnorr signature, a trivial implementation of threshold signatures that is linear in size and available in Bitcoin right now. 2-of-3 would be the setup at home maybe but if you are a company or you are a Bitcoin exchange you probably want to be very careful. I read that the Bitfinex exchange uses a 3-of-6 setup as a cold wallet setup. Some of the wallets are in the control of different persons. At Blockstream we also have a pretty nice use case. This is our Liquid sidechain. Here we have a 11-of-15 setup, this is a sidechain, in a nutshell you can send Bitcoin from the Bitcoin chain to this sidechain and then transfer and transact with them on the sidechain, maybe more efficiently, more private and with other features. I won’t go into detail here but from the point of view of the Bitcoin system those coins are held by a federation. It is a federated sidechain and from the point of view of Bitcoin this is just an output that can be spent if 11 out of 15 agree. Here we could switch to more efficient threshold signatures and maybe even scale these numbers up. The numbers currently are a trade-off. You would want to have a very large federation to make sure the majority won’t be compromised but on the other hand you need to store the signatures and the public keys onchain. There is an efficiency and security trade-off. There are many more applications in use today.
To make our vision true that I mentioned on the previous slide, what we need is a threshold signature scheme where the threshold signatures look like ordinary Schnorr signatures. If all you have on the chain is the Schnorr verification algorithm and it accepts a public key, a signature and a message then we need to see how we can generate these things. The public key for the purpose of our talk will be an ordinary Schnorr public key, it looks like an ordinary Schnorr public key but it is obtained via an interactive distributed key generation algorithm run as a setup algorithm between the n
signers. The signature will be an ordinary Schnorr signature obtained via an interactive signing protocol involving at least t
honest signers. In this talk we pretty much focus only on the signing protocol. In a sense distributed key generation is necessary here but it is orthogonal to our work so I won’t really talk about it.
FROST paper: https://eprint.iacr.org/2020/852.pdf
If we want a nice Schnorr threshold signing protocol then the canonical choice nowadays is FROST. FROST has been suggested by Komlo and Goldberg in 2020 and it is a very nice 2 round threshold signature scheme. At its core it is very simple. Another feature of it is it really gets rid of all the complexity introduced by schemes in the past, it reduces the complexity to a very simple core. It has been proven unforgeable under concurrent signing sessions which will be helpful for us, assuming we have a suitable DKG protocol. The security actually has been discussed in multiple papers, at least three of them looked at the security of FROST and all proved unforgeability in some different flavors. We don’t talk about unforgeability in this talk so it is not really important for us.
Q - Should we always think from now it is about the signing protocol only? We include keygen or it is excluding it?
A - That is a very good question. We are always talking about the signing protocol, I think this is the last slide that even mentions the DKG. This is a talk only about signing.
Q - Starting from here everything is only about signing. If you say 2 round, it is really 2 round signing.
A - Right. We have the DKG in front and then if you want to sign a particular message then it is 2 rounds.
Another nice feature of FROST, it has no restrictions on t
with respect to n
. Earlier schemes often required that there exists a honest majority which would mean that t
is greater or equal to n/2
. In this case we don’t need this. FROST can work with any t
. You can select this arbitrarily which is very nice in practice.
It is unforgeable under concurrent signing sessions, we had this on the last slide. It has semi-interactive signing, this is the 2 round thing that I mentioned. There is a presignature round and there is a signature round. The nice thing about the presignature round is that it can be performed without knowing the message or any exact subset of t
signers that will later reply in the signature round. We will heavily make use of that property, you can do the presignature round with up to n
signers and only then select the exact t
signers that you are willing to continue with. The problem here is that FROST does not provide robustness, signing sessions can fail. This is the problem we are tackling in this work.
Our contribution is ROAST. What is ROAST? In a nutshell it does the following. Given a threshold signature scheme, for example FROST, it has certain properties. It is semi interactive, it provides identifiable aborts. I haven’t mentioned this. I will explain this in a minute. And it is unforgeable under concurrent signing sessions. Then ROAST is a wrapper that turns it into a threshold signing protocol that is robust and robust in asynchronous networks. There are some terms I haven’t explained on this slide. Identifiable aborts and asynchronous networks and so on. Let’s have a look at the setting we are considering here.
Our setup is always we have a set of n
signers and we have a coordinator which is a helper node in the network. The signers are connected only to this coordinator. They don’t need to talk to each other, they just talk to the coordinator. The coordinator is really just a helper node, it is not trusted for unforgeability, we just rely upon it for robustness, for creating a signature. In that sense it is not that much different from the network itself. We also rely on the network being available because if the network is down you can’t have robustness, you can’t produce a signature if you can’t talk to each other.
In this setting what does robustness mean? This is not a formal definition here but it is much more detailed than what I told you earlier. Consider a run of the signing protocol of ROAST with all n
signers. We assume that signers somehow agree on a message to sign. This is given to them from the outside and it is independent of our protocol. We have these n
signers and among those n
signers t
signers are honest and willing to sign. The remaining signers n-t
want to prevent the honest signers from signing. They are malicious or faulty. In this setting a signing protocol is robust if it is guaranteed to terminate with the coordinator outputting a valid signature. It is guaranteed to output a valid signature. Here I am requiring the coordinator to output the signature. This is just a simplification. If everybody is interested in the signature the coordinator can later just broadcast it. This is just to make the model a tiny bit simpler. The network we assume here is asynchronous. Network messages between honest parties arrive eventually but that is all we know. We don’t know any upper bound or there is no upper bound on the maximum message delay between honest parties. This also means we can’t do any timeouts or anything of this kind. This is very important I think in real networks such as the internet because this is just how things work. Synchronous networks are nice on paper and often useful when developing ideas but if you want to put ideas into practice this is what you want.
Enough about our goal, robustness. There is one more thing that I haven’t explained which is identifiable aborts. This is a property that FROST provides. What does it mean? In a sense it is a way to say “If the protocol fails then we can at least know who is responsible”. In this talk I will explain it slightly differently which is kind of equivalent but more helpful for this talk. Informally it means that if all received messages are valid then the protocol will terminate. This part of it is more like a correctness definition. Only honest signers will send valid messages. If everybody is honest then the protocol will work out, this is more or less correctness. Now if the coordinator receives a protocol message it can tell immediately whether the message is valid or not. This means if the coordinator gets an invalid message from a dishonest node or disruptive node the coordinator can tell immediately. What you would do in practice is kick him out but we will handle that differently, you’ll see it in a minute. Also we give a formal game based definition of this in the paper. Identifiable aborts are a standard property in multiparty computation but we couldn’t find a definition that fits exactly our setting so we give a formal game based definition for threshold signature schemes in the paper. The consequences of this second point here, the coordinator can immediately tell whether the message is valid or not, in theory it can ignore invalid messages. It doesn’t want to receive invalid messages, if someone sends an invalid message it can drop it and silently ignore it. That’s not what you would do in practice. In practice you would probably kick out the dishonest signer, this is what we do in the paper for protection against DoS attacks and so on. But for now in theory it is ok to just ignore the message. If we ignore invalid messages that basically means we can assume that all messages that we receive are valid. That is a pretty nice property to work with. In other words it basically means it suffices to consider crash faults. It is not exactly true but for the purpose of this talk it might be useful to think about it like this. Nodes can’t really send invalid messages, the only thing they can do is crash and never reply.
Setting the stage, what I want to show you here is first how FROST works and why it doesn’t provide robustness. Then we can see how we fix it. For the remainder of this talk we will introduce some conventions. If we have a slide that looks like this we always have a 2-of-4 example. t
is always 2
and n
is always 4
. The choice is pretty arbitrarily, it just fits nicely on the slides. When I say “we” I typically mean the coordinator. Most of the protocol logic is in the coordinator, that’s why it is called a coordinator. We always think from the point of view of the coordinator in this talk. We also have the convention that all messages we receive are valid, this is because of the identifiable aborts property I mentioned earlier. And whenever I say “session” I mean “FROST session”, I don’t mean “ROAST session”, that is sometimes confusing. I wanted to clarify here.
We can look at what a FROST session actually looks like. A nice thing about our work is that I don’t need to really go into the cryptographic details. You will never see a formula in this talk that explains how FROST works because ROAST will work independently, we will only talk about message flows, this is more of a distributed systems work in a sense than a cryptographic work even though we realize a cryptographic protocol at the end. So how does a FROST session work? At the beginning the signers send presignature shares, this is the presignature round. Maybe the signer in the top left will send a presignature share, what the coordinator will do is wait for t
presignature shares. This signer here sends a presignature share and now we have t
presignature shares, the coordinator has t
presignature shares. What it will do now is say “I’ve got presignature shares from those two signers. Let’s ask those two signers for signature shares.” This is the point in the protocol where the coordinator commits on the set of t
signers. It could maybe wait for more and then select another subset but at some point it needs to commit and say “Ok we are going to ask exactly those two signers for real signature shares”. We can’t change the subset, this is a property of FROST. Whenever we put two signers in a session we mark them in blue. The coordinator will now take these two presignature shares, will aggregate them into a full presignature and will send this full presignature back to the two signers. This is basically a request for a signature share. Now the signers are supposed to reply with their signature shares. Maybe this signer replies with a signature share, maybe now that signer here is slow and sends a presignature share because it was supposed to do this, at this point we’ll ignore it because we have already decided on the set of t
signers which are the blue nodes on the left. We can’t make use of that presignature share so we’ll ignore it. Now maybe we get the second signature share and this is all we need. The coordinator can combine the signature shares together with the presignature and output the full Schnorr signature. At that point the protocol is complete. This is what it is supposed to look like if everything works out.
I said that FROST is not robust. Let’s go a step back and see what can fail. It is literally a step back. Assume we have received a signature share from this node in the top left but we haven’t received a signature share from this signer here. We don’t really know what to do. It is an asynchronous network, we can never tell whether this signer is actually there or not, or malicious or crashed or just slow. It could be crashed, if it is crashed we will basically wait forever because we can’t really continue. In the case that it really never replies we can’t make progress here and we’ll wait forever. This means FROST alone is not robust.
There is a trivial solution in a sense. Instead of running the protocol with one subset of size t
just run a FROST session for every subset of size t
. In a 2-of-4 this corresponds to 6 choices of 2 signers if I did the math right. In such a case we can maybe run 6 sessions in parallel, it is still feasible. But in general that needs n
choose t
FROST sessions and it really won’t scale. We had the example of 4
choose 2
, that’s ok. Previously we had the example of a 11-of-15 setup and then we have 1365 sessions. If you have the bandwidth, it is not elegant but in some settings you may be able to do it. But if you want to scale this to larger sizes no luck. A 67-of-100 setup and the number of sessions would be too large. We need a better solution.
Thinking about a better solution, let’s go back to this situation where we couldn’t really continue and try to derive some observations here. One observation, this one FROST session here is still running but there is really no need to abort this session. We never know if the node here actually has crashed or not. It might still reply. We could try to kick it out and abort the session but it is not really helpful. We can just keep the session running and hope it replies. But on the other hand if we want to start new sessions then this pending signer here should be excluded from further sessions. It is not really reliable, it wouldn’t be clever to start a new session and again put this pending signer here in the new session. But we need to start further sessions to ensure progress. This is simply because we have observed in the current session we are not guaranteed to make progress. The only thing we can do is start more sessions and hope they progress. The last observation is we may be able to start the session immediately. In our example we saw that this south east signer here was slow sending the presignature share. In this position we already have one signer available for the next session, we just need a second signer. If we had a fresh presignature share from this signer then we would already have two signers available. We could start a new session immediately. These are some nice observations. Now let’s try to turn these observations into a plan.
The first observation was no need to abort the session. We make a plan to always keep sessions open. The second observation was that if there was a signer pending then we shouldn’t try to put it into new sessions. We maintain a set of responsive nodes where responsive is exactly defined as not pending. In the end it is equivalent, we divide the set of n
signers into pending nodes and responsive nodes. This is a full partition of the set. We maintain a set of responsive nodes that are not pending and we will make sure that we only put responsive nodes in new sessions. And we want to start new sessions when we want to make progress. The simplest thing we can do whenever there are t
responsive nodes available, we put them in a new FROST session. Let me remind you when I say “we” I am always talking from the point of view of the coordinator. The last observation was that this piggybacking makes sense. Whenever we let the signer send a signature share we always let it include a presignature share for potential future sessions. Maybe there will be further sessions required and then we already have a presignature share available.
Signers
Upon init - send presignature share
Upon receive presignature - send signature share for presignature
Coordinator
Upon init - mark all signers not responsive
Upon receive initial presignature share - mark sender responsive
Upon receive signature share for some session and fresh presignature share. If session has t
signature shares compute and output signature. Mark sender responsive. If we have t
responsive signers put them in a session, send presignature, mark them unresponsive.
We can now translate this into pseudocode. The nice thing about ROAST is that the pseudocode is so simple that it fits on a slide. From the point of view of the signers, when they are initialized the only things they do is send a presignature share. When they then receive the full presignature from the coordinator and are supposed to send the signature share, they do this, they take the message, run their signing algorithm, create the signature share for this presignature and send it back to the coordinator. As I said it will fit on a slide here, I am cheating here. Let’s look at the coordinator. What does the coordinator do? At initialization the only thing it does, it marks all signers as not responsive. Why not responsive? Because at the beginning they are pending. They are supposed to send the presignature shares. They are pending which is the definition of not responsive. Then when we receive an initial presignature share from any of the signers we mark it as responsive. Of course we store the presignature share. When we receive a signature share for some session, this is the second round message, together with the piggybacked fresh signature share… This signer was in some FROST session so now if the session has t
signature shares we compute the signature and we are done. And if not we go ahead and we mark the sender as responsive, it is not pending anymore, it has fulfilled its obligation and sent the signature share. Now we check if we have t
responsive signers. If we have t
responsive signers we put them in a FROST session and start a new session. Send the presignature back to them, wait for signature shares and mark them as unresponsive again. If at this stage we don’t have t
responsive signers we just wait for more replies. This is pretty much pseudo code and maybe hard to follow. Let’s work through this with a concrete example execution.
Again we have our 4 signers, we are looking at the same basic scenario that we looked at with FROST just now. We have our 4 signers and as the coordinator we need to keep this responsive set. We need to have a distinction whether a signer is pending or responsive. My indicator on the slide is this clock symbol here. If there is a clock this means the signer is pending and slow. It is supposed to send a message. At the beginning everybody is supposed to send a message because they need to send their presignature shares. Let’s say this node here sends a presignature share, it is not pending anymore. Then maybe this node here sends a presignature share. It is then also not pending anymore and now we already have 2 responsive signers and so as a coordinator we can put them into a session. I again mark them blue but because we are starting multiple sessions I also give the sessions a number. This is session number 1, those signers belong to session number 1. Now the coordinator will aggregate the presignature shares into a full presignature and send the presignature back to the signers in this particular session. This now means it is the signer’s turn again. They need to reply, this is why we mark them as pending again. Let’s go ahead. This signer replies with a signature share, this is great. It is not pending anymore and also you can remove it from the session. It has done everything it needs to in the session, the session is basically done from the point of view of the signer. As in our previous example there is a presignature share from this white signer here, we receive it, this means this signer is not pending anymore because it has sent a presignature share. We didn’t draw another arrow here but we piggyback presignature shares for future sessions on other replies. Also this signer here has already sent a presignature share for further sessions. We have 2 nodes that are not pending, that are responsive. You can see it from the symbols here. What we will do as the coordinator is go ahead, put them in a session, now it is session 2. This means that they are now pending again. We will also send them the full presignatures back. Finally this very slow signer here in the top right now responds with the presignature share. This means this signer is not pending anymore. Let’s say as the next step this signer replies with the signature share, this signer is done with session 2, it has replied with the second round message, so you can remove it from the session. And it is not pending anymore. We have 2 signers which are responsive, not pending. We can again start a session, session 3 and then they are pending again. The interesting thing about this position now is that this is a winning position. It is easy to see from here on the protocol will always terminate. Why is that? By case analysis on the next message that will arrive, if this signer here in the bottom left replies with a signature share then session 1 will terminate. This signer will only sign a pending in session 1. If this signer replies then session 1 is done. The same is true for this signer in session 2. If this signer replies then session 2 is done. If the other 2 signers reply then session 3 will be done. No matter who sends next we will make progress and we will finally terminate assuming that 2 of those signers will eventually reply. This is necessary because we have a 2-of-4. If more than 2 crash then we are out of luck. Assuming 2 signers will respond this session will terminate.
What makes this work is a key invariant that we try to uphold here, the invariant here is that every signer is pending in at most one session. You can easily check this is true. Just look at the number on the signer and this is the session in which it is pending. We will see how this is useful in proving robustness now.
In the paper we have a robustness theorem, here is an informal version. Robustness basically says the coordinator outputs a valid signature after initiating at most n-t+1
signing sessions of the underlying threshold signature protocol where the underlying threshold signature protocol is FROST in this case. Why is that true? The idea is that we have seen in the previous slide that every signer is pending in at most one session. This immediately implies that every signer can hold up at most one session. If every signer can hold up at most one session then n-t
faulty or crashed signers can hold up at most n-t
sessions. When I say faulty it is a synonym with crashed and even malicious. I explained that we only need to look at crash failures, it is basically the same. Again the idea is n-t
faulty signers can hold up at most n-t
sessions. That means if we start n-t+1
sessions then 1 session can’t be held up and will terminate.
(ROAST terminates after 2(n-t) + 3 = O(n-t)
)
In our paper we also look at the round complexity. ROAST terminates after a linear number of asynchronous rounds. If you don’t know what an asynchronous round is, in the asynchronous setting we can’t really talk about traditional rounds as you would in the… setting where time is divided in rounds. This translates to maximal message delays. Assume that suddenly we have a bound on the maximum message delay between 2 honest nodes, which in our setting is always between the signer and the coordinator. Say we have some delta here then we know that the protocol will terminate after delta times this number here. Only looking at the network and ignoring computation. There is some additional time you need, if you receive a message you need to do some computation before you reply. In our case in the realistic setting such as the internet, the big part of the running time here is the network and not the computation. This number of asynchronous rounds immediately gives us a bound on the running time if we have a maximum bound on the message delay. Of course if we don’t have a maximum bound on the message delay, we really can’t assume an asynchronous network, then I can’t say anything about the running time, it could be unbounded.
So much for theoretical performance. We also did an experimental evaluation. We implemented the protocol in Python, we wrote a prototype. For the elliptic curve operations that we need for FROST, we implemented those using the fastecdsa Python library. This is a Python library but it calls the C library GMP under the hood. It is slightly optimized but the entire prototype is not really optimized, it is in Python, there are a lot of things we could do here. If you want to try to improve performance then we also need to look at what the attacker does, we need to simulate the attacker somehow. In our experiment what we did is we had some fraction of faulty signers, the number f
here, between 0
and n-t
. If it is more than n-t
then we couldn’t terminate so this is the range in which f
could be. We run the experiment for several values of f
. What those faulty signers do is crash after the presignature round. They are not particularly clever but on the other hand what could they do instead of crashing? I mentioned this in the identifiable aborts slide, they could send invalid messages but the coordinator wouldn’t care much, it would just ignore the message and kick them out. It doesn’t really make a difference if they send invalid messages or just not reply at all. The only larger assumption here is that faults are not coordinated. Every attacker is on its own and they don’t talk to each other, there is no omnipotent attacker that coordinates when signers go offline and when they crash. We could have run this experiment too but we felt we already have a nice bound from the theoretical model so we wanted to see what happens in practice. In practice it could happen that all malicious signers collude but it is maybe not the standard. The standard we’d expect is maybe there are some malicious nodes and some are just offline or slow. Our network setup is we have a coordinator on a server in San Francisco and we have all signers on a single server in Munich. This sounds a little bit strange but it really simplifies the setup and it is not really a problem here. The signers don’t talk to each other, they only talk to the coordinator. This means that all signers have the same round trip time to the coordinator which is maybe slightly artificial but it still makes for a good experiment. The round trip we have here between the coordinators and the signers was 158 milliseconds which we just measured. I will show you the results but let me say you can find our code and our benchmark data on GitHub.
When you are looking at the running time this is what we got. We run this first of all with different setups, different numbers of t
and n
from very small 3-of-5 to very high 67-of-100 and this is the round trip time for reference. On the x axis we have this fraction f/(n-t)
of faulty signers. We have the worst case corruption scenario, acknowledging that the corruptions are not coordinated, this is the worst case scenario in terms of our setting and this is the best case scenario. Here we just plot the running time in seconds. The interesting part is that in practice if you have these uncoordinated failures then the running time is pretty much constant here in the fraction of faulty signers, if you look at those graphs. This is 3 seconds. Even in this setting, 67-of-100 setup which means the remaining 33 are faulty and are faulty and will crash at some point, crash after sending the first presignature share, even in this setting we terminate after 3 seconds. We believe that this still can be optimized. You can ask “Why does this happen?” The intuitive explanation here is that the network introduces some implicit randomness into the protocol. The coordinator kind of converges its decision to select sessions towards the signers that reply quickly and are very responsive. This ensures that we have this nice graph here. This was surprising to us, we didn’t really expect that it would go flat in the end here.
There is a final piece I want to talk about. This is getting rid of the coordinator. We always assume this coordinator, it is a useful abstraction to make, but it may be problematic. You need this coordinator node and it is relied upon for robustness. In the beginning I said it is a little bit like the network, we also rely on network components, but I also need to admit that it is a little bit different. If parts of the network fail then maybe some signers go offline and can’t see the other signers. But you could still make progress in this setting. For the coordinator it is a little bit different. There’s a single node, if that node goes down we are out of luck. What can we do here? We can just get rid of it by replication. The idea is we know that there are at most n-t
faulty signers. What we can do, this is very similar to the robustness argument of a single ROAST run, is take n-t+1
signers, let them play the role of the coordinator in n-t+1
parallel ROAST runs. Now we know that among those n-t+1
signers there will by the pigeon hole principle be at least one honest signer. This one honest signer will play a honest, reliable coordinator in its run. Then the corresponding run will be guaranteed to terminate. It is a very simple trick, of course it has this linear blowup, but it works. It does the job. You can get rid of the coordinator.
To sum up, ROAST is a wrapper protocol, what it does is given a threshold signature scheme that is semi interactive, that provides identifiable aborts and is unforgeable under concurrent signing sessions, our contribution ROAST turns it into a threshold signing protocol which is robust and robust in asynchronous settings. That’s all I have to say, I am happy to take your questions.
Q - Do you assume the coordinator is honest?
A - Kind of but really only partially. The answer is more no than yes. Let me explain. For unforgeability we assume nothing about the coordinator. The coordinator can be arbitrarily malicious and still it can’t break unforgeability. What we need the coordinator for is to make progress. It coordinates the session and also acts as a broadcast mechanism in a sense. We rely upon it for robustness and in that sense you could call it trusted, it is supposed to run the honest algorithm. But it is not much different from the network component. We also trust the router to take the message from one end and put it on the other side of the wire. Really the only difference in the reliability assumption as opposed to the network component is typically network failures are local. Maybe some nodes are offline, some are not and you could continue. If you have a single coordinator then if the coordinator fails then of course you can’t make progress because signers can’t talk to each other. Then again we can fix this by getting rid of the coordinator essentially by replicating the coordinator among many signers. If you ask me I wouldn’t say we trust it. It is just a component that has to fulfil its job.
Q - If the presigning is not message independent but we just want signers to sign the same message does that still work? I think you mentioned that you require message independence for the presigning stage.
A - I don’t require it. It is just a property that FROST has but it is not crucial for ROAST. The presignature round or the preprocessing round here is independent of two things in FROST. It is independent of the message, you can run it without knowing the message, and you can run it without knowing which exact subset of t
signers you will use in the signing round. This latter part is what we rely upon. We need this independence in the set of signers that will continue the session. We don’t really need the independence of the message though it helps. I haven’t mentioned this in the talk but you could make an optimization. You could start a ROAST run without even a session available relying on that property of FROST. Then all n
signers will send their presignature shares. As soon as a message arrives you can immediately put them into sessions. This is really just a tiny optimization that saves you one round trip in the beginning. It is not crucial. The crucial part is that the first round is independent of the set of signers that will reply in the second round.
Q - Is there any ECDSA threshold signature algorithm that has the required properties to make ROAST work?
A - Unfortunately not. When I say ROAST is a wrapper that turns a threshold signature scheme with those properties into a robust signing protocol, this is true. But so far the only protocol we’ve found that really matches these properties is FROST in the Schnorr setting. This is formally not exactly true because in a sense 1 round BLS signatures also fulfil this property but they are already robust. You don’t need ROAST, it would be useless to apply ROAST to this setting. We are actively looking for other applications. If you have a ECDSA protocol that makes this work I’d be very interested. Even any other protocol. The way that I sold this is threshold signing but you could also imagine a similar protocol for any other threshold. Maybe threshold decryption. If it has this specific property that in the first round you can run it with all n
participants but then at some point you need to decide on an exact subset of t
and only then you can continue. That is what we exploit in ROAST but also what we fix. In FROST you have to commit to these t
signers and ROAST gives you a way to handle this.
Q - Identifiable aborts, is it something that is only satisfied by FROST? Any other signature scheme that satisfies identifiable abort by itself?
A - Identifiable aborts is a pretty natural condition at least for discrete logarithm based signatures. I haven’t shown you what a Schnorr signature looks like, I tried to avoid this because it is really not important for the talk. A Schnorr signature has two parts. The presignature R
, sometimes called the nonce, and the signature s
. All these protocols, the threshold signature schemes or the multisignature schemes work by exploiting the linearity of Schnorr signatures. The equation is linear in R
so you can split the nonce into multiple parts additively and you can do the same for the signature part s
. Then you can check a partial signature. For every signature share you can get from a signer you can do partial verification and check if it is valid for the presignature share that the same party sent. If all signature shares verify partially then you can add them up and you know the signature will verify. For Schnorr signatures it is a very natural property. In the first round the only thing you are supposed to do is send a group element, you can’t really cheat there. You can send an invalid group element but it would recognize it immediately. As soon as you send the second round thing, the signature share, I can immediately verify that it matches the presignature share. At least for Schnorr it is very common. If you look at the original FROST paper they mention this but they don’t prove it formally. If you look at it it is not very hard to see.
Q - Was it implicit in the theorem of robustness?
A - In a sense it is used here (identifiable aborts). The coordinator can simply ignore invalid messages. It is not exactly what it does in practice because in practice if it really receives an invalid message then it has proved some signer is malicious and will kick it out forever. It is never the case that you receive a message and you can’t tell whether it is valid and you need to receive some further messages to see whether this node is honest or not. You know immediately.
Q - It is contributing to the fact that every message received is valid. That will contribute to the fact all the sessions created will eventually finish. All the messages are valid that you are receiving.
A - Whenever you end a session you know there is no hidden invalidity in the session that will only appear later. This is the core idea. Another aspect that I haven’t mentioned here that is also used in ROAST, identifiable aborts also means some non frame-ability property. The attacker can’t create a scenario where one honest signer thinks the other honest signer is malicious and will kick him out. I haven’t mentioned this on the slides but this is another aspect.
Q - On performance, you showed the performance of ROAST, how is it compared to FROST? I know it is probably not apples to apples but that constant time with respect to the fraction of failed nodes. Is it the same?
A - If you look at this graph here FROST really corresponds to this scenario here. In FROST we don’t have robustness so we can’t really talk about fraction of faulty signers. If there is a faulty signer you wouldn’t terminate at all. We can’t really talk about this. It is apples and oranges. If everybody is honest ROAST might start multiple sessions but it starts at least FROST session in the beginning and this session will terminate because everybody is honest. It basically boils down to needing only one session of FROST. This happens in this scenario where no one is malicious. This is the base case in a sense for this large setup where we just run a FROST session that works. Here we have the bad case. Of course again this is assuming faults are not coordinated. If faults are perfectly coordinated, the overhead of ROAST in the worst case is n-t+1
, it is multiplicative overhead because this is exactly the number of underlying FROST sessions you need to abort.
Q - On the storage size of the coordinator, have you evaluated statistically? The memory?
A - If you look at the paper the algorithm is pretty much optimized for this. We optimize it because we could. Not sure if it is useful in practice. It might depend on what you use for hardware. If you use a real server it is not a problem but maybe you could run it on a router even. We really try to make sure that it doesn’t need to store too many things. It is not even mentioned in the paper because it doesn’t fit our generic syntax of a threshold signature scheme but would be compatible with FROST, I think it is O(number of sessions) plus O(number of nodes) and the number of sessions is number of nodes so it is O(number of nodes). If you receive a presignature for example you can accumulate the full presignature because it is just addition in the EC group. You start with the point at infinity, you receive the first presignature share, you add it to the accumulator, you receive the second presignature, you add it to the accumulator. It is even more efficient than I said on the slide. On the slide I said “First wait for t
presignatures and then aggregate them”. But you can do this in constant memory by just keeping an accumulator. The same is true for the signature shares when building the final signature. That is why it is constant storage per session.
Q - It is really O(n) in that sense.
A - I think it is O(n), maybe now if I look at the paper there is some tiny O(n^2) part. In practice it is pretty much O(n)
Community-maintained archive to unlocking knowledge from technical bitcoin transcripts