Commit Graph

15 Commits

Author SHA1 Message Date
Rusty Russell a3c4908f4a lightningd: don't explicitly tell connectd to disconnect, have it do it on sending error/warning.
Connectd already does this when we *receive* an error or warning, but
now do it on send.  This causes some slight behavior change: we don't
disconnect when we close a channel, for example (our behaviour here
has been inconsistent across versions, depending on the code).

When connectd is told to disconnect, it now does so immediately, and
doesn't wait for subds to drain etc.  That simplifies the manual
disconnect case, which now cleans up as it would from any other
disconnection when connectd says it's disconnected.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-18 20:50:04 -05:00
Rusty Russell 719d1384d1 connectd: give connections a chance to drain when lightningd says to disconnect, or peer disconnects.
We want to avoid lost messages in the common cases.

This generalizes our drain code, by giving the subds each 5 seconds to
close themselves, but continue to allow them to send us traffic (if
peer is still connected) and continue to send them traffic.

We continue to send traffic *out* to the peer (if it's still
connected), until all subds are gone.  We still have a 5 second timer
to close the connection to peer.

On reconnects, we don't do this "drain period" on reconnects: we kill
immediately.

We fix up one test which was looking for the "disconnect" message
explicitly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-18 20:50:04 -05:00
Rusty Russell 41b379ed89 lightningd: hand fds to connectd, not receive them from connectd.
Before this patch:
1. connectd says it's connected (peer_connected)
2. we tell connectd we want to talk about each channel (peer_make_active)
3. connectd gives us an fd for each channel, and we connect it to a subd (peer_active)
4. OR, connectd says it sent something about a channel we didn't tell it about, with an fd (peer_active)

Now:
1. connectd says it's connected (peer_connected)
2. we start all appropriate subds and tell connectd to what channels/fds (peer_connect_subd).
3. if connectd says it sent something about a channel we didn't tell it about, we either tell
   it to hang up (peer_final_msg), or connect a new opening daemon (peer_connect_subd).

This is the minimal-size patch, which is why we create socket pairs in
so many places to use the existing functions.  Many cleanups are
possible, since the new flow is so simple.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-07-18 20:50:04 -05:00
Rusty Russell 8678c5efb3 connectd: release peer soon as lightingd tells us.
Now we have separate peer draining logic, we can simply use it when
connectd tells us to release the peer, without waiting.  (We could
simply free the peer, but that's a bit rude, as messages can get
lost).

This removes various complex flags and logic we had before.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: `connectd`: various crashes and issues fixed by simplification and rewrite.
2022-07-18 20:50:04 -05:00
Rusty Russell 2424b7dea8 connectd: hold peer until we're interested.
Either because lightningd tells us it wants to talk, or because the peer
says something about a channel.

We also introduce a behavior change: we disconnect after a failed open.
We might want to modify this later, but we it's a side-effect of openingd
not holding onto idle connections.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-23 13:20:12 +10:30
Rusty Russell d4fee837c2 misc: clarifications from cdecker review.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell 1c71c9849b connectd: handle custom messages.
This is neater than what we had before, and slightly more general.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON_RPC: `sendcustommsg` now works with any connected peer, even when shutting down a channel.
2022-02-08 11:15:52 +10:30
Rusty Russell 960e911986 connectd: do io logging properly for msgs we make.
We don't need to log msgs from subds, but we do our own, and we weren't.

1. Rename queue_peer_msg to inject_peer_msg for clarity, make it do logging
2. In the one place where we're relaying, call msg_queue() directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell 50eccb6a12 connectd: handle pings and pongs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON_RPC: `ping` now works with connected peers, even without a channel.
2022-02-08 11:15:52 +10:30
Rusty Russell d29795a198 connectd: don't just close to peer, but use shutdown().
We would lose packets sometimes due to this previously, but it
doesn't happen over localhost so our tests didn't notice.  However,
now we have connectd being sole thing talking to peers, we can do
a more elegant shutdown, which should fix closing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: Always flush sockets to increase chance that final message get to peer (esp. error packets).
2022-01-20 15:24:06 +10:30
Rusty Russell 029d65cf2e connectd: serve gossip_store file for the peer.
We actually intercept the gossip_timestamp_filter, so the gossip_store
mechanism inside the per-peer daemon never kicks off for normal connections.

The gossipwith tool doesn't set OPT_GOSSIP_QUERIES, so it gets both, but
that only effects one place.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30
Rusty Russell e37a638c0c connectd: do nagle by packet type.
channeld can't do it any more: it's using local sockets.  Connectd
can do it, and simply does it by type.

Amazingly, on my machine the timing change *always* caused
test_channel_receivable() to fail, due to a latent race.

Includes feedback from @cdecker.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30
Rusty Russell 9c0bb444b7 per_peer_state: remove struct crypto_state
Now that connectd does the crypto, no need to hand around crypto_state.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30
Rusty Russell a2b3d335bb connectd: do decryption for peers.
We temporarily hack to sync_crypto_write/sync_crypto_read functions to
not do any crypto, and do it all in connectd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30
Rusty Russell e683649004 connectd: maintain connection with peer, shuffle data.
Instead of passing the incoming socket to lightningd for the
subdaemon, create a new one and simply shuffle data between them,
keeping connectd in the loop.

For the moment, we don't decrypt at all, just shuffle.  This means our
buffer code is kind of a hack, but that goes away once we start
actually decrypting and understanding message boundaries.

This implementation is naive: it closes the socket to the local daemon
as soon as the peer closes the socket to us.  This is fixed in a
successive patch series (along with many other similar issues).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-20 15:24:06 +10:30