Commit Graph

56 Commits

Author SHA1 Message Date
niftynei 9b8909e507 dual-fund: keep track of aborted requests, seamlessly restart daemon
Clean restart of daemon after a tx-abort is a nice way to work around
the 'persistent' disconnect that we t-bast noticed.

Changelog-Fixed: `dualopend`: Fix behavior for tx-aborts. No longer hangs, appropriately continues re-init of RBF requests without reconnction msg exchange.
2023-07-30 15:20:04 +09:30
Rusty Russell c074fe050f lightningd/log: clean up nomenclature.
`struct log` becomes `struct logger`, and the member which points to the
`struct log_book` becomes `->log_book` not `->lr`.

Also, we don't need to keep the log_book in struct plugin, since it has
access to ld's log_book.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-07-19 19:13:57 +09:30
Rusty Russell 375215a141 lightningd: more graceful shutdown.
Be more graceful in shutting down: this should fix the issue where
bookkeeper gets upset that its commands are rejected during shutdown,
and generally make things more graceful.

1. Stop any new RPC connections.
2. Stop any per-peer daemons (channeld, etc).
3. Shut down plugins.
4. Stop all existing RPC connections.
5. Stop global daemons.
6. Free up peer, chanen HTLC datastructures.
7. Close database.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Plugins: RPC operations are now still available during shutdown.
2022-09-12 14:00:41 +02:00
Rusty Russell 5c949e3116 subd: make channel/peer own the subd.
We get some memleak reports because ld owns the subd, but once
the peer/channel is freed, there's no reference for the brief time
until the subd exits.

This happens for both opening and closingd.  For openingd, the
peer owns it, for others (including dualopend) the channel owns it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-30 06:27:52 +10:30
Rusty Russell 00bb6f07d7 lightningd: simplify memleak code.
Instead of doing this weird chaining, just call them all at once and
use a reference counter.

To make it simpler, we return the subd_req so we can hang a destructor
off it which decrements after the request is complete.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-03-10 09:40:09 +10:30
Rusty Russell 0c24334738 lightningd: clean up subds before freeing HTLCs.
Otherwise we get weird effects, as htlcs are being freed:

```
2022-01-26T05:07:37.8774610Z lightningd-1: 2022-01-26T04:47:48.770Z DEBUG   030eeb52087b9dbb27b7aec79ca5249369f6ce7b20a5684ce38d9f4595a21c2fda-chan#8: Failing HTLC 18446744073709551615 due to peer death
2022-01-26T05:07:37.8775287Z lightningd-1: 2022-01-26T04:47:48.770Z **BROKEN** 030eeb52087b9dbb27b7aec79ca5249369f6ce7b20a5684ce38d9f4595a21c2fda-chan#8: Neither origin nor in?
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-02-08 11:15:52 +10:30
Rusty Russell 4a4f85dd3f subd: fix waitpid properly.
lightningd would race with the subd destructor to do the waitpid(),
resulting in UNUSUAL log messages, but also us missing if a plugin
was killed via a signal.

We can also get rid of the gratuitous waitpid() in test_subdaemons.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2022-01-25 06:26:52 +10:30
Rusty Russell 741f44725a patch lightningd-peer-fds.patch 2022-01-20 15:24:06 +10:30
Rusty Russell 90b669857e lightningd: handle channel cleanups more explicitly.
1. Freeing an unconfirmed channel already releases the subd, so don't
   do that explicitly.
2. Use channel->owner to transfer ownership where possible, using
   channel_set_owner() which handles all the cases.

This simplifies the code and makes it more readable, IMHO.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-12-30 09:50:40 +10:30
Rusty Russell f97a51cc0f lightningd: don't send other messages until we've received version.
This avoids subdaemons complaining about malformed messages from us,
or doing the completely wrong thing, if they are really the wrong
version.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2021-04-24 13:56:58 +09:30
niftynei 4c9a4250bf subd: remove "swap" methods
only needed for moving the subd->channel from an uncommitted_channel to
a channel; we removed uncommitted_channel from dual_open so it's no
longer necessary
2021-03-03 16:19:04 -06:00
niftynei de3599e98a subd: remove ctype (channel_type)
We only needed the type check for dual_open, since it was the only
subdaemon path that used two 'types' in the subd->channel field.
2021-03-03 16:19:04 -06:00
Rusty Russell d14e273b04 common: treat all "all-channels" errors as if they were warnings.
This is in line with the warnings draft, where all-zeroes in a
channel_id is no longer special (i.e. it will be ignored).

But gossipd would send these if it got upset with us, so it's best
practice to ignore them for now anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we treat error messages from peer which refer to "all channels" as warnings, not errors.
2021-02-04 12:02:52 +10:30
niftynei bf49bcfa90 subd: keep track of 'channel's type
Back in the days before dual-funding, the `channel` struct on subd was
only every one type per daemon (either struct channel or struct
uncommitted_channel)

The RBF requirement on dualopend means that dualopend's channel,
however, can now be two different things -- either channel or
uncommitted_channel.

To track the difference/disambiguate, we now track the channel type on a
flag on the subd. It gets updated when we swap out the channel.
2021-01-10 13:44:04 +01:00
niftynei c8aa6d4a55 subd: swap out the channel + error callback
dual funding now swaps out the subdaemon's 'channel' struct in the
middle of daemon existence, so we update the channel and error callback
here.
2021-01-10 13:44:04 +01:00
Rusty Russell 4fa7b30836 lightningd: have optional node_id associated with subdaemons.
We'll use this for logging it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-11-18 04:50:22 +00:00
Rusty Russell dd79813a75 common: add peer_error flag to treat this error as "soft".
The spec says to close the channel if they send us an error, but we
need to be more lenient to preserve channels with other
implementations.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-07-26 03:53:03 +00:00
Rusty Russell 38d2899fbb common/per_per_state: generalize lightningd/peer_comm Part 1
Encapsulating the peer state was a win for lightningd; not surprisingly,
it's even more of a win for the other daemons, especially as we want
to add a little gossip information.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-06-04 01:29:39 +00:00
ZmnSCPxj 37440e9447 lightningd/subd.c: Return NULL from subd_shutdown.
And set pointers to shut down daemons as NULL in lightningd.
2019-05-31 15:01:58 +02:00
Rusty Russell eaac0d7105 lightningd: group crypto_state and fds into a convenient structure.
These are always handed to subdaemons as a set, so group them.  This makes
it easier to add an fd (in the next patch).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-13 05:16:18 +00:00
Rusty Russell 6323cc1898 plugins: allow --dev-debugger=<pluginname>.
This currently just invokes GDB, but we could generalize it (though
pdb doesn't allow attaching to a running process, other python
debuggers seem to).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-12-10 00:00:50 +00:00
Rusty Russell 3e2dea221b common/msg_queue: make it a tal object.
This way there's no need for a context pointer, and freeing a msg_queue
frees its contents, as expected.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-10-29 04:06:16 +00:00
Rusty Russell da9d92960d lightningd: accept hsmstatus_client_bad_request messages (and log!)
We currently just ignore them.  This is one reason the hsm (in some places)
explicitly calls log_broken so we get some idea.

This was the only subdaemon which had a NULL msgcb and msgname, so eliminate
those checks in subd.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-20 09:49:39 +02:00
Rusty Russell 76f116daf1 lightningd: minor cleanups
Code changes:
1. Expose daemon_poll() so lightningd can call it directly, which avoids us
   having store a global and document it.
2. Remove the (undocumented, unused, forgotten) --rpc-file="" option to disable
   JSON RPC.
3. Move the ickiness of finding the executable path into subd.c, so it doesn't
   distract from lightningd.c overview.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-09-03 05:01:40 +00:00
Rusty Russell 28c3706f87 hsmd: fix missing status messages.
I crashed the HSMD, and it gave no output at all.  That's because we
were only reading the status fd when we were waiting for a reply.

Fix this by using a separate request fd and status fd, which also means
that hsm_sync_read() is no longer required.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-07-17 12:32:00 +02:00
Rusty Russell 1e282ecb7a subd: record which ones connect to a peer.
This comes in useful for the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell ab9d9ef3b8 gossipd: drain fd instead of passing around gossip index.
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).

In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).

This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.

This turns out to be quite simple, and is probably how I should have
done it originally :(

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
ZmnSCPxj 74f3662a3b lightningd/subd.h: Add missing wire/wire.h.
If not included, a source file containing only
`#include<lightningd/subd.h>` will file compilation.
2018-03-26 01:09:59 +00:00
Rusty Russell 26b004e5af subd: handle status_peer_billboard messages from subdaemons.
We use a callback which updates the appropriate slot.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-23 18:02:00 +01:00
practicalswift 6bacab5e87 Fix typos 2018-02-20 13:05:51 +01:00
Rusty Russell 02d469b3d4 peer_failed: hand fds back to master when we fail.
master now hands it back to gossipd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-19 02:56:51 +00:00
Rusty Russell d2f691b288 subd: make functions more generic, don't assume 'struct channel'.
This means the caller needs to supply an explicit log to base the
subd log on, and also a callback for error handling.

The callback is kind of ugly, but it gets reworked towards the end
of this series.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-19 02:56:51 +00:00
Rusty Russell 409fef582d subd: keep pointer to channel, not peer.
This rolls through many other functions, making them take channel not peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-02-14 11:31:58 +01:00
Rusty Russell 3a596d6dda subd: wrap all message callbacks in a transaction.
Including destructors.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-11-06 10:24:34 +01:00
Rusty Russell 3c6eec87e3 Add DEVELOPER flag, set by default.
This is a bit messier than I'd like, but we want to clearly remove all
dev code (not just have it uncalled), so we remove fields and functions
altogether rather than stub them out.  This means we put #ifdefs in callers
in some places, but at least it's explicit.

We still run tests, but only a subset, and we run with NO_VALGRIND under
Travis to avoid increasing test times too much.

See-also: #176
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-26 12:53:09 +02:00
Rusty Russell 4c9f7542b2 subd: Clarify description of subd_release_peer.
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-22 16:24:10 +02:00
Rusty Russell 0b953b86fe subd: automatically detect if callback frees subd.
This involves a tricky callback internally, but far less error-prone.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell 5a256c724a subd: simplify and cleanup lifetime handling.
There are now only two kinds of subdaemons: global ones (hsmd, gossipd) and
per-peer ones.  We can handle many callbacks internally now.

We can have a handler to set a new peer owner, and automatically do
the cleanup of the old one if necessary, since we now know which ones
are per-peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-10-20 18:31:32 +02:00
Rusty Russell b7bb0be944 subd: remove context arg, as we're always owned by lightningd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-28 15:37:43 +02:00
Rusty Russell ef28b6112c status: use common status codes for all the failures.
This change is really to allow us to have a --dev-fail-on-subdaemon-fail option
so we can handle failures from subdaemons generically.

It also neatens handling so we can have an explicit callback for "peer
did something wrong" (which matters if we want to close the channel in
that case).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-09-12 23:00:53 +02:00
Rusty Russell bbed5e3411 Rename subdaemons, move them into top level.
We leave the *build* results in lightningd/ for ease of in-place testing though.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-29 17:54:14 +02:00
Rusty Russell a37c165cb9 common: move some files out of lightningd/
Basically all files shared by different daemons.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-29 17:54:14 +02:00
Rusty Russell 99581bd709 dev_disconnect: support 'permfail' line to permanently fail peer.
The master daemon checks for this after a subdaemon dies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-08-20 13:06:41 +09:30
Rusty Russell 6e59f85666 subd: expose raw API for getting a single fd to a subdaemon.
We're going to use this for the HSM.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-27 10:25:53 +09:30
Rusty Russell f2d4309add lightningd/subd: explicit failure reply support.
We had a terrible hack in gossip when a peer didn't exist.  Formalize
a pattern when code+200 is a failure (with no fds passed), and use it
here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-06-27 10:25:53 +09:30
Rusty Russell 06bc035f59 Minor fixes: feedback from Christian
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 9b1d240c1f lightningd: --dev-disconnect support.
We use a file descriptor, so when we consume an entry, we move past it
(and everyone shares a file offset, so this works).

The file contains packet names prefixed by - (treat fd as closed when
we try to write this packet), + (write the packet then ensure the file
descriptor fails), or @ ("lose" the packet then ensure the file
descriptor fails).

The sync and async peer-write functions hook this in automatically.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'test-run-cryptomsg__fix_compilation.patch':

test/run-cryptomsg: fix compilation.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell d1fcc434c8 subd: use array of fd pointers, not fds, and use take().
This lets us specify that we want to keep some fds.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell fe1ff33419 lightningd/subd: don't take ownership of peer.
Use callback which fails the peer if subd dies: that will later allow
reconnect.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell d27a5d3212 lightningd/lightningd: shutdown subdaemons on exit.
Especially under valgrind, we should give them some time to exit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-04-12 09:09:19 -07:00