Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Plugins: `keysend` now removes unknown even (technically illegal!) fields, to try to accept more payments.
We filter based on the environment variable `CLN_PLUGIN_LOG`,
defaulting to `info` as that is not as noisy as `debug` or `trace`, at
least libraries will not spam us too heavily.
Changelog-Added cln-plugin: The logs level from cln-plugins can be configured by the `CLN_PLUGIN_LOG` environment variable.
We had a bit of a chicken-and-egg problem, where we instantiated the
`state` to be managed by the `Plugin` during the very first step when
creating the `Builder`, but then the state might depend on the
configuration we only get later. This would force developers to add
placeholders in the form of `Option` into the state, when really
they'd never be none after configuring.
This defers the binding until after we get the configuration and
cleans up the semantics:
- `Builder`: declare options, hooks, etc
- `ConfiguredPlugin`: we have exchanged the handshake with
`lightningd`, now we can construct the `state` accordingly
- `Plugin`: Running instance of the plugin
Changelog-Changed: cln-plugin: Moved the state binding to the plugin until after the configuration step
We will now simply reject old-style ones as invalid. Turns out the
only trace we could find is a channel between two nodes unconnected to
the rest of the network.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: We now require all channel_update messages include htlc_maximum_msat (as per latest BOLTs)
Many changes to gossmap (including the pending ones!) don't actually
concern readers, as long as they obey certain rules:
1. Ignore unknown messages.
2. Treat all 16 upper bits of length as flags, ignore unknown ones.
So now we split the version byte into MAJOR and MINOR, and you can
ignore MINOR changes.
We don't expose the internal version (for creating the map)
programmatically: you should really hardcode what major version you
understand!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is actually what the autoclean plugin wants, especially since
you can't otherwise delete a payment which has failed then succeeded.
But insist on neither or both being specified, at least for now.
Changelog-Added: JSON-RPC: `delpay` takes optional `groupid` and `partid` parameters to specify exactly what payment to delete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Previous commit was a hack which *always* batched where possible, this
is a more sophisticated opt-in varaint, with a timeout sanity check.
Final performance for cleaning up 1M pays/forwards/invoices:
```
$ time l1-cli autoclean-once succeededpays 1
{
"autoclean": {
"succeededpays": {
"cleaned": 1000000,
"uncleaned": 26895
}
}
}
real 6m9.828s
user 0m0.003s
sys 0m0.001s
$ time l2-cli autoclean-once succeededforwards 1
{
"autoclean": {
"succeededforwards": {
"cleaned": 1000000,
"uncleaned": 40
}
}
}
real 3m20.789s
user 0m0.004s
sys 0m0.001s
$ time l3-cli autoclean-once paidinvoices 1
{
"autoclean": {
"paidinvoices": {
"cleaned": 1000000,
"uncleaned": 0
}
}
}
real 6m47.941s
user 0m0.001s
sys 0m0.000s
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `batching` command to allow database transactions to cross multiple back-to-back JSON commands.
autoclean was using 98% of its time in memmove; we should simply keep
an offset, and memmove when it's empty. And also, only memmove the
used region, not the entire buffer!
Running on product of giantnodes.py:
$ time l1-cli autoclean-once failedpays 1
{
"autoclean": {
"failedpays": {
"cleaned": 26895,
"uncleaned": 1000000
}
}
}
Before:
real 20m46.579s
user 0m0.000s
sys 0m0.001s
After:
real 2m10.568s
user 0m0.001s
sys 0m0.000s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's more natural: we will eventually support dynamic config variables,
so this will be quite nice.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Plugins: `autoclean` can now delete old forwards, payments, and invoices automatically.
And take the opportunity to rename l0 and l1 in the tests to the
more natural l1 l2 (since we add l3).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This can happen, and in fact does below in our test_autoclean_once
test where we update the datastore, and return from the cmd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
"Who needs specs?" FFS...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: `keysend` will now attach the longest valid text field in the onion to the invoice (so you can have Sphinx.chat users spam you!)
It means we consume the channel completely to the best of our
knowledge, so let that through.
Changelog-Fixed: pay: Squeezed out the last `msat` from our local view of the network
In the case of the local channel we set the estimation to the exact
value spendable, which is important when we want to drain a channel,
because there we actually want to get the last msat.
Changelog-Added: JSON-RPC: `fundchannel`, `multifundchannel` and `fundchannel_start` now accept a `reserve` parameter to indicate the absolute reserve to impose on the peer.
This is what we do in lightningd, which makes memleak much more forgiving:
you can hang temporaries off cmd without getting reports of leaks (also
when send_outreq called).
We remove all the notleak() calls in plugins which worked around this!
And avoid multiple notleak labels, since both send_outreq() and
command_still_pending() can be called multiple times.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Add memleak_ignore_children() so callers can do exclusions themselves.
Having two exclusions was always such a hack!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows GDB to print values, but also allows us to use them in
'case' statements. This wasn't allowed before because they're not
constant terms.
This also made it clear there's a clash between two error codes,
so move one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON-RPC: Error code from bcli plugin changed from 400 to 500.
This is usually fine, but without this, commando (another branch!) has
a race:
1. A command has multiple parts.
2. We start sending them out.
3. We get a response, which completes the cmd.
4. We go to send the next one out, but it's been freed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
And fold start_json_request() and start_json_rpc() into the core
function since it's the only caller.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Build them from the command which caused them, and take plugin name
as basename with extension stripped.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This avoids having to escape | or &, though we still allow that for
the deprecation period.
To detect deprecated usage, we insist that alternatives are *always*
an array (which could be loosened later), but that also means that
restrictions must *always* be an array for now.
Before:
```
# invoice, description either A or B
lightning-cli commando-rune '["method=invoice","pnamedescription=A|pnamedescription=B"]'
# invoice, description literally 'A|B'
lightning-cli commando-rune '["method=invoice","pnamedescription=A\\|B"]'
```
After:
```
# invoice, description either A or B
lightning-cli commando-rune '[["method=invoice"],["pnamedescription=A", "pnamedescription=B"]]'
# invoice, description literally 'A|B'
lightning-cli commando-rune '[["method=invoice"],["pnamedescription=A|B"]]'
```
Changelog-Deprecated: JSON-RPC: `commando-rune` restrictions is always an array, each element an array of alternatives. Replaces a string with `|`-separators, so no escaping necessary except for `\\`.
JSON needs to escape \, since it can't be in front of anything unexpected,
so no \|. So we need to return \\ to \, and in theory handle \n etc.
Changelog-Fixed: JSON-RPC: `commando-rune` now handles \\ escapes properly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The completion of a successful payment is defined as the first
completion of a successful part, hence we just collect the minimum
completion time of successes.
Changelog-Added: JSON-RPC: `pay` and `listpays` now lists the completion time.
It was deprecated in v0.10.1, but only one channel on the network
doesn't set it now anyway, and we'll be ignoring that soon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. If the tool changes, you need to regenerate since the output may
change.
2. This didn't just filter that out, ignored all but the first
dependency, which made bisecting the bookkeeper plugin a nightmare:
it didn't regenerate the .po file, causing random crashes.
If we want this, try $(filter-out tools/fromschema.py) instead. But I
don't think we want that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
we weren't making records for 'missed' accounts that had a zero balance
at snapshot time (if peer opens channel and is unused)
Fixes: #5502
Reported-By: https://github.com/niftynei/cln-logmaid
Track rebalances, and report income events for them.
Previously `listincome` would report:
- invoice event, debit, outgoing channel
- invoice_fee event, debit, outgoing channel
- invoice event, credit, inbound channel
Now reports:
- rebalance_fee, debit, outgoing channel
(same value as invoice_fee above)
Note: only applies on channel events; if a rebalance falls to chain
we'll use the older style of accounting.
Changelog-None
When traversing the list we call `command_finished` which modifies the
list we are traversing. This ensures we don't end up advancing in the
list iteration.
Reported-by: Rusty Russell <@rustyrussell>
Technically this is a use-after-free since `command_finished` frees
the `cmd` which is also the parent of `p`, so reset it early. All
paths lead to `command_finished` so setting it early is ok.
Reported-by: Rusty Russell <@rustyrussell>
Suspended payments now have the same groupid as the actual attempt,
this allows us to identify pay calls that were suspended due to us and
terminate only those.
Changelog-Fixed pay: Fixed a crash when `pay` was called multiple times while an attempt was already in progress.
This should prevent us from accidentally completing a payment twice,
when replaying the result of an actual attempt against pay call that
was suspended due to it still being pending.
We rely on sqlite3 being present to run unit tests for some bookkeeper
tests; here we effectively disable these tests if not available
Fixes report in #4928
Reported-By: @whitslack
The initial snapshots on an already-running lightningd are expected to
be unbalanced, but this shouldn't cause users to long for the green,
green grass of home.
This controls the Art of Noise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
First, how we record "our_funds" and then apply pushes vs lease_fees
(for liquidity ad buys/sales) was exactly opposite.
For pushes we were reporting the total funded into the channel, with the
push representing how much we'd later moved to the peer.
For lease_fees we were rerporting the total in the channel, with the
push representing how much was already moved to the peer.
We fix this (from a view perspective) by re-adding lease fees to what's
reported in the channel funding totals. Since this is now new behavior
(for leased channel values), we added new fields so we can take the old
field names thru a deprecation cycle.
We also make it possible to differentiate btw a push and a lease_fee
(before they were all the same), by adding to new fields to `listpeers`:
`fee_paid_msat` and `fee_rcvd_msat`.
This allows us to avoid math in the bookkeeper, instead we just pick
the numbers out directly and record them.
Fixes#5472
Changelog-Added: JSON-RPC: `listpeers` now has a few new fields for `funding` (`remote_funds_msat`, `local_funds_msat`, `fee_paid_msat`, `fee_rcvd_msat`).
Changelog-Deprecated: JSON-RPC: `listpeers`.`funded` fields `local_msat` and `remote_msat` are now deprecated.
Keep the accounts as an 'append only' log, instead we move the marker
for the 'channel_open' forward when a 'channel_open' comes out.
We also neatly hide the 'channel_proposed' events in 'inspect' if
there's a 'channel_open' for that same event.
If you call inspect before the 'channel_open' is confirmed, you'll see
the tag as 'channel_proposed', afterwards it shows up as
'channel_open'. However the event log rolls forward -- listaccountevents
will show the correct history of the proposal then open confirming (plus
any routing that happened before the channel confirmed).
We need a record of the channel account before you start sending
payments through it. Normally we don't start allowing payments to be
sent until after the channel has locked in but zeroconf does away with
this assumption.
Instead we push out a "channel_proposed" event, which should only show
up for zeroconfs.
This was originally done as part of the bookkeeper introduction, but it deserves its
own patch (and that one didn't remove the bitcoin/chainparams.o from the individual
requirements lines).
Suggested-by: @niftynei
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
`lightningd` has an option --wallet that lets you supply a database dsn
string to connect to a sqlite3/postgres database that's hosted/stored
elsewhere.
This adds the `--bookkeeper-db` option which does the same, except for
the bookkeeping data for a node!
Note that the default is to go in the `lightning-dir` in a database
called `accounts.sqlite3`
First off, when we pull data out of JSON, unescape it so we don't end up
with extraneous escapes in our bookkeeping data. I promise, it's worth
it.
Then, when we print descriptions out to the csvs, we gotta wrap
everything in quotes... but also we have to change all the double-quotes
to singles so that adding the quotes doesn't do anything untoward.
We also just pass it thru json_escape to get rid of linebreaks etc.
Note that in the tests we do a byte comparison instead of converting the
CSV dumps to strings because python will escape the strings on
conversion...
So we print out invoice fees on the same line for those CSVs! This means
we have to do a little bit of gymnastics (but not too bad):
- we save the fee amount onto the income event now so we can use
it later
- we ignore every "invoice_fee" event for the koinly/cointracker
Note that since we're not skipping income events in the loops we also
move the newline character to the start of every `_entry` function so
skipped records dont incur empth lines.
Changelog-Added: bkpr: print out invoice fees on the same line for `koinly` and `cointracker` csv types
It's possible we'll get an "external" event for an account/channel and
then try to do close checks on it and it'll bomb out because we didn't
go look up the originating account to make sure that that had open info
Now, we make sure and do that when we hear about an originating account
If two events for the same (unlogged) account come in and get run
through the "lookup peer" code, we should anticipate that.
We do two things here:
- one, if it's a duplicate "event" to create the channel open
we check for that and just exit early
- two, we were using a copy of the account that was
fetched/pulled from before the RPC ran, so instead we
just re-pull the most up to date account info for the
close checks.
This fixes the crash I was getting in re-running these things
If we expect further events for an onchain output (because we can steal
it away from the 'external'/rightful owner), we mark them.
This prevents us from marking a channel as 'onchain-resolved' before
all events that we're interested in have actually hit the chain.
Case that this matters:
Peer publishes a (cheating) unilateral close and a timeout htlc (which
we can steal).
We then steal the timeout htlc.
W/o the stealable flag, we'd have marked the channel as resolved when
the peer published the timeout htlc, which is incorrect as we're still
waiting for the resolution of that timeout htlc (b/c we *can* steal it).
We were failing to mark channels as resolved b/c we weren't using later
events to external (but originated from this account) events as signals
to run the channel resolution check.
This fixes that, and adds a test.
Adds schema definitions and manpages for bkpr- commands; also renames
the commands to all start with 'bkpr-', so they're easier to identify/
make runes about.
it's nice to know what node your channel was opened with. in theory we
could use listpeers to merge the data after the fact, except that
channels disappear after they've been closed for a bit. it's better to
just save the info.
we print it out in `listbalances`, as that's a great place account level
information
Anchor outputs are ignored by the clightning wallet, but we keep track
of them in the bookkeeper. This causes problems when we do the balance
checks on restart w/ the balance_snapshot -- it results in us printing
out a journal_entry to 'get rid of' the anchors that the clightning node
doesnt know about.
Instead, we mark some outputs as 'ignored' and exclude these from our
account balance sums when we're comparing to the clightning snapshot.
Our consolidate fees had a crash bug (and was pretty convoluted). This
makes it less convoluted and resolves the crash.
The only kinda meh thing is that we have to look up the most recent
timestamp data for the onchain fee entry separately, because of the way
SQL sums work.
For income events, break out the amount paid in routing fees vs the
total amount of the *invoice* that is paid.
Also printout these fees, when available, on listaccountevents
Add ability to filter results by timestamp.
Note that "start_time" is everything *after* that timestamp; and
"end_time" is everything *up to and including* that timestamp.
Prints out the `listincome` events as a CSV formatted file
Current csv_format options:
- koinly
- cointracker
- harmony (https://github.com/harmony-csv/harmony)
- quickbooks*
*Quickbooks expects values in 'USD', whereas we print values out
in <currency> (will be noted in the Description field). This won't work
how you'd expect -> you might need to do some conversion etc before
uploading it.
All amounts emitted as 'btc' w/ *11* decimals.
This is a rare case where we RBF the output of a penalty until it no
longer has an output value we can reclaim. We ignore the txid for these
events when closing a channel.
We issue events for external deposits (withdrawals) before the tx is
confirmed in a block. To avoid double counting these, we don't count
them as confirmed/included until after they're confirmed. We do this
by keeping the blockheight as zero until the withdraw for the input for
them comes through.
Note that since we don't have any way to note when RBF'd withdraws
aren't eligible for block inclusion anymore, we don't really have a good
heuristic to trim them. Which is fine, they *will* show up in account
events however.
onchain fees are weird at channel close because:
- you may be missing an trimmed htlc (which went to fees)
- the balance from close may have been rounded (msats cant land on
chain)
- the close might have been a past state and you've actually
ended up with more money onchain than you had in the channel. wut
This commit accounts for all of this appropriately, with some tests.
channel_close.debit should equal onchain_fee.credit (for that txid)
plus sum(chain_event.credit [wallet/channel_acct]).
In the penalty case, channel_close.debit becomes channel_close.debit +
penalty_adj.debit, i.e.
channel-close.debit + (penalty_adj.debit) =
onchain_fee.credit
+ sum(chain_event.credit [wallet/channel_acct])
Due to the way that onchain channel closes work, there is often a delay
between when the funding output is spent and the channel is considered
'closed'.
Once *every* downstream utxo of a channel has landed on chain, we
annotate the account with the resolving blockheight.
This gives us some insight into whether or not the chain fees etc of a
channel are going to update further and allows for a natural marker to
prune data (at a later date)
Pass in an account id, get out a utxo chain of the channel open and
close (and any other related htlc txs etc).
Note that this prints all wallet deposits that occurred in any of the
tx's that touched this channel.
This is fine and expected for any tx that's not the open; when
considerig the tx open event, the wallet deposit that's present is
typically the change. If there were other channels opened in the same tx
then the change won't match up exactly...
Also note that we might have ignored/missed fees for the channel
closed's spending txid, so we attempt to update those as well.
Backfilling for missed events is a beast.
We need the total output_value, and we can figure this out if we look at
the remote amount also.
We also need to account for the pushed/leased amount, as for leased
channels this really messes with onchain fee calculations.
We copy basically the events that lightningd emits for leased channels:
an open event with the 'lease fee' (pushed?) amount credited to the side
that made the lease/push; then an in-channel event to effectively push
the pushed/leased amount over to the peer it was paid to.
We run the journal entry info after this, so the journal snapshot will
take the pushed amount into account when figuring out what the further
missed in-channel payments were (if any)
Because we update the onchain_fee table every time a new event comes in,
it's possible (and in fact happens) that we get a wallet
withdraw/deposit event and then the channel open output event.
What we'd expect in this case is that the fees for the tx were credited
to the channel's account, not the wallet. But since we got the two
in/out events first, the fees were accumulated there first.
Our existing logic will add the channel's fees correctly, but we weren't
zeroing out the wallet's balance once it'd been determined that they
were 'ineligble' so to speak, for being included in the fees that round.
We now add chain events for starting channel balances, so print these
out with chain first then channel events.
Makes it less confusing for channel lease fee events.
Prints all the events for the requested account. If no account
requested, prints out all the events. Ordered by timestamp.
Changelog-Added: bookkeeper: new command `listaccountevents`
There's two situations where we're missing info.
One is we get a 'channel_closed' event (but there's no 'channel_open')
The other is a balance_snapshot arrives with information about accounts
that doesn't match what's already on disk. (For some of these cases, we
may be missing 'channel_open' events..)
In the easy case (no channel_open missing), we just figure out what the
When we print events out, we need to know the account name. This makes
our lookup a lot easier, since we just pull it out from the database
every time we query for these.
One really rough thing about how we did onchain fees is that the records update
every time a new event comes in.
The better way to do this is to create new entries for every adjustment,
so that reconciliation between printouts isn't a misery.
We add a timestamp and `update_count` to these records, so you can
roughly order them now (and have a good idea of the last time an event
that updated an onchain_fee occurred).
When the node starts up, it records missing/updated account balances
to the 'channel' events... which is kinda fucked for wallet + external
events now that i think about it but these are all treated the same
anyway so it's fine.
This is the magic piece that lets your bookkeeping data startup ok on an
already running/established node.
clightning doesn't give us any info about onchain fees (how could it?
it only knows about utxo object levels, and doesn't keep track of
how/when those are all related)
Instead, we keep running totals of the onchain fees for utxos. This
implements the master method for accounting for them, plus includes
tests to account for channel opens (across two accounts) as well as a
htlc-tx channel close.
Missing: we don't currently emit an event from cln for `withdraw`
initiated removal of funds, so the accounting for wallet -> external
funds is a bit janky. We don't account for the fees on these
transactions since we don't have the resulting 'external' event to
register them against!
Originally I (incorrectly?) assumed that since TX_COMMITMENT_SIGNED
always came before TX_SIGNATURES, we would always receive a response
from openchannel_update (w/ commitment_secured = true) before getting
notification of receipt of the peer's signatures.
But it's observable in the logs of hung tests that this in fact is a
wrong assumption -- the notification for the tx_sigs arrives at our
spender plugin before the callback from our openchannel_update RPC.
This mis-ordering causes a hang.
Luckily we're pretty much setup to handle this race already w/ states
etc, minus actually calling the method advance the plot in case we're
ready.
2022-07-26T05:37:59.4529095Z lightningd-1 2022-07-26T05:10:07.395Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-dualopend-chan#2: peer_in WIRE_COMMITMENT_SIGNED
2022-07-26T05:37:59.4530452Z lightningd-1 2022-07-26T05:10:07.396Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-hsmd: Got WIRE_HSMD_VALIDATE_COMMITMENT_TX
2022-07-26T05:37:59.4530719Z lightningd-1 2022-07-26T05:10:07.396Z DEBUG hsmd: Client: Received message 35 from client
2022-07-26T05:37:59.4531386Z lightningd-1 2022-07-26T05:10:07.396Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-dualopend-chan#2: billboard: channel open: commitment received, sending to lightningd to save
2022-07-26T05:37:59.4531856Z lightningd-1 2022-07-26T05:10:07.398Z DEBUG 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-dualopend-chan#2: peer_in WIRE_TX_SIGNATURES
>>> 2022-07-26T05:37:59.4532553Z lightningd-1 2022-07-26T05:10:07.400Z DEBUG plugin-spenderp: mfc 60:`openchannel_peer_sigs` notice received for channel 9d145e763f08ee6f715ba7677f869cbb9580c7406f4d0b0ff3a0987efe501e13 <<<< THIS ONE WAS ASSUMED TO COME AFTER openchannel_update (next line)
2022-07-26T05:37:59.4533048Z lightningd-1 2022-07-26T05:10:07.400Z DEBUG plugin-spenderp: mfc 60, dest 0: openchannel_update 035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d returned.
2022-07-26T05:37:59.4554292Z lightningd-1 2022-07-26T05:10:07.400Z DEBUG plugin-spenderp: mfc 60: parallel `openchannel_update`.
2022-07-26T05:37:59.4555485Z lightningd-1 2022-07-26T05:10:07.400Z DEBUG plugin-spenderp: mfc 60: funding tx 50425e20dbf0ca6fe112a8811b8048edb5bfa8d2922079668c5f353b859b45cb
2022-07-26T05:37:59.4557934Z lightningd-1 2022-07-26T05:10:07.508Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-hsmd: Got WIRE_HSMD_CUPDATE_SIG_REQ
2022-07-26T05:37:59.4558244Z lightningd-1 2022-07-26T05:10:07.508Z DEBUG hsmd: Client: Received message 3 from client
2022-07-26T05:37:59.4558738Z lightningd-3 2022-07-26T05:11:03.234Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-gossipd: seeker: startup peer finished
2022-07-26T05:37:59.4559209Z lightningd-3 2022-07-26T05:11:03.234Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-gossipd: seeker: state = PROBING_SCIDS Seeking scids 1 - 105
(The last 2 log messages (from a different node) are >1min after the last
log line from lightning-1, because lightning-1 hung)
Hacked lightningd up to test this (such that notification always sent
before the RPC response, works as intended w/ patch)
We were cmd was getting free'd but holding on to reference of the
thing was causing problems.
==523280== Invalid read of size 8
==523280== at 0x1B3E14: del_notifier_property (tal.c:326)
==523280== by 0x1B3E14: tal_del_notifier_ (tal.c:569)
==523280== by 0x1123E7: handle_rpc_reply (libplugin.c:671)
==523280== by 0x1123E7: rpc_read_response_one (libplugin.c:866)
==523280== by 0x1123E7: rpc_conn_read_response (libplugin.c:886)
==523280== by 0x1A7B53: next_plan (io.c:59)
==523280== by 0x1A7B53: do_plan (io.c:407)
==523280== by 0x1A7B53: io_ready (io.c:417)
==523280== by 0x1A9BDB: io_loop (poll.c:453)
==523280== by 0x1141D0: plugin_main (libplugin.c:1708)
==523280== by 0x10D7E4: main (commando.c:937)
==523280== Address 0x52de928 is 8 bytes inside a block of size 40 free'd
==523280== at 0x483F0C3: free (vg_replace_malloc.c:872)
==523280== by 0x1B2CDD: del_tree (tal.c:419)
==523280== by 0x1B37BB: tal_free (tal.c:486)
==523280== by 0x1B37BB: tal_free (tal.c:474)
==523280== by 0x110CB2: command_complete (libplugin.c:255)
==523280== by 0x110CB2: command_done_err (libplugin.c:390)
==523280== by 0x10F511: handle_reply (commando.c:560)
==523280== by 0x10F511: handle_custommsg (commando.c:609)
==523280== by 0x113877: ld_command_handle (libplugin.c:1441)
==523280== by 0x113877: ld_read_json_one (libplugin.c:1491)
==523280== by 0x113877: ld_read_json (libplugin.c:1511)
==523280== by 0x1A7B53: next_plan (io.c:59)
==523280== by 0x1A7B53: do_plan (io.c:407)
==523280== by 0x1A7B53: io_ready (io.c:417)
==523280== by 0x1A9BDB: io_loop (poll.c:453)
==523280== by 0x1141D0: plugin_main (libplugin.c:1708)
==523280== by 0x10D7E4: main (commando.c:937)
==523280== Block was alloc'd at
==523280== at 0x483C855: malloc (vg_replace_malloc.c:381)
==523280== by 0x1B3BBD: allocate (tal.c:250)
==523280== by 0x1B3BBD: add_notifier_property (tal.c:303)
==523280== by 0x1B3BBD: tal_add_destructor2_ (tal.c:529)
==523280== by 0x110725: jsonrpc_request_start_ (libplugin.c:181)
==523280== by 0x10E0EA: send_more_cmd (commando.c:643)
==523280== by 0x11243C: handle_rpc_reply (libplugin.c:696)
==523280== by 0x11243C: rpc_read_response_one (libplugin.c:866)
==523280== by 0x11243C: rpc_conn_read_response (libplugin.c:886)
==523280== by 0x1A7B53: next_plan (io.c:59)
==523280== by 0x1A7B53: do_plan (io.c:407)
==523280== by 0x1A7B53: io_ready (io.c:417)
==523280== by 0x1A9BDB: io_loop (poll.c:453)
==523280== by 0x1141D0: plugin_main (libplugin.c:1708)
==523280== by 0x10D7E4: main (commando.c:937)
==523280==
{
<insert_a_suppression_name_here>
If rune contains invalid UTF-8, offers (which implements decode) would
produce JSON with invalid UTF-8, which causes lightningd to complain
and kill it, and then die because it's an important plugin.
So don't decode invalid UTF-8!
Reported-by: @jb55
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>