Commit Graph

165 Commits

Author SHA1 Message Date
Christian Decker 9cfd09dc4a gossip: HalfChans are public if we have an update and the Chan is
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-05-07 01:10:48 +00:00
practicalswift 5db73c6e27 Avoid static analyzer warnings about potentially uninitialized values 2018-05-01 17:14:33 +02:00
Rusty Russell f083a699e2 gossipd: separate init and activate.
This means gossipd is live and we can tell it things, but it won't
receive incoming connections.  The split also means that the main daemon
continues (eg. loading peers from db) while gossipd is loading from the store,
potentially speeding startup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-30 12:01:36 +02:00
practicalswift abf510740d Force the use of the POSIX C locale for all commands and their subprocesses 2018-04-27 14:02:59 +02:00
ZmnSCPxj 69cdfba3c8 gossip: Use gossiped node_announcement to locate nodes.
So we can get via address hint, DNS seed, or node_announcement
gossip.
2018-04-26 11:45:38 +00:00
Rusty Russell 83e847575c gossipd: don't handle multiple connect requests, combine them in lightningd.
Christian points out that this is the pattern used elsewhere, for example.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell 3b29d2b75a gossipd: don't create a new chain of timers on every connect command.
When a connect fails, if it's an important peer, we set a timer.  If
we have a manual connect command, this means we do this again, leading
to another timer.

For a manual command, free any existing timer; the normal fail logic
will start another if necessary.

Reported-by: @ZmnSCPxj
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell c6483a57d0 gossipd: give more distinct errors.
At least say whether we failed to connect at all, or failed cryptographic
handshake, or failed reading/writing init messages.

The errno can be "Operation now in progress" if the other end closes the
socket on us: this happens when we handshake with the wrong key and it
hangs up on us.  Fixing this would require work on ccan/io though.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell a134ca9659 gossipd: use exponential backoff on reconnect for important peers.
We start at 1 second, back off to 5 minutes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell bc4809aa85 gossipd: make sure master only ever sees one active connection.
When we get a reconnection, kill the current remote peer, and wait for the
master to tell us it's dead.  Then we hand it the new peer.

Previously, we would end up with gossipd holding multiple peers, and
the logging was really hard to interpret; I'm not completely convinced
that we did the right thing when one terminated, either.

Note that this now means we can have peers with neither ->local nor ->remote
populated, so we check that more carefully.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell be1f33b265 gossipd: have master explicitly tell us when peer is disconnected.
Currently we intuit it from the fd being closed, but that may happen out
of order with when the master thinks it's dead.

So now if the gossip fd closes we just ignore it, and we'll get a
notification from the master when the peer is disconnected.

The notification is slightly ugly in that we have to disable it for
a channel when we manually hand the channel back to gossipd.

Note: as stands, this is racy with reconnects.  See the next patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell ab9d9ef3b8 gossipd: drain fd instead of passing around gossip index.
(This was sitting in my gossip-enchancement patch queue, but it simplifies
this set too, so I moved it here).

In 94711969f we added an explicit gossip_index so when gossipd gets
peers back from other daemons, it knows what gossip it has sent (since
gossipd can send gossip after the other daemon is already complete).

This solution is insufficient for the more general case where gossipd
wants to send other messages reliably, so replace it with the other
solution: have gossipd drain the "gossip fd" which the daemon returns.

This turns out to be quite simple, and is probably how I should have
done it originally :(

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell 72c459dd6c gossipd: keep reaching struct only when we're actively connecting, and don't retry
1. Lifetime of 'struct reaching' now only while we're actively doing connect.
2. Always free after a single attempt: if it's an important peer, retry
   on a timer.
3. Have a single response message to master, rather than relying on
   peer_connected on success and other msgs on failure.
4. If we are actively connecting and we get another command for the same
   id, just increment the counter

The result is much simpler in the master daemon, and much nicer for
reconnection: if they say to connect they get an immediate response,
rather than waiting for 10 retries.  Even if it's an important peer,
it fires off another reconnect attempt, unless it's actively
connecting now.

This removes exponential backoff: that's restored in next patch.  It
also doesn't handle multiple addresses for a single peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell 20e3a18af5 gossipd: maintain a separate structure to track important peers.
Rather than using a flag in reaching/peer; we make it self-contained
as the next patch puts it straight into a timer callback.

Also remove unused 'succeeded' field from struct peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell 8c2c1fe1c2 openingd: tell gossipd that the peer is important once funding tx in place.
And on channel_fail_permanent and closing (the two places we drop to
chain), we tell gossipd it's no longer important.

Fixes: #1316
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell c9fa9817f6 gossipd: explicitly track which peers are important.
These don't have a maximum number of reconnect attempts, and ensure
that we try to reconnect when the peer dies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Rusty Russell b1498f07c5 gossipd: exponential backoff for reconnect (5 minute ceiling).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-26 05:47:57 +00:00
Christian Decker b84804009a gossip: Use the DNS seeds to look up nodes if we don't have an addr
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-25 12:34:55 +02:00
Rusty Russell 5551c161ca gossipd: finish startup before master prints that it's ready.
We're about to remove automatic retrying of connect, and that uncovered
that we actually print out our "Server started" message before we create
the listening socket.

Move the init higher (outside the db transaction) and make it a
request/response, the loop until it's done.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-23 20:18:15 +00:00
Christian Decker 64fbea1528 gossip_store: Save local_add_channel messages and replay them
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-22 12:50:34 +02:00
Christian Decker 7497f972f1 moveonly: Move handle_local_add_channel to routing.h
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-22 12:50:34 +02:00
Christian Decker ddbf016152 gossip: Pass rstate to handle_local_add_channel directly
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-22 12:50:34 +02:00
Rusty Russell b0c2e3cd5c gossipd: use a separate CSV file for the gossip_store types.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-11 15:58:18 +02:00
Rusty Russell 7d0a76c533 goossipd: make store load truncate on errors.
We don't need pread, we just need read, and we can loop internally.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-11 15:58:18 +02:00
ZmnSCPxj 86290b54d4 routing: Use 64-bit msatoshi for messages to and from routing.
Internally both payment and routing use 64-bit, but the interface
between them used 32-bit.
Since both components already support 64-bit we should use that.
2018-04-09 20:45:26 +02:00
Christian Decker a121b7dbc3 gossip: Make gossipd less noisy when receiving requests
This is very noisy when syncing with the blockchain

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-09 00:21:20 +00:00
Christian Decker 2de7f622cb gossip: Add an explicit debug message when handing back a peer
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-04-09 00:21:20 +00:00
practicalswift 693d6fddab Adjust loglevel for error message "Failed to get peername for incoming conn" 2018-04-03 14:05:27 +02:00
Rusty Russell 1a4a59d221 common/daemon: common routines for all daemons.
In particular, the main daemon and subdaemons share the backtrace code,
with hooks for logging.

The daemon hook inserts the io_poll override, which means we no longer
need io_debug.[ch].  Though most daemons don't need it, they still link
against ccan/io, so it's harmess (suggested by @ZmnSCPxj).

This was tested manually to make sure we get backtraces still.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-03 14:03:28 +02:00
Rusty Russell 20bbd92564 utils: add subdaemon_shutdown() to consolidate subdaemon cleanup.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-04-03 14:03:28 +02:00
Christian Decker 63f22d70b5 gossip: Store channel deletions so we don't re-add them on restart
If we only remember the actions that added channels then we'd restore them when
re-reading the gossip_store, so put a tombstone in there to remember to delete
it. These will be cleared upon re-writing the store since the announcements wont
be written anymore.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-30 16:35:00 +02:00
Christian Decker 9132a097b5 gossip: Free the channel when notified of its funding being spent
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-30 16:35:00 +02:00
Christian Decker 5571f2143e gossip: Added message to notify gossipd of outpoint spends
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-30 16:35:00 +02:00
Christian Decker 44e23b3773 gossip: Replay the entire store on init instead of when idle
This now works because we no longer call out to masterd or bitcoind to verify
the channels. It's also rather quick and silent so we can just process all
stored messages until we're done.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker c4ea79cc5c Revert gossip: Track whether we read a message from store or peer
Messages from peers and messages from the gossip_store now have completely
different entrypoints, so we don't need to trace their origin around the message
handling code any longer.
2018-03-25 23:56:59 +00:00
Christian Decker 6e01f38d7d gossip: Use the custom gossip wire msg to wrap channel_announcements
This stores and reads the channel_announcements in the wrapping message which
allows us to store associated data with the raw channel_announcements.

The gossip_store applies channel_announcements directly but it also returns it,
and it gets discarded as a duplicate. In the next commit we'll have gossip_store
apply all changes, bypassing verification, so the duplication is only temporary.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker 0a5ea76d77 gossip: Add message types to store gossip msgs and associate data
Since we may want to extend the on-disk format by adding custom information we
may as well just go the extra mile and reuse the serialization primitives we
already have.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker a571bf9d3a gossip: Track whether we read a message from store or peer
When we read from the gossip_store we set store=false so that we don't duplicate
messages in the store.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker 1a5a4f5853 gossip: Replay gossip messages from the gossip_store
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker e750d3cda1 gossip: Move error return into peer handler
Ee will be replaying gossip messages from the gossip_store soon. This means that
not all messages originate from a peer, so we move the queuing of error messages
up into the peer message handler.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
Christian Decker 49b0c375ce gossip: Added gossip store primitives
Signed-off-by: Christian Decker <decker.christian@gmail.com>
2018-03-25 23:56:59 +00:00
practicalswift 0bf1b01425 Fix typos 2018-03-25 15:53:01 +02:00
Rusty Russell e63b7bb539 take: allocate temporary variables off NULL.
If we're going to simply take() a pointer, don't allocate it off a random
object.  Using NULL makes our intent clear, particularly with allocating
packets we're going to take() onto a queue.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-16 00:16:10 +00:00
Rusty Russell 0a6e3d1e13 utils: remove tal_tmpctx altogether, use global.
In particular, we now only free tmpctx at the end of main().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-16 00:16:10 +00:00
Rusty Russell ccc9414356 status: remove trc context now we have tmpctx.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-16 00:16:10 +00:00
Rusty Russell 2d919d56cb gossipd: make struct queued_message private.
Callers don't need it, and when we add timestamps it just makes
for more places to change.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-14 02:19:37 +00:00
Rusty Russell 5e333b75b9 daemon_conn: simplify msg_queue_cleared_cb.
Now it just returns true if it queued something.  This allows it
to queue multiple packets, and lets it share code paths with other code
in future patches.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-14 02:19:37 +00:00
Rusty Russell 87effd90c2 gossipd: Revert 6afc7dcc09.
This bandaid was solved properly by 94711969f9
where other daemons say where they were up to.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-14 02:19:37 +00:00
Rusty Russell afe61cb841 gossipd: honor LOCAL_INITIAL_ROUTING_SYNC.
We currently spam the peer with all gossip whether they want it or not.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-14 02:19:37 +00:00
Rusty Russell 1f443df428 gossipd: use the broadcast structure to hold gossip messages.
We currently keep two copies; one in the broadcast structure to send
in order, and one in the routing information.  Since we already keep
the broadcast index in the routing information, use that.
Conveniently, a zero index is the same as the old NULL test.

Rename struct node's announcement_idx to node_announce_msgidx to
make it match the other users.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2018-03-14 02:19:37 +00:00