rgb-cln

Commit Graph

Author	SHA1	Message	Date
Michael Schmoock	ad249607d6	dual-fund: update extracted CSVs to latest bolt draft Changelog-None	2023-02-04 15:31:16 +10:30
Rusty Russell	153b7bf192	common/gossip_store: move subdaemon-only routines to connectd. connectd is the only one who uses these routines now. The rest can be linked into a plugin. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-01-30 15:15:41 -06:00
Rusty Russell	6a95d3a25e	common: expose node_id_hash functions. They're used in several places, and we're about to add more. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-01-21 08:05:31 -06:00
Rusty Russell	5dfcd15782	all: no longer need to call htable_clear to free htable contents. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-01-12 11:44:10 +10:30
Rusty Russell	f07e37018d	setup: make all htables use tal. This makes them easier to clean up. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-01-12 11:44:10 +10:30
Rusty Russell	81e57dce52	connectd: ensure htables are always tal objects. We want to change the htable allocator to use tal, which will need this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-01-12 11:44:10 +10:30
Rusty Russell	22eac96750	connectd: don't ask DNS seeds for addresses on every reconnect. We were stressing the servers if node cannot be found. Only do lookup on manual connect commands. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Protocol: lightningd: Only use DNS server address lookup on manual `connect` commands, not normal reconnection attempts.	2023-01-03 15:00:27 +10:30
Rusty Russell	15d0a8bec8	connectd: don't spam logs when we're under load. This happens a lot with my node with rc2, so drop it to debug. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-11-30 19:31:38 +01:00
Rusty Russell	5becfa6ee1	onion_message: don't use general secret, use per-message secret. We had a scheme where lightningd itself would put a per-node secret in the blinded path, then we'd tell the caller when it was used. Then it simply checks the alias to determine if the correct path was used. But this doesn't work when we start to offer multiple blinded paths. So go for a far simpler scheme, where the secret is generated (and stored) by the caller, and hand it back to them. We keep the split "with secret" or "without secret" API, since I'm sure callers who don't care about the secret won't check that it doesn't exist! And without that, someone can use a blinded path for a different message and get a response which may reveal the node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-11-09 15:08:03 +01:00
Rusty Russell	8720bbedae	common/onion: split into decode and encode routines. Some places (e.g. the pay plugin) only need to construct onions, not decode them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-11-09 15:08:03 +01:00
Rusty Russell	159fc7d1a2	common/onion_message_parse: generic routine for parsing onion messages. Instead of open coding in connectd/onion_message, we move it to common with a nice API. This lets us process the BOLT test vectors. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-10-26 11:29:06 +10:30
Rusty Russell	5cf86a1a2e	common: update to latest onion message spec. Mainly, field name changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: Protocol: Support for forwarding blinded payments (as per latest draft)	2022-10-26 11:29:06 +10:30
Rusty Russell	53e40c4380	common/blindedpath: generalize routines. We're going to share them for onion messages as well as for blinded payments. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-10-26 11:29:06 +10:30
Rusty Russell	41ef85318d	onionmessages: remove obsolete onion message parsing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-09-29 16:10:57 +09:30
Rusty Russell	701dd3dcef	memleak: remove exclusions from memleak_start() Add memleak_ignore_children() so callers can do exclusions themselves. Having two exclusions was always such a hack! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-09-19 11:34:42 +09:30
Rusty Russell	3380f559f9	memleak: simplify API. Mainly renaming. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-09-19 11:34:42 +09:30
Rusty Russell	2da5244e83	jsonrpc: make error codes an enum. This allows GDB to print values, but also allows us to use them in 'case' statements. This wasn't allowed before because they're not constant terms. This also made it clear there's a clash between two error codes, so move one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: JSON-RPC: Error code from bcli plugin changed from 400 to 500.	2022-09-19 10:18:55 +09:30
Michael Schmoock	e0d6f3ceb1	connectd: DNS Bolt7 #911 no longer EXPERIMENTAL Changelog-Changed: Bolt7 #911 DNS annoucenent support is no longer EXPERIMENTAL	2022-09-13 06:42:20 +09:30
Rusty Russell	1b30ea4b82	doc: update BOLTs to bc86304b4b0af5fd5ce9d24f74e2ebbceb7e2730 This contains the zeroconf stuff, with funding_locked renamed to channel_ready. I change that everywhere, and try to fix up the comments. Also the `alias` field is called `short_channel_id`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: Protocol: `funding_locked` is now called `channel_ready` as per latest BOLTs.	2022-09-12 09:34:52 +09:30
Rusty Russell	22ff007d64	connectd: control connect backoff from lightningd. We used to tell connectd to remember our connect delay, and hand it back (increased if necessary). Instead, simply record when we last tried to connect. If it was less than 10 minutes ago, double delay (up to 5 minutes max), otherwise reset delay to 1 second. This covers all scenarios: whether we reconnect then immediately disconnect, or never successfully connect, it doesn't matter. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: #5453	2022-07-28 15:08:44 +09:30
Rusty Russell	9498e14530	connectd: two logging cleanups. Don't log_io final messages twice (multiplex_final_message already does this, so it's confusing to see us send e.g. WIRE_ERROR twice!). And report that the peer has failed to connect out before telling lightningd, otherwise we get a very confusing ordering, e.g.: ``` 2022-07-23T05:17:36.096Z DEBUG 027d0de66d08f956a8d606c0d1c34e59bda38c05a3b1cc738fdd6378716c644997-lightningd: Reconnecting in 4 seconds 2022-07-23T05:17:36.096Z DEBUG 027d0de66d08f956a8d606c0d1c34e59bda38c05a3b1cc738fdd6378716c644997-lightningd: Will try reconnect in 4 seconds 2022-07-23T05:17:36.096Z DEBUG 027d0de66d08f956a8d606c0d1c34e59bda38c05a3b1cc738fdd6378716c644997-connectd: Failed connected out: ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-25 15:16:58 -07:00
Rusty Russell	a3c4908f4a	lightningd: don't explicitly tell connectd to disconnect, have it do it on sending error/warning. Connectd already does this when we receive an error or warning, but now do it on send. This causes some slight behavior change: we don't disconnect when we close a channel, for example (our behaviour here has been inconsistent across versions, depending on the code). When connectd is told to disconnect, it now does so immediately, and doesn't wait for subds to drain etc. That simplifies the manual disconnect case, which now cleans up as it would from any other disconnection when connectd says it's disconnected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	c415c80d48	connectd: spelling and typo fixes. From @niftynei. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	719d1384d1	connectd: give connections a chance to drain when lightningd says to disconnect, or peer disconnects. We want to avoid lost messages in the common cases. This generalizes our drain code, by giving the subds each 5 seconds to close themselves, but continue to allow them to send us traffic (if peer is still connected) and continue to send them traffic. We continue to send traffic out to the peer (if it's still connected), until all subds are gone. We still have a 5 second timer to close the connection to peer. On reconnects, we don't do this "drain period" on reconnects: we kill immediately. We fix up one test which was looking for the "disconnect" message explicitly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	6a2817101d	connectd: don't move parent while we're being freed. A subtle case I hadn't come across before: if a child tal_resizes() its parent while the parent is being deleted, tal gets confused. The subd destructor does this using tal_arr_remove() on peer->subds, which is currently being freed: ``` ==61056== Invalid read of size 8 ==61056== at 0x185632: del_tree (tal.c:417) ==61056== by 0x18560D: del_tree (tal.c:412) ==61056== by 0x185957: tal_free (tal.c:486) ==61056== by 0x1183BC: peer_discard (connectd.c:1861) ==61056== by 0x11869E: recv_req (connectd.c:1942) ==61056== by 0x12774B: handle_read (daemon_conn.c:35) ==61056== by 0x173453: next_plan (io.c:59) ==61056== by 0x17405B: do_plan (io.c:407) ==61056== by 0x17409D: io_ready (io.c:417) ==61056== by 0x176390: io_loop (poll.c:453) ==61056== by 0x118A68: main (connectd.c:2082) ==61056== Address 0x4bd8850 is 16 bytes inside a block of size 48 free'd ==61056== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==61056== by 0x1860E6: tal_resize_ (tal.c:699) ==61056== by 0x1373DD: tal_arr_remove_ (utils.c:184) ==61056== by 0x11D508: destroy_subd (multiplex.c:930) ==61056== by 0x1850A4: notify (tal.c:240) ==61056== by 0x1855BB: del_tree (tal.c:402) ==61056== by 0x18560D: del_tree (tal.c:412) ==61056== by 0x18560D: del_tree (tal.c:412) ==61056== by 0x185957: tal_free (tal.c:486) ==61056== by 0x1183BC: peer_discard (connectd.c:1861) ==61056== by 0x11869E: recv_req (connectd.c:1942) ==61056== by 0x12774B: handle_read (daemon_conn.c:35) ``` So simply make the subds children of `peer` not the `peer->subds` array. The only effect is that drain_peer() can't simply free the subds array but must free the subds one at a time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	d31420211a	connectd: add counters to each peer connection. This allows us to detect when lightningd hasn't seen our latest disconnect/reconnect; in particular, we would hit the following pattern: 1. lightningd says to connect a subd. 2. connectd disconnects and reconnects. 3. connectd reads message, connects subd. 4. lightningd reads disconnect and reconnect, sends msg to connect to subd again. 5. connectd asserts because subd is alreacy connected. This way connectd can tell if lightningd is talking about the previous connection, and ignoere it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	41b379ed89	lightningd: hand fds to connectd, not receive them from connectd. Before this patch: 1. connectd says it's connected (peer_connected) 2. we tell connectd we want to talk about each channel (peer_make_active) 3. connectd gives us an fd for each channel, and we connect it to a subd (peer_active) 4. OR, connectd says it sent something about a channel we didn't tell it about, with an fd (peer_active) Now: 1. connectd says it's connected (peer_connected) 2. we start all appropriate subds and tell connectd to what channels/fds (peer_connect_subd). 3. if connectd says it sent something about a channel we didn't tell it about, we either tell it to hang up (peer_final_msg), or connect a new opening daemon (peer_connect_subd). This is the minimal-size patch, which is why we create socket pairs in so many places to use the existing functions. Many cleanups are possible, since the new flow is so simple. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	ab0e5d30ee	connectd: don't io_halfclose() We don't io_halfclose() the other side, we io_sock_shutdown(), which can leave both sides unset: ``` lightningd-2: 2022-06-07T11:00:05.053Z BROKEN connectd: FATAL SIGNAL 6 (version 57e1af2) lightningd-2: 2022-06-07T11:00:05.053Z BROKEN connectd: backtrace: common/daemon.c:38 (send_backtrace) 0x563b9b603af7 lightningd-2: 2022-06-07T11:00:05.053Z BROKEN connectd: backtrace: common/daemon.c:46 (crashdump) 0x563b9b603b4b lightningd-2: 2022-06-07T11:00:05.053Z BROKEN connectd: backtrace: /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 ((null)) 0x7fe6e8d4f08f lightningd-2: 2022-06-07T11:00:05.053Z BROKEN connectd: backtrace: ../sysdeps/unix/sysv/linux/raise.c:51 (__GI_raise) 0x7fe6e8d4f00b lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:79 (__GI_abort) 0x7fe6e8d2e858 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: /build/glibc-SzIz7B/glibc-2.31/assert/assert.c:92 (__assert_fail_base) 0x7fe6e8d2e728 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: /build/glibc-SzIz7B/glibc-2.31/assert/assert.c:101 (__GI___assert_fail) 0x7fe6e8d3ffd5 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: ccan/ccan/io/io.c:65 (next_plan) 0x563b9b64fd7e lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: ccan/ccan/io/io.c:407 (do_plan) 0x563b9b6508f0 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: ccan/ccan/io/io.c:423 (io_ready) 0x563b9b650984 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: ccan/ccan/io/poll.c:453 (io_loop) 0x563b9b652c25 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: connectd/connectd.c:2037 (main) 0x563b9b5f5793 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: ../csu/libc-start.c:308 (__libc_start_main) 0x7fe6e8d30082 lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: (null):0 ((null)) 0x563b9b5ebf6d lightningd-2: 2022-06-07T11:00:05.054Z BROKEN connectd: backtrace: (null):0 ((null)) 0xffffffffffffffff ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	40145e619b	connectd: remove the redundant "already connected" logic. It should now be reliable, so we don't need this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	9b6c97437e	connectd: remove reconnection logic. We don't have to put aside a peer which is reconnecting and wait for lightningd to remove the old peer, we can now simply free the old and add the new. Fixes: #5240 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	7b0c11efb4	connectd: don't let peer close take forever. Sending any pending messages to peer before hanging up is a courtesy: give it 5 seconds before simply closing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	8678c5efb3	connectd: release peer soon as lightingd tells us. Now we have separate peer draining logic, we can simply use it when connectd tells us to release the peer, without waiting. (We could simply free the peer, but that's a bit rude, as messages can get lost). This removes various complex flags and logic we had before. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: `connectd`: various crashes and issues fixed by simplification and rewrite.	2022-07-18 20:50:04 -05:00
Rusty Russell	e856accb7d	connectd: send cleanup messages however peer is freed. This lets us tal_free() it wherever we want, rather than always freeing via peer_discard. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	9dc3880360	connectd: put peer into "draining" mode when we want to close it. This removes it from the hashtable, and forces it to do nothing but send out any remaining packets, then close. It is, in effect, reduced to a stub, with no further interactions with the rest of the system (all subds are freed already). Also removes the need for an explicit "final_msg" too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	37ff013c2c	connectd: fix subd tal parents. This came out in a later patch: freeing the peer->subds doesn't actually free the subds, because they're reparented onto subd->conn, which is a child of peer itself. This breaks because when the peer is finally freed, destroy_subd is called, and expects to find itself in peer->subds (but we made that NULL when we manually freed it!). Fix this, and make it obvious that we tal_steal it. ``` ightning_connectd: FATAL SIGNAL 11 (version v0.11.0.1-25-gbf025aa-modded) 0x55de2a1b8b94 send_backtrace common/daemon.c:33 0x55de2a1b8c3e crashdump common/daemon.c:46 0x7fe2be2fc08f ??? /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0x55de2a1af41e destroy_subd connectd/multiplex.c:1119 0x55de2a217686 notify ccan/ccan/tal/tal.c:240 0x55de2a217b9d del_tree ccan/ccan/tal/tal.c:402 0x55de2a217bef del_tree ccan/ccan/tal/tal.c:412 0x55de2a217bef del_tree ccan/ccan/tal/tal.c:412 0x55de2a217f39 tal_free ccan/ccan/tal/tal.c:486 0x55de2a1aa116 peer_discard connectd/connectd.c:1834 0x55de2a1aa38d recv_req connectd/connectd.c:1903 0x55de2a1b9121 handle_read common/daemon_conn.c:31 0x55de2a205a35 next_plan ccan/ccan/io/io.c:59 0x55de2a20663d do_plan ccan/ccan/io/io.c:407 0x55de2a20667f io_ready ccan/ccan/io/io.c:417 0x55de2a208972 io_loop ccan/ccan/io/poll.c:453 0x55de2a1aa736 main connectd/connectd.c:2042 0x7fe2be2dd082 __libc_start_main ../csu/libc-start.c:308 0x55de2a1a085d ??? ???:0 0xffffffffffffffff ??? ???:0 ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-18 20:50:04 -05:00
Rusty Russell	6fd8fa4d95	connectd: optimize requests for "recent" gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-15 21:18:29 +09:30
Rusty Russell	92fe871467	connectd: optimize case where peer doesn't want gossip. LND and us send 0xFFFFFFFF to turn off gossip. LDK and Eclair don't seem to turn off gossip at all, but that's OK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-15 21:18:29 +09:30
Rusty Russell	06e1e119aa	pytest: fix test_gossip_no_empty_announcements flake. This is a side-effect of fixing aging: sometimes, we age our rcvd_filter cache too fast, and thus re-xmit. This breaks our test, since it used dev-disconnect on the channel_announce, but that closes to l3, not l1! ``` > assert l1.rpc.listchannels()['channels'] == [] E AssertionError: assert [{'active': T...ags': 1, ...}] == [] E Left contains 2 more items, first extra item: {'active': True, 'amount_msat': 100000000msat, 'base_fee_millisatoshi': 1, 'channel_flags': 0, ...} ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Fixes: #5403	2022-07-12 21:41:19 +09:30
Rusty Russell	7dd8e27862	connectd: don't insist on ping replies when other traffic is flowing. Got complaints about us hanging up on some nodes because they don't respond to pings in a timely manner (e.g. ACINQ?), but that turned out to be something else. Nonetheless, we've had reports in the past of LND badly prioritizing gossip traffic, and thus important messages can get queued behind gossip dumps! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: connectd: give busy peers more time to respond to pings.	2022-07-09 12:27:05 +09:30
Rusty Russell	32af92145b	update-mocks: handle missing deprecated_apis. This expands update-mocks to be able to handle (simple!) missing symbols which are not functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-07-09 09:59:52 +09:30
Alex Myers	cbafc0fa33	gossip_store: add flag for spam gossip, update to v10 This will be used to decouple internal use of gossip from what is passed to gossip peers. Updates GOSSIP_STORE_VERION to 10. Changelog-Changed: gossip_store updated to version 10.	2022-07-06 14:31:19 +09:30
Rusty Russell	9ab7c8aed3	connected/test: fix memleak in test. ``` VALGRIND=1 valgrind -q --error-exitcode=7 --track-origins=yes --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all connectd/test/run-netaddress > /dev/null ==2483395== 16 bytes in 1 blocks are still reachable in loss record 1 of 15 ==2483395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==2483395== by 0x10D59A: autodata_register_ (autodata.c:20) ==2483395== by 0x10EB26: register_autotype_type_to_string (type_to_string.h:77) ==2483395== by 0x10EB6B: register_one_type_to_string0 (type_to_string.c:8) ==2483395== by 0x188C0C: __libc_csu_init (in /home/rusty/devel/cvs/lightning/connectd/test/run-netaddress) ==2483395== by 0x4A3A00F: (below main) (libc-start.c:264) ==2483395== ==2483395== 40 bytes in 1 blocks are still reachable in loss record 2 of 15 ==2483395== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ... ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-06-29 21:07:42 +09:30
Rusty Russell	fd90e5746b	connectd: don't keep around more than one old connection. This was fixed in `1c495ca5a8` ("connectd: fix accidental handling of old reconnections.") and then reverted by the rework in "connectd: avoid use-after-free upon multiple reconnections by a peer". The latter made the race much less likely, since we cleaned up the reconnecting struct once the connection was hung up by the remote node, but it's still theoretically possible. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-06-28 13:47:27 +09:30
Matt Whitlock	83c825945c	connectd: avoid use-after-free upon multiple reconnections by a peer `peer_reconnected` was freeing a `struct peer_reconnected` instance while a pointer to that instance was registered to be passed as an argument to the `retry_peer_connected` callback function. This caused a use-after-free crash when `retry_peer_connected` attempted to reparent the instance to the temporary context. Instead, never have `peer_reconnected` free a `struct peer_reconnected` instance, and only ever allow such an instance to be freed after the `retry_peer_connected` callback has finished with it. To ensure that the instance is freed even if the connection is closed before the callback can be invoked, parent the instance to the connection rather than to the daemon. Absent the need to free `struct peer_reconnected` instances outside of the `retry_peer_connected` callback, there is no use for the `reconnected` hashtable, so remove it as well. See: https://github.com/ElementsProject/lightning/issues/5282#issuecomment-1141454255 Fixes: #5282 Fixes: #5284 Changelog-Fixed: connectd no longer crashes when peers reconnect.	2022-06-28 13:47:27 +09:30
Rusty Russell	4ee55acc71	connectd: don't start connecting in parallel in peer_conn_closed. The crash below from @zerofeerouting left me confused. The invalid value in fmt_wireaddr_internal is a telltale sign of use-after-free. This backtrace shows us destroying the conn twice: what's happening? Well, tal carefully protects against destroying twice: it's not that unusual to free something in a destructor which has already been freed. So this indicates that there are two io_conn hanging off one struct connecting, which isn't supposed to happen! We deliberately call try_connect_one_addr() initially, then inside the io_conn destructor. But due to races in connectd vs lightningd connection state, we added a fix which allows a connect command to sit around while the peer is cleaning up (`6cc9f37cab`) and get fired off when it's done. But what if, in the chaos, we are already connecting again? Now we'll end up with two connections. Fortunately, we have a `conn` pointer inside struct connecting, which (with a bit of additional care) we can ensure is only non-NULL while we're actually trying to connect. This lets us check that before firing off a new connection attempt in peer_conn_closed. ``` lightning_connectd: FATAL SIGNAL 6 (version v0.11.2rc2-2-g8f7e939) 0x5614a4915ae8 send_backtrace common/daemon.c:33 0x5614a4915b72 crashdump common/daemon.c:46 0x7ffa14fcd72f ??? ???:0 0x7ffa14dc87bb ??? ???:0 0x7ffa14db3534 ??? ???:0 0x5614a491fc71 fmt_wireaddr_internal common/wireaddr.c:255 0x5614a491fc7a fmt_wireaddr_internal_ common/wireaddr.c:257 0x5614a491ea6b type_to_string_ common/type_to_string.c:32 0x5614a490beaa destroy_io_conn connectd/connectd.c:754 0x5614a494a2f1 destroy_conn ccan/ccan/io/poll.c:246 0x5614a494a313 destroy_conn_close_fd ccan/ccan/io/poll.c:252 0x5614a4953804 notify ccan/ccan/tal/tal.c:240 0x5614a49538d6 del_tree ccan/ccan/tal/tal.c:402 0x5614a4953928 del_tree ccan/ccan/tal/tal.c:412 0x5614a4953e07 tal_free ccan/ccan/tal/tal.c:486 0x5614a4908b7a try_connect_one_addr connectd/connectd.c:870 0x5614a490bef1 destroy_io_conn connectd/connectd.c:759 0x5614a494a2f1 destroy_conn ccan/ccan/io/poll.c:246 0x5614a494a313 destroy_conn_close_fd ccan/ccan/io/poll.c:252 0x5614a4953804 notify ccan/ccan/tal/tal.c:240 0x5614a49538d6 del_tree ccan/ccan/tal/tal.c:402 0x5614a4953e07 tal_free ccan/ccan/tal/tal.c:486 0x5614a4948f08 io_close ccan/ccan/io/io.c:450 0x5614a4948f59 do_plan ccan/ccan/io/io.c:401 0x5614a4948fe1 io_ready ccan/ccan/io/io.c:417 0x5614a494a8e6 io_loop ccan/ccan/io/poll.c:453 0x5614a490c12f main connectd/connectd.c:2164 0x7ffa14db509a ??? ???:0 0x5614a4904e99 ??? ???:0 0xffffffffffffffff ??? ???:0 ``` Fixes: #5339 Changelog-Fixed: connectd: occasional crash when we reconnect to a peer quickly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-06-28 13:46:59 +09:30
Vincenzo Palazzo	7ff62b4a00	lightnind: remove`DEFAULT_PORT` global definition Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>	2022-06-28 06:09:01 +09:30
Rusty Russell	a1b8b40d13	connectd: fix debug message on bind fail. It doesn't get the right errno, and it says "create" not "bind". ``` 2022-05-20T03:04:46.498Z DEBUG connectd: Failed to create 2 socket: Success 2022-05-20T03:04:46.500Z DEBUG connectd: REPLY WIRE_CONNECTD_INIT_REPLY with 0 fds 2022-05-20T03:04:46.501Z DEBUG connectd: connectd_init_done 2022-05-20T03:04:46.503Z BROKEN connectd: Failed to bind socket for 127.0.0.1:37871: Address already in use ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-06-27 17:21:35 +09:30
Michael Schmoock	a2b75b66ba	connectd: use dev_allow_localhost for remote_addr testing Before this fix, there was the situation where a DEVELOPER=1 node would announce non-public addresses on mainnet if detected. Since there are some nodes on the internet that falsely report local addresses we move this 'testing feature' to 'dev-allow-locahost' nodes. Changelog-None	2022-06-17 20:30:16 +09:30
Michael Schmoock	033ac323d1	connectd: prefer IPv6 when available Changelog-Changed: connectd: prefer IPv6 connections when available.	2022-06-17 20:30:16 +09:30
Rusty Russell	0c9017fb76	connectd: shrink max filter size. 10,000 per peer was too much. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2022-06-17 14:14:02 +09:30

1 2 3 4 5 ...

371 Commits