Commit Graph

591 Commits

Author SHA1 Message Date
Rusty Russell 0ca0db765a gossipd: fix crash if we truncate store.
Entries we've already loaded expect to exist in the store.  We could go
back and remove them all, but instead just truncate at the known-good
point.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-05-01 11:59:12 +02:00
Rusty Russell b248bb155a tools/bench-gossipd.sh: make it work (where possible) with DEVELOPER=0
Some tests require dev support, but the rest can run.  We simplify
the gossip_store output so it's the same in non-dev mode too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-24 13:46:39 -05:00
Rusty Russell 0fc42415c2 gossipd/routing: remove BFG implementation.
Now we can benchmark, and remove 500 bytes per node.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35093-37907(36146+/-1.1e+03)
	vsz_kb:555168
	store_rewrite_sec:12.120000-13.750000(12.7+/-0.6)
	listnodes_sec:1.270000-1.370000(1.322+/-0.039)
	listchannels_sec:29.770000-31.600000(30.82+/-0.64)
	routing_sec:0.00
	peer_write_all_sec:63.630000-67.850000(65.432+/-1.7)

MCP notable changes from pre-Dijkstra (>1 stddev):
	-vsz_kb:577456
	+vsz_kb:555168
	-routing_sec:60.70
	+routing_sec:12.04

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell cfdb012b30 gossipd: re-add fuzz logic to routing.
Do it inside the can_reach() function, which is less optimal for BFG
which does 20 ops on the same channel, but fine for Dijkstra.

This does have a measurable cost, so we might want to use
non-cryptographic fuzz in future:

$ gossipd/test/run-bench-find_route 100000 100:

Before:
	100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route)

After:
	100 (100 succeeded) routes in 100000 nodes in 113381 msec (1133813412 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell e197956032 gossipd/routing: Iterate on Dijkstra when route is too long.
If a route is too long, we try to bias Dijkstra towards choosing a
shorter route by adding a per-hop cost.  We do a naive "shortest path"
pass, then using that cost as a ceiling on per-hop cost, we do a
binary search.

There are some subtleties: we use risk rather than total as our
counter field (we normally bias this by 1 anyway, so it's easy to make
that a variable), and we set riskfactor to a mimimal value once we're
iterating.  It's good enough to get a solution, we don't need to do a
2-dimensional search on riskfactor and riskbias.

Of course, this is extremely slow if we hit it on our benchmark,
though it doesn't happen in a more realistic network:

$ gossipd/test/run-bench-find_route 100000 100:

Before:
	100 (79 succeeded) routes in 100000 nodes in 25341 msec (253412314 nanoseconds per route)

After:
	100 (100 succeeded) routes in 100000 nodes in 97346 msec (973461784 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell f8ffae837d gossipd: speed Dijkstra a little.
Our uintmap can be a little slow with all the reallocation, so leave
NULL entries and walk to find the first one.  Since we don't clean
them up, keep a cache of where the min non-all-NULL value is in the
heap.

It's clearer benefit on really large tests, so here's 1M nodes:

Comparison using gossipd/test/run-bench-find_route 1000000 10:

Before:
	10 (10 succeeded) routes in 1000000 nodes in 91995 msec (9199532898 nanoseconds per route)

After:
	10 (10 succeeded) routes in 1000000 nodes in 20605 msec (2060539287 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell 7caa37f0f1 gossipd: implement Dijkstra.
Use a uintmap as our minheap.

Note that Dijkstra can give overlength routes, so some checks are disabled.

Comparison using gossipd/test/run-bench-find_route 100000 10:

Before:
	10 (10 succeeded) routes in 100000 nodes in 120087 msec (12008708402 nanoseconds per route)
After:
	10 (10 succeeded) routes in 100000 nodes in 2269 msec (226925462 nanoseconds per route)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell 4d84a436f5 gossipd: temporarily disable fuzz in routing.
This allows precise comparison between Dijkstra and Bellman-Ford without
worrying about fuzz.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell 594af8049b gossipd: extract common functionality.
This will be needed by Dijkstra as well.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
Rusty Russell 6dfa46d65a gossipd/test: add test for handling overlong routes.
This is a weakness with Dijkstra, so write an explicit unit test that
we can find a short enough (but more expensive) route.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-18 06:33:09 +00:00
trueptolemy 77236caa91 gossipd: fix the check for node announcement in broadcast_state_check()
There should check if node_id_1 was stored in pubkeys, other than checking scid.
2019-04-16 00:20:26 +00:00
trueptolemy 274f156b28 gossiped: rename empty_node_map() to new_node_map()
empty_node_map() sounds like a destructor. new_node_map() makes sense and is better.
2019-04-14 23:12:00 +00:00
trueptolemy ee036a2e36 Gossipd: change the pending_cannouncement list to htable 2019-04-14 05:39:31 +00:00
Rusty Russell 261921dee2 gossipd: adjust peers' broadcast_offset when compacting store.
When we compact the store, we need to adjust the broadast index for
peers so they know where they're up to.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell fdb42c3170 gossipd: don't keep channel_updates in memory.
This requires some trickiness when we want to re-add unannounced channels
to the store after compaction, so we extract a common "copy_message" to
transfer from old store to new.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:36034-37853(37109.8+/-5.9e+02)
	vsz_kb:577456
	store_rewrite_sec:12.490000-13.250000(12.862+/-0.27)
	listnodes_sec:1.250000-1.480000(1.364+/-0.09)
	listchannels_sec:30.820000-31.480000(31.068+/-0.24)
	routing_sec:26.940000-27.990000(27.616+/-0.39)
	peer_write_all_sec:65.690000-68.600000(66.698+/-0.99)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1202316
	+vsz_kb:577456

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 0370ed2eca gossipd: use pread in the store.
The next patch causes us to access the store while loading (we read
channel_updates for local peers), which messes up loading due to the
lseek involved.

Using pread() is atomic with seek & read, and also a bit more
efficient.  Make the header contiguous too, while we're here.

We don't need pwrite: we always open with O_APPEND which means the
seek-to-end is implicit.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:36771-38289(37529.6+/-5.3e+02)
	vsz_kb:1202316
	store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
	listnodes_sec:1.240000-1.410000(1.34+/-0.058)
	listchannels_sec:29.850000-31.840000(30.908+/-0.69)
	routing_sec:27.800000-31.790000(28.822+/-1.5)
	peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:39207-45089(41374.6+/-2.2e+03)
	+store_load_msec:36771-38289(37529.6+/-5.3e+02)
	-store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
	+store_rewrite_sec:12.460000-13.280000(12.784+/-0.29)
	-peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)
	+peer_write_all_sec:66.200000-68.720000(67.44+/-0.84)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 2135c7a024 gossipd: allow reading from the store during load.
When we no longer keep channel_updates in memory, there's a path where
we access them on load: when we promote a local channel to an
announced channel.

This breaks at the moment, since gs->fd == -1; change it to a writable
flag instead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell aeb72a05e3 gossipd: remove some fields from struct chan.
The txout_script field is unused; the local_disable only applies to
the handful of local channels, so move that into a hash table.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:39207-45089(41374.6+/-2.2e+03)
	vsz_kb:1202316
	store_rewrite_sec:15.090000-16.790000(15.654+/-0.63)
	listnodes_sec:1.290000-3.790000(1.938+/-0.93)
	listchannels_sec:30.190000-32.120000(31.31+/-0.69)
	routing_sec:28.220000-31.340000(29.314+/-1.2)
	peer_write_all_sec:66.830000-76.850000(71.976+/-3.6)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:35107-37944(36686+/-1e+03)
	+store_load_msec:39207-45089(41374.6+/-2.2e+03)
	-vsz_kb:1218036
	+vsz_kb:1202316
	-listchannels_sec:28.510000-30.270000(29.6+/-0.6)
	+listchannels_sec:30.190000-32.120000(31.31+/-0.69)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 3280466e19 gossipd: don't keep channel_announcement messages in memory.
MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35107-37944(36686+/-1e+03)
	vsz_kb:1218036
	store_rewrite_sec:14.060000-17.970000(15.966+/-1.6)
	listnodes_sec:1.270000-1.350000(1.314+/-0.034)
	listchannels_sec:28.510000-30.270000(29.6+/-0.6)
	routing_sec:30.230000-31.510000(30.83+/-0.44)
	peer_write_all_sec:67.390000-70.710000(68.568+/-1.2)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1780516
	+vsz_kb:1218036

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 2fd4a0121f gossipd: unify is_chan_public / is_chan_announced.
We used to have a `struct chan` while we're waiting for an update; now we
keep that internally.  So a `struct chan` without a channel_announcement
in the store is private, and other is public.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell aafc489edb gossipd: remove info fields from struct node.
Reload them from disk if they do listnodes.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35390-38659(37336.4+/-1.3e+03)
	vsz_kb:1780516
	store_rewrite_sec:13.800000-16.800000(15.02+/-0.98)
	listnodes_sec:1.280000-1.530000(1.382+/-0.096)
	listchannels_sec:28.700000-30.440000(29.34+/-0.68)
	routing_sec:30.120000-31.080000(30.526+/-0.35)
	peer_write_all_sec:65.910000-76.850000(69.462+/-4.1)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1792996
	+vsz_kb:1780516
	-listnodes_sec:1.030000-1.120000(1.068+/-0.032)
	+listnodes_sec:1.280000-1.530000(1.382+/-0.096)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 0608c36301 gossipd: don't keep node_announcement messages in memory.
MCP results from 5 runs, min-max(mean +/- stddev):
store_load_msec:34779-38628(36903.4+/-1.4e+03)
vsz_kb:1792996
store_rewrite_sec:14.440000-15.040000(14.672+/-0.24)
listnodes_sec:1.030000-1.120000(1.068+/-0.032)
listchannels_sec:27.860000-32.850000(30.05+/-1.7)
routing_sec:30.020000-31.700000(31.044+/-0.56)
peer_write_all_sec:65.100000-70.600000(68.422+/-2)

-vsz_kb:1780516
+vsz_kb:1792996
-listnodes_sec:1.280000-1.530000(1.382+/-0.096)
+listnodes_sec:1.030000-1.120000(1.068+/-0.032)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:30640-33236(32202+/-8.7e+02)
	+store_load_msec:34779-38628(36903.4+/-1.4e+03)
	-vsz_kb:1812956
	+vsz_kb:1792996
	-listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	+listnodes_sec:1.030000-1.120000(1.068+/-0.032)
	-peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)
	+peer_write_all_sec:65.100000-70.600000(68.422+/-2)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell cb297b0a1b gossipd: free tmpctx children in gossip_store_load loop.
We're accumulating children, and we'll get more in the successive
patches.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 3ef767fd52 gossipd: don't use cached node_announcement for redundancy checking
Re-parse the existing message, since we'e going to get rid of those
fields.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell e02f5817fe gossipd: don't create struct chan for yet-to-be-updated channels.
We currently create a struct chan when we receive a `channel_announcement`,
but we can only broadcast once we have a `channel_update` (since that
provides the timestamp).

This means a `struct chan` can be in a weird state where it exists,
but is unusable (can't use without an update), and also means we need to
keep the channel_announcement message around until an update arrives, so
we can put it in the gossip_store.

Instead, keep track of these "unupdated" channels separately, and check
for them in all the places we search for a specific channel to update.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:30640-33236(32202+/-8.7e+02)
	vsz_kb:1812956
	store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
	listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	listchannels_sec:28.140000-29.560000(28.816+/-0.56)
	routing_sec:29.530000-32.590000(30.352+/-1.1)
	peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:1812904
	+vsz_kb:1812956
	-store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	+store_rewrite_sec:13.410000-16.970000(14.438+/-1.3)
	-listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	+listnodes_sec:0.590000-0.660000(0.62+/-0.033)
	-listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	+listchannels_sec:28.140000-29.560000(28.816+/-0.56)
	-routing_sec:45.080000-48.160000(46.814+/-1.1)
	+routing_sec:29.530000-32.590000(30.352+/-1.1)
	-peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)
	+peer_write_all_sec:60.380000-61.320000(60.836+/-0.37)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell d8aee68ba8 gossipd: handle duplicate nodes from unverified channel_announces properly.
If we have a channel_announcement, we catch any node_announcement for
either end while we validate the channel_announcement.  But if we have
multiple channel_announcements and the first one failed to verify, it
would remove this catch, meaning we'd discard following node_announcements
even though there was a pending channel_announcement.

The answer is to use a simple reference count, and as a further
optimization, only place the `pending_node_announce` if there's no
node already.

We also move the process_pending_node_announcement() calls lower down,
so *any* new channel creation checks it.  This is more robust, and
will prove useful for the next patch, where we can use the same
mechanism to handle node_announcements on channel_announcements which
are verified, but don't yet have a channel_update.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell da884751e8 gossipd: make routing_add_channel_update discard old timestamps.
This is currently done higher up, in handle_channel_update(), but
that's one reason why handle_channel_update() has to do a channel
lookup.  Moving the check down means handle_channel_update() can do a
minimal "get node id for this channel" so it can check the signature.

This helps, because the chan lookup semantics are changing in the next
few patches.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 6b9069ee28 broadcast: don't keep payload pointer.
If we need the payload, pull it from the gossip store.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:30189-52561(39416.4+/-8.8e+03)
	vsz_kb:1812904
	store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	routing_sec:45.080000-48.160000(46.814+/-1.1)
	peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:2288784
	+vsz_kb:1812904
	-store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
	+store_rewrite_sec:21.390000-27.070000(23.596+/-2.4)
	-listnodes_sec:0.750000-0.850000(0.794+/-0.042)
	+listnodes_sec:1.120000-1.230000(1.176+/-0.044)
	-listchannels_sec:30.740000-31.760000(31.096+/-0.35)
	+listchannels_sec:38.900000-50.580000(44.716+/-3.9)
	-routing_sec:29.600000-33.560000(30.472+/-1.5)
	+routing_sec:45.080000-48.160000(46.814+/-1.1)
	-peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)
	+peer_write_all_sec:58.780000-87.150000(72.278+/-9.7)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell da845b660b gossipd: gossip_store_get() to load a single store entry.
This will allow us to load on demand, and not keep all messages in
memory.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 1f08cfb3e3 gossipd: use file offset within store as broadcast index.
Instead of an arbitrary counter, we can use the file offset for our
partial ordering, removing a field.  It takes some care when we compact
the store, however, as this field changes.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:34271-35283(34789.6+/-3.3e+02)
	vsz_kb:2288784
	store_rewrite_sec:38.060000-39.130000(38.426+/-0.39)
	listnodes_sec:0.750000-0.850000(0.794+/-0.042)
	listchannels_sec:30.740000-31.760000(31.096+/-0.35)
	routing_sec:29.600000-33.560000(30.472+/-1.5)
	peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:35685-38538(37090.4+/-9.1e+02)
	+store_load_msec:34271-35283(34789.6+/-3.3e+02)
	-vsz_kb:2288768
	+vsz_kb:2288784
	-peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)
	+peer_write_all_sec:49.220000-52.690000(50.892+/-1.3)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell ec50ec6a71 gossipd: make gossip loading stats accurate.
They didn't count the header sizes when reporting bytes, which is
misleading.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell eb4564c3cd gossipd: embed broadcast information into each structure.
This is more compact, but also required once we replace the arbitrary
"index" with an actual offset into the gossip store.  That will let us
remove the in-memory variants entirely.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35685-38538(37090.4+/-9.1e+02)
	vsz_kb:2288768
	store_rewrite_sec:35.530000-41.230000(37.904+/-2.3)
	listnodes_sec:0.720000-0.810000(0.762+/-0.041)
	listchannels_sec:30.750000-35.990000(32.704+/-2)
	routing_sec:29.570000-34.010000(31.374+/-1.8)
	peer_write_all_sec:51.140000-58.350000(55.69+/-2.4)

MCP notable changes from previous patch (>1 stddev):
	-vsz_kb:2621808
	+vsz_kb:2288768

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 62918fcb3b gossip_store: avoid gratuitous copy on load.
Doesn't make measurable difference, but an obvious optimization.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 617c23e735 gossipd: use u32 for timestamp.
We used an s64 so we could use -1 and save a check, but that's just
silly as we have adjacent non-u64 fields: wastes 7 bytes per node
and 16 per channel.

Interestingly, this seemed to make us a little slower for some reason.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:35569-38776(37169.8+/-1.2e+03)
	vsz_kb:2621808
	store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
	listnodes_sec:0.740000-0.800000(0.768+/-0.023)
	listchannels_sec:29.820000-32.730000(30.972+/-0.99)
	routing_sec:30.110000-30.590000(30.346+/-0.18)
	peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:32825-36365(34615.6+/-1.1e+03)
	+store_load_msec:35569-38776(37169.8+/-1.2e+03)
	-vsz_kb:2637488
	+vsz_kb:2621808
	-store_rewrite_sec:35.150000-36.200000(35.59+/-0.4)
	+store_rewrite_sec:35.870000-40.290000(38.14+/-1.6)
	-listnodes_sec:0.590000-0.710000(0.682+/-0.046)
	+listnodes_sec:0.740000-0.800000(0.768+/-0.023)
	-peer_write_all_sec:49.020000-52.890000(50.376+/-1.5)
	+peer_write_all_sec:52.420000-59.160000(54.692+/-2.5)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-11 18:31:34 -07:00
Rusty Russell 0b484b111e gossipd: make more compact getchannels entries.
We can save significant space by combining both sides: so much that we
can reduce the WIRE_LEN_LIMIT to something sane again.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:34467-36764(35517.8+/-7.7e+02)
	vsz_kb:2637488
	store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)
	listnodes_sec:1.140000-2.780000(1.596+/-0.6)
	listchannels_sec:55.390000-58.110000(56.998+/-0.99)
	routing_sec:30.330000-30.920000(30.642+/-0.19)
	peer_write_all_sec:50.640000-53.360000(51.822+/-0.91)

MCP notable changes from previous patch (>1 stddev):
	-store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
	+store_rewrite_sec:35.310000-36.580000(35.816+/-0.44)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell 91849dddc4 wire: use struct node_id for node ids.
Don't turn them to/from pubkeys implicitly.  This means nodeids in the store
don't get converted, but bitcoin keys still do.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:33934-35251(34531.4+/-5e+02)
	vsz_kb:2637488
	store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)
	listnodes_sec:1.020000-1.290000(1.146+/-0.086)
	listchannels_sec:51.110000-58.240000(54.826+/-2.5)
	routing_sec:30.000000-33.320000(30.726+/-1.3)
	peer_write_all_sec:50.370000-52.970000(51.646+/-1.1)

MCP notable changes from previous patch (>1 stddev):
	-store_load_msec:46184-47474(46673.4+/-4.5e+02)
	+store_load_msec:33934-35251(34531.4+/-5e+02)
	-vsz_kb:2638880
	+vsz_kb:2637488
	-store_rewrite_sec:46.750000-48.280000(47.512+/-0.51)
	+store_rewrite_sec:34.720000-35.130000(34.94+/-0.14)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell a2fa699e0e Use node_id everywhere for nodes.
I tried to just do gossipd, but it was uncontainable, so this ended up being
a complete sweep.

We didn't get much space saving in gossipd, even though we should save
24 bytes per node.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell d4ab0592c5 fixup! gossipd: use simple inline array for nodes with few channels.
Suggested-by: @cdecker
Suggested-by: @niftynei
2019-04-09 12:37:16 -07:00
Rusty Russell b6494c1994 gossipd: use simple inline array for nodes with few channels.
Allocating a htable is overkill for most nodes; we can fit 11 pointers
in the same space (10, since we use 1 to indicate we're using an array).

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:45947-47016(46683.4+/-4e+02)
	vsz_kb:2639240
	store_rewrite_sec:46.950000-49.830000(48.048+/-0.95)
	listnodes_sec:1.090000-1.350000(1.196+/-0.095)
	listchannels_sec:48.960000-57.640000(53.358+/-2.8)
	routing_sec:29.990000-33.880000(31.088+/-1.4)
	peer_write_all_sec:49.360000-53.210000(51.338+/-1.4)

MCP notable changes from previous patch (>1 stddev):
-	vsz_kb:2641316
+	vsz_kb:2639240

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell 417e1bab7d gossipd: use iterator helpers for iterating node channels.
Makes the next step easier.

MCP results from 5 runs, min-max(mean +/- stddev):
	store_load_msec:45791-46917(46330.4+/-3.6e+02)
	vsz_kb:2641316
	store_rewrite_sec:47.040000-48.720000(47.684+/-0.57)
	listnodes_sec:1.140000-1.340000(1.2+/-0.072)
	listchannels_sec:50.970000-54.250000(52.698+/-1.3)
	routing_sec:29.950000-31.010000(30.332+/-0.37)
	peer_write_all_sec:51.570000-52.970000(52.1+/-0.54)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-09 12:37:16 -07:00
Rusty Russell 891ee20a59 tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project
Outputs CSV.  We add some stats for load times in developer mode, so we can
easily read them out.

peer_read_all_sec doesn't work, since we seem to reject about half the
updates for having bad signatures.  It's also very slow...

routing fails, for unknown reasons, so that failure is ignored in routing_sec.

Results from 5 runs, min-max(mean +/- stddev):
	store_load_msec,vsz_kb,store_rewrite_sec,listnodes_sec,listchannels_sec,routing_sec,peer_write_all_sec
	39275-44779(40466.8+/-2.2e+03),2899248,41.010000-44.970000(41.972+/-1.5),2.280000-2.350000(2.304+/-0.025),49.770000-63.390000(59.178+/-5),33.310000-34.260000(33.62+/-0.35),42.100000-44.080000(43.082+/-0.67)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-2.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

Suggested-by: @niftynei



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project-1.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

MCP filename change.



Header from folded patch 'tools-bench-gossipd.sh__dont_print_csv_by_default.patch':

tools/bench-gossipd.sh: don't print CSV by default.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_tools-bench-gossipd.sh__rough_benchmark_for_gossipd_and_the_million_channels_project.patch':

fixup! tools/bench-gossipd.sh: rough benchmark for gossipd and the million channels project

Make shellcheck happy.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell 2bd7df93c6 gossipd: preserve unannounced channels across store compaction.
Otherwise we'd forget them on restart, again.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell c424c42668 gossipd: store local channel updates across restart, even if unannounced.
Either private or simply not enough confirms.  They would have been added
on reconnect, but that's not ideal.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell 7c8f506a0f dev-compact-store-gossip: specific RPC so we can test gossip_store rewrite.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell 5b12007a4f gossipd: dev option to allow unknown channels.
This lets us benchmark without a valid blockchain.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>



Header from folded patch 'fixup!_gossipd__dev_option_to_allow_unknown_channels.patch':

fixup! gossipd: dev option to allow unknown channels.

Suggested-by: @cdecker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell f8f6533dba dev: --dev-gossip-time so gossipd doesn't prune old data.
This is useful for canned data, such as the million channels project.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Rusty Russell b2c93beaed gossipd: use htable instead of simple array for node's channels.
For giant nodes, it seems we spend a lot of time memmoving this array.
Normally we'd go for a linked list, but that's actually hard: each
channel has two nodes, so needs two embedded list pointers, and when
iterating there's no good way to figure out which embedded pointer
we'd be using.

So we (ab)use htable; we don't really need an index, but it's good for
cache-friendly iteration (our main operation).  We can actually change
to a hybrid later to avoid the extra allocation for small nodes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2019-04-08 04:41:43 +00:00
Christian Decker f3c234529e gossip: Cache txout query failures
If we asked `bitcoind` for a txout and it failed we were not storing that
information anywhere, meaning that when we see the channel announcement the
next time we'd be reaching out to `lightningd` and `bitcoind` again, just to
see it fail again. This adds an in-memory cache for these failures so we can
just ignore these the next time around.

Fixes #2503

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2019-04-01 23:54:19 +00:00
Christian Decker 426b22fdcb gossip: Bump `gossip_getnodes_reply` result count to be u32 as well
Otherwise we'll just have the same issue once we reach 65k nodes.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
2019-03-27 12:48:52 +01:00
Christian Decker 25e829c7d1 gossip: Make the `listchannels` reply result count a u32
Fixes #2504

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Reported-by: Antoine Le Calvez <@alecalve>
2019-03-27 12:48:52 +01:00