Commit Graph

1909 Commits

Author SHA1 Message Date
Rusty Russell 1e36a19164 lightningd/peer_control: keep cryptostate.
Like the fd, it's only useful when the peer is not in a daemon, so we
free & NULL it when that happens.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell ccad93edb3 gossipd: add fail_peer.
We use this if it reconnects via another fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 5fb1f20898 gossip: make connection own peer.
We steal it when we're closing connection, but we normally want to forget
it if connection just dies.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell aef745e37d peer_control: simplify code flow in depth callback.
I modified it to use states, and messed it up.  Rewrite it to be
far simpler.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 780b3870ad lightningd: peer state cleanup.
1. We explicitly assert what state we're coming from, to make transitions
   clearer.
2. Every transition has a state, even between owners while waiting for HSM.
3. Explictly step though getting the HSM signature on the funding tx
   before starting channeld, rather than doing it in parallel: makes
   states clearer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 662dfef436 lightningd/gossip: Move INIT message handling to handshake daemon.
We need to do this on every connection, whether reconnecting or not,
so it makes sense for the handshake daemon to handle it and return
the feature fields.

Longer term I'm considering having the handshake daemon handle the
listening and connecting, and simply hand the fds back once the peers
are ready.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 0c8b24cf97 daemon/dns: hand netaddr we connected to through to callback.
That way it doesn't have to extract it from fd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell f6495d3310 lightningd/peer_control: don't create peer struct until we've connected.
We currently create a peer struct, then complete handshake to find out
who it is.  This means we have a half-formed peer, and worse: if it's
a reconnect we get two peers the same.

Add an explicit 'struct connection' for the handshake phase, and
construct a 'struct peer' once that's done.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 61a2ed97e1 lightningd/peer_control: start of reconnect logic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell fe1ff33419 lightningd/subd: don't take ownership of peer.
Use callback which fails the peer if subd dies: that will later allow
reconnect.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 34e6e56471 lightningd: introduce peer_state enum.
The actual state names are place holders for now, really.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell 93849b1e02 lightningd/peer_control: save funder side in struct peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Rusty Russell be9bb5f9cb lightningd: peer_fail helper to fail/reconnect peer.
This will eventually hook into restart logic.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-25 14:24:47 +09:30
Christian Decker 744d657860 doc: Updating README and related documentation
Like many I don't read any documentation besides the readme in the
repo, so I thought I could just pull some simple getting started info
into the readme to make it easy for people to get started :-)
2017-05-20 20:02:45 +09:30
Christian Decker 05e951d748 wire: Correct the short channel id serialization to use 3+3+2
Fixes the `short_channel_id` being serialized as 4 bytes block height,
3 bytes transaction index and 1 byte output number, to use 3+3+2 as
the spec says.

The reordering in the unit test structs is mainly to be able to still
use `eq_upto` for tests.
2017-05-20 20:01:34 +09:30
Christian Decker b8ba8f003c logging: Removed automatic subprocess logging for bitcoind
This was responsible for a huge number of loglines simply because we
log every subprocess start and termination. Moving the logging
upstream to where it is really needed gets rid of the polling, that
are successful. A highly unscientific test shows a reduction in
loglines produced by lightningd from 17000 to 10000 lines.
2017-05-20 19:59:16 +09:30
Christian Decker 80bf908922 script: Consolidate pubkey comparison 2017-05-20 19:59:16 +09:30
Christian Decker 3f4e43081c bitcoind: Respect testnet for bitcoin-cli 2017-05-20 19:59:16 +09:30
Christian Decker 80486a6669 pytest: Do not check for valgrind errors if disabled
Checking for valgrind errors if we disabled valgrind will kill the
teardown, so don't!
2017-05-20 19:59:16 +09:30
Christian Decker 6154020f67 fix: Header include order 2017-05-19 14:06:44 +02:00
Rusty Russell f4d92813a0 lightningd: handle bad failure message.
We used to core dump if unwrap_onionreply() returned NULL!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell 55510ea27a io_write_wire: always make a copy (or steal if take).
I caught the gossip daemon freeing a message, while it was queued to be
written.  Using tal_dup_arr() is the Right Thing, as it handles taken()
properly automatically.

------------------------------- Valgrind errors --------------------------------
Valgrind error file: /tmp/lightning-rvc7d5oi/test_forward/lightning-3/valgrind-errors
==11057== Invalid read of size 8
==11057==    at 0x1328F2: to_tal_hdr (tal.c:174)
==11057==    by 0x133894: tal_len (tal.c:659)
==11057==    by 0x11BBE7: do_write_wire (wire_io.c:103)
==11057==    by 0x127B95: do_plan (io.c:369)
==11057==    by 0x127C31: io_ready (io.c:390)
==11057==    by 0x129461: io_loop (poll.c:295)
==11057==    by 0x10CBB4: main (gossip.c:722)
==11057==  Address 0x55a99d8 is 24 bytes inside a block of size 200 free'd
==11057==    at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057==    by 0x133000: del_tree (tal.c:416)
==11057==    by 0x132F77: del_tree (tal.c:405)
==11057==    by 0x13333E: tal_free (tal.c:504)
==11057==    by 0x1123F1: queue_broadcast (broadcast.c:38)
==11057==    by 0x111EB0: handle_node_announcement (routing.c:918)
==11057==    by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057==    by 0x10B76B: owner_msg_in (gossip.c:335)
==11057==    by 0x12712E: next_plan (io.c:59)
==11057==    by 0x127BD0: do_plan (io.c:376)
==11057==    by 0x127C09: io_ready (io.c:386)
==11057==    by 0x129461: io_loop (poll.c:295)
==11057==  Block was alloc'd at
==11057==    at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11057==    by 0x132AE7: allocate (tal.c:245)
==11057==    by 0x1330A3: tal_alloc_ (tal.c:443)
==11057==    by 0x1332A6: tal_alloc_arr_ (tal.c:491)
==11057==    by 0x133FEC: tal_dup_ (tal.c:846)
==11057==    by 0x112347: new_queued_message (broadcast.c:20)
==11057==    by 0x11240B: queue_broadcast (broadcast.c:43)
==11057==    by 0x111EB0: handle_node_announcement (routing.c:918)
==11057==    by 0x10B166: handle_gossip_msg (gossip.c:170)
==11057==    by 0x10B76B: owner_msg_in (gossip.c:335)
==11057==    by 0x12712E: next_plan (io.c:59)
==11057==    by 0x127BD0: do_plan (io.c:376)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

wire_io: make a copy in io_write_wire (unless taken()).

I hit a corner case where gossipd freed a duplicate while it was being
sent out; this kind of thing doesn't happen if io_write_wire() makes
a copy by default.

We also do a memcheck() here; this gives us a caller in the backtrace
if there are uninitialized bytes, rather than waiting until the write
which happens later.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell 3d2f166364 test_lightningd.py: don't fail if valgrind turned off.
eg:
test_routing_gossip (__main__.LightningDTests) ... ERROR

======================================================================
ERROR: test_routing_gossip (__main__.LightningDTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/test_lightningd.py", line 150, in tearDown
    err_count += self.printValgrindErrors(node)
  File "tests/test_lightningd.py", line 137, in printValgrindErrors
    errors, fname = self.getValgrindErrors(node)
  File "tests/test_lightningd.py", line 132, in getValgrindErrors
    with open(error_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lightning-l106st0a/test_routing_gossip/lightning-1/valgrind-errors'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-19 13:30:32 +02:00
Rusty Russell 811e14ff66 Update .gitignore files.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Rusty Russell 6e0e1c7067 Update to latest BOLT (hyphens changed to underscores).
Now in sync with 8ee57b97738b1e9467a1342ca8373d40f0c4aca5.

Our tool doesn't need to convert them any more, but we actually had a
mis-typed field in the HSM which needed fixing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Rusty Russell e97046f797 BOLT update: temporary_channel_failure with update.
Aka d140405a6f0d95e3ccf650e3560383768cbf3e03.

This doesn't make it work, just compile.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-12 12:59:09 +02:00
Christian Decker f6d60b9076 pytest: Fail tests that produce valgrind errors
Check and print all valgrind errors and fail tests that produce
errors. We abort using an exception since `fail()` is not allowed in
the teardown.
2017-05-10 12:38:12 +09:30
Christian Decker 2d76b066c2 routing: Cleaning up old hostname and port handling
The single string-based hostname and port has been retired in favor of
having multiple `struct ipaddr`s from the `node_announcement`. This
breaks the hostnames and ports from IRC, but I didn't bother to
backport ipaddr for it since it is only used in the legacy daemon.
2017-05-10 12:37:44 +09:30
Christian Decker e972b208c7 routing: Passing back all addresses on getnodes 2017-05-10 12:37:44 +09:30
Christian Decker 26892e79bb routing: Reading multiple addresses from node_announcements 2017-05-10 12:37:44 +09:30
Christian Decker ed9668339d routing: Add command line option to specify external IP address
We don't currently have a good way to determine our external IP
address so let's at least give people an option to manually specify
it.
2017-05-10 12:37:44 +09:30
Christian Decker daf8866eb5 gossip: Implement the basic node_announcement
Rather a big commit, but I couldn't figure out how to split it
nicely. It introduces a new message from the channel to the master
signaling that the channel has been announced, so that the master can
take care of announcing the node itself. A provisorial announcement is
created and passed to the HSM, which signs it and passes it back to
the master. Finally the master injects it into gossipd which will take
care of broadcasting it.
2017-05-10 12:37:44 +09:30
Rusty Russell b99c5620ef struct secret: use everywhere.
We alternated between using a sha256 and using a privkey, but there are
numerous places where we have a random 32 bytes which are neither.

This fixes many of them (plus, struct privkey is now defined in terms of
struct secret).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-09 11:43:35 +09:30
Rusty Russell 42601c29d7 test_lightning.py: run valgrind on child daemons too
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-09 11:43:20 +09:30
Sankalp Ghatpande 8c73f37869 Added another faucet, removed redundent dependency installation instructions 2017-05-09 00:03:17 +02:00
Christian Decker 1e2f4dcf63 travis: Passing through some env variables
Mainly useful if we want to quickly turn on debug output on travis on
a failing test.
2017-05-07 12:06:56 +09:30
Christian Decker 6d06303789 pytest: Remove obsolete test and fix two flaky tests 2017-05-07 12:06:56 +09:30
Christian Decker 23da30a2a4 pytest: Better dealing with env variables to configure tests 2017-05-07 12:06:56 +09:30
Christian Decker db84525b67 travis: Docker pull separate to avoid output 2017-05-07 12:06:56 +09:30
Christian Decker 126839886a routing: Use a pointer to a shared_secret htlc_end 2017-05-06 10:16:07 +09:30
Christian Decker 870b83f67f sphinx: Incrementally wrap replies in new onion layers 2017-05-06 10:16:07 +09:30
Christian Decker 9820abda7c sphinx: Store shared secrets on the origin node
We could recompute them once we receive a reply and need to decrypt
it, but why go through the trouble when we can just store them?
2017-05-06 10:16:07 +09:30
Christian Decker 79f848145c sphinx: Passing shared secret of the HTLC up to the master
This is needed so we can create error messages and wrap them on the
way back.
2017-05-06 10:16:07 +09:30
Christian Decker 79582ea415 sphinx: Update the HMAC in onionreply to full length 2017-05-06 10:10:55 +09:30
Ben Gorlick a75838030a Update INSTALL.md to fix broken pip3 install
Fixed the Ubuntu install instructions -- previously it will break on being unable to find pip3. 
Tested this on Ubuntu 16.04 and 17+ server.
2017-05-05 17:57:00 +02:00
Rusty Russell 162dac1271 peer_control: don't complete fundchannel command until broadcast.
Under stress, the tests can mine blocks too soon, and the funding never
locks.  This gives more of a chance, at least.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-05 16:11:46 +09:30
Rusty Russell 6dcd7f9d6d tests: dump more information when we fail to find something in logs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-05 16:11:46 +09:30
Rusty Russell a12a670d85 opening: don't die if we get a gossip packet.
Using 'taskset -c 0' I managed to slow down pytest enough to trigger this
locally.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-05 16:11:45 +09:30
Rusty Russell cd90c8408c lightningd/gossip: clean up channel resolve failure handling.
We were getting an assert "!secp256k1_fe_is_zero(&ge->x)", because
an all-zero pubkey is invalid.  We allow marshal/unmarshal of NULL for
now, and clean up the error handling.

1. Use status_failed if master sends a bad message.
2. Similarly, kill the gossip daemon if it gives a bad reply.
3. Use an array for returned pubkeys: 0 or 2.
4. Use type_to_string(trc, struct short_channel_id, &scid) for tracing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-05 16:11:44 +09:30
Rusty Russell 2641ecaafa test_lightningd.py: dump logs on error.
I couldn't actually figure out how to just dump them on error, so I
dump all the time.  When running 3 lightningd + bitcoind, this separates
the logs nicely.

TODO: We should delete the directories on success!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2017-05-03 11:47:10 +09:30