As it stands, cloudflared prechecks are not taking the `protocol` flag into consideration and is instead falling back to the default protocol, which is QUIC. Prechecks should report the protocol cloudflared will use, not the default protocol.
Bumps quic-go to v0.59.1 (chungthuang fork rebased from upstream v0.45 onto
v0.59.1). Upstream removed the `logging` package and replaced its
callback-based ConnectionTracer with the structured `qlog`/`qlogwriter` event
API, which required migrating cloudflared's QUIC metrics collection.
Migrations:
- quic/tracing.go: connTracer no longer fills a logging.ConnectionTracer
callback struct. It implements qlogwriter.Trace + qlogwriter.Recorder and
dispatches qlog events (PacketSent, PacketReceived, MetricsUpdated, ...) to
the collector through RecordEvent. NewClientTracer now returns a function
compatible with quic.Config.Tracer.
- quic/metrics.go: collector methods take qlog types (qlog.Frame,
qlog.PacketType, qlog.MetricsUpdated, ...) and plain int64 in place of the
removed logging.ByteCount/Frame/RTTStats/TransportParameters.
- quic/conversion.go: PacketType, PacketDropReason and PacketLossReason are
strings upstream rather than numeric iotas, so the converters become
pass-through allowlists. CongestionState is also a string;
congestionStateToFloat maps it back to the numeric gauge values cloudflared
exports.
- quic.Connection/quic.Stream became *quic.Conn/*quic.Stream; updated
ConnWithCloser, SafeStreamCloser and the connection package accordingly.
Tests and generated mocks (mocks/mock_quic_connection.go) were adapted to
the new pointer-based API.
Closes TUN-10557
The bump of the QUIC library introduces a cyclic dependency between the connection and quic modules hence it is necessary to break this coupling.
Right now, the connection module depends on the quic module for the datagram v2/v3 and to which a QUIC connection (currently an interface) is passed.
As it is there is no issue however, under the hood, interface is a wrapper around an UDP connection and a QUIC connection meaning this type must be exposed to the quic module since the QUIC Connection will no longer be a interface but a struct.
Given the above, these changes introduce an interface, QUICConnection, with the surface used today in cloudflared and a struct, ConnWithCloser, that implements said interface within the quic module.
Closes TUN-10563
Fixing some bugs with DNS targets. Most importantly, these changes also fix some wrong assumptionsmade when trying to add support for the `--edge` flag:
1. Removes `StaticEdgeDNSResolver` in favor `resolveStaticEdge`. Since --edge does not imply resolving DNS, this fixes that assumption.
2. Adds EdgeAddrs, which allows us to skip DNS probes when set. This fixes the targets in the DNS rows.
3. Added a new `ResolvedTarget` struct, which joins addresses with the respective DNS results. This avoids the brittle logic we had before, where we assumed there were always two groups (one for each region) when running probes. So this not only makes the code more extensible in case we want to add more regions in the future but also adds support for multiple targets supplied via `--edge`.
4. Changes the existing nomenclature, going from calling things `region` to `target`. The term `region` works when resolving production regions (region1 and region2), but becomes misleading when we add the logic for `--edge`.
The end result of these changes is that we now see the correct addresses when you supply targets via `--edge`, while also making the code a bit clearer.
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
This PR allows us to use edge overrides with pre-checks. I forgot to account for the --edge flag when developing pre-check probes in TUN-10388. This should allow us to wire this flag in.
DNSResolver will still be injected via RunDialers, and we will take care of initialization in cmd.go. This allows us to keep pre-checks testable and inject mock DNSResolvers when needed.
I thought about overriding `edgediscovery.ResolveEdge` and `edgeDiscovery.StaticEdge` instead of `allregion.EdgeDiscovery` and `allregionResolveAddrs`, respectively. But that would imply changing our existing probe logic to support `Regions` instead of `[][]EdgeAddr`, which would mean more work. Additionally (and perhaps most importantly), using `Regions` would also require us to create new functions to extract a list of edge addresses for our probe tests. I don't think this would go well with the current implementation of `Regions`, as I believe it's intent is to encapsulate the logic around managing addresses per regions. Adding these functions would mean breaking this encapsulation.
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
Fixing missing TLS settings. While developing the pre-check probes, I forgot to add the certificate settings, which are essential for establishing a connection to origintunneld. I discovered this while testing cloudflared locally.
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
This introduces the main precheck logic. This will follow concurrency model, timeout handling, and retry logic established in the SPEC. We will follow the decision flow in the [connectivity pre-checks SPEC](https://wiki.cfdata.org/spaces/TUN/pages/1374967685/Connectivity+Pre-checks+for+cloudflared). You can find an attached image of what the decision flow should look like.
{width=900 height=235}
This PR implements all the dialers and resolvers needed to make pre-checks happen. So this task focuses on the following:
1. Implement the DNS probe: call DNSResolver.Resolve(region)
2. Implement the QUIC probe: call QUICDialer.DialQuic (handshake only, no stream opened) and record the result.
3. Implement the HTTP/2 probe: call TCPDialer.DialEdge (TCP + TLS handshake only, no frames sent) and record the result.
4. Implement the Management API probe: call ManagementDialer.DialContext to api.cloudflare.com:443 and record the result.
5. Export edgeDiscovery as EdgeDiscovery in edgediscovery/allregions/discovery.go so the pre-check can reuse the production DNS path.
This sets up the main components to implement the checker.
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
Switching `SuggestedProtocol` to a pointer, so we can pass nil whenever both QUIC and HTTP2 fail. We should not be suggesting anything when all our protocols have failed.
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
The goals of this PR are twofold:
## **1. Introduce a new renderer to output to `stdout`**
Implement the table renderer that will be used to report the results to stdout. The renderer should output something similar to this:
```
─── CONNECTIVITY PRE-CHECKS ──────────────────────────────────────────────────
COMPONENT TARGET STATUS DETAILS
DNS Resolution region1.v2.argotunnel.com PASS Resolved successfully
DNS Resolution region2.v2.argotunnel.com PASS Resolved successfully
UDP Connectivity Port 7844 (QUIC) PASS Handshake successful
TCP Connectivity Port 7844 (HTTP/2) PASS TLS handshake successful
Cloudflare API api.cloudflare.com:443 PASS Reachable
SUMMARY: Environment is healthy. cloudflared will use 'quic' as primary protocol.
──────────────────────────────────────────────────────────────────────────────
```
## **2. Add a log-level renderer**
Add support for structured logging to print the table results as logs. Below is an example of how logs should look like:
```
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "component":"DNS Resolution","target":"region1.v2.argotunnel.com","status":"pass","details":"Resolved successfully","time":"2024-01-15T10:30:00Z","message":"precheck"}
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "component":"DNS Resolution","target":"region2.v2.argotunnel.com","status":"pass","details":"Resolved successfully","time":"2024-01-15T10:30:00Z","message":"precheck"}
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "component":"QUIC Connectivity","target":"Port 7844 (QUIC)","status":"pass","details":"Handshake successful","time":"2024-01-15T10:30:01Z","message":"precheck"}
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "component":"HTTP/2 Connectivity","target":"Port 7844 (HTTP/2)","status":"pass","details":"TLS handshake successful","time":"2024-01-15T10:30:01Z","message":"precheck"}
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "component":"Management API","target":"api.cloudflare.com:443","status":"pass","details":"Reachable","time":"2024-01-15T10:30:01Z","message":"precheck"}
{"level":"info","run_id":"52828729-dfwd-45b3-w12f-727cbdb4cbd4", "hard_fail":false,"suggested_protocol":"quic","time":"2024-01-15T10:30:01Z","message":"precheck complete"}
```
Check / check (1.22.x, macos-latest) (push) Has been cancelled
Check / check (1.22.x, ubuntu-latest) (push) Has been cancelled
Check / check (1.22.x, windows-latest) (push) Has been cancelled
Semgrep config / semgrep/ci (push) Has been cancelled
This adds the foundation for the new cloudflared pre-checks by creating a new prechecks package. This adds the following:
* types.go: Status, CheckResult, Report, Config (add IPVersion allregions.ConfigIPVersion field to Config)
* interfaces.go: DNSResolver, TCPDialer, QUICDialer, ManagementDialer