Commit Graph

106 Commits

Author SHA1 Message Date
Andrew Ayer d8b1877e8d Improve filenames of unverified STHs
Include the tree size in plain decimal, since it's more user-friendly.

Don't include tree size in hash (redundant now that we're storing it
outside of hash) or version (implied by signature).
2017-01-06 12:51:10 -08:00
Andrew Ayer 1719aa5d8e Set log ID in STHs that we download
This will facilitate STH pollination.
2017-01-06 12:50:21 -08:00
Andrew Ayer 0eb6d199a4 Improve the name of a function 2017-01-06 12:24:09 -08:00
Andrew Ayer c8f0a0f9e8 Only write once file if run was 100% successful
Otherwise, if a single log was unreachable, we'd be force to download
all of it on the next run.
2017-01-06 12:23:20 -08:00
Andrew Ayer 0d9b81ecc8 Tweak logic for storing tree position 2017-01-06 12:19:53 -08:00
Andrew Ayer 8ea4003994 Add some additional logging 2017-01-06 10:31:34 -08:00
Andrew Ayer 0c751f0294 Drop the MerkleTreeBuilder return value from VerifyConsistencyProof 2017-01-05 21:06:37 -08:00
Andrew Ayer 0af0262498 Overhaul log processing and auditing
1. Instead of storing a single STH per log, we now store one verified
STH and any number of unverified STHs.  When we process a log, we verify
each unverified STH using a consistency proof with the verified STH,
and only delete it if it successfully verifies.  We set the verified
STH to the largest STH which we've successfully verified.

This has two important benefits.  First, we never ever delete an STH
unless we can successfully verify it (previously, we would forget about
an STH under certain error conditions).  Second, it lays the groundwork
for STH pollination.  Upon reception of an STH, we can simply drop it in
the log's unverified_sths directory (assuming the signature is valid),
and Cert Spotter will audit it.

There is no more "evidence" directory; if a consistency proof fails,
the STHs will already be present elsewhere in the state directory.

2. We now persist a MerkleTreeBuilder between each run of Cert Spotter,
instead of rebuilding it every time from the consistency proof.  This is
not intrinsically better, but it makes the code simpler considering we
can now fetch numerous consistency proofs per run.

3. To accommodate the above changes, the state directory has a brand
new layout.  The state directory is now versioned, and Cert Spotter
will automatically migrate old state directories to the new layout.
This migration logic will be removed in a future Cert Spotter release.

As a bonus, the code is generally cleaner now :-)
2017-01-05 21:00:35 -08:00
Andrew Ayer b63a024876 Replace MerkleTreeBuilder.Finish with non-mutating CalculateRoot 2016-11-25 17:43:07 -08:00
Andrew Ayer 9bf82346d8 Avoid use of json.Decoder
Per https://ahmetalpbalkan.com/blog/golang-json-decoder-pitfalls/
2016-11-15 15:59:39 -08:00
Andrew Ayer 31f2316aa2 Rework -all_time logic
If -all_time is specified, scan the entirety of all logs, even
existing logs.  This matches user expectation better.  Previously,
-all_time had no impact on existing logs.

The first time Cert Spotter is run, do not scan any logs, unless
-all_time is specified.  This avoids a several hour wait the first
time Cert Spotter is run.  If the user is interested in knowing
about existing certificates, they can use the certspotter.com API
or crt.sh.  This is the same as existing behavior.

When a new log is added, scan it in its entirety even if -all_time is
not specified, so users are alerted to interesting certificates in the
new log.  Hopefully new logs will be small and this won't take too long!
Previously, new logs were not scanned in their entirety unless -all_time
was specified.

Closes: #5
2016-11-15 15:59:38 -08:00
Jonathan Rudenberg acc6781f29 Run gofmt
Signed-off-by: Jonathan Rudenberg <jonathan@titanous.com>
2016-07-28 14:55:46 -04:00
Andrew Ayer 1dc7e1cda9 Refine command line flag descriptions 2016-07-27 14:14:09 -07:00
Andrew Ayer f75c47d9ca Always store files in ~/.certspotter, even if running as root 2016-07-26 16:57:26 -07:00
Andrew Ayer 19e05b901a Remove some dead code from the scanner 2016-06-22 10:32:42 -07:00
Andrew Ayer 2c8cb1f402 Return exit code from cmd.Main instead of exiting directly
This allows the calling code to do custom cleanup.
2016-06-03 07:21:08 -07:00
Andrew Ayer 2bed88e7c5 Rework watchlist
Watchlist is now read from ~/.certspotter/watchlist by default, or from
the file specified by -watchlist (- for stdin).

By default, only exact DNS names are matched.  To match both the domain
itself and all sub-domains, prefix with a dot (e.g. .example.com).

Comments are now allowed in watchlist files.
2016-05-12 11:30:59 -07:00
Andrew Ayer 7196ec5217 Use $CERTSPOTTER_STATE_DIR to specify state directory 2016-05-12 10:53:57 -07:00
Andrew Ayer f9432ae4b9 Reverse order of certspotter.MatchesWildcard arguments 2016-05-10 14:29:04 -07:00
Andrew Ayer 92fbdcb947 Support crazy wildcards (not just in the left-most label) 2016-05-10 10:37:10 -07:00
Andrew Ayer b79cb31413 Move package to software.sslmate.com/src/certspotter 2016-05-04 12:19:59 -07:00
Andrew Ayer 1e582e2e0c License under the MPL 2.0 2016-05-04 11:56:13 -07:00
Andrew Ayer 670cddafbc Rename project to certspotter 2016-05-04 11:49:07 -07:00
Andrew Ayer ea3db97486 Only replace DNS label with placeholder if it's utterly unparsable
e.g. contains control characters, Punycode conversion fails

There are quite simply too many certs with bogus DNS labels out in the wild,
and it just doesn't make sense to bother every .com domain holder because
GoDaddy signed a cert with a DNS name like "www.        just4funpartyrentals.com"
It is highly unlikely any validator will ever match that DNS name.
2016-05-04 11:43:02 -07:00
Andrew Ayer 60636ba2d7 Move Identifiers from CertInfo to EntryInfo
It's more logical, and it avoids some redundant parsing.
2016-05-03 11:58:59 -07:00
Andrew Ayer ca8f60740a Trim trailing dots from DNS names 2016-05-01 12:49:26 -07:00
Andrew Ayer 847b7129e8 Monitor for all DNS names that _might_ match a monitored domain
Wildcards, redacted labels, and unparseable labels.
2016-04-29 09:02:03 -07:00
Andrew Ayer 2c9df274e9 Gracefully handle all manner of poorly encoded identifiers
Also add preliminary support for IP address identifiers.
2016-04-28 22:00:32 -07:00
Andrew Ayer 65ed742477 Support wildcards
For example, if you're watching subdomain.example.com, a cert for
*.example.com will now match.
2016-04-26 14:49:39 -07:00
Andrew Ayer 4132ed5e9f Add support for IDNs
IDNs can be specified in either Unicode or ASCII (as Punycode).
Certs can specify the DNS name either way, and we'll match it.
2016-04-26 14:38:09 -07:00
Andrew Ayer 2d2aa37202 Parse common names separately from DNS names 2016-04-22 20:58:33 -07:00
Andrew Ayer e091186d83 Save consistency proof along with evidence of misbehavior
Although the consistency proof is neither necessary nor sufficient
to prove misbehavior by a log, this will help with debugging if a
log returns a bogus consistency proof erroneously (which seems to
be happening with the Rocketeer log lately...).
2016-04-06 08:10:06 -07:00
Andrew Ayer 81bfa0bbd8 Add ctparsewatch
It watches for certificates which we can't fully parse
2016-03-23 20:19:39 -07:00
Andrew Ayer 616ac0cb83 Adjust gitignore 2016-03-23 20:04:55 -07:00
Andrew Ayer 3b59332bf1 Rename a function for clarity 2016-03-17 16:34:53 -07:00
Andrew Ayer a071e9490a Replace embedded X509 parser with my own lightweight parser 2016-03-16 16:59:37 -07:00
Andrew Ayer 5ccf9fdcd3 ctwatch: allow state dir to be set by $CTWATCH_STATE_DIR 2016-03-08 07:09:26 -08:00
Andrew Ayer 5803389588 Fix some pointer inconsistencies in code 2016-02-22 15:29:52 -08:00
Andrew Ayer 09c37cfdfd Clarify a flag 2016-02-22 15:14:17 -08:00
Andrew Ayer 8f3bd3b6ff Improve logging 2016-02-22 14:58:11 -08:00
Andrew Ayer b297ba9967 Use bits in the exit code to convey what happened 2016-02-22 14:45:50 -08:00
Andrew Ayer 40123f9ba8 Allow . to be specified on stdin as well 2016-02-22 14:18:56 -08:00
Andrew Ayer df6527b165 Change -all_time to only affect logs we haven't seen before
It's more useful this way - there's no sense in scanning logs we've
already scanned.

I need a better name for this switch, though.
2016-02-20 12:04:07 -08:00
Andrew Ayer ff44576c87 Save old and new STHs if consistency proof fails 2016-02-18 12:40:21 -08:00
Andrew Ayer 16bf546258 Embed Google CT library, with my own changes 2016-02-18 10:44:56 -08:00
Andrew Ayer 3c33dc8277 Remove sha1watch 2016-02-18 10:41:55 -08:00
Andrew Ayer e91d7bacbd Minor cleanup to improve encapsulation 2016-02-18 10:23:07 -08:00
Andrew Ayer b47d35a005 Rename some types/functions for clarity 2016-02-18 10:15:56 -08:00
Andrew Ayer 35eef25f4a Rename function for clarity 2016-02-18 10:09:33 -08:00
Andrew Ayer 9558efc955 Verify STH signatures 2016-02-17 16:03:49 -08:00
Andrew Ayer 4b304fd192 Audit Merkle tree when retrieving entries
Also add an -all_time command line option to retrieve all certificates,
not just the ones since the last scan.
2016-02-17 14:54:40 -08:00
Andrew Ayer b6dec7822d Overhaul to be more robust and simpler
All certificates are now parsed with a special, extremely
lax parser that extracts only the DNS names.  Only if the
DNS names match the domains we're interested in will we attempt
to parse the cert with the real X509 parser.  This ensures that
we won't miss a very badly encoded certificate that has been
issued for a monitored domain.

As of the time of commit, the lax parser is able to process every
logged certificate in the known logs.
2016-02-09 10:28:52 -08:00
Andrew Ayer a79cc26570 Include filename of saved cert in output/script invocation 2016-02-05 08:20:12 -08:00
Andrew Ayer cfaf126284 To monitor all domains, require "." to be specified
Now that we save all certs by default, we want to prevent people
from accidentally monitoring all domains, which could lead to MASSIVE
disk usage.

"." is used because it denotes the root zone in DNS.
2016-02-05 08:13:11 -08:00
Andrew Ayer 3f596730a0 New and simplified multi-log operation 2016-02-04 20:16:25 -08:00
Andrew Ayer a418a3686d Initial commit 2016-02-04 18:46:19 -08:00