If -all_time is specified, scan the entirety of all logs, even
existing logs. This matches user expectation better. Previously,
-all_time had no impact on existing logs.
The first time Cert Spotter is run, do not scan any logs, unless
-all_time is specified. This avoids a several hour wait the first
time Cert Spotter is run. If the user is interested in knowing
about existing certificates, they can use the certspotter.com API
or crt.sh. This is the same as existing behavior.
When a new log is added, scan it in its entirety even if -all_time is
not specified, so users are alerted to interesting certificates in the
new log. Hopefully new logs will be small and this won't take too long!
Previously, new logs were not scanned in their entirety unless -all_time
was specified.
Closes: #5
Watchlist is now read from ~/.certspotter/watchlist by default, or from
the file specified by -watchlist (- for stdin).
By default, only exact DNS names are matched. To match both the domain
itself and all sub-domains, prefix with a dot (e.g. .example.com).
Comments are now allowed in watchlist files.
e.g. contains control characters, Punycode conversion fails
There are quite simply too many certs with bogus DNS labels out in the wild,
and it just doesn't make sense to bother every .com domain holder because
GoDaddy signed a cert with a DNS name like "www. just4funpartyrentals.com"
It is highly unlikely any validator will ever match that DNS name.
Although the consistency proof is neither necessary nor sufficient
to prove misbehavior by a log, this will help with debugging if a
log returns a bogus consistency proof erroneously (which seems to
be happening with the Rocketeer log lately...).
All certificates are now parsed with a special, extremely
lax parser that extracts only the DNS names. Only if the
DNS names match the domains we're interested in will we attempt
to parse the cert with the real X509 parser. This ensures that
we won't miss a very badly encoded certificate that has been
issued for a monitored domain.
As of the time of commit, the lax parser is able to process every
logged certificate in the known logs.
Now that we save all certs by default, we want to prevent people
from accidentally monitoring all domains, which could lead to MASSIVE
disk usage.
"." is used because it denotes the root zone in DNS.