Previously, if we rounded down the tree size to avoid downloading a
partial tile, but the log position was already within the partial tile
(which can happen with a brand new log and -start_at_end), we'd generate
a download batch where end < begin, which caused all sorts of problems.
end < begin can arise if we've rounded down end to avoid downloading a
partial tile, but the log position is already within the partial tile
(which can happen with a brand new log and -start_at_end).
Instead use the time at which the STH was observed (which for
FilesystemState is assumed to be the mtime of the STH file). This is
easier to reason about: we don't have to worry about logs lying about
the time; we don't have to take into account the delay between STH fetch
and healthcheck; we won't raise spurious health checks about logs with
MMDs longer than the healthcheck interval.
This bug caused certspotter to always request 1000 entries even if
went beyond the size of the log. This would have prevented
certspotter from downloading entries near the end of the log, if the log was
strict with get-entries bounds.
In practice, none of the active CT logs are strict with get-entries bounds,
and even if a log were strict, certspotter would have been able to successfully
download the entries later once the log grew.
Specifically, certspotter no longer terminates unless it receives SIGTERM
or SIGINT or there is a serious error.
Although using cron made sense in the early days of Certificate
Transparency, certspotter now needs to run continuously to reliably keep
up with the high growth rate of contemporary CT logs, and to gracefully
handle the many transient errors that can arise when monitoring CT.
Closes: #63Closes: #37Closes: #32 (presumably by eliminating $DNS_NAMES and $IP_ADDRESSES)
Closes: #21 (with $WATCH_ITEM)
Closes: #25