Log errors are so frequent that they are drowning out fatal errors. This commit will reserve stderr for fatal errors by default. See #104 for background.
This means that operators will need to enable -verbose if they want to get details about why a health check failed. This seems better than making stderr noisy by default. The long-term solution is #106.
Previously, if we rounded down the tree size to avoid downloading a
partial tile, but the log position was already within the partial tile
(which can happen with a brand new log and -start_at_end), we'd generate
a download batch where end < begin, which caused all sorts of problems.
end < begin can arise if we've rounded down end to avoid downloading a
partial tile, but the log position is already within the partial tile
(which can happen with a brand new log and -start_at_end).
Instead use the time at which the STH was observed (which for
FilesystemState is assumed to be the mtime of the STH file). This is
easier to reason about: we don't have to worry about logs lying about
the time; we don't have to take into account the delay between STH fetch
and healthcheck; we won't raise spurious health checks about logs with
MMDs longer than the healthcheck interval.
Without fsync, there's a risk of zero-length files being persisted if
there's a power failure.
Don't bother fsyncing the parent directory because it's OK if the data rolls
back to the previous version; we only need to avoid data corruption.
Closes: #101
This bug caused certspotter to always request 1000 entries even if
went beyond the size of the log. This would have prevented
certspotter from downloading entries near the end of the log, if the log was
strict with get-entries bounds.
In practice, none of the active CT logs are strict with get-entries bounds,
and even if a log were strict, certspotter would have been able to successfully
download the entries later once the log grew.