Age | Commit message (Collapse) | Author |
|
Its a bit unclean to create an item just to let the item decide that it
can't do anything and let it fail, so instead we let the item creator
decide in all cases if patching should be attempted.
Also pulls a small trick to get the hashes for the current file without
calculating them by looking at the 'old' Release file if we have it.
Git-Dch: Ignore
|
|
At the moment we only have hashes for the uncompressed pdiff files, but
via the new '$HASH-Download' field in the .diff/Index hashes can be
provided for the .gz compressed pdiff file, which apt will pick up now
and use to verify the download. Now, we "just" need a buy in from the
creators of repositories…
|
|
Git-Dch: Ignore
|
|
The rred parser is very accepting regarding 'invalid' files. Given that
we can't trust the input it might be a bit too relaxed. In any case,
checking for more errors can't hurt given that we support only a very
specific subset of ed commands.
|
|
rred is responsible for unpacking and reading the patch files in one go,
but we currently only have hashes for the uncompressed patch files, so
the handler read the entire patch file before dispatching it to the
worker which would read it again – both with an implicit uncompress.
Worse, while the workers operate in parallel the handler is the central
orchestration unit, so having it busy with work means the workers do
(potentially) nothing.
This means rred is working with 'untrusted' data, which is bad. Yet,
having the unpack in the handler meant that the untrusted uncompress was
done as root which isn't better either. Now, we have it at least
contained in a binary which we can harden a bit better. In the long run,
we want hashes for the compressed patch files through to be safe.
|
|
Having every item having its own code to verify the file(s) it handles
is an errorprune process and easy to break, especially if items move
through various stages (download, uncompress, patching, …). With a giant
rework we centralize (most of) the verification to have a better
enforcement rate and (hopefully) less chance for bugs, but it breaks the
ABI bigtime in exchange – and as we break it anyway, it is broken even
harder.
It shouldn't effect most frontends as they don't deal with the acquire
system at all or implement their own items, but some do and will need to
be patched (might be an opportunity to use apt on-board material).
The theory is simple: Items implement methods to decide if hashes need to
be checked (in this stage) and to return the expected hashes for this
item (in this stage). The verification itself is done in worker message
passing which has the benefit that a hashsum error is now a proper error
for the acquire system rather than a Done() which is later revised to a
Failed().
|
|
If we e.g. fail on hash verification for Packages.xz its highly unlikely
that it will be any better with Packages.gz, so we just waste download
bandwidth and time. It also causes us always to fallback to the
uncompressed Packages file for which the error will finally be reported,
which in turn confuses users as the file usually doesn't exist on the
mirrors, so a bug in apt is suspected for even trying it…
|
|
Conflicts:
apt-pkg/pkgcache.h
debian/changelog
methods/https.cc
methods/server.cc
test/integration/test-apt-download-progress
|
|
Git-Dch: ignore
|
|
Conflicts:
apt-pkg/deb/dpkgpm.cc
|
|
The underlying problem is that libapt-pkg does not correctly parse these
provides. Internally, it creates a version named "baz:i386" with
architecture amd64. Of course, such a package name is invalid and thus
this version is completely inaccessible. Thus, this bug should not cause
apt to accept a broken situation as valid. Nevertheless, it prevents
using architecture qualified depends.
Closes: 777071
|
|
Add a regression test that reproduced the hang of apt when a
partial file is present.
Git-Dch: ignore
|
|
The variable "Size" was misleading and caused bug #1445239. To
avoid similar issues in the future, rename it to make the meaning
more obvious.
git-dch: ignore
|
|
The apt http code parses Content-Length and Content-Range. For
both requests the variable "Size" is used and the semantic for
this Size is the total file size. However Content-Length is not
the entire file size for partital file requests. For servers that
send the Content-Range header first and then the Content-Length
header this can lead to globbing of Size so that its less than
the real file size. This may lead to a subsequent passing of a
negative number into the CircleBuf which leads to a endless
loop that writes data.
Thanks to Anton Blanchard for the analysis and initial patch.
LP: #1445239
|
|
|
|
Valid-Until protects us from long-living downgrade attacks, but not all
repositories have it and an attacker could still use older but still
valid files to downgrade us. While this makes it sounds like a security
improvement now, its a bit theoretical at best as an attacker with
capabilities to pull this off could just as well always keep us days
(but in the valid period) behind and always knows which state we have,
as we tell him with the If-Modified-Since header. This is also why this
is 'silently' ignored and treated as an IMSHit rather than screamed at
the user as this can at best be an annoyance for attackers.
An error here would 'regularily' be encountered by users by out-of-sync
mirrors serving a single run (e.g. load balancer) or in two consecutive
runs on the other hand, so it would just help teaching people ignore it.
That said, most of the code churn is caused by enforcing this additional
requirement. Crisscross from InRelease to Release.gpg is e.g. very
unlikely in practice, but if we would ignore it an attacker could
sidestep it this way.
|
|
Not all servers we are talking to support If-Modified-Since and some are
not even sending Last-Modified for us, so in an effort to detect such
hits we run a hashsum check on the 'old' compared to the 'new' file, we
got the hashes for the 'new' already for "free" from the methods anyway
and hence just need to calculated the old ones.
This allows us to detect hits even with unsupported servers, which in
turn means we benefit from all the new hit behavior also here.
|
|
It isn't used much compared to what the methodname suggests, but in the
remaining uses it can't hurt to check more than strictly necessary by
calculating and verifying with all hashes we can compare with rather
than "just" the best known hash.
|
|
If we have the expected hashes we can check with them if the file we
have in partial we got a 416 for is the expected file. We detected this
with same-size before, but not every server sends a good Content-Range
header with a 416 response.
|
|
While it is mostly busywork to rewrite all instances it actually fixes
bugs as the data storage used by the new method is std::string rather
than a char*, the later mostly created by c_str() from a std::string
which the caller has to ensure keeps in scope – something apt-ftparchive
actually didn't ensure and relied on copy-on-write behavior instead
which c++11 forbids and hence the new default gcc abi doesn't use it.
|
|
TFRewrite is okay, but it has obscure limitations (256 Tags), even more
obscure bugs (order for renames is defined by the old name) and the
interface is very c-style encouraging bad usage like we do it in
apt-ftparchive passing massive amounts of c_str() from std::string in.
The old-style is marked as deprecated accordingly. The next commit will
fix all places in the apt code to not use the old-style anymore.
|
|
In 66c3875df391b1120b43831efcbe88a78569fbfe we workaround/fixed a
problem where the code makes the assumption that the compiler uses
copy-on-write implementations for std::string. Turns out that for c++11
compatibility gcc >= 5 will stop doing this by default.
|
|
dpkg and dak know various field names and order them in their output,
while we have yet another order and have to play catch up with them as
we are sitting between chairs here and neither order is ideal for us,
too.
A little testcase is from now on supposed to help ensureing that we do
not derivate to far away from which fields dpkg knows and orders.
|
|
This rename with value is ordered by the 'old' name 'Source', but should
be ordered by the new name… by splitting the operation in a delete and a
new field we can easily fix this problem locally for now.
|
|
The helper expects to be told if it should generate messages, not where
these messages should be printed – as it isn't printing such messages,
but puts them in _error. apt-get uses in other methods a helper
specialisation which does also print stuff to a stream through, so this
is likely a copy&paste error.
Git-Dch: Ignore
|
|
Git-Dch: Ignore
|
|
Slightly rewriting the code to ensure we only use two sources for the
versions as it could otherwise be confusing to look at.
|
|
dpkg -l < 1.16.2 loads the available file and hence sees a package which
later versions do not see, leading to failures on travis-ci.
The different versions also have slightly different messages.
Git-Dch: Ignore
|
|
If the pin for a generic pin is 0, it get a value by strange looking
rules, if the pin is specific the rules are at least not strange, but
the value 989 is a magic number without any direct meaning… but both
never happens in practice as the parsing skips such entries with a
warning, so there always is a priority != 0 and the code therefore never
used.
|
|
The documentation says this, but the code only agreed while evaluating
specific packages, but not generics. These needed a pin above 1000 to
have the same effect.
The code causing this makes references to a 'second pesduo status file',
but nowhere is explained what this might stand for and/or what it was,
so we do the only reasonable thing: Remove all references and do as
documented.
|
|
Git-Dch: Ignore
|
|
Especially pdiff-enhanced downloads have the tendency to fail for
various reasons from which we can recover and even a successful download
used to leave the old unpatched index in partial/.
By adding a new method responsible for making the transaction of an
individual file happen we can at specialisations especially for abort
cases to deal with the cleanup.
This also helps in keeping the compressed indexes around if another
index failed instead of keeping the decompressed files, which we
wouldn't pick up in the next call.
|
|
|
|
|
|
The fix for #777760 causes packages of foreign (and the native)
architectures, to be created correctly, but invalidates (like the
previously existing, but policy-forbidden architecture-less packages
we had to support for some upgrade scenarios) the assumption that the
first (and only) package in the cache for a single architecture system
must be the package for the native architecture (as, where should the
other architectures come from, right? Wrong.).
Depending on the order of parsing sources more or less packages can be
effected by this. The effects are strange (for apt it mostly effects
simulation/debug output, but also apt-mark on these specific packages),
which complicates debugging, but relatively harmless if understood as
most actions do not need direct named access to packages.
The problem is fixed by removing the single-arch special casing in the
paths who had them (Cache.FindPkg), so they use the same code as
multi-arch systems, which use them as a wrapper for Grp.FindPkg.
Note that single-arch system code was using Grp.FindPkg before as well
if a Grp structure was handily available, so we don't introduce new
untested code here: We remove more brittle special cases which are less
tested instead (this was planed to be done for Stretch anyhow).
Note further that the method with the assumption itself isn't fixed. As
it is a private method I opted for declaring it deprecated instead and
remove all its call positions. As it is private no-one can call this
method legally (thanks to how c++ works by default its still an exported
symbol through) and fixing it basically means reimplementing code we
already have in Grp.FindPkg.
Removing rather than fixing seems hence like a good solution.
Closes: 782777
Thanks: Axel Beckert for testing
|
|
Conflicts:
apt-pkg/acquire-item.cc
cmdline/apt-key.in
methods/https.cc
test/integration/test-apt-key
test/integration/test-multiarch-foreign
|
|
The sibling of this message are all guarded as debug messages, just this
one had it missing an subsequently causes display issues if triggered.
Git-Dch: Ignore
|
|
If we get a IMSHit for the Transaction-Manager (= the InRelease file or
as its still supported fallback Release + Release.gpg combo) we can
assume that every file we would queue based on this manager, but already
have locally is current and hence would get an IMSHit, too. We therefore
save us and the server the trouble and skip the queuing in this case.
Beside speeding up repetative executions of 'apt-get update' this way we
also avoid hitting hashsum errors if the indexes are in fact already
updated, but the Release file isn't yet as it is the case on well
behaving mirrors as Release files is updated last.
The implementation is a bit harder than the theory makes it sound as we
still have to keep reverifying the Release files (e.g. to detect now expired
once to avoid an attacker being able to silently stale us) and have to
handle cases in which the Release file hits, but some indexes aren't
present (e.g. user added a new foreign architecture).
|
|
Calculating the final name of an item which it will have after
everything is done and verified successfully is suprisingly complicated
as while they all follow a simple pattern, the URI and where it is
stored varies between the items.
With some (abibreaking) redesign we can abstract this similar to how it
is already down for the partial file location.
Git-Dch: Ignore
|
|
Checking Valid-Until on an unsigned Release file doesn't give us any
security brownie points as an attacker could just change the date and in
practice repositories with unsigned Release files will very likely not
have a Valid-Until date, but for symetry and the fact that being
unsigned is currently just a warning, while expired is a fatal error.
|
|
Its a bit unpredictable which permissons and owners we will encounter on
a CD-ROM (or a USB stick, as apt-cdrom is responsible for those too),
so we have to ensure in this codepath as well that everything is nicely
setup without waiting for a 'apt-get update' to fix up the (potential)
mess.
|
|
We do this in HTTP already to give the CPU some exercise while the disk
is heavily spinning (or flashing?) to store the data avoiding the need
to reread the entire file again later on to calculate the hashes – which
happens outside of the eyes of progress reporting, so you might ended up
with a bunch of https workers 'stuck' at 100% while they were busy
calculating hashes.
This is a bummer for everyone using apt as a connection speedtest as the
https method works slower now (not really, it just isn't reporting done
too early anymore).
|
|
Closes: 782122
|
|
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
|
|
Git-Dch: Ignore
|
|
Upstream claims its faster if combined with an optimizing compiler and
I can confirm that in some tests, so lets see how it works out in
practice.
Git-Dch: Ignore
|
|
Servers who advertise that they close the connection get the 'Closes'
encoding flag, but this conflicts with servers who response with a
transfer-encoding (e.g. encoding) as it is saved in the same flag.
We have a better flag for the keep-alive (or not) of the connection
anyway, so we check this instead of the encoding.
This is in practice not much of a problem as real servers we talk to are
HTTP1.1 servers (with keep-alive) and there isn't much point in doing
chunked encoding if you are going to close anyway, but our simple
testserver stumbles over this if pressed and its a bit cleaner, too.
Git-Dch: Ignore
|
|
file sends information about the uncompressed file if it can find it as
well as for the compressed file. This was done only for gzip so far, but
we support more compression types. That this information isn't used a
lot is a different story.
Git-Dch: Ignore
|
|
Git-Dch: Ignore
|
|
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this
report including the (expected) filesize. https was using a "global"
struct for reporting which made it 'reuse' incorrect values in some
cases like a non-existent InRelease fallbacking to Release{,.gpg}
resulting in a size-mismatch warning. Reducing the scope and redesigning
the setting of the values we can fix this and related issues.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
|