Age | Commit message (Collapse) | Author |
|
Reported-By: gcc
Git-Dch: Ignore
|
|
Some additional files like 'Contents' are very big and should therefore
kept compressed on the disk, which apt-file did in the past. It also
implemented pdiff patching of these files by un- and recompressing these
files on-the-fly, with this commit we can do the same – but we can do
this in both pdiff patching styles (client and server merging) and
secured by hashes.
Hashes are in so far slightly complicated as we can't compare the hashes
of the compressed files as we might compress them differently than the
server would (different compressor versions, options, …), so we must
compare the hashes of the uncompressed content.
While this commit has changes in public headers, the classes it changes
are marked as hidden, so nobody can use them directly, which means the
ABI break is internal only.
|
|
Reported-By: codespell
|
|
Thanks: Julian Andres Klode
Git-Dch: ignore
|
|
Also add "Debug::Acquire::SrvRecs" debug option and the option
"Acquire::EnableSrvRecods" to allow disabling this lookup.
|
|
|
|
feature/srv-records
|
|
Conflicts:
cmdline/apt-helper.cc
cmdline/makefile
|
|
This allows us to run the clang static analyzer and to run the
testsuite with the clang MemorySanitizer.
|
|
[Commiter comment: Untested, but looks and compiles fine, so what could
possibly go wrong]
Closes: 624727
|
|
C++11 adds the 'override' specifier to mark that a method is overriding
a base class method and error out if not. We hide it in the APT_OVERRIDE
macro to ensure that we keep compiling in pre-c++11 standards.
Reported-By: clang-modernize -add-override -override-macros
Git-Dch: Ignore
|
|
The previous commit returns to the possibility of using just gpgv for
verification proposes. There is one problem through: We can't enforce a
specific keyid without using gpg, but our acquire method can as it
parses gpgv output anyway, so it can deal with good signatures from not
expected signatures and treats them as unknown keys instead.
Git-Dch: Ignore
|
|
There is an option to keep all targets (Packages, Sources, …) compressed
for a while now, but the all-or-nothing approach is a bit limited for
our purposes with additional targets as some of them are very big
(Contents) and rarely used in comparison, so keeping them compressed by
default can make sense, while others are still unpacked.
Most interesting is the copy-change maybe: Copy is used by the acquire
system as an uncompressor and it is hence expected that it returns the
hashes for the "output", not the input. Now, in the case of keeping a
file compressed, the output is never written to disk, but generated in
memory and we should still validated it, so for compressed files copy is
expected to return the hashes of the uncompressed file. We used to use
the config option to enable on-the-fly decompress in the method, but in
reality copy is never used in a way where it shouldn't decompress a
compressed file to get its hashes, so we can save us the trouble of
sending this information to the method and just do it always.
|
|
Limits which key(s) can be used to sign a repository. Not immensely useful
from a security perspective all by itself, but if the user has
additional measures in place to confine a repository (like pinning) an
attacker who gets the key for such a repository is limited to its
potential and can't use the key to sign its attacks for an other (maybe
less limited) repository… (yes, this is as weak as it sounds, but having
the capability might come in handy for implementing other stuff later).
|
|
All other methods call it, so they should follow along even if the work
they do afterwards is hardly breathtaking and usually results in a
URIDone pretty soon, but the acquire system tells the individual item
about this via a virtual method call, so even through none of our
existing items contains any critical code in these, maybe one day they
might. Consistency at least once…
Which is also why this has a good sideeffect: file: and cdrom: requests
appear now in the 'apt-get update' output. Finally - it never made sense
to hide them for me. Okay, I guess it made before the new hit behavior,
but now that you can actually see the difference in an update it makes
sense to see if a file: repository changed or not as well.
|
|
'file' isn't using the destination file per-se, but returns another name
via "Filename" header. It still should deal with destination files as
they could exist (pkgAcqFile e.g. creates links in that location) and
are potentially bogus.
|
|
For some reason travis seems to be unhappy about it claiming it
is not defined. Well, lets not think to deeply about it…
Git-Dch: Ignore
|
|
At the moment we only have hashes for the uncompressed pdiff files, but
via the new '$HASH-Download' field in the .diff/Index hashes can be
provided for the .gz compressed pdiff file, which apt will pick up now
and use to verify the download. Now, we "just" need a buy in from the
creators of repositories…
|
|
The rred parser is very accepting regarding 'invalid' files. Given that
we can't trust the input it might be a bit too relaxed. In any case,
checking for more errors can't hurt given that we support only a very
specific subset of ed commands.
|
|
rred is responsible for unpacking and reading the patch files in one go,
but we currently only have hashes for the uncompressed patch files, so
the handler read the entire patch file before dispatching it to the
worker which would read it again – both with an implicit uncompress.
Worse, while the workers operate in parallel the handler is the central
orchestration unit, so having it busy with work means the workers do
(potentially) nothing.
This means rred is working with 'untrusted' data, which is bad. Yet,
having the unpack in the handler meant that the untrusted uncompress was
done as root which isn't better either. Now, we have it at least
contained in a binary which we can harden a bit better. In the long run,
we want hashes for the compressed patch files through to be safe.
|
|
Having every item having its own code to verify the file(s) it handles
is an errorprune process and easy to break, especially if items move
through various stages (download, uncompress, patching, …). With a giant
rework we centralize (most of) the verification to have a better
enforcement rate and (hopefully) less chance for bugs, but it breaks the
ABI bigtime in exchange – and as we break it anyway, it is broken even
harder.
It shouldn't effect most frontends as they don't deal with the acquire
system at all or implement their own items, but some do and will need to
be patched (might be an opportunity to use apt on-board material).
The theory is simple: Items implement methods to decide if hashes need to
be checked (in this stage) and to return the expected hashes for this
item (in this stage). The verification itself is done in worker message
passing which has the benefit that a hashsum error is now a proper error
for the acquire system rather than a Done() which is later revised to a
Failed().
|
|
Conflicts:
apt-pkg/pkgcache.h
debian/changelog
methods/https.cc
methods/server.cc
test/integration/test-apt-download-progress
|
|
Git-Dch: ignore
|
|
Conflicts:
apt-pkg/deb/dpkgpm.cc
|
|
The variable "Size" was misleading and caused bug #1445239. To
avoid similar issues in the future, rename it to make the meaning
more obvious.
git-dch: ignore
|
|
The apt http code parses Content-Length and Content-Range. For
both requests the variable "Size" is used and the semantic for
this Size is the total file size. However Content-Length is not
the entire file size for partital file requests. For servers that
send the Content-Range header first and then the Content-Length
header this can lead to globbing of Size so that its less than
the real file size. This may lead to a subsequent passing of a
negative number into the CircleBuf which leads to a endless
loop that writes data.
Thanks to Anton Blanchard for the analysis and initial patch.
LP: #1445239
|
|
Not all servers we are talking to support If-Modified-Since and some are
not even sending Last-Modified for us, so in an effort to detect such
hits we run a hashsum check on the 'old' compared to the 'new' file, we
got the hashes for the 'new' already for "free" from the methods anyway
and hence just need to calculated the old ones.
This allows us to detect hits even with unsupported servers, which in
turn means we benefit from all the new hit behavior also here.
|
|
If we have the expected hashes we can check with them if the file we
have in partial we got a 416 for is the expected file. We detected this
with same-size before, but not every server sends a good Content-Range
header with a 416 response.
|
|
We do this in HTTP already to give the CPU some exercise while the disk
is heavily spinning (or flashing?) to store the data avoiding the need
to reread the entire file again later on to calculate the hashes – which
happens outside of the eyes of progress reporting, so you might ended up
with a bunch of https workers 'stuck' at 100% while they were busy
calculating hashes.
This is a bummer for everyone using apt as a connection speedtest as the
https method works slower now (not really, it just isn't reporting done
too early anymore).
|
|
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
|
|
Servers who advertise that they close the connection get the 'Closes'
encoding flag, but this conflicts with servers who response with a
transfer-encoding (e.g. encoding) as it is saved in the same flag.
We have a better flag for the keep-alive (or not) of the connection
anyway, so we check this instead of the encoding.
This is in practice not much of a problem as real servers we talk to are
HTTP1.1 servers (with keep-alive) and there isn't much point in doing
chunked encoding if you are going to close anyway, but our simple
testserver stumbles over this if pressed and its a bit cleaner, too.
Git-Dch: Ignore
|
|
file sends information about the uncompressed file if it can find it as
well as for the compressed file. This was done only for gzip so far, but
we support more compression types. That this information isn't used a
lot is a different story.
Git-Dch: Ignore
|
|
Git-Dch: Ignore
|
|
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this
report including the (expected) filesize. https was using a "global"
struct for reporting which made it 'reuse' incorrect values in some
cases like a non-existent InRelease fallbacking to Release{,.gpg}
resulting in a size-mismatch warning. Reducing the scope and redesigning
the setting of the values we can fix this and related issues.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
|
|
It might be quite interesting which file (content) made curl freak out
and other methods keep the file around as well.
Git-Dch: Ignore
|
|
to 404"
This reverts commit 1296bc7c466181a7978c313c40a041b34ce3eaeb.
|
|
Working with strings c-style is complicated and error-prune,
so by converting to c++ style we gain some simplicity and
avoid buffer overflows by later extensions.
Git-Dch: Ignore
|
|
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this report
including the (expected) filesize. https is using a "global" struct for
reporting which made it 'reuse' incorrect values in some cases like a
non-existent InRelease fallbacking to Release{,.gpg} resulting in an incorrect
size-mismatch warning scaring and desensitizing users as well as being subject
to a race between the write_data and progress callbacks generating incorrect
progress reporting and potentially the same error message.
Other branches as well as the bugreports contain 'better' fixes making the
struct local and other sensible changes, but are larger as a result, so in
this version we opted for short diff with minimal effect above else instead.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
|
|
|
|
Bug #778375 uncovered that https wasn't properly integrated in the class
family tree of http as it was supposed to be leading to a NULL pointer
dereference. Fixing this 'properly' was deemed to much diff for
practically no gain that late in the release, so commit
0c2dc43d4fe1d026650b5e2920a021557f9534a6 just fixed the synptom, while
this commit here is fixing the cause plus adding a test.
|
|
|
|
Do not crash in ServerState::HeaderLine if there is no Owner.
Closes: #778375
|
|
Add a explicit ReceivedData to HttpsMethod that indicates when
we got data from the connection so that we can send URISTart()
to the parent.
This is needed because URIStart got moved in f9b4f12d from
the progress_callback to write_data() and it only checks for
Res.Size. In the old code if progress_callback is called by
libcurl (and sets Res.Size) before write_data is called then
URIStart() is never send. Making this a explicit ReceivedData
variable fixes this issue.
|
|
Real webservers (like apache) actually send an error page with a 416
response, but our client didn't expect it leaving the page on the socket
to be parsed as response for the next request (http) or as file content
(https), which isn't what we want at all… Symptom is a "Bad header line"
as html usually doesn't parse that well to an http-header.
This manifests itself e.g. if we have a complete file (or larger) in
partial/ which isn't discarded by If-Range as the server doesn't support
it (or it is just newer, think: mirror rotation).
It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a,
which removed the filesize - 1 trick, but this had its own problems…
To properly test this our webserver gains the ability to reply with
transfer-encoding: chunked as most real webservers will use it to send
the dynamically generated error pages.
(The tests and their binary helpers had to be slightly modified to
apply, but the patch to fix the issue itself is unchanged.)
Closes: 768797
|
|
Real webservers (like apache) actually send an error page with a 416
response, but our client didn't expect it leaving the page on the socket
to be parsed as response for the next request (http) or as file content
(https), which isn't what we want at all… Symptom is a "Bad header line"
as html usually doesn't parse that well to an http-header.
This manifests itself e.g. if we have a complete file (or larger) in
partial/ which isn't discarded by If-Range as the server doesn't support
it (or it is just newer, think: mirror rotation).
It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a,
which removed the filesize - 1 trick, but this had its own problems…
To properly test this our webserver gains the ability to reply with
transfer-encoding: chunked as most real webservers will use it to send
the dynamically generated error pages.
Closes: 768797
|
|
We use it in other places already as well even though it is farly new
addition to the POSIX family with 2008, but rolling our own here is
really something which should be avoided in such a important method.
Git-Dch: Ignore
|
|
'pos_is_okay'
It does not have any desired sideeffect, so we just mark it as const to
properly advertise this fact to developer, compiler and linter alike.
Reported-By: cppcheck
Git-Dch: Ignore
|
|
|
|
Do not drop privileges in the methods when using a older version of
libapt that does not support the chown magic in partial/ yet. To
do this DropPrivileges() now will ignore a empty Apt::Sandbox::User.
Cleanup all hardcoded _apt along the way.
|
|
Instead of using strcat use a C++ std::string to avoid overflowing
this buffer. Thanks to David Garfield
Closes: #76442
|