Age | Commit message (Collapse) | Author |
|
We do this in HTTP already to give the CPU some exercise while the disk
is heavily spinning (or flashing?) to store the data avoiding the need
to reread the entire file again later on to calculate the hashes – which
happens outside of the eyes of progress reporting, so you might ended up
with a bunch of https workers 'stuck' at 100% while they were busy
calculating hashes.
This is a bummer for everyone using apt as a connection speedtest as the
https method works slower now (not really, it just isn't reporting done
too early anymore).
|
|
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
|
|
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this
report including the (expected) filesize. https was using a "global"
struct for reporting which made it 'reuse' incorrect values in some
cases like a non-existent InRelease fallbacking to Release{,.gpg}
resulting in a size-mismatch warning. Reducing the scope and redesigning
the setting of the values we can fix this and related issues.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
|
|
Bug #778375 uncovered that https wasn't properly integrated in the class
family tree of http as it was supposed to be leading to a NULL pointer
dereference. Fixing this 'properly' was deemed to much diff for
practically no gain that late in the release, so commit
0c2dc43d4fe1d026650b5e2920a021557f9534a6 just fixed the synptom, while
this commit here is fixing the cause plus adding a test.
|
|
Real webservers (like apache) actually send an error page with a 416
response, but our client didn't expect it leaving the page on the socket
to be parsed as response for the next request (http) or as file content
(https), which isn't what we want at all… Symptom is a "Bad header line"
as html usually doesn't parse that well to an http-header.
This manifests itself e.g. if we have a complete file (or larger) in
partial/ which isn't discarded by If-Range as the server doesn't support
it (or it is just newer, think: mirror rotation).
It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a,
which removed the filesize - 1 trick, but this had its own problems…
To properly test this our webserver gains the ability to reply with
transfer-encoding: chunked as most real webservers will use it to send
the dynamically generated error pages.
Closes: 768797
|
|
The Maximum-Size protection breaks the http pipeline reorder code
because it relies on that the object got fetched entirely so that
it can compare the hash of the downloaded data. So instead of
stopping when the Maximum-Size of the expected item is reached we
only stop when the maximum size of the biggest item in the queue
is reached. This way the pipeline reoder code keeps working.
|
|
|
|
|
|
Conflicts:
apt-pkg/acquire-item.cc
apt-pkg/acquire-item.h
apt-pkg/cachefilter.h
configure.ac
debian/changelog
|
|
Prefix all answers with the URL that the answer is for. This
helps when debugging and pipeline is enabled.
|
|
This ensures that we can stop downloading if the server send
too much data by accident (or by a malicious attempt)
|
|
Now that methods have the expected hashes available they can check if
the response from the server is what they expected. Pipelining is one of
those areas in which servers can mess up by not supporting it properly,
which forced us to disable it for the time being. Now, we check if
we got a response out of order, which we can not only use to disable
pipelining automatically for the next requests, but we can fix it up
just like the server responded in proper order for the current requests.
To ensure that this little trick works pipelining is only attempt if we
have hashsums for all the files in the chain which in theory reduces the
use of pipelining usage even on the many servers which work properly,
but in practice only the InRelease file (or similar such) will be
requested without a hashsum – and as it is the only file requested in
that stage it can't be pipelined even if we wanted to.
Some minor annoyances remain: The display of the progress we have
doesn't reflect this change, so it looks like the same package gets
downloaded multiple times while others aren't at all. Further more,
partial files are not supported in this recovery as the received data
was appended to the wrong file, so the hashsum doesn't match.
Both seem to be minor enough to reenable pipelining by default until
further notice through to test if it really solves the problem.
This therefore reverts commit 8221431757c775ee875a061b184b5f6f2330f928.
|
|
Git-Dch: Ignore
Reported-By: gcc -Wsuggest-attribute={pure,const,noreturn}
|
|
Beside being a bit cleaner it hopefully also resolves oddball problems
I have with high levels of parallel jobs.
Git-Dch: Ignore
Reported-By: iwyu (include-what-you-use)
|
|
server.cc: In member function ‘bool ServerState::HeaderLine(std::string)’:
server.cc:198:72: warning: format ‘%llu’ expects argument of type ‘long long unsigned int*’, but argument 3 has type ‘long long int*’ [-Wformat=]
else if (sscanf(Val.c_str(),"bytes %llu-%*u/%llu",&StartPos,&Size) != 2)
Git-Dch: Ignore
Reported-By: gcc -Wpedantic
|
|
Git-Dch: Ignore
Reported-By: gcc -Wpedantic
|
|
The most "visible" change is from utime to utimensat/futimens
as the first one isn't part of POSIX anymore.
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Servers might respond with a complete file either because they don't
support Ranges at all or the If-Range condition isn't statisfied, so we
have to parse the headers curl gets ourself to seek or truncate the file
we have so far.
This also finially adds the testcase testing a bunch of partial
situations for both, http and https - which is now all green.
Closes: 617643, 667699
LP: 1157943
|
|
No effective behavior change, just shuffling big junks of code between
methods and classes to split them into those strongly related to our
client implementation and those implementing HTTP.
The idea is to get HTTPS to a point in which most of the implementation
can be shared even though the client implementations itself is
completely different. This isn't anywhere near yet though, but it should
beenough to reuse at least a few lines from http in https now.
Git-Dch: Ignore
|