summaryrefslogtreecommitdiff
path: root/methods/server.cc
AgeCommit message (Collapse)Author
2018-09-28http: Stop pipeline after close only if it was not filled beforeJulian Andres Klode
It is perfectly valid behavior for a server to respond with Connection: close eventually, even when pipelining. Turning off pipelining due to that is wrong. For example, some Ubuntu mirrors close the connection after 101 requests. If I have more packages to install, only the first 101 would benefit from pipelining. This commit introduces a new check to only turn of pipelining for future connections if the pipeline for this connection did not have 3 successful fetches before, that should work quite well to detect broken server/proxy combinations like in bug 832113. (cherry picked from commit df696650b7a8c58bbd92e0e1619e956f21010a96) LP: #1794957 (cherry picked from commit 3de7454c796f245371c33076ae01529d6d36d715)
2018-03-06Revert "http: A response with Content-Length: 0 has no content"Julian Andres Klode
This reverts commit dd547ebaffd2aceb42e2908f1d5f0ab386af6cb1. LP: #1751225
2017-09-13http: A response with Content-Length: 0 has no contentJulian Andres Klode
APT considered any response with a Content-Length to have a body, even if the value of the header was 0. A 0 length body however, is equal to no body. (cherry picked from commit d47fb34ae03566feec7fec6dccba80e45fa03e6f)
2017-02-22basehttp: Only read Content-Range on 416 and 206 responsesJulian Andres Klode
This fixes issues with sourceforge where the redirector includes such a Content-Range in a 302 redirect. Since we do not really know what file is meant in a redirect, let's just ignore it for all responses other than 416 and 206. Maybe we should also get rid of the other errors, and just ignore the field in those cases as well? LP: #1657567 (cherry picked from commit 4759a702081297bde66982efed8b2b7fd39ca27c) (cherry picked from commit b5d0e1be09fd07e693bae8046848059f578d029f)
2016-10-05don't try pipelining if server closes connectionsDavid Kalnischkies
If a server closes a connection after sending us a file that tends to mean that its a type of server who always closes the connection – it is therefore relatively pointless to try pipelining with it even if it isn't a problem by itself: apt is just restarting the pipeline each time after it got served one file and the connection is closed. The problem starts if one or more proxies are between the server and apt and they disagree about how the connection should be as in the bugreporters case where the responses apt gets contain both Keep-Alive and Proxy-Connection headers (which apt both ignores) indicating a proxy is trying to keep a connection open while the response also contains "Connection: close" indicating the opposite which apt understands and respects as it is required to do. We avoid stepping into this abyss by not performing pipelining anymore if we got a respond with the indication to close connection if the response was otherwise a success – error messages are sent by some servers via this method as their pages tend to be created dynamically and hence their size isn't known a priori to them. Closes: #832113 (cherry picked from commit 9714d522056e5256f5a2de587d88eba7cb3291c2)
2016-10-05http(s): allow empty values for header fieldsDavid Kalnischkies
It seems completely pointless from a server-POV to sent empty header fields, so most of them don't do it (simply proven by this limitation existing since day one) – but it is technically allowed by the RFC as the surounding whitespaces are optional and Github seems to like sending "X-Geo-Block-List:\r\n" since recently (bug reports in other http clients indicate July) at least sometimes as the reporter claims to have seen it on https only even through it can happen with both. Closes: 834048 (cherry picked from commit 148c049150cc39f2e40894c1684dc2aefea1117e)
2016-08-31use proper warning for automatic pipeline disableDavid Kalnischkies
Also fixes message itself to mention the correct option name as noticed in #832113. (cherry picked from commit b9c20219dc17db1d29eaf297263a4b008bd1b90b)
2016-08-31close server if parsing of header field failedDavid Kalnischkies
Seen in #828011 if we fail to parse a header field like Last-Modified we end up interpreting the data as response header for coming requests in case we don't rotate to a new server in DNS rotation. (cherry picked from commit cc0a4c82b3c132abba9b9ec35fd61bc8b45a1b80)
2016-06-01prevent C++ locale number formatting in text APIsDavid Kalnischkies
Setting the C++ locale via std::locale::global(std::locale("")); which would otherwise default to the default C locale (aka: unaffected by setlocale) effects the formatting of numeric types in IO streams, which for output for humans is perfectly sensible, but breaks our many text interfaces used and parsed by us and others without expecting the numbers to be formatted. Closes: #825396 (cherry picked from commit b58e2c7c56b1416a343e81f9f80cb1f02c128e25)
2016-01-26act on various suggestions from cppcheckDavid Kalnischkies
Reported-By: cppcheck Git-Dch: Ignore
2016-01-12Only enable pipelining if server is HTTP/1.1Julian Andres Klode
Just enabling it for anyone breaks with HTTP/1.0 servers and proxies sometimes. Closes: #810796
2015-12-27Convert most callers of isspace() to isspace_ascii()Julian Andres Klode
This converts all callers that read machine-generated data, callers that might work with user input are not converted.
2015-11-05allow acquire method specific options via Binary scopeDavid Kalnischkies
Allows users who know what they are getting themselves into with this trick to e.g. disable privilege dropping for e.g. file:// until they can fix up the permissions on those repositories. It helps also the test framework and people with a similar setup (= me) to run in less modified environments.
2015-11-04wrap every unlink call to check for != /dev/nullDavid Kalnischkies
Unlinking /dev/null is bad, we shouldn't do that. Also, we should print at least a warning if we tried to unlink a file but didn't manage to pull it of (ignoring the case were the file is /dev/null or doesn't exist in the first place). This got triggered by a relatively unlikely to cause problem in pkgAcquire::Worker::PrepareFiles which would while temporary uncompressed files (which are set to keep compressed) figure out that to files are the same and prepare for sharing by deleting them. Bad move. That also shows why not printing a warning is a bad idea as this hide the error for in non-root test runs. Git-Dch: Ignore
2015-09-14fix two memory leaks reported by gccDavid Kalnischkies
Reported-By: gcc -fsanitize=address -fno-sanitize=vptr Git-Dch: Ignore
2015-08-27fix various typos reported by codespellDavid Kalnischkies
Reported-By: codespell
2015-05-22Merge branch 'debian/sid' into debian/experimentalMichael Vogt
Conflicts: apt-pkg/pkgcache.h debian/changelog methods/https.cc methods/server.cc test/integration/test-apt-download-progress
2015-05-22Rename "Size" in ServerState to TotalFileSizeMichael Vogt
The variable "Size" was misleading and caused bug #1445239. To avoid similar issues in the future, rename it to make the meaning more obvious. git-dch: ignore
2015-05-22Fix endless loop in apt-get update that can cause disk fillupMichael Vogt
The apt http code parses Content-Length and Content-Range. For both requests the variable "Size" is used and the semantic for this Size is the total file size. However Content-Length is not the entire file size for partital file requests. For servers that send the Content-Range header first and then the Content-Length header this can lead to globbing of Size so that its less than the real file size. This may lead to a subsequent passing of a negative number into the CircleBuf which leads to a endless loop that writes data. Thanks to Anton Blanchard for the analysis and initial patch. LP: #1445239
2015-05-12detect 416 complete file in partial by expected hashDavid Kalnischkies
If we have the expected hashes we can check with them if the file we have in partial we got a 416 for is the expected file. We detected this with same-size before, but not every server sends a good Content-Range header with a 416 response.
2015-04-19calculate hashes while downloading in httpsDavid Kalnischkies
We do this in HTTP already to give the CPU some exercise while the disk is heavily spinning (or flashing?) to store the data avoiding the need to reread the entire file again later on to calculate the hashes – which happens outside of the eyes of progress reporting, so you might ended up with a bunch of https workers 'stuck' at 100% while they were busy calculating hashes. This is a bummer for everyone using apt as a connection speedtest as the https method works slower now (not really, it just isn't reporting done too early anymore).
2015-04-19calculate only expected hashes in methodsDavid Kalnischkies
Methods get told which hashes are expected by the acquire system, which means we can use this list to restrict what we calculate in the methods as any extra we are calculating is wasted effort as we can't compare it with anything anyway. Adding support for a new hash algorithm is therefore 'free' now and if a algorithm is no longer provided in a repository for a file, we automatically stop calculating it. In practice this results in a speed-up in Debian as we don't have SHA512 here (so far), so we practically stop calculating it.
2015-03-16derive more of https from http methodDavid Kalnischkies
Bug #778375 uncovered that https wasn't properly integrated in the class family tree of http as it was supposed to be leading to a NULL pointer dereference. Fixing this 'properly' was deemed to much diff for practically no gain that late in the release, so commit 0c2dc43d4fe1d026650b5e2920a021557f9534a6 just fixed the synptom, while this commit here is fixing the cause plus adding a test.
2015-03-16merge debian/sid into debian/experimentalDavid Kalnischkies
2015-02-23Fix crash in the apt-transport-https when Owner is NULLTomasz Buchert
Do not crash in ServerState::HeaderLine if there is no Owner. Closes: #778375
2014-12-22dispose http(s) 416 error page as non-contentDavid Kalnischkies
Real webservers (like apache) actually send an error page with a 416 response, but our client didn't expect it leaving the page on the socket to be parsed as response for the next request (http) or as file content (https), which isn't what we want at all… Symptom is a "Bad header line" as html usually doesn't parse that well to an http-header. This manifests itself e.g. if we have a complete file (or larger) in partial/ which isn't discarded by If-Range as the server doesn't support it (or it is just newer, think: mirror rotation). It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a, which removed the filesize - 1 trick, but this had its own problems… To properly test this our webserver gains the ability to reply with transfer-encoding: chunked as most real webservers will use it to send the dynamically generated error pages. (The tests and their binary helpers had to be slightly modified to apply, but the patch to fix the issue itself is unchanged.) Closes: 768797
2014-12-09dispose http(s) 416 error page as non-contentDavid Kalnischkies
Real webservers (like apache) actually send an error page with a 416 response, but our client didn't expect it leaving the page on the socket to be parsed as response for the next request (http) or as file content (https), which isn't what we want at all… Symptom is a "Bad header line" as html usually doesn't parse that well to an http-header. This manifests itself e.g. if we have a complete file (or larger) in partial/ which isn't discarded by If-Range as the server doesn't support it (or it is just newer, think: mirror rotation). It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a, which removed the filesize - 1 trick, but this had its own problems… To properly test this our webserver gains the ability to reply with transfer-encoding: chunked as most real webservers will use it to send the dynamically generated error pages. Closes: 768797
2014-10-08Fix ServerMethod::FindMaximumObjectSizeInQueue()Michael Vogt
Git-Dch: ignore
2014-10-08Fix http pipeline messup detectionMichael Vogt
The Maximum-Size protection breaks the http pipeline reorder code because it relies on that the object got fetched entirely so that it can compare the hash of the downloaded data. So instead of stopping when the Maximum-Size of the expected item is reached we only stop when the maximum size of the biggest item in the queue is reached. This way the pipeline reoder code keeps working.
2014-10-07make expected-size a maximum-size check as this is what we want at this pointMichael Vogt
2014-10-06make http size check workMichael Vogt
2014-09-27fix: %i in format string (no. 1) requires 'int' but the argument type isDavid Kalnischkies
'unsigned int' Git-Dch: Ignore Reported-By: cppcheck
2014-09-23Merge branch 'debian/sid' into debian/experimentalMichael Vogt
Conflicts: apt-pkg/acquire-item.cc apt-pkg/acquire-item.h apt-pkg/cachefilter.h configure.ac debian/changelog
2014-09-05Improve Debug::Acquire::http debug outputMichael Vogt
Prefix all answers with the URL that the answer is for. This helps when debugging and pipeline is enabled.
2014-08-26Pass ExpectedSize to tthe backend methodMichael Vogt
This ensures that we can stop downloading if the server send too much data by accident (or by a malicious attempt)
2014-05-09reenable pipelining via hashsum reordering supportDavid Kalnischkies
Now that methods have the expected hashes available they can check if the response from the server is what they expected. Pipelining is one of those areas in which servers can mess up by not supporting it properly, which forced us to disable it for the time being. Now, we check if we got a response out of order, which we can not only use to disable pipelining automatically for the next requests, but we can fix it up just like the server responded in proper order for the current requests. To ensure that this little trick works pipelining is only attempt if we have hashsums for all the files in the chain which in theory reduces the use of pipelining usage even on the many servers which work properly, but in practice only the InRelease file (or similar such) will be requested without a hashsum – and as it is the only file requested in that stage it can't be pipelined even if we wanted to. Some minor annoyances remain: The display of the progress we have doesn't reflect this change, so it looks like the same package gets downloaded multiple times while others aren't at all. Further more, partial files are not supported in this recovery as the received data was appended to the wrong file, so the hashsum doesn't match. Both seem to be minor enough to reenable pipelining by default until further notice through to test if it really solves the problem. This therefore reverts commit 8221431757c775ee875a061b184b5f6f2330f928.
2014-03-13cleanup headers and especially #includes everywhereDavid Kalnischkies
Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
2014-03-13warning: extra ‘;’ [-Wpedantic]David Kalnischkies
Git-Dch: Ignore Reported-By: gcc -Wpedantic
2014-02-22Fix typos in documentation (codespell)Michael Vogt
2014-02-14allow http protocol to switch to httpsDavid Kalnischkies
switch protocols at random is a bad idea if e.g. http can switch to file, so we limit the possibilities to http to http and http to https. As very few people (less than 1% according to popcon) have https installed this likely changes nothing in terms of failure. The commit is adding a friendly hint which package needs to be installed though.
2014-02-11use utimes instead of utimensat/futimensDavid Kalnischkies
cppcheck complains about the obsolete utime as it was removed in POSIX1.2008 and recommends usage of utimensat/futimens instead as those are in POSIX and so commit 9ce3cfc9 switched to them. It is just that they aren't as portable as the standard suggests: At least our kFreeBSD and Hurd ports stumble over it at runtime. So to make both, the ports and cppcheck happy, we use utimes instead. Closes: 738567
2014-01-16correct some style/performance/warnings from cppcheckDavid Kalnischkies
The most "visible" change is from utime to utimensat/futimens as the first one isn't part of POSIX anymore. Reported-By: cppcheck Git-Dch: Ignore
2013-10-01refactor http client implementationDavid Kalnischkies
No effective behavior change, just shuffling big junks of code between methods and classes to split them into those strongly related to our client implementation and those implementing HTTP. The idea is to get HTTPS to a point in which most of the implementation can be shared even though the client implementations itself is completely different. This isn't anywhere near yet though, but it should beenough to reuse at least a few lines from http in https now. Git-Dch: Ignore