summaryrefslogtreecommitdiff
path: root/methods/rred.cc
AgeCommit message (Collapse)Author
2016-07-26verify hash of input file in rredDavid Kalnischkies
We read the entire input file we want to patch anyhow, so we can also calculate the hash for that file and compare it with what he had expected it to be. Note that this isn't really a security improvement as a) the file we patch is trusted & b) if the input is incorrect, the result will hardly be matching, so this is just for failing slightly earlier with a more relevant error message (althrough, in terms of rred its ignored and complete download attempt instead).
2016-05-28use std::locale::global instead of setlocaleDavid Kalnischkies
We use a wild mixture of C and C++ ways of generating output, so having a consistent world-view in both styles sounds like a good idea and should help in preventing regressions.
2016-02-04rred: If there were I/O errors, failJulian Andres Klode
We basically ignored errors from writing and flushing, let's not do that.
2016-01-26act on various suggestions from cppcheckDavid Kalnischkies
Reported-By: cppcheck Git-Dch: Ignore
2016-01-08allow pdiff bootstrap from all supported compressorsDavid Kalnischkies
There is no reason to enforce that the file we start the bootstrap with is compressed with a compressor which is available online. This allows us to change the on-disk format as well as deals with repositories adding/removing support for a specific compressor.
2016-01-07rred: Run in parallelJulian Andres Klode
Remove the SingleInstance flag so we can use the new randomized queue feature to run parallel.
2015-12-27rred: Use buffered writesJulian Andres Klode
Buffered writes improve performance a lot, given that we spent about 78% of the time in _write.
2015-12-27rred: Only call pkgInitConfig() in test modeJulian Andres Klode
This accidentally slipped in in a previous commit, but it should be used only for testing mode. Reported-By: David Kalnischkies <david@kalnischkies.de>
2015-12-27Convert most callers of isspace() to isspace_ascii()Julian Andres Klode
This converts all callers that read machine-generated data, callers that might work with user input are not converted.
2015-12-26rred: Allow passing files as arguments for compressor testingJulian Andres Klode
This introduces a -t mode in which the first argument is input, the second is output and the remaining are diffs. This allows us to test patching compressed files, which are detected using their file extension.
2015-11-05apply various suggestions made by cppcheckDavid Kalnischkies
Reported-By: cppcheck Git-Dch: Ignore
2015-11-05allow acquire method specific options via Binary scopeDavid Kalnischkies
Allows users who know what they are getting themselves into with this trick to e.g. disable privilege dropping for e.g. file:// until they can fix up the permissions on those repositories. It helps also the test framework and people with a similar setup (= me) to run in less modified environments.
2015-09-14avoid using global PendingError to avoid failing too often too soonDavid Kalnischkies
Our error reporting is historically grown into some kind of mess. A while ago I implemented stacking for the global error which is used in this commit now to wrap calls to functions which do not report (all) errors via return, so that only failures in those calls cause a failure to propergate down the chain rather than failing if anything (potentially totally unrelated) has failed at some point in the past. This way we can avoid stopping the entire acquire process just because a single source produced an error for example. It also means that after the acquire process the cache is generated – even if the acquire process had failures – as we still have the old good data around we can and should generate a cache for (again). There are probably more instances of this hiding, but all these looked like the easiest to work with and fix with reasonable (aka net-positive) effects.
2015-08-28implement PDiff patching for compressed filesDavid Kalnischkies
Some additional files like 'Contents' are very big and should therefore kept compressed on the disk, which apt-file did in the past. It also implemented pdiff patching of these files by un- and recompressing these files on-the-fly, with this commit we can do the same – but we can do this in both pdiff patching styles (client and server merging) and secured by hashes. Hashes are in so far slightly complicated as we can't compare the hashes of the compressed files as we might compress them differently than the server would (different compressor versions, options, …), so we must compare the hashes of the uncompressed content. While this commit has changes in public headers, the classes it changes are marked as hidden, so nobody can use them directly, which means the ABI break is internal only.
2015-08-10add c++11 override marker to overridden methodsDavid Kalnischkies
C++11 adds the 'override' specifier to mark that a method is overriding a base class method and error out if not. We hide it in the APT_OVERRIDE macro to ensure that we keep compiling in pre-c++11 standards. Reported-By: clang-modernize -add-override -override-macros Git-Dch: Ignore
2015-06-09replace ULONG_MAX with c++ style std::numeric_limitsDavid Kalnischkies
For some reason travis seems to be unhappy about it claiming it is not defined. Well, lets not think to deeply about it… Git-Dch: Ignore
2015-06-09support hashes for compressed pdiff filesDavid Kalnischkies
At the moment we only have hashes for the uncompressed pdiff files, but via the new '$HASH-Download' field in the .diff/Index hashes can be provided for the .gz compressed pdiff file, which apt will pick up now and use to verify the download. Now, we "just" need a buy in from the creators of repositories…
2015-06-09add more parsing error checking for rredDavid Kalnischkies
The rred parser is very accepting regarding 'invalid' files. Given that we can't trust the input it might be a bit too relaxed. In any case, checking for more errors can't hurt given that we support only a very specific subset of ed commands.
2015-06-09check patch hashes in rred worker instead of in the handlerDavid Kalnischkies
rred is responsible for unpacking and reading the patch files in one go, but we currently only have hashes for the uncompressed patch files, so the handler read the entire patch file before dispatching it to the worker which would read it again – both with an implicit uncompress. Worse, while the workers operate in parallel the handler is the central orchestration unit, so having it busy with work means the workers do (potentially) nothing. This means rred is working with 'untrusted' data, which is bad. Yet, having the unpack in the handler meant that the untrusted uncompress was done as root which isn't better either. Now, we have it at least contained in a binary which we can harden a bit better. In the long run, we want hashes for the compressed patch files through to be safe.
2015-04-19calculate only expected hashes in methodsDavid Kalnischkies
Methods get told which hashes are expected by the acquire system, which means we can use this list to restrict what we calculate in the methods as any extra we are calculating is wasted effort as we can't compare it with anything anyway. Adding support for a new hash algorithm is therefore 'free' now and if a algorithm is no longer provided in a repository for a file, we automatically stop calculating it. In practice this results in a speed-up in Debian as we don't have SHA512 here (so far), so we practically stop calculating it.
2014-11-08Assert statement calls a function which may have desired side effects: ↵David Kalnischkies
'pos_is_okay' It does not have any desired sideeffect, so we just mark it as const to properly advertise this fact to developer, compiler and linter alike. Reported-By: cppcheck Git-Dch: Ignore
2014-03-13cleanup headers and especially #includes everywhereDavid Kalnischkies
Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
2014-03-13fix -Wformat= warnings about size_t != %lu on e.g. armelDavid Kalnischkies
Git-Dch: Ignore Reported-By: gcc
2014-02-11use utimes instead of utimensat/futimensDavid Kalnischkies
cppcheck complains about the obsolete utime as it was removed in POSIX1.2008 and recommends usage of utimensat/futimens instead as those are in POSIX and so commit 9ce3cfc9 switched to them. It is just that they aren't as portable as the standard suggests: At least our kFreeBSD and Hurd ports stumble over it at runtime. So to make both, the ports and cppcheck happy, we use utimes instead. Closes: 738567
2014-01-30fix various style/performance warnings in rredDavid Kalnischkies
Reported-By: cppcheck Git-Dch: Ignore
2014-01-21methods/rred: minor robustness improvementsAnthony Towns
Use retry_fwrite to better handle partial fwrite successes, and to keep the Hashes in sync with what's actually written.
2014-01-15integrate Anthonys rred with POC for client-side mergeDavid Kalnischkies
Providing the benefits of both without the downsides :) (ABI breaks or external dependencies) For this Anthonys rred is equipped with: - magic-filename-pickup of patches rather than explicit messages - use of FileFd instead of FILE* to get on-the-fly uncompress of the gzip compressed pdiff patches The acquire code in turn stops checking for apt-file's helper as our own rred is now clever enough for our needs.
2014-01-15reimplement rred to allow applying all the diffs in a single passAnthony Towns
Based on the idea presented in: https://lists.debian.org/deity/2009/08/msg00169.html and https://lists.debian.org/debian-devel/2014/01/msg00081.html It reads all patches one by one and merges them in-memory before applying the merged changes to the index. Beware: This commit by David Kalnischkies rips out the rred binary rewrite unchanged (expect minor format issue corrections) from the proposed changes, so this commit alone BREAKS pdiff completely. The integration into the acquire system as it was prepared in the previous POC will be done in the next commit to have proper 'blame'.
2013-12-13implement POC client-side merging of pdiffs via apt-fileDavid Kalnischkies
The idea of pdiffs is to avoid downloading the hole file by patching the existing index. This works very well, but becomes slow if a lot of patches needs to be applied to reconstruct an up-to-date index and in recent years more and more dinstall (or similar) runs are executed creating more and more pdiffs in the same amount of time, so pdiffs became less useful. The solution is simple: Reduce the amount of patches (which are very small) which need to be applied on top of the index we have available (which is usually pretty big). This can be done in two ways: Either merge the patches on the server-side so that the client has to download only one patch or the patches are all downloaded and merged on the client-side. The first needs a client who is doing one step at a time who can also skip patches if it needs (APT supports this for a long time now). The later is implemented by this commit, but depends on the server NOT merging the patches and the patches being in a strict order in which no patch is skipped. This is traditionally the case for dak, but other repository creators support merging – e.g. reprepro (which helpfully adds a flag indicating that the patches are merged). To support both or even mixes a client needs more information which isn't available for now. This POC uses the external diffindex-rred included in apt-file to do the heavy lifting of merging & applying all patches in one pass, hence to test this feature apt-file needs to be installed.
2012-05-10we don't need zlib (anymore) in rred so don't include itDavid Kalnischkies
2012-03-22make these retry_write methods static so that they don't end up as symbolsDavid Kalnischkies
2012-03-20* methods/rred.cc:David Kalnischkies
- check return of writev() as gcc recommends * methods/mirror.cc: - check return of chdir() as gcc recommends * apt-pkg/deb/dpkgpm.cc: - check return of write() a gcc recommends * apt-inst/deb/debfile.cc: - check return of chdir() as gcc recommends * apt-inst/deb/dpkgdb.cc: - check return of chdir() as gcc recommends
2012-01-20fix a few esoteric cppcheck errors/warnings/infosDavid Kalnischkies
2012-01-10as Size() can be quiet expensive for compressed files lets store the resultDavid Kalnischkies
2011-12-18implement the fallback method of rred by using the FileFd and the includedDavid Kalnischkies
ReadLine instead of accessing the files directly with fgets()
2011-12-11 - add a ReadLine methodDavid Kalnischkies
- drop the explicit export of gz-compression handling
2011-12-10enable FileFd to guess the compressor based on the filename if requested orDavid Kalnischkies
to search for compressed silbings of the given filename and use this guessing instead of hardcoding Gzip compression
2011-09-19use forward declaration in headers if possible instead of includesDavid Kalnischkies
2011-09-19do not pollute namespace in the headers with using (Closes: #500198)David Kalnischkies
2011-09-13merge with debian/experimentalDavid Kalnischkies
2011-09-13Support large files in the complete toolset. Indexes of thisDavid Kalnischkies
size are pretty unlikely for now, but we need it for deb packages which could become bigger than 4GB now (LP: #815895)
2011-09-13reorder includes: add <config.h> if needed and include it at firstDavid Kalnischkies
2011-08-11follow the recommendation of cppcheck to make some method methods (scnr)David Kalnischkies
const and initial mostly Debug member values in the constructors
2011-02-14* apt-pkg/contrib/fileutl.cc:David Kalnischkies
- really detect bigendian machines by including config.h, so we can really (Closes: #612986) * apt-pkg/contrib/mmap.cc: - Base has as 'valid' failure states 0 and -1 so add a simple validData method to check for failure states
2011-02-13update size of dynamic MMap as we write in from the outsideDavid Kalnischkies
2011-02-12* methods/rred.cc:David Kalnischkies
- read patch into MMap only if we work on uncompressed patches
2011-01-15* methods/rred.cc:David Kalnischkies
- operate optional on gzip compressed pdiffs * apt-pkg/acquire-item.cc: - don't uncompress downloaded pdiff files before feeding it to rred
2010-08-10apt-pkg, methods: Convert users of WriteEmpty to WriteAtomic.Julian Andres Klode
2010-05-04* methods/rred.cc:David Kalnischkies
- use the patchfile modification time instead of the one from the "old" file - thanks to Philipp Weis for noticing! (Closes: #571541)
2009-12-11Backport rred patches from my own sid branch to the 0.7.25 branchDavid Kalnischkies
* rewrite and refactor rred method to be able to handle even big (>30 MB) patches (Closes: #554349) and hardening the method itself by using more constants and a return value which can't be misinterpreted as linenumber * Finally adope the patch from Morten Hustveit <morten@debian.org> to be able to optional use mmaps and iovec to increase patch speed - but as this increase memory usage we can always fall back to the "old" method which doesn't depend on mmaps.