Age | Commit message (Collapse) | Author |
|
This converts all callers that read machine-generated data,
callers that might work with user input are not converted.
|
|
This introduces a -t mode in which the first argument is input,
the second is output and the remaining are diffs.
This allows us to test patching compressed files, which are
detected using their file extension.
|
|
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Allows users who know what they are getting themselves into with this
trick to e.g. disable privilege dropping for e.g. file:// until they can
fix up the permissions on those repositories. It helps also the test
framework and people with a similar setup (= me) to run in less modified
environments.
|
|
Our error reporting is historically grown into some kind of mess.
A while ago I implemented stacking for the global error which is used in
this commit now to wrap calls to functions which do not report (all)
errors via return, so that only failures in those calls cause a failure
to propergate down the chain rather than failing if anything
(potentially totally unrelated) has failed at some point in the past.
This way we can avoid stopping the entire acquire process just because a
single source produced an error for example. It also means that after
the acquire process the cache is generated – even if the acquire
process had failures – as we still have the old good data around we can and
should generate a cache for (again).
There are probably more instances of this hiding, but all these looked
like the easiest to work with and fix with reasonable (aka net-positive)
effects.
|
|
Some additional files like 'Contents' are very big and should therefore
kept compressed on the disk, which apt-file did in the past. It also
implemented pdiff patching of these files by un- and recompressing these
files on-the-fly, with this commit we can do the same – but we can do
this in both pdiff patching styles (client and server merging) and
secured by hashes.
Hashes are in so far slightly complicated as we can't compare the hashes
of the compressed files as we might compress them differently than the
server would (different compressor versions, options, …), so we must
compare the hashes of the uncompressed content.
While this commit has changes in public headers, the classes it changes
are marked as hidden, so nobody can use them directly, which means the
ABI break is internal only.
|
|
C++11 adds the 'override' specifier to mark that a method is overriding
a base class method and error out if not. We hide it in the APT_OVERRIDE
macro to ensure that we keep compiling in pre-c++11 standards.
Reported-By: clang-modernize -add-override -override-macros
Git-Dch: Ignore
|
|
For some reason travis seems to be unhappy about it claiming it
is not defined. Well, lets not think to deeply about it…
Git-Dch: Ignore
|
|
At the moment we only have hashes for the uncompressed pdiff files, but
via the new '$HASH-Download' field in the .diff/Index hashes can be
provided for the .gz compressed pdiff file, which apt will pick up now
and use to verify the download. Now, we "just" need a buy in from the
creators of repositories…
|
|
The rred parser is very accepting regarding 'invalid' files. Given that
we can't trust the input it might be a bit too relaxed. In any case,
checking for more errors can't hurt given that we support only a very
specific subset of ed commands.
|
|
rred is responsible for unpacking and reading the patch files in one go,
but we currently only have hashes for the uncompressed patch files, so
the handler read the entire patch file before dispatching it to the
worker which would read it again – both with an implicit uncompress.
Worse, while the workers operate in parallel the handler is the central
orchestration unit, so having it busy with work means the workers do
(potentially) nothing.
This means rred is working with 'untrusted' data, which is bad. Yet,
having the unpack in the handler meant that the untrusted uncompress was
done as root which isn't better either. Now, we have it at least
contained in a binary which we can harden a bit better. In the long run,
we want hashes for the compressed patch files through to be safe.
|
|
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
|
|
'pos_is_okay'
It does not have any desired sideeffect, so we just mark it as const to
properly advertise this fact to developer, compiler and linter alike.
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Beside being a bit cleaner it hopefully also resolves oddball problems
I have with high levels of parallel jobs.
Git-Dch: Ignore
Reported-By: iwyu (include-what-you-use)
|
|
Git-Dch: Ignore
Reported-By: gcc
|
|
cppcheck complains about the obsolete utime as it was removed in
POSIX1.2008 and recommends usage of utimensat/futimens instead
as those are in POSIX and so commit 9ce3cfc9 switched to them.
It is just that they aren't as portable as the standard suggests:
At least our kFreeBSD and Hurd ports stumble over it at runtime.
So to make both, the ports and cppcheck happy, we use utimes instead.
Closes: 738567
|
|
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Use retry_fwrite to better handle partial fwrite successes, and to keep
the Hashes in sync with what's actually written.
|
|
Providing the benefits of both without the downsides :)
(ABI breaks or external dependencies)
For this Anthonys rred is equipped with:
- magic-filename-pickup of patches rather than explicit messages
- use of FileFd instead of FILE* to get on-the-fly uncompress
of the gzip compressed pdiff patches
The acquire code in turn stops checking for apt-file's helper
as our own rred is now clever enough for our needs.
|
|
Based on the idea presented in:
https://lists.debian.org/deity/2009/08/msg00169.html and
https://lists.debian.org/debian-devel/2014/01/msg00081.html
It reads all patches one by one and merges them in-memory before
applying the merged changes to the index.
Beware: This commit by David Kalnischkies rips out the rred binary
rewrite unchanged (expect minor format issue corrections) from the
proposed changes, so this commit alone BREAKS pdiff completely.
The integration into the acquire system as it was prepared in the
previous POC will be done in the next commit to have proper 'blame'.
|
|
The idea of pdiffs is to avoid downloading the hole file by patching the
existing index. This works very well, but becomes slow if a lot of
patches needs to be applied to reconstruct an up-to-date index and in
recent years more and more dinstall (or similar) runs are executed
creating more and more pdiffs in the same amount of time, so pdiffs
became less useful.
The solution is simple: Reduce the amount of patches (which are very
small) which need to be applied on top of the index we have available
(which is usually pretty big).
This can be done in two ways: Either merge the patches on the
server-side so that the client has to download only one patch or the
patches are all downloaded and merged on the client-side.
The first needs a client who is doing one step at a time who can also
skip patches if it needs (APT supports this for a long time now).
The later is implemented by this commit, but depends on the server NOT
merging the patches and the patches being in a strict order in which no
patch is skipped.
This is traditionally the case for dak, but other repository creators
support merging – e.g. reprepro (which helpfully adds a flag indicating
that the patches are merged). To support both or even mixes a client
needs more information which isn't available for now.
This POC uses the external diffindex-rred included in apt-file to
do the heavy lifting of merging & applying all patches in one pass,
hence to test this feature apt-file needs to be installed.
|
|
|
|
|
|
- check return of writev() as gcc recommends
* methods/mirror.cc:
- check return of chdir() as gcc recommends
* apt-pkg/deb/dpkgpm.cc:
- check return of write() a gcc recommends
* apt-inst/deb/debfile.cc:
- check return of chdir() as gcc recommends
* apt-inst/deb/dpkgdb.cc:
- check return of chdir() as gcc recommends
|
|
|
|
|
|
ReadLine instead of accessing the files directly with fgets()
|
|
- drop the explicit export of gz-compression handling
|
|
to search for compressed silbings of the given filename and use this guessing
instead of hardcoding Gzip compression
|
|
|
|
|
|
|
|
size are pretty unlikely for now, but we need it for deb
packages which could become bigger than 4GB now (LP: #815895)
|
|
|
|
const and initial mostly Debug member values in the constructors
|
|
- really detect bigendian machines by including config.h,
so we can really (Closes: #612986)
* apt-pkg/contrib/mmap.cc:
- Base has as 'valid' failure states 0 and -1 so add a simple
validData method to check for failure states
|
|
|
|
- read patch into MMap only if we work on uncompressed patches
|
|
- operate optional on gzip compressed pdiffs
* apt-pkg/acquire-item.cc:
- don't uncompress downloaded pdiff files before feeding it to rred
|
|
|
|
- use the patchfile modification time instead of the one from the
"old" file - thanks to Philipp Weis for noticing! (Closes: #571541)
|
|
* rewrite and refactor rred method to be able to handle even big (>30 MB)
patches (Closes: #554349) and hardening the method itself by using more
constants and a return value which can't be misinterpreted as linenumber
* Finally adope the patch from Morten Hustveit <morten@debian.org> to be
able to optional use mmaps and iovec to increase patch speed -
but as this increase memory usage we can always fall back to the "old"
method which doesn't depend on mmaps.
|
|
Patch from Bernhard R. Link, thanks!
|
|
|
|
|
|
|
|
make the QueueNextDiff() code more robust
|
|
|
|
|