Age | Commit message (Collapse) | Author |
|
rred is responsible for unpacking and reading the patch files in one go,
but we currently only have hashes for the uncompressed patch files, so
the handler read the entire patch file before dispatching it to the
worker which would read it again – both with an implicit uncompress.
Worse, while the workers operate in parallel the handler is the central
orchestration unit, so having it busy with work means the workers do
(potentially) nothing.
This means rred is working with 'untrusted' data, which is bad. Yet,
having the unpack in the handler meant that the untrusted uncompress was
done as root which isn't better either. Now, we have it at least
contained in a binary which we can harden a bit better. In the long run,
we want hashes for the compressed patch files through to be safe.
|
|
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
|
|
'pos_is_okay'
It does not have any desired sideeffect, so we just mark it as const to
properly advertise this fact to developer, compiler and linter alike.
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Beside being a bit cleaner it hopefully also resolves oddball problems
I have with high levels of parallel jobs.
Git-Dch: Ignore
Reported-By: iwyu (include-what-you-use)
|
|
Git-Dch: Ignore
Reported-By: gcc
|
|
cppcheck complains about the obsolete utime as it was removed in
POSIX1.2008 and recommends usage of utimensat/futimens instead
as those are in POSIX and so commit 9ce3cfc9 switched to them.
It is just that they aren't as portable as the standard suggests:
At least our kFreeBSD and Hurd ports stumble over it at runtime.
So to make both, the ports and cppcheck happy, we use utimes instead.
Closes: 738567
|
|
Reported-By: cppcheck
Git-Dch: Ignore
|
|
Use retry_fwrite to better handle partial fwrite successes, and to keep
the Hashes in sync with what's actually written.
|
|
Providing the benefits of both without the downsides :)
(ABI breaks or external dependencies)
For this Anthonys rred is equipped with:
- magic-filename-pickup of patches rather than explicit messages
- use of FileFd instead of FILE* to get on-the-fly uncompress
of the gzip compressed pdiff patches
The acquire code in turn stops checking for apt-file's helper
as our own rred is now clever enough for our needs.
|
|
Based on the idea presented in:
https://lists.debian.org/deity/2009/08/msg00169.html and
https://lists.debian.org/debian-devel/2014/01/msg00081.html
It reads all patches one by one and merges them in-memory before
applying the merged changes to the index.
Beware: This commit by David Kalnischkies rips out the rred binary
rewrite unchanged (expect minor format issue corrections) from the
proposed changes, so this commit alone BREAKS pdiff completely.
The integration into the acquire system as it was prepared in the
previous POC will be done in the next commit to have proper 'blame'.
|
|
The idea of pdiffs is to avoid downloading the hole file by patching the
existing index. This works very well, but becomes slow if a lot of
patches needs to be applied to reconstruct an up-to-date index and in
recent years more and more dinstall (or similar) runs are executed
creating more and more pdiffs in the same amount of time, so pdiffs
became less useful.
The solution is simple: Reduce the amount of patches (which are very
small) which need to be applied on top of the index we have available
(which is usually pretty big).
This can be done in two ways: Either merge the patches on the
server-side so that the client has to download only one patch or the
patches are all downloaded and merged on the client-side.
The first needs a client who is doing one step at a time who can also
skip patches if it needs (APT supports this for a long time now).
The later is implemented by this commit, but depends on the server NOT
merging the patches and the patches being in a strict order in which no
patch is skipped.
This is traditionally the case for dak, but other repository creators
support merging – e.g. reprepro (which helpfully adds a flag indicating
that the patches are merged). To support both or even mixes a client
needs more information which isn't available for now.
This POC uses the external diffindex-rred included in apt-file to
do the heavy lifting of merging & applying all patches in one pass,
hence to test this feature apt-file needs to be installed.
|
|
|
|
|
|
- check return of writev() as gcc recommends
* methods/mirror.cc:
- check return of chdir() as gcc recommends
* apt-pkg/deb/dpkgpm.cc:
- check return of write() a gcc recommends
* apt-inst/deb/debfile.cc:
- check return of chdir() as gcc recommends
* apt-inst/deb/dpkgdb.cc:
- check return of chdir() as gcc recommends
|
|
|
|
|
|
ReadLine instead of accessing the files directly with fgets()
|
|
- drop the explicit export of gz-compression handling
|
|
to search for compressed silbings of the given filename and use this guessing
instead of hardcoding Gzip compression
|
|
|
|
|
|
|
|
size are pretty unlikely for now, but we need it for deb
packages which could become bigger than 4GB now (LP: #815895)
|
|
|
|
const and initial mostly Debug member values in the constructors
|
|
- really detect bigendian machines by including config.h,
so we can really (Closes: #612986)
* apt-pkg/contrib/mmap.cc:
- Base has as 'valid' failure states 0 and -1 so add a simple
validData method to check for failure states
|
|
|
|
- read patch into MMap only if we work on uncompressed patches
|
|
- operate optional on gzip compressed pdiffs
* apt-pkg/acquire-item.cc:
- don't uncompress downloaded pdiff files before feeding it to rred
|
|
|
|
- use the patchfile modification time instead of the one from the
"old" file - thanks to Philipp Weis for noticing! (Closes: #571541)
|
|
* rewrite and refactor rred method to be able to handle even big (>30 MB)
patches (Closes: #554349) and hardening the method itself by using more
constants and a return value which can't be misinterpreted as linenumber
* Finally adope the patch from Morten Hustveit <morten@debian.org> to be
able to optional use mmaps and iovec to increase patch speed -
but as this increase memory usage we can always fall back to the "old"
method which doesn't depend on mmaps.
|
|
Patch from Bernhard R. Link, thanks!
|
|
|
|
|
|
|
|
make the QueueNextDiff() code more robust
|
|
|
|
|