Age | Commit message (Collapse) | Author |
|
No sensible file should include these, but even insensible files do not
gain unfair advantages with it as this parser does not deal with
security critical files before they haven't passed other checks like
signatures or hashsums.
The problem is that the parser accepts and parses empty tag names
correctly, but does not store the data parsed which will effect later
passes over the data resulting e.g. in the following tag containing
the name and value of the previous (empty) tag, its own tagname and its
own value or a crash due to an attempt to access invalid memory
depending on who passes over the data and what is done with it.
This commit fixes both, the incidient of the crash reported by
Anatoly Trosinenko who reproduced it via apt-sortpkgs:
| $ cat /tmp/Packages-null
| 0:
| PACKAGE:0
|
| :
| PACKAGE:
|
| PACKAGE::
| $ apt-sortpkgs /tmp/Packages-null
and the deeper parsing issue shown by the included testcase.
Reported-By: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
References: 8710a36a01c0cb1648926792c2ad05185535558e
|
|
Check that the index we're going to use is within the size
of the array.
|
|
This fixes a segmentation fault trying to read from nullptr+1,
aka address 1.
|
|
warning: but ‘pkgTagFilePrivate::FileChunk::FileChunk(bool, size_t)’
does not throw; perhaps it should be declared ‘noexcept’ [-Wnoexcept]
Reported-By: gcc-9
Gbp-Dch: Ignore
|
|
|
|
Implementing a parser with recursion isn't the best idea, but in
practice we should get away with it for the time being to avoid
needless codechurn.
Closes: #920317 #921037
|
|
Prompted-by: Jakub Wilk <jwilk@debian.org>
|
|
This makes it easier to see which headers includes what.
The changes were done by running
git grep -l '#\s*include' \
| grep -E '.(cc|h)$' \
| xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/'
To modify all include lines by adding a space, and then running
./git-clang-format.sh.
|
|
Most of them in (old) code comments. The two instances of user visible
string changes the po files of the manpages are fixed up as well.
Gbp-Dch: Ignore
Reported-By: spellintian
|
|
Introduce a new enum class and add functions that can do a lookup
with that enum class. This uses triehash.
|
|
This allows us to add a perfect hash function to the tag file
without having to reimplement the methods a second time.
|
|
Move the use of the AlphaHash to a new second hash table in
preparation for the arrival of the new perfect hash function.
With the new perfect hash function hashing most of the keys for
us, having 128 slots for a fallback hash function seems enough
and prevents us from wasting space.
|
|
Adding 1 to the value of d->End - current makes restLength one byte
too long: If we pass memchr(current, ..., restLength) has thus
undefined behavior.
Also, reading the value of current has undefined behavior if
current >= d->End, not only for current > d->End:
Consider a string of length 1, that is d->End = d->Current + 1.
We can only read at d->Current + 0, but d->Current + 1 is beyond
the end of the string.
This probably caused several inexplicable build failures on hurd-i386
in the past, and just now caused a build failure on Ubuntu's amd64
builder.
Reported-By: valgrind
|
|
This improves performance of the cache generation on my
ARM platform (4x Cortex A15) by about 10% to 20% from
2.35-2.50 to 2.1 seconds.
|
|
APT usually deals with perfectly formatted files generated automatically
be other programs – and as it has to parse multiple MBs of such files it
tries to be fast rather than forgiving.
This was always a problem if we reused this parser for files with a
deb822 syntax which are mostly written by hand however, like
apt_preferences or the deb822-style sources as these can include stray
newlines and more importantly comments all over the place.
As a stopgap we had pkgUserTagSection which deals at least with comments
before and after a given stanza, but comments in between weren't really
supported and now that we support parsing debian/control for e.g.
build-dep we face the full comment problem e.g. with comments inbetween
multi-line fields (like Build-Depends).
We can't easily deal with this on the pkgTagSection level as the interface
gives access to 'raw' char-pointers for performance reasons so we would
need to optionally add a buffer here on which we could remove comments
to hand out pointers into this buffer instead. The interface is quite
large already and supports writing stanzas as well, which does not
support comments at all either. So while in future it might make sense
to have a parser setup which deals with and keeps comments in this
commit we opt for the simpler solution for now: We officially declare
that pkgTagSection does not support comments and instead expect the
caller to deal with them, which in our case is pkgTagFile:
pkgTagFile is extended with an additional mode which can deal with
comments by dropping them from the buffer which will later form the
input of pkgTagSection. The actual implementation is slightly more
complex than this sentence suggests at first on one hand to have good
performance and on the other to allow jumping directly to stanzas with
offsets collected in a previous run (like our cache generation does it
for example).
|
|
We ignored the boundary of the buffer we were reading in
while scanning for spaces.
|
|
Regression introduced in 8710a36a01c0cb1648926792c2ad05185535558e,
but such fields are unlikely in practice as it is just as simple to not
have a field at all with the same result of not having a value.
Closes: 808102
|
|
This converts all callers that read machine-generated data,
callers that might work with user input are not converted.
|
|
This makes the test suite work on 32 bit-long platforms.
Gbp-Dch: ignore
|
|
Treat invalid pin priorities and overflows as an error.
Closes: #429912
|
|
We store very few flags in the cache, so keeping storage space for 8 is
enough for all of them and still leaves a few unused bits remaining for
future extensions without wasting bytes for nothing.
Git-Dch: Ignore
|
|
We aren't and we will not be really compatible again with the previous
stable abi, so lets drop these markers (which never made it into a
released version) for good as they have outlived their intend already.
Git-Dch: Ignore
|
|
Having two different formats in the same file is very dirty and causes
external tools to fail hard trying to parse them. It is probably not a
good idea for them to parse them in the first place, but they do and we
shouldn't break them if there is a better way.
So we solve this issue for now by giving our deb822 format a new
filename extension ".sources" which unsupporting applications are likely
to ignore an can begin gradually moving forward rather than waiting for
the unknown applications to catch up.
Currently and for the forseeable future apt is going to support both
with the same feature set as documented in the manpage, with the
longtime plan of adopting the 'new' format as default, but that is a
long way to go and might get going more from having an easier time
setting options than from us pushing it explicitely.
|
|
Various small leaks here and there. Nothing particularily big, but still
good to fix. Found by the sanitizers while running our testcases.
Reported-By: gcc -fsanitize
Git-Dch: Ignore
|
|
Doing this disables the implicit copy assignment operator (among others)
which would cause hovac if used on the classes as it would just copy the
pointer, not the data the d-pointer points to. For most of the classes
we don't need a copy assignment operator anyway and in many classes it
was broken before as many contain a pointer of some sort.
Only for our Cacheset Container interfaces we define an explicit copy
assignment operator which could later be implemented to copy the data
from one d-pointer to the other if we need it.
Git-Dch: Ignore
|
|
Some of them modify the ABI, but given that we prepare a big one
already, these few hardly count for much.
Git-Dch: Ignore
|
|
TFRewrite is okay, but it has obscure limitations (256 Tags), even more
obscure bugs (order for renames is defined by the old name) and the
interface is very c-style encouraging bad usage like we do it in
apt-ftparchive passing massive amounts of c_str() from std::string in.
The old-style is marked as deprecated accordingly. The next commit will
fix all places in the apt code to not use the old-style anymore.
|
|
dpkg and dak know various field names and order them in their output,
while we have yet another order and have to play catch up with them as
we are sitting between chairs here and neither order is ideal for us,
too.
A little testcase is from now on supposed to help ensureing that we do
not derivate to far away from which fields dpkg knows and orders.
|
|
Implementing FileName() works for most cases for us, but other
frontends might need more and even for us its not very stable as
the normal Jump() implementation is pretty bad on a deb file and
produce errors on its own at times.
So, replacing this makeshift with a complete implementation by
mostly just shuffling code around.
|
|
We have a d-pointer available here, so go ahead and use it which also
helps in hidding some dirty details here. The "hard" part is keeping the
abi for the inlined methods so that they don't break – at least not more
than before as much of the point beside a speedup is support for more
than 256 fields in a single section.
|
|
Adding a new parameter (with a default) is an ABI break, but you can
overload a method, which is "just" an API break for everyone doing
references to this method (aka: nobody).
Git-Dch: Ignore
|
|
Git-Dch: Ignore
|
|
Reimplementing an inline method is opening a can of worms we don't want
to open if we ever want to us a d-pointer in those classes, so we do the
only thing which can save us from hell: move the destructors into the cc
sources and we are good.
Technically not an ABI break as the methods inline or not do the same
(nothing), so a program compiled against the old version still works
with the new version (beside that this version is still in experimental,
so nothing really has been build against this library anyway).
Git-Dch: Ignore
|
|
Conflicts:
apt-pkg/acquire-item.cc
apt-pkg/acquire-item.h
apt-pkg/cachefilter.h
configure.ac
debian/changelog
|
|
The iTFRewritePackageOrder is used in indexcopy to copy and normalize
cdrom Packages files. This change will ensure that there is no
"normalization" that changes MD5sum -> MD5Sum which alters the hash
of the Packages file on disk (oh the irony).
|
|
The by-hash can be configured on a per-hostname basis and a Release
file can indicate that it has by-hash support via a new flag.
The location of the hash now matches the AptByHash spec
|
|
Removes the 256 fields limit, deals consistently with spaces littered
all over the place and is even a tiny bit faster than before.
Even comes with a bunch of new tests to validate these claims.
|
|
|
|
Git-Dch: Ignore
Reported-By: gcc -Wsuggest-attribute={pure,const,noreturn}
|
|
Beside being a bit cleaner it hopefully also resolves oddball problems
I have with high levels of parallel jobs.
Git-Dch: Ignore
Reported-By: iwyu (include-what-you-use)
|
|
Reported-By: gcc -Wignored-qualifiers
Git-Dch: Ignore
|
|
Asking for more via Step() will notice that we are done with the file
already and will result in a fail, which means we can't find the last
sections anymore (which is especially painful if we haven't moved at
all as in the testcase we haven't even looked at one of the sources
leading to a strange behaviour)
Reported-By: Niall Walsh <niallwalsh@users.berlios.de>
|
|
The size/installed-size is displayed via SizeToStr() and Size
is rewriten to "Download-Size" to make clear what size is refered
to here.
|
|
|
|
The Eof check was added (by me of course) in
0aae6d14390193e25ab6d0fd49295bd7b131954f
as part of a fix up ~a month ago (at DebConf).
The idea was not that bad, but doesn't make that much sense either
as this bit is set by the FileFd based on Actual as well, so this is
basically doing the same check again – with the difference that the
HitEof bit can still linger from a previous Read we did at the end of
the file, but have seek'd away from it now.
Combined with the length of entries, entry order and other not that
easily controllable conditions you can be 'lucky' enough to hit this
problem in a way which even visible (truncating of other fields might
not be visible easily, like 'Tags' and others).
Closes: 723705
Thanks: Cyril Brulebois
|
|
|
|
Conflicts:
apt-pkg/tagfile.h
|
|
We don't need initialized memory for pkgTagFile, but more to the point
we can use realloc this way which hides the bloody details of increasing
the size of the buffer used.
Git-Dch: Ignore
|
|
In 91c4cc14d3654636edf997d23852f05ad3de4853 I removed the +256 from
the pkgTagFile call parsing Release files as I couldn't find a
mentioning of a reason for why and it was marked as XXX which suggested
that at least someone else was suspicious.
It turns out that it is indeed "documented", it just didn't found it at
first but the changelog of apt 0.6.6 (29. Dec 2003) mentions:
* Restore the ugly hack I removed from indexRecords::Load which set the
pkgTagFile buffer size to (file size)+256. This is concealing a bug,
but I can't fix it right now. This should fix the segfaults that
folks are seeing with 0.6.[45].
The bug it is "hiding" is that if pkgTagFile works with a file which doesn't
end in a double newline it will be adding it without checking if the Buffer
is big enough to store them. Its also not a good idea to let the End
pointer be past the end of our space, even if we don't access the data.
Closes: 719629
|
|
|