summaryrefslogtreecommitdiff
path: root/apt-pkg/tagfile.cc
AgeCommit message (Collapse)Author
2020-02-26Parse records including empty tag names correctlyDavid Kalnischkies
No sensible file should include these, but even insensible files do not gain unfair advantages with it as this parser does not deal with security critical files before they haven't passed other checks like signatures or hashsums. The problem is that the parser accepts and parses empty tag names correctly, but does not store the data parsed which will effect later passes over the data resulting e.g. in the following tag containing the name and value of the previous (empty) tag, its own tagname and its own value or a crash due to an attempt to access invalid memory depending on who passes over the data and what is done with it. This commit fixes both, the incidient of the crash reported by Anatoly Trosinenko who reproduced it via apt-sortpkgs: | $ cat /tmp/Packages-null | 0: | PACKAGE:0 | | : | PACKAGE: | | PACKAGE:: | $ apt-sortpkgs /tmp/Packages-null and the deeper parsing issue shown by the included testcase. Reported-By: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> References: 8710a36a01c0cb1648926792c2ad05185535558e
2020-02-20tagfile: Check out-of-bounds access to Tags vectorJulian Andres Klode
Check that the index we're going to use is within the size of the array.
2020-02-20tagfile: Check if memchr() returned null before usingJulian Andres Klode
This fixes a segmentation fault trying to read from nullptr+1, aka address 1.
2019-04-16Follow gcc-9 -Wnoexcept suggestion for FileChunk constructorDavid Kalnischkies
warning: but ‘pkgTagFilePrivate::FileChunk::FileChunk(bool, size_t)’ does not throw; perhaps it should be declared ‘noexcept’ [-Wnoexcept] Reported-By: gcc-9 Gbp-Dch: Ignore
2019-02-26tagfile: Remove deprecated pkgUserTagSection and TFRewriteJulian Andres Klode
2019-02-01Step over empty sections in TagFiles with commentsDavid Kalnischkies
Implementing a parser with recursion isn't the best idea, but in practice we should get away with it for the time being to avoid needless codechurn. Closes: #920317 #921037
2018-05-07Remove obsolete RCS keywordsGuillem Jover
Prompted-by: Jakub Wilk <jwilk@debian.org>
2017-07-12Reformat and sort all includes with clang-formatJulian Andres Klode
This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
2017-01-19fix various typos reported by spellintianDavid Kalnischkies
Most of them in (old) code comments. The two instances of user visible string changes the po files of the manpages are fixed up as well. Gbp-Dch: Ignore Reported-By: spellintian
2016-11-22TagSection: Introduce functions for looking up by key idsJulian Andres Klode
Introduce a new enum class and add functions that can do a lookup with that enum class. This uses triehash.
2016-11-22TagSection: Extract Find() methods taking Pos instead of KeyJulian Andres Klode
This allows us to add a perfect hash function to the tag file without having to reimplement the methods a second time.
2016-11-22TagSection: Split AlphaIndexes into AlphaIndexes and BetaIndexesJulian Andres Klode
Move the use of the AlphaHash to a new second hash table in preparation for the arrival of the new perfect hash function. With the new perfect hash function hashing most of the keys for us, having 128 slots for a fallback hash function seems enough and prevents us from wasting space.
2016-08-31TagFile: Fix off-by-one errors in comment strippingJulian Andres Klode
Adding 1 to the value of d->End - current makes restLength one byte too long: If we pass memchr(current, ..., restLength) has thus undefined behavior. Also, reading the value of current has undefined behavior if current >= d->End, not only for current > d->End: Consider a string of length 1, that is d->End = d->Current + 1. We can only read at d->Current + 0, but d->Current + 1 is beyond the end of the string. This probably caused several inexplicable build failures on hurd-i386 in the past, and just now caused a build failure on Ubuntu's amd64 builder. Reported-By: valgrind
2016-01-07Switch performance critical code to use APT::StringViewJulian Andres Klode
This improves performance of the cache generation on my ARM platform (4x Cortex A15) by about 10% to 20% from 2.35-2.50 to 2.1 seconds.
2016-01-02add optional support for comments in pkgTagFileDavid Kalnischkies
APT usually deals with perfectly formatted files generated automatically be other programs – and as it has to parse multiple MBs of such files it tries to be fast rather than forgiving. This was always a problem if we reused this parser for files with a deb822 syntax which are mostly written by hand however, like apt_preferences or the deb822-style sources as these can include stray newlines and more importantly comments all over the place. As a stopgap we had pkgUserTagSection which deals at least with comments before and after a given stanza, but comments in between weren't really supported and now that we support parsing debian/control for e.g. build-dep we face the full comment problem e.g. with comments inbetween multi-line fields (like Build-Depends). We can't easily deal with this on the pkgTagSection level as the interface gives access to 'raw' char-pointers for performance reasons so we would need to optionally add a buffer here on which we could remove comments to hand out pointers into this buffer instead. The interface is quite large already and supports writing stanzas as well, which does not support comments at all either. So while in future it might make sense to have a parser setup which deals with and keeps comments in this commit we opt for the simpler solution for now: We officially declare that pkgTagSection does not support comments and instead expect the caller to deal with them, which in our case is pkgTagFile: pkgTagFile is extended with an additional mode which can deal with comments by dropping them from the buffer which will later form the input of pkgTagSection. The actual implementation is slightly more complex than this sentence suggests at first on one hand to have good performance and on the other to allow jumping directly to stanzas with offsets collected in a previous run (like our cache generation does it for example).
2015-12-29pkgTagSection::Scan: Fix read of uninitialized valueJulian Andres Klode
We ignored the boundary of the buffer we were reading in while scanning for spaces.
2015-12-27deal with empty values properly in deb822 parserDavid Kalnischkies
Regression introduced in 8710a36a01c0cb1648926792c2ad05185535558e, but such fields are unlikely in practice as it is just as simple to not have a field at all with the same result of not having a value. Closes: 808102
2015-12-27Convert most callers of isspace() to isspace_ascii()Julian Andres Klode
This converts all callers that read machine-generated data, callers that might work with user input are not converted.
2015-12-14tagfile: Hardcode error message for out of range integer valuesJulian Andres Klode
This makes the test suite work on 32 bit-long platforms. Gbp-Dch: ignore
2015-08-12policy: Be more strict about parsing pin files, and document prio 0Julian Andres Klode
Treat invalid pin priorities and overflows as an error. Closes: #429912
2015-08-10use a smaller type for flags storage in the cacheDavid Kalnischkies
We store very few flags in the cache, so keeping storage space for 8 is enough for all of them and still leaves a few unused bits remaining for future extensions without wasting bytes for nothing. Git-Dch: Ignore
2015-08-10remove the compatibility markers for 4.13 abiDavid Kalnischkies
We aren't and we will not be really compatible again with the previous stable abi, so lets drop these markers (which never made it into a released version) for good as they have outlived their intend already. Git-Dch: Ignore
2015-08-10bring back deb822 sources.list entries as .sourcesDavid Kalnischkies
Having two different formats in the same file is very dirty and causes external tools to fail hard trying to parse them. It is probably not a good idea for them to parse them in the first place, but they do and we shouldn't break them if there is a better way. So we solve this issue for now by giving our deb822 format a new filename extension ".sources" which unsupporting applications are likely to ignore an can begin gradually moving forward rather than waiting for the unknown applications to catch up. Currently and for the forseeable future apt is going to support both with the same feature set as documented in the manpage, with the longtime plan of adopting the 'new' format as default, but that is a long way to go and might get going more from having an easier time setting options than from us pushing it explicitely.
2015-08-10fix memory leaks reported by -fsanitizeDavid Kalnischkies
Various small leaks here and there. Nothing particularily big, but still good to fix. Found by the sanitizers while running our testcases. Reported-By: gcc -fsanitize Git-Dch: Ignore
2015-08-10make all d-pointer * const pointersDavid Kalnischkies
Doing this disables the implicit copy assignment operator (among others) which would cause hovac if used on the classes as it would just copy the pointer, not the data the d-pointer points to. For most of the classes we don't need a copy assignment operator anyway and in many classes it was broken before as many contain a pointer of some sort. Only for our Cacheset Container interfaces we define an explicit copy assignment operator which could later be implemented to copy the data from one d-pointer to the other if we need it. Git-Dch: Ignore
2015-08-10apply various style suggestions by cppcheckDavid Kalnischkies
Some of them modify the ABI, but given that we prepare a big one already, these few hardly count for much. Git-Dch: Ignore
2015-05-11implement a more c++-style TFRewrite alternativeDavid Kalnischkies
TFRewrite is okay, but it has obscure limitations (256 Tags), even more obscure bugs (order for renames is defined by the old name) and the interface is very c-style encouraging bad usage like we do it in apt-ftparchive passing massive amounts of c_str() from std::string in. The old-style is marked as deprecated accordingly. The next commit will fix all places in the apt code to not use the old-style anymore.
2015-05-11sync TFRewrite*Order arrays with dpkg and dakDavid Kalnischkies
dpkg and dak know various field names and order them in their output, while we have yet another order and have to play catch up with them as we are sitting between chairs here and neither order is ideal for us, too. A little testcase is from now on supposed to help ensureing that we do not derivate to far away from which fields dpkg knows and orders.
2015-03-16properly implement pkgRecord::Parser for *.deb filesDavid Kalnischkies
Implementing FileName() works for most cases for us, but other frontends might need more and even for us its not very stable as the normal Jump() implementation is pretty bad on a deb file and produce errors on its own at times. So, replacing this makeshift with a complete implementation by mostly just shuffling code around.
2014-11-08restore ABI of pkgTagSectionDavid Kalnischkies
We have a d-pointer available here, so go ahead and use it which also helps in hidding some dirty details here. The "hard" part is keeping the abi for the inlined methods so that they don't break – at least not more than before as much of the point beside a speedup is support for more than 256 fields in a single section.
2014-11-08explicit overload methods instead of adding parametersDavid Kalnischkies
Adding a new parameter (with a default) is an ABI break, but you can overload a method, which is "just" an API break for everyone doing references to this method (aka: nobody). Git-Dch: Ignore
2014-11-08guard const-ification API changesDavid Kalnischkies
Git-Dch: Ignore
2014-10-13do not inline virtual destructors with d-pointersDavid Kalnischkies
Reimplementing an inline method is opening a can of worms we don't want to open if we ever want to us a d-pointer in those classes, so we do the only thing which can save us from hell: move the destructors into the cc sources and we are good. Technically not an ABI break as the methods inline or not do the same (nothing), so a program compiled against the old version still works with the new version (beside that this version is still in experimental, so nothing really has been build against this library anyway). Git-Dch: Ignore
2014-09-23Merge branch 'debian/sid' into debian/experimentalMichael Vogt
Conflicts: apt-pkg/acquire-item.cc apt-pkg/acquire-item.h apt-pkg/cachefilter.h configure.ac debian/changelog
2014-09-21Ensure that iTFRewritePackageOrder is "MD5sum" to match apt-ftparchiveMichael Vogt
The iTFRewritePackageOrder is used in indexcopy to copy and normalize cdrom Packages files. This change will ensure that there is no "normalization" that changes MD5sum -> MD5Sum which alters the hash of the Packages file on disk (oh the irony).
2014-05-22Add APT::Acquire::$(host)::By-Hash=1 knob, add Acquire-By-Hash to Release fileMichael Vogt
The by-hash can be configured on a per-hostname basis and a Release file can indicate that it has by-hash support via a new flag. The location of the hash now matches the AptByHash spec
2014-05-10improve pkgTagSection scanning and parsingDavid Kalnischkies
Removes the 256 fields limit, deals consistently with spaces littered all over the place and is even a tiny bit faster than before. Even comes with a bunch of new tests to validate these claims.
2014-04-22add support for apt-get build-dep foo.dscMichael Vogt
2014-03-13follow method attribute suggestions by gccDavid Kalnischkies
Git-Dch: Ignore Reported-By: gcc -Wsuggest-attribute={pure,const,noreturn}
2014-03-13cleanup headers and especially #includes everywhereDavid Kalnischkies
Beside being a bit cleaner it hopefully also resolves oddball problems I have with high levels of parallel jobs. Git-Dch: Ignore Reported-By: iwyu (include-what-you-use)
2014-03-13warning: type qualifiers ignored on function return type [-Wignored-qualifiers]David Kalnischkies
Reported-By: gcc -Wignored-qualifiers Git-Dch: Ignore
2014-01-30pkgTagFile: if we have seen the end, do not try to see moreDavid Kalnischkies
Asking for more via Step() will notice that we are done with the file already and will result in a fail, which means we can't find the last sections anymore (which is especially painful if we haven't moved at all as in the testcase we haven't even looked at one of the sources leading to a strange behaviour) Reported-By: Niall Walsh <niallwalsh@users.berlios.de>
2014-01-22"apt show" show user friendly size infoMichael Vogt
The size/installed-size is displayed via SizeToStr() and Size is rewriten to "Download-Size" to make clear what size is refered to here.
2013-12-21make /etc/apt/preferences parser deal with comment only sectionsMichael Vogt
2013-09-20do not trust FileFd::Eof() in pkgTagFile::Fill()David Kalnischkies
The Eof check was added (by me of course) in 0aae6d14390193e25ab6d0fd49295bd7b131954f as part of a fix up ~a month ago (at DebConf). The idea was not that bad, but doesn't make that much sense either as this bit is set by the FileFd based on Actual as well, so this is basically doing the same check again – with the difference that the HitEof bit can still linger from a previous Read we did at the end of the file, but have seek'd away from it now. Combined with the length of entries, entry order and other not that easily controllable conditions you can be 'lucky' enough to hit this problem in a way which even visible (truncating of other fields might not be visible easily, like 'Tags' and others). Closes: 723705 Thanks: Cyril Brulebois
2013-08-22do chdir("/") after chroot()Michael Vogt
2013-08-22Merge remote-tracking branch 'mvo/bugfix/coverity' into debian/sidMichael Vogt
Conflicts: apt-pkg/tagfile.h
2013-08-15use malloc instead of new[] in pkgTagFileDavid Kalnischkies
We don't need initialized memory for pkgTagFile, but more to the point we can use realloc this way which hides the bloody details of increasing the size of the buffer used. Git-Dch: Ignore
2013-08-15ensure that pkgTagFile isn't writing past Buffer lengthDavid Kalnischkies
In 91c4cc14d3654636edf997d23852f05ad3de4853 I removed the +256 from the pkgTagFile call parsing Release files as I couldn't find a mentioning of a reason for why and it was marked as XXX which suggested that at least someone else was suspicious. It turns out that it is indeed "documented", it just didn't found it at first but the changelog of apt 0.6.6 (29. Dec 2003) mentions: * Restore the ugly hack I removed from indexRecords::Load which set the pkgTagFile buffer size to (file size)+256. This is concealing a bug, but I can't fix it right now. This should fix the segfaults that folks are seeing with 0.6.[45]. The bug it is "hiding" is that if pkgTagFile works with a file which doesn't end in a double newline it will be adding it without checking if the Buffer is big enough to store them. Its also not a good idea to let the End pointer be past the end of our space, even if we don't access the data. Closes: 719629
2013-08-06memset() pkgTagSections data to make coverity happyMichael Vogt