2011-02-04 22:28:15 +00:00
|
|
|
//===--- iwyu_include_picker.h - map to canonical #includes for iwyu ------===//
|
|
|
|
//
|
|
|
|
// The LLVM Compiler Infrastructure
|
|
|
|
//
|
|
|
|
// This file is distributed under the University of Illinois Open Source
|
|
|
|
// License. See LICENSE.TXT for details.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2011-02-19 02:32:52 +00:00
|
|
|
// The include-picker provides a list of candidate #include-lines
|
|
|
|
// that iwyu can suggest in order to include a particular symbol
|
|
|
|
// or file.
|
2011-02-04 22:28:15 +00:00
|
|
|
//
|
2011-02-19 02:32:52 +00:00
|
|
|
// It seems like the 'file' case would be easy ("to include
|
|
|
|
// /usr/include/math.h, say '#include <math.h>"), but it's
|
|
|
|
// not because many header files are private, and should not
|
|
|
|
// be included by users directly. A private header will have
|
|
|
|
// one or (occassionally) more public headers that it maps to.
|
|
|
|
// The include-picker keeps track of these mappings.
|
2011-02-04 22:28:15 +00:00
|
|
|
//
|
2011-02-19 02:32:52 +00:00
|
|
|
// It's also possible for a public file to have an include-picker
|
|
|
|
// mapping. This means: "it's ok to #include this file directly, but
|
|
|
|
// you can also get the contents of this file by #including this other
|
|
|
|
// file as well." One example is that <ostream> maps to both
|
|
|
|
// <ostream> and <iostream>. Other parts of iwyu can decide which
|
|
|
|
// #include to suggest based on its own heuristics (whether the file
|
|
|
|
// already needs to #include <iostream> for some other reason, for
|
|
|
|
// instance).
|
|
|
|
//
|
|
|
|
// Some of these mappings are hard-coded, based on my own examination
|
|
|
|
// of gcc headers on ubuntu. Some mappings are determined at runtime,
|
|
|
|
// based on #pragmas or other writeup in the source files themselves.
|
|
|
|
//
|
|
|
|
// Mapping a symbol to a file has the same issues. In most cases, a
|
|
|
|
// symbol maps to the file that defines it, and iwyu_include_picker
|
|
|
|
// has nothing useful to say. But some symbols -- which we hard-code
|
|
|
|
// -- can be provided by several files. NULL is a canonical example
|
|
|
|
// of this.
|
|
|
|
//
|
|
|
|
// The include-picker also provides some helper functions for
|
|
|
|
// converting from file-paths to #include paths, including, routines to
|
|
|
|
// normalize a file-path to get rid of /usr/include/ prefixes.
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2016-05-22 09:06:36 +01:00
|
|
|
#ifndef INCLUDE_WHAT_YOU_USE_IWYU_INCLUDE_PICKER_H_
|
|
|
|
#define INCLUDE_WHAT_YOU_USE_IWYU_INCLUDE_PICKER_H_
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2011-05-04 19:30:53 +01:00
|
|
|
#include <map> // for map, map<>::value_compare
|
|
|
|
#include <set> // for set
|
|
|
|
#include <string> // for string
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
#include <utility> // for pair
|
2011-05-04 19:30:53 +01:00
|
|
|
#include <vector> // for vector
|
|
|
|
|
2014-09-07 02:04:27 +01:00
|
|
|
namespace clang {
|
|
|
|
class FileEntry;
|
|
|
|
} // namespace clang
|
|
|
|
|
2011-02-04 22:28:15 +00:00
|
|
|
namespace include_what_you_use {
|
|
|
|
|
|
|
|
using std::map;
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
using std::pair;
|
2011-02-04 22:28:15 +00:00
|
|
|
using std::set;
|
|
|
|
using std::string;
|
|
|
|
|
2011-05-04 19:17:55 +01:00
|
|
|
using std::vector;
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2013-02-15 06:04:44 +00:00
|
|
|
struct IncludeMapEntry;
|
|
|
|
|
Add --regex option
As reported in issue #981, using std::regex in IWYU has caused a
tremendous performance regression for large mapping files containing
regex mappings.
$ cat t.cc
#include <string>
# with llvm::Regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m0,529s
user 0m0,509s
sys 0m0,020s
# with std::regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m29,870s
user 0m29,717s
sys 0m0,012s
qt5_11.imp contains 2300+ regex mappings, and <string> has a bunch of
includes, so this is a good testbed for regular expression engines, but
over 50x slower is not the result we were hoping for.
The reason we switched to std::regex was to get support for negative
lookaround (llvm::Regex does not have it), but exotic regexes in
mappings are pretty rare, and this is a significant performance hit.
Introduce a --regex option to select regex dialect, with documented
tradeoffs. Put the default back to LLVM's fast implementation.
This fixes issue #981.
2022-09-02 19:55:48 +01:00
|
|
|
enum class RegexDialect;
|
2013-02-15 06:04:44 +00:00
|
|
|
enum IncludeVisibility { kUnusedVisibility, kPublic, kPrivate };
|
|
|
|
|
2019-04-12 21:52:18 +01:00
|
|
|
// When a symbol or file is mapped to an include, that include is represented
|
|
|
|
// by this struct. It always has a quoted_include and may also have a path
|
|
|
|
// (depending on its origin).
|
2019-08-01 15:05:56 +01:00
|
|
|
struct MappedInclude {
|
2019-04-12 21:52:18 +01:00
|
|
|
explicit MappedInclude(const string& quoted_include,
|
|
|
|
const string& path = {});
|
2019-08-01 15:05:56 +01:00
|
|
|
|
|
|
|
string quoted_include;
|
2019-04-12 21:52:18 +01:00
|
|
|
string path;
|
2019-04-17 22:35:29 +01:00
|
|
|
|
|
|
|
bool HasAbsoluteQuotedInclude() const;
|
2019-08-01 15:05:56 +01:00
|
|
|
};
|
|
|
|
|
2011-02-04 22:28:15 +00:00
|
|
|
class IncludePicker {
|
|
|
|
public:
|
2019-08-01 15:05:56 +01:00
|
|
|
// The keys are either symbol names or quoted includes, and the values are
|
|
|
|
// lists of candidate public headers to include for symbol or quoted include.
|
|
|
|
typedef map<string, vector<MappedInclude>> IncludeMap;
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2019-08-13 15:41:08 +01:00
|
|
|
// Used to track visibility as specified either in mapping files or via
|
2019-04-17 22:35:29 +01:00
|
|
|
// pragmas. The keys are quoted includes or paths. The values are the
|
2019-08-13 15:41:08 +01:00
|
|
|
// visibility of the respective files.
|
|
|
|
typedef map<string, IncludeVisibility> VisibilityMap;
|
|
|
|
|
Add --regex option
As reported in issue #981, using std::regex in IWYU has caused a
tremendous performance regression for large mapping files containing
regex mappings.
$ cat t.cc
#include <string>
# with llvm::Regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m0,529s
user 0m0,509s
sys 0m0,020s
# with std::regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m29,870s
user 0m29,717s
sys 0m0,012s
qt5_11.imp contains 2300+ regex mappings, and <string> has a bunch of
includes, so this is a good testbed for regular expression engines, but
over 50x slower is not the result we were hoping for.
The reason we switched to std::regex was to get support for negative
lookaround (llvm::Regex does not have it), but exotic regexes in
mappings are pretty rare, and this is a significant performance hit.
Introduce a --regex option to select regex dialect, with documented
tradeoffs. Put the default back to LLVM's fast implementation.
This fixes issue #981.
2022-09-02 19:55:48 +01:00
|
|
|
IncludePicker(bool no_default_mappings, RegexDialect regex_dialect);
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2011-02-19 02:32:52 +00:00
|
|
|
// ----- Routines to dynamically modify the include-picker
|
|
|
|
|
2011-02-04 22:28:15 +00:00
|
|
|
// Call this for every #include seen during iwyu analysis. The
|
|
|
|
// include-picker can use this data to better suggest #includes,
|
For some reason I was telling the include-picker about
includes-as-written rather than the actual file-path of the
included file. Since the include-picker deals with actual
file-paths (of the decls), this made no sense, and indeed we
were seeing when code depended on a search path, we weren't
finding the proper include-mapping for it.
For instance, python .h files has
#include "dictobject.h"
rather than
#include "third_party/python2_4_3/gcc-3.4-glibc-2.11.1-grte-k8-linux-python2.6-opt/include/python2.6/dictobject.h
Thus, while we had code that correctly mapped
third_party/python2_4_3 to <Python.h>, it wasn't firing on the
above code because the include-picker saw it as just
"dictobject.h". This is now fixed by using the actual
file-path.
While testing, I discovered the test-file was often calling
AddDirectInclude() improperly (with extra "'s). It didn't
happen to matter, but I cleaned it up.
R=wan,dsturtevant
DELTA=43 (5 added, 2 deleted, 36 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1574
2011-04-27 00:04:37 +01:00
|
|
|
// perhaps.
|
2011-02-19 02:32:52 +00:00
|
|
|
void AddDirectInclude(const string& includer_filepath,
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
const string& includee_filepath,
|
|
|
|
const string& quoted_include_as_written);
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2011-02-19 02:32:52 +00:00
|
|
|
// Add this to say "map_to re-exports everything in file map_from".
|
2019-08-01 15:05:56 +01:00
|
|
|
// map_from should be a quoted include.
|
|
|
|
void AddMapping(const string& map_from, const MappedInclude& map_to);
|
2011-03-04 00:00:14 +00:00
|
|
|
|
|
|
|
// Indicate that the given quoted include should be considered
|
|
|
|
// a "private" include. If possible, we use the include-picker
|
2019-08-13 15:41:08 +01:00
|
|
|
// mappings to map such includes to public (not-private) includes.
|
2011-03-04 00:00:14 +00:00
|
|
|
void MarkIncludeAsPrivate(const string& quoted_include);
|
2011-02-11 23:08:41 +00:00
|
|
|
|
2019-04-17 22:35:29 +01:00
|
|
|
// Indicate that the given path should be considered
|
|
|
|
// a "private" include. If possible, we use the include-picker
|
|
|
|
// mappings to map such includes to public (not-private) includes.
|
|
|
|
void MarkPathAsPrivate(const string& path);
|
|
|
|
|
2011-05-13 00:10:38 +01:00
|
|
|
// Add this to say that "any file whose name matches the
|
|
|
|
// friend_regex is allowed to include includee_filepath". The regex
|
|
|
|
// uses the POSIX Entended Regular Expression syntax and should
|
|
|
|
// match a quoted-include (starting and ending with "" or <>).
|
|
|
|
void AddFriendRegex(const string& includee_filepath,
|
|
|
|
const string& quoted_friend_regex);
|
2011-05-04 19:17:55 +01:00
|
|
|
|
2011-02-04 22:28:15 +00:00
|
|
|
// Call this after iwyu preprocessing is done. No more calls to
|
2011-02-19 02:32:52 +00:00
|
|
|
// AddDirectInclude() or AddMapping() are allowed after this.
|
2011-02-04 22:28:15 +00:00
|
|
|
void FinalizeAddedIncludes();
|
|
|
|
|
2011-02-19 02:32:52 +00:00
|
|
|
// ----- Include-picking API
|
|
|
|
|
|
|
|
// Returns the set of all public header files that 'provide' the
|
|
|
|
// given symbol. For instance, NULL can map to stddef.h, stdlib.h,
|
2011-02-04 22:28:15 +00:00
|
|
|
// etc. Most symbols don't have pre-defined headers they map to,
|
|
|
|
// and we return the empty vector in that case. Ordering is
|
|
|
|
// important (which is why we return a vector, not a set): all else
|
|
|
|
// being equal, the first element of the vector is the "best" (or
|
|
|
|
// most standard) header for the symbol.
|
2019-04-18 15:57:58 +01:00
|
|
|
vector<MappedInclude> GetCandidateHeadersForSymbol(
|
|
|
|
const string& symbol) const;
|
|
|
|
|
|
|
|
// As above, but given a specific including header it is possible to convert
|
|
|
|
// mapped includes to quoted include strings (because we can for example know
|
|
|
|
// the correct relative path for ""-style includes).
|
|
|
|
vector<string> GetCandidateHeadersForSymbolUsedFrom(
|
|
|
|
const string& symbol, const string& including_filepath) const;
|
2011-02-04 22:28:15 +00:00
|
|
|
|
|
|
|
// Returns the set of all public header files that a given header
|
2011-02-19 02:32:52 +00:00
|
|
|
// file -- specified as a full path -- would map to, as a set of
|
2019-08-01 15:05:56 +01:00
|
|
|
// MappedIncludes. If the include-picker has
|
2011-02-19 02:32:52 +00:00
|
|
|
// no mapping information for this file, the return vector has just
|
|
|
|
// the input file (now include-quoted). Ordering is important
|
2011-02-04 22:28:15 +00:00
|
|
|
// (which is why we return a vector, not a set): all else being
|
|
|
|
// equal, the first element of the vector is the "best" (or most
|
2011-02-19 02:32:52 +00:00
|
|
|
// standard) header for the input header.
|
2019-08-01 15:05:56 +01:00
|
|
|
vector<MappedInclude> GetCandidateHeadersForFilepath(
|
2016-05-05 11:08:32 +01:00
|
|
|
const string& filepath, const string& including_filepath = "") const;
|
The rule that files like 'foo/internal/bar.h' should always be
treated as private headers, and mapped to the nearest
including public header, was too strict. The basic problem:
what if 'foo/internal/baz.h' tried to include
'foo/internal/bar.h'? It should be able to. In some cases,
'foo/internal/bar.h' isn't even included from any non-internal
file, and we end up suggesting to #include <built-in> (the
only non-private include in the include-chain).
I fixed this up by adding a new function for mapping private
headers to public, that takes into account who is doing the
including. If foo/x/y/z is including foo/internal/a/b/c, we
don't say foo/internal/a/b/c is private in this context. But
if joe/otherproject tries to include foo/internal/a/b/c, then
we *do* say foo/internal/a/b/c is private, and map it to its
closest public header.
I also took out unnecessary code that marked includer-files
that are '/internal/' as private, not just included-files. We
should never need to mark includers as private; if the
includer-file is itself included in turn, we'll have ample
opportunity to mark it private then. Otherwise, we run the
risk of a file being marked private, with nobody including it
that we can map to.
To better match the new semantics that files aren't
intrinsically public or private but it depends on the context,
I renamed GetPublicHeader* to GetCandidateHeader*,
R=wan,dsturtevant
DELTA=179 (84 added, 2 deleted, 93 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1590
2011-04-27 00:12:07 +01:00
|
|
|
|
|
|
|
// This allows for special-casing of GetCandidateHeadersForFilepath
|
|
|
|
// -- it's the same, but you give it the filepath that's doing the
|
|
|
|
// #including. This lets us give a different answer for different
|
|
|
|
// call-sites. For instance, "foo/internal/bar.h" is a fine
|
|
|
|
// candidate header when #included from "foo/internal/baz.h", but
|
|
|
|
// not when #included from "qux/quux.h". In the common case there's
|
|
|
|
// no special-casing, and this falls back on
|
2011-05-26 00:01:16 +01:00
|
|
|
// GetCandidateHeadersForFilepath().
|
2019-04-12 21:52:18 +01:00
|
|
|
// Furthermore, knowing the including file allows use to convert each
|
|
|
|
// MappedInclude in the result to a simple string (quoted include).
|
The rule that files like 'foo/internal/bar.h' should always be
treated as private headers, and mapped to the nearest
including public header, was too strict. The basic problem:
what if 'foo/internal/baz.h' tried to include
'foo/internal/bar.h'? It should be able to. In some cases,
'foo/internal/bar.h' isn't even included from any non-internal
file, and we end up suggesting to #include <built-in> (the
only non-private include in the include-chain).
I fixed this up by adding a new function for mapping private
headers to public, that takes into account who is doing the
including. If foo/x/y/z is including foo/internal/a/b/c, we
don't say foo/internal/a/b/c is private in this context. But
if joe/otherproject tries to include foo/internal/a/b/c, then
we *do* say foo/internal/a/b/c is private, and map it to its
closest public header.
I also took out unnecessary code that marked includer-files
that are '/internal/' as private, not just included-files. We
should never need to mark includers as private; if the
includer-file is itself included in turn, we'll have ample
opportunity to mark it private then. Otherwise, we run the
risk of a file being marked private, with nobody including it
that we can map to.
To better match the new semantics that files aren't
intrinsically public or private but it depends on the context,
I renamed GetPublicHeader* to GetCandidateHeader*,
R=wan,dsturtevant
DELTA=179 (84 added, 2 deleted, 93 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1590
2011-04-27 00:12:07 +01:00
|
|
|
vector<string> GetCandidateHeadersForFilepathIncludedFrom(
|
|
|
|
const string& included_filepath, const string& including_filepath) const;
|
2011-02-19 02:32:52 +00:00
|
|
|
|
|
|
|
// Returns true if there is a mapping (possibly indirect) from
|
|
|
|
// map_from to map_to. This means that to_file 're-exports' all the
|
|
|
|
// symbols from from_file. Both map_from_filepath and
|
|
|
|
// map_to_filepath should be full file-paths.
|
|
|
|
bool HasMapping(const string& map_from_filepath,
|
|
|
|
const string& map_to_filepath) const;
|
2011-02-04 22:28:15 +00:00
|
|
|
|
2014-09-07 02:04:27 +01:00
|
|
|
bool IsPublic(const clang::FileEntry* file) const;
|
|
|
|
|
2012-10-14 23:39:30 +01:00
|
|
|
// Parses a YAML/JSON file containing mapping directives of various types.
|
|
|
|
void AddMappingsFromFile(const string& filename);
|
|
|
|
|
2011-02-04 22:28:15 +00:00
|
|
|
private:
|
2013-02-26 21:32:24 +00:00
|
|
|
// Private implementation of mapping file parser, which takes
|
|
|
|
// mapping file search path to allow recursion that builds up
|
|
|
|
// search path incrementally.
|
|
|
|
void AddMappingsFromFile(const string& filename,
|
|
|
|
const vector<string>& search_path);
|
|
|
|
|
2013-03-07 15:40:22 +00:00
|
|
|
// Adds all hard-coded default mappings.
|
|
|
|
void AddDefaultMappings();
|
|
|
|
|
2012-10-14 23:39:30 +01:00
|
|
|
// Adds a mapping from a one header to another, typically
|
|
|
|
// from a private to a public quoted include.
|
2013-02-15 06:04:44 +00:00
|
|
|
void AddIncludeMapping(
|
2020-03-20 01:02:40 +00:00
|
|
|
const string& map_from, IncludeVisibility from_visibility,
|
2019-08-01 15:05:56 +01:00
|
|
|
const MappedInclude& map_to, IncludeVisibility to_visibility);
|
2012-10-14 23:39:30 +01:00
|
|
|
|
2020-03-20 01:02:40 +00:00
|
|
|
// Adds a mapping from a a symbol to a quoted include. We use this to
|
2012-10-14 23:39:30 +01:00
|
|
|
// maintain mappings of documented types, e.g.
|
|
|
|
// For std::map<>, include <map>.
|
2013-02-15 06:04:44 +00:00
|
|
|
void AddSymbolMapping(
|
2019-08-01 15:05:56 +01:00
|
|
|
const string& map_from, const MappedInclude& map_to,
|
2013-02-15 06:04:44 +00:00
|
|
|
IncludeVisibility to_visibility);
|
|
|
|
|
|
|
|
// Adds mappings from sized arrays of IncludeMapEntry.
|
|
|
|
void AddIncludeMappings(const IncludeMapEntry* entries, size_t count);
|
|
|
|
void AddSymbolMappings(const IncludeMapEntry* entries, size_t count);
|
2012-10-14 23:39:30 +01:00
|
|
|
|
2014-09-28 18:54:48 +01:00
|
|
|
void AddPublicIncludes(const char** includes, size_t count);
|
|
|
|
|
2011-05-13 00:10:38 +01:00
|
|
|
// Expands the regex keys in filepath_include_map_ and
|
|
|
|
// friend_to_headers_map_ by matching them against all source files
|
|
|
|
// seen by iwyu.
|
|
|
|
void ExpandRegexes();
|
2011-02-19 02:32:52 +00:00
|
|
|
|
2019-08-13 15:41:08 +01:00
|
|
|
// Adds an entry to the given VisibilityMap, with error checking.
|
|
|
|
void MarkVisibility(VisibilityMap* map, const string& key,
|
2016-08-15 21:24:39 +01:00
|
|
|
IncludeVisibility visibility);
|
2011-03-04 00:00:14 +00:00
|
|
|
|
2012-10-14 23:39:30 +01:00
|
|
|
// Parse visibility from a string. Returns kUnusedVisibility if
|
|
|
|
// string is not recognized.
|
2016-08-15 21:24:39 +01:00
|
|
|
IncludeVisibility ParseVisibility(const string& visibility) const;
|
2012-10-14 23:39:30 +01:00
|
|
|
|
2019-04-17 22:35:29 +01:00
|
|
|
// Return the visibility of a given mapped include if known, else
|
2011-05-24 00:07:27 +01:00
|
|
|
// kUnusedVisibility.
|
2019-08-14 15:47:43 +01:00
|
|
|
IncludeVisibility GetVisibility(
|
2019-04-17 22:35:29 +01:00
|
|
|
const MappedInclude&,
|
2019-08-14 15:47:43 +01:00
|
|
|
IncludeVisibility default_value = kUnusedVisibility) const;
|
2011-05-24 00:07:27 +01:00
|
|
|
|
2011-03-04 00:00:14 +00:00
|
|
|
// For the given key, return the vector of values associated with
|
|
|
|
// that key, or an empty vector if the key does not exist in the
|
|
|
|
// map, filtering out private files.
|
2019-08-01 15:05:56 +01:00
|
|
|
vector<MappedInclude> GetPublicValues(const IncludeMap& m,
|
|
|
|
const string& key) const;
|
2011-03-04 00:00:14 +00:00
|
|
|
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
// Given an includer-pathname and includee-pathname, return the
|
2016-08-16 19:56:28 +01:00
|
|
|
// quoted-include of the includee, as written in the includer, or
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
// "" if it's not found for some reason.
|
2011-05-26 00:01:16 +01:00
|
|
|
string MaybeGetIncludeNameAsWritten(const string& includer_filepath,
|
|
|
|
const string& includee_filepath) const;
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
|
2019-04-18 15:57:58 +01:00
|
|
|
// Given a collection of MappedIncludes, and a path that might include them,
|
|
|
|
// choose the best quoted include form for each MappedInclude.
|
|
|
|
vector<string> BestQuotedIncludesForIncluder(
|
|
|
|
const vector<MappedInclude>&, const string& including_filepath) const;
|
|
|
|
|
2011-05-13 00:10:38 +01:00
|
|
|
// From symbols to includes.
|
2011-02-19 02:32:52 +00:00
|
|
|
IncludeMap symbol_include_map_;
|
2011-05-13 00:10:38 +01:00
|
|
|
|
|
|
|
// From quoted filepath patterns to includes, where a pattern can be
|
|
|
|
// either a quoted filepath (e.g. "foo/bar.h" or <a/b.h>) or @
|
|
|
|
// followed by a regular expression for matching a quoted filepath
|
2011-05-24 00:06:34 +01:00
|
|
|
// (e.g. @"foo/.*"). If key-value pair (pattern, headers) is in
|
|
|
|
// this map, it means that any header in 'headers' can be used to
|
|
|
|
// get symbols exported by a header matching 'pattern'.
|
2011-02-19 02:32:52 +00:00
|
|
|
IncludeMap filepath_include_map_;
|
|
|
|
|
2011-03-04 00:00:14 +00:00
|
|
|
// A map of all quoted-includes to whether they're public or private.
|
2019-04-17 22:35:29 +01:00
|
|
|
// Files whose visibility cannot be determined by this map nor the one
|
|
|
|
// below are assumed public.
|
2019-08-13 15:41:08 +01:00
|
|
|
VisibilityMap include_visibility_map_;
|
2011-03-04 00:00:14 +00:00
|
|
|
|
2019-04-17 22:35:29 +01:00
|
|
|
// A map of paths to whether they're public or private.
|
|
|
|
// Files whose visibility cannot be determined by this map nor the one
|
|
|
|
// above are assumed public.
|
|
|
|
// The include_visibility_map_ takes priority over this one.
|
|
|
|
VisibilityMap path_visibility_map_;
|
|
|
|
|
Add special support for third-party code, to make it less
likely we'll suggest adding an internal third-party header.
Basically, we stop trying to do include-what-you-use fixes on
third-party code.
More precisely, we make an 'implicit' judgment on which
third-party headers are public and which are private, based on
what existing code (in this translation unit) #includes. We
marked all unincluded files as private, which means iwyu will
never suggest adding a new third-party file as an #include.
Insted, it will suggest some already-included third-party file
that gets the needed file transitively.
Since it's not really practical for us to fix third-party code
to have better #include hygiene, or even to mark up third-party
code with iwyu pragmas, we need to do something similar to
this. We could just manually update iwyu_include_picker's
third_party_map with rules for every third party package we
have, but that's expensive. This is much cheaper, with the
downside that we may miss some potential include-what-you-use
opportunities in third-party code (going from a more-generic
third-party include to a less-generic one). I think that's a
low cost.
R=dsturtevant
DELTA=113 (104 added, 0 deleted, 9 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1577
2011-04-27 00:06:16 +01:00
|
|
|
// All the includes we've seen so far, to help with globbing and
|
|
|
|
// other dynamic mapping. For each file, we list who #includes it.
|
2016-10-25 01:41:00 +01:00
|
|
|
map<string, set<string>> quoted_includes_to_quoted_includers_;
|
2011-02-11 23:08:41 +00:00
|
|
|
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
// Given the filepaths of an includer and includee, give the
|
|
|
|
// include-as-written (including <>'s or ""'s) that the includer
|
|
|
|
// used to refer to the includee. We use this to return includes as
|
|
|
|
// they were written in the source, when possible.
|
2016-08-16 19:56:28 +01:00
|
|
|
map<pair<string, string>, string>
|
|
|
|
includer_and_includee_to_include_as_written_;
|
When keeping an #include, prefer the include-name as typed,
rather than the one clang gives us. Normally they're the
same, but can be different when the #include could be accessed
via different paths, or via symlinks (for instance, if
we #include "a/b/c.h" and compile with "-I. -Ia -Ia/b", then
we could say #include "a/b/c.h", #include "b/c.h", or #include
"c.h"). clang will, as I understand it, pick one of these
three forms arbitrarily for FileEntry::getName. We store the
name as it was actually typed in the source, and prefer it.
Obviously, the above only works for includes that already
exist. If we suggest a new include, we will fall back on
whatever clang gives us, which is an arbitrary name (in
practice, the first form seen).
R=klimek
DELTA=161 (100 added, 1 deleted, 60 changed)
Revision created by MOE tool push_codebase.
MOE_MIGRATION=1851
2011-05-13 00:10:15 +01:00
|
|
|
|
2011-05-13 00:10:38 +01:00
|
|
|
// Maps from a quoted filepath pattern to the set of files that used
|
|
|
|
// a pragma declaring it as a friend. That is, if foo/bar/x.h has a
|
|
|
|
// line "// IWYU pragma: friend foo/bar/.*" then "x.h" will be a
|
|
|
|
// member of friend_to_headers_map_["@\"foo/bar/.*\""]. In a
|
|
|
|
// postprocessing step, files friend_to_headers_map_ will have
|
|
|
|
// regular expressions expanded, e.g. if foo/bar/x.cc is processed,
|
|
|
|
// friend_to_headers_map_["foo/bar/x.cc"] will be augmented with the
|
|
|
|
// contents of friend_to_headers_map_["@\"foo/bar/.*\""].
|
2016-10-25 01:41:00 +01:00
|
|
|
map<string, set<string>> friend_to_headers_map_;
|
2011-05-04 19:17:55 +01:00
|
|
|
|
2011-02-11 23:08:41 +00:00
|
|
|
// Make sure we don't do any non-const operations after finalizing.
|
|
|
|
bool has_called_finalize_added_include_lines_;
|
Add --regex option
As reported in issue #981, using std::regex in IWYU has caused a
tremendous performance regression for large mapping files containing
regex mappings.
$ cat t.cc
#include <string>
# with llvm::Regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m0,529s
user 0m0,509s
sys 0m0,020s
# with std::regex
$ time include-what-you-use -Xiwyu --mapping_file=qt5_11.imp t.cc
...
real 0m29,870s
user 0m29,717s
sys 0m0,012s
qt5_11.imp contains 2300+ regex mappings, and <string> has a bunch of
includes, so this is a good testbed for regular expression engines, but
over 50x slower is not the result we were hoping for.
The reason we switched to std::regex was to get support for negative
lookaround (llvm::Regex does not have it), but exotic regexes in
mappings are pretty rare, and this is a significant performance hit.
Introduce a --regex option to select regex dialect, with documented
tradeoffs. Put the default back to LLVM's fast implementation.
This fixes issue #981.
2022-09-02 19:55:48 +01:00
|
|
|
|
|
|
|
// Controls regex dialect to use for mappings.
|
|
|
|
RegexDialect regex_dialect;
|
2011-02-04 22:28:15 +00:00
|
|
|
}; // class IncludePicker
|
|
|
|
|
|
|
|
} // namespace include_what_you_use
|
|
|
|
|
2016-05-22 09:06:36 +01:00
|
|
|
#endif // INCLUDE_WHAT_YOU_USE_IWYU_INCLUDE_PICKER_H_
|