summaryrefslogtreecommitdiff
path: root/cad/src/tools/README.global.dependencies.txt
blob: c63cd6d8b4a77ce31fdad2ad8aa9d590db3c3d46 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186

 Copyright 2007-2008 Nanorex, Inc.  See LICENSE file for details. 

 # $Id$
 
This describes how to use various tools to manage import statements
and to detect and analyze dependency loops created by those imports.


Summary of steps for a routine check:


For routine, incremental checks of the import cycle graph,
it's sufficient to run the following two commands in /bin/sh,
which are elaborated on below:

 $ tools/PackageDependency.py `tools/AllPyFiles.sh` --justCycles > depend.dot
 $ tools/splitDependDot.py > anotheroutputfile


Full explanation, and steps for more complete checks:


The first step is to make sure that all the necessary import
statements are in the code, and that they point to the right
locations.  The tools involved are:

* AllPyFiles.sh
   This script finds all .py files in the source tree under the point
   it is executed at, and filters out files that we're not interested
   in.  Things which are known to not be included in release builds
   can be removed from consideration at this stage.  Keeping some of
   these files can be a good idea, though.  It can help catch code
   which might have been referencing a symbol in a no-longer used
   file, which appears with a different value elsewhere.

* FindPythonGlobals.py
   This script analyzes a list of python files (generated by
   AllPyFiles) and emits a list of all of the global symbols defined
   in those files.

* FindExternalImports.py
   This script takes a list of python files and emits a list of all
   packages that the set of files imports from outside of itself.  So
   if the set is [A.py, B.py], and A imports B and C, and B imports D,
   the result would be C and D.

* SymbolsInPackage.py
   This script takes the package list generated by FindExternalImports
   and emits a list of all of the symbols defined in those packages,
   in the same format as FindPythonGlobals.  When those two lists are
   concatenated, they give the locations of all globals the processed
   python files could possibly reference.

* ResolveGlobals.py
   Reads the dataset produced by FindPythonGlobals and
   SymbolsInPackage and uses the data in one of several ways.  It can
   look at just the dataset itself and emit information about
   duplicate definitions of symbols within the dataset.  These are the
   symbols which could be confused with each other.  It can also
   analyze a set of import statements to verify that they are
   importing symbols from their actual definition location, and not
   indirectly.  Finally, it can parse the output of pychecker,
   converting its warnings about undefined globals into the
   appropriate import statements to resolve those warnings.

* pychecker
   An external tool to perform static checking on python source code.
   Here we just use it to find a list of global symbols which are not
   defined within a particular source file.

First, make sure you are working on the right copy of the source.  Do
a cvs update and make sure there are no merge conflicts.

Next, create a list of global symbols defined in those files:

$ tools/FindPythonGlobals.py `tools/AllPyFiles.sh` > allglobalsymbols

Next, add the externally defined symbols to this list.  Note we are
writing to allglobalsymbols in append mode, via >>:

$ tools/FindExternalImports.py `tools/AllPyFiles.sh` | tools/SymbolsInPackage.py >> allglobalsymbols

This will send to stderr a list of modules which were imported by
files in allpyfiles, but which could not be imported by
SymbolsInPackage.  If allpyfiles includes files which don't actually
work, improper imports from them would show up here.  If any programs
change the import search path, that could also cause a problem here.
It could also be affected by running this in a directory other than
the one the program will be actually running in.

Next, check the symbol list for duplicates:

$ tools/ResolveGlobals.py allglobalsymbols --duplicates > dups

Examine the resulting file.  Duplicates are not necessarily a
problem, but you may notice something which will help later.

If you know some symbol values are identical, or perhaps that some
should never be used, you can remove them from allglobalsymbols.  For
example, PyQt4.Qt contains a complete copy of the symbols in
PyQt4.QtGui and PyQt4.QtCore.  Removing one of these sets will later
allow ResolveGlobals to print an import statement for a missing Qt
call, rather than saying it is ambiguous.

Next, run pychecker on each file.  Note that pychecker can fail on a
file before actually processing it if it cannot import it.  The
failure may happen several levels deep in imports, and pychecker
doesn't always tell you where the actual problem was.  When this
happens, you can try loading the file directly into python:

$ python problemfile.py

and you may get a better error message.

To do the whole set in one batch:

$ for i in `tools/AllPyFiles.sh`; do pychecker $i; done > pycheckerstdout 2> pycheckerstderr

This can take a while.

Examine the output.  Wherever you find the string "NOT PROCESSED
UNABLE TO IMPORT" you should have also gotten a message printed on
stderr.  Fix these and rerun the above command.

Look for "No global (...) found" messages.  These indicate that
pychecker was unable to resolve a global symbol, so you'll need to add
an import statement for it.  Or, it could indicate a bug that needs to
be fixed.  To determine the appropriate import statement, run:

$ pychecker problemfile.py | tools/ResolveGlobals.py allglobalsymbols

Which will print out import...from statements for symbols it finds.
It prints a list of possible modules if the symbol appeared in the
duplicates list you examined above.  If the symbol is not found, it
just prints "import <symbolname>".

When pychecker is happy that all global symbols are defined, you can
check to make sure everything is imported directly from the module it
is defined in:

$ grep '^[[:space:]]*from.*import' `tools/AllPyFiles.sh` | tools/ResolveGlobals.py allglobalsymbols --check-import > checkimportoutput

Examine the output.  Any remaining import *'s will be flagged, as will
import lines that end in a backslash.  Those should be removed, so
that the grep will find all symbol imports.  Lines in the output which
say "can't check up on file..." mean that symbol is not in the
allglobalsymbols file.  Some of these may not be problems, like if you
removed the symbol on purpose.

Lines including the string "elsewhere imported from" indicate that a
symbol is being imported from multiple sources in different modules.

Symbols which could be imported from one of several places have all
potential sources listed.

When everything above has been fixed, the import statements should
accurately reflect the import dependencies between all modules.  At
this point, it's time to try graphing that structure.

$ tools/PackageDependency.py `tools/AllPyFiles.sh` --justCycles > depend.dot

(Optionally add "2> packageloopcounts" to capture the package loop counts
from stderr, but this may hide error messages printed to stderr,
including some we are about to add for imports with continuation lines
or without fully qualified module names. BTW the loop counts are no longer 
very useful, so they might be removed or made optional. 
[--bruce 072126 update])

If you have the GraphViz package installed, the results can be plotted
with:

$ dot -Tpng depend.dot > depend.png

(To color-code the nodes by the tentative package assignments
hardcoded into PackageDependency.py, generate depend.dot with
the additional option --colorPackages.)

(To split depend.dot into disjoint connected subgraphs, assuming it
was made using the --justCycles option, run 

 % tools/splitDependDot.py > anotheroutputfile

after making it. I don't know if that file is suitable input to
GraphViz, but the individual digraphs within it should be.)