Modules Notes

Design

A more recent and fleshed out version of these notes are in a Google Doc here: https://drive.google.com/open?id=1NNC4t5NjAhzOhiC_DauiYrw80h1cXjZhshdRTuAPSnc
Also some slides here: https://drive.google.com/open?id=1j0Het_5tRU0i8OxY1fH6EJ7L9ls1fuPobD2LfTekN_U
We conflate modules and compilation units. That is, each file f.p4 is module.
Files are searched in a standard order -- e.g., according to an environment variable like $P4PATH and/or command-line arguments to the p4c compiler.
- It would be simplest to restrict that for every name foo, importing module foo will always correspond to the file foo.p4 in the first directory of $P4PATH that contains such a file, ignoring any identically named files in later directories of $P4PATH. Also, for a single execution of the compiler, there is only one value for $P4PATH, i.e. no way to temporarily modify $P4PATH for part of the compilation run, or when compiling some modules vs. other modules.
- Should import foo.bar be allowed, and become searching for a file foo/bar.p4 inside of a directory in $P4PATH? I cannot think of any problems this would introduce, and for some "multi-module libraries" could be a useful way to let the authors of those libraries provide a directory tree of files, not restricting them to put all source files in the same directory.
- A question was raised about dealing with multiple versions of the same module, with different versions imported from different other modules, e.g. top level program A imports modules B and C. B imports module D version 28, but C imports module D version 31. Proposal: If someone has specific ideas on how to address this, please describe them. It is far simpler to just require that for every module name, there is a single version of it that is first in $P4PATH, no matter where it is imported from, and the P4 developer is responsible for arranging that it is the version they want to use everywhere in their program. I am not aware of any language with modules that has a good answer to using multiple versions of the same library/module from different parts of a program, other than explicitly giving them different names from each other, e.g. D_version_28 and D_version_31.
We still run cpp over each file before loading.
- Opinions differ on whether cpp is a hack that should be deprecated and made unnecessary, vs. a useful technique that is sometimes a reasonable tool for the job.
- Should cpp "include path" be the same for every module compiled in a single invocation of the compiler?
  - If yes, then note that if a module is provided spread over multiple files, and two modules both do #include "defs.h", where defs.h is a common file name in both module implementations, but they differ, then if you are lucky compilation will give an error, or if unlucky your compiled code will be wrong. We could have a lint-like tool or customization to cpp that warns about such name conflicts, which prevents the silent wrong compiler results, but would still require the developer to rename include files to make them unique across a project.
  - If no, then it seems like we would need a way to specify an independent cpp include path for each module. Should there be some kind of per-module file containing its own private additions to the include path? If so, should the module-specific files always be used in preference to the "project wide" include path? Probably yes. Even then, accidentally naming a project-wide include file the same as one inside a project could lead to hard-to-diagnose problems, and a warning like that mentioned in the previous bullet could be useful.
- It has been suggested that this proposal could add conditional importing of modules, e.g. perhaps something like this:
```
const bool ENABLE_FEATURE_FOO = true;
// compile_time_if is a placeholder name.  It is similar to #ifdef but
// is not special to cpp.
// Note that unlike #ifdef, compile_time_if can examine the values of P4_16 compile-time
// constants like ENABLE_FEATURE_FOO.
compile_time_if ENABLE_FEATURE_FOO
import feature_foo_lib as foo;
end_compile_time_if
```
- If such a compile_time_if was only useful for the purpose of controlling which import statements are "active" and which are not, then it still seems that one of the most common use cases for #ifdef remains -- conditionally including or leaving out arbitrary lines of code, e.g. within controls or parsers. Question to those who would like to deprecate cpp: would you propose that compile_time_if be usable in arbitrary places like #ifdef can be? If so, why is that better than using cpp?

We import modules by name in a P4 Program.

import f;
// can now reference f.x and f.y

import f as g;
// can now reference g.x and g.y, especially handy if instead of 'f' it is 'some_very_long_module_name'

from f import x,y;
// can now reference x and y

from f import x as a, y as b, z;
// can now reference f.x using name a, and f.y using name b, and f.z using name z

from f import *;
// @jafingerhut argues against this.
// However, this would let us say that the compiler automagically does `from core import *`
// We could consider making core, and perhaps also one architecture per program, as a special
// case that behaves like `from core import *`, but still not let the P4 developer write
// `from foo import *` for any other modules.

Proposal: The things in a module B that might be imported by another module A must be explicitly exported in B. By default, all names in B are private to it. export *; would be allowed for module authors that want everything in it to be importable from others. As long as there is no from B import *;, additions of new names to B cannot thereby cause name conflicts in modules that import B.
What is in a name?
- Proposal: Only "top level" names can be imported from another module. For example, it doesn't make sense to import a local variable inside of a control defined in one module, into another. Top level things:
  - type, typedef, header, header_union, struct, enum, const, action, function, control, parser, package, extern functions and objects, instantiated objects of control/parser/packages/extern object.
  - error names are special in P4_16, in that they seem to be intended to be "global". Does it make sense to auto-import these always, with no way to prevent it? What would it mean to keep such an error private to a module?
  - similar questions for match_kind as for error
  - NOT #define symbols - cpp is only run on each module independently, not "across modules". P4 developers can still have common include files like defines.h that they #include from each module, if they wish.
- Multiple extern functions and/or actions can be defined with the same name but different signatures, with p4c as of 2019-May-08. Should import/export allow selection of individual signatures, or is importing all/none of those by the name good enough?
- p4c 2019-May-08 allows one to define a control with the same name as an extern function, e.g. clone is an extern function in v1model.p4 include file, but in a program that includes v1model.p4 there is no error if you define a control with the name clone, too. Should this be disallowed by p4c? If it should be allowed, should there be a way to import the extern function clone but not the control clone, or vice versa? Or do they both always get imported together because they have the same name?
- [@jafingerhut] I have not tested extensively with other combinations of "kinds" of top level things and whether they can have the same name or not, but it seems worth thinking about that and how it affects importing/exporting of names. Proposal: Modify p4c to disallow different "kinds" of things from having the same name, where it is allowed today.
A few issues to think about:
- Do we write import core ? Or do we just automatically import and open that module?
- Do we write import v1model ?
  1. Could specify architecture on the command-line and automagically import, like core?
```
...
counter(1024) c;
...
switch( ... ) main;
```
  1. Could specify the architecture on the command-line but manually import architecture
```
import psa
...
psa.counter(1024) c;
...
psa.switch( ... ) main;
```
  1. Could specify the architecture on the command-line but automagically import with a wildcard
```
from psa import *
...
counter(1024) c;
...
switch( ... ) main;
```
  1. Could include using cpp directives
```
#include<psa.p4>
...
counter(1024) c;
...
switch( ... ) main;
```
Circular dependencies between compilation units not allowed
import statements are definitely supported at the top level of the code. Proposal: It seems reasonable to disallow them anywhere except at the top level of a program, e.g. they are not permitted inside the body of a parser or control.
Proposal: Explicitly support an arbitrary directed acyclic graph structure of imports between modules in a single compilation run, i.e. compiling A, which imports B, C, and D, and all of B, C, and D import E (perhaps with different aliases for each) should be allowed.
- Does it ever make sense for the instantiation of main to be anywhere other than A, the file name / module that one directly invokes the compiler on? Anything after that instantiation seems like it cannot affect the behavior of earlier code, so seems like it must be dead code to me. P4 has no forward declarations.
- Should a module that is imported be allowed to have top level instantiations of extern objects, parsers, controls, or packages? Proposal: Yes. If yes, and if a module like E in the example above is imported from multiple other places has such instantiations, should that imply exactly one instantiation of each of those objects, or once for each time that E is imported (3 in the example above)? Proposal: Exactly one instantiation seems simpler to implement and reason about, and still useful.
Should A be allowed to have multiple import statements for the same module B? If so, what does that mean? Proposal: It is allowed, and simply creates more aliases for the same objects of B inside of A. In the directed acyclic graph described in the issue above, it is simply multiple edges from A to B.
Module name conventions?
- Some discussion on Java style com.mycompany.module-name names vs. Python's more "free wheeling" approach, that in most cases seems to work fairly well because people are discouraged from colliding with names for anything that is published: https://stackoverflow.com/questions/709036/why-do-languages-like-java-use-hierarchical-package-names-while-python-does-not
- There is some discussion of an "ownership rule" for Python package names in PEP 423 here: https://www.python.org/dev/peps/pep-0423/ but I do not know if there is any kind of "official registry" for who owns what Python package names. Perhaps not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modules Notes

Design

Clone this wiki locally