New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General-purpose safe unions #896
Comments
I actually started working on the implementation, and I want to propose an alternative syntax:
@jafingerhut : what do you think? This does not handle the case of using an explicit existing enum as tags, but I think that we can accommodate that case easily with a small extension. What I like about this proposal is that is is very economic, and it can reuse many existing language mechanisms. |
In fact, we can make this even simpler: an assignment to U can be written as just |
One issue is something @jafingerhut had noted earlier: if
|
No, once assigned u should be immutable. Or, if we really want them mutable, we cannot allow changing the "tag", only the fields. |
So the expression |
This is really orthogonal to the syntax. Also, these values cannot be uninitialized if we want to be safe. The most liberal implementation would make the "tag" immutable within a switch statement that switches on the tag, but would allow mutating tags otherwise. |
With my proposal the type is actually the same, but you will be forbidden from accessing |
Oh, I misunderstood. Let me try to restate to see if I understand:
Is that right? |
With the exception that u.x cannot be assigned in a switch, but u.x.a can (if we decide that unions are not immutable). |
Of course, we are talking about a |
Right. A bit fiddly, but I get how this would work. |
This PR contains a working implementation p4lang/p4c#2615. |
Here is an example from the implementation with comments: #include <core.p4>
struct S {
bit<8> f;
}
union Safe {
bit<32> b;
bit<16> c;
S f;
}
// Generic unions are supported
union Either<T, U> {
T t;
U u;
}
control c(out bool o) {
apply {
Either<bit<32>, bit<16>> e;
Safe s; // union is uninitialized now. the union maintains an internal tag that says "uninitialized"
s.b = 4; // We can only assign to whole union fields. This sets the tag of the union.
// bool b = s.b == 4; // error: cannot read field except in a switch statement
// bool b = s.f.f == 0; // error: cannot access union sub-fields except in a switch statement
Safe s1;
s1.b = 2;
switch (s) {
// Union tags have the shape "UnionType.field". We could also go with just the field name, but I think this is cleaner.
Safe.b: { o = (s.b == 0); } // Within a label all values in the s.b field become available.
Safe.c: {
// s.c = 2; // error: cannot change the "tag" within a switch statement.
switch (s1) { // can have nested switch statements
Safe.b: { o = (s1.b == (bit<32>)s.c); }
// not all labels must exist
default: { o = false; }
}
}
Safe.f: { o = (s.f.f == 1); s.f.f = 2; }
default: {
o = true;
s1.c = 2;
// bool b = s.b == 0; // error: no union fields are available in the 'default' label
}
}
switch (e) {
Either.t: { o = e.t == 0; }
Either.u: { o = e.u == 0; }
}
}
}
control ctrl(out bool o);
package top(ctrl _c);
top(c()) main; |
Does the design permit void-valued tags? (Seems like it could be useful.) If so, (1) do unions subsume enums, and (2) how do we assign a void-valued tag? |
This is an interesting idea. |
However, this feature is unlikely to be valued by developers. |
If we can unify this design with the existing enums, I think we'd end up with a (c)leaner language. Maybe something like the following. This optimistically assumes that we can turn uninitialized reads into compile-time errors. I've also written union declarations as #include <core.p4>
struct S {
bit<8> f;
}
enum Safe {
B(bit<32>);
C(bit<16>);
F(S);
}
// Generic unions are supported
enum Either<T, U> {
T(T);
U(U);
}
control c(out bool o) {
apply {
Either<bit<32>, bit<16>> e;
Safe s; // Here, s is uninitialized. Reads of possibly uninitialized
// unions result in a compile-time error.
s = B(4); // This sets the tag of the union. `Safe.B` would be better,
// but this is incompatible with existing syntax for enums.
bool b1 = s == B(4); // True
bool b2 = s == B(5); // False: s is a B, but 4 != 5
bool b3 = s == C(4); // False: s is not a C
Safe s1;
s1 = B(2);
switch (s) {
B: {
// Within a label, the union has a `value` field that can be
// read and modified, but whose type cannot change. Here,
// `s.value` is a bit<32>, as declared by the `B` label of
// `Safe`.
o = (s.value == 0);
}
C: {
switch (s1) { // can have nested switch statements
B: { o = (s1.value == (bit<32>)s.value); }
}
}
F: { o = (s.value.f == 1); s.value.f = 2; }
}
switch (e) {
T: { o = e.value == 0; }
U: { o = e.value == 0; }
}
}
}
control ctrl(out bool o);
package top(ctrl _c);
top(c()) main; |
The language can be leaner, but it will look less familiar to C users. Moreover, we already have a bunch of code that deals with enums, including in P4Runtime. We would have to generalize that code. |
I'd like to highlight that there is no syntax for unsafe operations in the example I've written above, which makes for a cleaner design.
|
The original proposal should be equally safe. You can only compare labels using a switch, and you cannot change the tag within a switch. You can only access fields that depend on a tag only if the tag is present. |
BTW: the proposal is fully implemented, feel free to try it. |
unfortunately this implementation is flawed, as pointed out in p4lang/p4c#2615:
|
The way this is solved in other programming languages is either to make the unions immutable once assigned, or to have the switch perform a "pattern matching" operation copying the safe union fields into a temporary storage area. This removes the requirement for the union type to change in the rest of the block. |
Here is a possible way one could implement "pattern matching"; I am not particularly fond of this syntax, but we can use it as a basis for discussion:
|
Note that the switch labels have now become assignment statements that are constrained to have the RHS expression be exactly one of the union fields. |
Possibility for renaming the variable for switch cases:
|
|
Current proposal: the only assignments to |
Here is an example with the latest syntax:
|
In the interest of tidying up the set of active issues on the P4 specification repository, I'm marking this as "stalled" and closing it. Of course, we can always re-open it in the future if there is interest in resurrecting it. |
Personnel
Design
The actual implementation is inspired by this design, but diverges from it in several ways.
Implementation
Process
The text was updated successfully, but these errors were encountered: