From 4a14d9c9af995061723487d3a9f749246863078b Mon Sep 17 00:00:00 2001 From: Robert Griesemer Date: Mon, 31 Jul 2023 09:41:33 -0700 Subject: [PATCH] [release-branch.go1.21] spec: update spec to version at tip This updates the spec by copying over several recent CLs describing the new type inference mechanisms. Fixes #61659. Change-Id: I750c901e73e0404f782a3632f5cd936e3775ae13 Reviewed-on: https://go-review.googlesource.com/c/go/+/514435 Reviewed-by: Robert Griesemer Auto-Submit: Robert Griesemer TryBot-Bypass: Robert Griesemer Reviewed-by: Ian Lance Taylor --- doc/go_spec.html | 759 ++++++++++++++++++++++++++--------------------- 1 file changed, 413 insertions(+), 346 deletions(-) diff --git a/doc/go_spec.html b/doc/go_spec.html index c2fa871eaa..d1b8bf2a91 100644 --- a/doc/go_spec.html +++ b/doc/go_spec.html @@ -1,6 +1,6 @@ @@ -2511,7 +2511,7 @@ type (

A type definition creates a new, distinct type with the same -underlying type and operations as the given type +underlying type and operations as the given type and binds an identifier, the type name, to it.

@@ -4343,7 +4343,7 @@ type parameter list type arguments after substitution When using a generic function, type arguments may be provided explicitly, or they may be partially or completely inferred from the context in which the function is used. -Provided that they can be inferred, type arguments may be omitted entirely if the function is: +Provided that they can be inferred, type argument lists may be omitted entirely if the function is:

    @@ -4351,7 +4351,7 @@ Provided that they can be inferred, type arguments may be omitted entirely if th called with ordinary arguments,
  • - assigned to a variable with an explicitly declared type, + assigned to a variable with a known type
  • passed as an argument to another function, or @@ -4371,7 +4371,7 @@ must be inferrable from the context in which the function is used. // sum returns the sum (concatenation, for strings) of its arguments. func sum[T ~int | ~float64 | ~string](x... T) T { … } -x := sum // illegal: sum must have a type argument (x is a variable without a declared type) +x := sum // illegal: the type of x is unknown intSum := sum[int] // intSum has type func(x... int) int a := intSum(2, 3) // a has value 5 of type int b := sum[float64](2.0, 3) // b has value 5.0 of type float64 @@ -4406,402 +4406,323 @@ For a generic type, all type arguments must always be provided explicitly.

    Type inference

    -NOTE: This section is not yet up-to-date for Go 1.21. +A use of a generic function may omit some or all type arguments if they can be +inferred from the context within which the function is used, including +the constraints of the function's type parameters. +Type inference succeeds if it can infer the missing type arguments +and instantiation succeeds with the +inferred type arguments. +Otherwise, type inference fails and the program is invalid.

    -Missing function type arguments may be inferred by a series of steps, described below. -Each step attempts to use known information to infer additional type arguments. -Type inference stops as soon as all type arguments are known. -After type inference is complete, it is still necessary to substitute all type arguments -for type parameters and verify that each type argument -implements the relevant constraint; -it is possible for an inferred type argument to fail to implement a constraint, in which -case instantiation fails. +Type inference uses the type relationships between pairs of types for inference: +For instance, a function argument must be assignable +to its respective function parameter; this establishes a relationship between the +type of the argument and the type of the parameter. +If either of these two types contains type parameters, type inference looks for the +type arguments to substitute the type parameters with such that the assignability +relationship is satisfied. +Similarly, type inference uses the fact that a type argument must +satisfy the constraint of its respective +type parameter.

    -Type inference is based on +Each such pair of matched types corresponds to a type equation containing +one or multiple type parameters, from one or possibly multiple generic functions. +Inferring the missing type arguments means solving the resulting set of type +equations for the respective type parameters. +

    + +

    +For example, given +

    + +
    +// dedup returns a copy of the argument slice with any duplicate entries removed.
    +func dedup[S ~[]E, E comparable](S) S { … }
    +
    +type Slice []int
    +var s Slice
    +s = dedup(s)   // same as s = dedup[Slice, int](s)
    +
    + +

    +the variable s of type Slice must be assignable to +the function parameter type S for the program to be valid. +To reduce complexity, type inference ignores the directionality of assignments, +so the type relationship between Slice and S can be +expressed via the (symmetric) type equation Slice ≡A S +(or S ≡A Slice for that matter), +where the A in A +indicates that the LHS and RHS types must match per assignability rules +(see the section on type unification for +details). +Similarly, the type parameter S must satisfy its constraint +~[]E. This can be expressed as S ≡C ~[]E +where X ≡C Y stands for +"X satisfies constraint Y". +These observations lead to a set of two equations +

    + +
    +	Slice ≡A S      (1)
    +	S     ≡C ~[]E   (2)
    +
    + +

    +which now can be solved for the type parameters S and E. +From (1) a compiler can infer that the type argument for S is Slice. +Similarly, because the underlying type of Slice is []int +and []int must match []E of the constraint, +a compiler can infer that E must be int. +Thus, for these two equations, type inference infers +

    + +
    +	S ➞ Slice
    +	E ➞ int
    +
    + +

    +Given a set of type equations, the type parameters to solve for are +the type parameters of the functions that need to be instantiated +and for which no explicit type arguments is provided. +These type parameters are called bound type parameters. +For instance, in the dedup example above, the type parameters +P and E are bound to dedup. +An argument to a generic function call may be a generic function itself. +The type parameters of that function are included in the set of bound +type parameters. +The types of function arguments may contain type parameters from other +functions (such as a generic function enclosing a function call). +Those type parameters may also appear in type equations but they are +not bound in that context. +Type equations are always solved for the bound type parameters only. +

    + +

    +Type inference supports calls of generic functions and assignments +of generic functions to (explicitly function-typed) variables. +This includes passing generic functions as arguments to other +(possibly also generic) functions, and returning generic functions +as results. +Type inference operates on a set of equations specific to each of +these cases. +The equations are as follows (type argument lists are omitted for clarity):

    • - a type parameter list +

      + For a function call f(a0, a1, …) where + f or a function argument ai is + a generic function: +
      + Each pair (ai, pi) of corresponding + function arguments and parameters where ai is not an + untyped constant yields an equation + typeof(pi) ≡A typeof(ai). +
      + If ai is an untyped constant cj, + and typeof(pi) is a bound type parameter Pk, + the pair (cj, Pk) is collected separately from + the type equations. +

    • - a substitution map M initialized with the known type arguments, if any +

      + For an assignment v = f of a generic function f to a + (non-generic) variable v of function type: +
      + typeof(v) ≡A typeof(f). +

    • - a (possibly empty) list of ordinary function arguments (in case of a function call only) +

      + For a return statement return …, f, … where f is a + generic function returned as a result to a (non-generic) result variable + r of function type: +
      + typeof(r) ≡A typeof(f). +

    -and then proceeds with the following steps: +Additionally, each type parameter Pk and corresponding type constraint +Ck yields the type equation +PkC Ck. +

    + +

    +Type inference gives precedence to type information obtained from typed operands +before considering untyped constants. +Therefore, inference proceeds in two phases:

    1. - apply function argument type inference - to all typed ordinary function arguments +

      + The type equations are solved for the bound + type parameters using type unification. + If unification fails, type inference fails. +

    2. - apply constraint type inference -
    3. -
    4. - apply function argument type inference to all untyped ordinary function arguments - using the default type for each of the untyped function arguments -
    5. -
    6. - apply constraint type inference +

      + For each bound type parameter Pk for which no type argument + has been inferred yet and for which one or more pairs + (cj, Pk) with that same type parameter + were collected, determine the constant kind + of the constants cj in all those pairs the same way as for + constant expressions. + The type argument for Pk is the + default type for the determined constant kind. + If a constant kind cannot be determined due to conflicting constant kinds, + type inference fails. +

    -If there are no ordinary or untyped function arguments, the respective steps are skipped. -Constraint type inference is skipped if the previous step didn't infer any new type arguments, -but it is run at least once if there are missing type arguments. +If not all type arguments have been found after these two phases, type inference fails.

    -The substitution map M is carried through all steps, and each step may add entries to M. -The process stops as soon as M has a type argument for each type parameter or if an inference step fails. -If an inference step fails, or if M is still missing type arguments after the last step, type inference fails. +If the two phases are successful, type inference determined a type argument for each +bound type parameter: +

    + +
    +	Pk ➞ Ak
    +
    + +

    +A type argument Ak may be a composite type, +containing other bound type parameters Pk as element types +(or even be just another bound type parameter). +In a process of repeated simplification, the bound type parameters in each type +argument are substituted with the respective type arguments for those type +parameters until each type argument is free of bound type parameters. +

    + +

    +If type arguments contain cyclic references to themselves +through bound type parameters, simplification and thus type +inference fails. +Otherwise, type inference succeeds.

    Type unification

    -Type inference is based on type unification. A single unification step -applies to a substitution map and two types, either -or both of which may be or contain type parameters. The substitution map tracks -the known (explicitly provided or already inferred) type arguments: the map -contains an entry PA for each type -parameter P and corresponding known type argument A. -During unification, known type arguments take the place of their corresponding type -parameters when comparing types. Unification is the process of finding substitution -map entries that make the two types equivalent. +Type inference solves type equations through type unification. +Type unification recursively compares the LHS and RHS types of an +equation, where either or both types may be or contain bound type parameters, +and looks for type arguments for those type parameters such that the LHS +and RHS match (become identical or assignment-compatible, depending on +context). +To that effect, type inference maintains a map of bound type parameters +to inferred type arguments; this map is consulted and updated during type unification. +Initially, the bound type parameters are known but the map is empty. +During type unification, if a new type argument A is inferred, +the respective mapping P ➞ A from type parameter to argument +is added to the map. +Conversely, when comparing types, a known type argument +(a type argument for which a map entry already exists) +takes the place of its corresponding type parameter. +As type inference progresses, the map is populated more and more +until all equations have been considered, or until unification fails. +Type inference succeeds if no unification step fails and the map has +an entry for each type parameter.

    -

    -For unification, two types that don't contain any type parameters from the current type -parameter list are equivalent -if they are identical, or if they are channel types that are identical ignoring channel -direction, or if their underlying types are equivalent. -

    - -

    -Unification works by comparing the structure of pairs of types: their structure -disregarding type parameters must be identical, and types other than type parameters -must be equivalent. -A type parameter in one type may match any complete subtype in the other type; -each successful match causes an entry to be added to the substitution map. -If the structure differs, or types other than type parameters are not equivalent, -unification fails. -

    - - - -

    -For example, if T1 and T2 are type parameters, -[]map[int]bool can be unified with any of the following: + +For example, given the type equation with the bound type parameter +P

    -[]map[int]bool   // types are identical
    -T1               // adds T1 → []map[int]bool to substitution map
    -[]T1             // adds T1 → map[int]bool to substitution map
    -[]map[T1]T2      // adds T1 → int and T2 → bool to substitution map
    +	[10]struct{ elem P, list []P } ≡A [10]struct{ elem string; list []string }
     

    -On the other hand, []map[int]bool cannot be unified with any of -

    - -
    -int              // int is not a slice
    -struct{}         // a struct is not a slice
    -[]struct{}       // a struct is not a map
    -[]map[T1]string  // map element types don't match
    -
    - -

    -As an exception to this general rule, because a defined type -D and a type literal L are never equivalent, -unification compares the underlying type of D with L instead. -For example, given the defined type -

    - -
    -type Vector []float64
    -
    - -

    -and the type literal []E, unification compares []float64 with -[]E and adds an entry Efloat64 to -the substitution map. -

    - -

    Function argument type inference

    - - - -

    -Function argument type inference infers type arguments from function arguments: -if a function parameter is declared with a type T that uses -type parameters, -unifying the type of the corresponding -function argument with T may infer type arguments for the type -parameters used by T. +type inference starts with an empty map. +Unification first compares the top-level structure of the LHS and RHS +types. +Both are arrays of the same length; they unify if the element types unify. +Both element types are structs; they unify if they have +the same number of fields with the same names and if the +field types unify. +The type argument for P is not known yet (there is no map entry), +so unifying P with string adds +the mapping P ➞ string to the map. +Unifying the types of the list field requires +unifying []P and []string and +thus P and string. +Since the type argument for P is known at this point +(there is a map entry for P), its type argument +string takes the place of P. +And since string is identical to string, +this unification step succeeds as well. +Unification of the LHS and RHS of the equation is now finished. +Type inference succeeds because there is only one type equation, +no unification step failed, and the map is fully populated.

    -For instance, given the generic function -

    - -
    -func scale[Number ~int64|~float64|~complex128](v []Number, s Number) []Number
    -
    - -

    -and the call -

    - -
    -var vector []float64
    -scaledVector := scale(vector, 42)
    -
    - -

    -the type argument for Number can be inferred from the function argument -vector by unifying the type of vector with the corresponding -parameter type: []float64 and []Number -match in structure and float64 matches with Number. -This adds the entry Numberfloat64 to the -substitution map. -Untyped arguments, such as the second function argument 42 here, are ignored -in the first round of function argument type inference and only considered if there are -unresolved type parameters left. +Unification uses a combination of exact and loose +unification depending on whether two types have to be +identical, +assignment-compatible, or +only structurally equal. +The respective type unification rules +are spelled out in detail in the Appendix.

    -Inference happens in two separate phases; each phase operates on a specific list of -(parameter, argument) pairs: +For an equation of the form X ≡A Y, +where X and Y are types involved +in an assignment (including parameter passing and return statements), +the top-level type structures may unify loosely but element types +must unify exactly, matching the rules for assignments.

    -
      +

      +For an equation of the form P ≡C C, +where P is a type parameter and C +its corresponding constraint, the unification rules are bit +more complicated: +

      + +
      • - The list Lt contains all (parameter, argument) pairs where the parameter - type uses type parameters and where the function argument is typed. + If C has a core type + core(C) + and P has a known type argument A, + core(C) and A must unify loosely. + If P does not have a known type argument + and C contains exactly one type term T + that is not an underlying (tilde) type, unification adds the + mapping P ➞ T to the map.
      • - The list Lu contains all remaining pairs where the parameter type is a single - type parameter. In this list, the respective function arguments are untyped. + If C does not have a core type + and P has a known type argument A, + A must have all methods of C, if any, + and corresponding method types must unify exactly.
      • -
    +

-Any other (parameter, argument) pair is ignored. -

- -

-By construction, the arguments of the pairs in Lu are untyped constants -(or the untyped boolean result of a comparison). And because default types -of untyped values are always predeclared non-composite types, they can never match against -a composite type, so it is sufficient to only consider parameter types that are single type -parameters. -

- -

-Each list is processed in a separate phase: -

- -
    -
  1. - In the first phase, the parameter and argument types of each pair in Lt - are unified. If unification succeeds for a pair, it may yield new entries that - are added to the substitution map M. If unification fails, type inference - fails. -
  2. -
  3. - The second phase considers the entries of list Lu. Type parameters for - which the type argument has already been determined are ignored in this phase. - For each remaining pair, the parameter type (which is a single type parameter) and - the default type of the corresponding untyped argument is - unified. If unification fails, type inference fails. -
  4. -
- -

-While unification is successful, processing of each list continues until all list elements -are considered, even if all type arguments are inferred before the last list element has -been processed. -

- -

-Example: -

- -
-func min[T ~int|~float64](x, y T) T
-
-var x int
-min(x, 2.0)    // T is int, inferred from typed argument x; 2.0 is assignable to int
-min(1.0, 2.0)  // T is float64, inferred from default type for 1.0 and matches default type for 2.0
-min(1.0, 2)    // illegal: default type float64 (for 1.0) doesn't match default type int (for 2)
-
- -

-In the example min(1.0, 2), processing the function argument 1.0 -yields the substitution map entry Tfloat64. Because -processing continues until all untyped arguments are considered, an error is reported. This -ensures that type inference does not depend on the order of the untyped arguments. -

- -

Constraint type inference

- -

-Constraint type inference infers type arguments by considering type constraints. -If a type parameter P has a constraint with a -core type C, -unifying P with C -may infer additional type arguments, either the type argument for P, -or if that is already known, possibly the type arguments for type parameters -used in C. -

- -

-For instance, consider the type parameter list with type parameters List and -Elem: -

- -
-[List ~[]Elem, Elem any]
-
- -

-Constraint type inference can deduce the type of Elem from the type argument -for List because Elem is a type parameter in the core type -[]Elem of List. -If the type argument is Bytes: -

- -
-type Bytes []byte
-
- -

-unifying the underlying type of Bytes with the core type means -unifying []byte with []Elem. That unification succeeds and yields -the substitution map entry -Elembyte. -Thus, in this example, constraint type inference can infer the second type argument from the -first one. -

- -

-Using the core type of a constraint may lose some information: In the (unlikely) case that -the constraint's type set contains a single defined type -N, the corresponding core type is N's underlying type rather than -N itself. In this case, constraint type inference may succeed but instantiation -will fail because the inferred type is not in the type set of the constraint. -Thus, constraint type inference uses the adjusted core type of -a constraint: if the type set contains a single type, use that type; otherwise use the -constraint's core type. -

- -

-Generally, constraint type inference proceeds in two phases: Starting with a given -substitution map M -

- -
    -
  1. -For all type parameters with an adjusted core type, unify the type parameter with that -type. If any unification fails, constraint type inference fails. -
  2. - -
  3. -At this point, some entries in M may map type parameters to other -type parameters or to types containing type parameters. For each entry -PA in M where A is or -contains type parameters Q for which there exist entries -QB in M, substitute those -Q with the respective B in A. -Stop when no further substitution is possible. -
  4. -
- -

-The result of constraint type inference is the final substitution map M from type -parameters P to type arguments A where no type parameter P -appears in any of the A. -

- -

-For instance, given the type parameter list -

- -
-[A any, B []C, C *A]
-
- -

-and the single provided type argument int for type parameter A, -the initial substitution map M contains the entry Aint. -

- -

-In the first phase, the type parameters B and C are unified -with the core type of their respective constraints. This adds the entries -B[]C and C*A -to M. - -

-At this point there are two entries in M where the right-hand side -is or contains type parameters for which there exists other entries in M: -[]C and *A. -In the second phase, these type parameters are replaced with their respective -types. It doesn't matter in which order this happens. Starting with the state -of M after the first phase: -

- -

-Aint, -B[]C, -C*A -

- -

-Replace A on the right-hand side of → with int: -

- -

-Aint, -B[]C, -C*int -

- -

-Replace C on the right-hand side of → with *int: -

- -

-Aint, -B[]*int, -C*int -

- -

-At this point no further substitution is possible and the map is full. -Therefore, M represents the final map of type parameters -to type arguments for the given type parameter list. +When solving type equations from type constraints, +solving one equation may infer additional type arguments, +which in turn may enable solving other equations that depend +on those type arguments. +Type inference repeats type unification as long as new type +arguments are inferred.

Operators

@@ -5479,7 +5400,7 @@ in any of these cases: ignoring struct tags (see below), x's type and T are not type parameters but have - identical underlying types. + identical underlying types.
  • ignoring struct tags (see below), @@ -7324,7 +7245,8 @@ clear(t) type parameter see below

    -If the argument type is a type parameter, +If the type of the argument to clear is a +type parameter, all types in its type set must be maps or slices, and clear performs the operation corresponding to the actual type argument.

    @@ -8290,7 +8212,7 @@ of if the general conversion rules take care of this.

    A Pointer is a pointer type but a Pointer value may not be dereferenced. -Any pointer or value of underlying type uintptr can be +Any pointer or value of underlying type uintptr can be converted to a type of underlying type Pointer and vice versa. The effect of converting between Pointer and uintptr is implementation-defined.

    @@ -8438,3 +8360,148 @@ The following minimal alignment properties are guaranteed:

    A struct or array type has size zero if it contains no fields (or elements, respectively) that have a size greater than zero. Two distinct zero-size variables may have the same address in memory.

    + +

    Appendix

    + +

    Type unification rules

    + +

    +The type unification rules describe if and how two types unify. +The precise details are relevant for Go implementations, +affect the specifics of error messages (such as whether +a compiler reports a type inference or other error), +and may explain why type inference fails in unusual code situations. +But by and large these rules can be ignored when writing Go code: +type inference is designed to mostly "work as expected", +and the unification rules are fine-tuned accordingly. +

    + +

    +Type unification is controlled by a matching mode, which may +be exact or loose. +As unification recursively descends a composite type structure, +the matching mode used for elements of the type, the element matching mode, +remains the same as the matching mode except when two types are unified for +assignability (A): +in this case, the matching mode is loose at the top level but +then changes to exact for element types, reflecting the fact +that types don't have to be identical to be assignable. +

    + +

    +Two types that are not bound type parameters unify exactly if any of +following conditions is true: +

    + +
      +
    • + Both types are identical. +
    • +
    • + Both types have identical structure and their element types + unify exactly. +
    • +
    • + Exactly one type is an unbound + type parameter with a core type, + and that core type unifies with the other type per the + unification rules for A + (loose unification at the top level and exact unification + for element types). +
    • +
    + +

    +If both types are bound type parameters, they unify per the given +matching modes if: +

    + +
      +
    • + Both type parameters are identical. +
    • +
    • + At most one of the type parameters has a known type argument. + In this case, the type parameters are joined: + they both stand for the same type argument. + If neither type parameter has a known type argument yet, + a future type argument inferred for one the type parameters + is simultaneously inferred for both of them. +
    • +
    • + Both type parameters have a known type argument + and the type arguments unify per the given matching modes. +
    • +
    + +

    +A single bound type parameter P and another type T unify +per the given matching modes if: +

    + +
      +
    • + P doesn't have a known type argument. + In this case, T is inferred as the type argument for P. +
    • +
    • + P does have a known type argument A, + A and T unify per the given matching modes, + and one of the following conditions is true: +
        +
      • + Both A and T are interface types: + In this case, if both A and T are + also defined types, + they must be identical. + Otherwise, if neither of them is a defined type, they must + have the same number of methods + (unification of A and T already + established that the methods match). +
      • +
      • + Neither A nor T are interface types: + In this case, if T is a defined type, T + replaces A as the inferred type argument for P. +
      • +
      • + In all other cases unification of P and T fails. +
      • +
      +
    • +
    + +

    +Finally, two types that are not bound type parameters unify loosely +(and per the element matching mode) if: +

    + +
      +
    • + Both types unify exactly. +
    • +
    • + One type is a defined type, + the other type is a type literal, but not an interface, + and their underlying types unify per the element matching mode. +
    • +
    • + Both types are interfaces (but not type parameters) with + identical type terms, + both or neither embed the predeclared type + comparable, + corresponding method types unify per the element matching mode, + and the method set of one of the interfaces is a subset of + the method set of the other interface. +
    • +
    • + Only one type is an interface (but not a type parameter), + corresponding methods of the two types unify per the element matching mode, + and the method set of the interface is a subset of + the method set of the other type. +
    • +
    • + Both types have the same structure and their element types + unify per the element matching mode. +
    • +