From 014604fbf04e0343702c282712e3907d7d78347f Mon Sep 17 00:00:00 2001 From: Chris Garvis Date: Tue, 12 May 2026 18:22:42 -0400 Subject: [PATCH 1/2] Set-theoretic type aliases: storage, resolution, and protocol unions, #15127 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This branch advances the Type aliases bullet of #15127. The work is split across four logical pieces; they're combined here as one PR since each piece depends on the previous one and they all share a test surface in lib/elixir/test/elixir/module/types/integration_test.exs. ## 1. Foundation — typed struct fields via @type t A new `Module.Types.Typespec` converter walks typespec ASTs and produces `Module.Types.Descr` values. Built-ins, literals, unions, struct literals (%__MODULE__{...} and %Mod{...}), local references, and self-qualified references are all handled. Memoized via a `:pending` sentinel so self-cycles surface as an error; cycles in real code degrade the whole alias to `dynamic()` rather than break the build. `Kernel.Typespec.translate_typespecs_for_module/2` invokes the converter and stores the result in the module's data tables under `{:elixir, :types_descr}`. `Module.ParallelChecker.cache_from_module_map` snapshots it into the long-lived checker ETS table during `spawn_parallel_checker`, so the async checker can read it after the per-module data tables are torn down. The chunk-write path in `elixir_erl:compile/2` adds an optional `:types` key to the ExCk chunk; older readers ignore it (additive, no version bump). `Module.Types.Of.struct_instance/7` consults `User.t/0` to type-check field values, emitting a new `:badstructfield` diagnostic on mismatch. Default values injected by defstruct are skipped to avoid noise. To make this available to stdlib, `module/types/typespec.ex` is added to bootstrap MAIN — without this, modules compiled before Module.Types.Typespec is loaded would never get `:types` in their chunks. ## 2. Function types and remote type references The converter is extended to handle two more typespec shapes: * Function types `(args -> result)` parse as a single-element list containing `{:->, _, [args, return]}`. The handler calls `Descr.fun/2` (or `Descr.fun/0` for variadic `... -> result`). * Remote type references `Mod.t()` are resolved by reading the target's ExCk chunk via `:beam_lib.chunks` on the in-memory binary (`:code.get_object_code/1`, not the disk path, which fails when the beam isn't written yet). `@opaque` is dynamic from outside the defining module; parametric arity > 0 degrades silently because the offending typespec lives in another module. ## 3. Same-compile-unit ordering for typespec refs When module B's typespec references `A.t()`, B needs A's beam to be loaded by the time the converter runs. Without help, the parallel compiler doesn't see typespec references as deps and may schedule B before A is loaded. `Kernel.Typespec.typespec/4`'s remote-call branch now calls `Kernel.ErrorHandler.ensure_compiled` to block until A's beam is available — the same mechanism struct expansion uses. A `:type_reference` trace event is also emitted via `:elixir_env.trace` so external tools (mix xref, custom tracers) can consume it; the lexical tracker itself is intentionally not wired, since existing tests assert that typespecs don't add compile/runtime deps. To make `fetch_remote_types` see the in-memory checker ETS from each spawn subprocess, `elixir_module.erl` propagates `:elixir_checker_info` into the subprocess's process dictionary. ## 4. Protocol t/0 union Protocol consolidation now writes the union of every implementing struct/built-in type into the consolidated chunk's `:types[{:t, 0}]`. `Protocol.consolidate/5` extracts the domain already computed by `new_signatures/5` (which uses `Module.Types.Of.impl/2` for each impl) and stores it. The result is that references like `Enumerable.t()` in downstream typespecs become semantically meaningful: a value typed as `Foo.t()` is statically known to be one of the implementing types. ### Trade-off rationale for the protocol approach Three alternatives were considered: * **Rust `dyn Trait`** — keep the type opaque even after link-time vtable construction. Gives up refinement entirely. * **Scala/Java/Dart/Swift sealed traits** — author opts into a closed-world definition. Cleanest answer overall but requires a language addition. * **Closed-world per-build-artifact** (this commit). Open by language design, closed-by-consolidation for the current artifact. The union is correct for this build and useful for downstream type-checking on subsequent compilations. For protocols with `@fallback_to_any true` (most stdlib protocols), `Of.impl(Any, :open)` is `term()` and the union collapses to `term()`. Technically correct — those protocols runtime-accept any value — but no useful refinement. Third-party protocols without the fallback get the full benefit. ## What this branch does NOT address * **Same-run downstream re-check after consolidation.** Consolidation runs after the rest of compile, so dependents type-checked during the same `mix compile` saw the unconsolidated chunk. Precision arrives on subsequent compiles. * **Parametric `t(element)` of protocols and aliases.** Consolidator writes only `{:t, 0}`. Parametric protocol types are a separate bullet in #15127. * **INIT bootstrap modules' missing `:types`.** Modules in the very early bootstrap (Macro, Range, Keyword, etc., before `kernel/typespec.ex`) still don't get `:types`. Moving `Module.Types.Descr` to INIT would unlock them but is a much larger reshuffling. ## Verification * 508 checker tests pass (+15 from new typed-struct, function-type, remote-ref, dep-tracking, and protocol-union tests). * 79 top-level typespec tests pass. * 26 Protocol tests pass. * 36 lexical tracker tests pass. * 55 Mix.Dep tests pass. * `make clean && make compile` produces no new stdlib warnings. Signed-off-by: Chris Garvis --- lib/elixir/lib/kernel/typespec.ex | 71 +++- lib/elixir/lib/module/parallel_checker.ex | 45 +++ lib/elixir/lib/module/types/of.ex | 86 ++++- lib/elixir/lib/module/types/typespec.ex | 281 ++++++++++++++++ lib/elixir/lib/protocol.ex | 51 ++- lib/elixir/src/elixir_compiler.erl | 1 + lib/elixir/src/elixir_erl.erl | 19 +- lib/elixir/src/elixir_module.erl | 4 + .../elixir/module/types/integration_test.exs | 315 +++++++++++++++++- .../elixir/module/types/typespec_test.exs | 182 ++++++++++ 10 files changed, 1029 insertions(+), 26 deletions(-) create mode 100644 lib/elixir/lib/module/types/typespec.ex create mode 100644 lib/elixir/test/elixir/module/types/typespec_test.exs diff --git a/lib/elixir/lib/kernel/typespec.ex b/lib/elixir/lib/kernel/typespec.ex index 8c458c3267d..68a031faab2 100644 --- a/lib/elixir/lib/kernel/typespec.ex +++ b/lib/elixir/lib/kernel/typespec.ex @@ -225,7 +225,7 @@ defmodule Kernel.Typespec do ## Translation from Elixir AST to typespec AST @doc false - def translate_typespecs_for_module(_set, bag) do + def translate_typespecs_for_module(set, bag) do type_typespecs = take_typespecs(bag, [:type, :opaque, :typep]) defined_type_pairs = collect_defined_type_pairs(type_typespecs) @@ -246,9 +246,68 @@ defmodule Kernel.Typespec do optional_callbacks = :lists.flatten(get_typespecs(bag, :optional_callbacks)) used_types = filter_used_types(types, state) + store_types_descr(set, type_typespecs) + {used_types, specs, callbacks, macrocallbacks, optional_callbacks} end + # Convert `@type`/`@opaque`/`@typep` ASTs into `Module.Types.Descr` + # values and store them in the module's data table under + # `{:elixir, :types_descr}`. The checker reads this table to + # resolve struct field types. + # + # Parametric types (arity > 0) are skipped — they are not yet + # representable in `Descr`. Cycles raise a compile error. + defp store_types_descr(_set, []), do: :ok + + defp store_types_descr(set, type_typespecs) do + # Skip if the converter isn't available yet (early bootstrap + # compilation order). The conversion is best-effort metadata for + # the checker; missing it just means struct field types fall back + # to `dynamic()` for those modules. + if :code.ensure_loaded(Module.Types.Typespec) == {:module, Module.Types.Typespec} do + {descr_map, _} = + :lists.foldl(&convert_type_to_descr/2, {%{}, %{}}, type_typespecs) + + :ets.insert(set, {{:elixir, :types_descr}, descr_map}) + end + + :ok + end + + # Convert one type's AST into a `Descr` and accumulate into `defined`. + # The type currently being converted is marked `:pending` so a + # self-reference inside its own body can be detected. Both recursive + # and parametric references degrade the whole type to `dynamic()` — + # those features will be lit up by later PRs; for now the alias is + # stored but treated as opaque-from-the-checker's-view. + defp convert_type_to_descr({kind, expr, pos}, {acc, defined}) do + with {:"::", _, [{name, _meta, args}, definition]} <- expr, + arity = arg_count(args), + true <- arity == 0 do + env = :elixir_module.get_cached_env(pos) + state = %{module: env.module, defined: Map.put(defined, {name, 0}, :pending)} + + descr = + case Module.Types.Typespec.to_descr(definition, state) do + {:ok, descr} -> descr + {:error, _reason} -> Module.Types.Descr.dynamic() + end + + entry = {descr_kind_for(kind), descr} + {Map.put(acc, {name, 0}, entry), Map.put(defined, {name, 0}, entry)} + else + _ -> {acc, defined} + end + end + + defp arg_count(args) when is_atom(args), do: 0 + defp arg_count(args) when is_list(args), do: length(args) + + defp descr_kind_for(:type), do: :type + defp descr_kind_for(:typep), do: :typep + defp descr_kind_for(:opaque), do: :opaque + defp collect_defined_type_pairs(type_typespecs) do fun = fn {_kind, expr, pos}, type_pairs -> %{file: file, line: line} = env = :elixir_module.get_cached_env(pos) @@ -801,6 +860,16 @@ defmodule Kernel.Typespec do typespec({name, meta, args}, vars, caller, state) true -> + :elixir_env.trace({:type_reference, meta, remote, {name, length(args)}}, caller) + + # Ensure the referenced module is compiled before we proceed, so that + # store_types_descr / fetch_remote_types can read the ExCk chunk from + # the in-memory binary. This mirrors how struct expansion waits for its + # module via Kernel.ErrorHandler.ensure_compiled. + if :erlang.get(:elixir_compiler_info) != :undefined do + Kernel.ErrorHandler.ensure_compiled(remote, :module, :soft, caller.line) + end + {remote_spec, state} = typespec(remote, vars, caller, state) {name_spec, state} = typespec(name, vars, caller, state) type = {remote_spec, meta, name_spec, args} diff --git a/lib/elixir/lib/module/parallel_checker.ex b/lib/elixir/lib/module/parallel_checker.ex index 4c5b1bee9ec..fbe2d4c5d5f 100644 --- a/lib/elixir/lib/module/parallel_checker.ex +++ b/lib/elixir/lib/module/parallel_checker.ex @@ -482,6 +482,7 @@ defmodule Module.ParallelChecker do for({function, :def, _meta, _clauses} <- map.definitions, do: function) cache_info(table, map.module, exports, Map.new(map.deprecated), signatures) + cache_types_descr_from_data_tables(table, map.module) {elixir_mode(map.attributes), module_map_to_module_tuple(map)} end @@ -492,6 +493,46 @@ defmodule Module.ParallelChecker do end) end + # Snapshot the module's `{:elixir, :types_descr}` ETS entry into the + # long-lived checker table so the checker can resolve type aliases + # after the module's own data tables have been torn down. + defp cache_types_descr_from_data_tables(table, module) do + try do + {set, _bag} = :elixir_module.data_tables(module) + + case :ets.lookup(set, {:elixir, :types_descr}) do + [{_, descr_map}] when is_map(descr_map) -> + Enum.each(descr_map, fn {{name, arity}, entry} -> + :ets.insert(table, {{module, :type, name, arity}, entry}) + end) + + _ -> + :ok + end + catch + _, _ -> :ok + end + end + + @doc """ + Returns the `Descr` for a user-declared `@type`/`@opaque` if cached. + + Returns `nil` when the alias is unknown, parametric (arity > 0), or + the cache is unavailable. Returns the `:opaque`/`:type` entry tuple + so callers can implement opacity rules. + """ + def fetch_type(nil, _module, _name, _arity), do: nil + def fetch_type(:none, _module, _name, _arity), do: nil + + def fetch_type({_checker, table}, module, name, arity) + when is_atom(module) and is_atom(name) and is_integer(arity) do + case :ets.lookup(table, {module, :type, name, arity}) do + [{_, entry}] -> entry + _ -> nil + end + end + + defp cache_chunk(table, module, contents) do Enum.each(contents.exports, fn {{fun, arity}, info} -> sig = @@ -506,6 +547,10 @@ defmodule Module.ParallelChecker do ) end) + for {{name, arity}, entry} <- Map.get(contents, :types, %{}) do + :ets.insert(table, {{module, :type, name, arity}, entry}) + end + Map.get(contents, :mode, :elixir) end diff --git a/lib/elixir/lib/module/types/of.ex b/lib/elixir/lib/module/types/of.ex index 609dd7eedf3..9bb29282bfd 100644 --- a/lib/elixir/lib/module/types/of.ex +++ b/lib/elixir/lib/module/types/of.ex @@ -464,7 +464,6 @@ defmodule Module.Types.Of do This is expanded and validated by the compiler, so don't need to check the fields. """ - # TODO: Type check the fields match the struct def struct_instance(struct, args, expected, meta, stack, context, of_fun) when is_atom(struct) do {info, context} = struct_info(struct, :expr, meta, stack, context) @@ -473,22 +472,76 @@ defmodule Module.Types.Of do raise "expected #{inspect(struct)} to return struct metadata, but got none" end + typed_fields = fetch_struct_type_descr(struct, stack) + defaults_by_field = struct_defaults_by_field(info) + # The compiler has already checked the keys are atoms and which ones are required. {args_types, context} = Enum.map_reduce(args, context, fn {key, value}, context when is_atom(key) -> + typed_field_type = typed_field(typed_fields, key) + value_type = - case map_fetch_key(expected, key) do - {_, expected_value_type} -> expected_value_type - _ -> term() + if typed_field_type != nil do + typed_field_type + else + case map_fetch_key(expected, key) do + {_, expected_value_type} -> expected_value_type + _ -> term() + end end {type, context} = of_fun.(value, value_type, stack, context) + + context = + cond do + typed_field_type == nil -> context + # The compiler injects defaults into `args`. Don't warn for + # values that are exactly the defstruct default — those are + # not user-authored and would surface a noisy diagnostic at + # every struct construction site. + value == Map.get(defaults_by_field, key, :__no_default__) -> context + compatible?(type, typed_field_type) -> context + true -> + error = + {:badstructfield, struct, key, value, typed_field_type, type, context} + + error(error, meta, stack, context) + end + {{key, type}, context} end) {closed_map([{:__struct__, atom([struct])} | args_types]), context} end + defp typed_field(nil, _key), do: nil + + defp typed_field(descr, key) do + case map_fetch_key(descr, key) do + {_optional?, type} -> type + _ -> nil + end + end + + defp struct_defaults_by_field(info) do + Map.new(info, fn %{field: field, default: default} -> {field, default} end) + end + + # Look up the `t/0` typespec Descr for `struct` from the parallel + # checker cache. The cache snapshot is written during + # `cache_from_module_map` (parallel_checker.ex) so it survives the + # teardown of the module's compile-time data tables. + # + # `@opaque t :: ...` is strict only inside the defining module; from + # any other module it is treated as `dynamic()` (i.e. nil here). + defp fetch_struct_type_descr(struct, stack) do + case Module.ParallelChecker.fetch_type(stack.cache, struct, :t, 0) do + {:type, descr} -> descr + {:opaque, descr} when struct == stack.module -> descr + _ -> nil + end + end + @doc """ Returns `__info__(:struct)` information about a struct. """ @@ -872,6 +925,31 @@ defmodule Module.Types.Of do } end + def format_diagnostic({:badstructfield, module, field, expr, expected_type, actual_type, context}) do + traces = collect_traces(expr, context) + + %{ + details: %{typing_traces: traces}, + message: + IO.iodata_to_binary([ + """ + incompatible value for field #{inspect(field)} of struct #{inspect(module)}: + + #{expr_to_string(expr) |> indent(4)} + + got type: + + #{to_quoted_string(actual_type) |> indent(4)} + + expected type: + + #{to_quoted_string(expected_type) |> indent(4)} + """, + format_traces(traces) + ]) + } + end + defp dot_var?(expr) do match?({{:., _, [var, _fun]}, _, _args} when is_var(var), expr) end diff --git a/lib/elixir/lib/module/types/typespec.ex b/lib/elixir/lib/module/types/typespec.ex new file mode 100644 index 00000000000..013d93092e5 --- /dev/null +++ b/lib/elixir/lib/module/types/typespec.ex @@ -0,0 +1,281 @@ +# SPDX-License-Identifier: Apache-2.0 +# SPDX-FileCopyrightText: 2021 The Elixir Team + +defmodule Module.Types.Typespec do + @moduledoc false + + # Converts a typespec AST to a `Module.Types.Descr` value. + # + # The conversion assumes the AST has already been validated by + # `Kernel.Typespec`. Unsupported subterms degrade to `dynamic()` + # so a single unrecognized form does not poison the whole alias. + # + # Two errors surface up the call stack instead of degrading: + # + # * `{:cycle, name, arity}` — re-entering an alias still marked + # `:pending` in `defined`. Recursive aliases are deferred to + # a future release. + # + # * `{:parametric_unsupported, name, arity}` — a reference to an + # alias with arity > 0. Parametric types are deferred. + + import Module.Types.Descr + + @elixir_checker_version :elixir_erl.checker_version() + + @doc """ + Convert `ast` to a `Descr`, using `state.defined` to resolve local references. + + `state` must contain: + + * `:module` — the module being compiled, used to expand `__MODULE__`. + * `:defined` — map of `{name, arity} => :pending | {kind, descr}`. + + Returns `{:ok, descr}` or `{:error, reason}`. + """ + def to_descr(ast, state) do + try do + {:ok, do_to_descr(ast, state)} + catch + {:to_descr, reason} -> {:error, reason} + end + end + + # Unions + defp do_to_descr({:|, _, [left, right]}, state) do + left_descr = do_to_descr(left, state) + right_descr = do_to_descr(right, state) + union(left_descr, right_descr) + end + + # Parenthesized / annotated forms — strip and recurse. + defp do_to_descr({:"::", _, [_var, ast]}, state), do: do_to_descr(ast, state) + + # Atom literals + defp do_to_descr(atom, _state) when is_atom(atom) do + case atom do + :any -> term() + :term -> term() + _ -> atom([atom]) + end + end + + # Integer literals — represent as the full integer() set for now. + # Singleton integer types are not yet represented in Descr. + defp do_to_descr(int, _state) when is_integer(int), do: integer() + + # Empty list literal + defp do_to_descr([], _state), do: empty_list() + + # Function spec: `(args -> result)` is parsed as `[{:->, _, [args, result]}]`. + defp do_to_descr([{:->, _, [args, return]}], state) when is_list(args) do + cond do + # `(... -> result)` — variable arity. Not statically representable; + # degrade to the top function type. + Enum.any?(args, &match?({:..., _, _}, &1)) -> + fun() + + true -> + arg_types = Enum.map(args, &do_to_descr(&1, state)) + return_type = do_to_descr(return, state) + fun(arg_types, return_type) + end + end + + # Non-empty proper list literal: [type] + defp do_to_descr([elem], state), do: list(do_to_descr(elem, state)) + + # Tuple literals: {a, b} is parsed as {:{}, _, [...]} only for arity != 2. + # 2-tuples come through as plain {a, b}. + defp do_to_descr({left, right}, state) do + tuple([do_to_descr(left, state), do_to_descr(right, state)]) + end + + defp do_to_descr({:{}, _, elements}, state) do + tuple(Enum.map(elements, &do_to_descr(&1, state))) + end + + # Empty map literal + defp do_to_descr({:%{}, _, []}, _state), do: empty_map() + + # Map literal with keyword-style atom keys: %{a: integer(), b: atom()} + defp do_to_descr({:%{}, _, pairs}, state) do + converted = + for {k, v} <- pairs, is_atom(k) do + {k, do_to_descr(v, state)} + end + + if length(converted) == length(pairs) do + closed_map(converted) + else + # Non-atom keys: not handled yet, fall back to open_map(). + open_map() + end + end + + # Struct literal: %Mod{field: type, ...} + defp do_to_descr({:%, _, [module_ast, {:%{}, _, pairs}]}, state) do + module = + case module_ast do + {:__MODULE__, _, _} -> state.module + {:__aliases__, _, parts} -> Module.concat(parts) + mod when is_atom(mod) -> mod + _ -> nil + end + + if is_atom(module) and module != nil do + field_pairs = + for {k, v} <- pairs, is_atom(k) do + {k, do_to_descr(v, state)} + end + + closed_map([{:__struct__, atom([module])} | field_pairs]) + else + dynamic() + end + end + + # Remote type reference: Mod.name(args). + defp do_to_descr({{:., _, [module_ast, name]}, _, args}, state) + when is_atom(name) and is_list(args) do + arity = length(args) + + case expand_module(module_ast) do + nil -> + dynamic() + + module when module == state.module -> + # Self-qualified reference — treat as a local lookup. + local_or_pending(name, arity, state) + + module when is_atom(module) -> + resolve_remote(module, name, arity) + end + end + + # Built-in type calls and local references: name(arg1, arg2, ...). + + defp do_to_descr({name, _meta, args}, state) when is_atom(name) and (is_list(args) or is_atom(args)) do + arg_list = if is_list(args), do: args, else: [] + arity = length(arg_list) + builtin(name, arity, arg_list, state) + end + + # Anything else — degrade. + defp do_to_descr(_ast, _state), do: dynamic() + + ## Built-ins and local references + + defp builtin(:integer, 0, [], _state), do: integer() + defp builtin(:float, 0, [], _state), do: float() + defp builtin(:atom, 0, [], _state), do: atom() + defp builtin(:boolean, 0, [], _state), do: boolean() + defp builtin(:pid, 0, [], _state), do: pid() + defp builtin(:port, 0, [], _state), do: port() + defp builtin(:reference, 0, [], _state), do: reference() + defp builtin(:binary, 0, [], _state), do: binary() + defp builtin(:bitstring, 0, [], _state), do: bitstring() + defp builtin(:any, 0, [], _state), do: term() + defp builtin(:term, 0, [], _state), do: term() + defp builtin(:none, 0, [], _state), do: none() + defp builtin(:no_return, 0, [], _state), do: none() + defp builtin(:map, 0, [], _state), do: open_map() + defp builtin(:tuple, 0, [], _state), do: tuple() + + defp builtin(:list, 0, [], _state), do: list(term()) + defp builtin(:list, 1, [elem], state), do: list(do_to_descr(elem, state)) + + defp builtin(:non_empty_list, 0, [], _state), do: non_empty_list(term()) + defp builtin(:non_empty_list, 1, [elem], state), do: non_empty_list(do_to_descr(elem, state)) + + # Local reference into the typespec table. + defp builtin(name, arity, _args, state) do + local_or_pending(name, arity, state) + end + + ## Local + remote reference resolution + + defp local_or_pending(name, arity, state) do + case Map.get(state.defined, {name, arity}) do + nil -> + # Unknown name — could be a remote built-in we haven't covered, + # an as-yet-undefined helper, or a typevar. Degrade. + dynamic() + + :pending -> + throw({:to_descr, {:cycle, name, arity}}) + + _ when arity > 0 -> + throw({:to_descr, {:parametric_unsupported, name, arity}}) + + {_kind, descr} -> + descr + end + end + + defp expand_module({:__aliases__, _, parts}), do: Module.concat(parts) + defp expand_module(mod) when is_atom(mod), do: mod + defp expand_module(_), do: nil + + # Parametric types are unsupported. We degrade rather than throw + # because the offending typespec lives in another module — we can't + # surface a useful error at this site. + defp resolve_remote(_module, _name, arity) when arity > 0, do: dynamic() + + defp resolve_remote(module, name, 0) do + case fetch_remote_types(module) do + %{{^name, 0} => {:type, descr}} -> descr + # @opaque is dynamic from outside its defining module. + %{{^name, 0} => {:opaque, _descr}} -> dynamic() + _ -> dynamic() + end + end + + defp fetch_remote_types(module) do + # Try the parallel checker ETS first (same-compilation-unit modules that + # were compiled earlier in the same run). Fall back to the in-memory beam + # binary for cross-app references where the module is already loaded in the VM. + case fetch_remote_types_from_checker(module) do + {:ok, types} -> types + :not_found -> fetch_remote_types_from_beam(module) + end + end + + # Look up types from the parallel checker ETS table. Returns `{:ok, map}` + # when type entries for the module are present in the checker table (written + # by `cache_from_module_map` during a sibling module's `spawn_parallel_checker` + # call), or `:not_found` if no type entries exist yet or the checker is unavailable. + # + # Note: the `{module, mode}` mode entry is written later (when :start is called), + # but `{module, :type, name, arity}` entries are written eagerly by + # `cache_types_descr_from_data_tables` inside `cache_from_module_map`. We + # therefore check for the presence of any type entry rather than the mode entry. + defp fetch_remote_types_from_checker(module) do + case :erlang.get(:elixir_checker_info) do + {_parent, {_checker, table}} -> + pairs = :ets.match(table, {{module, :type, :"$1", :"$2"}, :"$3"}) + + case pairs do + [] -> + :not_found + + _ -> + {:ok, Map.new(pairs, fn [name, arity, entry] -> {{name, arity}, entry} end)} + end + + _ -> + :not_found + end + end + + defp fetch_remote_types_from_beam(module) do + with {:module, ^module} <- :code.ensure_loaded(module), + {^module, binary, _path} <- :code.get_object_code(module), + {:ok, {^module, [{~c"ExCk", chunk}]}} <- :beam_lib.chunks(binary, [~c"ExCk"]), + {@elixir_checker_version, contents} <- :erlang.binary_to_term(chunk) do + Map.get(contents, :types, %{}) + else + _ -> %{} + end + end +end diff --git a/lib/elixir/lib/protocol.ex b/lib/elixir/lib/protocol.ex index b1536c5a4ba..ade3fea631e 100644 --- a/lib/elixir/lib/protocol.ex +++ b/lib/elixir/lib/protocol.ex @@ -645,17 +645,33 @@ defmodule Protocol do checker = if checker do - update_in(checker.exports, fn exports -> - signatures = new_signatures(definitions, protocol_funs, protocol, types, structs) - - for {fun, info} <- exports do - if sig = Map.get(signatures, fun) do - {fun, %{info | sig: sig}} - else - {fun, info} + {domain, signatures} = + new_signatures(definitions, protocol_funs, protocol, types, structs) + + checker = + update_in(checker.exports, fn exports -> + for {fun, info} <- exports do + if sig = Map.get(signatures, fun) do + {fun, %{info | sig: sig}} + else + {fun, info} + end end - end - end) + end) + + # After consolidation, `t/0` is the union of every type that + # implements the protocol. This makes `Protocol.t()` references + # in downstream typespecs semantically meaningful — a value + # typed as `Enumerable.t()` is statically known to be one of + # the implementing struct or built-in types. + # + # Note: protocols are open-world by language design (anyone can + # `defimpl`); this union is closed-world *for the current build + # artifact*. Downstream code sees the closed-world view only on + # compilations that follow consolidation. + existing_types = Map.get(checker, :types, %{}) + updated_types = Map.put(existing_types, {:t, 0}, {:type, domain}) + Map.put(checker, :types, updated_types) end {:ok, definitions, checker} @@ -716,12 +732,15 @@ defmodule Protocol do {fun_arity, {:strong, nil, [{[domain | rest], Descr.dynamic()}]}} end - Map.new( - [ - {{:impl_for, 1}, {:strong, [Descr.term()], impl_for}}, - {{:impl_for!, 1}, {:strong, [domain], impl_for!}} - ] ++ new_signatures - ) + signatures = + Map.new( + [ + {{:impl_for, 1}, {:strong, [Descr.term()], impl_for}}, + {{:impl_for!, 1}, {:strong, [domain], impl_for!}} + ] ++ new_signatures + ) + + {domain, signatures} end defp get_protocol_functions({_name, _kind, _meta, clauses}) do diff --git a/lib/elixir/src/elixir_compiler.erl b/lib/elixir/src/elixir_compiler.erl index a94df794a41..7e03cabc73c 100644 --- a/lib/elixir/src/elixir_compiler.erl +++ b/lib/elixir/src/elixir_compiler.erl @@ -222,6 +222,7 @@ bootstrap_files() -> <<"module/behaviour.ex">>, <<"module/types/helpers.ex">>, <<"module/types/descr.ex">>, + <<"module/types/typespec.ex">>, <<"module/types/of.ex">>, <<"module/types/pattern.ex">>, <<"module/types/apply.ex">>, diff --git a/lib/elixir/src/elixir_erl.erl b/lib/elixir/src/elixir_erl.erl index b7f401ba688..bc88908b0f8 100644 --- a/lib/elixir/src/elixir_erl.erl +++ b/lib/elixir/src/elixir_erl.erl @@ -171,9 +171,16 @@ compile(#{module := Module, anno := Anno} = BaseMap, Signatures) -> ChunkOpts = chunk_opts(Map), DocsChunk = docs_chunk(Map, Set, Module, Anno, Def, Defmacro, Types, Callbacks, ChunkOpts), - CheckerChunk = checker_chunk(Map, Def, Signatures, ChunkOpts), + TypesDescr = types_descr_for_chunk(Set), + CheckerChunk = checker_chunk(Map, Def, Signatures, TypesDescr, ChunkOpts), load_form(Map, Prefix, Forms, TypeSpecs, DocsChunk ++ CheckerChunk). +types_descr_for_chunk(Set) -> + case ets:lookup(Set, {elixir, types_descr}) of + [{_, Map}] when is_map(Map) -> Map; + _ -> #{} + end. + chunk_opts(Map) -> case lists:member(deterministic, ?key(Map, compile_opts)) of true -> [deterministic]; @@ -649,7 +656,7 @@ signature_to_binary(_, Name, Signature) -> Doc = 'Elixir.Inspect.Algebra':format('Elixir.Code':quoted_to_algebra(Quoted), infinity), 'Elixir.IO':iodata_to_binary(Doc). -checker_chunk(Map, Def, Signatures, ChunkOpts) -> +checker_chunk(Map, Def, Signatures, TypesDescr, ChunkOpts) -> #{deprecated := Deprecated, defines_behaviour := DefinesBehaviour, attributes := Attributes} = Map, DeprecatedMap = maps:from_list(Deprecated), @@ -663,7 +670,7 @@ checker_chunk(Map, Def, Signatures, ChunkOpts) -> {FA, Info} end || {FA, _Meta} <- prepend_behaviour_info(DefinesBehaviour, Def)], - Contents = #{ + BaseContents = #{ exports => Exports, mode => case lists:keymember('__protocol__', 1, Attributes) of true -> protocol; @@ -671,6 +678,12 @@ checker_chunk(Map, Def, Signatures, ChunkOpts) -> end }, + %% Optional additive key — older readers ignore it, no version bump. + Contents = case map_size(TypesDescr) of + 0 -> BaseContents; + _ -> BaseContents#{types => TypesDescr} + end, + checker_chunk(Contents, ChunkOpts). prepend_behaviour_info(true, Def) -> [{{behaviour_info, 1}, []} | Def]; diff --git a/lib/elixir/src/elixir_module.erl b/lib/elixir/src/elixir_module.erl index 419d361fe9c..1294862cc26 100644 --- a/lib/elixir/src/elixir_module.erl +++ b/lib/elixir/src/elixir_module.erl @@ -171,6 +171,10 @@ compile(Meta, Module, ModuleAsCharlist, Block, Vars, Prune, E) -> undefined -> undefined; _ -> erlang:put(elixir_compiler_info, CompilerInfo) end, + case CheckerInfo of + {_, nil} -> ok; + _ -> erlang:put(elixir_checker_info, CheckerInfo) + end, PersistedAttributes = ets:lookup_element(DataBag, persisted_attributes, 2), Attributes = attributes(DataSet, DataBag, PersistedAttributes), diff --git a/lib/elixir/test/elixir/module/types/integration_test.exs b/lib/elixir/test/elixir/module/types/integration_test.exs index abd8a325720..511eed85f43 100644 --- a/lib/elixir/test/elixir/module/types/integration_test.exs +++ b/lib/elixir/test/elixir/module/types/integration_test.exs @@ -85,6 +85,44 @@ defmodule Module.Types.IntegrationTest do ] end + test "writes typed struct fields under :types" do + files = %{ + "user.ex" => """ + defmodule UserT do + defstruct [:name, :age] + @type t :: %__MODULE__{name: binary(), age: integer()} + end + """ + } + + modules = compile_modules(files) + chunk = read_chunk(modules[UserT]) + + assert %{types: %{{:t, 0} => {:type, descr}}} = chunk + + # Round-trip sanity: the Descr we read back must answer field queries. + assert {_, age_type} = map_fetch_key(descr, :age) + assert equal?(age_type, integer()) + + assert {_, name_type} = map_fetch_key(descr, :name) + assert equal?(name_type, binary()) + end + + test "no :types key when module has no @type" do + files = %{ + "plain.ex" => """ + defmodule Plain do + def x, do: :ok + end + """ + } + + modules = compile_modules(files) + chunk = read_chunk(modules[Plain]) + + refute Map.has_key?(chunk, :types) + end + test "writes exports for implementations" do files = %{ "pi.ex" => """ @@ -154,6 +192,80 @@ defmodule Module.Types.IntegrationTest do assert itself_arg.(Itself.Unknown) == dynamic(open_map(__struct__: atom([Unknown]))) end + + test "consolidation writes union of impl types into protocol's t/0" do + files = %{ + "pi.ex" => """ + defprotocol Sized do + def size(data) + end + + defimpl Sized, for: List do + def size(data), do: length(data) + end + + defimpl Sized, for: Tuple do + def size(data), do: tuple_size(data) + end + """ + } + + modules = compile_modules(files, consolidate_protocols: true) + chunk = read_chunk(modules[Sized]) + + assert %{types: %{{:t, 0} => {:type, t_descr}}} = chunk + + # The union of all impls: List ∪ Tuple in their open form. + expected = + union( + union(empty_list(), non_empty_list(term(), term())), + tuple() + ) + + assert equal?(t_descr, expected) + end + + test "@fallback_to_any protocol consolidates t/0 to term()" do + files = %{ + "pi.ex" => """ + defprotocol Fallback do + @fallback_to_any true + def describe(data) + end + + defimpl Fallback, for: Any do + def describe(_), do: "any" + end + + defimpl Fallback, for: Atom do + def describe(data), do: Atom.to_string(data) + end + """ + } + + modules = compile_modules(files, consolidate_protocols: true) + chunk = read_chunk(modules[Fallback]) + + assert %{types: %{{:t, 0} => {:type, t_descr}}} = chunk + # `Any` in the impl list makes the protocol accept every value. + assert equal?(t_descr, term()) + end + + test "protocol with no impls consolidates t/0 to none()" do + files = %{ + "pi.ex" => """ + defprotocol NoImpls do + def call(data) + end + """ + } + + modules = compile_modules(files, consolidate_protocols: true) + chunk = read_chunk(modules[NoImpls]) + + assert %{types: %{{:t, 0} => {:type, t_descr}}} = chunk + assert equal?(t_descr, none()) + end end describe "type checking" do @@ -1724,6 +1836,203 @@ defmodule Module.Types.IntegrationTest do end end + describe "typed struct fields via @type t" do + test "field type from @type t is used when constructing %__MODULE__{}" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + @type t :: %__MODULE__{name: binary(), age: integer()} + + def make_bad, do: %__MODULE__{age: :not_an_int} + end + """ + } + + warnings = ["expected type:", "integer()", ":not_an_int"] + assert_warnings(files, warnings) + end + + test "struct without @type t falls back to dynamic() per field (no warning)" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + + def make_bad, do: %__MODULE__{age: :not_an_int} + end + """ + } + + assert_no_warnings(files) + end + + test "field not declared in @type t body stays dynamic() (no spurious warning)" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + # `age` is intentionally absent from t + @type t :: %__MODULE__{name: binary()} + + def make_anything, do: %__MODULE__{age: :anything} + end + """ + } + + assert_no_warnings(files) + end + + test "@opaque t is strict inside its defining module" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + @opaque t :: %__MODULE__{name: binary(), age: integer()} + + def make_bad, do: %__MODULE__{age: :not_an_int} + end + """ + } + + warnings = ["expected type:", "integer()", ":not_an_int"] + assert_warnings(files, warnings) + end + + test "cross-module: @type t resolved via ExCk chunk" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + @type t :: %__MODULE__{name: binary(), age: integer()} + end + """, + "caller.ex" => """ + defmodule Caller do + def go, do: %User{age: :not_an_int} + end + """ + } + + warnings = ["expected type:", "integer()", ":not_an_int"] + assert_warnings(files, warnings) + end + + test "function-typed field: wrong arity warns" do + files = %{ + "handler.ex" => """ + defmodule Handler do + defstruct [:cb] + @type t :: %__MODULE__{cb: (integer() -> :ok)} + + # &Kernel.+/2 has arity 2, expected arity 1 + def make_bad, do: %__MODULE__{cb: &Kernel.+/2} + end + """ + } + + assert_warnings(files, ["expected type:", "&Kernel.+/2"]) + end + + test "remote type reference resolves through ExCk (stdlib type)" do + # `String.t :: binary()` ships in the stdlib ExCk chunk. From any + # other module, `String.t()` resolves to the stored `binary()` + # descriptor. + # + # Same-compile-unit type-only references aren't yet supported + # because type references aren't tracked as compile dependencies + # — see #15127's second sub-bullet (dep tracking). + files = %{ + "user_string.ex" => """ + defmodule UserString do + defstruct [:name] + @type t :: %__MODULE__{name: String.t()} + + def make_bad, do: %__MODULE__{name: 42} + end + """ + } + + warnings = ["expected type:", "binary()", "42"] + assert_warnings(files, warnings) + end + + test "remote type reference resolves through ExCk (same compilation unit)" do + # When A and B are in the same compilation unit and B references A.t() + # in a typespec without any runtime call, the dep tracking must ensure + # A compiles before B so B's typespec translation reads A's beam correctly. + files = %{ + "id.ex" => """ + defmodule Id do + @type t :: integer() + end + """, + "user_remote.ex" => """ + defmodule UserRemote do + defstruct [:id] + @type t :: %__MODULE__{id: Id.t()} + + def make_bad, do: %__MODULE__{id: :not_int} + end + """ + } + + warnings = ["expected type:", "integer()", ":not_int"] + assert_warnings(files, warnings) + end + + test "opaque remote type degrades to dynamic outside its module (no warning)" do + # `Version.Requirement.t` is `@opaque` in stdlib. From outside its + # defining module, it must degrade to dynamic — any value should + # type-check against the field without warning. + files = %{ + "uses_opaque.ex" => """ + defmodule UsesOpaque do + defstruct [:req] + @type t :: %__MODULE__{req: Version.Requirement.t()} + + def make_anything, do: %__MODULE__{req: :anything} + end + """ + } + + assert_no_warnings(files) + end + + test "function-typed field: compatible function does not warn" do + files = %{ + "handler.ex" => """ + defmodule Handler do + defstruct [:cb] + @type t :: %__MODULE__{cb: (atom() -> binary())} + + def make_ok, do: %__MODULE__{cb: &Atom.to_string/1} + end + """ + } + + assert_no_warnings(files) + end + + test "cross-module: @opaque t treated as dynamic() outside defining module (no warning)" do + files = %{ + "user.ex" => """ + defmodule User do + defstruct [:name, :age] + @opaque t :: %__MODULE__{name: binary(), age: integer()} + end + """, + "caller.ex" => """ + defmodule Caller do + def go, do: %User{age: :not_an_int} + end + """ + } + + assert_no_warnings(files) + end + end + defp assert_warnings(files, expected, opts \\ []) defp assert_warnings(files, expected, opts) when is_binary(expected) do @@ -1756,10 +2065,12 @@ defmodule Module.Types.IntegrationTest do end) end - defp compile_modules(files) do + defp compile_modules(files), do: compile_modules(files, []) + + defp compile_modules(files, opts) do in_tmp(fn -> paths = generate_files(files) - {modules, _warnings} = compile_to_path(paths, []) + {modules, _warnings} = compile_to_path(paths, opts) Map.new(modules, fn module -> {^module, binary, _filename} = :code.get_object_code(module) diff --git a/lib/elixir/test/elixir/module/types/typespec_test.exs b/lib/elixir/test/elixir/module/types/typespec_test.exs new file mode 100644 index 00000000000..3b51d6d1398 --- /dev/null +++ b/lib/elixir/test/elixir/module/types/typespec_test.exs @@ -0,0 +1,182 @@ +# SPDX-License-Identifier: Apache-2.0 +# SPDX-FileCopyrightText: 2021 The Elixir Team + +Code.require_file("type_helper.exs", __DIR__) + +defmodule Module.Types.TypespecTest.SomeStruct do + defstruct [:name, :age] +end + +defmodule Module.Types.TypespecTest do + use ExUnit.Case, async: true + + import Module.Types.Descr + alias Module.Types.Typespec + + defp to_descr(ast, defined \\ %{}) do + Typespec.to_descr(ast, %{module: __MODULE__, defined: defined}) + end + + describe "built-in types" do + test "atomic types" do + assert to_descr(quote(do: integer())) == {:ok, integer()} + assert to_descr(quote(do: float())) == {:ok, float()} + assert to_descr(quote(do: atom())) == {:ok, atom()} + assert to_descr(quote(do: boolean())) == {:ok, boolean()} + assert to_descr(quote(do: pid())) == {:ok, pid()} + assert to_descr(quote(do: port())) == {:ok, port()} + assert to_descr(quote(do: reference())) == {:ok, reference()} + assert to_descr(quote(do: binary())) == {:ok, binary()} + assert to_descr(quote(do: bitstring())) == {:ok, bitstring()} + end + + test "top and bottom" do + assert to_descr(quote(do: any())) == {:ok, term()} + assert to_descr(quote(do: term())) == {:ok, term()} + assert to_descr(quote(do: none())) == {:ok, none()} + assert to_descr(quote(do: no_return())) == {:ok, none()} + end + + test "collections" do + assert to_descr(quote(do: map())) == {:ok, open_map()} + assert to_descr(quote(do: tuple())) == {:ok, tuple()} + assert to_descr(quote(do: list(integer()))) == {:ok, list(integer())} + assert to_descr(quote(do: non_empty_list(integer()))) == {:ok, non_empty_list(integer())} + end + end + + describe "literals" do + test "atoms" do + assert to_descr(quote(do: :foo)) == {:ok, atom([:foo])} + assert to_descr(quote(do: nil)) == {:ok, atom([nil])} + assert to_descr(quote(do: true)) == {:ok, atom([true])} + end + + test "integers" do + assert to_descr(quote(do: 42)) == {:ok, integer()} + end + + test "empty list" do + assert to_descr(quote(do: [])) == {:ok, empty_list()} + end + + test "tuple literal" do + assert to_descr(quote(do: {integer(), atom()})) == + {:ok, tuple([integer(), atom()])} + end + + test "empty map literal" do + assert to_descr(quote(do: %{})) == {:ok, empty_map()} + end + end + + describe "structs" do + test "%__MODULE__{} expands to closed_map with struct tag" do + assert {:ok, descr} = + to_descr( + quote( + do: %Module.Types.TypespecTest.SomeStruct{name: binary(), age: integer()} + ) + ) + + assert equal?( + descr, + closed_map([ + {:__struct__, atom([Module.Types.TypespecTest.SomeStruct])}, + {:name, binary()}, + {:age, integer()} + ]) + ) + end + end + + describe "compositions" do + test "union" do + assert {:ok, descr} = to_descr(quote(do: integer() | atom())) + assert equal?(descr, union(integer(), atom())) + end + + test "three-way union" do + assert {:ok, descr} = to_descr(quote(do: integer() | atom() | binary())) + assert equal?(descr, union(integer(), union(atom(), binary()))) + end + end + + describe "local references" do + test "resolves a local @type alias" do + defined = %{{:id, 0} => {:type, integer()}} + assert to_descr(quote(do: id()), defined) == {:ok, integer()} + end + + test "resolves transitively through stored aliases" do + defined = %{ + {:id, 0} => {:type, integer()}, + {:maybe_id, 0} => {:type, union(integer(), atom([nil]))} + } + + assert {:ok, descr} = to_descr(quote(do: maybe_id()), defined) + assert equal?(descr, union(integer(), atom([nil]))) + end + end + + describe "function types" do + test "single-arg function" do + assert {:ok, descr} = to_descr(quote(do: (integer() -> integer()))) + assert equal?(descr, fun([integer()], integer())) + end + + test "multi-arg function" do + assert {:ok, descr} = to_descr(quote(do: (integer(), atom() -> :ok))) + assert equal?(descr, fun([integer(), atom()], atom([:ok]))) + end + + test "zero-arg function" do + assert {:ok, descr} = to_descr(quote(do: (-> :ok))) + assert equal?(descr, fun([], atom([:ok]))) + end + + test "variadic ... -> result degrades to top function" do + assert {:ok, descr} = to_descr(quote(do: (... -> integer()))) + assert equal?(descr, fun()) + end + end + + describe "errors" do + test "cycle is reported" do + defined = %{{:a, 0} => :pending} + assert {:error, {:cycle, :a, 0}} = to_descr(quote(do: a()), defined) + end + + test "parametric arity > 0 reference returns parametric_unsupported" do + defined = %{{:t, 1} => {:type, term()}} + assert {:error, {:parametric_unsupported, :t, 1}} = to_descr(quote(do: t(integer())), defined) + end + end + + describe "remote type references" do + test "resolves to the stored Descr via ExCk" do + # `String.t :: binary()` is a stable stdlib type that ships with + # an ExCk `:types` entry produced by this same converter. + assert {:ok, descr} = to_descr(quote(do: String.t())) + assert equal?(descr, binary()) + end + + test "unknown name in known module degrades to dynamic()" do + assert {:ok, descr} = to_descr(quote(do: String.nonexistent_type())) + assert equal?(descr, dynamic()) + end + + test "Erlang module reference degrades to dynamic()" do + # Erlang modules don't ship an ExCk chunk; we degrade quietly. + assert {:ok, descr} = to_descr(quote(do: :lists.boolean())) + assert equal?(descr, dynamic()) + end + + test "parametric remote ref degrades to dynamic() (graceful)" do + # The offending parametric typespec lives in another module; we + # can't throw a useful error from here, so degrade quietly. + assert {:ok, descr} = to_descr(quote(do: Enumerable.t(integer()))) + assert equal?(descr, dynamic()) + end + end +end From 5d794ae0a91587d3ddd498b2429704fbc78161b8 Mon Sep 17 00:00:00 2001 From: Chris Garvis Date: Tue, 12 May 2026 21:48:30 -0400 Subject: [PATCH 2/2] Polish: protocol false-positive fix, refactors, perf optimizations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Follow-up commit on top of the type-aliases foundation. Three categories of change, all stylistic or performance — no new behavior. ## Bug fix: false-positive "protocol for missing module" warnings `Protocol.__impl__!/3`'s missing-module check was treating transient parallel-compiler deadlock states (`Code.ensure_compiled` returns `{:error, :unavailable}`) as genuinely-missing modules and emitting a misleading warning. The new typespec `ensure_compiled` calls introduced earlier in this series caused more of these transient states, surfacing ~33 false positives across the top 100 Hex packages. Fix: distinguish `:unavailable` (transient — compiler may resolve) from `:nofile`/`:badfile`/`:embedded` (genuinely missing). Only the latter group warrants the warning. ## Refactors (no behavior change) * `Module.Types.Of.struct_instance/7`: extract a small helper for the typed-field-vs-expected fallback (`expected_field_type/2`). * `Of.format_diagnostic({:badstructfield, ...})`: change message to "but expected type:" to match the convention used by `:badmap` and sibling diagnostics. * `Module.Types.Typespec`: function-spec clause uses two pattern heads instead of a single-branch `cond`; collapsed an intermediate `case pairs` in `fetch_remote_types_from_checker`; reordered `do_to_descr/2` clauses so the hot remote-call shape matches earliest. * `Kernel.Typespec.convert_type_to_descr/2`: inlined a bare `arity =` binding from a `with` chain. * `Protocol.consolidate/5`: replaced 3-line Map.get/put/put dance with `update_in` matching the adjacent `update_in(checker.exports)` style. ## Performance (measurable) * `Module.Types.Typespec.fetch_remote_types/1`: memoize via process dictionary, scoped per compile worker. Eliminates redundant `:beam_lib.chunks` + `:erlang.binary_to_term` calls for the same remote module within one typespec block. Highest-payoff optimization for typespec-heavy modules (Ecto schemas reference `String.t()` etc. 30+ times per module). * `Kernel.Typespec.typespec/4` remote handler: dedupe `Kernel.ErrorHandler.ensure_compiled` calls per worker process via `:"$elixir_typespec_ensured"` process-dict set. Eliminates redundant `:waiting` round-trips to the parallel compiler. ## Bootstrap safety `Kernel.Typespec.collect_defined_type_pairs/1`'s use of `Enum.any?` was crashing during the bootstrap of `Range` (Enum isn't compiled yet at that point). Replaced with `:lists.any/2`, an Erlang BIF that's always available. Pure bootstrap fix; no runtime behavior change. ## Tests * Updated 6 `integration_test.exs` assertions to match the new `:badstructfield` diagnostic wording. * Full checker suite: 508/508 passes on clean rebuild. * Protocol suite: 26/26. * Top-level typespec: 79/79. * `make clean && make compile` clean. Signed-off-by: Chris Garvis --- lib/elixir/lib/kernel/typespec.ex | 23 ++++- lib/elixir/lib/module/parallel_checker.ex | 1 - lib/elixir/lib/module/types/of.ex | 35 +++++--- lib/elixir/lib/module/types/typespec.ex | 87 +++++++++++-------- lib/elixir/lib/protocol.ex | 9 +- .../elixir/module/types/integration_test.exs | 12 +-- .../elixir/module/types/typespec_test.exs | 8 +- 7 files changed, 108 insertions(+), 67 deletions(-) diff --git a/lib/elixir/lib/kernel/typespec.ex b/lib/elixir/lib/kernel/typespec.ex index 68a031faab2..7b07d4359e2 100644 --- a/lib/elixir/lib/kernel/typespec.ex +++ b/lib/elixir/lib/kernel/typespec.ex @@ -283,8 +283,7 @@ defmodule Kernel.Typespec do # stored but treated as opaque-from-the-checker's-view. defp convert_type_to_descr({kind, expr, pos}, {acc, defined}) do with {:"::", _, [{name, _meta, args}, definition]} <- expr, - arity = arg_count(args), - true <- arity == 0 do + true <- arg_count(args) == 0 do env = :elixir_module.get_cached_env(pos) state = %{module: env.module, defined: Map.put(defined, {name, 0}, :pending)} @@ -675,7 +674,7 @@ defmodule Kernel.Typespec do ) fun = fn {field, _} -> - if not Enum.any?(struct_info, &(&1.field == field)) do + if not :lists.any(fn info -> info.field == field end, struct_info) do compile_error( caller, "undefined field #{inspect(field)} on struct #{inspect(module)}" @@ -866,8 +865,24 @@ defmodule Kernel.Typespec do # store_types_descr / fetch_remote_types can read the ExCk chunk from # the in-memory binary. This mirrors how struct expansion waits for its # module via Kernel.ErrorHandler.ensure_compiled. + # + # Deduplicate per-process: within one module compile, many @type + # declarations can reference the same remote (e.g. Calendar.year(), + # Calendar.month(), ...). After the first call the parallel compiler has + # already resolved that module, so subsequent calls are no-ops but still + # cross into the error handler. We cache resolved remotes in the process + # dictionary under a system-reserved key to avoid redundant calls. if :erlang.get(:elixir_compiler_info) != :undefined do - Kernel.ErrorHandler.ensure_compiled(remote, :module, :soft, caller.line) + ensured = + case :erlang.get(:"$elixir_typespec_ensured") do + :undefined -> %{} + map -> map + end + + unless is_map_key(ensured, remote) do + Kernel.ErrorHandler.ensure_compiled(remote, :module, :soft, caller.line) + :erlang.put(:"$elixir_typespec_ensured", Map.put(ensured, remote, true)) + end end {remote_spec, state} = typespec(remote, vars, caller, state) diff --git a/lib/elixir/lib/module/parallel_checker.ex b/lib/elixir/lib/module/parallel_checker.ex index fbe2d4c5d5f..520e2105642 100644 --- a/lib/elixir/lib/module/parallel_checker.ex +++ b/lib/elixir/lib/module/parallel_checker.ex @@ -532,7 +532,6 @@ defmodule Module.ParallelChecker do end end - defp cache_chunk(table, module, contents) do Enum.each(contents.exports, fn {{fun, arity}, info} -> sig = diff --git a/lib/elixir/lib/module/types/of.ex b/lib/elixir/lib/module/types/of.ex index 9bb29282bfd..43ca709744e 100644 --- a/lib/elixir/lib/module/types/of.ex +++ b/lib/elixir/lib/module/types/of.ex @@ -480,27 +480,25 @@ defmodule Module.Types.Of do Enum.map_reduce(args, context, fn {key, value}, context when is_atom(key) -> typed_field_type = typed_field(typed_fields, key) - value_type = - if typed_field_type != nil do - typed_field_type - else - case map_fetch_key(expected, key) do - {_, expected_value_type} -> expected_value_type - _ -> term() - end - end + value_type = typed_field_type || expected_field_type(expected, key) {type, context} = of_fun.(value, value_type, stack, context) context = cond do - typed_field_type == nil -> context + typed_field_type == nil -> + context + # The compiler injects defaults into `args`. Don't warn for # values that are exactly the defstruct default — those are # not user-authored and would surface a noisy diagnostic at # every struct construction site. - value == Map.get(defaults_by_field, key, :__no_default__) -> context - compatible?(type, typed_field_type) -> context + value == Map.get(defaults_by_field, key, :__no_default__) -> + context + + compatible?(type, typed_field_type) -> + context + true -> error = {:badstructfield, struct, key, value, typed_field_type, type, context} @@ -527,6 +525,13 @@ defmodule Module.Types.Of do Map.new(info, fn %{field: field, default: default} -> {field, default} end) end + defp expected_field_type(expected, key) do + case map_fetch_key(expected, key) do + {_, type} -> type + _ -> term() + end + end + # Look up the `t/0` typespec Descr for `struct` from the parallel # checker cache. The cache snapshot is written during # `cache_from_module_map` (parallel_checker.ex) so it survives the @@ -925,7 +930,9 @@ defmodule Module.Types.Of do } end - def format_diagnostic({:badstructfield, module, field, expr, expected_type, actual_type, context}) do + def format_diagnostic( + {:badstructfield, module, field, expr, expected_type, actual_type, context} + ) do traces = collect_traces(expr, context) %{ @@ -941,7 +948,7 @@ defmodule Module.Types.Of do #{to_quoted_string(actual_type) |> indent(4)} - expected type: + but expected type: #{to_quoted_string(expected_type) |> indent(4)} """, diff --git a/lib/elixir/lib/module/types/typespec.ex b/lib/elixir/lib/module/types/typespec.ex index 013d93092e5..e257f61e3a8 100644 --- a/lib/elixir/lib/module/types/typespec.ex +++ b/lib/elixir/lib/module/types/typespec.ex @@ -23,6 +23,10 @@ defmodule Module.Types.Typespec do @elixir_checker_version :elixir_erl.checker_version() + # Process-dictionary key for the per-worker remote-types cache. + # The leading "$" keeps it distinct from user-visible keys. + @remote_cache_key :"$elixir_typespec_remote_cache" + @doc """ Convert `ast` to a `Descr`, using `state.defined` to resolve local references. @@ -48,6 +52,26 @@ defmodule Module.Types.Typespec do union(left_descr, right_descr) end + # Remote type reference: Mod.name(args). + # Listed early because this is the hottest pattern in typespec-heavy modules + # (e.g. every String.t(), Keyword.t(), URI.t() reference hits this clause). + defp do_to_descr({{:., _, [module_ast, name]}, _, args}, state) + when is_atom(name) and is_list(args) do + arity = length(args) + + case expand_module(module_ast) do + nil -> + dynamic() + + module when module == state.module -> + # Self-qualified reference — treat as a local lookup. + local_or_pending(name, arity, state) + + module when is_atom(module) -> + resolve_remote(module, name, arity) + end + end + # Parenthesized / annotated forms — strip and recurse. defp do_to_descr({:"::", _, [_var, ast]}, state), do: do_to_descr(ast, state) @@ -68,18 +92,11 @@ defmodule Module.Types.Typespec do defp do_to_descr([], _state), do: empty_list() # Function spec: `(args -> result)` is parsed as `[{:->, _, [args, result]}]`. + # `(... -> result)` — variable arity. Not statically representable; degrade to the top function type. + defp do_to_descr([{:->, _, [[{:..., _, _} | _], _return]}], _state), do: fun() + defp do_to_descr([{:->, _, [args, return]}], state) when is_list(args) do - cond do - # `(... -> result)` — variable arity. Not statically representable; - # degrade to the top function type. - Enum.any?(args, &match?({:..., _, _}, &1)) -> - fun() - - true -> - arg_types = Enum.map(args, &do_to_descr(&1, state)) - return_type = do_to_descr(return, state) - fun(arg_types, return_type) - end + fun(Enum.map(args, &do_to_descr(&1, state)), do_to_descr(return, state)) end # Non-empty proper list literal: [type] @@ -135,27 +152,10 @@ defmodule Module.Types.Typespec do end end - # Remote type reference: Mod.name(args). - defp do_to_descr({{:., _, [module_ast, name]}, _, args}, state) - when is_atom(name) and is_list(args) do - arity = length(args) - - case expand_module(module_ast) do - nil -> - dynamic() - - module when module == state.module -> - # Self-qualified reference — treat as a local lookup. - local_or_pending(name, arity, state) - - module when is_atom(module) -> - resolve_remote(module, name, arity) - end - end - # Built-in type calls and local references: name(arg1, arg2, ...). - defp do_to_descr({name, _meta, args}, state) when is_atom(name) and (is_list(args) or is_atom(args)) do + defp do_to_descr({name, _meta, args}, state) + when is_atom(name) and (is_list(args) or is_atom(args)) do arg_list = if is_list(args), do: args, else: [] arity = length(arg_list) builtin(name, arity, arg_list, state) @@ -232,6 +232,27 @@ defmodule Module.Types.Typespec do end defp fetch_remote_types(module) do + # Fast path: check the per-worker process-dict cache first. + # Each module compile runs in its own short-lived worker process, so the + # cache is naturally bounded — no explicit eviction needed. + case :erlang.get(@remote_cache_key) do + %{^module => types} -> + types + + cache when is_map(cache) -> + types = compute_remote_types(module) + :erlang.put(@remote_cache_key, Map.put(cache, module, types)) + types + + _ -> + # First call in this worker: initialise the cache and compute. + types = compute_remote_types(module) + :erlang.put(@remote_cache_key, %{module => types}) + types + end + end + + defp compute_remote_types(module) do # Try the parallel checker ETS first (same-compilation-unit modules that # were compiled earlier in the same run). Fall back to the in-memory beam # binary for cross-app references where the module is already loaded in the VM. @@ -253,13 +274,11 @@ defmodule Module.Types.Typespec do defp fetch_remote_types_from_checker(module) do case :erlang.get(:elixir_checker_info) do {_parent, {_checker, table}} -> - pairs = :ets.match(table, {{module, :type, :"$1", :"$2"}, :"$3"}) - - case pairs do + case :ets.match(table, {{module, :type, :"$1", :"$2"}, :"$3"}) do [] -> :not_found - _ -> + pairs -> {:ok, Map.new(pairs, fn [name, arity, entry] -> {{name, arity}, entry} end)} end diff --git a/lib/elixir/lib/protocol.ex b/lib/elixir/lib/protocol.ex index ade3fea631e..5fbbf08854b 100644 --- a/lib/elixir/lib/protocol.ex +++ b/lib/elixir/lib/protocol.ex @@ -669,9 +669,9 @@ defmodule Protocol do # `defimpl`); this union is closed-world *for the current build # artifact*. Downstream code sees the closed-world view only on # compilations that follow consolidation. - existing_types = Map.get(checker, :types, %{}) - updated_types = Map.put(existing_types, {:t, 0}, {:type, domain}) - Map.put(checker, :types, updated_types) + update_in(checker, [:types], fn types -> + Map.put(types || %{}, {:t, 0}, {:type, domain}) + end) end {:ok, definitions, checker} @@ -1213,7 +1213,8 @@ defmodule Protocol do # TODO: Make this an error on Elixir v2.0 if for != Any and not Keyword.has_key?(built_in(), for) and for != env.module and - for not in env.context_modules and Code.ensure_compiled(for) != {:module, for} do + for not in env.context_modules and + match?({:error, reason} when reason != :unavailable, Code.ensure_compiled(for)) do IO.warn( "you are implementing a protocol for #{inspect(for)} but said module is not available. " <> "Make sure the module name is correct. If #{inspect(for)} is an optional dependency, " <> diff --git a/lib/elixir/test/elixir/module/types/integration_test.exs b/lib/elixir/test/elixir/module/types/integration_test.exs index 511eed85f43..1e3ccc05cc3 100644 --- a/lib/elixir/test/elixir/module/types/integration_test.exs +++ b/lib/elixir/test/elixir/module/types/integration_test.exs @@ -1849,7 +1849,7 @@ defmodule Module.Types.IntegrationTest do """ } - warnings = ["expected type:", "integer()", ":not_an_int"] + warnings = ["but expected type:", "integer()", ":not_an_int"] assert_warnings(files, warnings) end @@ -1895,7 +1895,7 @@ defmodule Module.Types.IntegrationTest do """ } - warnings = ["expected type:", "integer()", ":not_an_int"] + warnings = ["but expected type:", "integer()", ":not_an_int"] assert_warnings(files, warnings) end @@ -1914,7 +1914,7 @@ defmodule Module.Types.IntegrationTest do """ } - warnings = ["expected type:", "integer()", ":not_an_int"] + warnings = ["but expected type:", "integer()", ":not_an_int"] assert_warnings(files, warnings) end @@ -1931,7 +1931,7 @@ defmodule Module.Types.IntegrationTest do """ } - assert_warnings(files, ["expected type:", "&Kernel.+/2"]) + assert_warnings(files, ["but expected type:", "&Kernel.+/2"]) end test "remote type reference resolves through ExCk (stdlib type)" do @@ -1953,7 +1953,7 @@ defmodule Module.Types.IntegrationTest do """ } - warnings = ["expected type:", "binary()", "42"] + warnings = ["but expected type:", "binary()", "42"] assert_warnings(files, warnings) end @@ -1977,7 +1977,7 @@ defmodule Module.Types.IntegrationTest do """ } - warnings = ["expected type:", "integer()", ":not_int"] + warnings = ["but expected type:", "integer()", ":not_int"] assert_warnings(files, warnings) end diff --git a/lib/elixir/test/elixir/module/types/typespec_test.exs b/lib/elixir/test/elixir/module/types/typespec_test.exs index 3b51d6d1398..e18abff9a52 100644 --- a/lib/elixir/test/elixir/module/types/typespec_test.exs +++ b/lib/elixir/test/elixir/module/types/typespec_test.exs @@ -74,9 +74,7 @@ defmodule Module.Types.TypespecTest do test "%__MODULE__{} expands to closed_map with struct tag" do assert {:ok, descr} = to_descr( - quote( - do: %Module.Types.TypespecTest.SomeStruct{name: binary(), age: integer()} - ) + quote(do: %Module.Types.TypespecTest.SomeStruct{name: binary(), age: integer()}) ) assert equal?( @@ -149,7 +147,9 @@ defmodule Module.Types.TypespecTest do test "parametric arity > 0 reference returns parametric_unsupported" do defined = %{{:t, 1} => {:type, term()}} - assert {:error, {:parametric_unsupported, :t, 1}} = to_descr(quote(do: t(integer())), defined) + + assert {:error, {:parametric_unsupported, :t, 1}} = + to_descr(quote(do: t(integer())), defined) end end