Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Resolved Values and Function Handler sections to formatting #728

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Mar 14, 2024

This PR is effectively an extended comment on the ongoing discussion in #515 / #645 / #646. It also closes #678.

The intent with the change proposed here is to keep our formatting spec close to its current shape, but more clearly define what "resolved values" are, and via a non-normative example guide the reader towards a mental model of what a "formattable" or "selectable" value might look like.

I think this could be included already in LDML45 as its doesn't actually change anything, but that's of course presuming a lot of consensus.

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking at stab at this.

spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
@aphillips aphillips added the Agenda+ Requested for upcoming teleconference label Mar 15, 2024
Copy link
Collaborator

@catamorphism catamorphism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is non-normative text, but my conclusion after reading it is that there should be a design doc.

Some of it should be normative: specifically, the question of what functions take and return, which has to be answered (IMO) for the registry spec to make sense. How to describe that takes some thought, as I said in one of the comments. The task is to design a foreign function interface that's abstract enough to be adapted to a range of implementation languages, while concrete enough to make the spec something that can be implemented in a way that affords confidence that an implementation conforms to the spec. That's not easy.

Given the amount of discussion the question of "what is the domain and range of a function?" has generated, and how closely coupled that question is with "resolved values", I think it's worth a design doc. This could be #645 or someone could start afresh.

spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
@macchiati
Copy link
Member

macchiati commented Mar 18, 2024 via email

@aphillips aphillips added LDML46 LDML46 Release (Tech Preview - October 2024) and removed Agenda+ Requested for upcoming teleconference labels Mar 25, 2024
@aphillips
Copy link
Member

In to 2024-03-25 call we agreed to work on this in the Tech Preview period.

@mihnita mihnita self-requested a review May 6, 2024 17:10
@aphillips
Copy link
Member

In the 2024-05-06 call, we asked @mihnita to review this before discussing further.

>
> ```ts
> interface MessageValue {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the direction seems OK.
But I am uneasy with specifying this interface.
Because:

  • in "intertwines" even more the formatters and selectors (which I always maintained are different concepts, with different interfaces)
  • it is just specified, without explaining how to use it, how it works.

So it feels over-specified.

@mihnita
Copy link
Collaborator

mihnita commented May 12, 2024

I think this update strongly conflicts with #645 ("[DESIGN] dataflow for composability")

That one proposes to throw away "resolved values" as too ambiguous ("In the current spec, a "resolved value" can be a nameable value, formattable value, or preformatted value, depending on context.")

This PR seems to make resolved values even more ambiguous, while in the same time restricting the ability of implementations to implement things in a different (cleaner) way.

@catamorphism
Copy link
Collaborator

I experimented with something like the suggested MessageValue interface in the ICU4C implementation, in a branch. Despite my previous comments, I think it's pretty good except for: resolvedOptions(): { [key: string]: unknown }. I think it should be made clearer that an option can be a value with options. I realize that in TypeScript, that's already implicit, but someone coming from a different implementation language reading this might assume that unknown is any implementation-specific type and not notice that it has to be something with options (in order to make examples like the one from #515 work).

So I suggest that instead of unknown, it should be: resolvedOptions(): { [key: string]: MessageValue }.

spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
@eemeli eemeli requested a review from catamorphism June 18, 2024 09:12
Copy link
Collaborator

@catamorphism catamorphism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried changing ICU4C to use a value representation that's similar to this, and it works. The change in the PR shows what machinery needs to be there in order to implement examples like the one in #515. In an impl that uses the proposed MessageValue interface, it should be possible to write custom functions that use options that have options, and so on. Maybe we can get to a more normative version of this in the future, but I think this is a good step.

spec/formatting.md Outdated Show resolved Hide resolved
Co-authored-by: Tim Chevalier <tjc@igalia.com>
_Expressions_ are used in _declarations_, _selectors_, and _patterns_.
_Markup_ is only used in _patterns_.
This specification allows for the same value to be used for:
- formatting in a _placeholder_,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not right, I think. The result of formatting is a string or some kind of a "formatted part" type. When talking about resolved values, I think we want to identify the intermediate type that's the result of evaluating the placeholder, before it's formatted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key thing here is the difference between the output of an expression and the evaluation of an expression. The "resolved value" is only important (or even visible) when it will be consumed by another expression later.

I think @stasm is correct that the output of formatting (or selection) is not the "resolved value". In the case of formatting, it is the string representation (or "formatted foo" parts representation), which is different from the "resolved value". Similarly, the output of selection is a filtered and stack ranked list of patterns (which is definitely not the "resolved value").

The "resolved value", to me, means "what is the value of the operand after the expression has been evaluated".

If I have a message You have {$x :integer} wildebeest, the value of $x is constrained to be an implementation-defined numeric type (it might not be an integer type, even though the formatting function formats it like one), but the output of the expression is a string.

I think that each function needs to explicitly say whether it "covers up" (changes) the operand's resolved value and what it changes it to be. Using my example:

Function: :integer

  • Operand: an implementation-defined type or a literal whose contents match the number-literal production in the ABNF
  • Format output: a string
  • Selection output: number selector (see design number-selection)
  • Resolved value: an implementation-defined numeric type

Notice that this eliminates the string representation between the operand (input) and resolved value. It also, probably, allows the implementation defined type to be changed by the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operand: an implementation-defined type or a literal whose contents match the number-literal production in the ABNF

There is a third kind of entity that can be an operand.

An operand is a defined type, or literal, or the "result of a previous functional application" (call that a function-result for now). A function-result is (logically) not just an implementation-defined datatype. If so, it wouldn't have access to the information that any composing function would need.

Instead, I think the best conceptual model of a function-result is that of an object with methods to get information. Such as (of course, the names can be chosen to protect the innocent):

  • getSourceFunction() —the function used to create it
  • getSourceOperand() — the operand used to create it
  • getsourceOptions() — the options used to create it
  • getFormattedString()
  • getFormatToPartsStructure()
  • getSelector()
  • getResolvedValue()

Internally, it can lazy-evaluate. For example, it only needs to generate the formatted string if the method getFormattedString() is called. If the function-result is only used for selection, that never needs to be generated. If the function-result is passed to an :uppercase function, then the getResolvedValue() would never need to be called, etc.

In particular a function can look at its operand, and if it is a function-result, see what options it was passed and use them if the type of the function is known.

The description of a function needs to specify what its inputs are, and how its function-result behaves. Part of that involves specifying which functions it can compose with, and how it treats that function's operand and options.

spec/formatting.md Outdated Show resolved Hide resolved
> ```ts
> interface MessageValue {
> formatToString(): string
> formatToX(): X // where X is an implementation-defined type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed some discussions about this, but why wouldn't we want to be more specific here about the parts?

Suggested change
> formatToX(): X // where X is an implementation-defined type
> formatToParts(): IterableIterator<MessagePart>; // MessagePart is implementation-defined.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a non-normative example and we've not reached consensus on formatted parts, I'd rather not introduce them here implicitly and somewhat out of context, when the key aspect of the formatToX() inclusion is to remind the reader that a string is not the only possible formatting target.

@stasm
Copy link
Collaborator

stasm commented Jun 24, 2024

(My review comes in late. I'm OK with its not being considered blocking this PR. I can file a new PR if you prefer to discuss my suggestions elsewhere.)

@eemeli eemeli requested review from aphillips and mihnita July 13, 2024 12:21
@eemeli
Copy link
Collaborator Author

eemeli commented Jul 13, 2024

@lucacasonato, would be interested to hear if including the Resolved Values section and the examples would make it more readable from your PoV? https://github.com/unicode-org/message-format-wg/blob/define-resolved-values/spec/formatting.md

Copy link
Collaborator

@catamorphism catamorphism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- this will move things forward!

spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
Co-authored-by: Tim Chevalier <tjc@igalia.com>
Copy link

@lucacasonato lucacasonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eemeli This does seem clearer than previously.

@aphillips aphillips added LDML46.1 MF2.0 Draft Candidate and removed LDML46 LDML46 Release (Tech Preview - October 2024) labels Sep 16, 2024
@eemeli
Copy link
Collaborator Author

eemeli commented Oct 1, 2024

@aphillips As discussed yesterday, I've updated this PR to incorporate the consensus we reached a few weeks ago:

A function MUST define its resolved value. The resolved value MAY be different from the value of the operand of the function. It MAY be an implementation specific type. It is not required to be the same type as the operand.

A function MUST define its resolved options. The resolved options MAY be different from the options of the function.

After some thought, I found the best place for this to be a new section defining function handlers as the real-world functions or methods that are called to get a resolved value. This now incorporates quite a bit of the language that was in step 4 of the Function Resolution process.

I opted for "handler" as we tend to use "implementation" to mean the whole of an MF2 formatter.

The exact MUSTard that's here is a little bit different from the above, as the requirements that a function "MUST define its resolved value" and "MUST define its resolved options" are not easily enforceable unless we were to define what "define" here means, and that's really rather tricky.

@eemeli eemeli changed the title Add Resolved Values section to formatting Add Resolved Values and Function Handler sections to formatting Oct 1, 2024
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Doing this on a cell phone, so no suggestions... sorry.

spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Outdated Show resolved Hide resolved
spec/formatting.md Show resolved Hide resolved
@@ -400,6 +417,23 @@ _Option_ _identifiers_ and values are not included in the _fallback value_.

_Pattern selection_ is not supported for _fallback values_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be optional? * might match?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a selector does not support pattern selection, then the * variant is selected. A selector is never asked about *, that's handled by the selection algorithm directly.

The fact that you need to consider the combined effects of Resolve Preferences 2.ii.b, Filter Variants 2.i.b, and Sort Variants 5.iii.c to see that is what I tried to mean by arguing that "best choice" is too complicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went to the context and read the section carefully. I guess I have a few questions.

This text suggests that the fallback state is transitive--that implementations must track if a fallback has occurred. That probably needs to be stated more clearly, if so. Otherwise, the fallback value looks just like a string i.e. a literal, which is what lead to my comment.

Let's use my bad example:

.local $a = {0xbeef :num}       // this is Bad Operand with a fallback of `|0xbeef|`
.local $b = {$a :uppercase}     // this might work on `0xbeef`? It's probably Bad Operand if the first line isn't
.match $b                       // let's say :uppercase is a type of string selector too...
48879 {{Definitely not this}}
0xBEEF {{Matches?}}
* {{Could match if above doesn't}}

My point is, how does .match $b know that its resolved value is a fallback value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Some of the current text is using "fails to resolve" in a way that's not clear. I'll apply some updates to improve this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a set of further changes fixing this that's now in ac7cdc5, a commit not included in this PR, which improve the text around fallback values. As this PR now has some approval for its current state, I'll hold off on including those changes here, and propose them separately after this (hopefully!) lands.

spec/registry.md Outdated Show resolved Hide resolved
spec/syntax.md Outdated Show resolved Hide resolved
@@ -269,7 +269,7 @@ a reference to a function which cannot be resolved.
### Bad Selector

A **_<dfn>Bad Selector</dfn>_** error occurs when a message includes a _selector_
with a resolved value which does not support selection.
with a _resolved value_ which does not support selection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Can you give an example of a selector that supports selection where its resolved value doesn't?

That is, if I have
.input {$num :number}
.match $num
...
How could :number ever have a resolved value that didn't support selection? I think the notion of 'resolved value' doesn't help.

Suggested change
with a _resolved value_ which does not support selection.
that does not support selection.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could :number ever have a resolved value that didn't support selection?

Going with the example you posted above, if that message is formatted without a num argument, then the first step of Function Resolution applies:

If the expression includes an operand, resolve its value. If this fails, use a fallback value for the expression.

And then this part of Fallback Resolution applies:

Pattern selection is not supported for fallback values.

And so we end up emitting two errors when formatting:

  1. Unresolved Variable
  2. Bad Selector

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @macchiati's finger is on the right thing. It's not a technical problem. This is just hard to understand. Perhaps:

Suggested change
with a _resolved value_ which does not support selection.
A **_<dfn>Bad Selector</dfn>_** error is an error that occurs when a _message_
includes a _selector_ whose _resolved value_
contains a _function handler_ which does not support selection,
is missing a _function handler_,
or consists of a _fallback value_.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we spin this suggestion off into a separate PR? As is, the only change that is being applied here is linkifying resolved value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me as a solution.

or which depends on validation associated with a specific function.

Implementations SHOULD provide a way for _functions_ to emit
Implementations SHOULD provide a way for _function handlers_ to emit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a definition of what a "function handler" is. Which file is that in?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's added by this PR to spec/formatting.md.

spec/formatting.md Outdated Show resolved Hide resolved
spec/syntax.md Outdated Show resolved Hide resolved
@eemeli
Copy link
Collaborator Author

eemeli commented Oct 2, 2024

Replying to @macchiati in #728 (comment):

I find this all somewhat muddled. Except in function composition, "resolved value" is not a useful abstraction.

That is, outside of function composition, the relevant features are

  • a formatted value, and
  • a selection value

A resolved value is also useful when considering the processing of a message like this:

.input {$num :number}
.match $num
1 {{You have a thing}}
* {{You have {$num} things}}

Because we're using $num for both selection and formatting, it's useful to be able to say that it has a resolved value that supports both selection and formatting.

@eemeli
Copy link
Collaborator Author

eemeli commented Oct 2, 2024

I've rephrased the resolved value definition a bit, and clarified that it applies to all message parts.

@aphillips
Copy link
Member

(chair hat 🎩 on)

I just re-requested reviews from everyone who'd left reviews on the old PR so that the status is clear.

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note my comment about the resolved value definition

spec/formatting.md Outdated Show resolved Hide resolved
Co-authored-by: Eemeli Aro <eemeli@mozilla.com>
Co-authored-by: Addison Phillips <addison@unicode.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatting LDML46.1 MF2.0 Draft Candidate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define resolved value formally
8 participants