- From: Lorenzo Moriondo <tunedconsulting@gmail.com>
- Date: Fri, 25 Jul 2025 13:00:28 +0100
- To: public-webagents <public-webagents@w3.org>
- Message-ID: <CAKgLLmu6b+KvG-VOECcu-Oo+WfNpCqZPzt48KWdJPviNDA8N4w@mail.gmail.com>
Hello, After spending some time with the matter, I will try here to summarise some points I have found interesting before next week's session. ## Assumptions My first question was: how to make it possible to generate NL contexts usable by LLMs starting from a formal protocol? This is relevant because the lingua-franca for developers and LLMs is indeed natural language before formalism, we may say that the distinction between human language and machine-readable payloads blurs in the presence of a machine that is somehow highly skilled in natural language. So I started from the assumption that, considering the pace of adoption of NL among LLMs, "in the context of contemporary LLM systems, natural language is now a machine-readable format". If this is true a critical point is the transpling operation between the layer that handles the input/output as natural language processing and the layer that manages effective tasks processing where machines exchange data and, more often than ever, code and programs. ## Process modeling Starting from this point of view, I focused my attention on two processes that can also be seen as ordered stepwise layers (abstract model of what happens in LLMs systems like MCPs and Graph implementations): 1. question: a. prompt (LLM) b. planning (LLM) c. structured data generation (LLM + protocol) d. operations (protocol + distributed computing + LLM) 2. and the reverse, answer: a. operations (protocol + distributed computing) b. structured data generation (LLM + protocol) c. response (protocol + distributed computing + LLM) Let's start with this intra-system point of view. There is also the inter-systems point of view that can be extrapolated linearly from the intra-system model (i.e. 1d or 2c layers doing intersystem communication using question/answer processes autosimilar to the intra-systems ones; so geometrically, there are "vertical" intrasystem stacks that communicate in a distributed fashion via "horizontal" channels but in this case stacks and channels use exactly the same protocols and technologies because both stacks and channels are LLM systems). ## Challenges If this stack is assumed for intra-system processes, there is a critical step happening between (1b -> 1c) and (2b -> 2c). The challenge comes from two factors: * the variety of LLMs present in the market * the variety of systems implementations present in the market That is the step of transpiling from natural language (unstructured and arbitrary) to a structured data representation (i.e. BSPL, any machine-readable format that uniquely and unambiguously defines data and operations). This is an interpretative step and interpretation can lead to ambiguity to be passed down the operational layer. So I would say that we need to keep this step interpretable from the point of view of humans and LLMs but also not ambiguous in the operation of transpiling from natural language to "strictly" (non-LLMs) machine-readability. ## Solutions After experimenting with automatic translations in the most popular LLMs in the market, it is evident that the results are very much heterogeneous if the LLM is presented with the structured data payloads as we are accustomed to use; protocols like BSPL, Terraform descriptions, any data defined in JSON/YAML format are not interpretable and therefore are not meant to be used to define a context in the sense that LLM requires in the context-prompt engineering step. In general any protocol that does not provide **natural language annotations** is not suitable for robust transpiling to/from a natural language frontend; it is possible to have robust transpiling inside the same system using the same tools but the result may or may not be transpiled correctly by another system using different technologies. Here an example of a formal protocol with added semantic annotations, starting from a BSPL sample. This is a sample of a valid BSPL defining a set of allowed transactions between different roles. As Amit pointed out, it packs a lot of useful information and it is designed to work with machines in a distributed computing setting. It provides some interesting features like expected signatures and direction of exchange (causality). It is easy to turn into code/programs. ``` Purchase { roles B, S, Shipper parameters out ID key, out item, out price, out outcome private address, shipped, accept, reject, resp B -> S: rfq[out ID, out item] S -> B: quote[in ID, in item, out price] B -> S: accept[in ID, in item, in price, out address, out resp, out accept] B -> S: reject[in ID, in item, in price, out outcome, out resp, out reject] S -> Shipper: ship[in ID, in item, in address, out shipped] Shipper -> B: deliver[in ID, in item, in address, out outcome] } ``` Two main points are missing to allow robust transpiling (payload->NL and back) also on an LLM: 1. it lacks semantic annotations to be used to transpile to NL, it designed to be transpiled into code but it was not designed to be transpiled into NL 2. it does not define strict types for the variables To make BSPL to be also LLM-ready, all the pieces in play need to be annotated and typed. Starting from this point, I developed a potential protocol that adds the features required by working with LLMs in the scope of the model as defined above. Taking the same sample as above, let's try to apply annotations and types (comments in // are just for demonstration, they are not part of the mandatory protocol definition): ``` // the protocol is stated and annotated Purchase <Protocol>("the generic action of acquiring a generic item in exchange of its countervalue in currency") { // --- Role Definitions --- // We now define each role's part using annotations. // The base type <Agent> signifies that this is an active participant. roles B <Agent>("the party wanting to buy an item"), S <Agent>("the party selling the item"), Shipper <Agent>("the third-party entity responsible for logistics") // --- Parameter Definitions --- // Parameters are the data fields exchanged in the protocol. We give each one a clear meaning. parameters ID <String>("a unique identifier for the request for quote"), item <String>("the name or description of the product being requested"), price <Float>("the cost of the item quoted by the seller"), address <String>("the physical destination for shipping"), shipped <Bool>("a confirmation status indicating the item has been dispatched"), accept <Bool>("a confirmation that the buyer agrees to the quote"), reject <Bool>("a confirmation that the buyer declines the quote"), outcome <String>("a final status message describing the result of the protocol") // --- Interaction Protocol --- // Buyer requests a quote from the Seller. B -> S: rfq <Action>("request for a price quote")[out ID, out item] // Seller responds with a price. S -> B: quote <Action>("provide a price quote for a requested item")[in ID, in item, out price] // Buyer accepts the quote. B -> S: accept <Action>("accept the seller's price quote")[in ID, in item, in price, out address, out accept] // Buyer rejects the quote. B -> S: reject <Action>("reject the seller's price quote")[in ID, in item, in price, out outcome, out reject] // If accepted, Seller instructs the Shipper. S -> Shipper: ship <Action>("request shipment of the purchased item")[in ID, in item, in address, out shipped] // Shipper completes the delivery to the Buyer. Shipper -> B: deliver <Action>("confirm delivery of the item to the buyer")[in ID, in item, in address, out outcome] } ``` As you can see it is easy to translate from BSPL to this annotated protocol for legacy support but it adds types and annotations. This improves the robustness of translation to NL on all the models and makes their interpretation less arbitrary. Every role and action provides a <Type> and a "prompt" to reduce ambiguity in the context engineering of an arbitrary LLM. I leave to you further considerations about pros and cons of this approach. ## Conclusion I have run tests on a similar protocol called MTP (Meaning Typed Prompting) achieving better preliminary results than plain BSPL in generation of natural language descriptions and code generation for strictly-typed programming languages. One model has also demonstrated almost flawless translation of a generated NL description into its original annotated protocol. These are all preliminary results but it may be a good path to explore for interoperability of LLM systems using BSPL. Please take a look and let's discuss next Friday. Best regards, -- ¤ acM ¤ Lorenzo Moriondo @lorenzogotuned https://www.linkedin.com/in/lorenzomoriondo https://github.com/Mec-iS
Received on Friday, 25 July 2025 12:00:46 UTC