Preliminary considerations about a standard exchange protocol for LLM systems

Hello,

After spending some time with the matter, I will try here to summarise some
points I have found interesting before next week's session.

## Assumptions
My first question was: how to make it possible to generate NL contexts
usable by LLMs starting from a formal protocol? This is relevant because
the lingua-franca for developers and LLMs is indeed natural language before
formalism, we may say that the distinction between human language and
machine-readable payloads blurs in the presence of a machine that is
somehow highly skilled in natural language. So I started from the
assumption that, considering the pace of adoption of NL among LLMs, "in the
context of contemporary LLM systems, natural language is now a
machine-readable format". If this is true a critical point is the
transpling operation between the layer that handles the input/output as
natural language processing and the layer that manages effective
tasks processing where machines exchange data and, more often than ever,
code and programs.

## Process modeling
Starting from this point of view, I focused my attention on two processes
that can also be seen as ordered stepwise layers (abstract model of what
happens in LLMs systems like MCPs and Graph implementations):
1. question:
    a. prompt (LLM)
    b. planning (LLM)
    c. structured data generation (LLM + protocol)
    d. operations (protocol + distributed computing + LLM)
2. and the reverse, answer:
    a. operations (protocol + distributed computing)
    b. structured data generation (LLM + protocol)
    c. response (protocol + distributed computing + LLM)

Let's start with this intra-system point of view. There is also the
inter-systems point of view that can be extrapolated linearly from the
intra-system model (i.e. 1d or 2c layers doing intersystem communication
using question/answer processes autosimilar to the intra-systems ones; so
geometrically, there are "vertical" intrasystem stacks that communicate in
a distributed fashion via "horizontal" channels but in this case stacks and
channels use exactly the same protocols and technologies because both
stacks and channels are LLM systems).

## Challenges
If this stack is assumed for intra-system processes, there is a critical
step happening between (1b -> 1c) and (2b -> 2c). The challenge comes from
two factors:
* the variety of LLMs present in the market
* the variety of systems implementations present in the market
That is the step of transpiling from natural language (unstructured and
arbitrary) to a structured data representation  (i.e. BSPL, any
machine-readable format that uniquely and unambiguously defines data and
operations). This is an interpretative step and interpretation can lead to
ambiguity to be passed down the operational layer. So I would say that we
need to keep this step interpretable from the point of view of humans and
LLMs but also not ambiguous in the operation of transpiling from natural
language to "strictly" (non-LLMs) machine-readability.

## Solutions
After experimenting with automatic translations in the most popular LLMs in
the market, it is evident that the results are very much heterogeneous if
the LLM is presented with the structured data payloads as we are accustomed
to use; protocols like BSPL, Terraform descriptions, any data defined in
JSON/YAML format are not interpretable and therefore are not meant to be
used to define a context in the sense that LLM requires in the
context-prompt engineering step. In general any protocol that does not
provide **natural language annotations** is not suitable for robust
transpiling to/from a natural language frontend; it is possible to have
robust transpiling inside the same system using the same tools but the
result may or may not be transpiled correctly by another system using
different technologies.
Here an example of a formal protocol with added semantic annotations,
starting from a BSPL sample.
This is a sample of a valid BSPL defining a set of allowed transactions
between different roles. As Amit pointed out, it packs a lot of useful
information and it is designed to work with machines in a distributed
computing setting. It provides some interesting features like expected
signatures and direction of exchange (causality). It is easy to turn into
code/programs.
```
Purchase {
 roles B, S, Shipper
 parameters out ID key, out item, out price, out outcome
 private address, shipped, accept, reject, resp

 B -> S: rfq[out ID, out item]
 S -> B: quote[in ID, in item, out price]
 B -> S: accept[in ID, in item, in price, out address, out resp, out accept]
 B -> S: reject[in ID, in item, in price, out outcome, out resp, out reject]

 S -> Shipper: ship[in ID, in item, in address, out shipped]
 Shipper -> B: deliver[in ID, in item, in address, out outcome]
}
```
Two main points are missing to allow robust transpiling (payload->NL and
back) also on an LLM:
1. it lacks semantic annotations to be used to transpile to NL, it designed
to be transpiled into code but it was not designed to be transpiled into NL
2. it does not define strict types for the variables

To make BSPL to be also LLM-ready, all the pieces in play need to be
annotated and typed.
Starting from this point, I developed a potential protocol that adds the
features required by working with LLMs in the scope of the model as defined
above.
Taking the same sample as above, let's try to apply annotations and types
(comments in // are just for demonstration, they are not part of the
mandatory protocol definition):
```
// the protocol is stated and annotated
Purchase <Protocol>("the generic action of acquiring a generic item in
exchange of its countervalue in currency") {
    // --- Role Definitions ---
    // We now define each role's part using annotations.
    // The base type <Agent> signifies that this is an active participant.
    roles
        B <Agent>("the party wanting to buy an item"),
        S <Agent>("the party selling the item"),
        Shipper <Agent>("the third-party entity responsible for logistics")

    // --- Parameter Definitions ---
    // Parameters are the data fields exchanged in the protocol. We give
each one a clear meaning.
    parameters
        ID <String>("a unique identifier for the request for quote"),
        item <String>("the name or description of the product being
requested"),
        price <Float>("the cost of the item quoted by the seller"),
        address <String>("the physical destination for shipping"),
        shipped <Bool>("a confirmation status indicating the item has been
dispatched"),
        accept <Bool>("a confirmation that the buyer agrees to the quote"),
        reject <Bool>("a confirmation that the buyer declines the quote"),
        outcome <String>("a final status message describing the result of
the protocol")

    // --- Interaction Protocol ---

    // Buyer requests a quote from the Seller.
    B -> S: rfq <Action>("request for a price quote")[out ID, out item]

    // Seller responds with a price.
    S -> B: quote <Action>("provide a price quote for a requested item")[in
ID, in item, out price]

    // Buyer accepts the quote.
    B -> S: accept <Action>("accept the seller's price quote")[in ID, in
item, in price, out address, out accept]

    // Buyer rejects the quote.
    B -> S: reject <Action>("reject the seller's price quote")[in ID, in
item, in price, out outcome, out reject]

    // If accepted, Seller instructs the Shipper.
    S -> Shipper: ship <Action>("request shipment of the purchased
item")[in ID, in item, in address, out shipped]

    // Shipper completes the delivery to the Buyer.
    Shipper -> B: deliver <Action>("confirm delivery of the item to the
buyer")[in ID, in item, in address, out outcome]
}
```
As you can see it is easy to translate from BSPL to this annotated protocol
for legacy support but it adds types and annotations. This improves the
robustness of translation to NL on all the models and makes their
interpretation less arbitrary. Every role and action provides a <Type> and
a "prompt" to reduce ambiguity in the context engineering of an arbitrary
LLM. I leave to you further considerations about pros and cons of this
approach.

## Conclusion
I have run tests on a similar protocol called MTP (Meaning Typed Prompting)
achieving better preliminary results than plain BSPL in generation of
natural language descriptions and code generation for strictly-typed
programming languages. One model has also demonstrated almost flawless
translation of a generated NL description into its original annotated
protocol. These are all preliminary results but it may be a good path to
explore for interoperability of LLM systems using BSPL.

Please take a look and let's discuss next Friday.
Best regards,

-- 
¤ acM ¤
Lorenzo
Moriondo
@lorenzogotuned
https://www.linkedin.com/in/lorenzomoriondo
https://github.com/Mec-iS

Received on Friday, 25 July 2025 12:00:46 UTC