-
Notifications
You must be signed in to change notification settings - Fork 45
Overview/ait 189 intro token #3035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: AIT-129-AIT-Docs-release-branch
Are you sure you want to change the base?
Overview/ait 189 intro token #3035
Conversation
Link to the pending `/ai-transport` overview page.
[AIT-148] AI Transport example filter and product tile
Add intro describing the pattern, its properties, and use cases.
Includes continuous token streams, correlating tokens for distinct responses, and explicit start/end events.
Splits each token streaming approach into distinct patterns and shows both the publish and subscribe side behaviour alongside one another.
Includes hydration with rewind and hydration with persisted history + untilAttach. Describes the pattern for handling in-progress live responses with complete responses loaded from the database.
General overview intro page for AIT, giving a summary of major feature groups
Overview page for token streaming - set direction, link to later pages
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
GregHolmes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good! (I can't approve or anything as I raised it)
@rainbowFi I've left some comments on my thoughts.
I also think we need to be careful and remember that if some of this (such as the full list of agents/frameworks) isn't available on release, we need to remove the TODO comments.
| * [Complex message patterns](#message) | ||
| * [Enterprise controls](#enterprise) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these not be "Advanced messaging" and "User input" ? (Maybe user input isn't a helpful title).
But they're the sections defined in the AIT Docs IA Miro.
|
|
||
| ### Complex message patterns <a id="message"/> | ||
|
|
||
| Truly interactive AI experiences require more than a simple HTTP request-response exchange between a single client and agent. AI transport allows the use of [complex messaging patterns](//TODO: Link here), for example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing if this is meant to be where Advanced messaging is, the link would be /docs/ai-transport/features/advanced-messaging Yet to be created though.
|
|
||
| ### Enterprise controls <a id="enterprise"/> | ||
|
|
||
| Ably's platform provides [integrations](/docs/platform/integrations) and capabilities to ensure that your application will meet the requirements of enterprise environments, for example [message auditing](/docs/platform/integrations/streaming), [client identification](/docs/auth/identified-clients) and [RBAC](/docs/auth/capabilities). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call it capabilities elsewhere in the docs, should we keep with capabilities instead of RBAC?
| meta_description: "Learn about token streaming with Ably AI Transport, including common patterns and the features provided by the Ably solution." | ||
| --- | ||
|
|
||
| Token streaming is a technique used with Large Language Models (LLMs) where the model's response is transmitted progressively as each token is generated, rather than waiting for the complete response before transmission begins. This allows users to see the response appear incrementally, similar to watching someone type in real-time, giving an improved user experience. This is normally accomplished by streaming the tokens as the response to an HTTP request from the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this paragraph is necessarily correct. We're focusing specifically on streaming per token in this paragraph. But the more preferred way is also valid, streaming per response? Should we talk about that too?
Also, probably an AI addition but it's realtime :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am talking about the general definition of token streaming here, rather than anything to do with our recommendations of how to token stream over Ably (which comes later)
| so you can choose the one that best fits your requirements and customise it for your application. The Realtime client maintains a persistent connection to the Ably service. This allows you to publish at very high message rates with the lowest possible latencies, | ||
| while preserving guarantees around message delivery order. For more information, see [Realtime and REST](/docs/basics#realtime-and-rest). | ||
|
|
||
| ### Message-per-response <a id="pattern-per-response"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any possible usecases to list in here like we do in pattern-per-token below?
|
|
||
|  | ||
|
|
||
| If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. The exact mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. The exact mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use. | |
| If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. in detail the mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use. |
|
|
||
| ## Token streaming patterns <a id="patterns"/> | ||
|
|
||
| Ably AI Transport is built on the Pub/Sub messaging platform, which allows you to use whatever message structure and pattern works best for your application. AI transport supports two token streaming patterns using a [Realtime](/docs/api/realtime-sdk) client, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the line breaks in this para?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, fixed now
| ### Message-per-token <a id="pattern-per-token"/> | ||
| Token streaming with [message-per-token](/docs/ai-transport/features/token-streaming/message-per-token) is a pattern where every token generated by your model is published as its own Ably message. Each token then appears as one message in the channel history. | ||
|
|
||
| This pattern is useful when clients only care about the most recent part of a response and you are happy to treat the channel history as a short sliding window rather than a full conversation log. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other possible reason for using message-per-token is where you want the Ably transport to preserve the specific breakdown of the response into separate fragments. This might be because some higher-level framework is dependent on knowing that breakdown, or is handling token concatenation in some way that is incompatible with Ably performing concatenation of fragments.
| - /docs/products/ai-transport | ||
| --- | ||
|
|
||
| Ably AI Transport is a solution for building stateful, steerable, multi-device AI experiences into new or existing applications. You can use AI Transport as the transport layer with any LLM or agent framework, without rebuilding your existing stack or being locked to a particular vendor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what point is being made about not being locked to a vendor.
Co-authored-by: Paddy Byers <paddy.byers@gmail.com>
400eb09 to
f8056cb
Compare
Description
Adds overview pages to the documentation covering
the overall AIT product, listing the major features and linking them to other documentation
token streaming, including an overview of the proposed architecture and patterns