Skip to main content

LLMs

Features (natively supported)​

All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. ainvoke, batch, abatch, stream, astream. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:

  • Async support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.
  • Streaming support defaults to returning an Iterator (or AsyncIterator in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.
  • Batch support defaults to calling the underlying LLM in parallel for each input by making use of a thread pool executor (in the sync batch case) or asyncio.gather (in the async batch case). The concurrency can be controlled with the max_concurrency key in RunnableConfig.

Each LLM integration can optionally provide native implementations for async, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.

ModelInvokeAsync invokeStreamAsync streamBatchAsync batch
AI21βœ…βŒβŒβŒβŒβŒ
AlephAlphaβœ…βŒβŒβŒβŒβŒ
AmazonAPIGatewayβœ…βŒβŒβŒβŒβŒ
Anthropicβœ…βœ…βœ…βœ…βŒβŒ
Anyscaleβœ…βŒβŒβŒβŒβŒ
Aviaryβœ…βŒβŒβŒβŒβŒ
AzureMLOnlineEndpointβœ…βŒβŒβŒβŒβŒ
AzureOpenAIβœ…βœ…βœ…βœ…βœ…βœ…
Bananaβœ…βŒβŒβŒβŒβŒ
Basetenβœ…βŒβŒβŒβŒβŒ
Beamβœ…βŒβŒβŒβŒβŒ
Bedrockβœ…βŒβœ…βŒβŒβŒ
CTransformersβœ…βœ…βŒβŒβŒβŒ
CTranslate2βœ…βŒβŒβŒβœ…βŒ
CerebriumAIβœ…βŒβŒβŒβŒβŒ
ChatGLMβœ…βŒβŒβŒβŒβŒ
Clarifaiβœ…βŒβŒβŒβŒβŒ
Cohereβœ…βœ…βŒβŒβŒβŒ
Databricksβœ…βŒβŒβŒβŒβŒ
DeepInfraβœ…βŒβŒβŒβŒβŒ
DeepSparseβœ…βŒβŒβŒβŒβŒ
EdenAIβœ…βœ…βŒβŒβŒβŒ
Fireworksβœ…βœ…βŒβŒβœ…βœ…
FireworksChatβœ…βœ…βŒβŒβœ…βœ…
ForefrontAIβœ…βŒβŒβŒβŒβŒ
GPT4Allβœ…βŒβŒβŒβŒβŒ
GooglePalmβœ…βŒβŒβŒβœ…βŒ
GooseAIβœ…βŒβŒβŒβŒβŒ
GradientLLMβœ…βœ…βŒβŒβŒβŒ
HuggingFaceEndpointβœ…βŒβŒβŒβŒβŒ
HuggingFaceHubβœ…βŒβŒβŒβŒβŒ
HuggingFacePipelineβœ…βŒβŒβŒβŒβŒ
HuggingFaceTextGenInferenceβœ…βœ…βœ…βœ…βŒβŒ
HumanInputLLMβœ…βŒβŒβŒβŒβŒ
JavelinAIGatewayβœ…βœ…βŒβŒβŒβŒ
KoboldApiLLMβœ…βŒβŒβŒβŒβŒ
LlamaCppβœ…βŒβœ…βŒβŒβŒ
ManifestWrapperβœ…βŒβŒβŒβŒβŒ
Minimaxβœ…βŒβŒβŒβŒβŒ
MlflowAIGatewayβœ…βŒβŒβŒβŒβŒ
Modalβœ…βŒβŒβŒβŒβŒ
MosaicMLβœ…βŒβŒβŒβŒβŒ
NIBittensorLLMβœ…βŒβŒβŒβŒβŒ
NLPCloudβœ…βŒβŒβŒβŒβŒ
Nebulaβœ…βŒβŒβŒβŒβŒ
OctoAIEndpointβœ…βŒβŒβŒβŒβŒ
Ollamaβœ…βŒβŒβŒβŒβŒ
OpaquePromptsβœ…βŒβŒβŒβŒβŒ
OpenAIβœ…βœ…βœ…βœ…βœ…βœ…
OpenLLMβœ…βœ…βŒβŒβŒβŒ
OpenLMβœ…βœ…βœ…βœ…βœ…βœ…
Petalsβœ…βŒβŒβŒβŒβŒ
PipelineAIβœ…βŒβŒβŒβŒβŒ
Predibaseβœ…βŒβŒβŒβŒβŒ
PredictionGuardβœ…βŒβŒβŒβŒβŒ
PromptLayerOpenAIβœ…βŒβŒβŒβŒβŒ
QianfanLLMEndpointβœ…βœ…βœ…βœ…βŒβŒ
RWKVβœ…βŒβŒβŒβŒβŒ
Replicateβœ…βŒβœ…βŒβŒβŒ
SagemakerEndpointβœ…βŒβŒβŒβŒβŒ
SelfHostedHuggingFaceLLMβœ…βŒβŒβŒβŒβŒ
SelfHostedPipelineβœ…βŒβŒβŒβŒβŒ
StochasticAIβœ…βŒβŒβŒβŒβŒ
TextGenβœ…βŒβŒβŒβŒβŒ
TitanTakeoffβœ…βŒβœ…βŒβŒβŒ
Tongyiβœ…βŒβŒβŒβŒβŒ
VLLMβœ…βŒβŒβŒβœ…βŒ
VLLMOpenAIβœ…βœ…βœ…βœ…βœ…βœ…
VertexAIβœ…βœ…βœ…βŒβœ…βœ…
VertexAIModelGardenβœ…βœ…βŒβŒβœ…βœ…
Writerβœ…βŒβŒβŒβŒβŒ
Xinferenceβœ…βŒβŒβŒβŒβŒ

πŸ“„οΈ Amazon API Gateway

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the "front door" for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.

πŸ“„οΈ NLP Cloud

The NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, grammar and spelling correction, keywords and keyphrases extraction, chatbot, product description and ad generation, intent classification, text generation, image generation, blog post generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. It is ready for production, served through a REST API.

πŸ“„οΈ OpaquePrompts

OpaquePrompts is a service that enables applications to leverage the power of language models without compromising user privacy. Designed for composability and ease of integration into existing applications and services, OpaquePrompts is consumable via a simple Python library as well as through LangChain. Perhaps more importantly, OpaquePrompts leverages the power of confidential computing to ensure that even the OpaquePrompts service itself cannot access the data it is protecting.