When deploying on Vespa Cloud you don't need to supply your own machine-learned models but can use standard models provided by Vespa Cloud. These models will always be available on Vespa Cloud and never change.
You specify a Vespa Cloud provided model by setting the
attribute on the config value for that model. This can be done with any
config field of type
whether the config is from Vespa or defined by you:
See the complete example below for more.
You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:
<aModelConfigValue model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>
This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.
These are the models available on Vespa Cloud:
|An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.|
|A larger BERT transformer model on ONNX format with higher accuracy and cost than minilm-l6-v2. It maps sentences & paragraphs to a 768 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search.|
|A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models.|
Models in this table can safely be used in production and will never be removed or changed.
Here is an example of configuring the Vespa-provided Bert embedder with models supplied by Vespa Cloud:
<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration"> <config name="embedding.bert-base-embedder"> <transformerModel model-id="minilm-l6-v2"/> <tokenizerVocab model-id="bert-base-uncased"/> </config> </component>
By putting this under the
<container> element in your
file, your application will have support for
suitable for production.