When deploying on Vespa Cloud you don't need to supply your own machine-learned models but can use standard models provided by Vespa Cloud. These models will always be available on Vespa Cloud and never change.
You specify a Vespa Cloud provided model by setting the model-id
attribute on the config value for that model. This can be done with any
config field of type model
,
whether the config is from Vespa or defined by you:
<aModelConfigValue model-id="minilm-l6-v2"/>
See the complete example below for more.
You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:
<aModelConfigValue model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>
This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.
These are the models available on Vespa Cloud:
minilm-l6-v2 | |
---|---|
An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 |
mpnet-base-v2 | |
A larger BERT transformer model on ONNX format with higher accuracy and cost than minilm-l6-v2. It maps sentences & paragraphs to a 768 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-mpnet-base-v2 |
bert-base-uncased | |
A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 |
Models in this table can safely be used in production and will never be removed or changed.
Here is an example of configuring the Vespa-provided Bert embedder with models supplied by Vespa Cloud:
<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration"> <config name="embedding.bert-base-embedder"> <transformerModel model-id="minilm-l6-v2"/> <tokenizerVocab model-id="bert-base-uncased"/> </config> </component>
By putting this under the <container>
element in your services.xml
file, your application will have support for
text embedding
suitable for production.