Vespa Cloud provides a set of machine-learned models that you can use in your applications. These models will always be available on Vespa Cloud and never change.
You specify to use a model provided by Vespa Cloud by setting the model-id
attribute where you specify a model config. For example, when configuring the
Bert embedder
provided by Vespa, you can write:
<component id="myEmbedderId" class="ai.vespa.embedding.BertBaseEmbedder" bundle="model-integration"> <config name="embedding.bert-base-embedder"> <transformerModel model-id="minilm-l6-v2"/> <tokenizerVocab model-id="bert-base-uncased"/> </config> </component>
By putting this under the <container>
element in your services.xml
file, your application will have support for
text embedding
suitable for production.
These models are currently available on Vespa Cloud:
minilm-l6-v2 | |
---|---|
An small, fast BERT transformer model on ONNX format. It maps sentences & paragraphs to a 384 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 |
mpnet-base-v2 | |
A larger BERT transformer model on ONNX format with higher accuracy and cost than minilm-l6-v2. It maps sentences & paragraphs to a 768 dimensional dense vector space and is suitable for tasks like clustering or semantic search. Use with distance-metric: angular in nearest neighbor search. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-mpnet-base-v2 |
bert-base-uncased | |
A vocabulary text file on the format expected by WordPiece: A text token per line, suitable for use with BERT models. | |
License | apache-2.0 |
Source | https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 |
These models can safely be used in production and will never be removed or changed.
You can also specify both a model id, which will be used on cloud, and a url/path, which will be used on self-hosted deployments:
<transformerModel model-id="minilm-l6-v2" path="myAppPackageModels/myModel.onnx"/>
This can be useful for example to create an application package which uses models from Vespa Cloud for production and a scaled-down or dummy model for self-hosted development.
Specifying a model-id can be done for any
config field of type model
,
whether the config is from Vespa or defined by you.