Memory Visualizer

The schema defines fields, types of fields and settings per field, e.g.

schema product {

    document product {

        field productId type long {
            indexing: summary | attribute
            attribute: fast-search
            rank: filter
        }

        field description type string {
            indexing: summary | index
        }

        ...
    }
}

The field types are often given by the application’s data, but the usage of the fields is also important - examples:

  • high-speed updates to documents can be achieved by using attributes for memory-writes only, even though the field could be a summary or indexed field, use case permitting
  • string fields can be faster than numeric, if the access is equality (not range like “price < 100”)

In short, there are functional, performance and cost tradeoffs. There are guides to help estimating resource use, see attributes, but often one does not know factors like number of unique values in the data. It might as well be easier to feed the data to Vespa Cloud and do schema changes online and observe the effect. Vespa Cloud has two features that accelerates this process - the Memory Visualizer and Automated Reindexing:

Memory Visualizer

The Memory Visualizer lets you browse the attribute fields and observe absolute and relative size. This can help finding the cost drivers for memory-bound applications, and identify bottlenecks for optimizations.

Adding or changing fields

Use the Memory Visualizer to track memory when adding a field. Attribute, index and summary fields have different behavior when it comes to empty fields and memory use, depending of data type - here, the tool indicates headroom for more data to assist in the evaluation.

Use the field change procedure to plan the schema changes for data availability in the transition. The Console will display reindexing progress:

Reindex progress

This makes it easy to estimate when the reindexing is complete. Note that attribute memory usage might require a node restart for all data structures to drain, take note of this when using the Memory Visualizer again.

Using the visualizer

Some of the fields have a different color code. To understand the types of fields, read more about the content node data structures - in short:

  • Ready are indexed documents that might or might not be included in queries
  • Not Ready are document replicas stored on the nodes that might be indexed later
  • Removed are deleted documents, either by the application, or the document replica has been moved to another node
  • Documentmetastore is the document ID mapping - see attributes