Many applications running Vespa Cloud handle data that would be classified as personal data per the European Union’s (EU) General Data Protection Regulation (GDPR).
When running an application on Vespa Cloud, it is the application owners’ that have the sole responsibility to ensure that all data in their applications is handled in compliance with GDPR requirements. The overall data retention policies for Vespa Cloud ensure that non-application data handled by Vespa is compliant.
As an application owner, there are some considerations you could make to help be GDPR-compliant:
The management of data stored in an application running on Vespa Cloud is the responsibility of the application owner and, as such, Vespa Cloud does not have any retention policy for this data as long as it is stored by the application.
The following data retention policies applies to Vespa Cloud:
After a node previously allocated to an application has been deallocated (e.g. due to application being deleted by application owner), all application data will be deleted within four hours.
All application log data will be deleted from Vespa servers after no more than 30 days (most often sooner) dependent on log volume, allocated disk resources, etc. PLEASE NOTE: This is the theoretical maximum retention time - see archive guide for how to ensure access to your application logs.
Below, we dive into the details around how we handle data in regard to GDPR. For most application owners, following the guidelines above should be sufficient, but an understanding of the underlying details can help make informed decisions to ensure your application is compliant.
Data inputs:
Vespa Cloud is a content agnostic service with dedicated instances per application. Applications will feed data (documents) into the system according to their application specific schema. The data may or may not include user data depending on application schema and supported use-cases.
Application owners typically retrieve data out of their Vespa application by sending HTTP
requests containing a query for what data to retrieve. This data may or may not include user
data depending on application schema and supported use-cases.
Requests to Vespa Cloud most often come via an application’s middle tier - there is no direct user (e.g. browser) access to the Vespa application.
Authentication to the Vespa Cloud console happens through the external Auth0 identity management service which, in turn, supports various identity providers such as Google and GitHub.
All user data collected by the console come from either information entered directly at the user Vespa Cloud sign-up form or meta-data associated with the identity used by the application owners to authenticate against the service, typically e-mail address.
Collected metadata:
For all incoming requests - either document feed or query requests - we keep standard HTTP access- and connection log. As requests to hosted Vespa typically do not originate from the end user directly, but comes via the application’s middle tier, whether there is actual user data stored in the access logs depends on what data the application passes on in their request to Vespa.
Purpose for processing data:
Processing performed on data:
External parties/system receiving data: