Vertex AI Foundations I EU AI Act I How to comply

What is the EU AI Act?

As the speed of AI adoption increases due to technologies like Generative AI, the EU realised the importance of ensuring fairness and accountability in AI systems. The goal of the EU AI Act is to protect consumers from experiencing a negative impact based on the output of AI models. This is very similar to how GDPR protects the rights of end users and gives them ownership of their data.

In essence, you can break the AI Act down into the following sections:

Banned applications: these are applications that could be a threat to the rights of EU citizens. These include things such as the creation of a scraped facial recognition database and the categorisation of people based on race, sexual orientation, etc. It is obvious how these systems could negatively impact some groups in society. This is why they are banned apart from specific occasions such as law enforcement where they are needed.
Obligations for high-risk systems: some applications are classified as high-risk due to the impact they have on the lives of the people interacting with them. These applications go from recruitment, healthcare, and operation of critical infrastructure to less ethical cases such as systems intended to influence electoral outcomes. Due to their high level of impact, these systems must follow some strict guidelines in terms of implementation. More on that later.
Guardrails for general artificial intelligence systems: generative AI is quickly becoming more prevalent in our society. These models are powerful but are prone to hallucinations. Some models have raised copyright concerns. And the possibility of misinformation makes these models a challenge to govern. Generative AI applications will have to obey specific requirements. These include drawing up technical documentation, complying with EU copyright law and disseminating detailed summaries about the content used for training. More on this later in this blog.

How can Devoteam’s Vertex AI Foundations help you comply?

Devoteam’s Vertex AI Foundations is an MLOps accelerator to get you up and running quickly with Google’s Vertex AI product suite. This is achieved by combining Google’s Vertex AI product suite with our experience implementing AI systems into production. These principles and the accelerator are not new, but the goal of this section is to explain how you can solve the requirements set forth by the AI Act.

Traceability

The AI Act grants individuals the right to seek explanations for decisions made using AI, highlighting the crucial concept of traceability. Traceability entails the ability to explain an AI model’s output by examining all the factors that influenced its training and prediction. These factors encompass the training data employed, the data transformations applied, the hyperparameters utilised during training, and the data utilised during runtime to generate predictions.

You can split the problem into two parts. Training and inference.

Traceability in Training

To ensure traceability in training, you need to have a way to trace back from your model artefact (the trained model used in our application) to the source data while keeping track of any modifications that might have occurred. To achieve this, the Foundations uses Vertex AI Pipelines, an MLOps tool based on the Open Source framework Kubeflow.

Vertex AI Pipelines are made up of components, which execute specific code, and artefacts, which are the output of components being passed around. The important thing to note is that one pipeline execution bundles together all artefacts and versions them together. This allows you to look at one version of a specific model and trace it back to the version of the input data that you used to train on.

By tracking the lineage of a model’s output data back to its training data, Vertex AI Pipelines enables you to identify and address biases in the training data that may cause biased model behaviour. This proactive approach to preventing discriminatory outcomes aligns with the AI Act’s focus on promoting fairness and accountability in AI development.

Image 1: An example of a Vertex AI Pipeline

Traceability at inference time

At inference time, predictions will often include data based on the current state of events. If not properly logged, it can be difficult to trace why a specific output was given. The way the Foundations solve this is by a tight integration with Google’s Monitoring suite. Allowing you to keep track of all predictions made by the model.

Next to that, Vertex AI also offers its Explainable AI feature, providing explanations for specific requests based on the input features provided to the model. This can then be combined with traceability at training time to drill down and go from prediction to potential anomaly in the training dataset.

2. Operational Performance Quality

Another requirement put in place by the AI Act is monitoring the operational performance of the model. For AI systems, this means that companies must keep track of the predictions and the actual observations they should be compared to. Next to monitoring predictions, one can also monitor the input data provided to detect data drift. Data drift is when the distribution of input data changes so significantly that the model output could be negatively affected.

The Foundations support this by integrating with Google’s Monitoring suite and BigQuery to keep track of model outputs. You can use this to detect model degradation. Because the Foundations uses Vertex AI Pipelines, a new training pipeline can automatically be triggered once such a degradation has been detected. Next to that, there is the option for human intervention should this be necessary.

3. Cybersecurity

Security is at the heart of everything we do at Devoteam and the same is true for the Vertex AI Foundations. It complies with security best practices and leverages the technical expertise we have on Google Cloud to ensure a secure setup. Vertex AI Foundations manages all permissions, service accounts and networking through infrastructure as code using Terraform. This ensures least privilege access is enforced everywhere.

The Foundations also supports clear environment separation, so that production data is not replicated outside of the production environment.

4. Technical Documentation

Another requirement is that of technical documentation. AI systems need proper documentation when used in production. Using the Vertex AI Foundations, customers benefit from the fact that Devoteam has extensive documentation on how the Foundations are set up and why specific decisions have been made.

By providing comprehensive documentation templates, Vertex AI Pipelines streamlines the documentation process, allowing you to focus on writing content specific to their unique use cases. This substantial reduction in documentation time enables you to allocate resources more effectively and prioritise tasks that contribute directly to business value.

5. Data Quality

One more requirement is the use of qualitative data. A very broad subject in its own right. We will therefore leave the discussion on data warehousing out of scope and focus on what is specific to machine learning. Keep your eyes out for a later blog post focusing on how our Data Foundations framework can help achieve data quality goals and GDPR compliance.

Assuming qualitative data is available in the organisation, it is still important to set up a feature store. A feature store contains input features for the models used by an organisation. It is a place to standardise the features models can and should use. If sensitive data is not available in the feature store, models will not be able to use this in training and prediction.

Next to the governance aspect, the feature store is also important because it provides a historical view of features over time. For example, your organisation’s customers might move to different regions in their lifetime. This will impact the predictions a model might make about them. To construct good quality training datasets, it is important to be able to link events to the features at the point in time of the event.

To give an illustration: let’s say there is a feature called “age group” which is used to see what products a customer might be interested in. It is easy to see that when a person is young, they buy significantly different products from when they are older. We must link purchases to the age group of a user at the time of the purchase.

Lastly, a feature store is a great place to perform data drift detection. You can use this to automatically re-train models that use these features.

How can Devoteam’s Vertex AI Foundations help you work with Generative AI?

The Vertex AI Foundations also include modules which support using Google’s Generative AI products. These products mostly use Google’s Generative AI models or Open Source models. Meaning that it is up to those foundational model providers to comply with the requirements of disclosing training datasets.

One exception is in the case of fine-tuning. Fine-tuning is when a company starts from a foundational model, but tunes it using its training data. In this case, that company will need to showcase what training data was used. Vertex AI seamlessly integrates fine-tuning with its existing pipeline infrastructure, ensuring that model lineage is maintained throughout the fine-tuning process. This lineage allows users to trace back the model’s output to the training data, enabling them to identify and correct any biases that may arise due to changes in the data.

The Foundations enables easy use of these products and provides templates for common use cases such as Retrieval Augmented Generation (RAG). This approach fetches data from external sources to supplement the prompt given to a Generative AI model. Meaning all of this needs to be logged to explain why the model gave a specific output based on the model version that was used and the data it was supplemented with. The Vertex AI Foundations again achieves this through a combination of Cloud Monitoring and Vertex AI Pipelines.

Copyright

Google provides indemnity on a large part of their Generative AI stack. If you do not have malicious intent or are not misusing the product suite, Google will indemnify you for third-party IP claims. This provides an important safety net for any customer using Generative AI as these lawsuits can be costly and time-consuming.

A solid basis for any ML system

The AI Act enforces some important requirements for the use of AI. A good thing for society and something everyone will benefit from in the future. Tackling these requirements can seem challenging at the start of your journey. The good news is, there are solutions!

Devoteam’s Vertex AI Foundations can get you up and running using all the best practices described in only a matter of days. Ensuring that any ML system you build and put in production is built on a solid base that complies with current regulations.

If you’re interested in hearing more about the foundations, don’t hesitate to reach out!

Talk to our experts