Google releases Gemma, a new AI model designed with AI researchers in mind

Google is building on the success of its Gemini launch with the release of a new family of lightweight AI models called Gemma. The Gemma models are open and are designed to be used by researchers and developers to innovate safely with AI. 

“We believe the responsible release of LLMs is critical for improving the safety of frontier models, for ensuring equitable access to this breakthrough technology, for enabling rigorous evaluation and analysis of current techniques, and for enabling the development of the next wave of innovations,” the researchers behind Gemma wrote in a technical report.  

Along with Gemma, Google is also releasing a new Responsible Generative AI Toolkit that includes capabilities for safety classification and debugging, as well as Google’s best practices for developing large language models.

Gemma comes in two model sizes: 2B and 7B. They share many of the same technical and infrastructure components as Gemini, which Google says enables Gemma models to “achieve best-in-class performance for their sizes compared to other open models.”

Gemma also provides integration with JAX, TensorFlow, and PyTorch, allowing developers to switch between frameworks as needed. 

The models can be run on a variety of device types, including laptops, desktops, IoT, mobile, and cloud. Google also partnered with NVIDIA to optimize Gemma for use on NVIDIA’s GPUs. 

It has also been optimized for use on Google Cloud, which allows for benefits like one-click deployment and built-in inference optimizations. It is accessible through Google Cloud’s Vertex AI Model Garden, which now contains over 130 AI models, and through Google Kubernetes Engine (GKE).

According to Google Cloud, through Vertex AI, Gemma could be used to support real-time generative AI tasks that require low latency or build apps that can complete lightweight AI tasks like text generation, summarization, and Q&A. 

“With Vertex AI, builders can reduce operational overhead and focus on creating bespoke versions of Gemma that are optimized for their use case,” Burak Gokturk, VP and GM of Cloud AI at Google Cloud, wrote in a blog post

On GKE, the potential use cases include deploying custom models in containers alongside applications, customizing model serving and infrastructure configuration without needing to provision nodes, and integrating AI infrastructure quickly and in a scalable way. 

Gemma was designed to align with Google’s Responsible AI Principles, and used automatic filtering techniques to remove personal data from training sets, reinforcement learning from human feedback (RLHF) to align models with responsible behaviors, and manual evaluations that included red teaming, adversarial testing, and assessments of model capabilities for potentially bad outcomes. 

Because the models were designed to promote AI research, Google is offering free credits to developers and researchers who are wanting to use Gemma. It can be accessed for free using Kaggle or Colab, or first-time Google Cloud users can get a $300 credit. Researchers can also apply for up to $500,000 for their projects. 

“Beyond state-of-the-art performance measures on benchmark tasks, we are excited to see what new use-cases arise from the community, and what new capabilities emerge as we advance the field together. We hope that researchers use Gemma to accelerate a broad array of research, and we hope that developers create beneficial new applications, user experiences, and other functionality,” the researchers wrote.

Source link