Google I/O 2024 illuminated a spectrum of projects and advancements, with one in particular seizing the spotlight: India’s Project Navarasa. This initiative thrust Gemma into the limelight, making it accessible across 15 Indic languages. Distinguished by its multilingual variant tailored for Indic languages, 'Project Navarasa' emerged as a focal point of Google's discourse, propelled by Telugu LLM Labs' ingenuity.
Harsh Dhand, Google's Head of APAC Research Partnerships, emphasised the significance of cultural alignment in technological development. "When technology resonates with a culture, it grasps the intricacies of diverse nations like India," he remarked.
What IS Gemma & Project Navarasa?
At the heart of 'Project Navarasa' lies Gemma's formidable tokeniser, empowering AI-driven language generation for the plethora of Indic languages it serves.
Ramsri Goutham Golla, co-creator of Navarasa, extolled Gemma's expansive vocabulary, essential for its adaptation to initiatives like Navarasa. "Gemma's prowess lies in its robust tokeniser, accommodating myriad words, symbols, and characters across diverse alphabets and language systems," Golla explained.
The vision for Navarasa, Golla expressed, is to foster inclusivity, enabling conversations in native languages and responses thereof.
Feedback from developers hailed Gemma's superiority over Llama for Indic languages. Adithya S Kolavi, founder of Cognitive Lab, attested to Gemma's superiority, noting its efficiency in handling Indic languages compared to Llama 2 and 3 models.
Vivek Raghavan, co-founder of Sarvam AI, echoed this sentiment, underscoring Gemma's tokeniser advantage in the context of Indic languages. He elucidated, "Gemma's tokeniser efficacy surpasses Llama's, particularly for Indic languages, where tokenisation overhead is notably higher."
Meanwhile, OpenAI's unveiling of GPT-4o, boasting an enhanced tokeniser and expanded vocabulary, promises heightened support for Indian languages, including Hindi, Gujarati, Marathi, Telugu, Tamil, and Urdu.
In a strategic move at Google I/O, the tech behemoth unveiled PaliGemma, a potent open vision-language model (VLM), alongside a preview of Gemma 2, heralding the next evolutionary stride in the Gemma lineage.
PaliGemma, drawing inspiration from PaLI-3 and integrating open components such as the SigLIP vision model, is engineered to excel in a gamut of vision-language tasks, spanning image and short video captioning to object detection and segmentation.
Google's provision of pre-trained and fine-tuned checkpoints across multiple resolutions and task combinations offers a wealth of resources for exploration and research endeavours.
The debut of PaliGemma heralds a new era of multimodal comprehension and versatile model architecture, empowering researchers with unprecedented capabilities for vision-language tasks.