{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Tce3stUlHN0L" }, "source": [ "##### Copyright 2026 Google LLC." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "tuOe1ymfHZPu" }, "outputs": [], "source": [ "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "Y_lgX9omPXF-" }, "source": [ "## Gemini API: Getting started with information grounding for Gemini models" ] }, { "cell_type": "markdown", "metadata": { "id": "VkR4fWudrHCs" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "WDKKNfXWrHgs" }, "source": [ "In this notebook you will learn how to use information grounding with [Gemini models](https://ai.google.dev/gemini-api/docs/models/).\n", "\n", "Information grounding is the process of connecting these models to specific, verifiable information sources to enhance the accuracy, relevance, and factual correctness of their responses. While LLMs are trained on vast amounts of data, this knowledge can be general, outdated, or lack specific context for particular tasks or domains. Grounding helps to bridge this gap by providing the LLM with access to curated, up-to-date information.\n", "\n", "Here you will experiment with:\n", "- Grounding information using Google Search grounding\n", "- Grounding real-world information using Google Maps grounding\n", "- Adding YouTube links to gather context information to your prompt\n", "- Using URL context to include website, pdf or image URLs as context to your prompt" ] }, { "cell_type": "markdown", "metadata": { "id": "vKu1tRBrQ7xj" }, "source": [ "## Set up the SDK and the client" ] }, { "cell_type": "markdown", "metadata": { "id": "vIWKUlPqP5NK" }, "source": [ "### Install SDK\n", "\n", "This guide uses the [`google-genai`](https://pypi.org/project/google-genai) Python SDK to connect to the Gemini models." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6Fr84vJuPSHb" }, "outputs": [], "source": [ "# Grounding with Google Maps was introduced in 1.43\n", "%pip install -q -U \"google-genai>=1.43.0\"" ] }, { "cell_type": "markdown", "metadata": { "id": "a503bnWNQoCL" }, "source": [ "### Set up your API key\n", "\n", "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication ![image](https://storage.googleapis.com/generativeai-downloads/images/colab_icon16.png)](../quickstarts/Authentication.ipynb) quickstart for an example." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "RjvgYmdLQd5s" }, "outputs": [], "source": [ "from google.colab import userdata\n", "\n", "GOOGLE_API_KEY = userdata.get(\"GOOGLE_API_KEY\")" ] }, { "cell_type": "markdown", "metadata": { "id": "VhKXgMSNQrrV" }, "source": [ "### Select model and initialize SDK client\n", "\n", "Select the model you want to use in this guide, either by selecting one in the list or writing it down. Keep in mind that some models, like the 2.5 ones are thinking models and thus take slightly more time to respond (cf. [thinking notebook](./Get_started_thinking.ipynb) for more details and in particular learn how to switch the thiking off)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "C75s1LR9QmOz" }, "outputs": [], "source": [ "from google import genai\n", "from google.genai import types\n", "\n", "client = genai.Client(api_key=GOOGLE_API_KEY)\n", "\n", "MODEL_ID = \"gemini-3-flash-preview\" # @param [\"gemini-2.5-flash-lite\", \"gemini-2.5-flash\", \"gemini-2.5-pro\", \"gemini-2.5-flash-preview\", \"gemini-3.1-flash-lite-preview\", \"gemini-3.1-pro-preview\"] {\"allow-input\":true, isTemplate: true}" ] }, { "cell_type": "markdown", "metadata": { "id": "abb962246f15" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "8mDMScex1It5" }, "source": [ "## Use Google Search grounding\n", "\n", "Google Search grounding is particularly useful for queries that require current information or external knowledge. Using Google Search, Gemini can access nearly real-time information and better responses.\n", "\n", "To enable Google Search, simply add the `google_search` tool in the `generate_content`'s `config` that way:\n", "```\n", " config={\n", " \"tools\": [\n", " {\n", " \"google_search\": {}\n", " }\n", " ]\n", " },\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "FHIcazUO0-xU" }, "outputs": [ { "data": { "text/markdown": [ "**Response**:\n", " The latest Indian Premier League (IPL) match was the final of the IPL 2025 season, which took place on June 3, 2025. In this match, Royal Challengers Bengaluru defeated Punjab Kings by 6 runs to win their maiden title." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Search Query: ['latest Indian Premier League match and winner', 'when did IPL 2025 finish', 'IPL 2024 final match and winner']\n", "Search Pages: olympics.com, wikipedia.org, thehindu.com, olympics.com, skysports.com, wikipedia.org, thehindu.com\n" ] }, { "data": { "text/html": [ "\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
\n", " \n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import HTML, Markdown\n", "\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=\"What was the latest Indian Premier League match and who won?\",\n", " config={\"tools\": [{\"google_search\": {}}]},\n", ")\n", "\n", "# print the response\n", "display(Markdown(f\"**Response**:\\n {response.text}\"))\n", "# print the search details\n", "print(f\"Search Query: {response.candidates[0].grounding_metadata.web_search_queries}\")\n", "# urls used for grounding\n", "print(f\"Search Pages: {', '.join([site.web.title for site in response.candidates[0].grounding_metadata.grounding_chunks])}\")\n", "\n", "display(HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content))" ] }, { "cell_type": "markdown", "metadata": { "id": "wROLHEYLLBHX" }, "source": [ "You can see that running the same prompt without search grounding gives you outdated information:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EdUkQ40cKaGX" }, "outputs": [ { "data": { "text/markdown": [ "The latest Indian Premier League (IPL) match was the **Final of the IPL 2024 season**.\n", "\n", "* **Match:** Kolkata Knight Riders (KKR) vs. Sunrisers Hyderabad (SRH)\n", "* **Date:** May 26, 2024\n", "* **Winner:** **Kolkata Knight Riders (KKR)** won by 8 wickets." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import Markdown\n", "\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=\"What was the latest Indian Premier League match and who won?\",\n", ")\n", "\n", "# print the response\n", "display(Markdown(response.text))" ] }, { "cell_type": "markdown", "metadata": { "id": "fE6Ft1wxSxO_" }, "source": [ "For more examples, please refer to the [dedicated notebook ![image](https://storage.googleapis.com/generativeai-downloads/images/colab_icon16.png)](./Search_Grounding.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "id": "cf6f7711ac06" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "ylPa8XFoYCq_" }, "source": [ "## Use Google Maps grounding\n", "\n", "Google Maps grounding allows you to easily incorporate location-aware functionality into your applications. When a prompt has context related to Maps data, the Gemini model uses Google Maps to provide factually accurate and fresh answers that are relevant to the specified location or general area.\n", "\n", "To enable grounding with Google Maps, add the `google_maps` tool in the `config` argument of `generate_content`, and optionally provide a structured location in the `tool_config`.\n", "\n", "```python\n", "client.models.generate_content(\n", " ...,\n", " config=types.GenerateContentConfig(\n", " # Enable the tool.\n", " tools=[types.Tool(google_maps=types.GoogleMaps())],\n", " # Provide structured location.\n", " tool_config=types.ToolConfig(retrieval_config=types.RetrievalConfig(\n", " lat_lng=types.LatLng(\n", " latitude=34.050481, longitude=-118.248526))),\n", " )\n", ")\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5AoiEtX9hJRT" }, "outputs": [ { "data": { "text/markdown": [ "### Response\n", " Yes, there are several cafes around that do a good flat white within a 20-minute walk.\n", "\n", "* **Tiny Dancer Coffee** specifically mentions serving flat whites, along with espressos and other latte options. It's a cozy subway cafe with a 4.8-star rating and is about a 6.8-minute walk away.\n", "* **Solid State Coffee** is an easygoing roastery offering thoughtfully sourced brews and has a 4.7-star rating. It's approximately a 4.7-minute walk.\n", "* **Sote Coffee Roasters** is a warm, laid-back coffee shop serving freshly roasted brews with a 4.9-star rating, about a 5.9-minute walk.\n", "* **White Noise Coffee - Coffee Shop & Roastery** is an intimate cafe with globally sourced, in-house roasted beans, rated 4.7 stars, and is about a 5.0-minute walk away.\n", "* **Rex** offers pour-over coffee and espresso drinks and has a 4.6-star rating, located about a 4.8-minute walk from you.\n", "* **Thē Soirēe** is a cozy cafe featuring espresso drinks, teas, and pastries. It has a 4.7-star rating and is about a 4.4-minute walk away.\n", "* **Bibble & Sip** is a bakery and coffeehouse serving upscale coffees, rated 4.5 stars, and is about a 4.1-minute walk." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Search Query: ['cafes with flat white near me']\n" ] } ], "source": [ "from IPython.display import Markdown\n", "\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=\"Do any cafes around here do a good flat white? I will walk up to 20 minutes away\",\n", " config=types.GenerateContentConfig(\n", " tools=[types.Tool(google_maps=types.GoogleMaps())],\n", " tool_config=types.ToolConfig(\n", " retrieval_config=types.RetrievalConfig(\n", " lat_lng=types.LatLng(latitude=40.7680797, longitude=-73.9818957)\n", " )\n", " ),\n", " ),\n", ")\n", "\n", "Markdown(f\"### Response\\n {response.text}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "dewokmssn2-x" }, "source": [ "All grounded outputs require sources to be displayed after the response text. This code snippet will display the sources." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "T3mktcrzoMCp" }, "outputs": [ { "data": { "text/markdown": [ "### Sources from Google Maps\n", "- [Sote Coffee Roasters](https://maps.google.com/?cid=13421224117575076881)\n", "- [Heaven on 7th Marketplace](https://maps.google.com/?cid=13100894621228039586)\n", "- [White Noise Coffee - Coffee Shop & Roastery](https://maps.google.com/?cid=9563404650783060353)\n", "- [Sip + Co.](https://maps.google.com/?cid=4785431035926753688)\n", "- [Weill Café](https://maps.google.com/?cid=16521712104323291061)\n", "- [Down Under Coffee](https://maps.google.com/?cid=3179851379461939943)" ], "text/plain": [ "" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def generate_sources(response: types.GenerateContentResponse):\n", " grounding = response.candidates[0].grounding_metadata\n", " # You only need to display sources that were part of the grounded response.\n", " supported_chunk_indices = {i for support in grounding.grounding_supports for i in support.grounding_chunk_indices}\n", "\n", " sources = []\n", " if supported_chunk_indices:\n", " sources.append(\"### Sources from Google Maps\")\n", " for i in supported_chunk_indices:\n", " ref = grounding.grounding_chunks[i].maps\n", " sources.append(f\"- [{ref.title}]({ref.uri})\")\n", "\n", " return \"\\n\".join(sources)\n", "\n", "\n", "Markdown(generate_sources(response))" ] }, { "cell_type": "markdown", "metadata": { "id": "Rpf3yVOnoTTO" }, "source": [ "The response also includes data you can use to assemble in-line links. See the [Grounding with Google Search docs](https://ai.google.dev/gemini-api/docs/google-search#attributing_sources_with_inline_citations) for an example of this." ] }, { "cell_type": "markdown", "metadata": { "id": "i29-R5Y9ikuV" }, "source": [ "### Render the contextual Google Maps widget\n", "\n", "If you are building a web-based application, you can add an interactive widget that includes a map view, the contextual location, the places Gemini considered in the query, and review snippets.\n", "\n", "To load the widget, perform all of the following steps.\n", "1. [Acquire a Google Maps API key](https://developers.google.com/maps/documentation/javascript/get-api-key), enabled for the Places API and the Maps JavaScript API,\n", "1. Request the widget token in your request (with `GoogleMaps(enable_widget=True)`),\n", "1. [Load the Maps JavaScript API](https://developers.google.com/maps/documentation/javascript/load-maps-js-api) and enable the Places library,\n", "1. Render the [``](https://developers.google.com/maps/documentation/javascript/reference/places-widget#PlaceContextualElement) element, setting `context-token` to the value of the `google_maps_widget_context_token` returned in the Gemini API response.\n", "\n", "Note that generating a widget can add additional latency to the response, so it is recommended that you do not enable the widget if you are not displaying it.\n", "\n", "Assuming you have a Google Maps API key with both APIs enabled, the following code shows one way to render the widget." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ijM8VHHhtrns" }, "outputs": [ { "data": { "text/markdown": [ "### Response\n", " There are several highly-rated cafes within a 20-minute walk that serve coffee.\n", "\n", "If you're looking for a café open right now, **Heaven on 7th Marketplace** is open 24 hours, has a 4.8-star rating, and is approximately a 2.7-minute walk (576 meters) away. They serve coffee and smoothies along with sandwiches and bagels.\n", "\n", "For a café that explicitly mentions flat whites and has a high rating, **Tiny Dancer Coffee** is an excellent option, rated 4.8 stars. They serve espressos and flat whites, as well as oat and matcha latte options. It's about a 6.8-minute walk (1.3 kilometers) away and opens at 7:00 AM local time.\n", "\n", "Other well-rated cafes that open soon and are within a short walk include:\n", "\n", "* **Cafe aroma**, with a 4.7-star rating, opens at 6:30 AM and is a 1.6-minute walk (279 meters) away. They offer hot drinks along with bagels, sandwiches, and pastries.\n", "* **Down Under Coffee**, rated 4.8 stars, opens at 7:30 AM and is a 1.9-minute walk (321 meters) away.\n", "* **Masseria Caffè**, a 4.6-star rated café, opens at 7:00 AM and is a 2.3-minute walk (472 meters) away. They offer a variety of caffeinated beverages and pastries.\n", "* **Weill Café**, boasting a 4.9-star rating, opens at 8:00 AM and is a very short 1.7-minute walk (425 meters) away." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "### Sources from Google Maps\n", "- [White Noise Coffee - Coffee Shop & Roastery](https://maps.google.com/?cid=9563404650783060353)\n", "- [Tiny Dancer Coffee](https://maps.google.com/?cid=14421445427760414557)\n", "- [Weill Café](https://maps.google.com/?cid=16521712104323291061)\n", "- [maman](https://maps.google.com/?cid=14208928559726348633)\n", "- [Bibble & Sip](https://maps.google.com/?cid=5234372605966457616)" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import HTML\n", "\n", "# Load or set your Maps API key here.\n", "MAPS_API_KEY = userdata.get(\"MAPS_API_KEY\")\n", "\n", "# This is the same request as above, except `enable_widget` is set.\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=\"Do any cafes around here do a good flat white? I will walk up to 20 minutes away\",\n", " config=types.GenerateContentConfig(\n", " tools=[types.Tool(google_maps=types.GoogleMaps(enable_widget=True))],\n", " tool_config=types.ToolConfig(\n", " retrieval_config=types.RetrievalConfig(\n", " lat_lng=types.LatLng(latitude=40.7680797, longitude=-73.9818957)\n", " )\n", " ),\n", " ),\n", ")\n", "\n", "widget_token = response.candidates[0].grounding_metadata.google_maps_widget_context_token\n", "\n", "display(Markdown(f\"### Response\\n {response.text}\"))\n", "display(Markdown(generate_sources(response)))\n", "display(HTML(f\"\"\"\n", "\n", "\n", " \n", "
\n", " \n", " \n", "
\n", " \n", "\n", "\"\"\"))" ] }, { "cell_type": "markdown", "metadata": { "id": "WZRCc8M947nK" }, "source": [ "Running and rendering the above code will require a Maps API key. Once you have it working, the widget will look like this.\n", "\n", "![Rendered contextual Places widget](https://storage.googleapis.com/generativeai-downloads/images/maps-widget.png)" ] }, { "cell_type": "markdown", "metadata": { "id": "bef61bf2764f" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "9XfNrFR7j6F6" }, "source": [ "## Grounding with YouTube links\n", "\n", "You can directly include a public YouTube URL in your prompt. The Gemini models will then process the video content to perform tasks like summarization and answering questions about the content.\n", "\n", "This capability leverages Gemini's multimodal understanding, allowing it to analyze and interpret video data alongside any text prompts provided.\n", "\n", "You do need to explicitly declare the video URL you want the model to process as part of the contents of the request using a `FileData` part. Here a simple interaction where you ask the model to summarize a YouTube video:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "akVTribOkgT2" }, "outputs": [ { "data": { "text/markdown": [ "This video introduces \"Gemma Chess,\" demonstrating how Google's large language model, Gemma, can enhance the game of chess by leveraging its linguistic abilities.\n", "\n", "The speaker, Ju-yeong Ji from Google DeepMind, explains that Gemma isn't intended to replace powerful chess engines that excel at calculating moves. Instead, it aims to bring a \"new dimension\" to chess through understanding and creating text.\n", "\n", "The video highlights three key applications:\n", "\n", "1. **Explainer:** Gemma can analyze chess games (e.g., Kasparov vs. Deep Blue) and explain the \"most interesting\" or strategically significant moves in plain language, detailing their impact, tactical considerations, and psychological aspects, making complex analyses more understandable.\n", "2. **Storytellers:** Gemma can generate narrative stories about chess games, transforming raw move data into engaging accounts that capture the tension, emotions, and key moments of a match, bringing the game to life beyond just the moves.\n", "3. **Supporting Chess Learning:** Gemma can act as a personalized chess tutor, explaining concepts like specific openings (e.g., Sicilian Defense) or tactics in an accessible way, even adapting to the user's language and skill level, effectively serving as an always-available, intelligent chess encyclopedia and coach.\n", "\n", "By combining the computational strength of traditional chess AI with Gemma's advanced language capabilities, this approach offers a more intuitive and human-friendly way to learn, analyze, and engage with chess." ], "text/plain": [ "" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yt_link = \"https://www.youtube.com/watch?v=XV1kOFo1C8M\"\n", "\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=types.Content(\n", " parts=[\n", " types.Part(text=\"Summarize this video.\"),\n", " types.Part(file_data=types.FileData(file_uri=yt_link)),\n", " ]\n", " ),\n", ")\n", "\n", "Markdown(response.text)" ] }, { "cell_type": "markdown", "metadata": { "id": "AR7sQVlxy8Yr" }, "source": [ "But you can also use the link as the source of truth for your request. In this example, you will first ask how Gemma models can help on chess games:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "UTH4DqBAzx3H" }, "outputs": [ { "data": { "text/markdown": [ "Gemma models, as large language models (LLMs), can significantly enhance the chess experience by bridging the gap between raw computational power and human understanding. Unlike traditional chess engines that excel at brute-force calculation and generating optimal moves (often in cryptic notation or complex numerical evaluations), Gemma's strength lies in processing and generating human-like text. This allows it to translate intricate chess engine outputs into intuitive, prose-based explanations, elucidating the strategic and tactical rationale behind moves, clarifying complex game concepts like openings and endgames, and providing accessible insights for players of all skill levels, significantly enhancing understanding beyond mere data.\n", "\n", "Furthermore, Gemma can serve as an invaluable tool for personalized chess learning and engagement. It can act as a dynamic, interactive coach, offering tailored explanations of specific positions, identifying weaknesses in a player's understanding, or even detailing the historical and psychological context of famous matches. By summarizing complex game analyses, highlighting pivotal moments, and even crafting narrative descriptions of entire games, Gemma can make chess more approachable, immersive, and educational, transforming how players learn, analyze, and appreciate the strategic depth of the game." ], "text/plain": [ "" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yt_link = \"https://www.youtube.com/watch?v=XV1kOFo1C8M\"\n", "\n", "response = client.models.generate_content(\n", " model=MODEL_ID,\n", " contents=types.Content(\n", " parts=[\n", " types.Part(\n", " text=\"In 2 paragraph, how Gemma models can help on chess games?\"\n", " ),\n", " types.Part(file_data=types.FileData(file_uri=yt_link)),\n", " ]\n", " ),\n", ")\n", "\n", "Markdown(response.text)" ] }, { "cell_type": "markdown", "metadata": { "id": "RHhdfKqLz_D6" }, "source": [ "Now your answer is more insightful for the topic you want, using the knowledge shared on the video and not necessarily available on the model knowledge." ] }, { "cell_type": "markdown", "metadata": { "id": "2de8c7349137" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "mKBPhxA-0RiT" }, "source": [ "## Grounding information using URL context\n", "\n", "The URL Context tool empowers Gemini models to directly access and process content from specific web page URLs you provide within your API requests. This is incredibly interesting because it allows your applications to dynamically interact with live web information without needing you to manually pre-process and feed that content to the model.\n", "\n", "URL Context is effective because it allows the models to base its responses and analysis directly on the content of the designated web pages. Instead of relying solely on its general training data or broad web searches (which are also valuable grounding tools), URL Context anchors the model's understanding to the specific information present at those URLs." ] }, { "cell_type": "markdown", "metadata": { "id": "Y7GrocBgYgrp" }, "source": [ "### Process website URLs\n", "\n", "If you want Gemini to specifically ground its answers thanks to the content of a specific website, just add the urls in your prompt and enable the tool by adding it to your config:\n", "```\n", "config = {\n", " \"tools\": [\n", " {\n", " \"url_context\": {}\n", " }\n", " ],\n", "}\n", "```\n", "\n", "You can add up to 20 links in your prompt." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "eOXM1Fh2D9Ai" }, "outputs": [ { "data": { "text/markdown": [ "The provided document details various Gemini model variants, including Gemini 1.5, Gemini 2.0, and Gemini 2.5, each with different \"Flash,\" \"Pro,\" and \"Lite\" versions optimized for specific use cases.\n", "\n", "Here's a comparison of the key differences:\n", "\n", "| Feature | Gemini 1.5 Pro | Gemini 1.5 Flash | Gemini 2.0 Flash | Gemini 2.5 Pro | Gemini 2.5 Flash | Gemini 2.5 Flash-Lite |\n", "| :---------------- | :---------------------------------------------- | :------------------------------------------------ | :--------------------------------------------------- | :------------------------------------------------------------------------------ | :-------------------------------------------------------------------------- | :------------------------------------------------------------------------ |\n", "| **Description** | Mid-size multimodal model, optimized for reasoning tasks, can process large amounts of data. | Fast and versatile multimodal model for diverse tasks. | Next-gen features, improved capabilities, superior speed, and native tool use. | Most powerful thinking model, maximum accuracy, state-of-the-art performance. | Best model in terms of price-performance, well-rounded capabilities. | Optimized for cost-efficiency and high throughput. |\n", "| **Input(s)** | Audio, images, video, text. | Audio, images, video, text. | Audio, images, video, text. | Audio, images, video, text, and PDF. | Audio, images, video, and text. | Text, image, video, audio. |\n", "| **Output(s)** | Text. | Text. | Text. | Text. | Text. | Text. |\n", "| **Input Token Limit** | 2,097,152. | 1,048,576. | 1,048,576. | 1,048,576. | 1,048,576. | 1,048,576. |\n", "| **Output Token Limit** | 8,192. | 8,192. | 8,192. | 65,536. | 65,536. | 65,536. |\n", "| **Key Use Cases** | Complex reasoning tasks. | Scaling across diverse tasks. | Next generation features, speed, realtime streaming. | Complex coding, reasoning, multimodal understanding, analyzing large data. | Low latency, high volume tasks that require thinking. | Real time, low latency use cases. |\n", "| **Thinking** | Not explicitly mentioned as a core capability, but optimized for reasoning tasks. | Not explicitly mentioned. | Experimental. | Supported (default on). | Supported (default on, can configure thinking budget). | Supported. |\n", "| **Live API** | Not supported. | Not supported. | Supported. | Not supported. | Not explicitly mentioned for the base Flash model, but Live variants exist. | Not supported. |\n", "| **Knowledge Cutoff** | September 2024. | September 2024. | August 2024. | January 2025. | January 2025. | January 2025. |\n", "| **Deprecation** | September 2025. | September 2025. | Not deprecated. | Not deprecated. | Not deprecated. | Not deprecated. |\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "prompt = \"\"\"\n", " Based on https://ai.google.dev/gemini-api/docs/models, what are the key\n", " differences between Gemini 1.5, Gemini 2.0 and Gemini 2.5 models?\n", " Create a markdown table comparing the differences.\n", "\"\"\"\n", "\n", "config = {\n", " \"tools\": [{\"url_context\": {}}],\n", "}\n", "\n", "response = client.models.generate_content(\n", " contents=[prompt], model=MODEL_ID, config=config\n", ")\n", "\n", "display(Markdown(response.text))" ] }, { "cell_type": "markdown", "metadata": { "id": "bPPCD5f2MSIx" }, "source": [ "You can see the status of the retrival using `url_context_metadata`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "V5kKeAX5MUsP" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "url_metadata=[UrlMetadata(\n", " retrieved_url='https://ai.google.dev/gemini-api/docs/models',\n", " url_retrieval_status=\n", ")]\n" ] } ], "source": [ "# get URLs retrieved for context\n", "print(response.candidates[0].url_context_metadata)" ] }, { "cell_type": "markdown", "metadata": { "id": "rS2xyoMPY8u-" }, "source": [ "### Add PDFs by URL\n", "\n", "Gemini can also process PDFs from an URL. Here's an example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "SeMZX5C5sLe3" }, "outputs": [ { "data": { "text/markdown": [ "\n", "\n", "The PDF is Alphabet Inc.'s Second Quarter 2025 Earnings Release. It details the company's financial performance for the quarter ended June 30, 2025.\n", "\n", "Key highlights include:\n", "* **Total Revenues:** Consolidated Alphabet revenues increased 14% year-over-year to \\$96.4 billion.\n", "* **Google Services:** Revenues grew 12% to \\$82.5 billion, driven by strong performance in Google Search & other, Google subscriptions, platforms, devices, and YouTube ads.\n", "* **Google Cloud:** Revenues increased 32% to \\$13.6 billion, with growth in Google Cloud Platform (GCP) across core GCP products, AI Infrastructure, and Generative AI Solutions. Google Cloud's annual revenue run-rate is now over \\$50 billion.\n", "* **Operating Income:** Total operating income rose 14%, and the operating margin was 32.4%.\n", "* **Net Income and EPS:** Net income increased 19%, and diluted EPS grew 22% to \\$2.31.\n", "* **AI Impact:** CEO Sundar Pichai highlighted that AI is positively impacting every part of the business, driving strong momentum, with new features like AI Overviews and AI Mode performing well in Search.\n", "* **Capital Expenditures:** Alphabet plans to increase capital expenditures to approximately \\$85 billion in 2025 due to strong demand for Cloud products and services.\n", "* **Issuance of Senior Unsecured Notes:** In May 2025, Alphabet issued \\$12.5 billion in fixed-rate senior unsecured notes.\n", "\n", "The document also provides detailed financial tables, including consolidated balance sheets, statements of income, and statements of cash flows, as well as segment results for Google Services, Google Cloud, and Other Bets. It also includes reconciliations of GAAP to non-GAAP financial measures." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "prompt = \"\"\"\n", " Can you give me an overview of the content of this pdf?\n", " https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf\n", "\n", "\"\"\"\n", "\n", "config = {\n", " \"tools\": [{\"url_context\": {}}],\n", "}\n", "\n", "response = client.models.generate_content(\n", " contents=[prompt], model=MODEL_ID, config=config\n", ")\n", "\n", "display(Markdown(response.text.replace(\"$\", \"\\$\")))" ] }, { "cell_type": "markdown", "metadata": { "id": "vAkMWXwAxiaT" }, "source": [ "### Add images by URL\n", "\n", "Gemini can also process images from an URL. Here's an example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HPNxQYkx8WJN" }, "outputs": [ { "data": { "text/markdown": [ "I cannot directly interpret the numbered parts within the image you provided. However, I can give you the common names of trombone parts in French, which you can then match to the numbers on your image:\n", "\n", "Here are some common parts of a trombone in French:\n", "* **Embouchure** (Mouthpiece)\n", "* **Pavillon** (Bell)\n", "* **Coulisse** (Slide)\n", "* **Coulisse d'accord** or **Pompe d'accord** (Tuning slide)\n", "* **Clé d'eau** or **Barillet** (Water key or spit valve)\n", "* **Entretoise** (Brace/Cross-stay, often used for various connecting rods)\n", "* **Manchon** (Ferrule/Sleeve, connecting parts)\n", "\n", "Please match these names to the numbered parts in your image." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "prompt = \"\"\"\n", " Can you help me name of the numbered parts of that instrument, in French?\n", " https://upload.wikimedia.org/wikipedia/commons/thumb/4/40/Trombone.svg/960px-Trombone.svg.png\n", "\n", "\"\"\"\n", "\n", "config = {\n", " \"tools\": [{\"url_context\": {}}],\n", "}\n", "\n", "response = client.models.generate_content(\n", " contents=[prompt], model=MODEL_ID, config=config\n", ")\n", "\n", "display(Markdown(response.text))" ] }, { "cell_type": "markdown", "metadata": { "id": "NHs8FDfSsjt2" }, "source": [ "## Mix Search grounding and URL context\n", "\n", "The different tools can also be use in conjunction by adding them both to the config. It's a good way to steer Gemini in the right direction and then let it do its magic using search grounding." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "wEOpBpbssjbD" }, "outputs": [ { "data": { "text/markdown": [ "Alphabet Inc. announced strong financial results for the second quarter of 2025, ending June 30, 2025. Consolidated revenues increased by 14% year-over-year to \\$96.4 billion, or 13% in constant currency, with double-digit growth seen across Google Search & other, YouTube ads, Google subscriptions, platforms, and devices, and Google Cloud.\n", "\n", "Key financial highlights include:\n", "* **Total revenues** of \\$96.428 billion, up from \\$84.742 billion in Q2 2024.\n", "* **Net income** increased by 19% to \\$28.196 billion.\n", "* **Diluted EPS** rose by 22% to \\$2.31.\n", "* **Operating income** increased by 14% to \\$31.271 billion, with an **operating margin** of 32.4%.\n", "* **Google Services revenues** increased by 12% to \\$82.5 billion.\n", "* **Google Cloud revenues** significantly increased by 32% to \\$13.6 billion, driven by growth in Google Cloud Platform (GCP), AI Infrastructure, and Generative AI Solutions. Its annual revenue run-rate now exceeds \\$50 billion.\n", "* The company announced an increase in **capital expenditures** to approximately \\$85 billion for 2025 due to strong demand for Cloud products and services.\n", "\n", "Sundar Pichai, CEO of Alphabet, highlighted the company's \"standout quarter\" with robust growth, attributing success to leadership in AI and rapid shipping. He noted the positive impact of AI across the business, strong momentum in Search (including AI Overviews and AI Mode), and continued strong performance in YouTube and subscriptions.\n", "\n", "**Financial Analyst Reactions and Trends:**\n", "\n", "Financial analysts generally maintain a positive outlook on Alphabet, with a majority (43 out of 55) recommending \"buy\" or \"strong buy\" ratings. However, the average target price has seen a slight decline from approximately \\$215 in March to \\$202.05, indicating increased uncertainty. Despite this, the current consensus suggests a potential upside of 11% from recent trading levels as of mid-July 2025.\n", "\n", "Prior to the earnings release, analysts expected a moderation in growth for Q2 2025, with projected revenue of \\$93.8 billion (+10.7% YoY) and net income of \\$26.5 billion (+12.2% YoY). Alphabet's actual Q2 2025 results surpassed these expectations, with EPS of \\$2.31 beating the forecasted \\$2.17 and revenue of \\$96.43 billion exceeding projections.\n", "\n", "Despite the positive earnings beat, Alphabet's stock experienced a modest increase of 0.57% in after-hours trading, and in some instances, a slight decline (around 1.5%) post-announcement. This dip was primarily attributed to the company's decision to raise its 2025 capital expenditures guidance by \\$10 billion to \\$85 billion, reflecting increased investments in AI and technology infrastructure, which raised some investor concerns about higher costs.\n", "\n", "The key areas of focus for analysts include Alphabet's ability to expand its GenAI model's user base without impacting traditional Search revenue, and the continued growth of Google Cloud, which is seen as the company's largest growth opportunity. Analysts are closely monitoring AI developments, cloud growth, and regulatory challenges. The overall trend appears to be one of cautious optimism, with strong underlying business performance balanced by increased investment needs for future growth in AI and cloud services." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
\n", " \n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "prompt = \"\"\"\n", " Can you give me an overview of the content of this pdf?\n", " https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf\n", " Search on the web for the reaction of the main financial analysts, what's the trend?\n", "\"\"\"\n", "\n", "config = {\n", " \"tools\": [\n", " {\"url_context\": {}},\n", " {\"google_search\": {}}\n", " ],\n", "}\n", "\n", "response = client.models.generate_content(\n", " contents=[prompt],\n", " model=MODEL_ID,\n", " config=config\n", ")\n", "\n", "display(Markdown(response.text.replace('$','\\$')))\n", "display(HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content))" ] }, { "cell_type": "markdown", "metadata": { "id": "dOt32shZaEXj" }, "source": [ "## Next steps\n", "\n", "\n", "\n", "* For more details about using Google Search grounding, check out the [Search Grounding cookbook](./Search_Grounding.ipynb).\n", "* If you are looking for another scenarios using videos, take a look at the [Video understanding cookbook](./Video_understanding.ipynb).\n", "\n", "Also check the other Gemini capabilities that you can find in the [Gemini quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts/)." ] } ], "metadata": { "colab": { "collapsed_sections": [ "Tce3stUlHN0L" ], "name": "Grounding.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }