9.8 C
New York
Monday, March 31, 2025

Tutorial to Create a Knowledge Science Agent: A Code Implementation utilizing gemini-2.0-flash-lite mannequin by means of Google API, google.generativeai, Pandas and IPython.show for Interactive Knowledge Evaluation


On this tutorial, we exhibit the combination of Python’s strong knowledge manipulation library Pandas with Google Cloud’s superior generative capabilities by means of the google.generativeai package deal and the Gemini Professional mannequin. By establishing the setting with the mandatory libraries, configuring the Google Cloud API key, and leveraging the IPython show functionalities, the code supplies a step-by-step strategy to constructing an information science agent analyzing a pattern gross sales dataset. The instance reveals the right way to convert a DataFrame into markdown format after which use pure language queries to generate insights concerning the knowledge, highlighting the potential of mixing conventional knowledge evaluation instruments with fashionable AI-driven strategies.

!pip set up pandas google-generativeai --quiet

First, we set up the Pandas and google-generativeai libraries quietly, establishing the setting for knowledge manipulation and AI-powered evaluation.

import pandas as pd
import google.generativeai as genai
from IPython.show import Markdown

We import Pandas for knowledge manipulation, google.generativeai for accessing Google’s generative AI capabilities, and Markdown from IPython.show to render markdown-formatted outputs.

GOOGLE_API_KEY = "Use Your API Key Right here"
genai.configure(api_key=GOOGLE_API_KEY)


mannequin = genai.GenerativeModel('gemini-2.0-flash-lite')

We assign a placeholder API key, configure the google.generativeai consumer with it, and initialize the ‘gemini-2.0-flash-lite’ GenerativeModel for producing content material.

knowledge = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'],
        'Class': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'],
        'Area': ['North', 'South', 'East', 'West', 'North', 'South'],
        'Models Offered': [150, 200, 180, 120, 90, 250],
        'Worth': [1200, 25, 75, 300, 50, 100]}
sales_df = pd.DataFrame(knowledge)


print("Pattern Gross sales Knowledge:")
print(sales_df)
print("-" * 30)

Right here, we create a Pandas DataFrame named sales_df containing pattern gross sales knowledge for varied merchandise, after which print the DataFrame adopted by a separator line to visually distinguish the output.

def ask_gemini_about_data(dataframe, question):
    """
    Asks the Gemini Professional mannequin a query concerning the given Pandas DataFrame.


    Args:
        dataframe: The Pandas DataFrame to research.
        question: The pure language query concerning the DataFrame.


    Returns:
        The response from the Gemini Professional mannequin as a string.
    """
    immediate = f"""You're a knowledge evaluation agent. Analyze the next pandas DataFrame and reply the query.


    DataFrame:
    ```
    {dataframe.to_markdown(index=False)}
    ```


    Query: {question}


    Reply:
    """
    response = mannequin.generate_content(immediate)
    return response.textual content

Right here, we assemble a markdown-formatted immediate from a Pandas DataFrame and a pure language question, then use the Gemini Professional mannequin to generate and return an analytical response.

# Question 1: What's the whole variety of items offered throughout all merchandise?
query1 = "What's the whole variety of items offered throughout all merchandise?"
response1 = ask_gemini_about_data(sales_df, query1)
print(f"Query 1: {query1}")
print(f"Reply 1:n{response1}")
print("-" * 30)
Question 1 Output
# Question 2: Which product had the best variety of items offered?
query2 = "Which product had the best variety of items offered?"
response2 = ask_gemini_about_data(sales_df, query2)
print(f"Query 2: {query2}")
print(f"Reply 2:n{response2}")
print("-" * 30)
Question 2 Output
# Question 3: What's the common worth of the merchandise?
query3 = "What's the common worth of the merchandise?"
response3 = ask_gemini_about_data(sales_df, query3)
print(f"Query 3: {query3}")
print(f"Reply 3:n{response3}")
print("-" * 30)
Question 3 Output
# Question 4: Present me the merchandise offered within the 'North' area.
query4 = "Present me the merchandise offered within the 'North' area."
response4 = ask_gemini_about_data(sales_df, query4)
print(f"Query 4: {query4}")
print(f"Reply 4:n{response4}")
print("-" * 30)
Question 4 Output
# Question 5. Extra advanced question: Calculate the entire income for every product.
query5 = "Calculate the entire income (Models Offered * Worth) for every product and current it in a desk."
response5 = ask_gemini_about_data(sales_df, query5)
print(f"Query 5: {query5}")
print(f"Reply 5:n{response5}")
print("-" * 30)
Question 5 Output

In conclusion, the tutorial efficiently illustrates how the synergy between Pandas, the google.generativeai package deal, and the Gemini Professional mannequin can rework knowledge evaluation duties right into a extra interactive and insightful course of. The strategy simplifies querying and deciphering knowledge and opens up avenues for superior use instances akin to knowledge cleansing, characteristic engineering, and exploratory knowledge evaluation. By harnessing these state-of-the-art instruments inside the acquainted Python ecosystem, knowledge scientists can improve their productiveness and innovation, making it simpler to derive significant insights from advanced datasets.


Right here is the Colab Pocket book. Additionally, don’t overlook to observe us on Twitter and be a part of our Telegram Channel and LinkedIn Group. Don’t Overlook to affix our 85k+ ML SubReddit.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles