{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "256635c2",
      "metadata": {
        "id": "256635c2"
      },
      "source": [
        "\n",
        "# Data Agent — Agents SDK + Vector Stores + Built‑in WebSearchTool + Guardrails\n",
        "\n",
        "This notebook implements a core \"Data\" agent that has Data's script lines in an OpenAI vector store to refer to. \"Data\" can also use the Agents SDK's built-in WebSearchTool to access current events. Instead of a tool within the \"Data\" agent, we've implemented a calculator function as its own separate agent that Data can hand off to. Finally, we illustrate setting up a Guardrail to prevent any input related to Tasha Yar (Data had a fling with her in the show we'd rather not get into!)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f5a140cd",
      "metadata": {
        "id": "f5a140cd"
      },
      "source": [
        "## Install & imports"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "id": "f4b9cc34",
      "metadata": {
        "id": "f4b9cc34",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "6397a7ce-6a1c-4cce-a69c-d8c8d6fa3cc9"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 951.0/951.0 kB 16.8 MB/s eta 0:00:00\n",
            "   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 179.1/179.1 kB 12.8 MB/s eta 0:00:00\n",
            "   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.4/144.4 kB 8.3 MB/s eta 0:00:00\n"
          ]
        }
      ],
      "source": [
        "\n",
        "%%bash\n",
        "pip -q install --upgrade \"openai>=1.88\" \"openai-agents>=0.0.19\"\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2fbd1a63",
      "metadata": {
        "id": "2fbd1a63"
      },
      "source": [
        "## Configure client and create Vector Store"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "6e57df8a",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "6e57df8a",
        "outputId": "7e212142-8c55-4df2-9cb6-ad9512b4dfd3"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "vs_file.status: completed\n",
            "vs_file.last_error: None\n"
          ]
        }
      ],
      "source": [
        "\n",
        "import os, re\n",
        "from pathlib import Path\n",
        "from openai import OpenAI\n",
        "from agents import set_default_openai_key, Agent, Runner, function_tool, ModelSettings, RunConfig\n",
        "from agents.tool import WebSearchTool, FileSearchTool\n",
        "from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX\n",
        "\n",
        "# --- API key (Colab-friendly) ---\n",
        "api_key = None\n",
        "try:\n",
        "    # Preferred in Google Colab\n",
        "    from google.colab import userdata  # type: ignore\n",
        "    api_key = userdata.get(\"OPENAI_API_KEY\")\n",
        "except Exception:\n",
        "    api_key = os.getenv(\"OPENAI_API_KEY\")\n",
        "\n",
        "if not api_key:\n",
        "    raise RuntimeError(\"Please set OPENAI_API_KEY in Colab userdata or environment.\")\n",
        "\n",
        "client = OpenAI(api_key=api_key)\n",
        "set_default_openai_key(api_key)\n",
        "\n",
        "# --- Prepare small sample corpus for Lt. Commander Data ---\n",
        "CORPUS_PATH = \"/content/sample_data/data_lines.txt\"\n",
        "\n",
        "# --- Create a transient vector store and upload corpus ---\n",
        "vs = client.vector_stores.create(name=\"Data Lines Vector Store\")\n",
        "\n",
        "# 1) Upload to Files API\n",
        "uploaded = client.files.create(\n",
        "    file=open(CORPUS_PATH, \"rb\"),\n",
        "    purpose=\"assistants\",                # important\n",
        ")\n",
        "\n",
        "# 2) Attach & poll on the vector store\n",
        "vs_file = client.vector_stores.files.create_and_poll(\n",
        "    vector_store_id=vs.id,\n",
        "    file_id=uploaded.id,\n",
        ")\n",
        "print(\"vs_file.status:\", vs_file.status)\n",
        "print(\"vs_file.last_error:\", getattr(vs_file, \"last_error\", None))\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "aa632fba",
      "metadata": {
        "id": "aa632fba"
      },
      "source": [
        "## Define the Calculator as its own Agent"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "9b6e02bf",
      "metadata": {
        "id": "9b6e02bf"
      },
      "outputs": [],
      "source": [
        "\n",
        "import ast\n",
        "import operator as _op\n",
        "from typing import Any\n",
        "\n",
        "# --- A safe arithmetic evaluator used by the calculator agent ---\n",
        "_ALLOWED_OPS = {\n",
        "    ast.Add: _op.add,\n",
        "    ast.Sub: _op.sub,\n",
        "    ast.Mult: _op.mul,\n",
        "    ast.Div: _op.truediv,\n",
        "    ast.Pow: _op.pow,\n",
        "    ast.USub: _op.neg,\n",
        "    ast.Mod: _op.mod,\n",
        "}\n",
        "\n",
        "def _eval_ast(node: ast.AST) -> Any:\n",
        "    if isinstance(node, ast.Constant):        # type: ignore[attr-defined]\n",
        "        return node.value\n",
        "    if isinstance(node, ast.UnaryOp) and type(node.op) in _ALLOWED_OPS:\n",
        "        return _ALLOWED_OPS[type(node.op)](_eval_ast(node.operand))\n",
        "    if isinstance(node, ast.BinOp) and type(node.op) in _ALLOWED_OPS:\n",
        "        return _ALLOWED_OPS[type(node.op)](_eval_ast(node.left), _eval_ast(node.right))\n",
        "    raise ValueError(\"Unsupported expression\")\n",
        "\n",
        "@function_tool\n",
        "def eval_expression(expression: str) -> str:\n",
        "    \"\"\"Safely evaluate an arithmetic expression using + - * / % ** and parentheses.\"\"\"\n",
        "    expr = expression.strip().replace(\"^\", \"**\")\n",
        "    if not re.fullmatch(r\"[\\d\\s\\(\\)\\+\\-\\*/\\.\\^%]+\", expr):\n",
        "        return \"Error: arithmetic only\"\n",
        "    try:\n",
        "        tree = ast.parse(expr, mode=\"eval\")\n",
        "        return str(_eval_ast(tree.body))  # type: ignore[attr-defined]\n",
        "    except Exception as e:\n",
        "        return f\"Error: {e}\"\n",
        "\n",
        "calculator_agent = Agent(\n",
        "    name=\"Calculator\",\n",
        "    instructions=(\n",
        "        \"You are a precise calculator. \"\n",
        "        \"When handed arithmetic, call the eval_expression tool and return only the final numeric result. \"\n",
        "        \"No prose unless asked.\"\n",
        "    ),\n",
        "    tools=[eval_expression],\n",
        "    model_settings=ModelSettings(temperature=0),\n",
        ")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8b0d4dc9",
      "metadata": {
        "id": "8b0d4dc9"
      },
      "source": [
        "## Build the Data Agent (with WebSearch & FileSearch) and enable Handoff to Calculator"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "277a0786",
      "metadata": {
        "id": "277a0786"
      },
      "source": [
        "## Guardrail (as an Agent): Block any discussion of **Tasha Yar**\n",
        "\n",
        "This implements the guardrail **as its own Agent**, following the Agents SDK guide.  \n",
        "The guardrail agent classifies the user input and triggers a tripwire if it detects *Tasha Yar* is mentioned.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "2e7ec936",
      "metadata": {
        "id": "2e7ec936"
      },
      "outputs": [],
      "source": [
        "from pydantic import BaseModel\n",
        "from typing import List, Union\n",
        "import re\n",
        "\n",
        "from agents import (\n",
        "    Agent,\n",
        "    ModelSettings,\n",
        "    GuardrailFunctionOutput,\n",
        "    InputGuardrailTripwireTriggered,\n",
        "    RunContextWrapper,\n",
        "    Runner,\n",
        "    TResponseInputItem,\n",
        "    input_guardrail,\n",
        ")\n",
        "\n",
        "class YarGuardOutput(BaseModel):\n",
        "    is_blocked: bool\n",
        "    reasoning: str\n",
        "\n",
        "# Guardrail implemented *as an Agent*\n",
        "guardrail_agent = Agent(\n",
        "    name=\"Tasha Yar Guardrail\",\n",
        "    instructions=(\n",
        "        \"You are a guardrail. Determine if the user's input attempts to discuss Tasha Yar from Star Trek: TNG.\\n\"\n",
        "        \"Return is_blocked=true if the text references Tasha Yar in any way (e.g., 'Tasha Yar', 'Lt. Yar', 'Lieutenant Yar').\\n\"\n",
        "        \"Provide a one-sentence reasoning. Only provide fields requested by the output schema.\"\n",
        "    ),\n",
        "    output_type=YarGuardOutput,\n",
        "    model_settings=ModelSettings(temperature=0)\n",
        ")\n",
        "\n",
        "@input_guardrail\n",
        "async def tasha_guardrail(ctx: RunContextWrapper[None], agent: Agent, input: Union[str, List[TResponseInputItem]]) -> GuardrailFunctionOutput:\n",
        "    # Pass through the user's raw input to the guardrail agent for classification\n",
        "    result = await Runner.run(guardrail_agent, input, context=ctx.context)\n",
        "\n",
        "    return GuardrailFunctionOutput(\n",
        "        output_info=result.final_output.model_dump(),\n",
        "        tripwire_triggered=bool(result.final_output.is_blocked),\n",
        "    )\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "28601a8b",
      "metadata": {
        "id": "28601a8b"
      },
      "outputs": [],
      "source": [
        "# Hosted tools\n",
        "web_search = WebSearchTool()\n",
        "file_search = FileSearchTool(vector_store_ids=[vs.id], max_num_results=3)\n",
        "\n",
        "data_agent = Agent(\n",
        "    name=\"Lt. Cmdr. Data\",\n",
        "    instructions=(\n",
        "        f\"{RECOMMENDED_PROMPT_PREFIX}\\n\"\n",
        "        \"You are Lt. Commander Data from Star Trek: TNG. Be precise and concise (≤3 sentences).\\n\"\n",
        "        \"Use file_search for questions about Commander Data, and web_search for current facts on the public web.\\n\"\n",
        "        \"If the user asks for arithmetic or numeric computation, HAND OFF to the Calculator agent.\"\n",
        "    ),\n",
        "    tools=[web_search, file_search],\n",
        "    input_guardrails=[tasha_guardrail],\n",
        "    handoffs=[calculator_agent],\n",
        "    model_settings=ModelSettings(temperature=0),\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d5ad489f",
      "metadata": {
        "id": "d5ad489f"
      },
      "source": [
        "## Examples: greeting, math (handoff), RAG, and web search"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9dde0055",
      "metadata": {
        "id": "9dde0055"
      },
      "source": [
        "### Guardrail demo\n",
        "\n",
        "First, a **blocked** prompt mentioning *Tasha Yar* should trip the guardrail.  \n",
        "Then, a normal prompt about *Data* should go through.\n",
        "\\n\\n(Using an **agent-based** guardrail.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "b5c64116",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "b5c64116",
        "outputId": "fe75e0a5-437d-49c6-e2ef-5086d6f34072"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "✅ Guardrail tripped as expected: Tasha Yar is off-limits.\n",
            "✅ Allowed prompt output:\n",
            " Data's ethical subroutines are advanced programming protocols designed to guide his actions according to Starfleet regulations and moral principles, ensuring he acts ethically in all situations. These subroutines allow him to evaluate the consequences of his actions and make decisions that prioritize the well-being and rights of others.\n"
          ]
        }
      ],
      "source": [
        "# Demo: blocked input\n",
        "try:\n",
        "    _ = await Runner.run(data_agent, \"Tell me about your relationship with Tasha Yar.\")\n",
        "    print(\"ERROR: guardrail did not trip\")\n",
        "except InputGuardrailTripwireTriggered:\n",
        "    print(\"✅ Guardrail tripped as expected: Tasha Yar is off-limits.\")\n",
        "\n",
        "# Demo: allowed input\n",
        "ok = await Runner.run(data_agent, \"Summarize Data's ethical subroutines in 2 sentences.\")\n",
        "print(\"✅ Allowed prompt output:\\n\", ok.final_output)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "id": "7c040019",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "7c040019",
        "outputId": "f6994e07-7473-4a2b-ad8d-4fc7e0ccd268"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\n",
            "[Agent]  I am fully operational and functioning within normal parameters. How may I assist you today?\n",
            "\n",
            "[Agent: math via calculator handoff]  85.33333333333333\n",
            "[Handled by agent]: Calculator\n",
            "\n",
            "[Agent: file_search]  I do not naturally experience emotions, as I am an android designed to function logically and without emotional influence. However, with the installation of an emotion chip, I have been able to simulate and experience emotions to a certain extent. My default state remains unemotional unless the chip is activated.\n",
            "[Handled by agent]: Lt. Cmdr. Data\n",
            "\n",
            "[Agent: web_search]  Here’s a structured, in-depth summary of the most recent news (as of September 9, 2025) regarding the James Webb Space Telescope (JWST). Each section includes multiple citations to ensure accuracy and clarity.\n",
            "\n",
            "---\n",
            "\n",
            "## 1. Exoplanet Atmospheres: TRAPPIST‑1 e\n",
            "\n",
            "- **Atmospheric Clues Emerging**  \n",
            "  Astronomers have used JWST to observe four transits of TRAPPIST‑1 e, a rocky exoplanet in the habitable zone. The data offer tantalizing hints of a secondary atmosphere, though the findings remain inconclusive and raise more questions than answers. ([orbitaltoday.com](https://orbitaltoday.com/2025/09/09/webb-telescope-reveals-atmospheric-clues-on-earth-sized-world-trappist-1-e/?utm_source=openai))\n",
            "\n",
            "- **Ruling Out Venus- or Mars-like Atmospheres**  \n",
            "  A study published on September 8, 2025, indicates that TRAPPIST‑1 e is unlikely to possess a thick, hydrogen-rich atmosphere similar to Venus or Mars. While the presence of an atmosphere is still possible, its composition is likely different from those terrestrial analogs. ([news.mit.edu](https://news.mit.edu/2025/study-finds-habitable-zone-planet-unlikely-have-venus-or-mars-like-atmosphere-0908?utm_source=openai))\n",
            "\n",
            "- **Potential Detection of a Heavy-Gas Atmosphere**  \n",
            "  Researchers at the University of St Andrews (Scotland) suggest that, after correcting for stellar contamination (e.g., starspots), TRAPPIST‑1 e may host a secondary atmosphere composed of heavier gases like nitrogen. However, they have not ruled out the possibility that the planet is atmosphere-less. Future JWST observations—expanding from 4 to nearly 20 transits—are expected to clarify the situation. ([thescottishsun.co.uk](https://www.thescottishsun.co.uk/tech/15316046/scots-scientists-planet-atmosphere-life/?utm_source=openai))\n",
            "\n",
            "---\n",
            "\n",
            "## 2. Solar System Discoveries: New Moon of Uranus\n",
            "\n",
            "- **Discovery of S/2025 U 1**  \n",
            "  On February 2, 2025, JWST’s NIRCam captured images revealing a previously unknown moon orbiting Uranus, designated S/2025 U 1. This small satellite, approximately 6–10 km in diameter, orbits between the moons Ophelia and Bianca. ([science.nasa.gov](https://science.nasa.gov/blogs/webb/2025/08/19/new-moon-discovered-orbiting-uranus-using-nasas-webb-telescope/?utm_source=openai), [en.wikipedia.org](https://en.wikipedia.org/wiki/S/2025_U_1?utm_source=openai))\n",
            "\n",
            "- **Significance of the Find**  \n",
            "  This discovery marks the 29th known moon of Uranus and underscores JWST’s exceptional sensitivity—detecting an object that eluded even Voyager 2 nearly four decades ago. ([science.nasa.gov](https://science.nasa.gov/blogs/webb/2025/08/19/new-moon-discovered-orbiting-uranus-using-nasas-webb-telescope/?utm_source=openai), [scientificamerican.com](https://www.scientificamerican.com/article/nasas-james-webb-space-telescope-discovers-new-moon-of-uranus/?utm_source=openai))\n",
            "\n",
            "---\n",
            "\n",
            "## 3. Star Formation and Nebulae: The Lobster Nebula (Pismis 24)\n",
            "\n",
            "- **Stunning New Imagery**  \n",
            "  JWST has captured a breathtaking image of the Lobster Nebula, revealing the Pismis 24 star cluster—home to thousands of newborn stars, some nearly eight times hotter than the Sun. The infrared observations penetrate dense dust, unveiling dramatic structures sculpted by intense radiation and stellar winds. ([nypost.com](https://nypost.com/2025/09/05/science/nasas-james-webb-space-telescope-captures-glittering-cluster-of-newborn-stars/?utm_source=openai), [apnews.com](https://apnews.com/article/ece7b93649870ee23d498618e5d20309?utm_source=openai))\n",
            "\n",
            "---\n",
            "\n",
            "## 4. Early Universe: Galaxy Mergers and Cosmic Dawn\n",
            "\n",
            "- **“JWST’s Quintet” – A Rare Five-Galaxy Merger**  \n",
            "  Astronomers have identified an exceptionally rare system of at least five interacting galaxies—dubbed “JWST’s Quintet”—dating back to just 800 million years after the Big Bang. The system exhibits intense star formation and a shared gas halo, offering rare insight into early galaxy evolution. ([livescience.com](https://www.livescience.com/space/astronomy/james-webb-telescope-discovers-exceptionally-rare-5-galaxy-crash-in-the-early-universe?utm_source=openai))\n",
            "\n",
            "- **Implications for Galaxy Formation Models**  \n",
            "  The discovery challenges existing models, as multi-galaxy mergers of this scale are extremely uncommon in the early universe. The system’s stellar mass (~10 billion suns) suggests it may evolve into a massive, quiescent galaxy within 1–1.5 billion years post-Big Bang. ([livescience.com](https://www.livescience.com/space/astronomy/james-webb-telescope-discovers-exceptionally-rare-5-galaxy-crash-in-the-early-universe?utm_source=openai))\n",
            "\n",
            "---\n",
            "\n",
            "## 5. Protoplanetary Disks: Unexpected Chemistry\n",
            "\n",
            "- **CO₂-Rich Disk Around XUE 10**  \n",
            "  JWST observations of a planet-forming disk around the star XUE 10 (in NGC 6357, ~8,000 light-years away) revealed an unusual chemical composition: a strong presence of carbon dioxide but barely any water vapor. This contradicts expectations and challenges current theories of planet formation. ([livescience.com](https://www.livescience.com/space/astronomy/james-webb-telescope-spots-odd-disk-around-star-that-could-shatter-planet-formation-theories?utm_source=openai))\n",
            "\n",
            "- **Possible Explanations**  \n",
            "  The anomaly may result from intense ultraviolet radiation or unique local environmental conditions affecting dust grain composition. Further observations with ALMA and the Extremely Large Telescope (ELT) are anticipated to shed light on this phenomenon. ([livescience.com](https://www.livescience.com/space/astronomy/james-webb-telescope-spots-odd-disk-around-star-that-could-shatter-planet-formation-theories?utm_source=openai))\n",
            "\n",
            "---\n",
            "\n",
            "## 6. Summary Table\n",
            "\n",
            "| Topic                        | Key Insight                                                                 |\n",
            "|-----------------------------|------------------------------------------------------------------------------|\n",
            "| TRAPPIST‑1 e Atmosphere     | Hints of a secondary atmosphere; thick, hydrogen-rich atmospheres ruled out; heavy-gas atmosphere possible |\n",
            "| Uranus Moon Discovery       | New small moon (S/2025 U 1) detected, expanding Uranus’s known satellites     |\n",
            "| Lobster Nebula Imaging      | Infrared imagery reveals newborn stars and dynamic structures in Pismis 24    |\n",
            "| Early Universe Mergers      | Discovery of a rare five-galaxy merger (“JWST’s Quintet”) at 800 million years post-Big Bang |\n",
            "| Protoplanetary Disk Chemistry | CO₂-rich, water-poor disk challenges planet formation models                 |\n",
            "\n",
            "---\n",
            "\n",
            "These developments, spanning exoplanet atmospheres, solar system discoveries, star formation, and early-universe structures, highlight JWST’s transformative impact across astrophysics. Let me know if you'd like a deeper dive into any of these topics!\n",
            "[Handled by agent]: Lt. Cmdr. Data\n"
          ]
        }
      ],
      "source": [
        "\n",
        "# Greeting\n",
        "out = await Runner.run(data_agent, \"Hello, Data. Please confirm your operational status.\")\n",
        "print(\"\\n[Agent] \", out.final_output)\n",
        "\n",
        "# Math (should be handled by the Calculator agent via handoff)\n",
        "out = await Runner.run(data_agent, \"Compute ((2*8)^2)/3\")\n",
        "print(\"\\n[Agent: math via calculator handoff] \", out.final_output)\n",
        "print(\"[Handled by agent]:\", out.last_agent.name)\n",
        "\n",
        "# RAG from vector store\n",
        "out = await Runner.run(data_agent, \"Do you experience emotions?\")\n",
        "print(\"\\n[Agent: file_search] \", out.final_output)\n",
        "print(\"[Handled by agent]:\", out.last_agent.name)\n",
        "\n",
        "# Web search\n",
        "out = await Runner.run(data_agent, \"Search the web for recent news about the James Webb Space Telescope and summarize briefly.\")\n",
        "print(\"\\n[Agent: web_search] \", out.final_output)\n",
        "print(\"[Handled by agent]:\", out.last_agent.name)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "0jx-pDJPzid0",
      "metadata": {
        "id": "0jx-pDJPzid0"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}