diff --git a/skills/ai-configs/aiconfig-ai-metrics/references/langchain-tracking.md b/skills/ai-configs/aiconfig-ai-metrics/references/langchain-tracking.md index 5e28e92..913836f 100644 --- a/skills/ai-configs/aiconfig-ai-metrics/references/langchain-tracking.md +++ b/skills/ai-configs/aiconfig-ai-metrics/references/langchain-tracking.md @@ -131,14 +131,20 @@ You will see examples in the wild that build the model by hand with `init_chat_m ## Tier 2 — LangGraph (agent workflows) -LangGraph's `create_react_agent` takes a `model`, `tools`, and `prompt`. Build the model the same way as the single-LangChain case — `create_langchain_model` — and pass it in. The tracker wraps the whole agent invocation, and the extractor aggregates token usage across every message the agent produced. +LangGraph's prebuilt agent takes a model, tools, and a system prompt. Build the model with `create_langchain_model` (Python) or `LangChainProvider.createLangChainModel` (Node) and pass it in. The tracker wraps the whole agent invocation; the extractor aggregates token usage across every message the agent produced, and tool-call telemetry is read off the result after the wrapped call returns. -**Python** — agent mode with a `MemorySaver` checkpointer: +> **API note (Python).** Use `from langchain.agents import create_agent`. The earlier `from langgraph.prebuilt import create_react_agent` is deprecated in LangGraph 1.0 and removed in 2.0 — same return shape; the only call-site rename is `prompt=` → `system_prompt=`. Node still uses `createReactAgent` from `@langchain/langgraph/prebuilt`. + +**Python** — agent mode with a `MemorySaver` checkpointer. The Python helper package ships `sum_token_usage_from_messages` (token aggregation across the agent's output messages) and `get_tool_calls_from_response` (tool-call name extraction per message); use them inside the `track_metrics_of_async` extractor / loop instead of hand-rolling either: ```python -from ldai.tracker import TokenUsage -from ldai_langchain import create_langchain_model, get_ai_metrics_from_response -from langgraph.prebuilt import create_react_agent +from ldai.providers.types import LDAIMetrics +from ldai_langchain import ( + create_langchain_model, + get_tool_calls_from_response, + sum_token_usage_from_messages, +) +from langchain.agents import create_agent from langgraph.checkpoint.memory import MemorySaver agent_config = ai_client.agent_config("my-agent-key", context) @@ -149,40 +155,34 @@ llm = create_langchain_model(agent_config) # MemorySaver gives the ReAct agent short-term memory per thread_id. checkpointer = MemorySaver() -agent = create_react_agent( +agent = create_agent( llm, - tools=[...], # application-owned tool handlers - prompt=agent_config.instructions, + [...], # application-owned tool handlers + system_prompt=agent_config.instructions, checkpointer=checkpointer, ) -async def track_langgraph_metrics(tracker, func): - """Aggregate token usage across every message the agent produced. - wraps track_duration_of + manual success/tokens/error tracking.""" - try: - result = await tracker.track_duration_of(func) - tracker.track_success() - total_in = total_out = total = 0 - for message in result.get("messages", []): - metrics = get_ai_metrics_from_response(message) - if metrics.usage: - total_in += metrics.usage.input - total_out += metrics.usage.output - total += metrics.usage.total - if total > 0: - tracker.track_tokens(TokenUsage(input=total_in, output=total_out, total=total)) - return result - except Exception: - tracker.track_error() - raise - -result = await track_langgraph_metrics( - agent_config.create_tracker(), - lambda: agent.ainvoke( - {"messages": [{"role": "user", "content": user_prompt}]}, - config={"configurable": {"thread_id": thread_id}}, - ), -) +# track_metrics_of_async records duration + success/error itself; the +# extractor only returns LDAIMetrics. The surrounding try/except is for +# local logging, not for tracker bookkeeping. +tracker = agent_config.create_tracker() +try: + result = await tracker.track_metrics_of_async( + lambda: agent.ainvoke( + {"messages": [{"role": "user", "content": user_prompt}]}, + config={"configurable": {"thread_id": thread_id}}, + ), + lambda res: LDAIMetrics( + success=True, + usage=sum_token_usage_from_messages(res.get("messages", [])), + ), + ) + for msg in result.get("messages", []): + for name in get_tool_calls_from_response(msg): + tracker.track_tool_call(name) +except Exception as e: + # Already recorded by track_metrics_of_async — log locally if needed. + raise ``` **Node** — same pattern with `trackMetricsOf` + a custom aggregator: @@ -219,6 +219,8 @@ const langgraphMetrics = (result: any): LDAIMetrics => { return { success: true, usage: total > 0 ? { input, output, total } : undefined }; }; +// trackMetricsOf records duration + success/error itself; do not call +// trackError after this — it would be a redundant second event. const agentTracker = agentConfig.createTracker!(); const result = await agentTracker.trackMetricsOf( langgraphMetrics, @@ -227,6 +229,14 @@ const result = await agentTracker.trackMetricsOf( { configurable: { thread_id: threadId } }, ), ); + +// Tool-call telemetry: walk the result messages. Once the JS SDK ships +// `LangChainProvider.getToolCallsFromResponse`, this collapses to one helper call. +for (const msg of result.messages ?? []) { + for (const tc of (msg as any).tool_calls ?? []) { + agentTracker.trackToolCall(tc.name); + } +} ``` ### Why aggregate per message diff --git a/skills/ai-configs/aiconfig-migrate/SKILL.md b/skills/ai-configs/aiconfig-migrate/SKILL.md index 9129861..2577957 100644 --- a/skills/ai-configs/aiconfig-migrate/SKILL.md +++ b/skills/ai-configs/aiconfig-migrate/SKILL.md @@ -33,7 +33,7 @@ The skill is optimized for Python and Node.js / TypeScript; other languages are | One-shot completion (direct OpenAI / Anthropic / Bedrock / Gemini call) | ✅ Worked example | ✅ Worked example | [before-after-examples.md](references/before-after-examples.md), per-provider docs in `aiconfig-ai-metrics/references/` | | Chat loop via managed runner (`ManagedModel` / `TrackedChat`) | ✅ Tier 1 pattern | ✅ Tier 1 pattern | [aiconfig-ai-metrics SKILL.md](../aiconfig-ai-metrics/SKILL.md) | | LangChain single-call | ✅ Worked example | ✅ Worked example | [langchain-tracking.md](../aiconfig-ai-metrics/references/langchain-tracking.md) | -| LangGraph `create_react_agent` / `createReactAgent` (prebuilt) | ✅ Worked example | ✅ Worked example | [agent-mode-frameworks.md § LangGraph](references/agent-mode-frameworks.md) | +| LangGraph prebuilt agent (Python `langchain.agents.create_agent`, Node `createReactAgent`) | ✅ Worked example | ✅ Worked example | [agent-mode-frameworks.md § LangGraph](references/agent-mode-frameworks.md) | | LangGraph custom `StateGraph` with run-scoped tracker (setup_run + call_model + finalize) | ✅ Deep worked example | ⚠️ Mentioned — translate from Python | [agent-mode-frameworks.md § Custom `StateGraph`](references/agent-mode-frameworks.md) | | CrewAI `Agent` | ✅ Worked example | — (not a Node framework) | [agent-mode-frameworks.md § CrewAI](references/agent-mode-frameworks.md) | | Strands `Agent` | ✅ Worked example | ⚠️ BedrockModel + OpenAIModel only (no Anthropic) | [agent-mode-frameworks.md § Strands](references/agent-mode-frameworks.md) | @@ -93,8 +93,9 @@ Use [phase-1-analysis-checklist.md](references/phase-1-analysis-checklist.md) to 3. **Existing LaunchDarkly usage** — any pre-existing `LDClient` or `ldclient` initialization to reuse 4. **Hardcoded model configs** — model name string literals, temperature / max_tokens / top_p, system prompts, instruction strings 5. **Template placeholders in prompts** — `.format()` calls, f-strings in prompt constants, JS/TS template literals, `%(var)s`, hand-rolled `str.replace("__VAR__", ...)`. Flag each placeholder name and its runtime-value source; all get rewritten to Mustache `{{ variable }}` in Stage 2. -6. **Hardcoded app-scoped knobs** — search-result limits, retry budgets, tool-timeout overrides, feature toggles, any config-dataclass field that isn't a prompt or model parameter but still governs agent behavior. These belong in `model.custom` on the variation (not `model.parameters`, which is forwarded to the provider SDK and will crash on unknown kwargs). -7. **Mode decision** — completion mode (chat messages array) or agent mode (single instructions string). Completion mode is the default and the only mode that supports judges attached in the UI. +6. **Externalized prompt files** — scan YAML / JSON / TOML / Markdown / `.prompt` / `.j2` files **and** prompt-template registries (`langchain.hub.pull(...)`, LangSmith `client.pull_prompt(...)`) for prompts loaded at runtime. Common shapes: CrewAI `agents.yaml` / `tasks.yaml`, LangChain Promptfiles, k8s ConfigMap overlays, Pydantic Settings classes with `prompt_*` fields. Same Mustache rewrite (sub-step 5 of Stage 2) applies if the placeholder syntax differs. See [phase-1-analysis-checklist.md § 4](references/phase-1-analysis-checklist.md). +7. **Hardcoded app-scoped knobs** — search-result limits, retry budgets, tool-timeout overrides, feature toggles, any config-dataclass field that isn't a prompt or model parameter but still governs agent behavior. These belong in `model.custom` on the variation (not `model.parameters`, which is forwarded to the provider SDK and will crash on unknown kwargs). +8. **Mode decision** — completion mode (chat messages array) or agent mode (single instructions string). Completion mode is the default and the only mode that supports judges attached in the UI. For each hardcoded target the audit finds, record: @@ -118,10 +119,20 @@ Hardcoded targets: - src/chat.py:42 model="gpt-4o" - src/chat.py:43 temperature=0.7, max_tokens=2000 - src/chat.py:45 system="You are a helpful assistant..." +Externalized prompt files: none (or e.g. "prompts/agents.yaml — CrewAI role/goal/backstory") +Prompt-template registries: none (or e.g. langchain.hub.pull("rlm/rag-prompt") at app.py:14) +Coverage totals: 3 hardcoded code targets · 0 externalized prompt files · 0 registry pulls Proposed plan: single AI Config key `chat-assistant`, mirror fallback, Stage 3 (tools) skipped (no function calling), Stage 4 (tracking) inline, Stage 5 (evals) attach built-in accuracy judge. ``` -**STOP.** Present this summary and wait for the user to confirm before proceeding to Stage 2. **This is the most important checkpoint in the workflow** — if the audit is wrong, every stage after this will be wrong. The user should cross-check the hardcoded-targets list against what they know is in the code before giving the go-ahead. +**STOP.** Present this summary, state the coverage totals out loud (e.g. "I found **N** hardcoded code targets and **M** externalized prompt files — does that match what you expected?"), and wait for the user to reply with one of four explicit forms: + +- **`confirm`** — proceed to Stage 2. +- **`add: `** — re-run the audit with the new locations and present an updated summary. +- **`fix: `** — update a target in the list (provider, mode, prompt content, etc.) and ask again. +- **`stop`** — pause the migration here. + +Do not interpret any other word — including `skip`, `next`, `go`, `ok`, `proceed` — as confirmation; ask the user to pick one of the four forms. **This is the most important checkpoint in the workflow** — if the audit is wrong, every stage after this will be wrong. The user should cross-check the hardcoded-targets list against what they know is in the code before giving the go-ahead. ### Step 2: Wrap the call in the AI SDK (Stage 2) @@ -284,11 +295,13 @@ This is the first stage that writes code. It has nine sub-steps. params = config.model.parameters or {} # Pass model_name + instructions into your framework's agent constructor. - # Example: LangGraph create_react_agent - # agent = create_react_agent( - # model=load_chat_model(model_name), - # tools=TOOLS, # Stage 3 will replace this with a config.tools loader - # prompt=instructions, + # Example: LangGraph prebuilt agent (Python — `from langchain.agents import create_agent`; + # this replaces `langgraph.prebuilt.create_react_agent`, deprecated in LangGraph 1.0 + # and removed in 2.0. Same return shape; `prompt=` was renamed to `system_prompt=`.) + # agent = create_agent( + # create_langchain_model(config), # forwards every variation parameter + # TOOLS, # Stage 3 will replace this with a config.tools loader + # system_prompt=instructions, # ) ``` @@ -308,7 +321,8 @@ Skip this step if the audited app has no function calling / tools. Otherwise: - `openai.chat.completions.create(tools=[...])` — OpenAI direct - `anthropic.messages.create(tools=[...])` — Anthropic direct - - `create_react_agent(tools=[...])` — LangGraph prebuilt ReAct + - `create_agent(llm, tools=[...], system_prompt=...)` — LangGraph prebuilt (Python, `langchain.agents`; replaces deprecated `langgraph.prebuilt.create_react_agent`) + - `createReactAgent({ llm, tools: [...] })` — LangGraph.js prebuilt (Node, `@langchain/langgraph/prebuilt`) - `Agent(tools=[...])` — CrewAI - `Agent(tools=[...])` — Strands (Python `@tool`-decorated callables passed through the constructor; TS SDK uses Zod-schema tools) - **Custom `StateGraph`** — module-level `TOOLS = [...]` list referenced in **both** `model.bind_tools(TOOLS)` and `ToolNode(TOOLS)`. This is the `langchain-ai/react-agent` template shape; the list is usually in a `tools.py` module. Grep for `bind_tools(` and `ToolNode(` together — they will point at the same list. @@ -479,7 +493,7 @@ Delegate: **`aiconfig-online-evals`** (sub-step 3, optional — only for UI-atta | App uses LangChain `ChatOpenAI(model=...)` | Replace the hand-rolled model construction with `create_langchain_model(config)` (Python) or `LangChainProvider.createLangChainModel(config)` (Node). Do not read `config.model.name` and pass it to `ChatOpenAI(model=...)` by hand — that pattern drops every variation parameter except the ones you explicitly name | | Retry wrapper around the provider call | The tracker is minted once at the top of the user turn; the retry loop is inside that scope. Every retry attempt shares the same `runId`. Tracker calls (`track_duration` / `track_tokens` / `track_success` / `track_error`) live *outside* the retry body — one call at the end of the turn, on the success path or the final-failure path | | App has no tools — Stage 3 skipped | Move directly from Stage 2 verification to Stage 4 (tracking) | -| Mode mismatch: user said agent, audit shows one-shot chat | Choose completion mode unless the app uses LangGraph `create_react_agent`, CrewAI `Agent`, Strands `Agent`, or a similar goal-driven framework | +| Mode mismatch: user said agent, audit shows one-shot chat | Choose completion mode unless the app uses a LangGraph prebuilt agent (`langchain.agents.create_agent` in Python or `createReactAgent` in Node), CrewAI `Agent`, Strands `Agent`, or a similar goal-driven framework | | App uses Strands Agents (Python) | Agent mode. Build a `create_strands_model` dispatcher keyed on `agent_config.provider.name` that returns `AnthropicModel(model_id=..., max_tokens=...)` or `OpenAIModel(model_id=..., params=...)`. Drop `parameters.tools` before passing params to the model class — Strands receives tools via `Agent(tools=[...])`. Tracking is Tier 3: wrap `invoke_async` with `tracker.track_duration_of(...)` and record tokens from `result.metrics.accumulated_usage`. See [agent-mode-frameworks.md § Strands Agent](references/agent-mode-frameworks.md) and [strands-tracking.md](../aiconfig-ai-metrics/references/strands-tracking.md) | | Strands app on TypeScript | TS SDK ships `BedrockModel` and `OpenAIModel` only — cannot serve Anthropic-backed variations. Use the Python SDK if multi-provider variations are required | | TypeScript app using Anthropic SDK | No `trackAnthropicMetrics` helper exists. Use Tier 3: `trackMetricsOf` with a small custom extractor that reads `response.usage.input_tokens` / `response.usage.output_tokens` and returns `LDAIMetrics`. See [anthropic-tracking.md](../aiconfig-ai-metrics/references/anthropic-tracking.md) in the `aiconfig-ai-metrics` skill for the exact extractor | diff --git a/skills/ai-configs/aiconfig-migrate/references/agent-mode-frameworks.md b/skills/ai-configs/aiconfig-migrate/references/agent-mode-frameworks.md index 979e231..6843975 100644 --- a/skills/ai-configs/aiconfig-migrate/references/agent-mode-frameworks.md +++ b/skills/ai-configs/aiconfig-migrate/references/agent-mode-frameworks.md @@ -8,7 +8,7 @@ Completion mode is the default and covers direct provider calls (OpenAI, Anthrop | Signal | Framework | Example | |--------|-----------|---------| -| Takes a `prompt` or `instructions` string as a single argument | LangGraph `create_react_agent` | `create_react_agent(model, tools, prompt="You are...")` | +| Takes a `system_prompt` / `prompt` / `instructions` string as a single argument | LangGraph prebuilt agent | Python: `create_agent(llm, tools, system_prompt="You are...")` (`langchain.agents`); Node: `createReactAgent({ llm, tools, prompt: "You are..." })` (`@langchain/langgraph/prebuilt`) | | Takes `role`, `goal`, `backstory` | CrewAI `Agent` | `Agent(role="researcher", goal="...", backstory="...")` | | Custom ReAct loop with a system instruction separated from messages | hand-rolled | `system = "You can use search..."; while not done: ...` | | Multi-step tool use with persistent instructions across turns | LangGraph / LangChain `AgentExecutor` | The system prompt stays stable across a long interaction | @@ -18,7 +18,7 @@ Agent mode returns an `instructions` string. Completion mode returns a `messages **Caveat:** judges cannot be attached to agent-mode variations via the LaunchDarkly UI. Agent mode evaluations must go through the programmatic judge API (`create_judge(...).evaluate(input, output)`). See `aiconfig-online-evals` for the programmatic path. -**Model construction for LangChain / LangGraph.** When the framework runs on top of LangChain (which includes LangGraph's `create_react_agent` and most custom graphs), build the chat model with `create_langchain_model(ai_config)` (Python) or `LangChainProvider.createLangChainModel(aiConfig)` (Node). These helpers forward every variation parameter (`temperature`, `max_tokens`, `top_p`, …) and handle LaunchDarkly→LangChain provider-name mapping internally. Do not hand-roll `init_chat_model(model=..., model_provider=...)` — it silently drops every variation parameter. See [langchain-tracking.md](../../aiconfig-ai-metrics/references/langchain-tracking.md) for the canonical single-model and LangGraph patterns, including the per-message token-aggregation extractor used with `track_metrics_of_async` / `trackMetricsOf`. +**Model construction for LangChain / LangGraph.** When the framework runs on top of LangChain (which includes LangGraph's prebuilt agent and most custom graphs), build the chat model with `create_langchain_model(ai_config)` (Python) or `LangChainProvider.createLangChainModel(aiConfig)` (Node). These helpers forward every variation parameter (`temperature`, `max_tokens`, `top_p`, …) and handle LaunchDarkly→LangChain provider-name mapping internally. Do not hand-roll `init_chat_model(model=..., model_provider=...)` — it silently drops every variation parameter. See [langchain-tracking.md](../../aiconfig-ai-metrics/references/langchain-tracking.md) for the canonical single-model and LangGraph patterns, including the SDK helpers `sum_token_usage_from_messages` / `get_tool_calls_from_response` (Python, `ldai_langchain`) used inside the `track_metrics_of_async` / `trackMetricsOf` extractor. ## Framework-agnostic invariants for the run-scoped pattern @@ -34,7 +34,7 @@ What "one user turn" means differs by app shape: |-----------|--------------| | Request/response HTTP handler | One request | | Chat loop (one session across many user inputs) | One user input (not the whole session) | -| LangGraph `app.ainvoke(...)` / `createReactAgent().invoke(...)` | One call to `ainvoke` / `invoke` | +| LangGraph `app.ainvoke(...)` / `create_agent().invoke(...)` (Python) / `createReactAgent().invoke(...)` (Node) | One call to `ainvoke` / `invoke` | | Custom ReAct loop with its own `for` iteration | The full loop run, not one iteration | | Batch job / dataset walk | One row — each sample is its own run | | Streaming response (SSE / WebSocket) | The full stream (open → last chunk), not one chunk | @@ -117,11 +117,13 @@ The only legitimate reason to re-fetch inside a tool is if the tool's behavior n ## Wiring `agent_config` into each framework -### LangGraph `create_react_agent` +### LangGraph prebuilt agent (Python — `langchain.agents.create_agent`) + +> **API note.** Use `from langchain.agents import create_agent`. The earlier `from langgraph.prebuilt import create_react_agent` is deprecated in LangGraph 1.0 and removed in 2.0. Same return shape; the only call-site rename is `prompt=` → `system_prompt=`. Node.js still uses `createReactAgent` from `@langchain/langgraph/prebuilt`. ```python -from langchain_openai import ChatOpenAI -from langgraph.prebuilt import create_react_agent +from langchain.agents import create_agent +from ldai_langchain import create_langchain_model from ldai.client import LDAIClient, AIAgentConfigDefault, ModelConfig, ProviderConfig FALLBACK = AIAgentConfigDefault( @@ -138,23 +140,21 @@ def build_agent(ai_client: LDAIClient, user_id: str, tools: list): if not config.enabled: return None, None - params = config.model.parameters or {} - llm = ChatOpenAI( - model=config.model.name, - temperature=params.get("temperature", 0.3), - ) + # create_langchain_model forwards every variation parameter — do NOT hand-roll + # ChatOpenAI(model=..., temperature=...). It silently drops unnamed parameters. + llm = create_langchain_model(config) - agent = create_react_agent( - model=llm, - tools=tools, - prompt=config.instructions, + agent = create_agent( + llm, + tools, + system_prompt=config.instructions, ) return agent, config.create_tracker() ``` **Key points:** -- `prompt=config.instructions` — the instructions string replaces the hardcoded prompt -- `model=` and `temperature=` come from `config.model` +- `system_prompt=config.instructions` — the instructions string replaces the hardcoded prompt +- Model + parameters come from `create_langchain_model(config)` — forwards the whole `model.parameters` dict - A fresh tracker is minted via `config.create_tracker()` and returned alongside the agent so the caller can wire Stage 4 tracking around `agent.invoke(...)`. Each call to `create_tracker()` produces a new `runId`; the caller should treat the returned tracker as owning the execution. ### CrewAI `Agent` @@ -298,9 +298,9 @@ async def run_turn(ai_client, user_id: str, user_input: str): ### Custom `StateGraph` (bind_tools + ToolNode) -The most common LangGraph pattern in the wild is not `create_react_agent` — it's a custom `StateGraph` with a `call_model` node that does `model.bind_tools(TOOLS)`, a separate `"tools"` node that runs `ToolNode(TOOLS)`, and a conditional edge between them. This is the shape of the `langchain-ai/react-agent` template. +The most common LangGraph pattern in the wild is not the prebuilt agent — it's a custom `StateGraph` with a `call_model` node that does `model.bind_tools(TOOLS)`, a separate `"tools"` node that runs `ToolNode(TOOLS)`, and a conditional edge between them. This is the shape of the `langchain-ai/react-agent` template. -Two things make it different from `create_react_agent`: +Two things make it different from the prebuilt `create_agent`: 1. **Tools appear in two places** — `bind_tools(TOOLS)` (so the LLM knows which tools exist) and `ToolNode(TOOLS)` (so the executor knows how to run them). Both must read from the same source. 2. **The system prompt is injected manually** in the `call_model` node body (usually as the first message in the `ainvoke([{"role": "system", ...}, *state.messages])` call), not passed as a constructor argument. @@ -790,10 +790,10 @@ Then at the call site: config = ai_client.agent_config("support-agent", context, FALLBACK) tools = create_dynamic_tools_from_launchdarkly(config) -agent = create_react_agent( - model=build_llm(config), - tools=tools, - prompt=config.instructions, +agent = create_agent( + build_llm(config), + tools, + system_prompt=config.instructions, ) ``` diff --git a/skills/ai-configs/aiconfig-migrate/references/before-after-examples.md b/skills/ai-configs/aiconfig-migrate/references/before-after-examples.md index 4e674df..0dfe851 100644 --- a/skills/ai-configs/aiconfig-migrate/references/before-after-examples.md +++ b/skills/ai-configs/aiconfig-migrate/references/before-after-examples.md @@ -176,21 +176,23 @@ export async function answer(userId: string, userQuestion: string): Promise **API note.** Use `from langchain.agents import create_agent`. The earlier `from langgraph.prebuilt import create_react_agent` is deprecated in LangGraph 1.0 and removed in 2.0. Same return shape; the only rename you'll feel at the call site is `prompt=` → `system_prompt=`. Node.js still uses `createReactAgent` from `@langchain/langgraph/prebuilt` — no JS deprecation. ### Before ```python from langchain_openai import ChatOpenAI -from langgraph.prebuilt import create_react_agent +from langchain.agents import create_agent from my_tools import search_kb, calculator llm = ChatOpenAI(model="gpt-4o", temperature=0.3) -agent = create_react_agent( - model=llm, - tools=[search_kb, calculator], - prompt=( +agent = create_agent( + llm, + [search_kb, calculator], + system_prompt=( "You are a technical support assistant. Use the search_kb tool to look up " "documentation, and the calculator tool for math. Always cite sources." ), @@ -209,7 +211,7 @@ from ldclient import Context from ldclient.config import Config from ldai.client import LDAIClient, AIAgentConfigDefault, ModelConfig, ProviderConfig from ldai_langchain import create_langchain_model -from langgraph.prebuilt import create_react_agent +from langchain.agents import create_agent from my_tools import search_kb, calculator ldclient.set_config(Config(os.environ["LD_SDK_KEY"])) @@ -236,10 +238,10 @@ def run_support(user_id: str, user_question: str) -> str: # ChatOpenAI(model=...) — it drops unnamed parameters silently. llm = create_langchain_model(config) - agent = create_react_agent( - model=llm, - tools=[search_kb, calculator], # Stage 3 will replace this with config.tools loader - prompt=config.instructions, + agent = create_agent( + llm, + [search_kb, calculator], # Stage 3 will replace this with config.tools loader + system_prompt=config.instructions, ) result = agent.invoke({"messages": [{"role": "user", "content": user_question}]}) @@ -251,7 +253,7 @@ def run_support(user_id: str, user_question: str) -> str: - `agent_config()` is called instead of `completion_config()` because the framework expects an `instructions` string - `FALLBACK` is an `AIAgentConfigDefault` (note the different type — same fields as completion except `instructions` instead of `messages`) - Model construction goes through `create_langchain_model(config)` from the `ldai_langchain` helper package — forwards every variation parameter. The alternative of hand-rolling `ChatOpenAI(model=config.model.name, temperature=...)` would silently drop every parameter not explicitly named. -- `create_react_agent(prompt=...)` reads from `config.instructions` +- `create_agent(..., system_prompt=...)` reads from `config.instructions` - Tool list is still hardcoded — Stage 3 handles that move (see [agent-mode-frameworks.md](agent-mode-frameworks.md) for the tool-factory pattern that closes over per-run config) - **Stage 4 will add a run-scoped tracker** (mint in a `setup_run` entry node, consume in `call_model` and `finalize`) — see [agent-mode-frameworks.md § Custom `StateGraph`](agent-mode-frameworks.md) for the full architecture - Provider-side logic (LangGraph, ReAct loop) is unchanged diff --git a/skills/ai-configs/aiconfig-migrate/references/phase-1-analysis-checklist.md b/skills/ai-configs/aiconfig-migrate/references/phase-1-analysis-checklist.md index 3959db5..c758f6b 100644 --- a/skills/ai-configs/aiconfig-migrate/references/phase-1-analysis-checklist.md +++ b/skills/ai-configs/aiconfig-migrate/references/phase-1-analysis-checklist.md @@ -29,7 +29,7 @@ Grep the source tree for provider SDK imports so you know which one the app actu | Bedrock | `import boto3`, `bedrock-runtime` | `@aws-sdk/client-bedrock-runtime` | | Gemini | `from google import genai`, `google.generativeai` | `@google/generative-ai` | | LangChain | `from langchain`, `langchain_openai`, `langchain_anthropic` | `langchain`, `@langchain/openai` | -| LangGraph | `from langgraph`, `create_react_agent` | `@langchain/langgraph` | +| LangGraph | `from langgraph`, `from langchain.agents import create_agent`, `create_react_agent` (deprecated) | `@langchain/langgraph`, `createReactAgent` | | CrewAI | `from crewai` | — | ### 3. Hardcoded model configs @@ -44,12 +44,44 @@ Look for the three things that need to move into the AI Config: 3. **System prompts / instructions** — grep for: - `"role": "system"` (OpenAI/Anthropic completion) - `system="` or `system:` (Anthropic top-level system) - - `instructions="` (agent frameworks, CrewAI, LangGraph `create_react_agent(prompt=)`) + - `instructions="` (agent frameworks, CrewAI, LangGraph Python `create_agent(system_prompt=)` or legacy `create_react_agent(prompt=)`, LangGraph.js `createReactAgent({ prompt })`) - Long triple-quoted strings above provider calls For each hit, record the file path, line number, and current value. -### 4. Template placeholders in prompts +### 4. External prompt files & registries + +Prompts often live outside `.py` / `.ts` source. **Open every plausible config file and read it before declaring the audit complete** — code-only grep will miss prompts loaded from YAML, prompt-template registries, or framework-specific manifests, and the resulting AI Config will not cover the real prompt surface area. + +**File-extension scan targets** (run from repo root): + +- `**/*.yaml`, `**/*.yml` +- `**/*.json` (exclude `package.json`, `package-lock.json`, `tsconfig.json`, `*.lock`) +- `**/*.toml` (exclude `pyproject.toml` already covered in §1) +- `**/*.md` under `prompts/`, `templates/`, `agents/`, `personas/` +- `**/*.prompt`, `**/*.prompty`, `**/*.j2`, `**/*.jinja`, `**/*.tmpl` + +**Content signals to grep inside those files**: + +- Keys named `system`, `system_prompt`, `instructions`, `prompt`, `template`, `role`, `goal`, `backstory`, `persona`, `messages` +- Mustache (`{{ }}`) or Jinja (`{% %}`) blocks suggesting a prompt template +- Long multi-line string values (>200 chars) under any of those keys + +**Framework-specific shapes to call out by name**: + +| Framework | Where prompts live | +|-----------|---------------------| +| **CrewAI** | `agents.yaml` and `tasks.yaml` carry `role` / `goal` / `backstory` / `description` / `expected_output` per agent or task | +| **LangChain** | `.prompt` files (Promptfile format); any `langchain.hub.pull("name/key")` call — pulled prompts are remote and must either be inlined into the AI Config or replaced with a `messages` array sourced from the hub at audit time | +| **LangSmith** | `client.pull_prompt(...)` calls referencing a remote prompt repo | +| **Pydantic / Settings** | classes with `prompt_*` fields backed by env vars or YAML overlays (Hydra / OmegaConf / Dynaconf) | +| **Helm / k8s ConfigMap** | prompts stored as values overrides — search `values*.yaml` and `templates/*configmap*.yaml` | + +**For each hit, record**: file path, line range, the key holding the prompt, the loader call site that reads it, and any template placeholder syntax (Mustache vs Jinja vs Python `.format()`). The placeholder rewrite in Stage 2 sub-step 5 needs to know the source syntax to convert it to Mustache. + +If a fallback file is already in use (see [fallback-defaults-pattern.md](fallback-defaults-pattern.md) — JSON/YAML loaded at startup), distinguish it from prompts that flow into the provider call. The fallback path is intentional infrastructure; only the latter migrates into the AI Config. + +### 5. Template placeholders in prompts Anything the app currently interpolates into a prompt at runtime must be rewritten to Mustache `{{ variable }}` syntax in Stage 2 so the fallback path renders identically to the LD-served path. Grep for: @@ -63,7 +95,7 @@ Anything the app currently interpolates into a prompt at runtime must be rewritt Record placeholder name + where the runtime value comes from (env var, function arg, `datetime.now()`, etc.). These get routed through `variables={...}` on `completion_config` / `agent_config` calls in Stage 2, and the literal prompt string gets rewritten to `{{ placeholder }}`. Leaving a non-Mustache placeholder in the fallback is a silent regression mode: LaunchDarkly-served prompts interpolate correctly, the fallback ships unrendered. -### 5. Hardcoded app-scoped knobs +### 6. Hardcoded app-scoped knobs Configuration that governs *tool* or *app* behavior rather than *model* behavior — easy to miss in an audit because it looks like ordinary application config. Common shapes: @@ -73,7 +105,7 @@ Configuration that governs *tool* or *app* behavior rather than *model* behavior If a value changes agent behavior between variations — it belongs in the AI Config. Stage 2 sub-step 5 (fallback) puts these in `ModelConfig(custom={...})`, **not** `parameters` (which is forwarded to the provider SDK and will crash on unknown kwargs). Tools read them via `ai_config.model.get_custom("key")`. -### 6. Existing LaunchDarkly SDK usage +### 7. Existing LaunchDarkly SDK usage If `LDClient` / `ldclient` is already initialized in the codebase, **reuse it** — do not create a second base client in Stage 2. Grep for: @@ -81,20 +113,21 @@ If `LDClient` / `ldclient` is already initialized in the codebase, **reuse it** - TypeScript/JS: `@launchdarkly/node-server-sdk`, `init(LD_SDK_KEY)`, `@launchdarkly/react-client-sdk` - Environment variables: `LD_SDK_KEY`, `LAUNCHDARKLY_SDK_KEY`, `LAUNCHDARKLY_API_KEY` -### 7. Mode decision: completion or agent +### 8. Mode decision: completion or agent Walk the decision tree once per call site, using the call shape as the primary signal: | Call shape | Mode | |------------|------| | Direct provider call with `messages=[...]` (OpenAI, Anthropic, Bedrock Converse) | **completion** | -| `create_react_agent(llm, tools, prompt=...)` | **agent** | +| `create_agent(llm, tools, system_prompt=...)` (Python, `langchain.agents`) or `create_react_agent(llm, tools, prompt=...)` (Python, deprecated) | **agent** | +| `createReactAgent({ llm, tools, prompt })` (Node, `@langchain/langgraph/prebuilt`) | **agent** | | CrewAI `Agent(role=..., goal=..., backstory=...)` | **agent** | | Custom react loop: LLM-call → tool-call → LLM-call | **agent** | **Default to completion mode** when unclear — it is more flexible and is the only mode that supports judges attached via the LaunchDarkly UI (Stage 5). -### 8. Monorepo / multi-service scope +### 9. Monorepo / multi-service scope If the repo contains multiple services, **ask the user which service to instrument**. Do not migrate every service in one pass. @@ -137,11 +170,15 @@ Hardcoded migration targets: - : temperature=0.7, max_tokens=2000 - : system="You are... {system_time}" (27 lines, Python .format placeholder) -Template placeholders: [{system_time} (Python .format, source=datetime.now().isoformat())] -App-scoped knobs: [Context.max_search_results=10 (tools.py:24, reads from runtime.context)] -Tools detected: -Retry wrapper: -Scope: +Externalized prompt files: +Prompt-template registries: +Template placeholders: [{system_time} (Python .format, source=datetime.now().isoformat())] +App-scoped knobs: [Context.max_search_results=10 (tools.py:24, reads from runtime.context)] +Tools detected: +Retry wrapper: +Scope: + +Coverage totals: N hardcoded code targets · M externalized prompt files · K registry pulls Proposed plan: Stage 1 (Audit): Read-only manifest of hardcoded targets; flag placeholders for Mustache rewrite and knobs for model.custom @@ -153,11 +190,19 @@ Proposed plan: ## STOP -Do not proceed to Stage 1 (Step 2 in the main workflow) until the user confirms: +Do not proceed to Stage 2 until the user confirms all of: 1. The service boundary is right 2. The hardcoded targets list is complete 3. The mode choice matches their intent 4. The stage plan is acceptable (e.g. skip tools? skip evals for now?) +5. **Coverage check.** State the totals out loud: "I scanned the repo and found **N** hardcoded code targets, **M** externalized prompt files, and **K** registry pulls. If you expected more (e.g. you know there's a `prompts/` directory, a CrewAI `agents.yaml`, or a config service I didn't reach), tell me where to look before we proceed." A number the user can react to surfaces under-detection that a yes/no on a list cannot. + +**Reply with one of:** + +- **`confirm`** — the audit looks complete; proceed to Stage 2. +- **`add: `** — I missed something; here's where to look. (Re-run the audit with the new locations and present an updated summary.) +- **`fix: `** — a target in the list is wrong (provider, mode, prompt content, etc.). (Update the summary and ask again.) +- **`stop`** — pause the migration here. -If the user corrects anything, update the summary and ask again. Do not proceed under ambiguity. +Do not interpret any other word — including `skip`, `next`, `go`, `ok`, `proceed` — as confirmation. If the reply doesn't match one of the four forms above, ask the user to pick one. Do not proceed under ambiguity.