Phase 2-C: Backend Codex CLI / OpenAI GPT Session Source #43

Closed
opened 2026-05-24 16:55:17 -05:00 by null · 0 comments
Owner

Source plan: /home/kaspa/.claude/plans/with-our-backend-created-precious-owl.md

Feature context: Feature 2: Tool Use Analytics

Scope

Phase 2-C — Backend: Codex CLI / OpenAI GPT Session Source

Summary

Add a Codex/OpenAI session source that mirrors the Claude Code backend contract as closely as possible: list local Codex CLI sessions, read a session conversation, and aggregate tool analytics for Codex/GPT activity.

Problem

The Claude Code UI now exposes sessions, messages, and tool analytics, but the data model is still Claude-specific. Users who use Codex CLI or OpenAI/GPT-backed agent workflows need the same operational visibility without a second, unrelated experience.

Affected area

  • Backend service: Codex/OpenAI session reader
  • API: provider/session source endpoints
  • Shared schemas: normalized session, message, tool-use, and analytics contracts

Affected files

  • backend/app/services/codex_session_reader.py — new reader for local Codex CLI history and OpenAI/GPT session sources
  • backend/app/services/agent_session_sources.py — optional provider registry/shared normalization helpers
  • backend/app/api/codex_sessions.py — new Codex/OpenAI API routes, or add provider-aware routes if the app standardizes under one endpoint family
  • backend/app/schemas/agent_sessions.py — shared provider-neutral schemas if extracting from claude_code.py
  • backend/app/schemas/claude_code.py — only touch if extending existing schemas is safer than introducing shared schemas
  • backend/tests/test_codex_session_reader.py — parser/fixture coverage
  • backend/tests/test_codex_sessions_api.py — API behavior coverage

Affected routes or endpoints

Preferred provider-specific routes:

  • GET /api/v1/codex/sessions
  • GET /api/v1/codex/sessions/{session_id}
  • GET /api/v1/codex/sessions/{session_id}/messages
  • GET /api/v1/codex/analytics/tools?days=7|30|90

Optional provider-neutral routes if the implementation chooses to generalize first:

  • GET /api/v1/agent-sessions/sources
  • GET /api/v1/agent-sessions/{source}/sessions
  • GET /api/v1/agent-sessions/{source}/sessions/{session_id}/messages
  • GET /api/v1/agent-sessions/{source}/analytics/tools?days=7|30|90

Data-source requirements

  • Codex CLI:
    • Discover local Codex CLI history/session files from ~/.codex when available.
    • Add an env override such as CODEX_SESSIONS_PATH so tests and nonstandard installs do not depend on a hard-coded path.
    • Detect and gracefully report when Codex CLI is installed but no readable session history exists.
    • Parse local session records into the same normalized concepts used by Claude:
      • session id
      • title
      • project/workspace directory
      • model(s)
      • token usage if present
      • message count
      • first/last message timestamps
      • active/completed status
      • tool calls and tool results if present
  • OpenAI/GPT API:
    • Do not attempt to scrape ChatGPT web history.
    • Only surface OpenAI/GPT sessions if Pipeline has a local/owned event source: gateway logs, stored API traces, an explicit import file, or a future collector.
    • Represent unavailable OpenAI/GPT history as a clear source status, not as an error.
    • Preserve provider labels so users can distinguish Codex CLI, OpenAI API, and future GPT sources.

Normalized backend contract

  • Keep the frontend shape aligned with Claude Code wherever possible:
    • session_id
    • source (claude_code, codex_cli, openai_api)
    • provider_label
    • project_dir
    • cwd
    • title
    • models
    • tokens
    • cost_usd
    • billing_source
    • message_count
    • first_message_at
    • last_message_at
    • is_active
    • entrypoints
    • git_branch
  • Messages should map to the same UI concepts:
    • user text blocks
    • assistant text blocks
    • reasoning/thinking blocks if the source exposes them
    • tool calls
    • tool results
    • token usage per assistant turn when available
  • Tool analytics should return the same structure as Claude analytics:
    • tool_counts
    • top_files_read
    • top_files_written
    • top_commands
    • session_count
    • date_range_days
  • Include source metadata in responses when helpful:
    • source
    • source_status
    • source_path
    • last_scanned_at

Security and privacy requirements

  • Treat Codex/OpenAI session logs as sensitive local data.
  • Reuse the same organization/member auth requirements as Claude Code endpoints.
  • Redact secrets from tool inputs, commands, env values, headers, URLs, and API request bodies.
  • Never return raw credential files such as ~/.codex/auth.json.
  • Do not read outside discovered/explicitly configured session roots.

Expected behavior

  • Codex CLI sessions appear through a backend API with the same shape as Claude sessions.
  • Codex/GPT messages can be rendered by the existing frontend message components with minimal branching.
  • Codex/GPT tool analytics can be rendered by the same analytics components used for Claude.
  • If Codex history format is missing, unsupported, or unavailable, API returns a graceful empty/source-unavailable response.
  • OpenAI/GPT API history is only shown when Pipeline has an owned event source; otherwise, it is listed as unavailable with setup guidance.

Steps to reproduce (acceptance criteria)

  1. Configure a fixture Codex sessions path via CODEX_SESSIONS_PATH
  2. Call GET /api/v1/codex/sessions
  3. Response includes normalized sessions with provider/source metadata
  4. Call GET /api/v1/codex/sessions/{session_id}/messages
  5. Response includes normalized user/assistant turns and tool calls when fixture data contains them
  6. Call GET /api/v1/codex/analytics/tools?days=30
  7. Response shape matches Claude analytics and reflects fixture tool usage
  8. Missing Codex history returns an empty/source-unavailable response, not a server error
  9. Tests cover parsing, unavailable source behavior, redaction, pagination, and analytics aggregation

New opportunity to add

  • Add GET /api/v1/agent-sessions/sources to return source cards for the frontend:
    • Claude Code: available/unavailable, session count, last activity
    • Codex CLI: available/unavailable, session count, last activity
    • OpenAI API: available/unavailable, reason/setup hint
  • This would let the UI show a polished provider switcher with honest availability instead of hiding missing sources.

Source plan: `/home/kaspa/.claude/plans/with-our-backend-created-precious-owl.md` Feature context: **Feature 2: Tool Use Analytics** ## Scope ### Phase 2-C — Backend: Codex CLI / OpenAI GPT Session Source #### Summary Add a Codex/OpenAI session source that mirrors the Claude Code backend contract as closely as possible: list local Codex CLI sessions, read a session conversation, and aggregate tool analytics for Codex/GPT activity. #### Problem The Claude Code UI now exposes sessions, messages, and tool analytics, but the data model is still Claude-specific. Users who use Codex CLI or OpenAI/GPT-backed agent workflows need the same operational visibility without a second, unrelated experience. #### Affected area - Backend service: Codex/OpenAI session reader - API: provider/session source endpoints - Shared schemas: normalized session, message, tool-use, and analytics contracts #### Affected files - `backend/app/services/codex_session_reader.py` — new reader for local Codex CLI history and OpenAI/GPT session sources - `backend/app/services/agent_session_sources.py` — optional provider registry/shared normalization helpers - `backend/app/api/codex_sessions.py` — new Codex/OpenAI API routes, or add provider-aware routes if the app standardizes under one endpoint family - `backend/app/schemas/agent_sessions.py` — shared provider-neutral schemas if extracting from `claude_code.py` - `backend/app/schemas/claude_code.py` — only touch if extending existing schemas is safer than introducing shared schemas - `backend/tests/test_codex_session_reader.py` — parser/fixture coverage - `backend/tests/test_codex_sessions_api.py` — API behavior coverage #### Affected routes or endpoints Preferred provider-specific routes: - `GET /api/v1/codex/sessions` - `GET /api/v1/codex/sessions/{session_id}` - `GET /api/v1/codex/sessions/{session_id}/messages` - `GET /api/v1/codex/analytics/tools?days=7|30|90` Optional provider-neutral routes if the implementation chooses to generalize first: - `GET /api/v1/agent-sessions/sources` - `GET /api/v1/agent-sessions/{source}/sessions` - `GET /api/v1/agent-sessions/{source}/sessions/{session_id}/messages` - `GET /api/v1/agent-sessions/{source}/analytics/tools?days=7|30|90` #### Data-source requirements - Codex CLI: - Discover local Codex CLI history/session files from `~/.codex` when available. - Add an env override such as `CODEX_SESSIONS_PATH` so tests and nonstandard installs do not depend on a hard-coded path. - Detect and gracefully report when Codex CLI is installed but no readable session history exists. - Parse local session records into the same normalized concepts used by Claude: - session id - title - project/workspace directory - model(s) - token usage if present - message count - first/last message timestamps - active/completed status - tool calls and tool results if present - OpenAI/GPT API: - Do not attempt to scrape ChatGPT web history. - Only surface OpenAI/GPT sessions if Pipeline has a local/owned event source: gateway logs, stored API traces, an explicit import file, or a future collector. - Represent unavailable OpenAI/GPT history as a clear source status, not as an error. - Preserve provider labels so users can distinguish `Codex CLI`, `OpenAI API`, and future GPT sources. #### Normalized backend contract - Keep the frontend shape aligned with Claude Code wherever possible: - `session_id` - `source` (`claude_code`, `codex_cli`, `openai_api`) - `provider_label` - `project_dir` - `cwd` - `title` - `models` - `tokens` - `cost_usd` - `billing_source` - `message_count` - `first_message_at` - `last_message_at` - `is_active` - `entrypoints` - `git_branch` - Messages should map to the same UI concepts: - user text blocks - assistant text blocks - reasoning/thinking blocks if the source exposes them - tool calls - tool results - token usage per assistant turn when available - Tool analytics should return the same structure as Claude analytics: - `tool_counts` - `top_files_read` - `top_files_written` - `top_commands` - `session_count` - `date_range_days` - Include source metadata in responses when helpful: - `source` - `source_status` - `source_path` - `last_scanned_at` #### Security and privacy requirements - Treat Codex/OpenAI session logs as sensitive local data. - Reuse the same organization/member auth requirements as Claude Code endpoints. - Redact secrets from tool inputs, commands, env values, headers, URLs, and API request bodies. - Never return raw credential files such as `~/.codex/auth.json`. - Do not read outside discovered/explicitly configured session roots. #### Expected behavior - Codex CLI sessions appear through a backend API with the same shape as Claude sessions. - Codex/GPT messages can be rendered by the existing frontend message components with minimal branching. - Codex/GPT tool analytics can be rendered by the same analytics components used for Claude. - If Codex history format is missing, unsupported, or unavailable, API returns a graceful empty/source-unavailable response. - OpenAI/GPT API history is only shown when Pipeline has an owned event source; otherwise, it is listed as unavailable with setup guidance. #### Steps to reproduce (acceptance criteria) 1. Configure a fixture Codex sessions path via `CODEX_SESSIONS_PATH` 2. Call `GET /api/v1/codex/sessions` 3. Response includes normalized sessions with provider/source metadata 4. Call `GET /api/v1/codex/sessions/{session_id}/messages` 5. Response includes normalized user/assistant turns and tool calls when fixture data contains them 6. Call `GET /api/v1/codex/analytics/tools?days=30` 7. Response shape matches Claude analytics and reflects fixture tool usage 8. Missing Codex history returns an empty/source-unavailable response, not a server error 9. Tests cover parsing, unavailable source behavior, redaction, pagination, and analytics aggregation #### New opportunity to add - Add `GET /api/v1/agent-sessions/sources` to return source cards for the frontend: - Claude Code: available/unavailable, session count, last activity - Codex CLI: available/unavailable, session count, last activity - OpenAI API: available/unavailable, reason/setup hint - This would let the UI show a polished provider switcher with honest availability instead of hiding missing sources. ---
null changed title from 2-C: Backend: Codex CLI / OpenAI GPT Session Source to Phase 2-C: Backend Codex CLI / OpenAI GPT Session Source 2026-05-24 16:56:09 -05:00
null closed this issue 2026-05-24 18:07:12 -05:00
Sign in to join this conversation.
No description provided.