|
|
1 개월 전 | |
|---|---|---|
| deployments | 1 개월 전 | |
| docs | 2 달 전 | |
| libs | 1 개월 전 | |
| scripts | 1 개월 전 | |
| services | 1 개월 전 | |
| tests | 1 개월 전 | |
| .dockerignore | 2 달 전 | |
| .gitignore | 2 달 전 | |
| .gitlab-ci.yml | 1 개월 전 | |
| README.md | 1 개월 전 | |
| pyproject.toml | 1 개월 전 |
基于 Python 的多服务智能体开发平台脚手架。
当前仓库已经初始化为 Monorepo,包含:
services/:核心微服务libs/:共享领域模型、DSL、事件、数据库和公共组件deployments/:本地和集群部署占位docs/:规划和数据库设计文档api-gatewaymodel-gateway-servicesession-serviceworkflow-serviceruntime-serviceagent-servicememory-serviceteam-serviceskill-servicehuman-serviceknowledge-serviceevent-serviceauth-servicescheduler-servicetool-service每个服务都提供了最小 FastAPI 启动入口和健康检查接口,数据库相关服务也已经带上了 SQLAlchemy 模型骨架与 Alembic 目录。
core-domaincore-dslcore-eventscore-sharedcore-db建议使用 uv 或 pip 创建虚拟环境后安装各服务依赖。
cd D:\workspace\auto-platform
python -m venv .venv
.venv\Scripts\activate
pip install -e .\libs\core-shared
pip install -e .\libs\core-domain
pip install -e .\libs\core-dsl
pip install -e .\libs\core-events
pip install -e .\libs\core-db
pip install -e .\services\api-gateway
pip install -e .\services\session-service
pip install -e .\services\workflow-service
pip install -e .\services\runtime-service
pip install -e .\services\agent-service
pip install -e .\services\memory-service
pip install -e .\services\team-service
pip install -e .\services\skill-service
pip install -e .\services\human-service
pip install -e .\services\knowledge-service
pip install -e .\services\event-service
pip install -e .\services\auth-service
pip install -e .\services\scheduler-service
pip install -e .\services\tool-service
运行示例:
cd D:\workspace\auto-platform\services\api-gateway
uvicorn app.main:app --reload --port 8000
数据库连接默认使用各服务目录下的 SQLite 文件,也可以通过环境变量覆盖:
$env:AGENT_PLATFORM_DATABASE_URL="postgresql+psycopg://user:password@localhost:5432/workflow_db"
本轮已经加入:
libs/core-db:统一 SQLAlchemy Base、通用 mixin、命名约定workflow-service:应用与流程定义模型session-service:会话与消息模型runtime-service:运行与节点执行模型tool-service:工具定义与绑定模型alembic.ini、env.py、versions/workflow-service:已接入 repository / application service / CRUD APIsession-service:已接入 repository / application service / CRUD API迁移执行示例:
cd D:\workspace\auto-platform\services\workflow-service
alembic upgrade head
其他服务同理:
services/session-serviceservices/runtime-serviceservices/tool-service接口示例:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8002/workflows/apps `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"sales_assistant","name":"Sales Assistant"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8001/sessions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","app_id":"app-1","user_id":"user-1","channel_type":"web"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8002/workflows/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","workflow_id":"wf-1","dsl_json":{"nodes":[],"edges":[]}}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8001/sessions/run-requests `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","session_id":"sess-1","app_version_id":"appv-1","workflow_version_id":"wfv-1"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/runs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","app_id":"app-1","app_version_id":"appv-1","workflow_id":"wf-1","workflow_version_id":"wfv-1","session_id":"sess-1","initial_node":{"node_id":"start","node_type":"llm"}}'
如果不传 initial_node,runtime-service 会调用 workflow-service 读取对应的 workflow version,并从 DSL 中自动推导首节点:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/runs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","app_id":"app-1","app_version_id":"appv-1","workflow_id":"wf-1","workflow_version_id":"wfv-1","session_id":"sess-1"}'
一条链直接派发到 runtime:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8001/sessions/run-requests/dispatch `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","session_id":"sess-1","app_id":"app-1","app_version_id":"appv-1","workflow_id":"wf-1","workflow_version_id":"wfv-1","initial_node":{"node_id":"start","node_type":"llm"}}'
工具定义示例:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8004/tools `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"search_products","name":"Search Products","tool_type":"http"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8004/tools/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","tool_id":"tool-1","input_schema_json":{"query":{"type":"string"}},"invoke_config_json":{"method":"GET","path":"/products/search"}}'
运行状态推进示例:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/node-runs/node-run-id/status `
-ContentType "application/json" `
-Body '{"status":"running","worker_key":"runtime-worker-1"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/runs/run-id/status `
-ContentType "application/json" `
-Body '{"status":"completed"}'
说明:
node-runs/{node_run_id}/status 更新节点状态时,runtime-service 会自动聚合当前运行下所有 node_run 的状态,并同步刷新 workflow_run.statusfailed 则运行 failed;有节点 running 则运行 running;全部节点都为 completed/skipped 则运行 completednode_run 被更新为 completed 时,runtime-service 还会基于 workflow version 的 DSL 自动查找后继节点,并创建新的 queued 状态 node_runservices/
api-gateway/
session-service/
workflow-service/
runtime-service/
skill-service/
human-service/
knowledge-service/
event-service/
auth-service/
scheduler-service/
tool-service/
libs/
core-domain/
core-dsl/
core-events/
core-shared/
core-db/
deployments/
docker/
k8s/
docs/
tests/
V0.1 的 repository / service 层agent-service stores strongly typed agent definitions, versioned prompts/configuration, and agent run records.
Create an agent:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"sales_agent","name":"Sales Agent","agent_type":"assistant"}'
Create a published agent version:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","agent_id":"agent-id","status":"published","role":"sales_assistant","goal":"Help qualify leads","system_prompt":"You are a careful sales assistant."}'
Enable multi-step ReAct planning for an agent version:
{
"model_config": {
"react_enabled": true,
"react_max_steps": 5
}
}
When ReAct is enabled, the model can emit JSON tool actions such as
{"action":"tool","tool_code":"lookup_order","input_json":{"order_id":"123"}}
and then finish with {"action":"finish","answer":"..."}. Each tool call is
persisted in agent_tool_invocation.
Create an agent run. If agent_version_id is omitted, the latest published version is used:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/runs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","agent_id":"agent-id","session_id":"session-id","input_text":"Summarize this lead."}'
List tool invocation records for an agent run:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8007/agents/runs/agent-run-id/tool-invocations?tenant_id=t1"
Agent execution now persists tool invocation audit records with selected,
running, skipped, completed, or failed status, including input/output payloads
and started_time / finished_time.
Through api-gateway, use /gateway/agents/**.
memory-service stores scoped memories for tenants, users, sessions, agents, and teams. The first version uses database text search so it works without vector infrastructure; pgvector can be added later behind the same API.
Memory search now stores a local deterministic embedding per memory and uses hybrid rerank:
keyword_score: token overlap and frequencyvector_score: cosine similarity over local hash embeddingsimportance_score: normalized memory importance boostrerank_mode: hybrid-localCreate a memory:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8008/memories `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","scope_type":"session","scope_id":"session-id","memory_type":"fact","content_text":"User prefers concise answers.","importance_score":80}'
Search memories:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8008/memories/search `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","query":"concise","scope_type":"session","scope_id":"session-id","limit":5}'
Through api-gateway, use /gateway/memories/**.
team-service stores multi-agent team definitions, versioned member composition, coordination mode, and team run records. The first version provides the team management backbone; later versions can connect team runs to supervisor/planner/member agent execution.
Create a team:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8009/teams `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"research_team","name":"Research Team","team_type":"collaborative"}'
Create a published team version:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8009/teams/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","team_id":"team-id","status":"published","coordination_mode":"supervisor","objective":"Research and summarize complex questions","member_refs":[{"member_key":"lead","agent_id":"agent-lead","role":"supervisor","responsibility":"Plan and assign work"},{"member_key":"writer","agent_id":"agent-writer","role":"executor","responsibility":"Draft final answer"}]}'
Create a team run. If team_version_id is omitted, the latest published version is used:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8009/teams/runs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","team_id":"team-id","session_id":"session-id","input_text":"Analyze this customer request."}'
Through api-gateway, use /gateway/teams/**.
Execute a team run. The first implementation creates and executes one agent run
per member, then stores a team-level summary. dry_run=true lets this work
without model API keys:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8009/teams/runs/team-run-id/execute `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","worker_key":"team-worker-1","dry_run":true}'
Execute one queued team run through the worker claim API:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8009/teams/workers/execute-next `
-ContentType "application/json" `
-Body '{"worker_key":"team-worker-1","lease_seconds":300,"dry_run":true}'
Run a standalone team worker process:
Push-Location .\services\team-service
$env:AGENT_PLATFORM_DATABASE_URL="sqlite:///./team_service.db"
$env:AGENT_PLATFORM_WORKER_DRY_RUN="true"
..\..\.venv\Scripts\python -m app.worker
Pop-Location
skill-service stores reusable skill definitions, versioned parameter/output schemas,
marketplace-style installations, and executable skill runs. The first executor supports a
dependency-free template runtime so local development works without API keys.
Create a skill:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8010/skills `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"hello_user","name":"Hello User","skill_type":"template"}'
Create a published skill version:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8010/skills/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","skill_id":"skill-id","status":"published","runtime_type":"template","parameter_schema_json":{"name":{"type":"string"}},"implementation_json":{"template":"Hello $name"}}'
Install the skill for a tenant, agent, team, app, or user scope:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8010/skills/installations `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","skill_id":"skill-id","install_scope":"tenant","scope_id":"t1","installed_by":"user-1"}'
Create and execute a skill run:
$run = Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8010/skills/runs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","skill_id":"skill-id","input_json":{"name":"Lucas"}}'
Invoke-RestMethod -Method Post `
-Uri "http://127.0.0.1:8010/skills/runs/$($run.id)/execute" `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","worker_key":"skill-worker-1"}'
Through api-gateway, use /gateway/skills/**.
human-service stores human-in-the-loop tasks for approval, input collection,
takeover, pause, and resume flows.
Create an approval task:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8011/human/tasks `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","task_type":"approval","title":"Approve refund","run_id":"run-id","node_run_id":"node-run-id","assigned_to":"ops-1","request_payload_json":{"amount":99}}'
Claim and complete a task:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8011/human/tasks/human-task-id/claim `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","claimed_by":"ops-1"}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8011/human/tasks/human-task-id/complete `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","status":"approved","response_payload_json":{"approved":true}}'
Through api-gateway, use /gateway/human/**.
Runtime human-in-the-loop nodes now create human-service tasks and pause the
node in pending status until the task is completed. Supported node types:
humanapprovalhuman-inputhuman-takeoverAfter completing the human task, resume the blocked node:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/node-runs/node-run-id/resume-human `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","human_task_id":"human-task-id","worker_key":"runtime-worker-1"}'
knowledge-service stores independent knowledge bases, documents, chunks, and
retrieval results. It defaults to deterministic local hash embeddings plus keyword
scoring, so it works without external API keys. For production, set
AGENT_PLATFORM_EMBEDDING_PROVIDER=http with an OpenAI-compatible
/embeddings endpoint; if the provider fails and fallback is enabled, indexing
and search fall back to local hash embeddings.
When running on PostgreSQL with pgvector, knowledge_chunk.embedding_vector
is populated and search uses pgvector cosine similarity first, then combines it
with keyword scoring. SQLite and other databases automatically fall back to the
JSON embedding hybrid search path.
Create a knowledge base:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8012/knowledge/bases `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"support_kb","name":"Support Knowledge Base"}'
Create and index a document:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8012/knowledge/documents `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","knowledge_base_id":"kb-id","title":"Refund Policy","content_text":"Refunds are available within seven days for eligible orders.","source_type":"text"}'
Search the knowledge base:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8012/knowledge/search `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","knowledge_base_id":"kb-id","query":"refund within seven days","top_k":3}'
Through api-gateway, use /gateway/knowledge/**.
event-service stores platform events with delivery status so services can use
a durable outbox pattern now, and later swap delivery to Kafka/RabbitMQ behind
the same API.
Publish an event:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8013/events `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","event_type":"run.created","source_service":"runtime-service","aggregate_type":"workflow_run","aggregate_id":"run-id","payload_json":{"run_id":"run-id"}}'
Claim pending events for a delivery worker:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8013/events/claim-pending `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","limit":50}'
Through api-gateway, use /gateway/events/**.
auth-service stores users, roles, role assignments, and permission checks.
This is the first RBAC layer for tenant governance.
$user = Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8014/auth/users `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","username":"alice","display_name":"Alice"}'
$role = Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8014/auth/roles `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","code":"admin","name":"Admin","permissions_json":["*"]}'
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8014/auth/assignments `
-ContentType "application/json" `
-Body "{`"tenant_id`":`"t1`",`"user_id`":`"$($user.id)`",`"role_id`":`"$($role.id)`"}"
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8014/auth/permissions/check `
-ContentType "application/json" `
-Body "{`"tenant_id`":`"t1`",`"user_id`":`"$($user.id)`",`"permission`":`"workflow:write`"}"
Through api-gateway, use /gateway/auth/**.
scheduler-service stores delayed jobs and due-job leases for time-based
automation. It is intentionally service-neutral: jobs can target HTTP,
event, runtime, agent, or team execution.
Create a scheduled job:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8015/scheduler/jobs `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","job_type":"runtime","name":"Run workflow later","schedule_time":"2026-04-26T12:00:00Z","payload_json":{"workflow_run_id":"run-id"}}'
Claim due jobs for a worker:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8015/scheduler/jobs/claim-due `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","worker_key":"scheduler-worker-1","limit":20}'
Mark a job completed or failed:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8015/scheduler/jobs/job-id/status `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","status":"completed"}'
Through api-gateway, use /gateway/scheduler/**.
Run the scheduler worker locally:
Push-Location .\services\scheduler-service
$env:AGENT_PLATFORM_DATABASE_URL="sqlite:///./scheduler_service.db"
$env:AGENT_PLATFORM_EVENT_SERVICE_URL="http://127.0.0.1:8013"
python -m app.worker
Pop-Location
Execute an agent run without calling an external model:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/runs/agent-run-id/execute `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","worker_key":"agent-worker-1","dry_run":true}'
Execute with model-gateway-service:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/runs/agent-run-id/execute `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","worker_key":"agent-worker-1"}'
Agent memory policy is stored on agent_version.memory_policy_json:
enabled: read memories before executionmemory_scope: one of tenant, user, session, agent, or teamread_top_k: maximum memories to inject into the promptwrite_enabled: write a conversation memory after successful model executionconfig_json.write_importance_score: optional importance score for written memoriesAgent capability refs are stored on agent_version.tool_refs_json and
agent_version.skill_refs_json.
required=true, config_json.auto_invoke=true, or selection_keywords match the run input.config_json.auto_invoke=false; selection_keywords can also select them.selected_tool_refs and selected_skill_refs without calling downstream tools/skills.Example version with session memory:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/versions `
-ContentType "application/json" `
-Body '{"tenant_id":"t1","agent_id":"agent-id","status":"published","role":"assistant","system_prompt":"Use relevant memory when helpful.","memory_policy":{"enabled":true,"memory_scope":"session","read_top_k":5,"write_enabled":true,"config_json":{"write_importance_score":60}}}'
Execute one queued agent run through the worker claim API:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8007/agents/workers/execute-next `
-ContentType "application/json" `
-Body '{"worker_key":"agent-worker-1","lease_seconds":300,"dry_run":true}'
Run a standalone agent worker process:
Push-Location .\services\agent-service
$env:AGENT_PLATFORM_DATABASE_URL="sqlite:///./agent_service.db"
$env:AGENT_PLATFORM_WORKER_DRY_RUN="true"
..\..\.venv\Scripts\python -m app.worker
Pop-Location
runtime-service now includes a typed executor skeleton for these node types:
llmtoolcodehumanapprovalhuman-inputhuman-takeoveranswerif-elseassignerknowledge-retrievaltemplate-transformExecute a specific queued node:
Invoke-RestMethod -Method Post `
-Uri http://127.0.0.1:8003/runtime/node-runs/node-run-id/execute `
-ContentType "application/json" `
-Body '{"worker_key":"runtime-worker-1"}'
Execute the next queued node in a run:
Invoke-RestMethod -Method Post `
-Uri "http://127.0.0.1:8003/runtime/runs/run-id/execute-next?tenant_id=t1" `
-ContentType "application/json" `
-Body '{"worker_key":"runtime-worker-1"}'
Execute queued nodes in sequence until the run is finished, blocked, or reaches max_steps:
Invoke-RestMethod -Method Post `
-Uri "http://127.0.0.1:8003/runtime/runs/run-id/execute?tenant_id=t1" `
-ContentType "application/json" `
-Body '{"worker_key":"runtime-worker-1","max_steps":16}'
Execute one queued node through the worker claim API:
Invoke-RestMethod -Method Post `
-Uri "http://127.0.0.1:8003/runtime/workers/execute-next" `
-ContentType "application/json" `
-Body '{"worker_key":"runtime-worker-1","lease_seconds":300}'
Run a standalone runtime worker process:
Push-Location .\services\runtime-service
$env:AGENT_PLATFORM_DATABASE_URL="sqlite:///./runtime_service.db"
..\..\.venv\Scripts\python -m app.worker
Pop-Location
The worker uses node_run.status plus lease_expire_time as a DB-backed queue. This keeps the first scalable version dependency-light; for heavier production concurrency, move AGENT_PLATFORM_DATABASE_URL to PostgreSQL before scaling many workers.
Node execution results are now persisted on node_run:
output_textoutput_jsonNode execution artifacts are also persisted on node_artifact:
artifact_typecontent_textcontent_jsonstorage_urisize_bytesQuery artifacts:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8003/runtime/node-artifacts?tenant_id=t1&run_id=run-id"
Trace spans are persisted on trace_span for timeline and latency analysis:
span_typenamestatusstarted_timeended_timeduration_msattributes_jsonerror_codeerror_messageQuery trace spans:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8003/runtime/trace-spans?tenant_id=t1&run_id=run-id"
Current behavior:
answer nodes persist rendered text to output_textassigner nodes write state_updates to output_jsoncondition / if-else nodes write condition_result and route to output_jsontemplate-transform nodes render text or JSON using previous node outputs and run stateknowledge-retrieval / retriever nodes run keyword retrieval over inline or HTTP JSON documentstool nodes persist resolved binding/tool metadata to output_jsonoutput_jsonconfig.join_policyconfig.allow_loop=true and config.max_iterationsconfig.retry_policy.max_attempts and retry_delay_secondsconfig.delay_seconds and config.timeout_secondsconfig.compensation_node_idRuntime template context:
state.xxx: values written by previous assigner nodesnodes.node_id.output.xxx: structured output from a previous nodenodes.node_id.text: text output from a previous nodecurrent.node_id: current node idAssigner node config example:
{
"id": "seed-state",
"type": "assigner",
"config": {
"assignments": {
"score": 7,
"user_name": "Alice"
}
}
}
Condition node config example:
{
"id": "check-score",
"type": "if-else",
"config": {
"expression": "state.score >= 5"
}
}
Conditional edge example:
[
{"source": "check-score", "target": "high-path", "condition": "true"},
{"source": "check-score", "target": "low-path", "condition": "false"}
]
Join node config example:
{
"id": "join-results",
"type": "join",
"config": {
"join_policy": "all_completed"
}
}
Loop and retry config example:
{
"id": "poll-status",
"type": "tool",
"config": {
"allow_loop": true,
"max_iterations": 5,
"timeout_seconds": 30,
"retry_policy": {
"max_attempts": 3,
"retry_delay_seconds": 2
}
}
}
Compensation config example:
{
"id": "charge-card",
"type": "tool",
"config": {
"compensation_node_id": "refund-card"
}
}
Template node config example:
{
"id": "high-path",
"type": "template-transform",
"config": {
"template": "{{state.user_name}} passed with score {{state.score}}"
}
}
Retriever node config example:
{
"id": "retrieve-docs",
"type": "knowledge-retrieval",
"config": {
"query_template": "{{state.query}}",
"top_k": 2,
"documents": [
{
"id": "refund",
"title": "Refund Policy",
"text": "Refund policy allows returns within seven days."
},
{
"id": "shipping",
"title": "Shipping Policy",
"text": "Shipping usually takes three to five business days."
}
]
}
}
Retriever nodes can call knowledge-service directly:
{
"id": "retrieve-kb",
"type": "knowledge-retrieval",
"config": {
"knowledge_base_id": "kb-id",
"query_template": "{{state.query}}",
"top_k": 3,
"filters_json": {
"source_type": "text"
}
}
}
Retriever output is persisted to node_run.output_json.retrieved_documents. Template nodes can consume it:
{
"id": "render-answer",
"type": "template-transform",
"config": {
"template": "Top doc: {{nodes.retrieve-docs.output.retrieved_documents.0.title}}"
}
}
Retriever nodes can also load documents from an HTTP JSON source:
{
"id": "retrieve-remote-docs",
"type": "retriever",
"config": {
"query": "refund policy",
"source_url": "http://127.0.0.1:9000/documents",
"top_k": 3
}
}
The HTTP source should return either a document list or an object with a documents list.
Run the no-key runtime smoke test after local services are running:
.\.venv\Scripts\python scripts\smoke_runtime_no_key.py
Run the same smoke test through api-gateway:
$env:AGENT_PLATFORM_SMOKE_WORKFLOW_URL="http://127.0.0.1:8000/gateway/workflows"
$env:AGENT_PLATFORM_SMOKE_RUNTIME_URL="http://127.0.0.1:8000/gateway/runtime"
.\.venv\Scripts\python scripts\smoke_runtime_no_key.py
api-gateway provides a unified entrypoint:
GET /gateway/services/health/gateway/workflows/** -> workflow-service /workflows/**/gateway/sessions/** -> session-service /sessions/**/gateway/runtime/** -> runtime-service /runtime/**/gateway/agents/** -> agent-service /agents/**/gateway/memories/** -> memory-service /memories/**/gateway/teams/** -> team-service /teams/**/gateway/skills/** -> skill-service /skills/**/gateway/human/** -> human-service /human/**/gateway/knowledge/** -> knowledge-service /knowledge/**/gateway/events/** -> event-service /events/**/gateway/auth/** -> auth-service /auth/**/gateway/scheduler/** -> scheduler-service /scheduler/**/gateway/tools/** -> tool-service /tools/**/gateway/models/** -> model-gateway-service /models/**/gateway/code/** -> code-runner-service /code/**Gateway readiness:
Invoke-RestMethod -Uri "http://127.0.0.1:8000/ready"
Downstream health:
Invoke-RestMethod -Uri "http://127.0.0.1:8000/gateway/services/health"
Gateway request context:
x-request-id is reused; otherwise gateway generates one.x-tenant-id is reused; otherwise gateway falls back to tenant_id query parameter, then public.x-request-id and x-tenant-id to downstream services.gateway_request_audit.Query gateway audits:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8000/gateway/audits?tenant_id=t1&limit=20" `
-Headers @{"x-tenant-id"="t1"}
Query gateway audit stats:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8000/gateway/audits/stats?tenant_id=t1" `
-Headers @{"x-tenant-id"="t1"}
Gateway API Key auth:
AGENT_PLATFORM_AUTH_REQUIRED=false by default for local development.AGENT_PLATFORM_AUTH_REQUIRED=true to protect /gateway/**, except /gateway/services/health.POST /gateway/api-keys is allowed as bootstrap.active, disabled, or revoked; only active keys are accepted.scopes, gateway checks them before proxying. Use *, gateway:agents:*, or exact permissions such as gateway:agents:read.AGENT_PLATFORM_AUTHZ_REQUIRED=true to require x-user-id and call auth-service /auth/permissions/check for the derived permission.Create an API key:
$body = @{
tenant_id = "t1"
name = "local-dev"
scopes = "gateway:agents:* gateway:runtime:read"
} | ConvertTo-Json
$created = Invoke-RestMethod `
-Method Post `
-Uri "http://127.0.0.1:8000/gateway/api-keys" `
-ContentType "application/json" `
-Body $body
$created.api_key
Use an API key:
Invoke-RestMethod `
-Uri "http://127.0.0.1:8000/gateway/audits?tenant_id=t1" `
-Headers @{"x-tenant-id"="t1"; "x-api-key"=$created.api_key}
Disable or revoke an API key:
$body = @{
tenant_id = "t1"
status = "revoked"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Patch `
-Uri "http://127.0.0.1:8000/gateway/api-keys/$($created.id)/status" `
-ContentType "application/json" `
-Headers @{"x-tenant-id"="t1"; "x-api-key"=$created.api_key} `
-Body $body
Run smoke test through an authenticated gateway:
$env:AGENT_PLATFORM_SMOKE_WORKFLOW_URL="http://127.0.0.1:8000/gateway/workflows"
$env:AGENT_PLATFORM_SMOKE_RUNTIME_URL="http://127.0.0.1:8000/gateway/runtime"
$env:AGENT_PLATFORM_SMOKE_TENANT_ID="t1"
$env:AGENT_PLATFORM_SMOKE_API_KEY=$created.api_key
.\.venv\Scripts\python scripts\smoke_runtime_no_key.py
HTTP tool node config example:
{
"id": "search-products",
"type": "tool",
"config": {
"tool_binding_id": "binding-1",
"query": {
"keyword": "milk"
}
}
}
Supported HTTP tool config resolution order:
config.url or invoke_config_json.urlconfig.base_url or binding.config_json.base_url or invoke_config_json.base_urlconfig.path or invoke_config_json.pathinvoke_config_json.method, default GETinvoke_config_json.query + config.queryinvoke_config_json.body + config.bodyinvoke_config_json.headers + binding.config_json.headers + config.headersLLM node config example:
{
"id": "draft-answer",
"type": "llm",
"config": {
"model": "gpt-4o-mini",
"system_prompt": "You are a customer support assistant.",
"prompt": "Summarize the user intent in Chinese.",
"temperature": 0.2,
"max_tokens": 400
}
}
llm nodes also support explicit messages:
{
"id": "rewrite-message",
"type": "llm",
"config": {
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a concise editor."},
{"role": "user", "content": "Rewrite this sentence in a warmer tone."}
]
}
}
runtime-service sends llm execution requests to model-gateway-service, and the gateway forwards them to an OpenAI-compatible /chat/completions provider.
Recommended environment variables for model-gateway-service:
$env:AGENT_PLATFORM_PROVIDER_BASE_URL="https://api.openai.com/v1"
$env:AGENT_PLATFORM_PROVIDER_API_KEY="your-api-key"
$env:AGENT_PLATFORM_DEFAULT_MODEL="gpt-4o-mini"
Code node config example:
{
"id": "compute-summary",
"type": "code",
"config": {
"language": "python",
"timeout_seconds": 5,
"input_json": {
"numbers": [1, 2, 3, 4]
},
"code": "total = sum(payload['numbers'])\nresult = {'total': total, 'count': len(payload['numbers'])}\nprint(f'total={total}')"
}
}
runtime-service sends code execution requests to code-runner-service. Current python execution contract:
payloadresultprint(...) output is captured into node_run.output_textresult is captured into node_run.output_json.result_jsonRecommended environment variables for code-runner-service:
$env:AGENT_PLATFORM_PYTHON_BIN="python"
$env:AGENT_PLATFORM_MAX_TIMEOUT_SECONDS="30"
Files:
deployments/docker/docker-compose.ymldeployments/docker/python-service.Dockerfiledeployments/docker/.env.exampleStart all services locally:
cd D:\workspace\auto-platform
Copy-Item .\deployments\docker\.env.example .\.env
docker compose -f .\deployments\docker\docker-compose.yml up --build
Start in detached mode:
docker compose -f .\deployments\docker\docker-compose.yml up --build -d
Production-like infrastructure:
postgres with the pgvector image and runs CREATE EXTENSION IF NOT EXISTS vector.redis with append-only persistence.deployments/docker/.env.example to .env to use per-service PostgreSQL databases such as workflow_service, agent_service, and knowledge_service.AGENT_PLATFORM_REDIS_URL=redis://redis:6379/0 to enable shared Redis-backed locks, idempotency keys, and queues.Run all service migrations:
python .\scripts\migrate_all.py
Run only selected migrations:
python .\scripts\migrate_all.py --only agent-service --only runtime-service
Run the automated smoke tests:
pip install pytest
pytest -q
The repository includes .gitlab-ci.yml with a Python 3.11 test job that
installs the core libraries plus Agent/Knowledge services, runs compileall,
and executes the pytest smoke suite.
Scale runtime workers:
docker compose -f .\deployments\docker\docker-compose.yml up --build -d --scale runtime-worker=3
Scale agent workers:
docker compose -f .\deployments\docker\docker-compose.yml up --build -d --scale agent-worker=3
Scale team workers:
docker compose -f .\deployments\docker\docker-compose.yml up --build -d --scale team-worker=3
Scale scheduler workers:
docker compose -f .\deployments\docker\docker-compose.yml up --build -d --scale scheduler-worker=3
Stop and remove containers:
docker compose -f .\deployments\docker\docker-compose.yml down
Important notes:
/data if AGENT_PLATFORM_DATABASE_URL is not set.core-shared.redis_primitives provides DistributedLock, IdempotencyStore, and RedisQueue for services that need cross-process coordination.agent-worker, runtime-worker, and scheduler-worker use Redis locks/idempotency when Redis is available, and fall back to DB leases when Redis is not available.agent-service stores agent definitions, prompt/config versions, and agent run records under /datamemory-service stores scoped memories under /data; move it to PostgreSQL before enabling high-volume memory writesteam-service stores multi-agent team definitions, team versions, and team run records under /datateam-worker executes queued team runs by orchestrating member agent runs; it can be scaled independentlyskill-service stores skill definitions, versions, marketplace-style installations, and skill execution runs under /datahuman-service stores human approval, input, pause/resume, and takeover task records under /dataknowledge-service stores knowledge bases, documents, chunks, and local retrieval metadata under /dataevent-service stores platform events and delivery status under /dataauth-service stores users, roles, assignments, and permission policy metadata under /datascheduler-service stores delayed jobs, due-job leases, and retry status under /dataagent-worker has no exposed port and can be scaled independently; set AGENT_PLATFORM_AGENT_WORKER_DRY_RUN=true for no-key local smoke runsscheduler-worker has no exposed port and can be scaled independently; prefer PostgreSQL for real multi-worker write concurrencyruntime-worker has no exposed port and can be scaled independently; prefer PostgreSQL for real multi-worker write concurrencyruntime-service automatically resolves internal URLs to workflow-service, tool-service, model-gateway-service, and code-runner-servicemodel-gateway-service defaults to http://host.docker.internal:11434/v1; replace it in .env if you want OpenAI or another OpenAI-compatible provider