Skip to main content
Call the Responses API to get the model-generated response data. This endpoint is compatible with the OpenAI Responses API request and response format.

Endpoint

POST https://api.stepfun.ai/v1/responses

Request Parameters

  • model string required
    Name of the model to use. Currently only step-3.7-flash is supported.
  • input string or object array required
    Input content. Either a plain text string, or an ordered array of messages / events.
Image and video URLs must be publicly reachable from the server; if the server cannot fetch the URL, a parameter error will be returned. For image input, base64 data URLs are recommended to avoid authentication, hotlink protection, or network access failures on external URLs.
  • instructions string optional
    Top-level system instructions.
  • stream bool optional
    Whether to enable SSE streaming. Default is false.
  • temperature float optional
    Sampling temperature, between 0.0 and 2.0.
  • top_p float optional
    Nucleus sampling parameter.
  • max_output_tokens int optional
    Maximum number of output tokens for this response.
max_output_tokens limits both the reasoning process and the final output. When using medium / high reasoning effort, JSON Schema, video, or other complex inputs, reserve a larger output budget; if the budget is insufficient, the response may return status="incomplete", and output may contain only a reasoning item without a final message.
  • reasoning object optional
    Reasoning configuration.
  • tools object array optional
    List of tool definitions. Currently only function-type tools are supported.
  • tool_choice string or object optional
    Tool-call strategy. Currently only the string "auto" is supported (the model decides whether to call a tool).
  • text object optional
    Text output format configuration.

Response Format

Non-streaming response

When stream=false (default), a single Response object is returned.

Properties

  • id string
    Unique response ID, in the form resp_xxx.
  • object string
    Always response.
  • created_at int
    Creation time as a Unix timestamp (seconds).
  • completed_at int or null
    Completion time as a Unix timestamp (seconds).
  • status string
    Response status. One of completed, incomplete, or failed.
  • error object or null
    Error information. Non-null only when status=failed.
  • incomplete_details object or null
    Incomplete details. Non-null only when status=incomplete; commonly { "reason": "max_output_tokens" }.
  • model string
    The model ID actually used.
  • output object array
    Array of output items. Each element may be of one of the following types:
  • usage object
    Token usage statistics.
  • instructions string or null
    Echoes the top-level instructions from the request.
  • max_output_tokens int or null
    Echoes the request parameter.
  • reasoning object or null
    Echoes the reasoning configuration.
  • temperature float or null
    Echoes the sampling temperature.
  • top_p float or null
    Echoes the nucleus sampling parameter.
  • text object
    Echoes the text output format configuration.
  • tool_choice string or object
    Echoes the tool-call strategy.
  • tools object array
    Echoes the tool definitions.

Example

{
  "id": "resp_xxxxxxxxxxxxxxxx",
  "object": "response",
  "created_at": 1772624997,
  "completed_at": 1772624998,
  "model": "step-3.7-flash",
  "status": "completed",
  "error": null,
  "incomplete_details": null,
  "output": [
    {
      "type": "reasoning",
      "id": "rs_xxxxxxxxxxxxxxxx",
      "summary": [],
      "content": null,
      "encrypted_content": null,
      "status": null
    },
    {
      "type": "message",
      "id": "msg_xxxxxxxxxxxxxxxx",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Quantum computing is a new computational paradigm that leverages principles of quantum mechanics (such as superposition and entanglement) to process information.",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 14,
    "input_tokens_details": { "cached_tokens": 0 },
    "output_tokens": 52,
    "output_tokens_details": { "reasoning_tokens": 0, "tool_output_tokens": 0 },
    "total_tokens": 66
  },
  "instructions": null,
  "max_output_tokens": null,
  "reasoning": { "effort": "medium", "summary": null },
  "temperature": 1.0,
  "top_p": 1.0,
  "text": { "format": { "type": "text" } },
  "tool_choice": "auto",
  "tools": []
}

Streaming response

When stream=true, Server-Sent Events (SSE) data is returned. Each event consists of an event: line and a data: line. Each event’s data object contains type and sequence_number. type matches the event name; sequence_number starts from 0 and increments, allowing clients to process events in order.

Event types

EventTriggered when
response.createdResponse is created
response.in_progressGeneration begins
response.output_item.addedA new output item is created
response.reasoning_part.addedA reasoning content part begins
response.reasoning_text.deltaReasoning text delta
response.reasoning_text.doneReasoning text finishes
response.reasoning_part.doneReasoning content part finishes
response.content_part.addedA text content part begins
response.output_text.deltaText delta
response.output_text.doneText part finishes
response.content_part.doneContent part finishes
response.function_call_arguments.deltaTool argument delta
response.function_call_arguments.doneTool arguments finish
response.output_item.doneOutput item finishes
response.completedResponse completes
response.incompleteEnded due to output truncation
response.failedGeneration failed
errorTransport-layer error

Example

Text streaming:
event: response.created
data: {"type":"response.created","sequence_number":0,"response":{"id":"resp_xxx","object":"response","created_at":1772624997,"model":"step-3.7-flash","status":"in_progress","output":[]}}

event: response.in_progress
data: {"type":"response.in_progress","sequence_number":1,"response":{"id":"resp_xxx","status":"in_progress"}}

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":2,"output_index":0,"item":{"id":"rs_xxx","type":"reasoning","summary":[],"content":null,"encrypted_content":null,"status":"in_progress"}}

event: response.reasoning_part.added
data: {"type":"response.reasoning_part.added","sequence_number":3,"output_index":0,"item_id":"rs_xxx","content_index":0,"part":{"type":"reasoning_text","text":""}}

event: response.reasoning_text.delta
data: {"type":"response.reasoning_text.delta","sequence_number":4,"output_index":0,"item_id":"rs_xxx","content_index":0,"delta":"User asked for a greeting."}

event: response.reasoning_text.done
data: {"type":"response.reasoning_text.done","sequence_number":5,"output_index":0,"item_id":"rs_xxx","content_index":0,"text":"User asked for a greeting."}

event: response.reasoning_part.done
data: {"type":"response.reasoning_part.done","sequence_number":6,"output_index":0,"item_id":"rs_xxx","content_index":0,"part":{"type":"reasoning_text","text":"User asked for a greeting."}}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":7,"output_index":0,"item":{"id":"rs_xxx","type":"reasoning","summary":[],"content":null,"encrypted_content":null,"status":"completed"}}

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":8,"output_index":1,"item":{"id":"msg_xxx","type":"message","role":"assistant","status":"in_progress","content":[]}}

event: response.content_part.added
data: {"type":"response.content_part.added","sequence_number":9,"item_id":"msg_xxx","output_index":1,"content_index":0,"part":{"type":"output_text","text":"","annotations":[]}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","sequence_number":10,"item_id":"msg_xxx","output_index":1,"content_index":0,"delta":"Hello"}

event: response.output_text.done
data: {"type":"response.output_text.done","sequence_number":11,"item_id":"msg_xxx","output_index":1,"content_index":0,"text":"Hello"}

event: response.content_part.done
data: {"type":"response.content_part.done","sequence_number":12,"item_id":"msg_xxx","output_index":1,"content_index":0,"part":{"type":"output_text","text":"Hello","annotations":[]}}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":13,"output_index":1,"item":{"id":"msg_xxx","type":"message","role":"assistant","status":"completed","content":[{"type":"output_text","text":"Hello","annotations":[]}]}}

event: response.completed
data: {"type":"response.completed","sequence_number":14,"response":{"id":"resp_xxx","object":"response","status":"completed","output":[{"id":"rs_xxx","type":"reasoning","summary":[],"content":null,"encrypted_content":null,"status":"completed"},{"id":"msg_xxx","type":"message","role":"assistant","status":"completed","content":[{"type":"output_text","text":"Hello","annotations":[]}]}],"usage":{"input_tokens":10,"output_tokens":2,"total_tokens":12}}}
Function-call streaming (excerpt):
event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":0,"output_index":0,"item":{"id":"fc_xxx","type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"","status":"in_progress"}}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","sequence_number":1,"item_id":"fc_xxx","output_index":0,"delta":"{\"city\":\"Beijing\"}"}

event: response.function_call_arguments.done
data: {"type":"response.function_call_arguments.done","sequence_number":2,"item_id":"fc_xxx","output_index":0,"arguments":"{\"city\":\"Beijing\"}","name":"get_weather"}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":3,"output_index":0,"item":{"id":"fc_xxx","type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"{\"city\":\"Beijing\"}","status":"completed"}}

event: response.completed
data: {"type":"response.completed","sequence_number":4,"response":{"id":"resp_xxx","object":"response","status":"completed","output":[{"id":"fc_xxx","type":"function_call","call_id":"call_xxx","name":"get_weather","arguments":"{\"city\":\"Beijing\"}","status":"completed"}]}}

Examples

from openai import OpenAI

client = OpenAI(api_key="STEP_API_KEY", base_url="https://api.stepfun.ai/v1")

response = client.responses.create(
    model="step-3.7-flash",
    input="Briefly introduce quantum computing in one sentence",
)

print(response.output_text)