Skip to content

Feature request: Add support for OpenAI Responses api #2028

@srampal

Description

@srampal

What would you like to be added:
The current code only supports the OpenAI completions and chat/completions apis. The responses api is the new recommended API from OpenAI going forward and it is important that it be supported in GAIE asap. Inference engines such as vllm have been supporting this api for a while but when I tried to test this api for vllm running as part of GIE/ LLM-D it did not work due to this missing support. I was getting the errors such as "invalid chat-completions request:" when sending requests to the responses api endpoint)

Places in the code which need to be fixed include the parsing and types utility packages such as
the ExtractRequestBody() method in pkg/epp/util/request and the types definitions in pkg/epp.scheduling/types.go and other places.

Here is an example that failed with these errors due to the missing support

$curl -X POST http://192.168.49.2:32324/v1/responses -H "Content-Type: application/json" -d '{
"model": "Qwen/Qwen3-0.6B",
"input": "Tell me a joke.",
"instructions": "Remember my name is John Doe. I will ask you about it later!",
"temperature": 0.2,
"max_output_tokens": 100
}'
inference gateway: BadRequest - failed to extract request data: inference gateway: BadRequest - invalid chat-completions request: inference gateway: BadRequest - chat-completions request must have at least one message

Why is this needed:
The Responses api is the new recommended api from OpenAI and includes a number of improvements over the chat/completions api. This is a significant restriction in GAIE currently to not support this api. I would almost consider this current lack of support for this important api a bug and not a feature request. This needs to be supported in GAIE/ EPP asap first for this issue and then subsequently if downstream projects such as LLM-D need additional fixes, additional changes may be needed in the llm-d repo as well.

Metadata

Metadata

Assignees

Labels

needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions