Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add 3 type of reasoning_content support (+deepseek-r1@OpenAI @Alibaba @ByteDance), parse <think></think> from SSE #6204

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
220 changes: 105 additions & 115 deletions app/client/platforms/alibaba.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,14 @@ import {
ALIBABA_BASE_URL,
REQUEST_TIMEOUT_MS,
} from "@/app/constant";
import { useAccessStore, useAppConfig, useChatStore } from "@/app/store";

import {
useAccessStore,
useAppConfig,
useChatStore,
ChatMessageTool,
usePluginStore,
} from "@/app/store";
import { streamWithThink } from "@/app/utils/chat";
import {
ChatOptions,
getHeaders,
Expand All @@ -15,14 +21,11 @@ import {
SpeechOptions,
MultimodalContent,
} from "../api";
import Locale from "../../locales";
import {
EventStreamContentType,
fetchEventSource,
} from "@fortaine/fetch-event-source";
import { prettyObject } from "@/app/utils/format";
import { getClientConfig } from "@/app/config/client";
import { getMessageTextContent } from "@/app/utils";
import {
getMessageTextContent,
getMessageTextContentWithoutThinking,
} from "@/app/utils";
import { fetch } from "@/app/utils/stream";

export interface OpenAIListModelResponse {
Expand Down Expand Up @@ -92,7 +95,10 @@ export class QwenApi implements LLMApi {
async chat(options: ChatOptions) {
const messages = options.messages.map((v) => ({
role: v.role,
content: getMessageTextContent(v),
content:
v.role === "assistant"
? getMessageTextContentWithoutThinking(v)
: getMessageTextContent(v),
}));

const modelConfig = {
Expand Down Expand Up @@ -122,15 +128,17 @@ export class QwenApi implements LLMApi {
options.onController?.(controller);

try {
const headers = {
...getHeaders(),
"X-DashScope-SSE": shouldStream ? "enable" : "disable",
};

const chatPath = this.path(Alibaba.ChatPath);
const chatPayload = {
method: "POST",
body: JSON.stringify(requestPayload),
signal: controller.signal,
headers: {
...getHeaders(),
"X-DashScope-SSE": shouldStream ? "enable" : "disable",
},
headers: headers,
};

// make a fetch request
Expand All @@ -140,116 +148,98 @@ export class QwenApi implements LLMApi {
);

if (shouldStream) {
let responseText = "";
let remainText = "";
let finished = false;
let responseRes: Response;

// animate response to make it looks smooth
function animateResponseText() {
if (finished || controller.signal.aborted) {
responseText += remainText;
console.log("[Response Animation] finished");
if (responseText?.length === 0) {
options.onError?.(new Error("empty response from server"));
const [tools, funcs] = usePluginStore
.getState()
.getAsTools(
useChatStore.getState().currentSession().mask?.plugin || [],
);
return streamWithThink(
chatPath,
requestPayload,
headers,
tools as any,
funcs,
controller,
// parseSSE
(text: string, runTools: ChatMessageTool[]) => {
// console.log("parseSSE", text, runTools);
const json = JSON.parse(text);
const choices = json.output.choices as Array<{
message: {
content: string | null;
tool_calls: ChatMessageTool[];
reasoning_content: string | null;
};
}>;

if (!choices?.length) return { isThinking: false, content: "" };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Verify array access safety.

The code accesses array elements without checking if the array exists or has elements.

-            if (!choices?.length) return { isThinking: false, content: "" };
-
-            const tool_calls = choices[0]?.message?.tool_calls;
+            if (!choices?.length || !choices[0]?.message) {
+              return { isThinking: false, content: "" };
+            }
+            const tool_calls = choices[0].message.tool_calls;

Also applies to: 173-190


const tool_calls = choices[0]?.message?.tool_calls;
if (tool_calls?.length > 0) {
const index = tool_calls[0]?.index;
const id = tool_calls[0]?.id;
const args = tool_calls[0]?.function?.arguments;
if (id) {
runTools.push({
id,
type: tool_calls[0]?.type,
function: {
name: tool_calls[0]?.function?.name as string,
arguments: args,
},
});
} else {
// @ts-ignore
runTools[index]["function"]["arguments"] += args;
}
Comment on lines +188 to +190
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove type assertion and improve type safety.

The code uses @ts-ignore to bypass type checking, which could lead to runtime errors.

-                // @ts-ignore
-                runTools[index]["function"]["arguments"] += args;
+                if (index !== undefined && index < runTools.length && runTools[index]?.function) {
+                  runTools[index].function.arguments += args;
+                }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// @ts-ignore
runTools[index]["function"]["arguments"] += args;
}
if (index !== undefined && index < runTools.length && runTools[index]?.function) {
runTools[index].function.arguments += args;
}

}
return;
}

if (remainText.length > 0) {
const fetchCount = Math.max(1, Math.round(remainText.length / 60));
const fetchText = remainText.slice(0, fetchCount);
responseText += fetchText;
remainText = remainText.slice(fetchCount);
options.onUpdate?.(responseText, fetchText);
}

requestAnimationFrame(animateResponseText);
}

// start animaion
animateResponseText();

const finish = () => {
if (!finished) {
finished = true;
options.onFinish(responseText + remainText, responseRes);
}
};

controller.signal.onabort = finish;

fetchEventSource(chatPath, {
fetch: fetch as any,
...chatPayload,
async onopen(res) {
clearTimeout(requestTimeoutId);
const contentType = res.headers.get("content-type");
console.log(
"[Alibaba] request response content type: ",
contentType,
);
responseRes = res;

if (contentType?.startsWith("text/plain")) {
responseText = await res.clone().text();
return finish();
}
const reasoning = choices[0]?.message?.reasoning_content;
const content = choices[0]?.message?.content;

// Skip if both content and reasoning_content are empty or null
if (
!res.ok ||
!res.headers
.get("content-type")
?.startsWith(EventStreamContentType) ||
res.status !== 200
(!reasoning || reasoning.trim().length === 0) &&
(!content || content.trim().length === 0)
) {
const responseTexts = [responseText];
let extraInfo = await res.clone().text();
try {
const resJson = await res.clone().json();
extraInfo = prettyObject(resJson);
} catch {}

if (res.status === 401) {
responseTexts.push(Locale.Error.Unauthorized);
}

if (extraInfo) {
responseTexts.push(extraInfo);
}

responseText = responseTexts.join("\n\n");

return finish();
return {
isThinking: false,
content: "",
};
}
},
onmessage(msg) {
if (msg.data === "[DONE]" || finished) {
return finish();
}
const text = msg.data;
try {
const json = JSON.parse(text);
const choices = json.output.choices as Array<{
message: { content: string };
}>;
const delta = choices[0]?.message?.content;
if (delta) {
remainText += delta;
}
} catch (e) {
console.error("[Request] parse error", text, msg);

if (reasoning && reasoning.trim().length > 0) {
return {
isThinking: true,
content: reasoning,
};
} else if (content && content.trim().length > 0) {
return {
isThinking: false,
content: content,
};
}

return {
isThinking: false,
content: "",
};
},
onclose() {
finish();
},
onerror(e) {
options.onError?.(e);
throw e;
// processToolMessage, include tool_calls message and tool call results
(
requestPayload: RequestPayload,
toolCallMessage: any,
toolCallResult: any[],
) => {
requestPayload?.input?.messages?.splice(
requestPayload?.input?.messages?.length,
0,
toolCallMessage,
...toolCallResult,
);
Comment on lines +230 to +235
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve array manipulation safety.

The splice operation could fail if the messages array is undefined.

-            requestPayload?.input?.messages?.splice(
-              requestPayload?.input?.messages?.length,
-              0,
-              toolCallMessage,
-              ...toolCallResult,
-            );
+            if (requestPayload?.input?.messages) {
+              requestPayload.input.messages.splice(
+                requestPayload.input.messages.length,
+                0,
+                toolCallMessage,
+                ...toolCallResult,
+              );
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
requestPayload?.input?.messages?.splice(
requestPayload?.input?.messages?.length,
0,
toolCallMessage,
...toolCallResult,
);
if (requestPayload?.input?.messages) {
requestPayload.input.messages.splice(
requestPayload.input.messages.length,
0,
toolCallMessage,
...toolCallResult,
);
}

},
openWhenHidden: true,
});
options,
);
} else {
const res = await fetch(chatPath, chatPayload);
clearTimeout(requestTimeoutId);
Expand Down
Loading