perf: improve streaming generation pipeline

Add CUDA inference hot-path optimizations, safer attention fallback handling, and generation profiling hooks. Improve SSE streaming, browser buffering telemetry, and playback recovery while preserving default audio quality settings.
This commit is contained in:
2026-04-30 18:54:14 +01:00
parent a39ec536fd
commit 75b84b211b
9 changed files with 459 additions and 48 deletions
+1
View File
@@ -27,6 +27,7 @@ export async function GET() {
message: data.message,
progress: data.progress ?? null,
voices: data.voices ?? [],
config: data.config ?? null,
},
COMMON_OPTIONS
);