mirror of
https://github.com/JezzWTF/vibepod.git
synced 2026-06-01 15:22:14 +00:00
style: apply prettier formatting across all source files
This commit is contained in:
@@ -8,10 +8,10 @@ This file gives AI coding agents (Jules, Copilot, Claude Code, etc.) the context
|
|||||||
|
|
||||||
VibePod is a text-to-speech web app. It has two services that must both run for the app to work:
|
VibePod is a text-to-speech web app. It has two services that must both run for the app to work:
|
||||||
|
|
||||||
| Service | Language | Entry point | Port |
|
| Service | Language | Entry point | Port |
|
||||||
|---------|----------|-------------|------|
|
| ---------- | ---------------------------------- | ------------------------------- | ---- |
|
||||||
| **server** | Python 3.10+ (FastAPI + VibeVoice) | `server/start.sh` | 8000 |
|
| **server** | Python 3.10+ (FastAPI + VibeVoice) | `server/start.sh` | 8000 |
|
||||||
| **web** | TypeScript (Next.js 15, React 19) | `pnpm --filter vibepod-web dev` | 3000 |
|
| **web** | TypeScript (Next.js 15, React 19) | `pnpm --filter vibepod-web dev` | 3000 |
|
||||||
|
|
||||||
The Next.js frontend proxies all model requests through its own API routes to the FastAPI server — it never calls the Python server directly from the browser.
|
The Next.js frontend proxies all model requests through its own API routes to the FastAPI server — it never calls the Python server directly from the browser.
|
||||||
|
|
||||||
@@ -51,12 +51,12 @@ pnpm build
|
|||||||
|
|
||||||
The `--cpu` flag in `start.sh` sets `VIBEPOD_DEVICE=cpu` and uses a separate venv (`server/.venv-cpu`) so CUDA and CPU installs never conflict. `vibevoice_server.py` reads `VIBEPOD_DEVICE` at startup via `_resolve_device()` — do not remove or rename that function.
|
The `--cpu` flag in `start.sh` sets `VIBEPOD_DEVICE=cpu` and uses a separate venv (`server/.venv-cpu`) so CUDA and CPU installs never conflict. `vibevoice_server.py` reads `VIBEPOD_DEVICE` at startup via `_resolve_device()` — do not remove or rename that function.
|
||||||
|
|
||||||
| Env var | Values | Set by |
|
| Env var | Values | Set by |
|
||||||
|---------|--------|--------|
|
| ------------------------ | ----------------------- | --------------------------- |
|
||||||
| `VIBEPOD_DEVICE` | `cpu` \| `cuda` | `server/start.sh` |
|
| `VIBEPOD_DEVICE` | `cpu` \| `cuda` | `server/start.sh` |
|
||||||
| `UV_PROJECT_ENVIRONMENT` | `.venv-cpu` \| `.venv` | `server/start.sh` |
|
| `UV_PROJECT_ENVIRONMENT` | `.venv-cpu` \| `.venv` | `server/start.sh` |
|
||||||
| `HF_TOKEN` | HuggingFace token | Jules secret / `.env.local` |
|
| `HF_TOKEN` | HuggingFace token | Jules secret / `.env.local` |
|
||||||
| `VIBEVOICE_SERVER_URL` | `http://localhost:8000` | `.env.local` |
|
| `VIBEVOICE_SERVER_URL` | `http://localhost:8000` | `.env.local` |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -94,7 +94,9 @@ dev.sh Concurrent launcher (forwards flags to start.sh)
|
|||||||
## API reference
|
## API reference
|
||||||
|
|
||||||
### `GET /health`
|
### `GET /health`
|
||||||
|
|
||||||
Returns server status. Safe to poll.
|
Returns server status. Safe to poll.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"status": "online",
|
"status": "online",
|
||||||
@@ -103,13 +105,17 @@ Returns server status. Safe to poll.
|
|||||||
"voices": ["carter", "davis", "emma", "frank", "grace", "mike"]
|
"voices": ["carter", "davis", "emma", "frank", "grace", "mike"]
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
`status` values: `downloading` | `loading` | `online` | `error`
|
`status` values: `downloading` | `loading` | `online` | `error`
|
||||||
|
|
||||||
### `POST /generate`
|
### `POST /generate`
|
||||||
|
|
||||||
Streams audio as SSE events.
|
Streams audio as SSE events.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{ "text": "Hello world", "speaker": "carter", "cfg_scale": 1.5, "inference_steps": 10 }
|
{ "text": "Hello world", "speaker": "carter", "cfg_scale": 1.5, "inference_steps": 10 }
|
||||||
```
|
```
|
||||||
|
|
||||||
Event types: `audio_chunk` (base64 float32 PCM) | `complete` | `error` | `cancelled`
|
Event types: `audio_chunk` (base64 float32 PCM) | `complete` | `error` | `cancelled`
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -117,12 +123,14 @@ Event types: `audio_chunk` (base64 float32 PCM) | `complete` | `error` | `cancel
|
|||||||
## Do / Don't
|
## Do / Don't
|
||||||
|
|
||||||
**Do:**
|
**Do:**
|
||||||
|
|
||||||
- Use `pnpm dev:cpu` in Jules — never plain `pnpm dev`
|
- Use `pnpm dev:cpu` in Jules — never plain `pnpm dev`
|
||||||
- Run `git checkout server/uv.lock` if uv rewrites it during setup
|
- Run `git checkout server/uv.lock` if uv rewrites it during setup
|
||||||
- Keep `_resolve_device()` in `vibevoice_server.py` — it's the CPU/CUDA switching logic
|
- Keep `_resolve_device()` in `vibevoice_server.py` — it's the CPU/CUDA switching logic
|
||||||
- Test server changes against `GET /health` and `POST /generate`
|
- Test server changes against `GET /health` and `POST /generate`
|
||||||
|
|
||||||
**Don't:**
|
**Don't:**
|
||||||
|
|
||||||
- Run `uv sync` without `UV_PROJECT_ENVIRONMENT=.venv-cpu` in the Jules sandbox
|
- Run `uv sync` without `UV_PROJECT_ENVIRONMENT=.venv-cpu` in the Jules sandbox
|
||||||
- Install Python packages with pip
|
- Install Python packages with pip
|
||||||
- Modify `server/uv.lock` manually
|
- Modify `server/uv.lock` manually
|
||||||
|
|||||||
@@ -173,16 +173,21 @@ The shape language is a hybrid of structural precision and tactile softness.
|
|||||||
## Components
|
## Components
|
||||||
|
|
||||||
### Card Containers
|
### Card Containers
|
||||||
|
|
||||||
The fundamental building block of the UI. Every distinct section (Script, Player, Controls, Logs) is housed in a card featuring the `card-bg`, a 1px `border`, and `rounded-xl` corners. The internal layout always features an uppercase teal header for immediate section identification.
|
The fundamental building block of the UI. Every distinct section (Script, Player, Controls, Logs) is housed in a card featuring the `card-bg`, a 1px `border`, and `rounded-xl` corners. The internal layout always features an uppercase teal header for immediate section identification.
|
||||||
|
|
||||||
### Primary Action Buttons
|
### Primary Action Buttons
|
||||||
|
|
||||||
Used for high-leverage actions like "Generate Audio" and "Play/Pause." These buttons utilize the `gradient-primary-dim` background, bold white text, and emit a soft teal glow to draw the eye and signify their importance.
|
Used for high-leverage actions like "Generate Audio" and "Play/Pause." These buttons utilize the `gradient-primary-dim` background, bold white text, and emit a soft teal glow to draw the eye and signify their importance.
|
||||||
|
|
||||||
### Range Sliders
|
### Range Sliders
|
||||||
|
|
||||||
Custom-styled input ranges replace default browser styles. The tracks are muted and slim, while the thumbs are bright teal, fully rounded, and emit a glow that intensifies on hover, providing a premium, tactile scrubbing experience.
|
Custom-styled input ranges replace default browser styles. The tracks are muted and slim, while the thumbs are bright teal, fully rounded, and emit a glow that intensifies on hover, providing a premium, tactile scrubbing experience.
|
||||||
|
|
||||||
### Status Indicators & Logs
|
### Status Indicators & Logs
|
||||||
|
|
||||||
A critical component of the application. Status badges utilize a minimalist pill shape with a pulsing ring animation to indicate active server processing. The log panel explicitly uses monospace typography and color-codes messages (green for success, red for error, white for neutral) to provide a terminal-like readout of the backend systems.
|
A critical component of the application. Status badges utilize a minimalist pill shape with a pulsing ring animation to indicate active server processing. The log panel explicitly uses monospace typography and color-codes messages (green for success, red for error, white for neutral) to provide a terminal-like readout of the backend systems.
|
||||||
|
|
||||||
### Gradients
|
### Gradients
|
||||||
|
|
||||||
Gradients are used purposefully to indicate progress, activity, or brand presence. The primary gradient (`135deg` from teal to violet) is used for branding (the logo icon and text) and primary buttons. Horizontal gradients (`90deg`) are used dynamically in progress bars to represent the flow of data over time (e.g., loading, downloading, and audio generation).
|
Gradients are used purposefully to indicate progress, activity, or brand presence. The primary gradient (`135deg` from teal to violet) is used for branding (the logo icon and text) and primary buttons. Horizontal gradients (`90deg`) are used dynamically in progress bars to represent the flow of data over time (e.g., loading, downloading, and audio generation).
|
||||||
|
|||||||
@@ -14,12 +14,12 @@ The Next.js app proxies audio generation requests to the FastAPI server, keeping
|
|||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
| Tool | Install |
|
| Tool | Install |
|
||||||
|------|---------|
|
| ---------------------------------- | ----------------------------------- |
|
||||||
| [Node.js 20+](https://nodejs.org) | `winget install OpenJS.NodeJS.LTS` |
|
| [Node.js 20+](https://nodejs.org) | `winget install OpenJS.NodeJS.LTS` |
|
||||||
| [pnpm](https://pnpm.io) | `npm i -g pnpm` |
|
| [pnpm](https://pnpm.io) | `npm i -g pnpm` |
|
||||||
| [Python 3.10+](https://python.org) | `winget install Python.Python.3.13` |
|
| [Python 3.10+](https://python.org) | `winget install Python.Python.3.13` |
|
||||||
| [uv](https://docs.astral.sh/uv/) | `winget install astral-sh.uv` |
|
| [uv](https://docs.astral.sh/uv/) | `winget install astral-sh.uv` |
|
||||||
|
|
||||||
## Getting started
|
## Getting started
|
||||||
|
|
||||||
@@ -50,10 +50,10 @@ The frontend shows a loading indicator while the model downloads. Once the serve
|
|||||||
|
|
||||||
VibePod maintains two completely separate Python virtual environments so CUDA and CPU torch installs never conflict:
|
VibePod maintains two completely separate Python virtual environments so CUDA and CPU torch installs never conflict:
|
||||||
|
|
||||||
| Mode | Command | venv | torch source |
|
| Mode | Command | venv | torch source |
|
||||||
|------|---------|------|--------------|
|
| -------------- | -------------- | ------------------ | ----------------------- |
|
||||||
| CUDA (default) | `pnpm dev` | `server/.venv` | PyTorch CUDA 12.4 index |
|
| CUDA (default) | `pnpm dev` | `server/.venv` | PyTorch CUDA 12.4 index |
|
||||||
| CPU-only | `pnpm dev:cpu` | `server/.venv-cpu` | PyPI (CPU wheel) |
|
| CPU-only | `pnpm dev:cpu` | `server/.venv-cpu` | PyPI (CPU wheel) |
|
||||||
|
|
||||||
On first run, each mode creates its own venv automatically. You can switch between them freely — they are fully independent. The active device is reported by the `/health` endpoint as `"device": "cpu"` or `"device": "cuda"`.
|
On first run, each mode creates its own venv automatically. You can switch between them freely — they are fully independent. The active device is reported by the `/health` endpoint as `"device": "cpu"` or `"device": "cuda"`.
|
||||||
|
|
||||||
@@ -74,11 +74,11 @@ pnpm build # Production build of the frontend
|
|||||||
|
|
||||||
Copy `.env.example` to `.env.local` and set:
|
Copy `.env.example` to `.env.local` and set:
|
||||||
|
|
||||||
| Variable | Default | Description |
|
| Variable | Default | Description |
|
||||||
|----------|---------|-------------|
|
| ---------------------- | ----------------------- | --------------------------------------------------------- |
|
||||||
| `VIBEVOICE_SERVER_URL` | `http://localhost:8000` | URL the Next.js API routes use to reach the Python server |
|
| `VIBEVOICE_SERVER_URL` | `http://localhost:8000` | URL the Next.js API routes use to reach the Python server |
|
||||||
| `HF_TOKEN` | — | HuggingFace token (required if the model repo is gated) |
|
| `HF_TOKEN` | — | HuggingFace token (required if the model repo is gated) |
|
||||||
| `HF_HOME` | — | Override the HuggingFace model cache directory |
|
| `HF_HOME` | — | Override the HuggingFace model cache directory |
|
||||||
|
|
||||||
## Project structure
|
## Project structure
|
||||||
|
|
||||||
@@ -107,11 +107,11 @@ server/
|
|||||||
|
|
||||||
## Generation parameters
|
## Generation parameters
|
||||||
|
|
||||||
| Parameter | Range | Default | Effect |
|
| Parameter | Range | Default | Effect |
|
||||||
|-----------|-------|---------|--------|
|
| ----------------- | --------------------------------------------------- | -------- | ---------------------------------------------- |
|
||||||
| `speaker` | `carter`, `davis`, `emma`, `frank`, `grace`, `mike` | `carter` | Voice preset used for the generated audio |
|
| `speaker` | `carter`, `davis`, `emma`, `frank`, `grace`, `mike` | `carter` | Voice preset used for the generated audio |
|
||||||
| `cfg_scale` | 0.5 – 4.0 | 1.5 | Higher = more expressive guidance |
|
| `cfg_scale` | 0.5 – 4.0 | 1.5 | Higher = more expressive guidance |
|
||||||
| `inference_steps` | 5 – 20 | 10 | More steps = higher quality, slower generation |
|
| `inference_steps` | 5 – 20 | 10 | More steps = higher quality, slower generation |
|
||||||
|
|
||||||
## How it works
|
## How it works
|
||||||
|
|
||||||
|
|||||||
+1
-1
@@ -1,2 +1,2 @@
|
|||||||
packages:
|
packages:
|
||||||
- 'web'
|
- "web"
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ export async function POST(request: NextRequest) {
|
|||||||
const pythonServerUrl = process.env.VIBEVOICE_SERVER_URL ?? "http://localhost:8000";
|
const pythonServerUrl = process.env.VIBEVOICE_SERVER_URL ?? "http://localhost:8000";
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const body = await request.json() as {
|
const body = (await request.json()) as {
|
||||||
text: string;
|
text: string;
|
||||||
speaker?: string;
|
speaker?: string;
|
||||||
cfg_scale?: number;
|
cfg_scale?: number;
|
||||||
@@ -41,7 +41,7 @@ export async function POST(request: NextRequest) {
|
|||||||
headers: {
|
headers: {
|
||||||
"Content-Type": "text/event-stream",
|
"Content-Type": "text/event-stream",
|
||||||
"Cache-Control": "no-cache, no-transform",
|
"Cache-Control": "no-cache, no-transform",
|
||||||
"Connection": "keep-alive",
|
Connection: "keep-alive",
|
||||||
"X-Content-Type-Options": "nosniff",
|
"X-Content-Type-Options": "nosniff",
|
||||||
"X-Accel-Buffering": "no",
|
"X-Accel-Buffering": "no",
|
||||||
},
|
},
|
||||||
|
|||||||
@@ -4,8 +4,7 @@ const OFFLINE_RESPONSE = { status: "offline" };
|
|||||||
const COMMON_OPTIONS = { headers: { "Cache-Control": "no-store" } };
|
const COMMON_OPTIONS = { headers: { "Cache-Control": "no-store" } };
|
||||||
|
|
||||||
export async function GET() {
|
export async function GET() {
|
||||||
const pythonServerUrl =
|
const pythonServerUrl = process.env.VIBEVOICE_SERVER_URL ?? "http://localhost:8000";
|
||||||
process.env.VIBEVOICE_SERVER_URL ?? "http://localhost:8000";
|
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const res = await fetch(`${pythonServerUrl}/health`, {
|
const res = await fetch(`${pythonServerUrl}/health`, {
|
||||||
|
|||||||
+4
-2
@@ -12,8 +12,10 @@
|
|||||||
--muted: #64748b;
|
--muted: #64748b;
|
||||||
--success: #22c55e;
|
--success: #22c55e;
|
||||||
--error: #ef4444;
|
--error: #ef4444;
|
||||||
--font-sans: ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
|
--font-sans:
|
||||||
--font-mono: ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, "Liberation Mono", monospace;
|
ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
|
||||||
|
--font-mono:
|
||||||
|
ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, "Liberation Mono", monospace;
|
||||||
}
|
}
|
||||||
|
|
||||||
@theme inline {
|
@theme inline {
|
||||||
|
|||||||
+58
-26
@@ -69,19 +69,39 @@ type AppAction =
|
|||||||
|
|
||||||
function reducer(state: AppState, action: AppAction): AppState {
|
function reducer(state: AppState, action: AppAction): AppState {
|
||||||
switch (action.type) {
|
switch (action.type) {
|
||||||
case "SET_SCRIPT": return { ...state, script: action.payload };
|
case "SET_SCRIPT":
|
||||||
case "SET_SPEAKER": return { ...state, speaker: action.payload };
|
return { ...state, script: action.payload };
|
||||||
case "SET_CFG_SCALE": return { ...state, cfgScale: action.payload };
|
case "SET_SPEAKER":
|
||||||
case "SET_INFERENCE_STEPS": return { ...state, inferenceSteps: action.payload };
|
return { ...state, speaker: action.payload };
|
||||||
case "SET_PREBUFFER_SECS": return { ...state, prebufferSecs: action.payload };
|
case "SET_CFG_SCALE":
|
||||||
case "SET_REBUFFER_THRESHOLD": return { ...state, rebufferThresholdSecs: action.payload };
|
return { ...state, cfgScale: action.payload };
|
||||||
case "SET_RESUME_THRESHOLD": return { ...state, resumeThresholdSecs: action.payload };
|
case "SET_INFERENCE_STEPS":
|
||||||
|
return { ...state, inferenceSteps: action.payload };
|
||||||
|
case "SET_PREBUFFER_SECS":
|
||||||
|
return { ...state, prebufferSecs: action.payload };
|
||||||
|
case "SET_REBUFFER_THRESHOLD":
|
||||||
|
return { ...state, rebufferThresholdSecs: action.payload };
|
||||||
|
case "SET_RESUME_THRESHOLD":
|
||||||
|
return { ...state, resumeThresholdSecs: action.payload };
|
||||||
case "START_GENERATION":
|
case "START_GENERATION":
|
||||||
return { ...state, isGenerating: true, audioUrl: null, logs: [], genElapsed: 0, genPct: null };
|
return {
|
||||||
|
...state,
|
||||||
|
isGenerating: true,
|
||||||
|
audioUrl: null,
|
||||||
|
logs: [],
|
||||||
|
genElapsed: 0,
|
||||||
|
genPct: null,
|
||||||
|
};
|
||||||
case "GEN_PROGRESS":
|
case "GEN_PROGRESS":
|
||||||
return { ...state, genElapsed: action.elapsed, genPct: action.pct };
|
return { ...state, genElapsed: action.elapsed, genPct: action.pct };
|
||||||
case "GENERATION_SUCCESS":
|
case "GENERATION_SUCCESS":
|
||||||
return { ...state, isGenerating: false, genElapsed: 0, genPct: null, audioUrl: action.payload };
|
return {
|
||||||
|
...state,
|
||||||
|
isGenerating: false,
|
||||||
|
genElapsed: 0,
|
||||||
|
genPct: null,
|
||||||
|
audioUrl: action.payload,
|
||||||
|
};
|
||||||
case "GENERATION_CANCELLED":
|
case "GENERATION_CANCELLED":
|
||||||
case "GENERATION_ERROR":
|
case "GENERATION_ERROR":
|
||||||
return { ...state, isGenerating: false, genElapsed: 0, genPct: null };
|
return { ...state, isGenerating: false, genElapsed: 0, genPct: null };
|
||||||
@@ -89,21 +109,27 @@ function reducer(state: AppState, action: AppAction): AppState {
|
|||||||
return { ...state, logs: [...state.logs, action.payload] };
|
return { ...state, logs: [...state.logs, action.payload] };
|
||||||
case "SET_SERVER_STATUS": {
|
case "SET_SERVER_STATUS": {
|
||||||
const isNewConfig = !state.serverConfig && action.payload.config;
|
const isNewConfig = !state.serverConfig && action.payload.config;
|
||||||
const deviceChanged = !!(state.serverConfig && action.payload.config && state.serverConfig.device !== action.payload.config.device);
|
const deviceChanged = !!(
|
||||||
|
state.serverConfig &&
|
||||||
|
action.payload.config &&
|
||||||
|
state.serverConfig.device !== action.payload.config.device
|
||||||
|
);
|
||||||
|
|
||||||
const nextSteps = (isNewConfig || deviceChanged)
|
const nextSteps =
|
||||||
|
isNewConfig || deviceChanged
|
||||||
? action.payload.config!.default_inference_steps
|
? action.payload.config!.default_inference_steps
|
||||||
: state.inferenceSteps;
|
: state.inferenceSteps;
|
||||||
|
|
||||||
const nextPrebuffer = (isNewConfig || deviceChanged)
|
const nextPrebuffer =
|
||||||
? action.payload.config!.prebuffer_secs
|
isNewConfig || deviceChanged ? action.payload.config!.prebuffer_secs : state.prebufferSecs;
|
||||||
: state.prebufferSecs;
|
|
||||||
|
|
||||||
const nextRebuffer = (isNewConfig || deviceChanged)
|
const nextRebuffer =
|
||||||
|
isNewConfig || deviceChanged
|
||||||
? action.payload.config!.rebuffer_threshold_secs
|
? action.payload.config!.rebuffer_threshold_secs
|
||||||
: state.rebufferThresholdSecs;
|
: state.rebufferThresholdSecs;
|
||||||
|
|
||||||
const nextResume = (isNewConfig || deviceChanged)
|
const nextResume =
|
||||||
|
isNewConfig || deviceChanged
|
||||||
? action.payload.config!.resume_threshold_secs
|
? action.payload.config!.resume_threshold_secs
|
||||||
: state.resumeThresholdSecs;
|
: state.resumeThresholdSecs;
|
||||||
|
|
||||||
@@ -121,7 +147,8 @@ function reducer(state: AppState, action: AppAction): AppState {
|
|||||||
resumeThresholdSecs: nextResume,
|
resumeThresholdSecs: nextResume,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
default: return state;
|
default:
|
||||||
|
return state;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -213,7 +240,10 @@ export default function HomePage() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
poll();
|
poll();
|
||||||
return () => { cancelled = true; clearTimeout(timeoutId); };
|
return () => {
|
||||||
|
cancelled = true;
|
||||||
|
clearTimeout(timeoutId);
|
||||||
|
};
|
||||||
}, []);
|
}, []);
|
||||||
|
|
||||||
const handleGenerate = useCallback(async () => {
|
const handleGenerate = useCallback(async () => {
|
||||||
@@ -241,7 +271,6 @@ export default function HomePage() {
|
|||||||
<Header />
|
<Header />
|
||||||
<main className="flex-1 container mx-auto px-4 py-6 max-w-6xl">
|
<main className="flex-1 container mx-auto px-4 py-6 max-w-6xl">
|
||||||
<div className="grid grid-cols-1 lg:grid-cols-3 gap-6">
|
<div className="grid grid-cols-1 lg:grid-cols-3 gap-6">
|
||||||
|
|
||||||
{/* Left: script + audio player */}
|
{/* Left: script + audio player */}
|
||||||
<div className="lg:col-span-2 flex flex-col gap-6">
|
<div className="lg:col-span-2 flex flex-col gap-6">
|
||||||
<TextInputPanel
|
<TextInputPanel
|
||||||
@@ -261,12 +290,16 @@ export default function HomePage() {
|
|||||||
onCfgScaleChange={(v) => dispatch({ type: "SET_CFG_SCALE", payload: v })}
|
onCfgScaleChange={(v) => dispatch({ type: "SET_CFG_SCALE", payload: v })}
|
||||||
inferenceSteps={state.inferenceSteps}
|
inferenceSteps={state.inferenceSteps}
|
||||||
onInferenceStepsChange={(v) => dispatch({ type: "SET_INFERENCE_STEPS", payload: v })}
|
onInferenceStepsChange={(v) => dispatch({ type: "SET_INFERENCE_STEPS", payload: v })}
|
||||||
prebufferSecs={state.prebufferSecs}
|
prebufferSecs={state.prebufferSecs}
|
||||||
onPrebufferSecsChange={(v) => dispatch({ type: "SET_PREBUFFER_SECS", payload: v })}
|
onPrebufferSecsChange={(v) => dispatch({ type: "SET_PREBUFFER_SECS", payload: v })}
|
||||||
rebufferThresholdSecs={state.rebufferThresholdSecs}
|
rebufferThresholdSecs={state.rebufferThresholdSecs}
|
||||||
onRebufferThresholdChange={(v) => dispatch({ type: "SET_REBUFFER_THRESHOLD", payload: v })}
|
onRebufferThresholdChange={(v) =>
|
||||||
resumeThresholdSecs={state.resumeThresholdSecs}
|
dispatch({ type: "SET_REBUFFER_THRESHOLD", payload: v })
|
||||||
onResumeThresholdChange={(v) => dispatch({ type: "SET_RESUME_THRESHOLD", payload: v })}
|
}
|
||||||
|
resumeThresholdSecs={state.resumeThresholdSecs}
|
||||||
|
onResumeThresholdChange={(v) =>
|
||||||
|
dispatch({ type: "SET_RESUME_THRESHOLD", payload: v })
|
||||||
|
}
|
||||||
onGenerate={handleGenerate}
|
onGenerate={handleGenerate}
|
||||||
onStop={stop}
|
onStop={stop}
|
||||||
onPauseStream={pauseStream}
|
onPauseStream={pauseStream}
|
||||||
@@ -281,7 +314,6 @@ export default function HomePage() {
|
|||||||
/>
|
/>
|
||||||
<StatusLog messages={state.logs} />
|
<StatusLog messages={state.logs} />
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
</main>
|
</main>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -14,15 +14,8 @@ function formatTime(seconds: number): string {
|
|||||||
}
|
}
|
||||||
|
|
||||||
export default function AudioPlayer({ audioUrl }: AudioPlayerProps) {
|
export default function AudioPlayer({ audioUrl }: AudioPlayerProps) {
|
||||||
const {
|
const { isPlaying, currentTime, duration, volume, toggle, seek, setVolume } =
|
||||||
isPlaying,
|
useAudioPlayer(audioUrl);
|
||||||
currentTime,
|
|
||||||
duration,
|
|
||||||
volume,
|
|
||||||
toggle,
|
|
||||||
seek,
|
|
||||||
setVolume,
|
|
||||||
} = useAudioPlayer(audioUrl);
|
|
||||||
|
|
||||||
if (!audioUrl) return null;
|
if (!audioUrl) return null;
|
||||||
|
|
||||||
@@ -56,12 +49,10 @@ export default function AudioPlayer({ audioUrl }: AudioPlayerProps) {
|
|||||||
background: "rgba(45, 212, 191, 0.05)",
|
background: "rgba(45, 212, 191, 0.05)",
|
||||||
}}
|
}}
|
||||||
onMouseEnter={(e) => {
|
onMouseEnter={(e) => {
|
||||||
(e.currentTarget as HTMLButtonElement).style.background =
|
(e.currentTarget as HTMLButtonElement).style.background = "rgba(45, 212, 191, 0.15)";
|
||||||
"rgba(45, 212, 191, 0.15)";
|
|
||||||
}}
|
}}
|
||||||
onMouseLeave={(e) => {
|
onMouseLeave={(e) => {
|
||||||
(e.currentTarget as HTMLButtonElement).style.background =
|
(e.currentTarget as HTMLButtonElement).style.background = "rgba(45, 212, 191, 0.05)";
|
||||||
"rgba(45, 212, 191, 0.05)";
|
|
||||||
}}
|
}}
|
||||||
>
|
>
|
||||||
<svg
|
<svg
|
||||||
@@ -115,27 +106,18 @@ export default function AudioPlayer({ audioUrl }: AudioPlayerProps) {
|
|||||||
onClick={toggle}
|
onClick={toggle}
|
||||||
className="w-10 h-10 rounded-full flex items-center justify-center transition-transform active:scale-95 cursor-pointer"
|
className="w-10 h-10 rounded-full flex items-center justify-center transition-transform active:scale-95 cursor-pointer"
|
||||||
style={{
|
style={{
|
||||||
background:
|
background: "linear-gradient(135deg, var(--accent-teal-dim), var(--accent-violet-dim))",
|
||||||
"linear-gradient(135deg, var(--accent-teal-dim), var(--accent-violet-dim))",
|
|
||||||
boxShadow: "0 4px 12px rgba(45, 212, 191, 0.3)",
|
boxShadow: "0 4px 12px rgba(45, 212, 191, 0.3)",
|
||||||
}}
|
}}
|
||||||
aria-label={isPlaying ? "Pause" : "Play"}
|
aria-label={isPlaying ? "Pause" : "Play"}
|
||||||
>
|
>
|
||||||
{isPlaying ? (
|
{isPlaying ? (
|
||||||
<svg
|
<svg className="w-4 h-4 text-white" viewBox="0 0 24 24" fill="currentColor">
|
||||||
className="w-4 h-4 text-white"
|
|
||||||
viewBox="0 0 24 24"
|
|
||||||
fill="currentColor"
|
|
||||||
>
|
|
||||||
<rect x="6" y="4" width="4" height="16" />
|
<rect x="6" y="4" width="4" height="16" />
|
||||||
<rect x="14" y="4" width="4" height="16" />
|
<rect x="14" y="4" width="4" height="16" />
|
||||||
</svg>
|
</svg>
|
||||||
) : (
|
) : (
|
||||||
<svg
|
<svg className="w-4 h-4 text-white" viewBox="0 0 24 24" fill="currentColor">
|
||||||
className="w-4 h-4 text-white"
|
|
||||||
viewBox="0 0 24 24"
|
|
||||||
fill="currentColor"
|
|
||||||
>
|
|
||||||
<polygon points="5 3 19 12 5 21 5 3" />
|
<polygon points="5 3 19 12 5 21 5 3" />
|
||||||
</svg>
|
</svg>
|
||||||
)}
|
)}
|
||||||
@@ -143,9 +125,7 @@ export default function AudioPlayer({ audioUrl }: AudioPlayerProps) {
|
|||||||
|
|
||||||
{/* Duration info */}
|
{/* Duration info */}
|
||||||
<div className="flex-1 flex items-center gap-1 text-sm">
|
<div className="flex-1 flex items-center gap-1 text-sm">
|
||||||
<span style={{ color: "var(--foreground)" }}>
|
<span style={{ color: "var(--foreground)" }}>{formatTime(currentTime)}</span>
|
||||||
{formatTime(currentTime)}
|
|
||||||
</span>
|
|
||||||
<span style={{ color: "var(--muted)" }}>/</span>
|
<span style={{ color: "var(--muted)" }}>/</span>
|
||||||
<span style={{ color: "var(--muted)" }}>{formatTime(duration)}</span>
|
<span style={{ color: "var(--muted)" }}>{formatTime(duration)}</span>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -36,18 +36,27 @@ const STATUS_CONFIG: Record<
|
|||||||
Exclude<ServerStatus, "online">,
|
Exclude<ServerStatus, "online">,
|
||||||
{ color: string; label: (p: DownloadProgress | null) => string }
|
{ color: string; label: (p: DownloadProgress | null) => string }
|
||||||
> = {
|
> = {
|
||||||
offline: { color: "var(--error)", label: () => "Server offline — waiting for connection..." },
|
offline: { color: "var(--error)", label: () => "Server offline — waiting for connection..." },
|
||||||
downloading: { color: "#60a5fa", label: (p) => p && p.total > 0 ? `Downloading model... (${p.done} / ${p.total} files)` : "Downloading model (~1 GB)..." },
|
downloading: {
|
||||||
loading: { color: "#fbbf24", label: () => "Loading model into memory..." },
|
color: "#60a5fa",
|
||||||
error: { color: "var(--error)", label: () => "Server error — check the terminal for details." },
|
label: (p) =>
|
||||||
|
p && p.total > 0
|
||||||
|
? `Downloading model... (${p.done} / ${p.total} files)`
|
||||||
|
: "Downloading model (~1 GB)...",
|
||||||
|
},
|
||||||
|
loading: { color: "#fbbf24", label: () => "Loading model into memory..." },
|
||||||
|
error: { color: "var(--error)", label: () => "Server error — check the terminal for details." },
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
function SpinnerIcon() {
|
function SpinnerIcon() {
|
||||||
return (
|
return (
|
||||||
<svg className="animate-spin w-4 h-4" viewBox="0 0 24 24" fill="none">
|
<svg className="animate-spin w-4 h-4" viewBox="0 0 24 24" fill="none">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
|
<path
|
||||||
|
className="opacity-75"
|
||||||
|
fill="currentColor"
|
||||||
|
d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z"
|
||||||
|
/>
|
||||||
</svg>
|
</svg>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
@@ -146,7 +155,10 @@ export default function GenerationControls({
|
|||||||
onChange={(e) => onCfgScaleChange(parseFloat(e.target.value))}
|
onChange={(e) => onCfgScaleChange(parseFloat(e.target.value))}
|
||||||
className="w-full"
|
className="w-full"
|
||||||
/>
|
/>
|
||||||
<div className="flex items-center justify-between text-xs" style={{ color: "var(--muted)" }}>
|
<div
|
||||||
|
className="flex items-center justify-between text-xs"
|
||||||
|
style={{ color: "var(--muted)" }}
|
||||||
|
>
|
||||||
<span>Flat (0.5)</span>
|
<span>Flat (0.5)</span>
|
||||||
<span>CFG Scale</span>
|
<span>CFG Scale</span>
|
||||||
<span>Expressive (4.0)</span>
|
<span>Expressive (4.0)</span>
|
||||||
@@ -176,7 +188,10 @@ export default function GenerationControls({
|
|||||||
className="w-full"
|
className="w-full"
|
||||||
style={{ "--thumb-color": "var(--accent-violet)" } as React.CSSProperties}
|
style={{ "--thumb-color": "var(--accent-violet)" } as React.CSSProperties}
|
||||||
/>
|
/>
|
||||||
<div className="flex items-center justify-between text-xs" style={{ color: "var(--muted)" }}>
|
<div
|
||||||
|
className="flex items-center justify-between text-xs"
|
||||||
|
style={{ color: "var(--muted)" }}
|
||||||
|
>
|
||||||
<span>Faster (5)</span>
|
<span>Faster (5)</span>
|
||||||
<span>Diffusion Steps</span>
|
<span>Diffusion Steps</span>
|
||||||
<span>Better (20)</span>
|
<span>Better (20)</span>
|
||||||
@@ -207,7 +222,11 @@ export default function GenerationControls({
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
{showAdvanced && (
|
{showAdvanced && (
|
||||||
<div id="advanced-buffering-panel" className="flex flex-col gap-4 pl-2 border-l" style={{ borderColor: "var(--border)" }}>
|
<div
|
||||||
|
id="advanced-buffering-panel"
|
||||||
|
className="flex flex-col gap-4 pl-2 border-l"
|
||||||
|
style={{ borderColor: "var(--border)" }}
|
||||||
|
>
|
||||||
{/* Pre-buffer */}
|
{/* Pre-buffer */}
|
||||||
<div className="flex flex-col gap-2">
|
<div className="flex flex-col gap-2">
|
||||||
<div className="flex items-center justify-between">
|
<div className="flex items-center justify-between">
|
||||||
@@ -232,7 +251,11 @@ export default function GenerationControls({
|
|||||||
{/* Re-buffer threshold */}
|
{/* Re-buffer threshold */}
|
||||||
<div className="flex flex-col gap-2">
|
<div className="flex flex-col gap-2">
|
||||||
<div className="flex items-center justify-between">
|
<div className="flex items-center justify-between">
|
||||||
<label htmlFor="rebuffer-threshold" className="text-xs font-medium" style={{ color: "var(--foreground)" }}>
|
<label
|
||||||
|
htmlFor="rebuffer-threshold"
|
||||||
|
className="text-xs font-medium"
|
||||||
|
style={{ color: "var(--foreground)" }}
|
||||||
|
>
|
||||||
Re-buffer Threshold
|
Re-buffer Threshold
|
||||||
</label>
|
</label>
|
||||||
<span className="text-xs font-mono" style={{ color: "var(--accent-teal)" }}>
|
<span className="text-xs font-mono" style={{ color: "var(--accent-teal)" }}>
|
||||||
@@ -260,7 +283,11 @@ export default function GenerationControls({
|
|||||||
{/* Resume threshold */}
|
{/* Resume threshold */}
|
||||||
<div className="flex flex-col gap-2">
|
<div className="flex flex-col gap-2">
|
||||||
<div className="flex items-center justify-between">
|
<div className="flex items-center justify-between">
|
||||||
<label htmlFor="resume-threshold" className="text-xs font-medium" style={{ color: "var(--foreground)" }}>
|
<label
|
||||||
|
htmlFor="resume-threshold"
|
||||||
|
className="text-xs font-medium"
|
||||||
|
style={{ color: "var(--foreground)" }}
|
||||||
|
>
|
||||||
Resume Threshold
|
Resume Threshold
|
||||||
</label>
|
</label>
|
||||||
<span className="text-xs font-mono" style={{ color: "var(--accent-teal)" }}>
|
<span className="text-xs font-mono" style={{ color: "var(--accent-teal)" }}>
|
||||||
@@ -302,7 +329,10 @@ export default function GenerationControls({
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
{serverStatus === "downloading" && (
|
{serverStatus === "downloading" && (
|
||||||
<div className="w-full rounded-full h-1.5 overflow-hidden" style={{ background: "var(--border)" }}>
|
<div
|
||||||
|
className="w-full rounded-full h-1.5 overflow-hidden"
|
||||||
|
style={{ background: "var(--border)" }}
|
||||||
|
>
|
||||||
<div
|
<div
|
||||||
className="h-1.5 rounded-full transition-all duration-500"
|
className="h-1.5 rounded-full transition-all duration-500"
|
||||||
style={{
|
style={{
|
||||||
@@ -315,10 +345,16 @@ export default function GenerationControls({
|
|||||||
)}
|
)}
|
||||||
|
|
||||||
{serverStatus === "loading" && (
|
{serverStatus === "loading" && (
|
||||||
<div className="w-full rounded-full h-1.5 overflow-hidden" style={{ background: "var(--border)" }}>
|
<div
|
||||||
|
className="w-full rounded-full h-1.5 overflow-hidden"
|
||||||
|
style={{ background: "var(--border)" }}
|
||||||
|
>
|
||||||
<div
|
<div
|
||||||
className="h-1.5 rounded-full animate-pulse"
|
className="h-1.5 rounded-full animate-pulse"
|
||||||
style={{ width: "60%", background: "linear-gradient(90deg, #fbbf24, var(--accent-teal))" }}
|
style={{
|
||||||
|
width: "60%",
|
||||||
|
background: "linear-gradient(90deg, #fbbf24, var(--accent-teal))",
|
||||||
|
}}
|
||||||
/>
|
/>
|
||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
@@ -328,11 +364,17 @@ export default function GenerationControls({
|
|||||||
{/* Generation progress bar */}
|
{/* Generation progress bar */}
|
||||||
{isGenerating && (
|
{isGenerating && (
|
||||||
<div className="flex flex-col gap-1.5">
|
<div className="flex flex-col gap-1.5">
|
||||||
<div className="flex items-center justify-between text-xs" style={{ color: "var(--muted)" }}>
|
<div
|
||||||
|
className="flex items-center justify-between text-xs"
|
||||||
|
style={{ color: "var(--muted)" }}
|
||||||
|
>
|
||||||
<span>{genElapsed}s elapsed</span>
|
<span>{genElapsed}s elapsed</span>
|
||||||
<span>{genPct !== null ? `${genPct}%` : "starting..."}</span>
|
<span>{genPct !== null ? `${genPct}%` : "starting..."}</span>
|
||||||
</div>
|
</div>
|
||||||
<div className="w-full rounded-full h-1.5 overflow-hidden" style={{ background: "var(--border)" }}>
|
<div
|
||||||
|
className="w-full rounded-full h-1.5 overflow-hidden"
|
||||||
|
style={{ background: "var(--border)" }}
|
||||||
|
>
|
||||||
<div
|
<div
|
||||||
className="h-1.5 rounded-full transition-all duration-500"
|
className="h-1.5 rounded-full transition-all duration-500"
|
||||||
style={{
|
style={{
|
||||||
@@ -355,7 +397,8 @@ export default function GenerationControls({
|
|||||||
buttonDisabled
|
buttonDisabled
|
||||||
? { background: "var(--border)", color: "var(--muted)" }
|
? { background: "var(--border)", color: "var(--muted)" }
|
||||||
: {
|
: {
|
||||||
background: "linear-gradient(135deg, var(--accent-teal-dim), var(--accent-violet-dim))",
|
background:
|
||||||
|
"linear-gradient(135deg, var(--accent-teal-dim), var(--accent-violet-dim))",
|
||||||
color: "#fff",
|
color: "#fff",
|
||||||
boxShadow: "0 4px 15px rgba(45, 212, 191, 0.2)",
|
boxShadow: "0 4px 15px rgba(45, 212, 191, 0.2)",
|
||||||
}
|
}
|
||||||
@@ -373,7 +416,13 @@ export default function GenerationControls({
|
|||||||
</>
|
</>
|
||||||
) : (
|
) : (
|
||||||
<>
|
<>
|
||||||
<svg className="w-4 h-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
|
<svg
|
||||||
|
className="w-4 h-4"
|
||||||
|
viewBox="0 0 24 24"
|
||||||
|
fill="none"
|
||||||
|
stroke="currentColor"
|
||||||
|
strokeWidth="2"
|
||||||
|
>
|
||||||
<polygon points="5 3 19 12 5 21 5 3" />
|
<polygon points="5 3 19 12 5 21 5 3" />
|
||||||
</svg>
|
</svg>
|
||||||
Generate Audio
|
Generate Audio
|
||||||
|
|||||||
+22
-25
@@ -6,8 +6,8 @@ type ServerStatus = "checking" | "downloading" | "loading" | "online" | "error"
|
|||||||
type Device = "cpu" | "cuda" | null;
|
type Device = "cpu" | "cuda" | null;
|
||||||
|
|
||||||
// Polling intervals: poll quickly until the server is online, then slow down.
|
// Polling intervals: poll quickly until the server is online, then slow down.
|
||||||
const FAST_INTERVAL_MS = 3000; // while checking / loading
|
const FAST_INTERVAL_MS = 3000; // while checking / loading
|
||||||
const SLOW_INTERVAL_MS = 30000; // once online
|
const SLOW_INTERVAL_MS = 30000; // once online
|
||||||
|
|
||||||
export default function Header() {
|
export default function Header() {
|
||||||
const [status, setStatus] = useState<ServerStatus>("checking");
|
const [status, setStatus] = useState<ServerStatus>("checking");
|
||||||
@@ -31,7 +31,10 @@ export default function Header() {
|
|||||||
intervalRef.current = setInterval(checkHealth, SLOW_INTERVAL_MS);
|
intervalRef.current = setInterval(checkHealth, SLOW_INTERVAL_MS);
|
||||||
}
|
}
|
||||||
// Switch to fast polling if we detect the server went offline/loading
|
// Switch to fast polling if we detect the server went offline/loading
|
||||||
if ((newStatus === "offline" || newStatus === "downloading" || newStatus === "loading") && intervalRef.current) {
|
if (
|
||||||
|
(newStatus === "offline" || newStatus === "downloading" || newStatus === "loading") &&
|
||||||
|
intervalRef.current
|
||||||
|
) {
|
||||||
clearInterval(intervalRef.current);
|
clearInterval(intervalRef.current);
|
||||||
intervalRef.current = setInterval(checkHealth, FAST_INTERVAL_MS);
|
intervalRef.current = setInterval(checkHealth, FAST_INTERVAL_MS);
|
||||||
}
|
}
|
||||||
@@ -95,23 +98,20 @@ export default function Header() {
|
|||||||
const cfg = statusConfig[status];
|
const cfg = statusConfig[status];
|
||||||
|
|
||||||
// Device badge — only shown once the server is online and device is known
|
// Device badge — only shown once the server is online and device is known
|
||||||
const deviceBadge = status === "online" && device ? (
|
const deviceBadge =
|
||||||
<span
|
status === "online" && device ? (
|
||||||
className="px-2 py-0.5 rounded-full text-xs font-semibold tracking-wide uppercase"
|
<span
|
||||||
style={{
|
className="px-2 py-0.5 rounded-full text-xs font-semibold tracking-wide uppercase"
|
||||||
background: device === "cuda"
|
style={{
|
||||||
? "var(--accent-violet-dim)"
|
background: device === "cuda" ? "var(--accent-violet-dim)" : "var(--accent-teal-dim)",
|
||||||
: "var(--accent-teal-dim)",
|
color: device === "cuda" ? "var(--accent-violet)" : "var(--accent-teal)",
|
||||||
color: device === "cuda"
|
border: `1px solid ${device === "cuda" ? "var(--accent-violet-dim)" : "var(--accent-teal-dim)"}`,
|
||||||
? "var(--accent-violet)"
|
}}
|
||||||
: "var(--accent-teal)",
|
title={device === "cuda" ? "Running on NVIDIA GPU" : "Running on CPU"}
|
||||||
border: `1px solid ${device === "cuda" ? "var(--accent-violet-dim)" : "var(--accent-teal-dim)"}`,
|
>
|
||||||
}}
|
{device.toUpperCase()}
|
||||||
title={device === "cuda" ? "Running on NVIDIA GPU" : "Running on CPU"}
|
</span>
|
||||||
>
|
) : null;
|
||||||
{device.toUpperCase()}
|
|
||||||
</span>
|
|
||||||
) : null;
|
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<header
|
<header
|
||||||
@@ -136,8 +136,7 @@ export default function Header() {
|
|||||||
<h1
|
<h1
|
||||||
className="text-xl font-bold tracking-tight"
|
className="text-xl font-bold tracking-tight"
|
||||||
style={{
|
style={{
|
||||||
background:
|
background: "linear-gradient(135deg, var(--accent-teal), var(--accent-violet))",
|
||||||
"linear-gradient(135deg, var(--accent-teal), var(--accent-violet))",
|
|
||||||
WebkitBackgroundClip: "text",
|
WebkitBackgroundClip: "text",
|
||||||
WebkitTextFillColor: "transparent",
|
WebkitTextFillColor: "transparent",
|
||||||
}}
|
}}
|
||||||
@@ -167,9 +166,7 @@ export default function Header() {
|
|||||||
className={`animate-ping absolute inline-flex h-full w-full rounded-full opacity-75 ${cfg.color}`}
|
className={`animate-ping absolute inline-flex h-full w-full rounded-full opacity-75 ${cfg.color}`}
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
<span
|
<span className={`relative inline-flex rounded-full h-2 w-2 ${cfg.color}`} />
|
||||||
className={`relative inline-flex rounded-full h-2 w-2 ${cfg.color}`}
|
|
||||||
/>
|
|
||||||
</span>
|
</span>
|
||||||
<span style={{ color: "var(--foreground)" }}>{cfg.label}</span>
|
<span style={{ color: "var(--foreground)" }}>{cfg.label}</span>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -47,8 +47,7 @@ export default function StatusLog({ messages }: StatusLogProps) {
|
|||||||
) : (
|
) : (
|
||||||
messages.map((msg, i) => {
|
messages.map((msg, i) => {
|
||||||
const isError =
|
const isError =
|
||||||
msg.toLowerCase().includes("error") ||
|
msg.toLowerCase().includes("error") || msg.toLowerCase().includes("failed");
|
||||||
msg.toLowerCase().includes("failed");
|
|
||||||
const isSuccess =
|
const isSuccess =
|
||||||
msg.toLowerCase().includes("done") ||
|
msg.toLowerCase().includes("done") ||
|
||||||
msg.toLowerCase().includes("complete") ||
|
msg.toLowerCase().includes("complete") ||
|
||||||
|
|||||||
@@ -15,10 +15,7 @@ interface TextInputPanelProps {
|
|||||||
onChange: (text: string) => void;
|
onChange: (text: string) => void;
|
||||||
}
|
}
|
||||||
|
|
||||||
export default function TextInputPanel({
|
export default function TextInputPanel({ value, onChange }: TextInputPanelProps) {
|
||||||
value,
|
|
||||||
onChange,
|
|
||||||
}: TextInputPanelProps) {
|
|
||||||
const charCount = value.length;
|
const charCount = value.length;
|
||||||
const wordCount = value.trim() === "" ? 0 : value.trim().split(/\s+/).length;
|
const wordCount = value.trim() === "" ? 0 : value.trim().split(/\s+/).length;
|
||||||
|
|
||||||
@@ -43,15 +40,12 @@ export default function TextInputPanel({
|
|||||||
color: "var(--muted)",
|
color: "var(--muted)",
|
||||||
}}
|
}}
|
||||||
onMouseEnter={(e) => {
|
onMouseEnter={(e) => {
|
||||||
(e.target as HTMLButtonElement).style.color =
|
(e.target as HTMLButtonElement).style.color = "var(--accent-violet)";
|
||||||
"var(--accent-violet)";
|
(e.target as HTMLButtonElement).style.borderColor = "var(--accent-violet)";
|
||||||
(e.target as HTMLButtonElement).style.borderColor =
|
|
||||||
"var(--accent-violet)";
|
|
||||||
}}
|
}}
|
||||||
onMouseLeave={(e) => {
|
onMouseLeave={(e) => {
|
||||||
(e.target as HTMLButtonElement).style.color = "var(--muted)";
|
(e.target as HTMLButtonElement).style.color = "var(--muted)";
|
||||||
(e.target as HTMLButtonElement).style.borderColor =
|
(e.target as HTMLButtonElement).style.borderColor = "var(--border)";
|
||||||
"var(--border)";
|
|
||||||
}}
|
}}
|
||||||
>
|
>
|
||||||
Load sample script
|
Load sample script
|
||||||
@@ -69,8 +63,7 @@ export default function TextInputPanel({
|
|||||||
}}
|
}}
|
||||||
onMouseLeave={(e) => {
|
onMouseLeave={(e) => {
|
||||||
(e.target as HTMLButtonElement).style.color = "var(--muted)";
|
(e.target as HTMLButtonElement).style.color = "var(--muted)";
|
||||||
(e.target as HTMLButtonElement).style.borderColor =
|
(e.target as HTMLButtonElement).style.borderColor = "var(--border)";
|
||||||
"var(--border)";
|
|
||||||
}}
|
}}
|
||||||
>
|
>
|
||||||
Clear
|
Clear
|
||||||
@@ -98,10 +91,7 @@ export default function TextInputPanel({
|
|||||||
}}
|
}}
|
||||||
/>
|
/>
|
||||||
|
|
||||||
<div
|
<div className="flex items-center justify-between text-xs" style={{ color: "var(--muted)" }}>
|
||||||
className="flex items-center justify-between text-xs"
|
|
||||||
style={{ color: "var(--muted)" }}
|
|
||||||
>
|
|
||||||
<span>
|
<span>
|
||||||
{wordCount} word{wordCount !== 1 ? "s" : ""}
|
{wordCount} word{wordCount !== 1 ? "s" : ""}
|
||||||
</span>
|
</span>
|
||||||
|
|||||||
@@ -55,16 +55,12 @@ export function useAudioPlayer(audioUrl: string | null) {
|
|||||||
() => setState((prev) => ({ ...prev, isPlaying: false, currentTime: 0 })),
|
() => setState((prev) => ({ ...prev, isPlaying: false, currentTime: 0 })),
|
||||||
{ signal }
|
{ signal }
|
||||||
);
|
);
|
||||||
audio.addEventListener(
|
audio.addEventListener("play", () => setState((prev) => ({ ...prev, isPlaying: true })), {
|
||||||
"play",
|
signal,
|
||||||
() => setState((prev) => ({ ...prev, isPlaying: true })),
|
});
|
||||||
{ signal }
|
audio.addEventListener("pause", () => setState((prev) => ({ ...prev, isPlaying: false })), {
|
||||||
);
|
signal,
|
||||||
audio.addEventListener(
|
});
|
||||||
"pause",
|
|
||||||
() => setState((prev) => ({ ...prev, isPlaying: false })),
|
|
||||||
{ signal }
|
|
||||||
);
|
|
||||||
|
|
||||||
return () => {
|
return () => {
|
||||||
audio.pause();
|
audio.pause();
|
||||||
|
|||||||
+159
-158
@@ -92,7 +92,7 @@ export function useStreamingGeneration({
|
|||||||
let resumeThresholdSecs = rawResumeThresholdSecs;
|
let resumeThresholdSecs = rawResumeThresholdSecs;
|
||||||
if (resumeThresholdSecs <= rebufferThresholdSecs) {
|
if (resumeThresholdSecs <= rebufferThresholdSecs) {
|
||||||
console.warn(
|
console.warn(
|
||||||
`[useStreamingGeneration] resumeThresholdSecs (${resumeThresholdSecs}) must be greater than rebufferThresholdSecs (${rebufferThresholdSecs}). Clamping resumeThresholdSecs to ${rebufferThresholdSecs + 0.5}.`,
|
`[useStreamingGeneration] resumeThresholdSecs (${resumeThresholdSecs}) must be greater than rebufferThresholdSecs (${rebufferThresholdSecs}). Clamping resumeThresholdSecs to ${rebufferThresholdSecs + 0.5}.`
|
||||||
);
|
);
|
||||||
resumeThresholdSecs = rebufferThresholdSecs + 0.5;
|
resumeThresholdSecs = rebufferThresholdSecs + 0.5;
|
||||||
}
|
}
|
||||||
@@ -162,177 +162,178 @@ export function useStreamingGeneration({
|
|||||||
hasStartedPlaybackRef.current = true;
|
hasStartedPlaybackRef.current = true;
|
||||||
}, [enqueue]);
|
}, [enqueue]);
|
||||||
|
|
||||||
const handleAudioChunk = useCallback((chunk: Float32Array<ArrayBuffer>) => {
|
const handleAudioChunk = useCallback(
|
||||||
const ctx = audioCtxRef.current;
|
(chunk: Float32Array<ArrayBuffer>) => {
|
||||||
if (!ctx) return;
|
const ctx = audioCtxRef.current;
|
||||||
|
if (!ctx) return;
|
||||||
|
|
||||||
chunksRef.current.push(chunk);
|
chunksRef.current.push(chunk);
|
||||||
totalAudioSamplesRef.current += chunk.length;
|
totalAudioSamplesRef.current += chunk.length;
|
||||||
|
|
||||||
if (!firstChunkSeenRef.current) {
|
if (!firstChunkSeenRef.current) {
|
||||||
firstChunkSeenRef.current = true;
|
firstChunkSeenRef.current = true;
|
||||||
onLog("First audio chunk received");
|
onLog("First audio chunk received");
|
||||||
}
|
|
||||||
|
|
||||||
if (!hasStartedPlaybackRef.current) {
|
|
||||||
const bufferedSecs = chunksRef.current.reduce((sum, c) => sum + c.length, 0) / SAMPLE_RATE;
|
|
||||||
if (bufferedSecs >= prebufferSecs) {
|
|
||||||
onLog(`Playback started after ${bufferedSecs.toFixed(1)}s buffered`);
|
|
||||||
flushBufferedAudio();
|
|
||||||
}
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
enqueue(ctx, chunk);
|
|
||||||
if (isUserPausedRef.current) return;
|
|
||||||
|
|
||||||
const ahead = nextStartTimeRef.current - ctx.currentTime;
|
|
||||||
if (
|
|
||||||
ctx.state === "running" &&
|
|
||||||
!isAutoBufferingRef.current &&
|
|
||||||
ahead < rebufferThresholdSecs
|
|
||||||
) {
|
|
||||||
isAutoBufferingRef.current = true;
|
|
||||||
underrunCountRef.current += 1;
|
|
||||||
adaptiveResumeSecsRef.current = Math.min(
|
|
||||||
MAX_ADAPTIVE_RESUME_SECS,
|
|
||||||
Math.max(resumeThresholdSecs, prebufferSecs + underrunCountRef.current * 2),
|
|
||||||
);
|
|
||||||
ctx.suspend().catch(() => {});
|
|
||||||
onLog(
|
|
||||||
`Buffer underrun ${underrunCountRef.current}; refilling to ${adaptiveResumeSecsRef.current.toFixed(1)}s`,
|
|
||||||
);
|
|
||||||
} else if (
|
|
||||||
isAutoBufferingRef.current &&
|
|
||||||
ahead >= adaptiveResumeSecsRef.current
|
|
||||||
) {
|
|
||||||
isAutoBufferingRef.current = false;
|
|
||||||
ctx.resume().catch(() => {});
|
|
||||||
onLog(`Buffer recovered with ${ahead.toFixed(1)}s queued`);
|
|
||||||
}
|
|
||||||
}, [enqueue, flushBufferedAudio, onLog, prebufferSecs, rebufferThresholdSecs, resumeThresholdSecs]);
|
|
||||||
|
|
||||||
const generate = useCallback(async (options: GenerateOptions) => {
|
|
||||||
if (!options.text.trim()) return;
|
|
||||||
|
|
||||||
resetPlayback();
|
|
||||||
revokeCurrentUrl();
|
|
||||||
audioCtxRef.current = new AudioContext({ sampleRate: SAMPLE_RATE });
|
|
||||||
|
|
||||||
const controller = new AbortController();
|
|
||||||
abortRef.current = controller;
|
|
||||||
|
|
||||||
onStart();
|
|
||||||
onLog(`Voice: ${options.speaker}`);
|
|
||||||
onLog(`CFG ${options.cfgScale.toFixed(1)}, steps ${options.inferenceSteps}`);
|
|
||||||
|
|
||||||
const startedAt = Date.now();
|
|
||||||
const timerId = window.setInterval(() => {
|
|
||||||
onProgress((Date.now() - startedAt) / 1000, null);
|
|
||||||
}, 500);
|
|
||||||
|
|
||||||
try {
|
|
||||||
const res = await fetch("/api/generate", {
|
|
||||||
method: "POST",
|
|
||||||
headers: { "Content-Type": "application/json" },
|
|
||||||
body: JSON.stringify({
|
|
||||||
text: options.text,
|
|
||||||
speaker: options.speaker,
|
|
||||||
cfg_scale: options.cfgScale,
|
|
||||||
inference_steps: options.inferenceSteps,
|
|
||||||
}),
|
|
||||||
signal: controller.signal,
|
|
||||||
});
|
|
||||||
|
|
||||||
if (!res.ok || !res.body) {
|
|
||||||
const err = await res.json().catch(() => ({})) as { error?: string };
|
|
||||||
throw new Error(err.error ?? `HTTP ${res.status}`);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const reader = res.body.getReader();
|
if (!hasStartedPlaybackRef.current) {
|
||||||
const decoder = new TextDecoder();
|
const bufferedSecs = chunksRef.current.reduce((sum, c) => sum + c.length, 0) / SAMPLE_RATE;
|
||||||
let buffer = "";
|
if (bufferedSecs >= prebufferSecs) {
|
||||||
|
onLog(`Playback started after ${bufferedSecs.toFixed(1)}s buffered`);
|
||||||
|
flushBufferedAudio();
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
while (true) {
|
enqueue(ctx, chunk);
|
||||||
const { done, value } = await reader.read();
|
if (isUserPausedRef.current) return;
|
||||||
if (done) break;
|
|
||||||
|
|
||||||
buffer += decoder.decode(value, { stream: true });
|
const ahead = nextStartTimeRef.current - ctx.currentTime;
|
||||||
const lines = buffer.split("\n");
|
if (ctx.state === "running" && !isAutoBufferingRef.current && ahead < rebufferThresholdSecs) {
|
||||||
buffer = lines.pop() ?? "";
|
isAutoBufferingRef.current = true;
|
||||||
|
underrunCountRef.current += 1;
|
||||||
|
adaptiveResumeSecsRef.current = Math.min(
|
||||||
|
MAX_ADAPTIVE_RESUME_SECS,
|
||||||
|
Math.max(resumeThresholdSecs, prebufferSecs + underrunCountRef.current * 2)
|
||||||
|
);
|
||||||
|
ctx.suspend().catch(() => {});
|
||||||
|
onLog(
|
||||||
|
`Buffer underrun ${underrunCountRef.current}; refilling to ${adaptiveResumeSecsRef.current.toFixed(1)}s`
|
||||||
|
);
|
||||||
|
} else if (isAutoBufferingRef.current && ahead >= adaptiveResumeSecsRef.current) {
|
||||||
|
isAutoBufferingRef.current = false;
|
||||||
|
ctx.resume().catch(() => {});
|
||||||
|
onLog(`Buffer recovered with ${ahead.toFixed(1)}s queued`);
|
||||||
|
}
|
||||||
|
},
|
||||||
|
[enqueue, flushBufferedAudio, onLog, prebufferSecs, rebufferThresholdSecs, resumeThresholdSecs]
|
||||||
|
);
|
||||||
|
|
||||||
for (const line of lines) {
|
const generate = useCallback(
|
||||||
if (!line.startsWith("data: ")) continue;
|
async (options: GenerateOptions) => {
|
||||||
const event = JSON.parse(line.slice(6)) as {
|
if (!options.text.trim()) return;
|
||||||
type: "audio_chunk" | "complete" | "error" | "cancelled";
|
|
||||||
data?: string;
|
|
||||||
elapsed?: number;
|
|
||||||
audio_secs?: number;
|
|
||||||
realtime_factor?: number | null;
|
|
||||||
chunks?: number;
|
|
||||||
first_chunk_secs?: number | null;
|
|
||||||
max_chunk_gap_secs?: number;
|
|
||||||
message?: string;
|
|
||||||
};
|
|
||||||
|
|
||||||
if (event.type === "audio_chunk" && event.data) {
|
resetPlayback();
|
||||||
handleAudioChunk(decodeFloat32Chunk(event.data));
|
revokeCurrentUrl();
|
||||||
} else if (event.type === "complete") {
|
audioCtxRef.current = new AudioContext({ sampleRate: SAMPLE_RATE });
|
||||||
if (!hasStartedPlaybackRef.current) {
|
|
||||||
flushBufferedAudio();
|
const controller = new AbortController();
|
||||||
} else if (isAutoBufferingRef.current) {
|
abortRef.current = controller;
|
||||||
isAutoBufferingRef.current = false;
|
|
||||||
audioCtxRef.current?.resume().catch(() => {});
|
onStart();
|
||||||
}
|
onLog(`Voice: ${options.speaker}`);
|
||||||
const wavBlob = buildWav(mergeFloat32Arrays(chunksRef.current), SAMPLE_RATE);
|
onLog(`CFG ${options.cfgScale.toFixed(1)}, steps ${options.inferenceSteps}`);
|
||||||
const audioUrl = URL.createObjectURL(wavBlob);
|
|
||||||
audioUrlRef.current = audioUrl;
|
const startedAt = Date.now();
|
||||||
const kb = (wavBlob.size / 1024).toFixed(0);
|
const timerId = window.setInterval(() => {
|
||||||
const audioSecs = event.audio_secs ?? totalAudioSamplesRef.current / SAMPLE_RATE;
|
onProgress((Date.now() - startedAt) / 1000, null);
|
||||||
const realtimeFactor =
|
}, 500);
|
||||||
event.realtime_factor ??
|
|
||||||
(event.elapsed && event.elapsed > 0 ? audioSecs / event.elapsed : null);
|
try {
|
||||||
const speedText =
|
const res = await fetch("/api/generate", {
|
||||||
realtimeFactor === null ? "" : ` - ${realtimeFactor.toFixed(2)}x realtime`;
|
method: "POST",
|
||||||
onLog(`Done in ${event.elapsed}s - ${audioSecs.toFixed(1)}s audio${speedText} - ${kb} KB`);
|
headers: { "Content-Type": "application/json" },
|
||||||
if (event.chunks && event.first_chunk_secs !== undefined) {
|
body: JSON.stringify({
|
||||||
|
text: options.text,
|
||||||
|
speaker: options.speaker,
|
||||||
|
cfg_scale: options.cfgScale,
|
||||||
|
inference_steps: options.inferenceSteps,
|
||||||
|
}),
|
||||||
|
signal: controller.signal,
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!res.ok || !res.body) {
|
||||||
|
const err = (await res.json().catch(() => ({}))) as { error?: string };
|
||||||
|
throw new Error(err.error ?? `HTTP ${res.status}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const reader = res.body.getReader();
|
||||||
|
const decoder = new TextDecoder();
|
||||||
|
let buffer = "";
|
||||||
|
|
||||||
|
while (true) {
|
||||||
|
const { done, value } = await reader.read();
|
||||||
|
if (done) break;
|
||||||
|
|
||||||
|
buffer += decoder.decode(value, { stream: true });
|
||||||
|
const lines = buffer.split("\n");
|
||||||
|
buffer = lines.pop() ?? "";
|
||||||
|
|
||||||
|
for (const line of lines) {
|
||||||
|
if (!line.startsWith("data: ")) continue;
|
||||||
|
const event = JSON.parse(line.slice(6)) as {
|
||||||
|
type: "audio_chunk" | "complete" | "error" | "cancelled";
|
||||||
|
data?: string;
|
||||||
|
elapsed?: number;
|
||||||
|
audio_secs?: number;
|
||||||
|
realtime_factor?: number | null;
|
||||||
|
chunks?: number;
|
||||||
|
first_chunk_secs?: number | null;
|
||||||
|
max_chunk_gap_secs?: number;
|
||||||
|
message?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
if (event.type === "audio_chunk" && event.data) {
|
||||||
|
handleAudioChunk(decodeFloat32Chunk(event.data));
|
||||||
|
} else if (event.type === "complete") {
|
||||||
|
if (!hasStartedPlaybackRef.current) {
|
||||||
|
flushBufferedAudio();
|
||||||
|
} else if (isAutoBufferingRef.current) {
|
||||||
|
isAutoBufferingRef.current = false;
|
||||||
|
audioCtxRef.current?.resume().catch(() => {});
|
||||||
|
}
|
||||||
|
const wavBlob = buildWav(mergeFloat32Arrays(chunksRef.current), SAMPLE_RATE);
|
||||||
|
const audioUrl = URL.createObjectURL(wavBlob);
|
||||||
|
audioUrlRef.current = audioUrl;
|
||||||
|
const kb = (wavBlob.size / 1024).toFixed(0);
|
||||||
|
const audioSecs = event.audio_secs ?? totalAudioSamplesRef.current / SAMPLE_RATE;
|
||||||
|
const realtimeFactor =
|
||||||
|
event.realtime_factor ??
|
||||||
|
(event.elapsed && event.elapsed > 0 ? audioSecs / event.elapsed : null);
|
||||||
|
const speedText =
|
||||||
|
realtimeFactor === null ? "" : ` - ${realtimeFactor.toFixed(2)}x realtime`;
|
||||||
onLog(
|
onLog(
|
||||||
`Stream: first chunk ${event.first_chunk_secs}s, ${event.chunks} chunks, max gap ${event.max_chunk_gap_secs}s`,
|
`Done in ${event.elapsed}s - ${audioSecs.toFixed(1)}s audio${speedText} - ${kb} KB`
|
||||||
);
|
);
|
||||||
|
if (event.chunks && event.first_chunk_secs !== undefined) {
|
||||||
|
onLog(
|
||||||
|
`Stream: first chunk ${event.first_chunk_secs}s, ${event.chunks} chunks, max gap ${event.max_chunk_gap_secs}s`
|
||||||
|
);
|
||||||
|
}
|
||||||
|
onSuccess(audioUrl);
|
||||||
|
} else if (event.type === "cancelled") {
|
||||||
|
throw new DOMException("Generation cancelled", "AbortError");
|
||||||
|
} else if (event.type === "error") {
|
||||||
|
throw new Error(event.message ?? "Generation failed");
|
||||||
}
|
}
|
||||||
onSuccess(audioUrl);
|
|
||||||
} else if (event.type === "cancelled") {
|
|
||||||
throw new DOMException("Generation cancelled", "AbortError");
|
|
||||||
} else if (event.type === "error") {
|
|
||||||
throw new Error(event.message ?? "Generation failed");
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
} catch (err) {
|
||||||
|
if (err instanceof Error && err.name === "AbortError") {
|
||||||
|
onLog("Cancelled.");
|
||||||
|
onCancel();
|
||||||
|
} else {
|
||||||
|
const message = err instanceof Error ? err.message : "Unknown error";
|
||||||
|
onLog(`Error: ${message}`);
|
||||||
|
onError();
|
||||||
|
}
|
||||||
|
} finally {
|
||||||
|
window.clearInterval(timerId);
|
||||||
|
abortRef.current = null;
|
||||||
}
|
}
|
||||||
} catch (err) {
|
},
|
||||||
if (err instanceof Error && err.name === "AbortError") {
|
[
|
||||||
onLog("Cancelled.");
|
flushBufferedAudio,
|
||||||
onCancel();
|
handleAudioChunk,
|
||||||
} else {
|
onCancel,
|
||||||
const message = err instanceof Error ? err.message : "Unknown error";
|
onError,
|
||||||
onLog(`Error: ${message}`);
|
onLog,
|
||||||
onError();
|
onProgress,
|
||||||
}
|
onStart,
|
||||||
} finally {
|
onSuccess,
|
||||||
window.clearInterval(timerId);
|
resetPlayback,
|
||||||
abortRef.current = null;
|
revokeCurrentUrl,
|
||||||
}
|
]
|
||||||
}, [
|
);
|
||||||
flushBufferedAudio,
|
|
||||||
handleAudioChunk,
|
|
||||||
onCancel,
|
|
||||||
onError,
|
|
||||||
onLog,
|
|
||||||
onProgress,
|
|
||||||
onStart,
|
|
||||||
onSuccess,
|
|
||||||
resetPlayback,
|
|
||||||
revokeCurrentUrl,
|
|
||||||
]);
|
|
||||||
|
|
||||||
const pauseStream = useCallback(() => {
|
const pauseStream = useCallback(() => {
|
||||||
isUserPausedRef.current = true;
|
isUserPausedRef.current = true;
|
||||||
|
|||||||
Reference in New Issue
Block a user