Installation and first run

Detailed setup guide

This page walks through installation, Local mode setup with FFmpeg, Ollama and a model, Cloud BYOK setup with your own provider key, and the first successful test. Tutorial videos are a highlighted use case, but the app is not limited to them.

Why is setup required?

VoxBridge Studio supports two setup paths. Local mode runs recognition, translation and file processing on your own computer, so FFmpeg, Ollama, a translation model and a license need to be checked. Cloud BYOK uses your own Groq or Gemini API key for selected cloud recognition or translation providers, so the key and provider choice also need to be checked.

Stuck? Start here

If this is your first run, do not start with a long video or the largest model. Use a 30–60 second test file first. On lighter hardware, start with translategemma:4b. Open VoxBridge Studio, let the first-run setup assistant check FFmpeg, license, and Local or Cloud BYOK readiness. Then fix only the missing step and run the check again.

translategemma:4b is a compatibility first-test model, not the quality baseline. translategemma:27b is for strong machines and should be treated as a heavier mode.

Recommended path Download → choose Local or Cloud BYOK → setup → short test video → if the workflow works, the personal license can come later.
Simple mode

Start with Local or Cloud BYOK

For new users, the Subtitles tab starts in Simple mode. Choose Local when you want the processing path on your PC. Choose Cloud BYOK when you intentionally want to use your own cloud provider key for Groq ASR or Gemini/Groq translation.

  • Local: FFmpeg, Ollama and a downloaded TranslateGemma model.
  • Cloud BYOK: your own provider account and API key.
  • Advanced: detailed provider combinations and mixed modes.
API keys

Session-only by default

Cloud BYOK keys are used only for the current session by default. If you explicitly enable “Remember API keys on this PC”, VoxBridge Studio stores them with Windows Credential Manager. Keys should not be described or handled as plain settings, logs or config-file values.

If you already set GROQ_API_KEY, GEMINI_API_KEY or GOOGLE_API_KEY in the environment, the app can use those as a fallback.

What you need

Minimum working set

You do not need a lot, but a few base components must be in place. Once they are ready, the setup assistant can check the whole chain quickly.

  • Windows 10 or Windows 11
  • VoxBridge Studio portable ZIP package
  • FFmpeg
  • Ollama for Local mode
  • at least one downloaded translation model for Local mode
  • your own Groq or Gemini API key for Cloud BYOK mode
  • enough RAM or VRAM for the selected model
  • internet for the initial model download and license activation
Download

Where should you start?

Start on the download page. The portable ZIP package should be extracted into a separate folder, and VoxBridge Studio should be started from that extracted folder.

  • open the download page
  • download the portable ZIP package
  • extract it into a separate folder
  • always start the app from the extracted folder
Hardware and model choice

Check hardware before downloading models

Local translation and speech recognition can need a lot of memory. The GB size of a model mainly describes download and storage size; while running, it also needs RAM or VRAM. On weaker machines, do not start with the largest models.

Translation modelRecommended start
translategemma:4b
3.3 GB
weak-PC/no-GPU first test and compatibility fallback; quality baseline is not guaranteed
translategemma:12b
8.1 GB
recommended base model; 24 GB RAM or 12 GB+ VRAM; do not start here below 16 GB RAM
translategemma:27b
17 GB
quality mode for strong machines; expect a warning before use and avoid it below 32 GB RAM
ASR / speech to subtitles

Speech recognition is also compute-heavy

  • distil-large-v3: recommended base ASR, 16 GB RAM and 6-8 GB+ VRAM recommended; slower on CPU.
  • large-v3-turbo: faster large ASR option, 16 GB RAM and 6-8 GB+ VRAM recommended.
  • large-v3: quality mode, 16-32 GB RAM and 8 GB+ VRAM recommended; can be very slow on CPU.

For the first test, use an existing SRT translation or a 1-3 minute video with clear speech.

Featured official downloads
FFmpeg: video and audio processing Ollama: local model runtime Both from official sources

The current app gives a guided first-run setup: FFmpeg download, Ollama installer launch, model selection, model download, license checks and Cloud BYOK readiness can all be started or checked from inside VoxBridge Studio. After the Ollama installer finishes, return to VoxBridge Studio and run the check again. The commands below remain as manual troubleshooting options.

Step 1

Install VoxBridge Studio

  1. Open the download page.
  2. Download the portable ZIP package.
  3. Extract it into a separate, easy-to-find folder.
  4. Start VoxBridge Studio from the extracted folder.
  5. Let the first-run setup assistant appear after launch.
VoxBridge Studio first-run setup assistant
First-run check The setup assistant checks FFmpeg, license, Local-mode Ollama/model readiness, and Cloud BYOK provider/key readiness for the first trial.
Step 2

Install and configure FFmpeg

FFmpeg is required for video and audio processing. During first-run setup, the app can download, extract and configure FFmpeg for you.

Manual download and PATH setup are fallback options if the app-based setup does not work.

  1. Start VoxBridge Studio.
  2. In the first-run setup, click the Download and configure FFmpeg button.
  3. Wait while the app downloads, extracts and configures FFmpeg.
  4. Run the check again inside the app.
  5. If it does not work, select the ffmpeg.exe path manually.

Quick check

ffmpeg -version

If this prints a version number, FFmpeg is visible to the system.

If it is not found, either it is not installed or it is not in PATH. Manual path entry inside the app is the quickest fallback.

Step by step in PowerShell

# 1) Create the folder
New-Item -ItemType Directory -Force -Path C:\Tools\ffmpeg

# 2) Extract the FFmpeg package here
# Skip this if you already extracted it

# 3) Check whether ffmpeg.exe exists
Get-ChildItem C:\Tools\ffmpeg -Recurse -Filter ffmpeg.exe

# 4) Add the bin folder to the current session PATH
$env:Path += ";C:\Tools\ffmpeg\bin"

# 5) Verify it
ffmpeg -version
where.exe ffmpeg

The same in cmd

mkdir C:\Tools\ffmpeg
where ffmpeg
set PATH=%PATH%;C:\Tools\ffmpeg\bin
ffmpeg -version

If ffmpeg -version works only in the current terminal, the PATH change is temporary. In that case either set PATH permanently in Windows, or point the app directly to ffmpeg.exe.

Step 3

Install and start Ollama

Ollama runs the local translation model. The app can open the official Windows installer for you, so you do not have to search for the download manually.

You still approve and finish the installation in the Windows installer. The app does not install it silently.

  1. In the first-run setup, click the Download Ollama and start installer button.
  2. Finish the Ollama installation in the Windows installer.
  3. Return to VoxBridge Studio and run the app check again.
  4. If Ollama is reachable, continue to model download.
  5. If it does not respond, restart Ollama or Windows, then check again.

Quick check

ollama --version
ollama list

These are manual troubleshooting commands. In normal use, the app check is enough.

Step by step in PowerShell

# 1) Check the version
ollama --version

# 2) See whether it responds
ollama list

# 3) Start the local service manually if needed
ollama serve

The same in cmd

ollama --version
ollama list
ollama serve

ollama serve is usually only needed if Ollama is not already running. If ollama list works right after install, you often do not need to start it separately.

Step 4

Download the translation model

The setup window lets you choose a translation model and start the download in one place. For the first run, one model is enough.

  1. Choose a model in the Translation model section of first-run setup.
  2. If the machine is weaker or you are unsure, start with translategemma:4b.
  3. If you have 24 GB RAM or 12 GB+ VRAM, translategemma:12b is a good base model.
  4. Click Download selected model.
  5. Wait until the download is complete.
  6. Run the check again; model status should be ready.

Recommended models

ollama pull translategemma:4b
ollama pull translategemma:12b
ollama pull translategemma:27b

translategemma:4b is the weak-PC first-test and compatibility model, not the quality baseline. translategemma:12b is the recommended base model, but it is not a good first choice below 16 GB RAM. translategemma:27b usually gives better translation quality, but it needs strong hardware, should show a warning before use, and is not recommended below 32 GB RAM.

You do not need all three models. Pick one for the first test, and try another later if needed.

PowerShell / cmd commands

# Faster model for weaker machines
ollama pull translategemma:4b

# Balanced default model
ollama pull translategemma:12b

# Larger, usually better-quality model
ollama pull translategemma:27b

# Verify
ollama list

Quick runtime test

ollama run translategemma:4b "Translate to Hungarian: Hello, this is a short test."

ollama run translategemma:12b "Translate to Hungarian: Hello, this is a short test."

ollama run translategemma:27b "Translate to Hungarian: Hello, this is a short test."

If ollama run returns a sensible translation, the model is really running. The 27b model is listed separately because it is a heavy quality mode for strong machines.

System requirements

What is needed for a stable start?

MinimumRecommended
Windows 10 or newer
FFmpeg installed
running Ollama
16 GB RAM or 6 GB+ VRAM for a 4b test
12b is not recommended below 16 GB RAM
SSD storage
Windows 11
24 GB RAM or 12 GB+ VRAM for the 12b model
32-64 GB RAM or 24 GB+ VRAM for the 27b model
fast SSD
stronger CPU or GPU
stable internet for the initial model download

4b, 12b or 27b?

  • translategemma:4b: 3.3 GB, weak-PC first test or compatibility fallback; not the quality baseline
  • translategemma:12b: 8.1 GB, recommended base model; start here with 24 GB RAM or 12 GB+ VRAM
  • translategemma:27b: 17 GB, heavy quality mode with an explicit warning before use; 32-64 GB RAM or 24 GB+ VRAM recommended
  • if you are unsure or the machine is weaker, start with 4b and a short test
  • if quality matters more, try 27b only on strong hardware
Step 5

First-run setup inside the app

  1. Launch VoxBridge Studio.
  2. Open the first-run setup assistant if it does not appear automatically.
  3. Check FFmpeg, Ollama, model and license status.
  4. If setup reports a missing step, fix it and run the check again.
  5. Activate the trial or license if you already have it.
  6. Run the first short trial check.

The goal here is not perfection. The goal is the first green status.

What should you verify inside the app?

  1. FFmpeg status is ready.
  2. Ollama is reachable.
  3. The downloaded model is visible.
  4. Trial or license status is green.
  5. The first test can be started.

What should be green?

  • FFmpeg available
  • Ollama running
  • at least one translation model downloaded
  • license or trial status ready
  • the first test can start

On weaker machines, Ollama may respond slowly. A smaller model and another setup check are the best first steps.

Step 6

First successful trial

Start with a short, non-private video, not with an important long course recording. A sidecar subtitle is the easiest first case.

  1. Choose a 1-3 minute clip.
  2. If a sidecar subtitle exists, start with that.
  3. If not, try the audio-to-subtitle path.
  4. Check the language-tagged output, for example VideoName.hu.srt.
  5. Once that works, you are ready for longer material.
VoxBridge Studio files and queue view
Main workflow Files, queue and output are visible in one place, which keeps the first trial easy to follow.
If you get stuck

What should you check first?

  • is FFmpeg actually visible?
  • is Ollama running?
  • is the model downloaded?
  • is the trial or license status green?
  • are you testing with a short, non-private file?

Support export

If it still does not start, create a support bundle and send it to support. It can include logs, settings and technical state information.

Summary The whole route is short: download, install, FFmpeg and license checks first, then Local-mode Ollama/model or Cloud BYOK provider/key checks, short test video. Once this is in place, the product has its basic working foundation, and the next decision is simply the later-finalized personal license.