newai.today
HomeModelsBenchmarks
newai.today

Discover, compare and track AI models, their public releases and benchmark scores.

Resources

  • Models Directory
  • Benchmarks
  • API Documentation

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 newai.today. All rights reserved.

Theme:

Models Directory

Browse and compare AI models across providers, modalities, and use cases.

Showing 20 of 84 models

Advanced Search

Active Filters

In: Audio

ElevenLabs Audio Isolation

Isolate audio tracks using ElevenLabs advanced audio isolation technology.

View Details

ElevenLabs Speech to Text

Generate text from speech using ElevenLabs advanced speech-to-text model.

View Details

Speech-To-text

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

View Details

Speech-to-Text

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

View Details

Speech-to-Text

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

View Details

Speech-to-Text

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

View Details

Whisper

Whisper is a model for speech transcription and translation.

View Details

Wizper (Whisper v3 -- fal.ai edition)

[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!

View Details

Eleven English Sts V1

10,000

View Details

Eleven English Sts V2

English-only voice changer model (Speech to Speech)

Context

10.0K

View Details

Eleven Multilingual Sts V2

State-of-the-art multilingual voice changer model (Speech to Speech)

View Details

GPT-4o Audio

This is a preview release of the GPT-4o Audio models. These models accept audio inputs and outputs, and can be used in the Chat Completions REST API.

Pricing

Input: $2.50 / 1M tokensOutput: $10.00 / 1M tokens

Context

128.0K

View Details

GPT-4o Realtime

This is a preview release of the GPT-4o Realtime model, capable of responding to audio and text inputs in realtime over WebRTC or a WebSocket interface.

Pricing

Input: $5.00 / 1M tokensOutput: $20.00 / 1M tokens

Context

128.0K

View Details

GPT-4o Transcribe

GPT-4o Transcribe is a speech-to-text model that uses GPT-4o to transcribe audio. It offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.

Pricing

Input: $2.50 / 1M tokensOutput: $10.00 / 1M tokens

Context

16.0K

View Details

GPT-4o mini Audio

This is a preview release of the smaller GPT-4o Audio mini model. It's designed to input audio or create audio outputs via the REST API.

Pricing

Input: $0.15 / 1M tokensOutput: $0.60 / 1M tokens

Context

128.0K

View Details

GPT-4o mini Realtime

This is a preview release of the GPT-4o-mini Realtime model, capable of responding to audio and text inputs in realtime over WebRTC or a WebSocket interface.

Pricing

Input: $0.60 / 1M tokensOutput: $2.40 / 1M tokens

Context

128.0K

View Details

GPT-4o mini Transcribe

GPT-4o mini Transcribe is a speech-to-text model that uses GPT-4o mini to transcribe audio. It offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.

Pricing

Input: $1.25 / 1M tokensOutput: $5.00 / 1M tokens

Context

16.0K

View Details

Gemini 1.5 Flash

Gemini 1.5 Flash is a fast and versatile multimodal model for scaling across diverse tasks.

Pricing

Input: $0.07 / 1M tokensOutput: $0.30 / 1M tokens

Context

1.0M

View Details

Gemini 1.5 Flash-8B

Gemini 1.5 Flash-8B is a small model designed for lower intelligence tasks.

Pricing

Input: $0.04 / 1M tokensOutput: $0.15 / 1M tokens

Context

1.0M

View Details

Gemini 1.5 Pro

Try Gemini 2.5 Pro Preview, our most advanced Gemini model to date.

Pricing

Input: $1.25 / 1M tokensOutput: $5.00 / 1M tokens

Context

2.1M

View Details
  • Previous
  • 1
  • 2
  • 5
  • Next