newai.today
HomeModelsBenchmarks
newai.today

Discover, compare and track AI models, their public releases and benchmark scores.

Resources

  • Models Directory
  • Benchmarks
  • API Documentation

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 newai.today. All rights reserved.

Theme:

Models Directory

Browse and compare AI models across providers, modalities, and use cases.

Showing 20 of 522 models

Advanced Search

Active Filters

Out: Text

Any LLM

Use any large language model from our selected catalogue (powered by OpenRouter)

View Details

Any VLM

Use any vision language model from our selected catalogue (powered by OpenRouter)

View Details

ElevenLabs Speech to Text

Generate text from speech using ElevenLabs advanced speech-to-text model.

View Details

FFmpeg API Metadata

Get encoding metadata from video and audio files using FFmpeg API.

View Details

FFmpeg API Waveform

Get waveform data from audio files using FFmpeg API.

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

Florence-2 Large

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

View Details

GOT OCR 2.0

GOT-OCR2 works on a wide range of tasks, including plain document OCR, scene text OCR, formatted document OCR, and even OCR for tables, charts, mathematical formulas, geometric shapes, molecular formulas and sheet music.

View Details

LLaVA v1.6 34B

Vision

View Details

MiniCPM-V 2.6

Multimodal vision-language model for single/multi image understanding

View Details

MiniCPM-V 2.6

Multimodal vision-language model for video understanding

View Details

MoonDreamNext

MoonDreamNext is a multimodal vision-language model for captioning, gaze detection, bbox detection, point detection, and more.

View Details

MoonDreamNext Batch

MoonDreamNext Batch is a multimodal vision-language model for batch captioning.

View Details

Moondream

Answer questions from the images.

View Details

NSFW Filter

Predict the probability of an image being NSFW.

View Details

Sa2VA 4B Image

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

View Details
  • Previous
  • 1
  • 2
  • More pages
  • 27
  • Next