https://research.nvidia.com/labs/adlr/NVLM-1/
NVLM: Open Frontier-Class Multimodal LLMs - NVIDIA ADLR
We introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks,...
frontier classmultimodal llmsopennvidia
https://arxiv.org/abs/2404.13784
[2404.13784] Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Abstract page for arXiv paper 2404.13784: Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
https://aiedresearcher.org/articles/mt-video-bench-a-holistic-video-understanding-benchmark-for-evaluating-multimodal-llms-in-multi-turn-dialogues-2/
MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in...
https://www.siliconflow.com/models/glm-5v-turbo
SiliconFlow – AI Infrastructure for LLMs & Multimodal Models
Lightning-fast AI platform for developers. Deploy, fine-tune, and run 200+ optimized LLMs and multimodal models with simple APIs - SiliconFlow.
ai infrastructurefor llmssiliconflowmultimodalmodels