Robuta

https://openreview.net/forum?id=F1ff8zcjPp&referrer=%5Bthe%20profile%20of%20Yue%20Dong%5D(%2Fprofile%3Fid%3D~Yue_Dong2)
Vision-language models (VLMs) have improved significantly in their capabilities, but their complex architecture makes their safety alignment challenging. In...
layerwisealignmentexaminingsafety
https://openreview.net/forum?id=IQTGyOh8f9&referrer=%5Bthe%20profile%20of%20Seung%20Hyun%20Lee%5D(%2Fprofile%3Fid%3D~Seung_Hyun_Lee2)
The goal of image cropping is to identify visually appealing crops in an image. Conventional methods are trained on specific datasets and fail to adapt to new...
vision language modelimage croppingcroppercontext
https://openreview.net/forum?id=ELd5Kn0StP&referrer=%5Bthe%20profile%20of%20Xiaohong%20Liu%5D(%2Fprofile%3Fid%3D~Xiaohong_Liu2)
The development of multimodal large language models (MLLMs) enables the evaluation of image quality through natural language descriptions. This advancement...
image quality assessmentmultimodal languagegroundingiqamodel
https://similarlabs.com/p/banana-ai
Professional AI image generation platform featuring nano banana AI technology integration with advanced natural language image editing capabilities
banana ainatural languageimage editinggoogle geminipro
https://huggingface.co/papers/2201.12086
Join the discussion on this paper page
pre trainingpaperblipbootstrappinglanguage
https://openreview.net/forum?id=KCZU12jzfC&referrer=%5Bthe%20profile%20of%20Harold%20Haodong%20Chen%5D(%2Fprofile%3Fid%3D~Harold_Haodong_Chen1)
Accurately profiling urban regions in terms of social, economic and environmental indicators is crucial for urban planning and sustainable development. The...
urban regionlearningtextenhancedprofiling
https://scirp.org/journal/papercitationdetails?paperid=124836&JournalID=2431
Research on specific domain question-answering technology has become important with the increasing demand for intelligent question-answering systems. This...
question answeringdomainalgorithmbasedcontrastive
https://www.showmebest.ai/ai-tools/nanoimg-ai
Create and edit professional-quality images instantly using natural language commands with advanced AI technology.
nano banana ainatural languageimage editorampgenerator
https://j-min.io/publication/sevila_neurips2023/
Sep 22, 2023 - To handle video QA, we self-chain BLIP-2 for 2-stage inference (localize+QA) & refining localization via QA feedback - *[NeurIPS...
image languagevideo localizationselfchainedmodel
https://www.hindustantimes.com/videos/news/china-rocket-image-row-bjps-mandarin-language-birthday-wish-for-mk-stalin-isro-tamil-nadu-101709316328813-amp.html
BJP took a fresh jibe at Tamil Nadu CM MK Stalin while extending birthday wishes. The BJP, in a post on 'X' platform, wished CM Stalin in Mandarin - a form of...
mandarin languagebirthday wishchinarocketimage
https://fluxkontext.io/
Flux Kontext AI: The ultimate AI-powered image editing platform. Transform your photos with natural language prompts using advanced FLUX.1 models. Edit...
flux kontext aiimage editorrevolutionarytransformimages
https://creati.ai/ai-tools/nanobanana-ai/
Nano Banana offers powerful AI image editing through natural language prompts. Maintain character consistency and create high-quality visuals effortlessly.
ai image editingnano banananatural languageadvancedcreati
https://aclanthology.org/2023.findings-emnlp.982/
Guojun Wu. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023.
language barriersicuconqueringvisionmodeling
https://huggingface.co/papers/2306.09093
Join the discussion on this paper page
language modelingpapermacawllmmulti
https://arxiv.org/abs/2209.06794
Abstract page for arXiv paper 2209.06794: PaLI: A Jointly-Scaled Multilingual Language-Image Model
multilingual languageimage modelpalijointlyscaled
https://imgur.com/a/0G8MCHJ
Discover the magic of the internet at Imgur, a community powered entertainment destination. Lift your spirits with funny jokes, trending memes, entertaining...
english languageimagepostoriginalfacebook
https://www.nature.com/articles/s41550-025-02670-z?error=cookies_not_supported&code=b1e206de-5480-4e3f-9d79-85ce84503c14
Oct 8, 2025 - Large language models can describe and classify changing objects in astronomical images with high accuracy. This enables searches for visual features using...
large language modelstextualinterpretationtransientimage
https://www.wolfram.com/language/11/neural-networks/out-of-core-image-classification.html.en?footer=lang
core imagewolfram languageclassificationnew
https://showmebest.ai/ai-tools/nanobanana-im
Transform photos with text prompts - advanced AI editing with character consistency and scene preservation.
natural language ainano bananaimage editor