AI 每日进展速报 / Daily AI Digest - 2026-05-31

图像生成/编辑 / Image Generation/Editing

arXiv

GitHub

HuggingFace Datasets

MONET (Massive, Open, Non-redundant and Enriched Text-to-image dataset) is a large-scale, curated image-text dat...

视频生成/编辑 / Video Generation/Editing

arXiv

GitHub

HuggingFace Models

音频生成 / Audio Generation

arXiv

GitHub

HuggingFace Models

语言大模型 / Large Language Models

arXiv

GitHub

HuggingFace Models

HuggingFace Datasets

📦 UltraData Collection | 🌐 UltraData | 🤗 MiniCPM5 Series

English | 中文

    📚 Introduction

Ult...

📜 Ultra-FineWeb Technical Report | 📦 UltraData Collection | 🌐 UltraData | 🤗 MiniCPM5 Series

English | 中文

...

Ended up with some tokens to burn on a Claude Max plan. Assembly began during 4.6 and moved to 4.7. Model is tagged. The develop...

15 trillion tokens of the finest data the 🌐 web has to offer

    What is it?

The 🍷 FineWeb dataset consists of m...

多模态大模型 / Multimodal Models

arXiv

GitHub

HuggingFace Models

强化学习 / Reinforcement Learning

arXiv

GitHub

世界动作模型 / World Action Model

arXiv

GitHub


Generated automatically by Daily AI Digest Agent 生成时间: 2026-05-31 01:00:24