physical-ai
- GRUtopia: Dream General Robots in a City at Scale 01 Feb 2025
- Genesis: A Generative and Universal Physics Engine for Robotics and Beyond 30 Jan 2025
- NVIDIA Cosmos: a world foundation model platform 20 Jan 2025
- Three 3DGS-based Sim2Real Projects 10 Jan 2025
- Feifei Li’s World Labs is giving machines 3D spatial intelligence 06 Jan 2025
3d-modelling
- GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control 06 Mar 2025
- AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation 24 Feb 2025
- GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting 01 Feb 2025
- Large-scale 3d neural shape modeling : representation, generation, and controllability 30 Jan 2025
- Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass 25 Jan 2025
- Tencent Hunyuan3D: An open-source high-quality 3D-DiT generative model 22 Jan 2025
- GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos 06 Jan 2025
- FaceLift: Single Image to 3D Head with View Generation and GS-LRM 06 Jan 2025
llm
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 08 Mar 2025
- Multi-Agent Collaboration Mechanisms: A Survey of LLMs 06 Mar 2025
- BIG-Bench Extra Hard 05 Mar 2025
- Andrej Karpathy LLM talks 05 Mar 2025
- Visual-RFT: Visual Reinforcement Fine-Tuning 05 Mar 2025
- VisualThinker-R1-Zero: First ever R1-Zero's Aha Moment on just a 2B non-SFT Model 05 Mar 2025
- Liquid: Language Models are Scalable and Unified Multi-modal Generators 01 Mar 2025
- LLM Post-Training: A Deep Dive into Reasoning Large Language Models 01 Mar 2025
- Self-rewarding Correction for Mathematical Reasoning 01 Mar 2025
- The Ultra-Scale Playbook: Training LLMs on GPU Clusters 01 Mar 2025
- Empowering innovation: The next generation of the Phi family 28 Feb 2025
- OpenAI: Introducing GPT-4.5 28 Feb 2025
- LangGraph 0.3 Release: Prebuilt Agents 28 Feb 2025
- WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models 24 Feb 2025
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 24 Feb 2025
- Exploring the Potential of Encoder-free Architectures in 3D LMMs 20 Feb 2025
- SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? 20 Feb 2025
- s1: Simple test-time scaling 04 Feb 2025
- KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation 01 Feb 2025
- FLAME: Learning to Navigate with Multimodal LLM in Urban Environments 01 Feb 2025
- Recommender Systems Meet Large Language Model Agents: A Survey 01 Feb 2025
- Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots 01 Feb 2025
- Reasoning Language Models: A Blueprint 30 Jan 2025
- A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application 30 Jan 2025
- Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models 25 Jan 2025
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 22 Jan 2025
- Five Open-source LLM-based Web Scrapers 22 Jan 2025
- Reinforcement Learning Enhanced LLMs: A Survey 06 Jan 2025
rl
- Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models 25 Jan 2025
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 22 Jan 2025
- Reinforcement Learning Enhanced LLMs: A Survey 06 Jan 2025
survey
- Generative Artificial Intelligence: A Historical Perspective 11 Mar 2025
- Biomedical Foundation Model: A Survey 10 Mar 2025
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 08 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 08 Mar 2025
- Simulating the Real World: A Unified Survey of Multimodal Generative Models 08 Mar 2025
- Multi-Agent Collaboration Mechanisms: A Survey of LLMs 06 Mar 2025
- When Continue Learning Meets Multimodal Large Language Model: A Survey 06 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 06 Mar 2025
- LLM Post-Training: A Deep Dive into Reasoning Large Language Models 01 Mar 2025
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation 24 Feb 2025
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 24 Feb 2025
- Recommender Systems Meet Large Language Model Agents: A Survey 01 Feb 2025
- A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application 30 Jan 2025
- Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models 25 Jan 2025
- Retrieval-Augmented Generation with Graphs (GraphRAG) 20 Jan 2025
- Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG 20 Jan 2025
- Reinforcement Learning Enhanced LLMs: A Survey 06 Jan 2025
time-series
rag
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents 05 Mar 2025
- HippoRAG 2: From RAG to Memory 28 Feb 2025
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation 24 Feb 2025
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation 24 Feb 2025
- Retrieval-Augmented Generation with Graphs (GraphRAG) 20 Jan 2025
- Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG 20 Jan 2025
- TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting 06 Jan 2025
genai
- Generative Artificial Intelligence: A Historical Perspective 11 Mar 2025
- MIT CS Course: Generative AI with Stochastic Differential Equations 11 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 08 Mar 2025
- Simulating the Real World: A Unified Survey of Multimodal Generative Models 08 Mar 2025
- GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control 06 Mar 2025
- Generating Multi-Image Synthetic Data for Text-to-Image Customization 04 Feb 2025
- SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models 01 Feb 2025
- ComfyUI - Visual storytelling, simplified by AI 01 Feb 2025
- Large-scale 3d neural shape modeling : representation, generation, and controllability 30 Jan 2025
- AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models 25 Jan 2025
- Tencent Hunyuan3D: An open-source high-quality 3D-DiT generative model 22 Jan 2025
- 4M: Massively Multimodal Masked Modeling 06 Jan 2025
codebase
- OpenManus: An open-source framework for building general AI agents 11 Mar 2025
- BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities 11 Mar 2025
- GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control 06 Mar 2025
- Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention 05 Mar 2025
- Visual-RFT: Visual Reinforcement Fine-Tuning 05 Mar 2025
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents 05 Mar 2025
- VisualThinker-R1-Zero: First ever R1-Zero's Aha Moment on just a 2B non-SFT Model 05 Mar 2025
- Self-rewarding Correction for Mathematical Reasoning 01 Mar 2025
- Magma: A Foundation Model for Multimodal AI Agents 28 Feb 2025
- HippoRAG 2: From RAG to Memory 28 Feb 2025
- YOLOv12: Attention-Centric Real-Time Object Detectors 24 Feb 2025
- WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models 24 Feb 2025
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation 24 Feb 2025
- AI system predicts protein fragments that can bind to or inhibit a target 24 Feb 2025
- Generating Multi-Image Synthetic Data for Text-to-Image Customization 04 Feb 2025
- s1: Simple test-time scaling 04 Feb 2025
- FLAME: Learning to Navigate with Multimodal LLM in Urban Environments 01 Feb 2025
- Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots 01 Feb 2025
- SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models 01 Feb 2025
- GRUtopia: Dream General Robots in a City at Scale 01 Feb 2025
- ComfyUI - Visual storytelling, simplified by AI 01 Feb 2025
- Genesis: A Generative and Universal Physics Engine for Robotics and Beyond 30 Jan 2025
- Tencent Hunyuan3D: An open-source high-quality 3D-DiT generative model 22 Jan 2025
- Five Open-source LLM-based Web Scrapers 22 Jan 2025
- NVIDIA Cosmos: a world foundation model platform 20 Jan 2025
- Three 3DGS-based Sim2Real Projects 10 Jan 2025
- Controllable Human-Object Interaction Synthesis 06 Jan 2025
- 4M: Massively Multimodal Masked Modeling 06 Jan 2025
pretrained-models
- Empowering innovation: The next generation of the Phi family 28 Feb 2025
- Magma: A Foundation Model for Multimodal AI Agents 28 Feb 2025
- SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models 01 Feb 2025
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 22 Jan 2025
- Tencent Hunyuan3D: An open-source high-quality 3D-DiT generative model 22 Jan 2025
- NVIDIA Cosmos: a world foundation model platform 20 Jan 2025
- 4M: Massively Multimodal Masked Modeling 06 Jan 2025
embodied-ai
- AgiBot World, the first large-scale robotic learning dataset 11 Mar 2025
- BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities 11 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 08 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 06 Mar 2025
- FLAME: Learning to Navigate with Multimodal LLM in Urban Environments 01 Feb 2025
- GRUtopia: Dream General Robots in a City at Scale 01 Feb 2025
- Genesis: A Generative and Universal Physics Engine for Robotics and Beyond 30 Jan 2025
- AgiBot World: the first large-scale robotic learning dataset 10 Jan 2025
- Three 3DGS-based Sim2Real Projects 10 Jan 2025
- Controllable Human-Object Interaction Synthesis 06 Jan 2025
quantum-computing
- PsiQuantum Announces Omega, a Manufacturable Chipset for Photonic Quantum Computing 01 Mar 2025
- Microsoft’s Majorana 1 chip carves new path for quantum computing 20 Feb 2025
- Integrating Quantum Computing Resources into Scientific HPC Ecosystems 10 Jan 2025
collections
- CVPR 2025 Accepted Papers 10 Mar 2025
- AAAI-25 Tutorial and Lab Forum 08 Mar 2025
- Five Open-source LLM-based Web Scrapers 22 Jan 2025
- Three 3DGS-based Sim2Real Projects 10 Jan 2025
dataset
- AgiBot World, the first large-scale robotic learning dataset 11 Mar 2025
- BIG-Bench Extra Hard 05 Mar 2025
- Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention 05 Mar 2025
- Visual-RFT: Visual Reinforcement Fine-Tuning 05 Mar 2025
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents 05 Mar 2025
- Magma: A Foundation Model for Multimodal AI Agents 28 Feb 2025
- AI system predicts protein fragments that can bind to or inhibit a target 24 Feb 2025
- SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? 20 Feb 2025
- Generating Multi-Image Synthetic Data for Text-to-Image Customization 04 Feb 2025
- s1: Simple test-time scaling 04 Feb 2025
- AgiBot World: the first large-scale robotic learning dataset 10 Jan 2025
nlu
- Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models 25 Jan 2025
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 22 Jan 2025
- Retrieval-Augmented Generation with Graphs (GraphRAG) 20 Jan 2025
- Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG 20 Jan 2025
web-scraping
materials
3D-printing
tbc
- GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control 06 Mar 2025
- Generative Artificial Intelligence in Robotic Manipulation: A Survey 06 Mar 2025
- Liquid: Language Models are Scalable and Unified Multi-modal Generators 01 Mar 2025
- AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation 24 Feb 2025
- GRUtopia: Dream General Robots in a City at Scale 01 Feb 2025
- AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models 25 Jan 2025
ai-ethics
recommender
multimodal
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 08 Mar 2025
- Simulating the Real World: A Unified Survey of Multimodal Generative Models 08 Mar 2025
- When Continue Learning Meets Multimodal Large Language Model: A Survey 06 Mar 2025
- Visual-RFT: Visual Reinforcement Fine-Tuning 05 Mar 2025
- VisualThinker-R1-Zero: First ever R1-Zero's Aha Moment on just a 2B non-SFT Model 05 Mar 2025
- Liquid: Language Models are Scalable and Unified Multi-modal Generators 01 Mar 2025
- Empowering innovation: The next generation of the Phi family 28 Feb 2025
- Magma: A Foundation Model for Multimodal AI Agents 28 Feb 2025
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation 24 Feb 2025
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review 24 Feb 2025
- Exploring the Potential of Encoder-free Architectures in 3D LMMs 20 Feb 2025
- FLAME: Learning to Navigate with Multimodal LLM in Urban Environments 01 Feb 2025
biomed
- Biomedical Foundation Model: A Survey 10 Mar 2025
- AI system predicts protein fragments that can bind to or inhibit a target 24 Feb 2025
object-detection
- Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention 05 Mar 2025
- YOLOv12: Attention-Centric Real-Time Object Detectors 24 Feb 2025
book
big-data
communication
battery
tutorial
- MIT CS Course: Generative AI with Stochastic Differential Equations 11 Mar 2025
- AAAI-25 Tutorial and Lab Forum 08 Mar 2025
- Andrej Karpathy LLM talks 05 Mar 2025