Data Discovery Agent and Data Markets
Autonomous agents increasingly rely on external data to complete downstream tasks such as model training and decision support. However, existing data discovery systems remain largely retrieval-oriented: they surface candidate datasets from heterogeneous sources, but provide limited support for estimating taskspecific utility, selecting cost-effective datasets under budget constraints, or incorporating trustworthy feedback from prior usage. This project presents Guixu, a valuation-driven data discovery system for autonomous agents. Guixu employs a three-phase valuation pipeline with proxy-label propagation and multi-round knapsack optimization for task-aware data valuation. Guixu integrates agentic payment protocol to enable budget-constrained data procurement workflows. Guixu leverages on-chain data market and attestation signals for verifiable data discovery. Our demonstration highlights how Guixu enables an agent to move beyond keywordbased dataset retrieval toward task- and budget-aware, trustworthy data discovery and procurement. Attendees can interactively explore the full workflow, from NL task specification and multi-source search to data valuation and verifiable transaction feedback.
Codebase
Publications
Guixu: Valuation-Driven Data Discovery for Autonomous AI Agents with On-Chain Attestation
Yifan Wu, Yuchen Peng, Jiaqi Chai, Yufei Qian, Xilin Li, Ke Chen, Lidan Shou
VLDB 2026, 52nd International Conference on Very Large Data Bases
Code
Token Economics for LLM Agents: A Dual-View Study from Computing and Economics
Yuxi Chen, Junming Chen, Chenyu He, Yiwei Li, Yicheng Ji, Yifan Wu, Dingyu Yang, Lansong Diao, Lidan Shou, Hongliang Zhang, Huan Li, Gang Chen
arXiv:2605.09104, 2026
Team
- Lidan Shou (Advisor)
- Ke Chen
- Yifan Wu