firstbacksecondback
40 Results
Workshop
|
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text - Poster Keiran Paster · Marco Dos Santos · Zhangir Azerbayev · Jimmy Ba |
||
Poster
|
Thu 8:45 |
Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems Benjamin Coleman · Wang-Cheng Kang · Matthew Fahrbach · Ruoxi Wang · Lichan Hong · Ed Chi · Derek Cheng |
|
Poster
|
Thu 15:00 |
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Hugo Laurençon · Lucile Saulnier · Leo Tronchon · Stas Bekman · Amanpreet Singh · Anton Lozhkov · Thomas Wang · Siddharth Karamcheti · Alexander Rush · Douwe Kiela · Matthieu Cord · Victor Sanh |
|
Workshop
|
Deploying Reinforcement Learning based Economizer Optimization at Scale Ivan Cui · Wei Yih Yap · Charles Prosper · Bharathan Balaji · Jake Chen |
||
Workshop
|
Exploring Dataset-Scale Indicators of Data Quality Benjamin Feuer · Chinmay Hegde |
||
Workshop
|
Fri 11:30 |
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text |
|
Workshop
|
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text - Oral |
||
Workshop
|
Navigating Dataset Documentation in ML: A Large-Scale Analysis of Dataset Cards on Hugging Face Xinyu Yang · Weixin Liang · James Zou |
||
Poster
|
Thu 15:00 |
SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model Di Wang · Jing Zhang · Bo Du · Minqiang Xu · Lin Liu · Dacheng Tao · Liangpei Zhang |
|
Poster
|
Wed 15:00 |
VCC: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens Zhanpeng Zeng · Cole Hawkins · Mingyi Hong · Aston Zhang · Nikolaos Pappas · Vikas Singh · Shuai Zheng |
|
Poster
|
Thu 15:00 |
A Massive Scale Semantic Similarity Dataset of Historical English Emily Silcock · Abhishek Arora · Melissa Dell |
|
Poster
|
Tue 8:45 |
Expanding Small-Scale Datasets with Guided Imagination Yifan Zhang · Daquan Zhou · Bryan Hooi · Kai Wang · Jiashi Feng |