Machine learning and data science competitions, wherein
contestants submit predictions about held-out data points, are an
increasingly common way to gather information and identify experts.
One of the most prominent platforms is Kaggle, which has run
competitions with prizes up to 3 million USD. The traditional
mechanism for selecting the winner is simple: score each prediction on
each held-out data point, and the contestant with the highest total
score wins. Perhaps surprisingly, this reasonable and popular
mechanism can incentivize contestants to submit wildly inaccurate
predictions. The talk will begin with intuition for the incentive
issues and what sort of strategic behavior one would expect---and
when. One takeaway is that, despite conventional wisdom, large
held-out data sets do not always alleviate these incentive issues, and
small ones do not necessarily suffer from them, as we confirm with
formal results. We will then discuss a new mechanism which is
approximately truthful, in the sense that rational contestants will
submit predictions which are close to their best guess. If time
permits, we will see how the same mechanism solves an open question
for online learning from strategic experts.
27/01/2025
The seminar is MANDATORY for PhD students!
When: January 27th 2025 at 10:00
Where: Room 1L, via del Castro Laurenziano
Rafael (Raf) Frongillo is an Associate Professor of Computer
Science at the University of Colorado Boulder. His research lies at
the interface between theoretical machine learning and economics,
primarily focusing on information elicitation mechanisms, which
incentivize humans or algorithms to predict accurately. Before
Boulder, Raf was a postdoc at the Center for Research on Computation
and Society at Harvard University and at Microsoft Research New York.
He received his PhD in Computer Science at UC Berkeley, advised by
Christos Papadimitriou and supported by the NDSEG Fellowship.