Mauro Sozio - "Fully-Dynamic Decision Trees"

Historically, machine learning (ML) algorithms have been developed assuming the input datasets to be static. However, as new data is collected or removed because being noisy or obsolete, one soon faces the need to efficiently update an ML model without sacrificing its accuracy. Decision trees are a cornerstone of ML and an essential tool in any ML library. Moreover, decision-tree based algorithms such as XGBoost and random forests have been consistently embraced in real-world applications and data challenges. In our talk, we present the first fully-dynamic algorithm that maintains an ``accurate'' decision tree with ``low'' amortized cost, over an arbitrary sequence of insertions and deletions of labeled examples. We also argue that our algorithm is optimal both in terms of memory usage and running time, up to polylogarithmic factors. We conclude our talk with an extensive experimental evaluation on real-world data showing the effectiveness of our approach. Our work (presented at AAAI23) represents one of the first studies on fully-dynamic supervised ML, which has been mostly unexplored so far, to the best of our knowledge.

31/05/2023