When proxy variables are used in place of unobserved categorical target variables, bias can become an issue. To address this, the literature has proposed and widely used solutions based on weighting adjustments and model-based imputation. In many cases, despite bias in the overall distribution, proxy variables still provide accurate individual-level predictions for most of the population. This work introduces a deterministic, distribution-consistent alternative to standard adjustment methods. The core idea is to treat proxy variable correction as a constrained reconstruction problem: the proxy is adjusted to match the target distribution, while preserving as much as possible the individual-level values it assumes. The approach extends classical Optimal Transport based solutions beyond simple marginal distribution alignment. While standard Optimal Transport aims to find a cost-minimizing transport plan between two distributions, here the problem is reformulated to include supervised information from the empirical confusion matrix between proxy and target variables. This results in a three-dimensional transportation framework where transitions are constrained by both marginal totals and the observed measurement structure. An application to educational attainment data demonstrates that the adjusted proxies are interpretable, reproducible, and statistically valid for both aggregate and individual-level analyses.
22 Maggio 2026, ore 12
Fabrizio Solari
ISTAT
In person: Room V (4th floor) building CU002 Scienze Statistiche
Webinar: https://uniroma1.zoom.us/j/83625004899?pwd=bXCtz0mp759PUh2lkqT0BUoVa0Uegg.1
ID riunione: 836 2500 4899
Passcode: 123456