Submodular functions are used to characterize the diminishing-returns property, which appears in many application areas, including information summarization, sensor placement, viral marketing, and more. Optimizing submodular functions has a rich history in mathematics and operations research, while recently, the subject has received increased attention due to the prevalent role of submodular functions in a broad range of data-science applications. In this talk we will discuss two recent projects on the topic of interpretable classification, both of which make interesting connections with submodular optimization. For the first project, we address the problem of multi-label classification via concise and discriminative rule sets. Submodularity is used to account for diversity, which helps avoiding redundancy, and thus, controlling the number of rules in the solution set. In the second project we aim to find accurate decision trees that have small size, and thus, are interpretable. We study a general family of impurity functions, including the popular functions of entropy and Gini-index, and show that a simple enhancement, relying on the framework of adaptive submodular ranking, can be used to obtain a logarithmic approximation guarantee on the tree complexity.