Timezone: »
Poster
Differentially Private n-gram Extraction
Kunho Kim · Sivakanth Gopi · Janardhan Kulkarni · Sergey Yekhanin
We revisit the problem of $n$-gram extraction in the differential privacy setting. In this problem, given a corpus of private text data, the goal is to release as many $n$-grams as possible while preserving user level privacy. Extracting $n$-grams is a fundamental subroutine in many NLP applications such as sentence completion, auto response generation for emails, etc. The problem also arises in other applications such as sequence mining, trajectory analysis, etc., and is a generalization of recently studied differentially private set union (DPSU) by Gopi et al. (2020). In this paper, we develop a new differentially private algorithm for this problem which, in our experiments, significantly outperforms the state-of-the-art. Our improvements stem from combining recent advances in DPSU, privacy accounting, and new heuristics for pruning in the tree-based approach initiated by Chen et al. (2012).
Author Information
Kunho Kim (Microsoft)
Sivakanth Gopi (Microsoft Research)
Sivakanth Gopi is a senior researcher in the Algorithms group at Microsoft Research Redmond. He is interested in Coding Theory and Differential Privacy.
Janardhan Kulkarni (Microsoft Research)
Sergey Yekhanin (Microsoft)
More from the Same Authors
-
2021 Spotlight: Numerical Composition of Differential Privacy »
Sivakanth Gopi · Yin Tat Lee · Lukas Wutschitz -
2021 Spotlight: Private Non-smooth ERM and SCO in Subquadratic Steps »
Janardhan Kulkarni · Yin Tat Lee · Daogao Liu -
2022 Poster: When Does Differentially Private Learning Not Suffer in High Dimensions? »
Xuechen Li · Daogao Liu · Tatsunori Hashimoto · Huseyin A. Inan · Janardhan Kulkarni · Yin-Tat Lee · Abhradeep Guha Thakurta -
2022 Poster: Differentially Private Model Compression »
FatemehSadat Mireshghallah · Arturs Backurs · Huseyin A. Inan · Lukas Wutschitz · Janardhan Kulkarni -
2021 Poster: Private Non-smooth ERM and SCO in Subquadratic Steps »
Janardhan Kulkarni · Yin Tat Lee · Daogao Liu -
2021 Poster: Fast and Memory Efficient Differentially Private-SGD via JL Projections »
Zhiqi Bu · Sivakanth Gopi · Janardhan Kulkarni · Yin Tat Lee · Judy Hanwen Shen · Uthaipon Tantipongpipat -
2021 Poster: Numerical Composition of Differential Privacy »
Sivakanth Gopi · Yin Tat Lee · Lukas Wutschitz -
2019 Poster: An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors »
Janardhan Kulkarni · Olga Ohrimenko · Bolin Ding · Sergey Yekhanin · Joshua Allen · Harsha Nori -
2019 Poster: Locally Private Gaussian Estimation »
Matthew Joseph · Janardhan Kulkarni · Jieming Mao · Steven Wu -
2017 Poster: Collecting Telemetry Data Privately »
Bolin Ding · Janardhan Kulkarni · Sergey Yekhanin -
2017 Poster: Clustering Billions of Reads for DNA Data Storage »
Cyrus Rashtchian · Konstantin Makarychev · Miklos Racz · Siena Ang · Djordje Jevdjic · Sergey Yekhanin · Luis Ceze · Karin Strauss -
2017 Spotlight: Clustering Billions of Reads for DNA Data Storage »
Cyrus Rashtchian · Konstantin Makarychev · Miklos Racz · Siena Ang · Djordje Jevdjic · Sergey Yekhanin · Luis Ceze · Karin Strauss