Shuguang Chen

Ph.D. Candidate, University of Houston


I completed my Ph.D. in Computer Science at the University of Houston in December 2022. I worked at the RiTUAL (Research in Text Understanding and Analysis of Language) lab supervised by Dr. Thamar Solorio and my dissertation was on Named Entity Recognition on Social Media.

My research interest lies in Natural Language Processing (NLP) with a special focus on neural sequence labeling. My research aims to overcome linguistic challenges in noisy environments, making NLP models resilient, robust, and reliable under various conditions.

Specifically, I'm interested in tackling the following challenges:

  • Sequence labeling on user‑generated text
  • Data transformation for model robustness
  • Knowledge transfer in linguistic code-switching


  • [2022.12]: Will give an oral presentation on style transfer as data augmentation at EMNLP 2022.
  • [2022.11]: Successfully defended my Ph.D. dissertation!
  • [2022.10]: 1 paper accepted to EMNLP 2022.
  • [2022.05]: Started my summer internship at Microsoft Research as Research Intern.
  • [2022.05]: 1 paper accepted to SWR at NAACL 2022.
  • [2022.04]: Honored to be selected as a recipient of the Cullen Graduate Student Success Fellowship!
  • [2022.01]: Successfully defended my proposal and officially became a Ph.D. candidate!
  • [2021.09]: 1 paper accepted to W-NUT at EMNLP 2021.
  • [2021.08]: 1 paper accepted to EMNLP 2021.
  • [2021.06]: Completed the Build Basic Generative Adversarial Networks (GANs) course on Coursera.
  • [2021.06]: Co-organized the fifth CALCS workshop at NAACL 2021.
  • [2021.06]: 1 paper accepted to CALCS at NAACL 2021.
  • [2021.05]: Started my internship at Melax Tech as NLP Intern.
  • [2021.02]: 1 paper accepted to SocialNLP at NAACL 2021.


  • CALCS 2021 Shared Task: Machine Translation for Code-Switched Data
    Shuguang Chen, Gustavo Aguilar, Anirudh Srinivasan, Mona Diab, Thamar Solorio.


  • Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition
    Shuguang Chen, Leonardo Neves, Thamar Solorio
    EMNLP 2022
    [Paper] [Code]

  • A Simple Approach to Jointly Rank Passages and Select Relevant Sentences in the OBQA Context
    Man Luo, Shuguang Chen, Chitta Baral.
    NAACL 2022 SWR
    [Paper] [Code]

  • Data Augmentation for Cross-Domain Named Entity Recognition
    Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio
    EMNLP 2021
    [Paper] [Code]

  • Can images help recognize entities? A study of the role of images for Multimodal NER
    Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio.
    EMNLP 2021 W-NUT
    [Paper] [Code]

  • Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp
    Shuguang Chen, Leonardo Neves, Thamar Solorio.
    NAACL 2021 SocialNLP
    [Paper] [Code]

  • Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching
    Thamar Solorio, Shuguang Chen, Alan W. Black, Mona Diab, Sunayana Sitaram, Victor Soto, Emre Yilmaz, Anirudh Srinivasan
    NAACL 2021 CALCS