Semantic Parsing & Code Generation
Research on improving semantic parsing and code generation through neural approaches
This project focuses on advancing semantic parsing and code generation through novel neural architectures, training techniques, and data synthesis approaches.
Natural Language to Code/SQL
Code Generation with Monolingual Data

We demonstrated that leveraging large amounts of monolingual programming language data can significantly improve code generation, achieving state-of-the-art performance with minimal task-specific architecture design.
TURING: Cross-Domain Natural Language Database Interface
We developed TURING, an interpretable multi-hypothesis system for Text-to-SQL parsing that achieves 75.1% execution accuracy on Spider. The system provides natural language explanations of SQL queries to help users select the correct interpretation.
Training & Architecture Innovations
Optimizing Deeper Transformers on Small Datasets

We introduced DT-Fixup, an initialization and learning rate scheme that enables training deep transformers on small datasets without warmup or layer normalization, leading to both better performance and faster training.
Globally Normalized Neural Model
A novel approach to semantic parsing that uses global normalization instead of local decisions, helping to avoid the label bias problem and improve parsing accuracy on small datasets.
Data Generation & Synthesis
Hierarchical Neural Data Synthesis

We developed a purely neural approach for synthesizing semantic parsing training data that removes the need for grammar engineering while achieving higher accuracy. The method enables zero-shot synthesis using only schema information.
Impact
This research has advanced semantic parsing and code generation by:
- Developing more efficient training methods for deep transformers
- Creating techniques for leveraging unlabeled data effectively
- Building interpretable multi-hypothesis parsing systems
- Enabling zero-shot data synthesis for new domains
- Improving model architectures through global normalization
The methods have been successfully applied to various tasks including code generation, Text-to-SQL parsing, and semantic parsing across different domains, making these technologies more practical and accessible.