AI

Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

Abstract

We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resource-impoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and named-entity segmentation.