Papers

1. Transfer Learning for Sequence Labeling Using Source Model and Target Data

Figure 2: Our Proposed Neural Adapter

本文提出了一种渐进式的序列标注模型,模型主要分为两部分:

the surface form of a new category type has already appeared in the DSDS, but they are not annotated as a label. Because it is not yet considered as a concept to be recognized.

2. Adversarial Active Learning for Sequences Labeling and Generation

本文发表在IJCAI2018上,主要是关于active learning在序列问题上的应用,现有的active learning方法大多依赖于基于概率的分类器,而这些方法不适合于序列问题(标签序列的空间太大),作者提出了一种基于adversarial learning的框架解决了该问题。 Figure 1: An overview of Adversarial Active Learning for sequences (ALISE). The black and blue arrows respectively indicate flows for labeled and unlabeled samples.

与GAN类似,训练过程主要分两步:

  1. Encoder&&Decoder:Mathematically, it encourages the discriminator D to output a score 1 for both $z_{L}$ and $z_{U}$.
  2. Discriminator:

Therefore, the score from this discriminator already serves as an informativeness similarity score that could be directly used for Eq.7.

训练完成之后,将所有的未标注数据通过M和D,来获得匹配度:

Apparently, those samples with lowest scores should be sent out for labeling because they carry most valuable information in complementary to the current labeled data.

3. Zero-Shot Adaptive Transfer for Conversational Language Understanding

本文提出的模型Zero-Shot Adaptive Transfer model (ZAT)借鉴于zero-shot learning,传统的序列标注任务把slot类型作为预测输出,而本文中的模型则是将slot描述信息作为模型输入,如下图:

Figure  1:  (a)  Traditional  slot  tagging  approaches  with  the BIO  representation.  (b)  For  each  slot,  zero-shot  models  independently  detect  spans  that  contain  values  for  the  slot.  Detected  spans  are  then  merged  to  produce  a  final  prediction.

针对于同一个utterance,需要独立的经过每一类slot type模型预测结果,之后再把结果合并得到最终的输出。作者假设,不同的领域可以共享slot描述的语义信息,基于此,我们可以在大量的源数据中训练源模型,之后在少量的目标数据上finetune,并且不需要显式地slot对齐。

Figure 2: Network architecture for the Zero-Shot Adaptive Transfer model.

4. Improving Domain Adaptation Translation with Domain Invariant and Specific Information

训练部分采用了不同的方式,需要关注。

Insight