# pathlm_schema **Repository Path**: billy_liu/pathlm_schema ## Basic Information - **Project Name**: pathlm_schema - **Description**: Code for EMNLP 2020 paper `Connecting the Dots: Event Graph Schema Induction with Path Language Modeling` - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-12-30 - **Last Updated**: 2020-12-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents ================= * [Overview](#overview) * [Requirements](#requirements) * [Data](#data) * [Training](#training) * [Testing](#testing) * [Citation](#citation) ## Overview The code for paper [Connecting the Dots: Event Graph Schema Induction with Path Language Modeling](http://blender.cs.illinois.edu/software/pathlm/).

## Requirements ``` Python=3.7 PyTorch=1.4 ``` ## Data ACE (Text Event Extraction Data): We preprcoessed ACE following [OneIE](http://blender.cs.illinois.edu/software/oneie/). Due to license reason, the ACE 2005 dataset is only accessible to those with LDC2006T06 license, please drop me an email (manling2@illinois.edu) showing your possession of the license for the processed data. ## Training Step 1. Prepare ACE data. Put the preprocessed ACE data under `data/ace`. The example of preprocessed ACE data is `example.json`in `Data.zip`. Step 2. Generate paths. ``` cd data_utils/preprocessing/ace python path_discover.py ``` Step 3. Generate training data for autoregressive language model and neighbor path classfication. ``` cd data_utils/preprocessing/ace python path_tsv_vocab.py ``` Step 4. Train PathLM on two tasks, ``` sh path_xlnet_ft.sh ``` The variant of PathLM removing neighbor path classification can be trained as follows, ``` sh path_xlnet_ft_clm.sh ``` ## Testing ``` cd data_utils/preprocessing/ace python evaluate_path.py ``` ## Citation Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, Heng Ji, Jonathan May, Nathanael Chambers, Clare Voss. "Connecting the Dots: Event Graph Schema Induction with Path Language Modeling." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 684-695. 2020. ``` @inproceedings{li2020connecting, title={Connecting the Dots: Event Graph Schema Induction with Path Language Modeling}, author={Li, Manling and Zeng, Qi and Lin, Ying and Cho, Kyunghyun and Ji, Heng and May, Jonathan and Chambers, Nathanael and Voss, Clare}, booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages={684--695}, year={2020} } ```