EMTerms 1.0: A Terminological Resource for Crisis Tweets

Irina Temnikova, Carlos Castillo, Sarah Vieweg

ABSTRACT

We present the first release of EMTerms (Emergency Management Terms), the largest crisis-related terminological resource to date, containing over 7,000 terms used in Twitter to describe various crises. This resource can be used by practitioners to search for relevant messages in Twitter during crises, and by computer scientists to develop new automatic methods for crises in Twitter.

The terms have been collected from a seed set of terms manually annotated by a linguist and an emergency manager from tweets broadcast during 4 crisis events. A Conditional Random Fields (CRF) method was then applied to tweets from 35 crisis events, in order to expand the set of terms while overcoming the difficulty of getting more emergency managers’ annotations.

The terms are classified into 23 information-specific categories, by using a combination of expert annotations and crowdsourcing. This article presents the detailed terminology extraction methodology, as well as final results.

Download full article