Towards Efficient Online Topic Detection through Automated Bursty Feature Detection from Arabic Twitter Streams
Detecting trending topics or events from Twitter is an active research area. The first step in detecting such topics focuses on efficiently capturing textual features that exhibit an unusual high rate of appearance during a specific timeframe. Previous work in this area has resulted in coining the term "detecting bursty features" to refer to this step. In this paper, TFIDF, entropy, and stream chunking are adapted to investigate a new technique for detecting bursty features from an Arabic Twitter stream. Experimental results comparing bursty features extracted from Twitter streams, to Twitter's trending Hashtags and headlines from local news agencies during the same time frame from which tweets were collected, show a great deal of overlap indicating that the presented algorithm is capable of detecting meaningful bursty features.