TIMELINE GENERATION AND RECAPITULATION OF PROGRESSIVE TWEET STREAMS IN A DISTRIBUTED SYSTEM
Author’s Name : Amila.H | Kirthana.S| Nithra.G| Neelamegam.
Volume 03 Issue 02 Year 2016 ISSN No: 2349-3828 Page no: 17-22
Short Message Services such as Tweets are created and shared in an outré rate. Raw form of tweets is newsy and also paralyzing. Tweets contain blimp of raspy and superfluity for both end users and data begetter. A novel continuous summarization skeleton called Sumblr (continuouS sUMmarization By stream cLusteRing) to overcome the problem. Here multi topic version of Sumblr is used for summarizing and clustering of large datasets in a Distributed System. Traditional summarization methods were focused on static and small scale datasets in a single system. But Sumblr is designed to deal with dynamic, fast arriving and large scale datasets. Three major components are proposed in the framework; first the tweets are clustered using tweet stream clustering algorithm and maintain clear statistics in an data structure called Tweet Cluster Vector (TCV), Second a TCV-Rank Summarization technique is developed for generating online and historical summaries of arbitrary time durations, Third an effective topic evolution detection method is designed for monitoring summary based variations to produce timeline automatically. Experiments on large scale real tweets demonstrate the efficiency and effectiveness of the framework.
Tweet stream, continuous summarization, historical summary and online summary.
- Mehrotra, “On Summarization and Timeline Generation for Evolutionary Tweet Streams ” by IEEE ,2015
- Yang Gao,Yue Xu,and Yuefeng Li, “Pattern-Based Topics for Document Modeling in Information Filtering” by IEEE,2015
- Maryam Habibi and Andrei Popescu-Belis,“Keyword Extraction and Clustering for Document Recommendation in Conversations” by IEEE,2015
- AltugAkay,AndreiDragomir,Erlandson,“Network-Based Modeling and Intelligent Data Mining of Social Media for Improving Care” by IEEE,2015