Analyzing the Life-Cycle of a News Story
Updated: February 11, 2010
Each video shows an example of the data analysis process for the set of articles (e.g. news, blogs, or tweets) corresponding to a specific story within the Thoora platform. Green markers show individual articles, placed on the date at which they originate. The green bars show the total number of articles received at a specific date. Using signal processing and statistical analysis, the platform automatically detects events of interest in the timeline of the story. These stand out as large blue peaks. Additionally, each separate event is modeled in terms of its importance and duration (this is depicted by red curves, one for each event).
Video1
A much more complex (and more common) case. Here the events are not clearly separated, and the task of identifying relevant developments is challenging.
see this story
Video2
Shows a story with two major events. Unlike the previous case, here each event consists of a group of documents distributed more widely over time. Still the platform is able to determine the location and extent of the two relevant developments in this story.
see this story
Video3
Shows a recent story with several noteworthy events, well separated in time. This is a reasonably straightforward case illustrating the above idea.
see this story
The most important property of our data analysis process is that it determines whether an event is meaningful in a probabilistic way, i.e. it is not enough for the platform to find a large bar indicating a large number of incoming articles. Instead, all data is examined, and the relevance of each article is considered in terms of previous and following events.
This kind of processing is useful for identifying the components that form a story's timeline, for identifying important articles for each of the principal events, and for studying the evolution of a story over time.