Emergent.info Blog

Updates and analysis from Emergent.info, a real-time rumor tracker.
Updates and analysis from Emergent.info, a real-time rumor tracker.
  • rss
  • archive
  • How we get the data for Emergent

    Emergent is a data-driven rumor tracker that relies on a mix of human and machine processes. Here’s a look at how it comes together.

    Identifying rumors

    We use Twitter searches, RSS feeds, tips from you, Google Alerts an other means to identify unconfirmed reports early on in their lifecycle. Once we’ve identified an unconfirmed report we look for news articles about it by searching Google News.

    Tracking/Classifying Rumors

    We enter the URLs of news articles in the Emergent database. Along with the URL, we capture the headline, byline, news outlet, body text and publication date and time. (For the publication time, we use what’s listed on the webpage or grab the time listed in the source code of the page, whichever is more recent.) 

    We also classify the headline and body text of the article on what we call a truthiness scale:

    • The headline/text is for the claim
    • The headline/text is against the claim
    • The headline/text is merely repeating the claim
    • The headline/text does not mention the claim

    Once entered in the database, we track two types of changes to the articles over time:

    1. Updates to the headline/body text. We want to track if new information is added to the article as the rumor evolves. This is done partly through the system capturing changes automatically, and party by us checking back at intervals. (We’ve written parsers to automatically check URLs every hour for lots of news sites, but have to manually check the ones we haven’t yet done a parser for.)When updates are made, we see if they require a change in the truthiness rating. For example, articles of ten start out by repeating the claim, which means they aren’t declaring it true or false. But as new information emerges we sometimes see news organizations go back and update the article to note that a rumor has been debunked, or proven true.

    2. Social shares. We capture the number of social shares for the URL on Twitter, Facebook and Google Plus, divided hour by hour. Why? Because we can then see which articles about a rumor generate the most shares, and whether a subsequent debunking or confirmation of the rumor attracts a similar amount of social engagement.

    Once a rumor has been definitively debunked or confirmed, we change the overall claim state of the rumor to reflect that (from unverified to either Confirmed True or Confirmed to be False). This provides a marker against which subsequent changes and shares can be compared.

    Visualizing Rumors

    By doing the above, we are able to visualize how rumors and the articles about them evolve overtime. We call this the lifecycle of a rumor. Each rumor’s page on the emergent site shows the sharing statistics, articles and truthiness ratings for a given rumor. You can also click on a specific article’s headline to go to a page that shows a more detailed sharing breakdown, as well as a listing of the revisions to the article overtime. We’re always interested in hearing from people with feedback about the site and the project. Please drop us an email! — Craig Silverman

    • September 25, 2014 (1:14 pm)
    • 5 notes
    1. a-gauna-ate-my-waifu liked this
    2. kidlightnings reblogged this from hagsploitation420
    3. kdragon87 reblogged this from emergentinfo
    4. hagsploitation420 reblogged this from emergentinfo
    5. emergentinfo posted this
© 2014–2017 Emergent.info Blog