Learn Data Science

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

06-26-2017
10:16 PM

- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

06-26-2017
10:16 PM

Centrality as a concept is largely subjective because it answers the question of what nodes are most important in a network. Importance is heavily dependent on what question you're trying to answer: if you want to know who has the most connections in a network, you need to only count the degrees. If you're trying to understand information transmission, Betweenness is great. For cases where you want to understand the ability to transmit information over time and given specific nodal properties, however, you need something like Percolation Centrality (PC) - which is today's topic of discussion!

I want to talk about Betweenness first, as it's the basis for PC. Betweenness as a concept assumes that information will travel over the "path of least resistance" - If your mother-in-law knows a joke, you're more likely to hear it from her than you are from the Pope - who may have told his friend, who tells his gardener, who tells her cousin, who tells her accountant, who finally tells you. Likewise, Betweenness is calculated as the proportion of shortest paths that go through any given node. If we make the assumption discussed above regarding shortest paths, then the proportion should describe the relative control a node has over the transmission of information, in this case, the telling of a joke.

The question then becomes: how does this control change over time, and given nodal states? If your mother-in-law doesn't know the joke, she can't tell you the joke. Therefore, her ability to transmit information (the joke) in this case would be 0. This ability is something I'll refer to as a nodal *state*. This is the one of the key additions that PC adds over Betweenness Centrality. At each point in time, PC is computed as the proportion of *percolated* paths that go through a given node (where percolated paths refer to paths that originate from a percolated source). If you're interested in learning more about the mathematical foundation of Percolation Centrality, please refer to the PLoS One publication by Piraveenan et. al found here.

As part of my M.S. thesis, I developed logic to parallelize the computation of Percolation Centrality, and extended Aster's Betweennness code to accommodate the changes necessary. The key files are attached to this post, however, if dependencies are needed, please contact Mike Rote (Mike.Rote@Teradata.com) for the complete software package.

The Percolation Centrality function takes the same arguments as the Betweenness function, with the exception that the "source" input is mandatory, and should be a table of "percolated" nodes (node ID is the only required column). More information regarding the Aster implementation can be found in my thesis document ("Bulk Synchronous Parallel Implementation of Percolation Centrality for Large Scale Graphs", which will be published through OhioLink.

Percolation Centrality can be used in applications from diverse fields. Examples include:

- Epidemiology (identifying high-risk contagion points in a social)
- Marketing (identifying potential recruiters in a network)
- Transportation (issue tracking)

Cheers!

Labels:

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.