This week I had the honor of attending and presenting at ICDM.
The conference was hosted in Atlantic City, NJ, at Bally’s Hotel and Casino on the boardwalk. It was certainly an interesting choice for an academic conference venue. Though I myself grew up just a few hours north of Atlantic City, and now live about four hours from Las Vegas, I’ve never really indulged in the delights of “gaming,” as the conference program referred to it. I didn’t really know what to expect, but I must say that it was a lot of fun to wander through the casino during session breaks or at the end of the day. The boardwalk itself was a lot of fun (and impressively recovered after Sandy), and not to mention the big outlet mall and Aquarium which was the destination of a group excursion during the second day of the Conference. The organizers did a great job of pulling together a diverse conference in a less-than-conventional place; I think everyone had a great time.
The conference took place over four days. The first day was devoted to workshops and a PhD forum for students to discuss their dissertation work, while the full conference program took place the during the remaining three days. The program boasted three excellent keynote speakers: Robert F. Engle, Michael I. Jordan, and Lada Adamic.
Unfortunately I was unable to attend Engle’s talk, but I will give an overview Jordan’s and Adamic’s talks.
On Computational Thinking, Inferential Thinking, and “Big Data.” Michael I. Jordan spoke about the need to adapt traditional methods for data analysis to address the more modern concerns of Big Data. According to Jordan, problems in Big Data require both a computer science approach and inferential thinking. On the computer science side, these problems benefit from the concept of abstraction, modularity, and scalability. On the inferential/statistics side, there is a formalization of risk and confidence. However, these two fields tend to miss each other on key concepts which stem from the trade-off of quality and time and space constraints inherent with Big Data.
Jordan discussed how the interface of computer science and statistics has evolved with respect to privacy constraints, compression, and implementation mechanisms such as subsampling, parallelization, and algorithmic weakening involved in estimating from a smaller empirical dataset.
Information in Social Networks. Lada Adamic is a computational social scientist at Facebook leading the Product Science team. In her work she studies trends in social network data, and how this affects users’ online experience and how it affects the growth of the network. In this talk, Adamic focused on how news propogates in social networks. In particular, she focused on what affects what a user sees, and the potential for diverse exposure. A case study was political affiliations, and how this affects what gets posted, what gets exposed, and what gets clicked.
The potential for higher or lower diversity was summarized by three contributing factors. First is the potential exposure associated with the connections a user has. That is, if a user has many homogeneous (in terms of political affiliation) friends, they are less likely to see content from an opposing viewpoint. On the other hand, if a user has a multitude of strong ties and weak ties with diverse users, they have a better change of being exposed to opposing ideas. Second is actual exposure – what is actually presented in a user’s News Feed. This is primarily the output of Facebook’s ranking algorithm. Finally is selective exposure. This is ultimately the user’s actions – what the user clicks on, whether or not they are willing to consult alternate sources or “cross-cut” their typical exposure.
Adamic then presented a series of results demonstrating how different groups experienced potential for diversity. Among some of the key findings:
- On average, 23% of people’s friends claim an opposing political ideology.
- 29.5% of the political news content shared by people’s friends cuts across ideological lines.
- On average, 28.9% of the political news encountered in News Feed cuts across ideological lines.
- 24.9% of the political news content that people actually click on cuts across ideological lines.
We also saw that “soft news,” such as sports, fashion, or culture, was shared more by self-proclaimed moderates while “hard news,” political and government related, was shared more by “charged” self-proclaimed conservatives or liberals.
In all, her talk outlined the multitudes of social insights available from social network data.
In coming posts, I will address some of the thematics concepts seen at the conference during the concurrent sessions – and maybe even a bit about the paper I presented 😉