Very interesting talk from Microsoft (if you study sociology and group dynamics).
Much of our routine online activity leaves behind so-called ‘traces,’ or electronic records of those actions. These traces have attracted the attention of social, behavioral, and computational scientists, legal scholars, and marketing experts, among others. As people increasingly incorporate web-based technologies into their daily lives, there are myriad opportunities to examine human behavior from those traces. How can we use these data, and what can they tell us about human social behavior? In this talk I share some of the tools, methods, and applications from my research on these web-based records of activity. Specifically, I demonstrate their application to social behavioral research contexts, with an emphasis on extracting and studying the social networks that emerge from these data.
Some points from the talk..
scrapeR by Acton is a tool used for scraping data from web-based documents directly into R. It allows to parse XML/HTML documents, navigate through links and retrieve information.
R is excellent for data organization and statistical analysis.
Potential applications for scrapeR:
- Extract geographic coordinates (spatial analysis)
- Extract news headlines (content analysis)
- Track edits on Wikipedia (longitudinal analysis)
- Examine link structure of a website (network analysis)
- Combined with some customized scripting, all of this could be automated
- Retrieve weather data every 24 hours
- Capture news headlines every hour
- Trace link structure over a set of websites, observe change over time