I've created a dataset for the globe applet where each point consists of someone in my Friendster network, and double clicking on the point will launch that person's Friendster web page.
Last week I joined Friendster and quickly became frustrated with the limited functionality of the site. The site is basically a meat market of foolishness, but I find the underlying network structure appealing in concept. However Friendster does not make it easy for you to visualize the network information, probably because more advanced features will be part of the eventual paid member service.
Saturday morning I decided to write a crawler to explore my own personal network on Friendster. Because I'm a masochist, I decided to try to write it in a programming language I had never used before: Python. Python is a new programming language all the kids keep yapping about these days so I figured I should finally take some time to look it over. Nine hours later I had a (somewhat) working spider and a significant level of respect for Python. It's a strangely beautiful language. Much thanks to #python and #infoanarchy on freenet! I'll release the spider code as soon as I get it in a distributable state. I also want to alter the search algorithm from depth-first to to breadth-first graph traversal.
The spider is grabbing each user's name, id and location. It is rather slow since I pause 2-6 seconds between each page request to avoid overtaxing Friendster's servers. So far I have about 5000 out of 15000 people in my network.
Once I had enough data to be interesting, my next step was to convert the location string data to latitude/longitude pairs. There are two types of location information on Friendster. If you are in the US or Canda you specify a zip code which is displayed as "City, State", otherwise you can only specify your country. It was a bit of a challenge as I had to write some perl scripts to massage data and do some manual input but I was able to convert the bulk of the locations to a lat/lon pair.
I wrote a perl script to convert the output from the spider into a headmap XML file that the globe applet could understand, then I was done! Well, almost, I found a bug in the XML parser library which the applet uses that caused it to not be able to load the Friendster URLs. Luckily it is an Open Source library so I was able to fix the bug myself.
So, click the link below to see what I ended up with. It's a big dataset, in fact it's probably past the reasonable limit of what the globe should be handling, but it is a good start. With some more work I could map this same dataset onto a flat map with the connections between people visible. Imagine as you hover over a point all of the direct connections to that point become visible. Lots of interesting geogeeking things and that's not even getting into the graph theory stuff or using the rest of the data associated with each user.
Cool. I've just spent an hour or so working on a Friendster spider -- my thought was to export a FoaF file from the data. It wouldn't be a very good FoaF file, since it wouldn't have mboxes, but...
Anyway, like you, I'm just learning Python, but I did come across this cookie library which might help you get closer to publishing your code (and saving me the trouble of finishing... er, reinventing the wheel!)
I just discovered headmap from your archived email message related to this posting. It looks right up my alley...
Posted by: Joe Germuska | 2003.08.16 at 11:33 PM
FYI: http://www.qrivy.net/friendgrapher.html
My take on friendster visualisation. Requires a Java VM.
Posted by: Michael | 2003.08.20 at 11:35 PM
Dav-
I'm a student at the University of Washington in Seattle, and I am interested in doing an econometrics based project on friendster. Essentially, I'm looking to crawl friendster network and find the degree of correlation between one's sex, possibly location, and interest, with the number of people in their network.
My coding skills are dismal at best, and I would very appreciate your help, or perhaps you sharing some of your code that I could modify to complete my project.
Posted by: Andy | 2003.11.09 at 07:01 PM