|
Powered by
|
|
Section: All | News & Politics | Geek Stuff | Devel | Non-existent Life | Random | Food! | Life |
Mon, April 28, 2003
More BlogMatcher...
I've got a prototype of the BMD (BlogMatcher Daemon) up and running, and I've also integreated it with a PHP front end. The BMD basically has all the link <-> blog relationships stored in RAM in a 350 million node graph, allowing it to do searches in well under 0.1 second. Right now, it seems like the slowest process now is the PHP front end that still does most of the scoring calculations.
One of the benefits of having a graph in memory is that I can do all kinds of things. For example, it's just as easy to search for all blogs that have a particular link, and in theory, it's also possible to do all kinds of "shortest path" types of calculations (i.e. shortest path from one blog to another).
Having said that, I want to work on the link scoring algorithm now. With blogs that have a lot of links to various sites, the results are next to useless. I need to somehow figure out a way to determine which links are significant and which links aren't...
I just thought of something. Maybe one way to do this would be to track "link shares" over time. The basic idea is that a certain percentage of links will always be to certain sites, like google or slashdot, and that percentage probably will remain fairly constant. On the other hand, a hot new site or an interesting article is more likely to pop up suddently and receive a lot of attention, then fade away again. Maybe I need to start logging that information...
|
|
Ryo Chijiiwa
I'm a biologically Japanese, culturally American, Germany-raised, socially liberal, politically independent, gun-totin', code writin' dude. My life is currently sponsored by Google.
|
Posted Mon, April 28, 2003 07:45 by ktpupp http://www.sumbler.com/blog/
I just checked out BlogMatcher, it's pretty cool! One thing I wondered about though... Does it have a way to catch links that are done through Blogrolling? It doesn't seem to show those links, but I thought maybe that was something you might be working on! Keep up the good work! I'm gonna mention BlogMatcher in my blog to get more folks interested in your work! -=kt=-