ryochiji's blog
Brought to you fresh from the depths of Ryo Chijiiwa


 
Powered by
IlohaBlog

Section: All | News & Politics | Geek Stuff | Devel | Non-existent Life | Random | Food! | Life |

Mon, April 21, 2003

Holy Sh*t!
Holy mother of llamas... the core search now takes well under a second (0.24 seconds for my blog)! I almost crapped in my pants. Well, I didn't, but my hands are shaking.

Here's how it works. Basically, the indexer generates a list of links for each blog and saves them into a file. What the search program has to do is go through each of the links files and see how many of the links in there match the links for the reference site (also stored in a file).

Here's what the PHP scripts did: The PHP search feature read the reference site's links into an array, then went through each of the indexed links files and ran an array_intersect().

Here's what the rev 1 C program did: I read in the reference site's links into an array, and went through each of the link files. When looking up each of the links, it did a linear search.

Here's what my rev 2 C program does: I read in the reference site's links into a B-Tree, and does everything else like before, except instead of an exhaustive linear search, it just has to search the B-Tree.

Does it get any better? Well, possibly... B-trees aren't the best if the data isn't inserted in random order (i.e. if it's already sorted, you get lop-sided trees). If I did an AVL tree it might be a little bit more efficient. Having said that, the tree will only contain at most 200 nodes, usually somewhere in the range of 30-60 so I doubt there's much room left for optimization there. At this point, I think the limiting factor is disk I/O. The only way to get around that would be to create a daemon that has all the links stored in RAM, and have a bunch of readily available (i.e. pre-spawned) processes ready. I'm not sure if I'm that desperate...



Ryo Chijiiwa

I'm a biologically Japanese, culturally American, Germany-raised, socially liberal, politically independent, gun-totin', code writin' dude. My life is currently sponsored by Google.
www.flickr.com
This is a Flickr badge showing public photos and videos from ryochiji. Make your own badge here.