Search engine barriers to entry

One of the myths that I can dispell about developing search engines is that there is a massive barrier to entry when it comes to hardware.

Google have 100,000s of servers running their search engine, so that’s what it takes to compete and enter the market – or so the argument goes. This couldn’t be further from the truth.

This forgets that Google only need this number of servers to run their search because they have a vast number of people searching on the site. The actual volume of traffic starts small in any start-up, and builds up on a growth curve – usually allowing the start-up to ramp up the number of PCs required as the traffic grows. Each individual index is actually fairly small, and sits across a few hard-drives only.

This is another popular myth – the volume of data is so high that hundreds of PCs are required to begin a new engine. I challenge people to do the calculation; how many bytes does it take to store each web-page? How many web-pages are required for a useful service? You may be surprised.

What I can personally confirm is that the hardest part about bringing a search engine to market is not the hardware, but the software. 90% of the software can be finished in a short space of time, but the other 10% can take years to complete to a satisfactory level. This is the real barrier to entry.

The story behind our local shopping search engine goes back several years. You can keep in touch with the story on this blog.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s