Horus Kol

11:34 pm, June 9, 2009 - RSS Bing: and your answer is….

bing.betaMicrosoft Live Search is dead – long live Bing!

Microsoft, despite some not terribly heroic attempts to gain ground, hasn’t been too big in search. Google takes about 90% of the market share globally, with Yahoo grabbing another 5%, and all the rest (including Microsoft) fighting to get a look-in. This is a major shift from where search was at in 2005.

Recently, Microsoft has sought to break away from the search peloton, and become a significant third (or even second) choice of search engine by killing the Live Search brand and replacing it with Bing.

First Impressions

The first thing you notice is that they have dropped the plain white of Live Search, and that which has served Google so well for over a decade. The background is now occupied by a nice photographic vista which seems to update on an irregular basis.

The next thing that hit me was that the default location seems to be the UK (at least for me coming in from Australia). Thankfully, http://www.bing.com.au/ gives me access to the option to only search for sites in Australia. Not sure why Bing can’t auto-detect my location like Google does, but the option is cookied for now.

The results page is quite a tidy design – looks a little bit more fresh than Google, but that might just be because I’m used to Google.

One nice thing is what happens when you hover the mouse over a result – a floating panel opens with more content from the page so you get a neat little preview.

Instant Answers

From the press release:

Bing provides Instant Answers that immediately return highly relevant direct answers in response to a specific search. For example, entering a flight number will return the most recent flight information and display it prominently in the results, saving the hassle of going to a separate page. Other Instant Answers on Bing include stock prices, local weather, sports scores and more.

That’s kinda cool – the flight from Adelaide to Singapore is SQ268, and the last flight arrived 35 minutes early. Google apparently does this too, so I’m not so impressed. And both search engines get their data from the same source – Flight Stats.

More Search: Images and Relevance

Image search only turned up 7 images associated with “horuskol” – while Google returned 214. This may be a result of Bing’s attempt to return ‘more relevant’ content, although one of the seven images is rather tangential to my alias.

Image search does have some nice filtering – size, type (illustration or photograph), people (faces/portrait), colour and shape – these finer controls over results here are most welcome.

A vanity search works nicely, though. This blog hits the top 2 spots for “horuskol” on Bing, while Google nets me my Twitter and Wikipedia profiles ahead of my own site. Fair enough, but I’d rather have this site on top, thanks – the problems of placing PageRank over relevance (considering my WikiPedia profile page has been untouched by myself for about a year, I’d rather it drop off the face of the internet – but WikiPedia has a very high PR).

Layout and Features

I quite like the results layout of Bing – it enhances what is already a well-developed formula for results pages. Sponsored sites, followed by image/video results, followed by results. Placing related searches on the left hand side is nice, and there is a quicklink to search ‘filters’ (maps, news, images, etc) to indicate that a good number of results in those sections.

Bing is missing one trick, though – I mispelt “lord of the erings” when I was testing, but Bing didn’t offer the alternative spelling. Google very handily places the alternative text right up the top of the page.

Overall

Bing saw a leap in usage right at the start, and outdid Yahoo for a couple of days. This could of course be due to a reported ‘glitch’ in some version of internet explorer (which Microsoft had previously said wouldn’t happen), but I think there was a fair bit of media interest which people followed.

The blip is down again now, and Bing usage is about the same as where Live Search was before the change.

Then again, I’ve become tempted to make the switch myself – at least for a little while.

10:00 pm, June 1, 2009 - RSS Knowledge Search: Wolfram Alpha launches softly

Last year, a self-proclaimed ‘Google-killer’ was launched – Cuil. They came out with huge bang, which quickly fizzled into a farce, and they have pretty much dropped off the radar. Apart from making riduculous claims, they also made a large number of technical mistakes which caused a lot of angst amongst web-masters around the world. All in all, Cuil has pretty much dropped out of the media, and doesn’t really factor in any discussion on search.

Last week saw a very different launch of a rather different search engine – WolframAlpha. Despite the media attempting to label it as another ‘Google-killer’, the developers and team behind it were very careful to show how their search engine would be different from Google and not in direct competition.

(more…)

6:13 pm, September 2, 2008 - RSS Cuil is getting into hot water

Just about a month ago, I wrote about the release of a new search engine named Cuil.

Things have been a bit quiet on that for front for the last month or so (at least as far I could be bothered with it). It seems that the odd misplacement of images against results from pages that are unrelated and the strange missing results (try compare cuil with google) are still cursing this self-proclaimed Google-killer.

But that isn’t why the not-so-hot search engine is getting in the news today.

Cuil is Killing Websites

This is the main gist of the headlines covering Cuil today (for example, TechCrunch.

It appears that for some strange reason, the search engine’s indexer is hitting websites with hundreds, sometimes thousands, of useless requests which, according to the site managers, make no sense.

The way a normal indexer works is to access a page, pull the content out of it, and then list any links that appear in that page into a record so that it can then index the pages that those links point to. A lot of indexers get a bit more complicated than that – like remembering when a page was viewed, and only visiting it again after a delay, or only going so many links down from the homepage (using the theory that if you have to click 17 links in succession from the frontpage to get a particular page, it probably isn’t that important). I’ve written a couple myself – and these rules aren’t all that hard to setup (the struggle comes in the bit where discovered content is made easily searchable).

What Cuil’s indexer, Twiceler, seems to be doing is generating pseudo-random page names and requesting them on sites (apparently using the idea that many sites have a page called “about.php”, so surely any random site should have it) in an effort to index pages that aren’t linked.

Fallout

This is a real problem – each request that fails eats up computer time on the server while it works out that the requested file isn’t there, AND then it eats bandwidth (both coming and going) as the request from the indexer comes in, and the “not found” response goes out. And if, like this site, you have a friendly error page in the same style as the rest of the site, then you are delivering the page and the images and associated stylesheet files to an idiot indexer (idiot as in dumb – without internal decision-making facility).

I haven’t seen the indexer in my logs yet (which begs the question as to how they got my content in their search engine in the first place). However, a growing number of webmasters are becoming increasingly disgruntled by this upstart. Some are blocking the indexer, while others are going so far as to request a cease and desist.

Response

By all accounts, Cuil is honoring the cease and desist requests, and quite promptly.

They have also crafted detailed responses to the requesters, which are full of nice fuzzy non-statements, including a bunch of “oh, it’s not us, it’s people mimicking us” (shown to be false). They also point to the rather stark Webmaster Info page on their site.

Cuil are pointing out that they are a startup, and as such expect a few teething issues – but the number of reports from sites experiencing problems is really more than teething issues.

I’m starting to think that the Cuil team have bitten off a lot more than they can chew right now, and that the 120 billions pages being indexed are probably mostly those error’d pseudo-random generation urls.