Wednesday, July 9, 2003

Google is soon going to reach the end of its useful life in my opinion, as are the other search engines even with their ?advanced? search options. The signal to noise ratio is getting ludicrous, something I notice particularly when I’m looking for something in Welsh. For some reason, when I search for Welsh words, I often end up with a long list of wholly useless Japanese pages.

What I want, instead of relying completely on someone else’s algorithm to sort out the wheat from the chaff, is to have a desk top app that learns my search preferences and which has a whole number of options I can use to narrow my search in exactly the way I wish.

Take the language issue. You can already use an advanced search with Google, which allows you to pick out certain languages, but firstly, the list doesn’t include Welsh and secondly, you can’t specify not to search for a certain language. It can’t be too difficult to write an app that recognised languages either from metatags or from the content of the page – would it not just be a matter of specifying a number of common words for each language which the app could look for in the page?

For English, for example, if you have a page with several instances of, say, ?the, and, a, an, you, I, be, go, come, of, not, from? on, then it’s a good bet it’s in English. For Welsh you could look for ?y, yr, mynd, dod, cael, dim, ddim, yn, wedi, mae, ydy, yw? to name just a few. I’m sure every language has a similar set of common words that could be used to specify it. This way, you could include user specified languages so that the app writer doesn’t have to be a linguist and doesn’t have to be even aware of minority language in order for them to be accommodated.

Then there’s the proliferation of commercial pages. I was looking for information on Wagner’s Ring Cycle, (please don’t ask, it was work, ok?), the other day but all I could find were pages trying to sell me CDs. For various reasons, I needed a page which had certain specific information on, but which was not selling anything, and it took me ages to sift through the gazillion Amazon pages that Google threw up. So why not provide a way to filter these out? You could either do it by user specified domain, or by filtering out key words such as ?buy now?. Equally, if you actually want to buy the CD, you could search specifically for these sorts of pages.

In the same way, if you wanted to filter out or filter in blogs, then you could catch the majority simply on domain, but I’m sure that there are recognisable facets to the code too that could be searched for, such as repeated use of the word ?blog? in the styles.

There are a host of criteria that people could use to narrow down their search, and although that advanced searches give you some access to additional criteria, they’re not very flexible. A desktop app could have a whole load of options for filtering certain types of pages, it could learn your preferences and be customised to your needs, thus provide a much better searching experience.

Admittedly, I don’t know what kind of metadata databases like DMOZ collect, but surely an app like this could sit on top of the current search algorithms, providing extra functionality and specificity. Or maybe it’s time that someone comes up with a more comprehensive set of metatags that would actually provide more useful information than a list of random keywords and a creation date. Of course, then the problem is getting people to actually use those metatags, but it would be start. (Or maybe there is already, but I’m not aware of them. I’m not 100% up to date on this stuff, admittedly.)

There are so many millions of web pages now, and searching through them for the tiny nugget of information that you want is becoming more and more laborious, and some time soon there’s going to have to be a serious paradigm shift in the way these searches are carried out. I don’t know if I’m representative, I guess I am, but usually, I have very specific ideas about what I’m looking for, and for me to be able to filter out certain genres of web site would be just great.

Of course, after the rapid response from Ralph to my last ?I wish someone would? post, I’m fully expecting someone to comment and say ?Why don’t you download??. 😉 Guess we’ll see.

{ Comments on this entry are closed }