25 May 2008

Who will search the microblogs?

I noticed something interesting when I was searching for Jaiku related blog posts with Google Blog Search recently. Many of the results of my search weren't exactly what I was looking for -- I wanted to read long-form blog posts related to Jaiku -- but they were nonetheless quite relevant to my search term. Instead of just giving me blog posts about Jaiku, Google also gave me returned actual microblog posts on Jaiku. To me this is interesting because Google seems to be treating microblogs and blogs as similar entities that can both be searched on Google Blog Search. Personally, I tend to think that microblogs and blogs are quite different species and that search engines should treat them accordingly.

Why shouldn't microblogs and blogs be lumped together? I'm looking at this issue primarily from a searching perspective. When I do a blog search about something, I expect to find more or less developed articles and/or collections of links. I'm not going to be satisfied with a sentence or two that happen to contain my keyword(s). If I'm looking for long-form content, microblogs appearing in my blog search results are simply noise. Even if I did want microblogs to be in the mix, filtering the useful microblog postings from the chaff is an unusually difficult challenge. Useful blog posts will attract links on the outside Web; useful Twitter postings probably won't, though they might be responded to more. How does a search engine compare blog posts having inlinks with Twitter tweets that aren't linked to? Does one give more weight to Twitter folks who have more followers or more links to their Twitter feeds than others? While intelligent and useful searching of microblogs is important, I don't think the solution involves treating conventional blogs and microblogs as if they were the same. Instead, I think we need "conversational search" that is just for microblogs, forums, and any other searchable forms of online chat. Thus far, the giants have been slow to recognize this need.

You might well ask yourself, "Is this really necessary?" After all, don't search engines search everything...isn't that what they're supposed to do? Sure. When I go to a search engine, I do expect to see everything in the general web index. Specialized search -- be it image, video, blog, or whatever -- makes things easier for me when I really want to narrow things down, though. If I do a general web search for tennis, then I expect to get a bunch of different stuff back: tennis news and results, the rules of the game, shops selling tennis supplies, etc are all appropriate first page search results for my very broad query. If I do a blog search for tennis, then I expect to get back more opinionated but still well-developed content. I don't expect just the news and results, but rather different personal takes on the news and results. I don't expect to find stores, but rather opinions about the stores and general posts with affiliate links. If I do a microblog search, I'm looking for small morsels of content: "Tennis sucks," "Tennis rocks," and "Tennis is hard on the knees." A tweet might convince me to start following someone and make a new friend. Alternatively, maybe I'm searching the microblogs just so I can explore a kaleidoscope of thought. Are people liking tennis more or less these days? Microblog search can give us a more personalized picture of shifting opinions than Google Trends can. No search engine can read minds, but I think it's safe to say that someone who is looking for blog posts about tennis does not want his search to lead him to a "Tennis sucks" microblog post. That post could be just what someone else is looking for, but I think more often than not microblog posts will just be adding noise to blog search engine results. This isn't a problem if we have conversational search.

There are already some quite decent Twitter search engines out there. At least one of them, Summize, unabashedly says that conversational search is what it does. The problem I'm seeing with these engines is that they're only searching Twitter right now. Twitter is the top dog in the microblogging world, for sure, but that doesn't mean other conversations should be ignored. As I mentioned earlier, I even think forum posts should be a part of conversational search (after all, they ARE conversations!). They already quite often show up in general web search results and have often helped me solve very specific problems; frankly, forums have proved a lot more useful to me personally over the years than microblogs have so far. There are other conversations going on elsewhere on the Net that could be indexed: for instance, I think IRSeek, which searches IRC chats, is a great service though it's been somewhat controversial. A really good conversational search engine will look for conversation everywhere and index it like mad.

Of course, the very phrase "conversation search" implies that microblogging is all about conversations. Truthfully, it isn't always. You can certainly tweet about anything you want without having any followers. You can also Jaiku haikus to your heart's content -- in that case, you're microblogging to express yourself, not to conversate with others. In such instances, perhaps those particular microblog posts would be more at home amongst traditional blog posts rather than forum posts and IRC logs. Perhaps, then, "conversational search" isn't the answer, but I still think we need a way to conveniently search microblog posts and that it is best to segregate regular blog posts from microblog posts. Whoever does it will have to tackle some tough questions. I already mentioned the difficulty in determining how to rank microblog posts. What about the difficulty in actually determining what a microblog is? I assume this determination would be based on platform (for example, Wordpress = blog while Twitter = microblog), but if someone writes really short posts on a Blogger or Wordpress blog isn't that person really microblogging instead of blogging? Anyway, it'll be very interesting to see if one of the big Internet companies will tackle this problem or if one of the independent search engines will dominate this still fairly fringe interest instead. Google seems to be the most natural home for conversational search to me, especially since it has its own microblogging service which needs to be promoted more, but it would be a good addition to Yahoo! Search or Live Search as well.

20 May 2008

Twitter may be down frequently, but Jaiku is always in beta.

Twitter has quickly become the king of the microblogging world. The darling of early adopters, Twitter is increasingly attracting mainstream Internet users as well. It should keep growing, too, because people are going to want to use the same service as their friends are already using. Twitter does have a weakness, though, and it's not something it can hide: it goes down, and fairly regularly at that. For the most part, Twitter users have proven to be an understanding lot; many of them realize that Twitter is an independent venture that has grown very big very quickly. Still, there's little that's more annoying than an unreliable communications network; Twitter's downtime ought to be fueling competition in the microblogging world.

Google's microblogging acquisition, Jaiku, has not taken advantage of Twitter's weakness. Although still frequently mentioned as being Twitter's primary competitor, Jaiku hardly offers a refuge for those seeking a more reliable microblogging alternative. If you want a Jaiku account, you have to go to the website, request an invitation, and wait. (You could also get invited directly by an existing Jaiku user, but invitations are limited.) If you want a Twitter account, you go to the website, signup, and start tweeting. By the time Jaiku sends you an invitation, you could already have built up a network of Twitter friends. Granted, Jaiku is in beta, still a work in progress. Google has missed the boat by not putting more resources into its microblogging platform; there may never again be as good a time to build up such a service. In all likelihood, Twitter will overcome its uptime issues and consolidate its position as the top microblogging service. Jaiku may have to settle for second place or worse if it doesn't come out of beta soon or at least start allowing open signups.

On the other hand, it is indisputable that Google achieved a lot of success following a similar strategy with Gmail. The invitation-only model there created massive amounts of interest prior to the service opening up. There isn't such a huge drive for Jaiku invitations as far as I can see, and I think this is largely due to the perception that Jaiku is not that different from Twitter. Gmail quickly gained a reputation as being something revolutionary; Jaiku, on the other hand, seems to be widely considered merely a pretty Twitter alternative. Additionally, I think many people are going to prefer to stick to one microblogging service; in contrast, few people seem to have just one email address nowadays. I could be wrong, but I think Jaiku would be better off if it were open and out of beta.

12 May 2008

Yahoo! tracks backlinks where Google fears to tread.

If you've ever created a web site, you've also probably gone to the search engines to check to see who, if anyone, is linking to you. If you use both Yahoo! and Google, you've probably noticed a big difference in number of links to your site that these two engines are reporting. Although Google is the world's most popular search engine, Yahoo! is much more thorough when it comes to counting backlinks. Indeed, many people who use Google as their main search engine use Yahoo! only to look for backlinks -- it's great to be the best at something, isn't it?

I'm not so sure Google even wants to be the best backlink counter on the Net, however. The "no-follow" attribute that is added to more and more links these days is part of the problem. I totally understand why Google doesn't want paid links and spammy links to improve a site's ranking in the search results, but I don't like the idea of the Google bot seeing a link attribute and saying to itself, "Well, I won't look there." The Google bot is supposed to look everywhere. It should know about every link on the Web, in my opinion, whether those links be no-follow or do-follow. If people want to keep their content off the search engines, robots.txt and password-protecting pages are methods that still work. Granted, no-follow makes the process of avoiding the Google bot easier -- it even has the effect of democratizing the process because users of Blogger and Geocities and other similar services as well as non-technically inclined web publishers everywhere can easily utilize no-follow links. Still, the Web is basically a public place, and I'm just not convinced at all that that many people want to have their content on the Web freely accessible to all but still hidden from search engines. After all, people who want to share content within a group but not with the outside world can use services like Google Docs and Blogger to do just that and totally control who can access their content. In my opinion, no-follow shouldn't be taken too literally; the search bots should still follow, but they should only consider do-follow links to be "votes" for a given website that need to be reflected in the search result rankings. As no-follow begins to be used more and more by people who simply don't want to pass PageRank around (except, perhaps, to their own sites and to their friends' sites), I think it'll become only more important that search engines know where all links on the Web lead.

Yahoo's more open-minded attitude towards no-follow lets webmasters and other interested parties find links, no-follow or not, that Google doesn't seem to even know exist. It's really not just about no-follow; Yahoo simply takes tracking and reporting links in general more seriously than Google does. When I go to Google and type in "link:del.icio.us" I want to find out who is linking to the world's biggest social bookmarking site. Google does find more than 400,000 links, so that's plenty to keep me busy and an indicator of just how popular Delicious has become. When I go to Yahoo! Site Explorer to explore del.icio.us, however, I find over 33 million links which is on another level. The difference in reported links is staggering for all sites, large and small. I still like Google for search better than Yahoo overall, but when it comes to counting links Yahoo! has a clear edge. Eventually, that link advantage could help Yahoo improve its search as well.

08 May 2008

Microsoft and Yahoo! have reembraced the status quo.

So the deal that seemed fated to so many never actually happened: Yahoo! and Microsoft remain separate companies, competitors rather than allies. As I've noted before, I think this is best from the user's perspective. I have a feeling it might be best for the two companies as well; true, there was a chance that the combined entity could pose a serious challenge to Google, but I felt there was also a chance that it could prove the downfall of Microsoft if they mismanaged their newly acquired properties. Personally, I never quite subscribed to the theory that the combination of Yahoo! and Microsoft automatically creates a major Google competitor -- it really would just create a larger competitor to Google in the short run for sure. People unhappy with the ensuing changes caused by the combination could have very well ended up migrating to Google, making Google actually a little bigger than it was prior to the deal. In Microsoft's defense, I will say that they surely viewed the acquisition as just one part of a long-term Internet strategy that would involve much more.

It didn't happen, though. All those bloggers who were so sure a deal would take place were wrong. Many financial analysts were wrong. I realize that I was also pretty wrong to take what those people were saying so seriously. Even though I'm not an expert on business acquisitions, I'm going to take any prediction of an impending deal with a grain of salt from now on. Sometimes, the experts can't really use their knowledge to make good predictions because a particular situation is unusual. Few seemed to consider how much Yahoo! did not want to be acquired and also that there would be some resistance to the deal within Microsoft as well. Understandably, I'm feeling quite skeptical now that the common expectation has become that Microsoft will launch another bid later this year after Yahoo's stock price has declined. This time, I'll believe it when it happens and not a moment before.

YouTube's Partner Program has adopted a closed model.

One of the things I admire about Google AdSense is that it is simultaneously one of the world's most open advertising networks and one of its most successful. This "open" model for advertising online has always made sense to me -- why wouldn't you want your ads to be seen by the largest number of people possible? -- but few networks can provide the considerable administrative and enforcement manpower needed to ensure that advertising will continue to work for both publisher and advertiser. AdSense and AdWords aren't perfect, but they do still work for a lot of people, including me.

Google decided to follow a very different route to sharing revenue with the video publishers of YouTube. The YouTube Partner Program requires prospective earners to meet three criteria before they can join and start making money with their videos: publishers need to put out "original videos suitable for online streaming," they must have the legal right to upload whatever they are uploading, and their videos must be popular. The last point is what this post is about, though the first two help explain why the third exists. If you are a budding video publisher, you probably would rather not do as YouTube is forcing you to do. Why would you want to put out a bunch of videos, wait to become popular, and only then start monetizing your work? Given the sudden (and often brief) explosions of popularity that online videos are prone to, waiting to be accepted into the program means losing revenue. You might well wonder, then, why YouTube won't just accept anyone who doesn't violate the terms of service into the partner program. Why can't it be easy like AdSense?

The first two criteria for joining the partner program are essentially warning those who upload copyrighted content that they need not apply. Nonetheless, copyrighted content remains a big draw for YouTube; plenty of people upload it, and many more people view it. It is probably true that most video publishers who regularly put out original content that get a lot of views are going to be less interested in getting booted off YouTube and losing out on future revenues on their videos just so they can get some quick views by uploading copyrighted content. If your only video is thirty seconds of your baby sleeping, you might just be a little more tempted to try to make some quick bucks using someone else's work. Additionally, the fewer people that apply to the YouTube Partner Program the less the stress placed on the staff that must review the applications. Thus, YouTube has strong organizational and legal motivations for experimenting with a closed revenue sharing model.

In the long run, I do hope the YouTube Partner Program opens up to everyone. It shouldn't be harder than AdSense -- video content shouldn't be discriminated against just because video copyright issues are more of a hot button issue than web site copyright issues. As of now, this isn't a big deal because YouTube is such a force in the web video world; it has the audience already, so publishers come to it in droves. Still, some publishers will be tempted to monetize their videos in other ways and at other venues instead of trying to first prove themselves to YouTube in order to be allowed to make money. An easy way to monetize creative work encourages creativity, but barriers to entry, even minor ones, tend to dissuade it.

It's interesting that so many people still use free hosting for their original videos even when their videos are the main content of their sites -- bandwidth concerns seem to have created this situation which has put the hosts in a position to dictate the rules to the publishers. There is, however, plenty of competition in the video sharing world despite YouTube's dominance. The YouTube Partner Program will have to compete with Revver and other sites that might offer publishers a better deal (and a smaller audience).

01 May 2008

On Web 2.0, there are hundreds of ways to bookmark.

Just about everyone who uses the Web has at least a few URLs they need to save or need to be able to access quickly -- it's a very basic need, and has been since the very beginning of the WWW. Indeed, bookmarking has been a feature offered within the browser for a very long time. For just as long, however, people have been saving URLs in notebooks, in documents, and in link collections on the Web. Social bookmarking and other online bookmarking solutions have grown at a rapid pace over the past few years, but nonetheless many people still use their browser's bookmarking utility whenever they want to save something or go to a favorite destination on the Web. What is the future of bookmarking, then? Will there continue to be many online bookmarking sites? Will old-fashioned methods of bookmarking still continue to find widespread use?

I don't think browser-based bookmarking is in any danger. (I'm afraid my term "browser-based bookmarking" might be confusing -- the idea is that the bookmarks are stored on the local computer or local home/work network rather than on the external Internet.) It doesn't go without saying that a person would want to share his or her bookmarks with the general public, so social bookmarking isn't something that will appeal to everyone. Indeed, I doubt it is very wise to let everyone on the Internet know who you bank with and have credit cards with, so some bookmarks really are better kept private. You can still keep your bookmarks accessible only to you while still using web-based services, but it is more intuitive to store private data locally. Saving copies of your local bookmarks collection is also simple and straightforward. Additionally, browser-based bookmarking has the advantage of widespread acceptance; people whose bookmarking needs are already met inside the browser may not want to learn new interfaces and use new features even if they are really cool. I expect the browsers will continue to add features to their own bookmarking utilities to keep up with the online innovators as well.
Clutter-averse individuals may particularly try to avoid online bookmarking because of the browser add-ons/toolbars that bookmarking sites tend to encourage their users to download, though often the download is optional. The biggest advantage of online bookmarks, however, cannot be matched on the browser side of things: only online bookmarking can free bookmarks from a particular computer or particular home/work network. Still, plenty of people only surf the Internet at home or work on the same computers every day; what might be vital for the traveler and the college student isn't so necessary for others.

With that said, I am sure that online bookmarking is here to stay and I expect there will continue to be many competitors in this space who will do all sorts of cool things. People like me already use multiple online bookmarking sites as well as browser-based bookmarking -- yeah, bookmark junkies do indeed exist -- and I think that could very well become much more common in the future. I use all my bookmark collections a little differently. My Firefox bookmarks are a dozen or so sites that I use often and extensively; quick access is the name of the game. My Opera bookmarks contain more categorized links than many web directories; I've been building it up since I was a teenager. Indeed, I've even considered using it as a basis for a web directory more than once, but laziness has prevented me from acting on this impulse. It would make a great directory, though...nothing but quality links to very informative sites. On the other hand, the two bookmark collections I maintain on Yahoo! services would make pathetic web directories. On del.icio.us, I primarily bookmark individual blog posts and other "standalone" web content. Appropriately tagged, I can find this miscellaneous material anytime I want via the search utility; a lot of it I may never actually look at again, but that doesn't matter. In fact, I don't think I've ever gone through and purged my del.icio.us bookmarks of dead links -- if I realize something no longer exists then of course I'll remove it, but I never specifically set out to preen my bookmarks there. I do preen my local bookmark collections semi-regularly. Finally, I use Yahoo! Bookmarks to save interesting URLs I find on the Web so that I can figure out what to do with them later. Some bookmarks will be incorporated into a browser-based collection while others will end up on del.icio.us; most of them, though, will probably be looked at more closely and then discarded. So Yahoo! Bookmarks isn't a permanent collection of bookmarks for me; it's sort of the Ellis Island of my bookmarking world. I doubt that my way of doing things is the most efficient nor do I think I get the most out of any of the bookmarking methods I utilize, but I'm nonetheless quite satisfied with my present arrangement. I can't wait to think of new methods of organizing my bookmarks in even more places.

If anything, I suspect this post has shown that bookmarking can be a pretty complicated thing. The beauty is that the tools that are out there for allow us bookmarkers to bookmark how ever we want. You don't need to make it complicated if you don't want it to be; it's all up to you. Want to signup with a bookmarking service just so you can stash away your links to your favorite web games? You can do it while simultaneously keeping all of your serious links on another service or in your browser. Freedom is wonderful.