I came accross this other interesting paper on multi-lingual search.
Internet Searching and Browsing in a Multilingual World: An Experiment on the Chinese Business Intelligence Portal (CBizPort) (PDF) Journal of the American society for information science and technology âJuly 2004.
It gives some good practical background on a project to create a business search for Chinese. It also describes an approach to automatically summarizing and categorizing search results. Here is its description of the Chinese search landscape, illustrating some interesting language-locale subtleties.
“Chinese is the primary language used by people in mainland China, Taiwan, and Hong Kong. Language encoding, vocabularies, economies, and societies of the three regions differ signiï¬?cantly. Regional search engines, therefore, have been developed to provide Internet searching.
In mainland China, the major search engines include Sina and Baidu. Baidu currently powers over 80% of Internet search services in China, including ChinaRen, 163.net, etc. The database of Baidu stores over 60 million Web pages collected from mainland China, Hong Kong, Taiwan, and Singapore, and grows at a speed of several hundreds of thousands of Web pages per day. Sina is an Internet portal providing comprehensive services such as Web searching, e-mail, news, business directory, entertainment, weather forecast, etc. From our review of search engines in mainland China, we found that Baidu has better search capabilities than the others, as shown by its content coverage. Sina has a wider scope of functions than Baidu.
In Taiwan, the two major Internet search portals are Openï¬?nd and Yam. Openï¬?nd, established in 1998, is one of the largest portals in
Taiwan. In addition to basic searching, Openï¬?nd suggests terms that are highly associated with usersâ queries to help them reï¬?ne their search. It also allows users to ï¬?nd more related items from each search result and highlights the query terms in the results. Established in 1995, Yam provides comprehensive online services. Its four major focuses are content, communication, community, and commerce (4C). Yamâs search engine allows users to search various media: Web sites, Web pages, news, Internet forum messages, and activities (in 18 Taiwan cities or regions). We found that Openï¬?nd has better functionality and content coverage, but Yam was better established in the local market (e.g., it powers the search function of the Taiwan governmentâs Web sites).
In Hong Kong, due to its bilingual culture, people rely on both English and Chinese when accessing and searching the
Internet. Major search portals include Yahoo Hong Kong and Timway. Of these, Yahoo Hong Kong is one of the most popular. Yahoo Hong Kongâs search engine returns results in different categories, Web sites, Web pages, and news. Headquartered in Hong Kong, Timway provides services such as Web searching, Web directory, e-mail, news, forums, etc. Its database stores over 30,000 Hong Kong Web sites and over 10 million Web pages. “
In other words, it’s not because Chinese is a language, that one Chinese search engine will be enough for the various users that want to search in Chinese. There are different groups of people who search in Chinese, with different local requirements, and this local requirement has given rise to a number of different search engines for Chinese.