Blog | News | Jobs
News centre
KnowledgeBANK
ADVERTISEMENT

Struggle for scholarly search

With free search service IngentaConnect adding 50,000 digitised articles from academic publisher Brill to its collection, how worried should Scopus and Google Scholar be?

By Davey Winder 05 Mar 2008

The first quarter of 2008 has seen the digitisation of 80% of Brill’s content for the IngentaConnect database. As well as the obvious benefit of Brill articles containing forwards and backwards citation links to the likes of CrossRef, PubMed and CSA Illumina, there will be an option to check with Google Scholar for citation links not within IngentaConnect itself.

But the biggest advance in functionality and usability is likely to be an article recommendation feature which will look for associated articles on a ‘show me more like this one’ basis.

This is much more than just the same old, same old. With this feature IngentaConnect is moving away from just keyword and content searches. Instead, it is moving towards the ‘wisdom of crowds’ concept, and associating content with user behaviour and recommendations from a like-minded peer group.

Too much should not be read into the integration of Google Scholar’s search functionality with IngentaConnect as far as archive searching is concerned; that’s more of a notional nod to the search giant than a truly useful move forward.

Why so? Well, Google Scholar is hamstrung by its business strategy, which adopts a particularly inflexible stance as far as data reuse is concerned. Google is equally rigid, almost ridiculously so, when it comes to publishing full details of the scientific journals it crawls to generate its database, or to revealing details of how often those journals are updated, something it has repeatedly refused to do. Ingenta is probably not too worried anyway, as it appears well ahead of Google Scholar in the field of serious scholarly research.

Quick and dirty
This conclusion is not some kind of search snobbery, but reflects a basic evaluation of the nature of the services on offer. IngentaConnect with its bilateral links and growing citation resource is simply a more serious research option while remaining free for personal use.

One reason why Google Scholar is better suited to ‘quick and dirty’ research is because it gathers its information from many disparate sources using the proprietary algorithm on which the Google empire is built.

As the Google data can come from the automatic extraction of information from text or citation, both publication and date data from sources can often be incomplete or missing altogether. Indeed, the inaccuracy of results when looked at from a high level has been the main butt of the criticism poured on Google Scholar.

It can often be easier to work out what is excluded from the Google database than included in it. There are, for example, no peer-reviewed articles published by Elsevier (no real surprise given that Elsevier has a competing search offering in the form of Scirus). There are even some reports of bizarre database quirks which result in the numbers of hits increasing when a search is date-limited.

Yet Google Scholar does have a place in the overall scheme of things. It undoubtedly shook up the whole scholarly search genre by offering a totally free access service, with a searchable archive of peer-reviewed papers, books, preprint, thesis and technical reports.

It’s just that the gap in the search market that it spotted and filled is more in the low-rent part of town rather than prime real estate. The likes of IngentaConnect offer free access for personal account users in the posher part of Searchville. It comes back to the quick and dirty search principle, as opposed to the in-depth research professional approach.

Google Scholar is best used through the Advanced Scholar Search interface. From here, you can restrict searches not only to a specific subject area – such as biology, life sciences and environmental science, or social sciences, arts and humanities – but also hugely increase the hit accuracy by adding operators such as author search, publication and date restrict.

Most useful, perhaps, is the author operator, which comes into its own when a word is both a person’s name and a common noun. Placing a minus sign before the name will exclude it from all results.

The ability to view results using a Cited By filter to display lists of documents citing the originally searched for article, and the Group Of function to quickly locate other articles in the same scholarly work, are great examples of why Google Scholar can be the perfect place to start your research.
The Library Link function is even a persuasive argument to stay there and continue with that research: libraries that have their holdings listed in the OCLC Open Worldcat database get links for every book result which leads directly to it, enabling users to find the book concerned in the local library.

The British Library agreement that sees Google Scholar match its results against the British Library Direct document delivery service is another example of how researchers can get quick and cheap results, with PDF scans coming at a relatively cheap cost. Although it’s relatively cheap if you are only doing the odd bit of research, any great volume and the article and service fees would soon add up, which is why a subscription model service makes more sense to professional researchers.

But the part-time information worker, the home user doing a bit of academic sleuthing and the student on an assignment will all find Google Scholar great value for money. After all, by looking for preprints and early drafts of articles, it is possible to save on the fees that subscription-based services charge for access to published papers.

Again, it comes down to your definition of value. The omissions and errors contained within the full text of a preprint or draft paper might be of little consequence to students and amateurs, but could be career-threatening mistakes for professional researchers.

Like IngentaConnect and Google Scholar, Scirus is a free service. But it is one that covers more serious-searcher ground than Google Scholar thanks to a superior interface which lets users filter searches by ISSN, author affiliation, file format, journal or web source preference and information type. The information type attribute is particularly helpful in restricting searches to just patents, theses and dissertations, articles or abstracts, for example.

Subscription service
So where does Scopus fit into the scheme of things, competing with the free search services for its share of the scholarly search market?

Right at the top of the tree would seem to be the logical answer, given that it allows abstract and citation searches of more than 15,000 peer-reviewed science journals.

Scopus is as easy to use as the free services, but, as you might expect, throws in some extra bangs for your buck, such as the ability to track article citations and get email alerts as new search results surface.

You can dig deep into the citations for any article just by clicking on the citation count, and get all articles by a given author just by clicking on their name. And not only does Scopus simultaneously search its patents database, but it also lets you go straight to the patent record at the relevant patent office with a single click on the results.

An author search offers a details link in the results that can provide information about co-authors, citations received and so on. But it’s the advanced search functionality of Scopus that is its most impressive quality.

While not the most user-friendly of beasts at first glance, Scopus proves the point about the difference between professional researchers and knowledge tourists. The professional will appreciate the power of the search codes that can be added to a search from this interface (ABS = abstract, AUTH = author, AFFIL = affiliation, CONFLOC = conference location and SUBJAREA = subject area, to name but a few).

You can even mix and match, so TITLE-ABS-KEY, for example, will search article titles, abstracts and keywords together. Combining these codes with standard search operators (such as PRE/, which finds search terms separated by a specified number of words) releases the full power of a serious search engine.

Unlike Google Scholar, Scopus is literally upfront about the publications included in its database. Pressing the sources button at the top of every search page will list them.

IngentaConnect also makes this information available, although you have to access it via a drop-down menu, so it is not quite so obvious. Google just says: “You’ll find works from a wide variety of academic publishers and professional societies, as well as scholarly articles available across the web,” which is about as useful as a chocolate teapot.

But at the end of the day, you have to come back to that value proposition, and there is a price to pay for the Scopus functionality. Although the precise cost will vary according to the number of end-users and the type of establishment, you can expect to fork out at least £10,000 to get started.

Google will argue, with some merit, that the people paying that kind of money are not the target market for Google Scholar. But the argument rapidly falls apart when you compare Google Scholar to IngentaConnect, which also offers a no-cost model.

The truth is that outside the commercial subscription arena, there is no panacea for serious scholarly research right now. There are strengths and weaknesses in every free model, but when used with their limitations understood there can be no denying that these tools have helped open up what was a closed research community to a much wider audience, and that has to be a good thing.

Ball in google’s court
Perhaps it is still too soon to say how well Ingenta’s Brill deal or its Google Scholar integration will turn out in day-to-day use. But if IngentaConnect delivers on the promises, then Google has got a lot to be worried about.

Unless Google can revamp both the Google Scholar service and the business strategy underlying it – in particular its inflexible policies and refusal to be open about sources included in the database – then its academic search product looks in real danger of remaining a plaything for the curious rather than a serious competitor in the scholarly search marketplace.

The limits of free search
One point often overlooked by many amateur researchers is that while the searching at IngentaConnect and Google Scholar may well be free, the hits generated often involve payment of some kind to access.

If you want something more valuable than that preprint text, then you have to pay to retrieve it. That value might be in the form of a PubMed or Elsevier subscription or perhaps on a pay per article basis if you are lucky.

Indeed, such free services as Scirus have often been accused of being little more than a driver pushing users firmly in the direction of other subscription-based Elsevier products such as ScienceDirect or Scopus.


All Academic & Humanities

Like this story? Spread the news by clicking below:

Post this to Delicious del.icio.us    Post this to Digg Digg this    Post this to reddit reddit!

Permalink for this story
Other UK websites