Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 107
15- Microsoft Academic Search: An Overview and Future Directions
Lee Dirks1
Microsoft Research Connections
I would like to brief you on what we have been doing lately with the Microsoft Academic Search
service. It started as a research project that has been conducted at our Beijing lab for almost eight
years now. Over the course of the last eighteen months, our team in Redmond has gotten very
involved in providing strategic guidance and input. Currently, we are in the process of
transitioning it from a research project into an operational service that Microsoft Research will
provide to the community. It will be a free academic search engine for tracking academic
papers, citation links, and all the various characteristics that can be extracted from papers.
What we have been doing over the last six to nine months is working directly with open access
repositories and publishers around the world to sign content agreements so we can get access to
their papers. This is all about facilitating access to the papers. At present, we have 27 million
papers across 14 domains, and we have another 100 million papers across more than 20 domains
in the queue, pending indexing. We are going to expand our content about every three months,
and are already actively evolving the site.
All of the signed content agreements that I was referencing earlier--with the various open access
repositories and publishers--are to make sure that content providers are aware that we are
making their data available for free. We are very interested in having the community use this
service as widely as possible.
I also would like to stress that we are being as transparent as possible in talking about the number
of publications and authors that we have. As soon as possible, we are going to post a list of the
publishers and all the sources of this material. We are also waiting for ORCID to come online, at
which point we intend to leverage their work and use their identifiers to help in the name
disambiguation process.
Through the Academic Search service, people will have the ability to look at citations or
publications on a cumulative or on an annual basis. The service also has some powerful
visualization abilities. For example, we will have the ability to show a single author in
connection with all the people that he/she has worked with in the past (e.g., co-authors).
Another thing I would like to highlight is the system's ability to drill down into fields and sub-
fields. For computer science, for example, you can look at the top authors, top publications, top
conferences, journals, organizations, and other characteristics. (Note that this ranking is solely
based on citation counts we have calculated.) Also, you can drill down into a sub-domain of
computer science and visualize, for example, publication activity using what we call the Domain
Trend. We believe that Domain Trend is a very useful tool for helping researchers find co-
authors, principal investigators, and even awards and people to invite to conferences. There is
also the ability to do ranking across institutions and across countries.
1
Presentation slides are available at http://sites.nationalacademies.org/PGA/brdi/PGA_064019.
107
OCR for page 108
108 DEVELOPING DATA ATTRIBUTION AND CITATION PRACTICES AND STANDARDS
Again, all of that information is free. We have been getting some good coverage lately,
especially about some of the new functionalities of the system. Here is a recent quote from
Nature2:
"...Meanwhile, Microsoft Academic Search (MAS), which launched in 2009 and has a
tool similar to Google Scholar, has over the past few months added a suite of nifty new
tools based on its citation metrics (go.nature.com/u1ouut). These include visualizations of
citation networks (see 'Mapping the structure of science'); publication trends; and
rankings of the leading researchers in a field."
I would like to stress the fact that the work that we are doing here is for researchers and by
researchers. That is something that we will always keep in mind when we grow and make this a
more sustainable service. We are also very interested in changing our interface and not just doing
citation analysis of papers, but eventually also of data. We are very interested in conducting
research projects with the community. From our perspective, Microsoft Academic Search is an
open platform and we are going to be as transparent as we can about our work. We want to make
sure that this service will accurately represent how science and academia work. We are going to
make our domain coverage more extensive. We are also working on more partnerships. For
example, we are an associate member of DataCite and we are a founding sponsor of ORCID.
Finally, we are tracking these and other activities to see when and how we can integrate them
into our service.
2
Butler, D. 4 August 2011 Computing giants launch free science metrics. Nature 476, 18 (2011)
(doi:10.1038/476018a).