Code for Librarians
The NIH Library Information Architecture Branch creates solutions for Librarians and Informationists. The following code is being shared with the library community to assist others in meeting the complex demands of our respective customers.
Excel VBA code to assist in obtaining citation counts from Web of Science ®
Customers of Thomson Reuters can obtain access to the Links Article Match Retrieval Service (Links AMR) API here. Once you obtain authorization from Thomson Reuters your IP address will be able to submit queries to the API.
IAB was asked to provide access to the API in the simplest form possible. The customer had provided an Excel spreadsheet with over 17,000 articles listed. The columns included PMID, Article Title, Authors, Journal Name, and Year of Publication. We added the Time Cited column.
In the Excel spreadsheet we enabled macros and then clicked on the Visual Basic for Application icon to insert the macro code (shown above). We created four modules for this project. The lines the begin with an apostrophe are comments. Important note - within the Visual Basic for Application window that opens, you need to click on Tools, References, and check the box next to Microsoft XML 6.0. This reference can change over time so your experience may vary.
Module 1 - SendXML() - Creates the XML and Posts it to the Links AMR API. Uses a PMID for Lookup and Returns the timesCited
Sub SendXML()
'Excel Macro that utilizes Web of Science Links Article Match Retrieval Service API to retrieve timesCited when a PMID is provided
'HTTP variable
Dim myHTTP As MSXML2.XMLHTTP
'HTTP object
Set myHTTP = CreateObject("msxml2.xmlhttp")
'Create dom document variable stores the xml to send
Dim myDom As MSXML2.DOMDocument
'Create the DomDocument Object
Set myDom = CreateObject("MSXML2.DOMDocument")
'Load entire Document before moving on
myDom.async = False
'xml string variable
Dim myxml As String
'This is the XML that will be submitted to the API. Notice the val name="pmid" and the ActiveCell.Value
'You could substitute pmid with doi or any of the other variables that the API will accept. Make sure the cursor is on the value that is being used to do the lookup.
'In this case we would placethe cursor on the first PMID in column A and have a emtpy column B to receive the timesCited number retrieved.
myxml = "<?xml version=""1.0"" encoding=""UTF-8"" ?>" & _
"<request xmlns=""http://www.isinet.com/xrpc42"" src=""app.id=API Test"">" & _
"<fn name=""LinksAMR.retrieve"">" & _
"<list>" & _
"<map>" & _
"</map>" & _
"<map>" & _
"<list name=""WOS"">" & _
"<val>timesCited</val>" & _
"</list>" & _
"</map>" & _
"<!-- LOOKUP DATA -->" & _
"<map>" & _
"<map name=""cite_1"">" & _
"<val name=""pmid"">" & ActiveCell.Value & "</val>" & _
"</map>" & _
"</map>" & _
"</list>" & _
"</fn>" & _
"</request>"
'loads the xml
myDom.LoadXML (myxml)
'open the connection to the URL
myHTTP.Open "POST", _
"https://ws.isiknowledge.com/cps/xrpc", False
'send the XML
myHTTP.send (myDom.XML)
'Display the response - remove the apostrophe before MsgBox if you want to see the actual results in a simple pop up box
'MsgBox myHTTP.responseText
' Get the single value that is returned. You need to modify this if you want more than one value returned.
Dim objHTML As Object
Set objHTML = myHTTP.responseXML.DocumentElement.SelectSingleNode("//map")
strHTML = objHTML.Text
Set objHTML = Nothing
'Place the value in the adjacent column
ActiveCell.Offset(0, 1).Value = strHTML
Set objXML = Nothing
End Sub
Module 2 - DoCited() - Simple macro to run the SendXML module submitting all of the PMIDs in column A until it encounters an empty cell.
Sub DoCited()
' DoCited Macro while A1 is not empty
Do While ActiveCell.Value <> Empty
' Run the SendXML macro
Call SendXML
' Move down one cell
ActiveCell.Offset(1, 0).Range("A1").Select
Loop
End Sub
Module 3 - Lookup DOI numbers when you have the PMID.
We found it necessary to lookup citaitons by DOI numbers because some records in Web of Science did not have the PMID number. This module uses the PMID2DOI JSON based service at http://www.pmid2doi.org/. This module could use some additional code to parse the results to isolate the DOI. We chose to place the whole string returned in the adjacent column and then used Excel functions to clean up the return. When there is a successful match the return looks like this:
{"pmid":15353548,"doi":"10.1083/jcb.200404024"}
So assuming that result was in B15 we then used the function =LEFT(MID(B15,FIND("doi",B15)+6,LEN(B15)),LEN(MID(B15,FIND("doi",B15)+6,LEN(B15)))-2)
This is the module code we used to retrieve the DOI given a PMID
Sub SendPMID()
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
URL = "http://www.pmid2doi.org/rest/json/doi/" & ActiveCell.Value
objHTTP.Open "GET", URL, False
objHTTP.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
objHTTP.send ("")
SourceHTMLText = objHTTP.responseText
'If the return has the word Apache in it then there was no DOI returned. Put no DOI in the cell and move on.
If InStr(SourceHTMLText, "Apache") > 0 Then
ActiveCell.Offset(0, 1).Value = "no DOI"
Else
' Place the value in the adjacent cell
ActiveCell.Offset(0, 1).Value = SourceHTMLText
End If
End Sub
Module 4 - A simple macro to run through all of the PMIDs in column A and place the DOI found in column B.
Sub DoPMID()
' DoPMID Macro while A1 is not empty
Do While ActiveCell.Value <> Empty
' Run the SendPMID macro
Call SendPMID
' Move down one cell
ActiveCell.Offset(1, 0).Range("A1").Select
Loop
End Sub
For more information, please contact Bridget Burns.
Last modified date: Tue, 03/09/2021 - 11:41am