Linking Name Authorities and Wikipedia Entries

I love the idea of linked data. As a comic book, science fiction, and fantasy geek the relationships between different works, titles, series, and the characters in them are fascinating to me and the idea of being able to explore all those relationships just by following links is basically Nerdvana. I will also confess to having spent entire evenings following link to link to link on TV Tropes  (it’s roughly as addictive as breathing oxygen) and reading Night Court/ Doctor Who crossover fan fiction on Archive of Our Own (yes, it really is a thing).

But I try to read papers on linked data and they start to talk about FRBR and triples and RDF and my eyes glaze over. I just don’t get it very well without seeing it work. But I found a way to at least create links between data that doesn’t require a massive database or knowing SPARQL or having access to paid subscription tools. I don’t think it’s real linked data because it’s one way and there’s not a described relationship, but it’s a start, and it’s something I understand so I’m going to roll with it for now and try to build on this later.

The short version is that there is a really easy way to add Library of Congress Control Numbers (LCCNs, the unique identifiers for Name Authority Records, or NARs) to Wikidata pages, which are in turn linked to Wikipedia pages. So that’s what I’m going to describe here.

The first thing we need to do is find a Wikipedia entry without an LCCN. The LCCNs are shown in the Authority Control template of the Wikipedia page. As a counter-example of what we’re looking for, I chose the Wikipedia page for Weird Al Yankovic. Of course Weird Al’s page has his LCCN listed, because he’s Polka Megastar Weird Al Yankovic. The arrow shows the location of the LCCN, which links to the Name Authority Record. Some Wikipedia pages don’t have the Authority Control template, and that is fixable, but involves editing the page and is beyond what we’re doing here so for now we’re just looking at pages with that template.

weird_al

In keeping with the novelty music theme, I found a Wikipedia page without an LCCN. That page is for Barnes & Barnes, the musical duo best known for their 1978 hit song “Fish Heads” (the video was directed by Bill Paxton, no joke, which is another reason I want linked data to take off because the world needs to know this when they Google Bill Paxton.). As you can see below, there’s a MusicBrainz ID, but no LCCN for Barnes & Barnes on their Wikipedia page. We’re going to change that.

barnes_and_barnes

We need to get the LCCN so we search authorities.loc.gov to find Barnes & Barnes Name Authority Record. Once we find the right record we copy the LCCN, but please make sure you have the right record so you don’t add an incorrect LCCN. In this case the 670 is for Barnes & Barnes 1980 album Voobaha (the reissue includes “The Vomit Song”), so we’ve got the right record. I’m trying to keep this open and accessible to everyone, but if you’re not familiar with NARs and searching authorities it might be wise to get a foundational knowledge of those before trying this. Copy the LCCN (shown here as LC control no: ).

barnes_and_barnes_record.PNG

And then go back to the Barnes & Barnes Wikipedia page and click on the Wikidata item link.

barnes_and_barnes_wikidata_item

This takes you to the Barnes & Barnes Wikidata page, which is where we’re going to add the LCCN we copied earlier. Scroll down to Identifiers, and then click on + add statement at the bottom.

barnes_and_barnes_wikidata_page.PNG

This will open up the box, just type in LCCN and the Library of Congress authority ID will appear. Click on that option.

barnes_and_barnes_LCCN.PNG

Then paste the LCCN you copied earlier into the text box. For this entry we need to (1) delete the space between the n and the 9, since URLs don’t like spaces. If there is no space in the LCCN you copied, you can just paste the number into the box as is. Then (2) click on save. If you’re not logged in, you’ll have to do a captcha to save.

LCCN_added.PNG

And once we’ve saved, we can see the LCCN in the Wikidata page. It’s a link, and you can click on it to make sure you’ve entered the right number (recommended just in case, I really suggest avoiding creating incorrect links). And there it is.

lccn_added.PNG

This won’t appear in the Wikipedia page for Barnes & Barnes right away (it seems to take about 24 hours to refresh, but that’s just a guess), but when it does we get this:

fish_heads_woohoo.PNG

Success! I didn’t add the Worldcat Identies, I suspect that was done automatically when the LCCN was added, but I’m not sure and there may be more people interested in adding links to Barnes & Barnes than I suspect. And now the Wikidata page (and the Wikipedia page through it) is linked to the Library of Congress Name Authorities. The link is only one way (we didn’t create a link from the NAR page to the Wikidata page), and there’s no relationship described so I’m pretty sure it’s not linked data, but it’s at least something I understand and can tinker with for five or ten minutes while I’m waiting on something else.

On the other hand so what, who cares? It’s a fair question. And I don’t know what this accomplishes at the moment, beyond the obvious (anyone who does know, please comment). But it at least connects NARs to a big open database, and hopefully people experimenting with that database will find a use for it. Until then, I’m happy just doing it for the sake of doing it.

If you’re looking to experiment with this yourself, for some reason it seems like a lot of musical groups don’t have their LCCN on their Wikipedia page. And they’re easy to pull out of authorities.loc.gov, just do a keyword authorities (all) search for music group and then pick one and check the Wikipedia page.

Thanks!

Creating Name Authority Records

I’m going to start off by saying what this post is not. It is not comprehensive instructions on creating Name Authority Records (NARs). It is not a how-to guide on creating simple NARs. It is most emphatically not all you need to know to create NARs. It is just a quick description of the mechanical steps that are (roughly) what you would do to create an NAR. So don’t take this as a guide, or even a reference because things change over time, but with that very lengthy disclaimer out of the way if you’re curious about how people create NARs, this might be helpful.

You do need to go through NACO training to add records to the file, but you can experiment with this up to the point of adding them to the authority file if you’re curious. If you try it and like it the training is done remotely for new institutions, and there is an option to become a part of a funnel if that works better for you than becoming a contributing institution (it would depend on how many records you think you’ll generate). You can find more information on all this at https://www.loc.gov/aba/pcc/naco/training/, and everyone I have worked with through NACO has been fantastic.

I work in OCLC Connexion by the way, so  that’s where my images are from. Hopefully it generalizes to other people’s cataloging software.

Step 1 is to catalog something that has an author without an existing NAR. For me, this usually means something older, like this 1909 pamphlet in our Abraham Lincoln Collection. The author is a U.S. Representative from Illinois, so when someone put the author in the 100 field they found the life dates and built the 100 field like an NAR. They didn’t create the NAR to control it, but they used the basic formatting and information you would find in an NAR, and they could do that because he was well known enough that information was available. This makes it quite a bit simpler to create the NAR, and sometimes you only get the author’s name (or even less) from the work in hand.

01

So you search the authorities to see if Rodenberg has an existing NAR. I did and he didn’t, but it’s also a good idea to search for personal names in OCLC records to see how which name he used when writing his works. He might also have written something under a variant name that might have an NAR attached (Bill Rodenberg, for example). You go through the OCLC records and the statement of responsibility in the 245 field to determine what name he actually used when writing, and once you have that figured out you can begin creating your NAR. I use the Macro in OCLC.

02

It’s literally as simple as dropping your cursor in the 100 field (or 700 field if you’re doing an additional creator), and running the macro, which gives you the below. This is the bare minimum NAR, and if we didn’t have those life dates and the name was all we could find, that would satisfy the requirements for an NAR if it was a unique string that wasn’t the same as another NAR. If there is already another author with that NAR assigned, you would need to do something to differentiate them (adding a middle name or year of birth and/or death are probably the two most common ways). In this example, we would also need to add a 670 citing OCLC for the life dates, and the 046 field with those years. But that would be enough to create a valid NAR.

03

But that’s no fun. We can add a lot of other information too, and I’m a big enough believer in linked data I’ll usually take the time to do it, especially if the creator has relevance to my area of Illinois or especially Illinois State University (the people who sign my paychecks). The record below is considerably more fleshed out, and has an

-024 field which links it to his entry in the Biographical Directory of the United States Congress (link)
-046 field giving his full date of birth and death (should be in edtf format)
-370 field giving his place of birth and (in subfield b) place of death
-373 field giving an affiliated institution
-374 field giving his profession
-375 field giving his gender (be careful with this one, mis-gendering people is not cool)
-377 field showing which language he is associated with
-378 field showing a fuller form of his name (if he had written some of his works as William Augustus Rodenberg I would have included this in the 100, but I couldn’t find any where he did)
-A 670 field showing the work in hand I generated the NAR for, and a second 670 showing where I got all that other information from

How much of  this you add depends on how much research you want to do. I had a Wikipedia article handy so I could add things quickly (and could have added more information if I wanted to or thought it was relevant).

04

We cross-check each other’s NARs here, so we save them to the online file and email each other numbers to check, but generally once you’ve done all that optional (but fun) stuff and are satisfied with your record you check again to make sure no one else has added an NAR for the creator you’re working on in the meantime (all this checking is because multiple NARs for the same creator are really bad for authority control), and then you submit it to the authority file. This is as easy as clicking a button in OCLC (normally it would be blue, I was logged into an account that doesn’t have permission to add authority files at the time).

06

It then goes into distribution, gets an ARN and LCCN, and in about a week will be in the public view for the authority file.

05

Once it’s in distribution you  get the joy of going through OCLC and controlling the headings in the bib records and seeing them turn blue (I know how to have a good time!). We don’t do it for non-English language bib records (other countries don’t necessarily use Library of Congress NARs and dislike us changing their records), but for some of the more historical authors it can really pull a lot of variant 100 fields together under a common heading.

07

And then we get into the super optional area here, but if I used a Wikipedia article to generate an NAR, I try to remember to drop the LCCN into Wikidata so the  two are linked. It’s super simple to do, you just go into Wikidata and paste the LCCN into the field and hit save.

09

Takes less than a minute and you’re done.

10

So this is super basic and very general (and I omitted or glossed over a huge amount of things and special circumstances you need to be aware of it you’re creating NARs), but I think it describes the basic mechanical process of how a creator gets from the item in hand to the Authority File well enough. Questions, comments, suggestions always welcome.

Thanks!