Log in to WRAL.com with one click using your favorite social network:
OR
Log in using your WRAL.com account:



Wrong email/password combination.

Forgot password?

Register with WRAL.com using your favorite social network:
OR
Register for a WRAL.com account using our web form.

Login Options

6:58 a.m. • 2-11-12

Weather Forecast for Raleigh

  • Today: Mostly Cloudy.
    • Hi: 50° F
  • Sun: Clear.
    • Hi: 41° F
  • Mon: Mostly Cloudy.
    • Hi: 50° F

Other Locations

> 7 Day Forecast

Doppler Image

Marketplace Links

Social Links

Main Menu

Tara Calishain kittie and her Kindle

TechTalk Blog: Consumer Tech News

 Want to stay current on the latest tech issues and trends? Find out about cool stuff you can use, news you should be aware of and resources that should come in handy with WRAL's TechTalk with technology writer and researcher Tara Calishain.

RSS Feed

Duke University puts old yearbooks online

Duke University has made back issues of its yearbook, The Chanticleer, available on The Internet Archive. The back issues cover 1918 to 1960 and are available via an Internet Archive search.

This isn't an extensive presentation where you can look up individuals by name or class year. Instead it's digitized versions of the books themselves, each yearbook an individual entry. They're free to download and are available in several formats including PDF and text files (which you're probably familiar with) and DjVu and Flip formats (which you may not be familiar with.)

Note that these are pretty big files unless you just want the text version. Even a plain ol' PDF is over 25MB. Make sure you have a good connection before you start downloading.

Scanning in all these yearbooks and putting them online was a tremendous amount of work. However. It was driving me CRAZY that there was no way to search the text of all this yearbook content at once. You suspect Uncle Fred went to Duke, but you're not sure what year, and tracking him down might take some digging. I wondered if Google was indexing the content of the yearbooks, but running this search:

"duke university" chanticleer site:archive.org filetype:pdf

didn't find anything. So I pulled out links to all the text-only versions of the yearbooks and built a Google CSE.

A Google CSE -- a Custom Search Engine -- is just a search engine that looks and operates like Google but searches only the pages and sites that you've specified. It allows you to create your own search engine for little slices of the Web -- in this case, a very tiny slice.

The problem is, when I first made the Google CSE for searching the Duke Chanticleers, it didn't work. I didn't get any results for any of the searches I did. That's probably because Google had not indexed that content yet. But after I let the CSE sit for a couple of days, I could get results when I ran searches.

If you want to search the text of the Duke Chanticleer yearbooks, visit my Chanticleers Google Custom Search Engine atits Google Custom Search page. (WRAL's content management system doesn't allow me to embed search engines in my blog posts. But if you visit that page you'll see that there are ways for you to put the Chanticleer search on your own Web site if you like.) You'll see that this little search engine is only searching 43 pages, but each of these pages is the content of a yearbook.

Some notes on searching this content. If you want to find a name be sure to search for it both backwards ("Smith Fred") and forwards ("Fred Smith") as the names are sometimes listed on the yearbook backwards.

Now, what do you do once you find a name? Let's try an example. I'll try searching for "Smith Fred". I get three results, so I'm ready to go back to the Internet Archive and get copies of the yearbooks in order to go rummaging around for Fred. But how do I know which yearbooks to download?

Look in the URL of the search result. The first result of my Fred Smith search had this page URL:

www.archive.org/stream/chanticleerseria1932duke/chanticleerseria1932duke_djvu.txt

The URL will tell you what year of the Chanticleer to look at -- in this case, 1932. The only one that won't tell you that is this URL:

www.archive.org/stream/chanticleerseria00duke/chanticleerseria00duke_djvu.txt

... and I can tell you that one's 1950.

Is this an ideal solution to searching the yearbooks? Absolutely not. There's a lot of data here that's not divided up, you have to have a search engine in one window and the yearbooks collection in the other, and because the text was scanned by OCR software it might miss once in a while. But it's something to have the yearbooks' text aggregated all in one place, especially as a starting point if you think your Uncle Fred went to Duke but you're not sure when, and you want a quick tool to narrow down the search for his phiz.

Read More Posts from this Blog
e-mail print friendly

0 Comments


WRAL.com welcomes your comments on this story. All comments are moderated prior to publication based on our posting guidelines. Please review them prior to posting and if your message is not approved.

This story is closed for comments. Comments on WRAL.com news stories are accepted and moderated between the hours of 8 a.m. and 8 p.m. Monday through Friday.


Featured Blogposts

  • scotty and mr wuf

    American Idol and Garner native Scotty McCreery performs at N.C. State's Hoops 4 Hope. The circus is in town. And Olympic-level table tennis stops in Cary. Here's what's happening this weekend.

  • Hoops 4 Hope on Feb. 15, 2009

    The Hillsborough Street Community Service Corporation is sponsoring Play 4Kay events on Hillsborough Street starting Feb. 8 to support Hoops 4 Hope and the Kay Yow Cancer Fund.

  • Heart

    Showering your loved ones with goodies is always fun to do on Valentine's Day, but not if it leaves you drowning in debt! With a little planning and creativity, you can show your loved ones you care and stay within your budget.

Other Recent Blogposts