British Library sticks 1 million pics on Flickr, asks for help making them useful

In 2008, the British Library, in partnership with Microsoft, embarked on a project to digitize thousands of out-of-copyright books from the 17^th, 18^th, and 19^th centuries. Included within those books were maps, diagrams, illustrations, photographs, and more. The Library has uploaded more than a million of them onto Flickr and released them into the public domain. It's now asking for help.

Though the library knows which book each image is taken from, its knowledge largely ends there. While some images have useful titles, many do not, so the majority of the million picture collection is uncatalogued, its subject matter unknown.

Next year, it plans to launch a crowdsourced application to fill the gap, to enable humans to describe the images. This information will then be used to train an automated classifier that will be run against the entire corpus.

The library is also soliciting ideas for how to present the collection to aid the tagging and metadata generation, and also make the pictures easier to navigate.

Promoted Comments

Dan HomerickWise, Aged Ars Veteran

Some years back, Google had a labs project to help categorize images. They made a game out it -- two players would simultaneously see an image, and they would both secretly write a list of appropriate tags while a timer counted down. When time was up, the tags were compared and both players got points for having tags that matched.

In practice, I don't think the incentives were quite right. It encouraged the most simplistic descriptions only, since those were the most likely to be matched by the other player. But with some tweaking, I think it could be a good foundation for the library's efforts, in that the timed game aspect was fun enough to keep you playing "one more round."

Promoted Comments

Dan HomerickWise, Aged Ars Veteran

Mostly not cats —

British Library sticks 1 million pics on Flickr, asks for help making them useful

Finding the right picture is hard when you don't even know what they all are.

Promoted Comments

Promoted Comments

Channel Ars Technica

Promoted Comments

Promoted Comments

reader comments

Channel Ars Technica