Category: Classification

Dec 09 2009

A Little Star Wars Political Library Geekery


The part of me that is somewhat of a Star Wars geek really finds it fascinating that, according to one of the catalogers at my place of work,  Barack Obama’s Dewey Decimal Cutter Number is…

Ob1


  • Share/Bookmark
Sep 22 2009

Drinking from the Firehose of Metadata


Lorcan Demsey discusses a previous post about Metadata that he wrote a couple of years ago, and the implications for how we approach the creation and selection of information about information.  His four categories:

  1. Professional. Produced by staff in support of particular business aims. Think of cataloging, or data produced within the book industry, or A&I data.
  2. Crowdsourced. Produced by users of systems.Think of tags, reviews and ratings on consumer sites.
  3. Programmatically promoted. Think of automatic extraction of metadata from digital files, automatic classifcation, entity identification, and so on.
  4. Intentional. Data about choices and transactions which support analytics or business intelligence services. Think about ranking, relating, recommending in consumer sites (e.g people who like this also like this) based on collected transaction data.

The traditional library approach has been the first category (Professional).  The downside is that it it far too time consuming to keep up with the firehose of new resources.  When was the last time you heard someone discuss cataloging the internet?

The challenge with the remaining options is the opposite.  There is a great deal of metadata being generated, and the challenge is to organize and/or standardize what we use.

Where does this leave library catalogers, and libraries in general?  How should we focus our efforts?  Should we focus on traditional metadata creation, or should we attempt to update and adapt our processes and standards to a changing world?  Potential rewards, and possible troubles await either choice.  Can we forge a path that allows us to do both, or is that doomed to failure.

Just some things to think about….

  • Share/Bookmark
Jul 01 2009

NASA Needs A Library Solution (But So Do Libraries)


In a merging of two of my great interests, NASA has issued a Request For Information (ROI) on how best to “analyze and catalog notes from spaceflight pioneer Wernher von Braun into an electronic, searchable database or other system.

von-braun-sketch1

Sample Page from Von Braun's Notes

At first glance, this is something that would be solved by using library tools and software.  However, the list of potential ways to set this up seems to illustrate the gaps in library technology (all points are mine):

  • Users should be able to see the notes as they exist.
  • The text in the notes, as well as all labels and notations, should be fully keyword searchable.
  • All elements of the notes, including text, formulas, diagrams, etc. should be able to be targeted and described in a way that allows for keyword searching.  This includes “tagging”, but also commentary, description and critique.
  • Users should be able to define relationships (create links) between ideas within the notes, as well as documents and other resources from other collections.  For instance, someone seeking information on the Saturn V Engine Bell should find all drawings, notes, diagrams, and formulas within the notes, as well as outside resources relating to all of these.

This project begs for a combination of a traditional database (for storing and searching text) with the added functionality provided by social software products.  Nothing in the list is beyond the current means of technology… think  of a wiki combined with flickr-type functionality that can utilize PDF documents and you have a good starting point.

Why hasn’t this been done?  How many libraries and archives have document sets like this that could become a researcher’s favorite collection, with the right application of technology?  Have any been digitized with a social annotation feature?

Why do I suspect that the development of this will come from outside the library community?  We should already have projects that utilize this kind of mash-up philosophy… our collections could be so much more powerful.

There are some great things being done in the library technology realm, and many people and projects that are worth praising.  But now and then I see something like this and wonder how we missed this obvious application of existing technologies.

found via ResourceShelf and Wired Science

  • Share/Bookmark
Apr 22 2009

Neutral Pleasure, Medium Arousal


In its continuing examination of library blogs, HotStuff 2.0 has added a visualization of emotional content.

Here is the current visualization for Libology:

Libology's Emotional Content

How to read the information, from HotStuff’s description:

  1. The overall scatter of words in the ANEW list are shown as small blue dots. This is shown simply as a guide to indicate the overall shape (as per the previous image that resembled the map of Australia).
  2. The average emotional content of each blog post is shown as a small green cross. This is a calculated by looking for all occurrences of ANEW words in the blog post. The average position is then calculated. Therefore, if a blog post contained lots of strongly negative content, you would expect the green cross to be towards the bottom-left.
  3. The average emotional content of all the blog posts is shown as a larger red cross. This is calculated as before, but is the average for all of the content on the blog. Therefore, if a blog contained lots of posts with strongly positive content, you would expect the red cross to be towards the bottom-right.
  4. Word usage frequency is indicated by the transparent circles. This gives an indication of the type of words being used on the blog. Larger circles indicate that words with the same pleasure & arousal values have been used more frequently.

The red X falls in the Neutral Pleasure, Medium Arousal section, but definitely far enough towards the right to suggest that there is Positive Pleasure at work as well.  This seems about right, as I don’t tend to go negative all that often (and when I do I try to remain constructive), and my writing voice tends to be more formal, leading to a Neutral/Medium tone.

I haven’t seen anything on the HotStuff site that makes me feel that there is a grand truth in their categorization of various library blogs, but they are doing some really interesting experiments that provide glimpses of what is there.  I recommend checking out their listings for the library blogs that you follow.

And the title of this post?  Well, I had my Myers-Briggs TypeINFP – as my automobile’s license plate for several years, so this seemed to be the way to go.

found through Walt at Random

  • Share/Bookmark
Apr 20 2009

A Periodic Table of Visualization Methods


A Periodic Table of Visualization Methods is a great resource if you have information you need to present visually, but don’t know the best way to express it.  Simply go to the site and let your mouse hover over the examples in the different categories.  Note the creative use of the Periodic Table structure.

A word of caution:  too often people fall into the trap of using a visualization method that they find appealing, but that doesn’t quite fit their information.  In addition, focusing on the presentation of information at the expense of Keeping It Simple (a.k.a. Style vs. Substance) can lead to a result that looks great but doesn’t say anything relevant.

found via Dysart & Jones

  • Share/Bookmark
Mar 27 2009

Free Drinks Tomorrow


Karen Coyle writes about the Library of Congress and their follow-up to the lcsh.info shuttering last fall.  In LC discovers infinity, she points out that at ALA Midwinter they not only stated that they recognized the value of the service, but that they were planning on re-releasing it as “Library of Congress Subject Headings” within 6-8 weeks.

Then she points out that 9 weeks have passed, with no changes on the website, nor updates as to the progress being made.

Any project, including re-implementing a service that was fully functional, can run into complications.  We have all experienced this.  The key is keeping people informed, and being realistic about solving problems.

LC should not have shut down lcsh.info in the manner they did; they should have implemented their version first, then allowed for an overlap (6 months, for instance) to give those who had integrated the service into their systems time to switch over.  What we have is a mess, and the pressure is on the Library of Congress to clean it up.

  • Share/Bookmark
Feb 07 2009

LibraryThing and Authors


LibraryThing has implemented the start of a solution for the problem of distinguishing authors with the same names.

This has been a challenge for libraries since the beginning of cataloging.  The accepted solution thus far has been Authority Records.  I like that LibraryThing has found a simple, elegant solution that matches what people think and say when distinguishing between two authors with the same name.

I also like that they will be following the Disambiguation model used by Wikipedia; it works well and oftentimes leads users to serindipitous information.

  • Share/Bookmark
Nov 19 2008

A Useful Amplification


A Useful Amplification of Records That Are Unavoidably Needed Anyway is an essay by Brett Bonfield which, dare I phrase it this way, usefully amplifies several of the major web-based entities which are intertwined with libraries.  These include (but aren’t limited to) OCLC’s WorldCat, Amazon, and LibraryThing.  Brett clearly understands libraries, and does a great job detailing the interrelationships between all involved.

Not directly related to the essay, LibraryThing has posted an expansion of their Common Knowledge fields for Authors and Events.  This is an interesting read, as it addresses in a real-world way the need for authorities and relationships.

  • Share/Bookmark
Sep 01 2008

Lakes and Rivers


Lorcan Demsey has a post on metadata that does a great job of illustrating two types of data collections by describing them as lakes and rivers.  The idea did not originate with him; rather he encountered it via OCLC’s Eric Hellman.

  • Lakes are repositories of information that change little over time, and are fed from a few well-defined sources, supplemented by occasional “springs”.  A good analog for this is the library catalog.
  • Rivers are cascading flows of information, changing rapidly and fed by many sources.  The quote that describes this most effectively is often attributed to Heraclitus : “you cannot step into the same river twice.”*

This is a fantastic way to frame the ongoing transition that libraries face.  We are transforming ourselves (being forced to transform?  some combination of the two?) from a lake-based information service to a river-based information service.  We are having to learn as we go to navigage ever-changing waterways, dodging sandbars and debris in a boat that was designed over a century ago for lake use.

Keep this analogy in mind… it lends itself well.

* Wikipedia offers the following quote listed within their page on Heraclitus: “We both step and do not step in the same rivers. We are and are not.” This quote is simultaneously much more illustrative of the complexity of our situation, and much more confusing.

  • Share/Bookmark
Jul 09 2008

Classify


Classify is a new service from OCLC which returns class numbers (Dewey, LC, and National Library of Medicine) assigned to books in WorldCat. This could be a good way to use the “wisdom of the crowd” when you are not 100% sure where to group a particular book.

I noticed that the url had a “2″ at the end, so I removed it to see what would happen. It appears to be an earlier version of the service. I didn’t have any luck with the first few ISBNs I entered, but the example links work well.

Any other changes to the url bounce the user to the DeweyBrowser, which is a lookup service from a couple of years back. Although it also has a “2″ at the end of the url, nothing happens when one changes it. So much for rewarding curiosity ;-)

Classify found via Lorcan Dempsey’s weblog

  • Share/Bookmark
Jul 08 2008

Open Shelves Classification


Tim Spalding of LibraryThing has started a new ambitious project: develop a new shelf classification system that would eliminate the baggage of the 100+ year-old systems many libraries have in place, as well as create a system free from the trademark, copyright, and license issues connected with Dewey.

He is looking for a few librarians (one to five) to manage the project, and has started a LibraryThing group for everyone to join in the conversation.

This just started up this morning, folks… they’re still talking letters vs. numbers and general classification philosophy. We’re talking ground floor timing, so sign up and begin discussing!

found via Tim Spalding’s post to the Web4Lib list

  • Share/Bookmark
FireStats icon Powered by FireStats