Statistics


Think of Infochimps.org as not only a wikipedia of data sets, but as potentially the greatest data mash-up tool yet.

Imagine having loads of census, weather, sports, and other statistical data available in one big database.  Then standardize the fields so that you can interconnect the data sets with each other.  From the Infochimps site:

“A central, community-driven repository solves these problems and presents amazing possibilities. Once we interconnect the datasets along concepts they share, instead of 100,000 datasets, there’s just one. Study the physics of baseball by comparing the hourly weather during every single baseball game to game outcomes. Uncover political campaign irregularities by comparing neighborhood per-capita income, historical voter trends, and public campaign finance records. Plan real-estate decisions based on what news-and-other-media keywords rank highly in each area.”

Still don’t see the possibilities?  Browse through the datasets that are already loaded.  Then check out what is coming in the near future.  This will likely be the first place we will want to go for statistical information, as long as it is fast and easy to extract what we need.  I am looking forward to seeing what they (or some other enterprising web designers) come up with to work with data on the web.

(does this sound vaguely familiar?  you might be thinking of Freebase.com, previously discussed here)

found via Open Access News

Marshall Breeding provides his annual overview of the shifts and trends in the world of the Integrated Library System (ILS) in the current issue of Library Journal.

Of interest is the definite movement of the open source systems, Koha and Evergreen, into the mix.  I’m surprised that they haven’t made this much impact in the past year or two, but I suspect that contracts and switching effort has made the library world very hesitant to try out new technologies at the level of the ILS and OPAC until they sense that others are using them effectively.

We could stand more experimenting, even on the sandbox level… who knows how much better we could be with a bit of time and effort?

BookLamp.org is a web 2.0 application that does something new with book recommendations. Their approach is to avoid any book selling sites and focus only on responses from readers. This provides benefits when one thinks about libraries — people often don’t buy the books they are reading.

The other new approach is how they break down the book information : they create a chart showing, chapter by chapter, the Pacing, Density, Action, Description and Dialog within the book. Here is a chart for one of my favorite reads, To Say Nothing of the Dog (click the image for the larger version):

BookLamp.org Chart

In the end, this will show similarities beneath the surface, as well as justifying why (in my own opinion) there is a natural progression for readers to begin with Harry Potter, move on to Lord of the Rings (with a brief visit to Narnia), and then onto The Dark Tower series.

If this seems like an interesting approach, sign up for an account. There isn’t much there yet, but what is there looks really promising. I hope they will consider OPAC integration sometime in the future…

found via LISNews

UNdata is a search tool for the many informational databases that the United Nations maintains. It is straightforward, easy to use, and effective in attaining what you need.

If only the UN as a whole worked so well ;-)
via OSDir

Remember how, about 10 years ago, the concept of a “paperless office” began to seem like a weird joke? The proliferation of the desktop computer and the ascent of the internet introduced the potential of foregoing paper documents, relying instead on electronic versions. The source of the joke was that instead of reducing our paper use, having access to all these e-mails, websites and electronic documents increased our print output.

An article titled Pushing Paper Out the Door in today’s New York Times documents that paper use has plateaued, and is currently in decline. The actual cause? People saving money on ink, toner and paper. This matches what I have seen in various libraries: people tend to be more conservative when their own resources are being used.

As far as printing from public workstations and labs is concerned, I like the idea of having a certain number of pages printed being included in one’s account, then paying for any additional printing. This seems to strike a balance that allows for modest printing without being overly commercial about it.

link passed to the Web4lib list by Bernie Sloan

Visualizing the Bible is a project by Chris Harrison, a doctoral student at Carnegie-Mellon University.  It consists of visualization of biblical references and social networking.  Check out his other projects, as well, such as his Wikipedia Top 50 and Clusterball.

found via if:book

Does your library track reference statistics?  If the answer is “yes” (or even “possibly in the future”), then check out the READ Scale website.  Their system for categorizing and recording reference transactions via a 6 level hierarchy is both straightforward, yet powerful.  They even provide a guide for implementing the READ Scale at your own library.

found via Extensible Librarian

The Pew Internet & American Life Project has released a report (Pdf here) that you should read. Really. It will likely challenge assumptions that we make regarding who uses libraries and why. Here are a few of their findings as examples:

Problem Solving Behavior (from Major Questions and Findings):

  • 58% of those who had recently experienced one of those problems said they used the internet (at home, work, a public library or some other place) to get help.
  • 53% said they turned to professionals such as doctors, lawyers or financial experts.
  • 45% said they sought out friends and family members for advice and help.
  • 36% said they consulted newspapers and magazines.
  • 34% said they directly contacted a government office or agency.
  • 16% said they consulted television and radio.
  • 13% said they went to the public library.

Public Library Use, by Generation (from Chapter 3):

  • After Work (age 72+) - 32%
  • Matures (62-71) - 42%
  • Leading Boomers (53-61) - 46%
  • Trailing Boomers (43-52) - 57%
  • Generation X (31-42) - 59%
  • Generation Y (18-30) - 62%

Regarding the second set of statistics, this is a dramatic turnaround from a survey in 1996 (from Chapter 9) which showed 18-24 year olds being the “least supportive” of libraries.

Another interesting note is that those with broadband access to the internet are more likely to use a public library than those with lower or no access to the internet (from Chapter 3). This finding surprised me.

Read the report; there is a lot more there to catch your attention. What surprises you? What confirms your circumstances? What does it all mean?

We are in a time of great change for libraries. The internet, social networking, wireless access, and broad access to computers are all radical forces that are going to alter our jobs and environments in ways we still cannot fully imagine. Understanding and implementing Library 2.0 concepts is only a start (but a necessary one).

We need to understand that this is a revolution in information. Storing, seeking, accessing, using, and understanding information is going to be different. Different is not necessarily good. Different is not necessarily bad. It will simply be… different.

We in Libraryland need to be on top of this moving colossus, and to be doing our best to anticipate and understand where it is going. This is not only important for ourselves, but for the good we can do for society as a whole.

found on Search Engine Land

ResourceShelf has a list of highlights of the Survey of Library Database Licensing Practices.  The highlights are pretty interesting, and the cost of the complete survey results ($80 for a paper version and a whopping $89.50 for a downloadable pdf) make the highlights that much more interesting.

Marshall Breeding has posted a chart detailing the various brands of Integrated Library Systems (ILS) software used by the Association of Research Libraries (ARL).

Most interesting, from my own perspective, is that Voyager and Millennium are the top two systems. I used Voyager for many years, and feel very comfortable with it; I have been working with Millennium for about 9 months, and am still getting used to it.

Note his brief comment about Open Source ILS use within the ARL. I personally think the shift is coming.

American FactFinder is a lookup service from the U.S. Census Bureau that will provide a decent snapshot of statistical information for a given geographical area.  Enter your zip code, city or county and there you go!

found in the third comment to this MetaFilter post

They’ve been around for a while, but I haven’t blogged about them yet, and they keep adding great features, so here is another great site for finding info related to location:

Melissa DATA has links to resources that give you information based on zip code, street address, and more.  Want to find out if an address is valid?  Know the address number and zip code, but don’t remember the street?  Mail delivery routes?  School districts?  These searches and more are at your fingertips.

found at ResourceShelf

A newly released study, Taxpayer Return-on-Investment (ROI) in Pennsylvania Public Libraries (Pdf), comes to an interesting and positive conclusion:  for every $10 of tax money invested in public libraries, the Pennsylvania taxpayers receive a return of $55.

found on ResourceShelf

The Library Salary Database (press release) from the American Library Association is an online database where you can find out what people earn in various library positions at various places in the country.

It seems like a good resource, but the ALA is charging quite a bit for access:  $250 per year for non-members and $150 per year for members.  Their “special deal” is $30 for one months access.  I could understand a reasonable fee for access if you were a non-member, but I suspect that they have obtained these figures from surveys of their membership, and to charge this much for access seems excessive.

The book format of this costs quite a bit less than database access:   between $63 and $100, depending on membership and whether you want MLS or non-MLS information.  And why does the non-MLS book cost at least $30 more?

To the ALA:  if you make this resource much more reasonably priced, perhaps making it a no-added-cost benefit for members (or perhaps even “in exchange for providing salary information”), you will find it a much more popular and respected resource.  I don’t believe it will get wide use in its present form.

from ResourceShelf 

Wouldn’t it be nice if there existed a web page containing links to all the Blue Books for the states? There are two:

ALA’s GODORT Wiki

Bradley University’s Wiki

If you look under the history tab for each of the pages, you will see that Bradley University’s page was the likely source for the GODORT page.

from ResourceShelf

The ECAR Study of Undergraduate Students and Information Technology, 2006 (3MB Pdf here), the third annual report, has been released.  These have been very informative reports, containing information on a broad range of student/IT interactions.

thanks to Bill Drew for posting this at Web4Lib

Yes, you are a library geek if you understand that the title does not refer to a quick workout plan, but to one of the best statistics resource for the U.S.

The Statistical Abstracts of the United States:  2007 is available online.

from ResourceShelf (other great links in this post, so check it out)

Do you have a favorite word?  A favorite words list?  Wordie is for you!

Wordie is, according to their page, “Like Flickr, but without the photos.”  You can look up words, create your own list of words you like, and see people’s lists of words.

from TechCrunch

An interesting site for comparing and contrasting the many various diet offerings exists on the site called Diet Television.

Adjust the sliders along the left side of the screen to show the relative importance of various diet-related issues (speed of weight loss, eating out, feeling hungry, and various food choices), and the ranking of diets changes to reflect the responses of people who have used those diets.

I found it just a tad strange that it popped up with one particular diet plan’s rating at 100%, but it seems to choose the various diets randomly, so you can reload the page to see different diets and how they were ranked.

from TechCrunch

Neighboroo is one of the many mashups that exist using the Google Maps as the user interface.   What Neighboroo does is  give you a visual geographic layout of a variety of statistics, while doing a great job explaining those stats, and highlighting them by specific location.

Go to your chosen location, run through the categories, and see what I mean….

from Monkey Bites

Next Page »