RSS

Monthly Archives: March 2011

e-journal digital preservation: to read list

CC image courtesy of Mot on Flickr

I’m in the process of researching digital preservation of e-journals for a side project, so I thought I’d share some of the things I plan on reading.  I’m only listing items that are open access, so my real list is much longer.  If you have any recommendations, I’d love to hear them!

Advertisements
 
 

hyping links

Taken Too Early: Remembering the children of Native American and First Nations’ Boarding and Residential Schools
This post by Jennifer O’Neal, Head Archivist of the Smithsonian’s National Museum of the American Indian Archive Center, highlights an important yet often unknown part of history.

How to Track Conferences Virtually
Miss out on SXSW?  Check out this list of resources from ReadWriteWeb.

Research Libraries See Google Decision as Just a Bump on the Road to Widespread Digital Access
This week’s decision on the Google settlement doesn’t mean the goal of creating a comprehensive digital library is over.

 
Leave a comment

Posted by on March 25, 2011 in links

 

Tags: ,

computers in libraries 2011: data curation workshop

This post-conference workshop was the first of many as part of the Data Curation Profiles Toolkit project at Purdue University Libraries, which is funded by the Institute of Museum and Library Services (IMLS).  Below are my notes from the workshop, but be sure to visit the website if you’d like to learn more.

Reasons for the workshop and project:

  • assess information needs of researchers
  • raise awareness of curation issues with researchers
  • grow the collection of data curation profiles to analyze and develop a community of practice
  • understand the needs of librarians in engaging with faculty

First, what is research data?  It’s recorded factual material commonly accepted in the scientific community as necessary to validate research findings.  Data can be in the form of images, samples, surveys, tapes, raw numbers, algorithms, etc.

What is curation?  Curation is the activity of managing and promoting the use of data, starting from the point of creation, to ensure its fitness for contemporary purposes and availability for discovery and reuse.

Data used to be viewed as a byproduct of research, so it wasn’t preserved — now, data is seen more as an informational asset, part of the historical record.  Data can be reused and repurposed for future research.

A data curation profile (DCP) is basically the story of a data set — how it starts and is used.  The purpose of DCP’s is to investigate what data researchers have, what they are currently doing with the data, and what they’d like to do with it.  The larger the collection of DCP’s, the more conclusions you can draw from analyzing them and finding patterns — thus, increasing understanding of faculty needs.

DCP’s are created after interviewing researchers about a specific data set.  The DCP Toolkit has four components:

  1. user guide: describes the rationale and process of DCP’s and provides guidance
  2. interviewer’s manual: used in tandem with…
  3. interview worksheet: filled out by the researcher
  4. DCP template: informed by the researcher’s perspective

Uses of the DCP:

  • guide for discussing data issues with researchers
  • give insight into areas that need attention
  • give insight into the differences between data among various disciplines
  • understand faculty research from a production perspective (instead of consumption)
  • help liaison librarians engage with researchers

The DCP Toolkit is an interesting approach to understand what new services librarians can offer.  The process is hard work — but if you really want to work with data, this is the information you need to know.

 
Leave a comment

Posted by on March 24, 2011 in conferences

 

Tags: ,

computers in libraries: day three

How Libraries Add Value to Communities
Keynote by Lee Rainie, Director, Pew Internet and American Life Project
You can view the entire speech here.

  • tweckle: to heckle a speaker via Twitter
  • The Pew Internet and American Life Project has researched three revolutions so far: internet/broadband, wireless connectivity, and social networking.
  • Regarding the internet/broadband revolution, libraries add value by covering access and participatory divides.
  • Regarding the wireless revolution, libraries add value by helping the conversation of finding information and providing access to real-time information.  The library as a place becomes the library as a placeless resource.
  • Regarding the social networking revolution, libraries add value by becoming embedded into people’s networks — watch people’s needs and respond/contribute without being asked directly.
  • Libraries can be nodes in social networks by acting as sentries (spreading information by word of mouth), information evaluators, and forums for action.
  • three attention zones: continuous partial attention, deep dives, and info snacking.
  • four media zones: social streams, immersive, creative/participatory, and study/work.
  • Libraries provide greater cosmic value by being teachers of new literacies (especially ethical literacy — how to behave in this new online world) and filling civic information gaps.

Digital Preservation Strategies: Value Through Longevity

  • digital preservation: making sure that born digital files are accessible over time
  • two strategies for digital preservation: emulation (maintaining the original hardware and software to view a file, or creating software that mimics the original) and migration (transforming files to a stable format)
  • Open source programs may be free upfront, but there are other considerations to take into account.  The developer may be one person or a thousand, with a heavy level of investment to very low investment.  The learning curve can be quite steep, and documentation may not exist for the program.
  • When starting a digital preservation program, be sure to build in time for tool installation (specific for your configuration), troubleshooting (can be difficult without documentation), and Googling for assistance.
  • challenges to implementation: staff time, resources, IT knowledge
  • at-riskier files: older formats, AV files
  • Creating files in open source formats initially makes digital preservation easier because you can already see what the files are made of.

Libraries in the Semantic Web
This session was definitely my favorite of the day (possibly the entire conference).  It’s a complicated topic, but the speakers (Lisa Goddard and Gillian Byrne, Memorial University Libraries) explained it very well.  My notes might not make very much sense, so take a look at their article in D-Lib Magazine.

  • Why do we need a new web?  We forget how stupid existing search engines are — they have high recall, but low precision because they are vocabulary dependent.  Results are a series of individual webpages that are not linked in any way, and the deep web isn’t searchable because it isn’t indexed.  You can’t do very complex queries or comparisons.
  • The semantic solution is made of structured data (RDF), controlled vocabulary, and linking.
  • goal: machine-actionable data
  • Using reasoning — such as equivalent, symmetric, and inverse — machines can create new information from existing information.
  • When linking data, it is not at the page/document level but at the entity level.
  • natural language processing: a machine can take unstructured text to identify people/places/things and disambiguate the terms.
  • RDF publishing tools: Drupal 7 CMS incorporates RDF into its core; Semantic MediaWiki has you enter information into a structured form; Zemanta for blogging
  • DBPedia takes all of Wikipedia’s content and models it as RDF.
  • Some obstacles to implementing the semantic web are competing vocabularies, co-referencing (multiple unique identifiers), finding linked data, and preservation (what happens when a unique identifier disappears?).

Repositories: Strategies and Practices

  • Lessons learned by Goddard Library: talk to the end user early and often (we think like librarians, they don’t); publisher metadata is problematic; human quality control is necessary (automation can only go so far); there’s always something new — be aware and adaptable.
  • Institutional repositories (IR) must be cohesive and involve everyone in the library — IR can’t just be a project worked on by a few people.
  • Answer: Who is the IR for — who is the community?  What materials will be collected?
  • IR is a chronically understaffed program in libraries.  Digitization and dealing with copyright issues are hard work that require staff.

Preserving User-Generated Content

  • What will scholars of the future want access to?  Are we providing for that?
  • Streaming media can’t be archived (if the video/audio isn’t recorded).
  • Widgets, such as embedded Google maps, can’t be replicated in an archive yet.  Don’t even think about apps…
  • The value of preserving user-generated content lies in documenting change over time — the historical view.  Specifically, the archive captures both cultural and technological changes.
  • Regarding the Twitter archive, it is very difficult to index it all.
  • Web Archiving at the Library of Congress

Transliteracies: Libraries as the Critical “Classroom”

  • Transliteracy: the ability to read, write, and interact across a range of platforms, tools, and media.
  • For kids, content is more important than the container.
  • Transliteracy is an umbrella that unifies the different types of literacy.  We must be proactive, not reactive, in preparing for the future.
  • Start with training your staff about different technologies.
  • Libraries must be a place of creation, not just consumption.
  • Create a structured plan, and ease your way into it.
  • Learn multiple formats to understand your patrons’ needs.
  • There is more than one divide — it’s more of a multi-level caste divide.
  • It will not be easy, but you have to do it — not just for your job but to help your friends and family too.
  • Learn more by visiting the blog Libraries and Transliteracy.
 
Leave a comment

Posted by on March 23, 2011 in conferences

 

Tags:

computers in libraries 2011: day two

3 Keys to Engaging Digital Natives

Keynote by Michelle Manafy, Director of Content, Free Pint Limited
You can view the entire speech here.

  1. public opinion, not private lives
  2. knowledge sharing, not knowledge hoarding
  3. interactions, not transactions
  • Use options such as Social Sign-on to blur the line between your environment and their environment.
  • For a good example of knowledge sharing, check out Digitalkoot from the National Library of Finland.  It’s a very cool cultural heritage project — users play games that help improve the indexing accuracy of the library’s archives.
  • Listen, respond, and react: digital natives are engaged by genuine communication.
  • We have to think more like the digital native in order to survive.

Learning from Inspirational Libraries

This session was mostly visual, with photos of libraries from all over the world.  I didn’t take very good notes, but here are a couple:

  • The Yonsei Samsung Library in Korea embraces Web 2.0 in how their building uses space — interactive with digital maps and digital signage.
  • The library in Medellin, Colombia has book pick-up and drop-off locations at subway stations.  They also built new library facilities in the “worst areas” of town, which ended up transforming the neighborhoods.

Metasocial: Making Online & Mobile Interactions Rock
Speakers: David Lee King, Sarah Houghton-Jan, and Nate Hill

  • Libraries should friend and follow people in their local area — focus on your customers first, not on those in the librarian community who don’t live in your area and won’t be using your library.
  • Treat your foursquare mayor!
  • Augmented Reality (AR) combines a mobile device’s GPS, camera, and accelerometer to provide a digital view of the physical world connected to digital objects (or a digital world).
  • For examples of augmented reality layers, check out Layar.
  • The San Jose Public Library created the San Jose Now mobile app using Drupal 7 CMS, HTML5/CSS3, Google Maps API3, and jQuery mobile framework.

New Alignments, Structures, and Services

  • Don’t have a service model where your clients or patrons only see you when they want or need something.
  • Increase your visibility within your organization by attending other meetings and adopting the same accountability measures as other projects.
  • Adopting a pilot mentality lets you evaluate between different products or services, allowing one to rise above the others by seeing how they would actually be implemented (instead of making a decision based on paper or theory).
  • The daily “scrum” meeting: your team meets every morning for 10-15 minutes to share what everyone worked on yesterday and will work on today — provides accountability for project goals.
  • Using a variety of commercial sources is difficult because they don’t all work well together due to restrictions.
  • Moving to a model of open source programs gives you much more control (you can get exactly what you want), and data can be more easily used in multiple contexts.

Integrating iPads into Learning and Libraries

  • The Ryerson University Library and Archives in Toronto started a research project to see how students can use ipads for school.  Four students were selected from their library Student Advisory Committee, and in order to keep their ipad at the end of the research project, the students must participate in meetings and blog weekly.  The research project is not over yet, so more analysis is to come.
  • Because of the number and variety of available ipad apps, students from different majors and fields were able to find ones to fit their studies.  For example, the myPANTONE app lets one student color match items in both digital and print forms.
  • If considering conducting a similar study, be sure to get ethics approval early.  Consider other tablets (now that there are more on the market) and providing participants with peripherals such as a keyboard or stylus.  Decrease blogging frequency to once every other week to allow students more time to reflect.
  • Is the ipad useful on a temporary basis?  For information and book consumption, yes.  Loaning ipads also introduces people to new technologies, letting them test it out to decide if they want to get one or not.  If producing work or using it as a personal device (using app for email, calendar, etc.), the ipad is not ideal for a temporary loan.
  • The ipad is a hybrid device — used for both consumption and production.

Getting to the Eureka Moment
My notes for this session also aren’t very thorough — for good reason, though.  The speaker (Julian Aiken, Access Services Librarian at the Yale Law Library) was so funny, it was hard to take notes.  If you don’t believe me, check Twitter.

  • Every librarian should have the opportunity to experiment with a “rummy” (odd) but potentially brilliant idea.
  • Incorporate Google’s 80/20 innovation model in your library as a staff reward.
  • Allowing a scheduled time to work on another project can greatly benefit the staff member’s professional development in addition to the library.
 
Leave a comment

Posted by on March 22, 2011 in conferences

 

Tags:

computers in libraries 2011: day one

I’ll be honest.  This morning I wasn’t overly excited about day one of the conference compared to the rest of the week, but I did enjoy the searching sessions I attended.  Overall, the sessions scheduled for Monday just didn’t seem as interesting to me.  Maybe my expectations for Wednesday’s Content Management and Preservation Track and Thursday’s Data Curation workshop are too high.  We’ll see.

Today I only attended three sessions, and none of the keynotes.  I planned for a partial day at the conference since I would be working a full shift at work tonight (I didn’t want to ask someone else to cover my closing shift).  Below are some of the highlights from the sessions I attended.

Super Searcher Strategies and Tips

  • Google has a word proximity search operator: AROUND (#) # = number of word proximity
  • Google Books’ data mining lab: http://ngrams.googlelabs.com
  • Yahoo Clues shows search trends and demographics.
  • DuckDuckGo.com doesn’t track your search results, so past searches won’t influence future ones.
  • Blekko.com blocks spam and content farms.  You can also use specialized slash tags, such as /relevance and /date to organize results.

Building Community with Faculty and Suppliers

  • When creating a digital repository, start small with a focus but also dream big for the future.
  • Be ready to do it all yourself.
  • Plan ahead, but stay flexible.
  • After implementing the repository, constantly assess what people are looking at and downloading.
  • You need to have both anecdotes and data to demonstrate the repository’s worth.

Search: Quick Tips for Adding Value

  • In Google, use an asterisk in place of wildcard words (e.g. john * kennedy) – but be careful, each asterisk insists presence of a word.
  • Microsoft Academic Search: Google Scholar on steroids?
  • Topsy.com archives tweets and allows you to search images and videos linked in tweets.
  • IssueMap.org and Many Eyes allow you to create data visualizations without knowing code.
  • Google has a command for searching only specific file types: filetype:[extension] (e.g. [search terms] filetype:ppt)
  • Google also has an operator for searching within number ranges: .. (e.g. 1984..1999)

If you’re interested in more details about the searching sessions, Librarian in Black (who moderated the sessions) live-blogged each one.

 
3 Comments

Posted by on March 21, 2011 in conferences

 

Tags: ,

hyping links

Why Some E-Books Cost More than the Hardcover
Finally, an explanation that really breaks down the costs of selling physical books vs. ebooks that I can understand.

Information Is Beautiful on the books everyone must read
Here’s a cloud visualization aggregating popular lists and polls that name the must-read books.

SXSW 2011: The Year of the Librarian
This post from The Atlantic shares how awesome librarians are.

LibConf.com
I’ll be attending Computers in Libraries 2011 next week and intend to blog about my experience.  If you’re not attending in person, you can still watch all three keynote speeches that will be streamed live.

 
Leave a comment

Posted by on March 18, 2011 in conferences, ebooks, links

 

Tags: