Monday, November 10, 2014

Week Eleven:

REQUIRED READING NOTES

Dewey Meets Turing: Librarians, Computer Scientists, and the Digital Libraries Initiative
by Andreas Paepcke, Hector Garcia-Molina, and Rebecca Wesley
http://www.dlib.org/dlib/july05/paepcke/07paepcke.html
  • The Digital Libraries Initiative (DLI) was launched in 1994 by the National Science Foundation.
    • Combining the words 'digital' and 'library' defined three interested parties: librarians, computer scientists, and publishers.
  • The impact reached past these three groups, as Google and its search engine emerged from funded work and changed the styles for all professions that involve computers.
  • The initiative has reached the work of historians, anthropologists, political science and law professions.
  • Uniting librarians and computer scientists occurred - but where does this place the publisher?
  • For librarians, the Web was more difficult to integrate. 
    • disruption to the library community occurred with journal publishers' charging premium for digital content.
  • Digital library projects in the computer science realm relieved tension between conducting 'pure' research and impacts on day-to-day society.
    • scientists had been trained to use libraries, so this provided an exciting new framework.
  • Librarians that immersed themselves in the initiative understood that info technologies were important to ensuring libraries' impact on scholarly work.
    • OPAC (online public access catalogs) constituted the entirety of digital facilities in libraries.
  • The growth of the web changed many significant plans for the Digital Library Initiative - propelling computer scientists and libraries into new directions. 
    • blurred the distinction between consumers and producers of information
    • dispersed most items in the aggregate should have been collected across the world and under diverse ownership.
  • Early results in the project demonstrated a nagging downside of the existing DLI research environment. 
    • Environment was bound to special deals with publishers and communities were not able to share results. 
    • Restrictions were serious because computer scientists traditionally make their systems functionally public.
  • DLI researchers with results that were bound by per-project agreements with publishers realized they could only share a small teaser with colleagues. 
    • working online removed these specific restrictions.
  • The embrace of the Internet by computer scientists was natural also because of the information link that is a much employed concept in computer programming. 
  • Accomplishments of the digital initiative has broadened opportunities to library science, rather than marginalized the field.
  • Hubs - collections of web sites whose primary goal is to direct visitors to other web sites that specialize on the hubs topics.
    • The computer was teaching and producing relevance to topics.
Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age
by Clifford A. Lynch
http://www.arl.org/storage/documents/publications/arl-br-226.pdf
  • The fall of 2002 introduced the institutional repository, a new strategy that allows universities to apply serious, systematic leverage to accelerate changes taking place in scholarship and scholarly communication.
  • Many technology trends came together to make the IR possible - online storage costs dropped and repositories are now more affordable.
  • The development of free, publicly accessed journal article collections in discplines has demonstrated ways in which the network can change scholarly communication by altering access patterns 
    • Separately, the development of a series of extraordinary digital works suggests the potential of creative authorship specifically for a digital medium to transform the presentation into scholarship.
  •  Lynch defines institutional repositories in the academic setting as a set of services that a university offers to the members for the management and dissemination of digital materials created by that institution and its community members.
    • Essentially it is an organizationl committment to the stewardship of digital materials, including long-term preservation where appropriate.
  • At a basic and fundamental level, the IR is a recognition that the intellectual lif and scholarship of our institutions will be increasingly represented, documented, and shared in digital form - that is a primary responsibility of our universities, to exercise stewardship over these riches: to make them available and to preserve them.
  • Scholarship and scholarly communication are changing and extending slowly to cultural changes at the disciplinary level.
    • higher education has overlooked an opportunity to support our most innovative and creative faculty.
  • Future developments in IR is covered, with suggestions that there are unexplored and interesting extensions that can be utilized in the public.
  • Lynch has argued that IR is a powerful idea that serves as an engine for change in higher education.
http://www.leadgeneration.org.za/wp-content/uploads/2011/06/search-engines.gif

Web Search Engines: Part 1 and 2
by David Hawking
http://web.mst.edu/~ercal/253/Papers/WebSearchEngines-1.pdf
  • Google, Yahoo!, and Microsoft are indexing almost a thousand times as much data for web users, providing reliable and sub second responses to around a billion queries a day in a plethora of languages.
  • Part 1 covers behind the scenes look in this article, detailing the infrastructure, the algorithms, and part 2 details the algorithms and data structures required to index the 400 terabytes of a Web page text AND deliver high-quality results in response to the million queries each day.
  • "crawlers" in reference to Algorithms that scale the networks and research.
  • Limited space prevents discussion of the many aspects to the search engine operation. 
    • high priority search engine operations monitor the search quality to ensure it does not decrease when the new index is implemented.

Current developments and future trends for the OAI protocol for metadata harvesting
by Sarah L. Shreeves, Thomas G. Habing, Kat Hagedorn, and Jeffrey A. Young
http://www.lib.umich.edu/files/services/dlps/Shreevesetal_5
  • The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has been adopted since its initial release in 2001.
    • developed to federate access to diverse e-print archives through metadata and harvesting, the protocol has demonstrated its potential usefulness to a broad range of communities.
  • Article details the overview of the OAI environment.
    • Lit Review -  the mission is to "develop and promote interoperability standards that aim to facilitate the efficient dissemination of content."
  •  There are two focal points: Community- and Domain-Specific OAI Services.
  • Ongoing challenges exist for the communities:
    • Metadata variation -  the normalization of a subject element, with many different controlled vocabularies, are used by the different data providers. It is for most service providers, resource intensive.
    • Metadata formats - many new formats including adding additional paths to the processing routines of data.
    • OAI Data Provider implementation practices
    • Communication Issues - loosely federated bunch, lacking general or technical skills within the group.
  • The OAI community does have a future if the issues above have guidelines to deal with the problems.
Thoughts 

These articles are decent and interesting to a point, but it feels like there is a lot to know and not enough written. It is a vast topic with many points, but these readings cover the Digital Library quite well. I actually referenced the Lynch article for a paper that I wrote on Digital Humanities this semester. It's all pretty interesting to see how library science accommodates to e-science these days.

No comments:

Post a Comment