LIS 2600 Fall 2010 Priya Shenoy: November 2010

Friday, November 26, 2010

Muddiest Point 11/22/10

In class we discussed the z39.50 search. I also remember seeing this in our Koha project assignment. In the slides it stated that in general these types of searches are not used due to the fact that they are difficult to implement, can have semantic problems, and network server problems. So then who really uses it? And why would they use it over another type of search, especially since it does not retrieve full documents or other objects?

Saturday, November 20, 2010

Comments for 11/22/10

http://bds46.blogspot.com/2010/11/reading-notes-week-11-reposted.html?showComment=1290312763178#c8319111753134264728

http://maj66.blogspot.com/2010/11/week-11-readings.html?showComment=1290313663883#c6596095573235399331

Reading Notes for 11/22/10

1.) I really liked the David Hawking articles about the basics of the web crawlers, how they function, the problems they have, and why they are such great tools. How they must leave out "low-value automated content" since that is not necessary and makes the search harder. How they must be able to search in multiple different languages, misspellings, and also made up words (ie: google, yahoo). How they must work with such a vast amount of information, how they must avoid tricky spamming sites, and all within a few milliseconds. Its quite amazing when you think about it like that!

2.) The Shreeves, S. L et al. article had an example of its customers that I thought was intriguing. The Sheet Music Consortium is trying to digitize sheet music and they initially had some trouble with figuring out how to digitize the various components of their information such as the "cover art, the sheet music itself, the lyrics, etc." and also I assume all the other small additions written onto the music such as playing in piano, forte, or staccato. In class, sometimes we discuss the concept of using other languages for digitizing data, but music was not something I had though of before. Essentially music is like another language.

3.) The Bergman article was great because it explained the true depth of the deep web and the difficulty for crawlers to find this information. Even though the crawlers do a great job with all that they are responsible for, there is still a massive amount of information which is left out of the equation. The article discusses how "Internet searchers are therefore searching only 0.03% — or one in 3,000 — of the pages available to them today." That is such a tiny fraction of the information that we could be accessing! I find that number almost unbelievable. The article also discusses, that crawlers aren't even looking in fire-walled or "Intranet" sites within institutions as they can't. So there's more information there that we cannot access..

Friday, November 19, 2010

Muddiest Point for 11/15

I have no muddiest point for this week.

Saturday, November 13, 2010

Comments for 11/15

http://bds46.blogspot.com/2010/11/reading-notes-week-10_09.html?showComment=1289692064230#c41115158330661988

http://maj66.blogspot.com/2010/11/week-10-readings.html?showComment=1289692536059#c1843605878669020642

Reading Notes for 11/15

In the Mischo article it was interesting to read about the university projects that were funded by the DL-1 for networking and computing technologies. It was good to see Carnegie Mellon listed there as I'm sure these grants were only for the most qualified and capable groups. It said that CMU had a grant for the "study of integrated speech, image, video, and language understanding software under its Informedia system." I wonder if that is for speech recognition technology? Or if it was something that was specific for individuals who are either deaf, hearing impaired, or blind?

The Paepcke, A. et al. article is interesting because he talks about "the binary union between academic librarians and computer scientists." And I feel in many ways thats what Information Science is about, a fusion of computer science and library science. I didn't think I really understood till I started my MLIS, the interconnected weave that technology and information have with each other. I already feel very knowledgeable about many computer based concepts. Granted, I have a long long ways to go till I can really feel proficient. But, I still feel an understanding of the concept of this union.

The article by Lynch, Clifford was a very heavy read. But, as I was reading I started wondering about the University of Pittsburgh's Institutional repository. I wondered if we had one, what it is called and who uses it. At first I thought it may be something like Blackboard. But, then it described it more as a place that individuals can access anything by a professor or a graduate student. And on Blackboard you can only access the classes that you have or teach. Therefore, I don't think it fits the description. So I guess I would like to know more about the University of Pittsburgh and institutional repositories.

Saturday, November 6, 2010

Comments for 11/8/10

http://bds46.blogspot.com/2010/10/reading-notes-week-9.html?showComment=1289098437807#c7534108398756973370

http://archivist-amy-in-training.blogspot.com/2010/11/week-9-xml.html?showComment=1289098943136#c7587698470070509100

Reading Notes for 11/8/10

The Martin Bryan article was a good introductory article. I had trouble with accessing it, but I read on some of my classmates blogs where to get a good link for the article. The article gave a good background to the XML framework and organization. I think this was one of the best articles for this week.

The Uche Ogbuji article I think was very complex and went a bit over my head. I did notice they had some great links to other XML tutorials. Many of these tutuorials were basic tutorials. I followed a few and found them to be very good references.

I liked the first few pages of the Andre Bergholz article. I liked the examples he used on the side of the page. In figure 1a and 1b for HTML and XML hes states "The HTML description is layout oriented, while the XML description is structure oriented." From the readings this seems to be one of the major difference between the two and also why people may prefer XML. Because, XML can offer more due to its structured nature.

The XML Schema Tutorial was also a very good read. I like the format of the w3schools.com content. We also read from this site last week and I find it rather clear and concise. Reading about the Schema and how much better that it is to DTD's reminded me of last weeks reading about how HTML has CSS sheets to help expand its abilities. It also brings to mind the fact that technology is constantly changing and that in a few years something may have superseded XML Schema.

Friday, November 5, 2010

Muddiest Point 11/1/10

I have no muddiest point for class on 11/1/10.

Thursday, November 4, 2010

Assignment 5- Koha Bookshelf

http://upitt01-staff.kwc.kohalibrary.com/cgi-bin/koha/virtualshelves/shelves.pl?viewshelf=96

The title is: Priya's Harry Potter List

(It is on PAGE FOUR, for some reason it does not show up in the catalog when I type the name. Only when I look on page four. And it is a public account. Thanks)

My username: PRS38