This room is pretty empty, maybe people will come in a bit late. Seriously, maybe like 25 people right now, at 3:48. Danny jokes about the number of people in the audience. He introduces the book search deal.
Dan Rose from Amazon was up first. He is talking about "Search Inside the Book" program. It allows searches to look inside the book, instead of just title, meta or description. Launched it about two years ago, they work closely with publishers (they do not scan books, u must submit the content electronically). This product leads to increased book sales. Amazon and publishers are cool with this because it increases sales. He shows a screen capture of a search at amazon on "energy swaps" and how it uses inside the book to find this result, "excerpt on page 35" is where it is found in the context. He shows how you can scan pages. Access is limited to authorized Amazon customers. Viewing is limited to 2 pages forward and back, not more then that. Search inside also allows them to innovate new features like; first sentence of the book, SIPs (phrases found in this book at a higher rate then other books) and CAPs (phrases in caps). Then other features like books on related topics, concordance (visual demo of the 100 most used words in this book) and text stats (interested text stats on the book). Pretty cool stuff. Nov 3rd they announced, Amazon Upgrade; which enables those who buy the book to buy an upgrade to search inside the book with unlimited access. Also announced the ability to purchase individual pages of the book.
Tom Turvey from Google. He goes over Google's mission of making the world's info accessible, yada yada. Online channel is increasing its presence in the book market. 13% of all book sold, are sold online. The book market itself continues to publish more books each year, 195,000 books published in 2004, it increases every year. Google already drives traffic to online book sites. Google is the global leader among search engine referrals for book sales (~60%). Google Books Partner Program; thousands of publishers already on board. Its not just about most popular searches; its about everything. The partner program philosophy enable consumers the ability to search, discover and then buy and then provide publishers reports. Google wants to make every page searchable. Google Users can search just books, books.google.com or at Google.com. They are looking to add "buy this book" links and add partners' brand on each page. They also allow publishers to place ads at bottom of page. When a user views one of your books scanned pages, Google reads that page and adds text ads for related products and services. Reports on Partner's Programs, total page views, total clicks on "buy this books", CTR, total clicks on ads, ad CTR and net ad revenue. Content is protected; protected by the same high level security as Google's own data, Google monitors for large scale attacks. They also have page level security, restricted pages and login with google account id. Google Print; Publishers provide PDF of physical copy of books to Google, Google digitizes it, User searches Google, User is linked to the book and user can buy it. Library Project; more than 80% of published books are now out of print. the goal; 20% of books are in print; 80% of books are out of print; goal is to make them visible. A typical library collection is where 60% or more of books are unclear copyright status; but 20% are in the public domain and Google has those 20%. Three user experiences; there is a full view (up to 20%), there is a snippet view (copyright views) and Full Book view (no restrictions). He goes into more detail on the snippet view, (controversial view), that only shows a snippet of the content.
Sumir Meghani from Yahoo to discuss the Open Content Alliance. Enable people to find, use, share and expand all human knowledge. Expanding the amount of content available online and making it accessible in an open manner. The OCA represents the collaborative efforts of a group of cultural, tech, nonprofit and govt organizations... from around the world...that will help build a permanent archive of multilingual digitized text and multimedia content. Founding contributors were 9 in total (Adobe, HP, Internet Archive, OReilly, University of California and Toronto, etc. OCA Policies; Content is only made available on an opt in basis, Accessibility will be huge, anyone can index it and search it (meta data accessible via oai and rss), Third parties encouraged to build services on top of content in the collection, Existing digitized collections can be included in repository with permission of content providers.
Thiru Anandanpillai from MSN Search, without a PPT. MSN is doing book search because vast majority of very complex queries go unanswered and it takes up to 7 - 11 minutes to get the answer. There is also an element of trust and people want trusted sources. They want to make it easier for people to find answers. MSN took the approach; they figured it was important to work with the publishers who know what people want in books and its also easier to work together with others to digitize the content. So they joined the OCA (Open Content Alliance). They will ensure the end users can interact with the book as much as possible, they wont say when. And from a publisher perspective, they will ensure there are monetization methods. Publishers care a lot about control.
Tony Sanfilippo from Penn State University Press. He discusses all the published papers by universities. He goes over the long tail when it comes to people who buy books, of the top 150 or so, 50% of books are sold outside of that. His main market is library and then students (libraries want hard cover books). His director asked how are people going to find these books. That is when Google Print came by. Of the 1,000+ titles, they signed up 972 publications with Google. They have great success with this. 300,000 accesses to their site, which they are amazed of. He discusses netlibary which brings in $600,000 per year. ebrary isnt great because it requires a proprietary reader. They got a 130 titles in a program named Questia (spelling?) and get about $2,000 per year on that. They are looking into joining the OCA, they have some internal politics. They have issues with Amazon's license agreement; specifically Amazon retains the right to decide which content comes down if there is an issue (that is a deal breaker). Google offered a 45 hour take down on content. They wanted to make the content more accessible.
Where is Gary Price? We miss you in this session. Chatting with Gary now, here are some live virtual gems from our link to the library world.. Some additional resources from Gary:
- Public Domain Books: More than 25,000 Full Text Books in a Single Database
- More Online Books Resources
- Book Search Without the Fuss
Danny just beat up on bloggers, kidding... And then embarrassed me... kidding again.
Update: Also check out Lee's coverage on Advanced Search term Research Tools.