HKUL > One Millionth E-Book Celebration > Exhibition

Seven Steps to Creating the One Millionth E-Book

1. Choosing the right book
The one millionth e-book at the HKU Libraries should be a unique title not available elsewhere and free from copyright. These are the major factors considered by the team of library staff led by Dr Ferguson, the Librarian. After a careful search amongst the Libraries’ rare book collection, one title emerged. It was the 1798 edition of George Staunton’s An authentic account of an embassy from the King of Great Britain to the Emperor of China. Besides its interesting content recounting an event of historical importance, readers will marvel at the aesthetic illustrations contained therein.

2. Pre-production preparations
Drawing from the digitisation experience of the Libraries’ technical team, an overall plan was laid out for the production of this unique e-book. The team, however, identified two special issues that needed to be addressed. Firstly, many of the illustrations are over-sized and therefore had to be sent to a professional scanning house for digitisation. The rest of the book in A4 size was handled by the in-house team. The second issue related to preservation. Staff needed to exercise extreme care to handle this five-volume edition. Prior to digitisation, page by page inspection was made to ensure that there were no missing pages and that each page was in good condition. Special paper boxes were tailor-made for each of the five volumes to ensure their safe journey between the library and the scanning house. For preservation reasons, the scanners employed for this digitisation project were equipped with special book care features -- SMA 21 ScanFox colour scanner for A4 text pages and Zeutschel Omniscan OS10000 A1 colour scanner for the over-sized illustrations.

3. Image capturing
Production process began once the five volumes had been inspected and the scanning house appointed for the job. Two sets of images were produced. The master file maintains a high resolution of 600 dpi and 24-bit colour depth achieving "true colour" quality. However, considering the file size of such high resolution images, another set of images were produced for web release. This set has a lower resolution of 75 dpi which will facilitate online viewing. Stringent quality control is exercised throughout the image capturing process. Library staff checked every digital image, and looking for images which were out-of-focus, those that fail to capture the entire page or any missing pages. These pages were then arranged for rescanning.

4. Image manipulation
Once rescanning was completed, refinishing work began to ensure that all images were in good shape. Staff opened each and every image to check the orientation, crop unnecessary borders and clean up unwanted spots. Another staff is assigned to double-check the quality. This work ensured that all of the printed pages had received the best possible digitisation, and that all were free from defects.

5. Creating table of contents/indexes
To create the table of contents and entries for illustrations and maps, staff manually input the required information into a database to produce an Extensible Markup Language (XML) file based on the Metadata Encoding & Transmission Standard (METS) (http://www.loc.gov/standards/mets/) of the Library of Congress.

6. Output images
This served as the final touch-up for the web release of the e-book. Firstly, the digital images were converted to Portable Document Format (PDF) files. Secondly, watermark stamps and security measures were added to the files to prevent unauthorised use of the e-book. Thirdly, a "table of contents" page with hyperlinks to the respective PDF files was generated based on the XML file created in step 5. With the table of contents in place, users can go straight to a chapter and navigate within the e-book.

7. Compile the e-book
This step put the e-Book PDF files into a digital library system, called Greenstone, that has a web interface. After access points and navigation methods were checked and double-checked, the resulting web pages were made available to the public and search engines such as Google and Yahoo, so that universal “discovery” of this new and unique online title could begin.

創造第一百萬冊電子書的七個步驟

(一) 選擇合適書籍
一個以彭仁賢館長為首的圖書館專責小組,經過慎重的考慮後,決定挑選一種本館獨有和不再受版權限制的書籍,作為香港大學圖書館的第一百萬冊電子書。館藏眾多珍稀書籍中,一套1798年版的《英使謁見乾隆紀實》,最能符合這些要求。這書不單生動地記錄了一件史實,美麗的插圖更令讀者讚嘆。

(二) 製作前準備工作
開始數碼化工作之前,圖書館的技術支援隊伍憑着他們豐富的經驗,為製作這獨一無二的電子書,訂定一個全盤的方案,及解決在製作流程中兩個主要問題。其一是部份插圖的面積甚大,需要將書籍送到一間專業製作公司,利用特殊器材來掃描;其餘不超過A4面積的部份,則由館員在館內掃描。第二個問題是如何確保這珍貴書籍,不會在製作電子書期間受到任何破壞。為此,館員需要極度小心處理所有與書籍接觸的工序,同時亦要仔細檢查這套共五冊的書籍,確保沒有缺頁及每一頁都是完好無缺的。館方為每冊書訂做紙盒,避免書籍在運送到專業製作公司的途中受到破壞。製作時採用的掃描器附有保護書籍的特別功能;掃描A4面積的是SMA 21 ScanFox彩色掃描器,而超過A4面積的是Zeutschel Omniscan OS10000 A1彩色掃描器。

(三) 擷取影像
在完成檢查書籍和選定專業製作公司後,掃描工作隨即開始。掃描器以600 dpi解像度(resolution)和24-bit色深(colour depth)來擷取影像,以保持原貌。然而,這些以高解像度掃描的影像檔案,體積甚大,不利於網上傳送。館方需另行建立一個以75 dpi解像度存檔的版本,以方便讀者瀏覽。整個擷取影像流程的品質控制非常嚴格,館員需檢查所有影像,並細看每個影像的對焦準確與否,還有它們是否涵蓋整頁,及有否擷取所有影像,不合格的影像則需重新掃描。

(四) 整理影像
重新掃描不合格影像後,館員會再次逐一檢查所有影像,糾正方向,剪裁無用的周邊,及清除多餘的汚點。另一位館員會覆核所有影像,確保它們的質量能够符合館方的要求,及完全沒有缺點。

(五) 建立目錄和索引
館員需要輸入篇章、插圖及地圖等資料,以建立一個資料庫,用以編製一個符合美國國會圖書館所制訂的 Metadata Encoding & Transmission Standard (METS) 準則(http://www.loc.gov/standards/mets/) 的「可延伸性標示語言」(Extensible Markup Language,簡稱XML) 檔案。

(六) 輸出影像
這是電子書製作過程的最後程序,共有三個步驟。第一步是將所有影像轉換成「可攜式文件格式」(Portable Document Format,簡稱PDF) 檔案;第二步是在檔案加上「水印」及其他安全措施以防止電子書被盜用;最後一步是利用上述XML檔案來建立電子書的目錄頁,讓讀者可以連接到不同篇章、插圖及地圖的 PDF檔案,方便網上閱讀。

(七) 編製電子書
最後,所有PDF檔案都會上載到Greenstone,這是港大圖書館使用的網上圖書館系統。當館員確定這網頁各連結點均正常運作後,讀者便可以直接登入港大圖書館或通過谷歌 (Google) 和雅虎(Yahoo)等搜尋器,閱讀香港大學圖書館這獨一無二的電子書了。

 

Proudly sponsored by

 
Copyright & Disclaimer