Home |
Documentation |
General Notes |
PyGETS Guide |
PyGEMZ Guide |
PyGERS Guide |
Credits |
Project Page |
Downloads |
|
Python Gutenberg E-text Project |
Project Gutenberg produces versions of literary works that are free of copyright restrictions in the form of e-texts (electronic texts). These e-texts can be downloaded from the Project Gutenberg web site or from any of the many mirror sites that are located worldwide.
E-texts can be found in many different formats, some open and some proprietary. PyGERS and other applications from the PyGE project are designed to support e-text material produced by Project Gutenberg in either of two forms: plain text files and zipped text files. However, PyGERS does not directly read these files, but instead reads e-texts that have been converted to a format known as zTxt.
From its inception, Project Gutenberg has distributed e-texts as plain text files to maintain maximum compatibility with the many computing platforms and reading programs that were available. Plain text files contain only the words of the original literary work, without any of the illustrations or specialized formatting often found in published material. These files may usually be identified by their .txt file extension.
To reduce the size of files for downloading, Project Gutenberg also distributes plain text files that have been compressed in the zip format. Zipped files can be uncompressed with commonly available utility programs such as unzip and WinZip. These files may usually be identified by their .zip file extension.
PyGERS reads e-text files that have been converted to the zTxt format. This format uses compression to reduce the size of e-text files, much like zipped text files, while including support for additional features such as user-modifiable bookmarks and annotations. Conversion from plain text and zipped text files into zTxt files can be done with the PyGEMZ utility included in the PyGE project software distribution. These files may usually be identified by their .pdb file extension.
An additional benefit for users that comes from the use of zTxt files is the possibility of reading the same files on Palm PDA devices using the free Weasel Reader e-text reading program.
E-text files may be opened for display by PyGERS in one of two ways: invoking a file open dialog or selecting a previously opened file from the history list.
Invoking the File->Open menu command will result in a dialog box with the title "Choose a file to read". This dialog box can be used to locate a zTxt file to open and display in PyGERS. The e-text file will be opened and have its title page displayed when the OK button of the dialog box is clicked. Clicking on the Cancel button will abort the open operation.
PyGERS automatically remembers e-texts that have recently been opened for reading. Clicking on the File menu entry will reveal a history list containing the file locations of up to nine previously opened e-texts. Clicking on one of the items in the history will open it in PyGERS.
Finding a complete and accurate book title or author name for a literary work within the contents of an e-text is not always easy to do. The information is often hard to reliably isolate from other text, and it may have been truncated or abbreviated. In order to assign accurate title and author information to e-texts, PyGERS can use an index file which relates the title and author data to an e-text number which identifies all works from Project Gutenberg.
The index file may be created with the PyGETS program. A sample index file is included with the PyGE software distribution under the "SampleData" directory. The name of the sample index file is "gutenberg.idx".
The location of an index file must be entered into PyGERS only once. After being set, PyGERS will use the same index file in all future sessions, unless a new file location is entered. Reading an index file is initiated with a File->Read index menu command. This results in a dialog box with the title "Identify a Project Gutenberg index file", which can be used to locate the index. Index files will typically have the file extension .idx.
Last modified: Mon Aug 04 19:45:06 PST 2003 |
Copyright © Gary Shao, 2003. All rights reserved. |