LibGen (Sci-Tech) |
Scientific articles |
Fiction |
---|---|---|
Comics |
Standards |
Magazines |
scrolling
Koha
- Koha Administration
- Aqusitions
- Authorities
- Cataloguing
- Circulation
- Course Reserves
- Advance Search
- ILL Request
- Lists
- Patrons
- Reports
- Serials
- About Koha
- Tools
- How to get DOI
- Data collection form
- Multi member add
- Mult add Chrome
- URL Shorterner
- Setting Patron
- Backup
- Marc ownFramework
- Cataloguing
- video to MP3
- Create Barcod & Label
- Sharepoint
- Item
- EMI loan Calculator
- English learning
- ConnectIPS
- Reduce Image size
- Increase
- Remove image Background
- TU Online Form
- Patron & Circulation
Training
scrolling share
Tuesday, August 18, 2020
Thursday, August 13, 2020
FIELD SELECTION TABLE (FST)
4.6 FIELD SELECTION TABLE (FST)
This is perhaps the most difficult of the four forms to understand.
CDS/ISIS has two ways of finding information in the database, which can be compared with the two ways of finding information in a book. Suppose we have a book on architecture and we want to find any mention of cathedrals. One method is to start at page 1 and scan each page in turn to see whether ’cathedrals’ occurs on that page. This is known as a ’serial’ or ’sequential’ search, because we are searching through the pages in sequence. It would be quite a reliable method (provided we could keep up the concentration) but it would
take a long time if the book had several hundred pages.
A much quicker method is to make use of the index (provided that the book has one). We look under C, find ’cathedrals’, and then see an entry something like:
cathedrals 30, 212, 360
Now we can go straight to those page numbers and read what is said about cathedrals. This method might not be quite so reliable, since it depends on the skills of the indexer. He or she might have considered some mentions of ’cathedrals’ to be too insignificant to index.
CDS/ISIS allows both these approaches to information retrieval. The first method, scanning through the records sequentially examining the text contained in the record is known as free-text searching. It is likely to be a slow process when the database contains more than a few hundred records. The second method, using an index, is the normal way of searching. CDS/ISIS allows you to set up the index automatically and refers to it as the index or inverted file. (The list of terms in the index without the details of their occurrences is also referred to as the terms dictionary.)
The selection of terms from the database records to go on to the index file is controlled by the Field Selection Table. It is not possible for the computer
to select terms according to their significance. Instead the selection depends upon three rules:
i. Which fields from the record are to be indexed (e.g. you probably want authors indexed but not the publisher or the number of pages).
ii. How the index terms are to be constructed from the data in these fields (called the indexing technique). For example, do you want the title ’Good secretarial practice’ as a whole
field under ’G’, or do you want it split up into separate words so that
’secretarial’ can be searched under ’s’?
iii. You can specify a list of stopwords which are not to be used on their own as index terms, e.g. ’in’, ’of’ and ’the’.
CDS/ISIS allows much flexibility in specifying each of these three rules. It is important to consider them carefully, since they determine what searches will be possible on the database. For instance, if you index authors as separate words, then ’Walpole, Horace’ will appear under
’Horace’ and under ’Walpole’: you cannot search him as ’Walpole, Horace’. If you index titles as whole fields, then ’The Concise Oxford Dictionary of Quotations’ cannot be searched under
’Dictionary’ or under ’Quotations’. It is, in fact, possible in CDS/ISIS to index the same field in more than one way.
If you have divided the field into subfields, you can index different subfields by different techniques (or some subfields but not others).
Each line of the Field Selection Table comprises three elements: the Tag or Name, the Technique and the Format. You need to make an entry in the table for each field you want to index (i.e. to make searchable) and if the same field
is indexed in two ways you need two entries for it.
Again if you are unsure about writing FSTs it would be a good idea to engage the services of the
Dictionary Assistant. This will give you a dialog box like the one in Figure 4.3.
Figure 4.3 Dictionary Assistant dialog box
All you need do is to choose which technique to apply and which fields to index. The listbox on the right shows the techniques available. The two most commonly used are 0 – by line and 4 – by word.
0 means that the whole field contents will be indexed as a single term.
1 means index each subfield separately and so is relevant only if the field is divided into subfields.
2 means index only words or phrases which have been entered between angle brackets, e.g.
<inflation rate>. This technique can be used to select particular terms from a lengthy piece of text such as an abstract. Some CDS/ISIS users like to enter descriptors this way and use technique 2 to index them.
3 is similar to 2 but indexes terms entered between slashes, e.g. /Windward Islands/
4 signifies that each word in the field will be indexed separately (except stopwords – see Section
4.7). If the field is divided into subfields, you must specify mode mhl or mdl in the extraction format – see Section 5.2.
Other values are also available and are explained in the Reference manual. If you choose one of the values 5 to 8 you will have to edit the format manually to put in the required prefix. For help on choosing the right technique please see Section 4.8.
Now click the check boxes against the fields you want to be indexed (i.e. searchable) and finally click OK. The FST is then displayed and you can edit it if necessary. Using the Dictionary Assistant, all the fields selected are indexed by the same technique: if you want to apply different techniques to different fields, you will need to make changes here.
Each entry in the FST has three parts. In the top part of the dialog box the entry being edited is shown in three separate boxes. In the Entries box each entry is shown on one line with spaces between the three parts.
The first value, which was called the ID in the DOS version of CDS/ISIS, is normally the same as the tag of the field from which the terms come. (It does not have to be, but this usually makes searching easier.) It can be used to specify the type of term when searching, as we shall see in chapter 7. If you choose a number that corresponds to a field tag, Winisis will show the field name in the Tag/Name box when you are editing it. If you choose a number that does not correspond to a field tag, it will be shown as the number followed by “FST Tag”.
The second value, the indexing technique, specifies how the index terms are to be extracted as explained above.
The third column, the format, shows which field in the record the terms are to come from. As in the display format, fields are specified with v in front of their tags.
So, if the title field has a tag 200 and we want to index each individual word, the entries would be:
Tag/Name: 200 Title Technique: 4 Format: v200
and if the author field is 100 and we want to index the author name as a whole: Tag/Name: 100 Author Technique: 0 Format: v100
If we want to index only subfield a of field 100 we could specify
Tag/Name: 100 Author Technique: 0 Format: v100^a
This dialog box works in a similar way to the one for the FDT. When you have entered the data for each field, the focus will be on the Add button. Either click on the button or press {Enter} to add the field to the table (displayed in the Entries box). If you need to correct the details for any entry, just click on that entry in the Entries box and the details will be copied into the boxes used for editing. If you need to remove an entry, highlight it and click the Delete button. An example of an FST is shown in Figure 4.4.
Figure 4.4 Example of Field Selection Table (FST )
For more information on writing the data extraction format, please see Chapter 5, especially
Section 5.2 for dealing with subfield markers and Section 5.5 for dealing with repeated fields.
Again, do not be too concerned to get the Field Selection Table right first
time. It is best to try it out on a few sample records and look at the index terms produced. If they are not what you want, edit the FST and then
regenerate the inverted file.
When you have completed your entries in the Field Selection Table, click the Terminate button. You are then asked to confirm that you want the database to be created. Click Yes and your wish should be granted. You are then invited to select a database to work on: you can choose the one you have just created or a previous one.
DEFAULT DISPLAY FORMAT
4.5 DEFAULT DISPLAY FORMAT
The display format means the way that the records will appear when you use browse the database or display search results. Display formats can also be used in producing printed output. There must be at least one display format for the database and that must have the same filename as the database. You can always create more formats, or modify existing ones, later.
A message box appears asking “Do you want Winisis to launch the Print Format Assistant?” and you can choose Yes or No. If you are new to CDS/ISIS, or if you just want an off- the-peg format to save time, click Yes. You are then given the choice of five pre-defined formats. The order of fields will be the same as in the Field Definition Table.
Normal style. This uses font 2 (normally Times Roman) and colour 4 (normally blue) and gives a display with the field names in one column and the data in the next.
CDS/ISIS DOS compatible format. This is similar to the Normal style but it uses only black text and
Courier font, and features which are within the capabilities of the DOS version of CDS/ISIS.
Decorated format. Three fonts and various colours feature in the format. The record number
(MFN) and the name of the database appear in a box and the field names appear in italics.
HTML normal. This is a format using very basic HTML (HyperText Markup Language), the language used to create pages for the World Wide Web. No HTML tags are included to separate the contents of one record from the next.
HTML table with headers. This again incorporates HTML tags and displays field names and their contents in the form of a table.
Once the format has been created, it will be displayed in case you wish to edit
it. The next chapter describes the formatting language in some detail, but just to give you a taste:
a) Fields are specified by using v (for variable) in front of the tag: thus v10 means display the contents of field 10.
b) Text between single or double inverted commas forms a literal and will appear in the display just as it is written.
c) The slash (/) means start a new line here.
A simple format for a database containing fields 10, 20 and 30 could be:
v10,v20,v30
This would display field 10, immediately followed by field 20, immediately followed by field 30, e.g.
Walton, C.Good office management practice1990
To display the fields on different lines, they should be separated by slashes, e.g.
v10/v20/v30
This would display the above example as:
Walton, C.
Good office management practice
1990
Unlike in the DOS version of CDS/ISIS, you can use carriage returns in the format to make it easier to read, e.g.
’Author: ’ v10/
’Title: ’ v20/
’Date: ’ v30
Do not worry about getting your display format right first time. It is best to try the format out when you have entered a few records and then edit it as necessary. When you have used the services of the Assistant, or you have written your own format, click the green arrow to go on.
FIELD DEFINITION TABLE (FDT)
4.3 FIELD DEFINITION TABLE (FDT)
The FDT defines the fields that may be present in the database and certain parameters for each field. You enter the values in the boxes at the top of the dialog box. In the DOS version, the FDT had little effect – you could repeat a field or enter subfields regardless of what the FDT said. The Windows version is much stricter and you need to be more careful about your definition (although you can always change it later).
The boxes are as follows:
(a) Tag -- see above. You can use the up and down arrows if you like to select the number, or type it in.
(b) Name -- this is to help you identify the field. It can be up to 31 characters long and can contain spaces. When you come to set up the data entry worksheet, this name will be used as
the prompt for the field, but you can override it there. It is also used to
specify the field in the “Guided Search” form.
(d) Type. Unless you can predict that the field will contain only letters (no spaces or punctuation) or only figures (no symbols or decimal point) it is
best to leave this as Alphanumeric. The other possible values are Alphabetic or
Numeric. The beginner is strongly advised to use Alphanumeric.
(e) Rep[eatable]. If you want to allow multiple occurrences of this field, e.g. several authors or several descriptors, click this checkbox.
(f) Pattern/subfields. If you are dividing the field into subfields, you should list the subfields here (without punctuation or spaces) e.g.
abc
If you are not using subfields, press the ( key to leave this box blank. Pattern fields are not supported in Version 1.4.
When you have entered the data for each field, the focus will be on the Add button. Either click the button or press {Enter} to add the field to the table (displayed in the large box). If you need to correct the details for any field, just click on that entry in the large box and the details will be copied into the boxes used for editing. If you need to remove an entry, highlight it and click the Delete Entry button. You can alter the order of fields by selecting a field and clicking the up-arrow or down- arrow key: they do not have to be in numeric order, though
that is usually clearest. An example of an FDT is shown in Figure 4.1.
Figure 4.1 Example of a Field Definition Table (FDT)
After entering all the fields, click the button with the green arrow to go on.