Create a Stopword File - Full-Text Retrieval (FTR) - Help

Full-Text Retrieval (FTR) Help

Language
English
Product
Full-Text Retrieval (FTR)
Search by Category
Help

Stopword files are supported with Directa, but not SmartPlant Foundation.

A stopword file contains words that are used so frequently in a collection that they provide no search value (such as and, the, of, to, and for). To reduce overhead, you can specify not to index these words.

The optional stopword file helps make searches on your collections more effective. You should be careful when deciding which words to include in the stopword file. The word should be included only if it is known to be of no relevant search value in most contexts. For example, the word a should not be included because the letter a could have different meanings in some contexts. It could designate an Appendix A or a Section A. If a is included in the stopword file, then these entries could not be searched.

The maximum number of bytes in a stopword file is 32,768 (32KB). Additionally, the stopword list after internal translation to UTF-8 must be smaller than 48KB.

You can create a stopword file with any text editor. The words are listed with a single word on each line. The following is a sample stopword file:

the

of

to

and

in

that

for

by

as

be

or

this

which

with

at

an

from

under

such

there

other

if

but

upon

where

these

when

whether

also

than

after

within

before

because

without

however

between

those

since

into

out

If you change the stopword file after a collection has been indexed, you must re-index the entire collection.

For more information on character classes, see Character Classes.

After creating the stopword file the my_collection.cfg file must be edited and the entire collection must be re-indexed before the stopword file can be used.

To edit my_collection.cfg, open the file with an editing tool such as Wordpad. Then add the STP variable to the file and set it equal to the full path of the stopword file. As an example, the following line should be added to the my_collection.cfg file.

STP = D:\ftr\config\cc_all.stp

Here is an example of the my_collection.cfg file. The my_collection.cfg file is created after the first operation that causes information to be written to the collection is completed.

CAT = my_collection.cat

CIX = my_collection.cix

DCT = my_collection.dct

REF = my_collection.ref

DUP = my_collection.dup

RUP = my_collection.rup

SRT = my_collection.srt

LOG = my_collection.log

PTH = D:\fultext

STP = D:\ftr\config\cc_all.stp

OPT:a

See Also

FTR-Supplied Stopword Files