Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Rich H.J for C programmers.2006.pdf
Скачиваний:
19
Добавлен:
23.08.2013
Размер:
1.79 Mб
Скачать

Fast String Searching: s: (Symbols)

If you find your program taking a lot of time matching strings, you can create symbols representing the strings and then match the symbols rather than the strings themselves. The interpreter uses special code to make symbol-matching very fast.

Symbol is an atomic data type (like numeric, literal, and box). In a noun of the symbol type, each atom represents a boxed character string. You create a symbol with monad s: which has infinite rank. s: y takes an array of boxed strings y and creates an array of symbols of the same shape as y :

]sym =. s: 2 2$'abc';'def';'ghi';'jk' `abc `def

`ghi `jk $sym

2 2

The '`' characters are a clue that sym is an array of symbols. The value of the top-left atom of sym is not '`abc' or 'abc'; it is a value understood only by the interpreter. The interpreter chooses to display the text associated with the symbol, but that text is actually stored in the interpreter's private memory.

y in s: y can be a character string which is chopped into pieces using the leading character as a separator; each piece is then converted to a symbol. This is a handy way of creating a short list of symbols:

s: '`abc`ghi' `abc `ghi

Symbols can be operands of any verb that does not perform arithmetic; in addition, comparison between symbols is allowed with 'less than' defined to mean 'earlier in alphabetical order'.

a =. s: '`abc`def`ghi`jk' defines a list of 4 symbols.

a i. s:<'ghi'

2

We create a symbol to represent 'ghi' and find that in the list. a i. <'ghi'

4

Note: the boxed string <'ghi' is not a symbol, so it is not found in the list.

Dyad s: has a number of forms for operating on symbols. The only one of interest to us here is 5 s: y which converts each symbol in y to its corresponding boxed string:

5 s: 3 1 { a +--+---+ |jk|def| +--+---+

When a string is converted to a symbol, the interpreter allocates internal resources to hold the string's value and other information. There is no way to tell the interpreter to free the resources for a single string; this can be a problem if your symbol table is large and changes dynamically. It is possible to clear the entire symbol table (using

215

y=.0 s: 10 and 10 s: y), but doing so invalidates any symbols previously created by s: y .

If you would like to do high-speed matching but what you want to match is not a string, consider converting to strings using 5!:5 <'y' which converts the variable named y to string form.

Fast Searching: m&i.

x i. y usually starts by creating a hash table of x and y and then looks for matches in the hash. If you repeatedly use the same x, in other words if you are doing many lookups in the same table, the has of x is recomputed at each invocation of i. . You can have the hash computed once by defining a search verb for the table with

search =: x&i.

The hash is computed when the verb is defined, and subsequent lookup via search y will be faster.

CRC Calculation

x 128!:3 y computes the CRC of the string y . x is the CRC polynomial, a Boolean list. Normally the initial CRC is _1 but you can specify a different initial value by making x a 2-element boxed list of polynomial;initial_value .

If you define a CRC verb as crc =: x&(128!:3)

then the interpreter will precompile the CRC polynomial for x, making subsequent CRC calculations faster.

Unicode Characters: u:

2-byte unicode characters can be represented by variables that have the unicode atomic data type. Such variables are created by the verb u: . Its use is described in the Dictionary.

Window Driver And Form Editor

Designing user interfaces is quick and painless with J's Form Editor. The Lab named Form Editor will show you how.

216