Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «Высшая школа экономики»

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

An_Introduction_to_Information_Retrieval.pdf

Скачиваний:

421

Добавлен:

26.03.2016

Размер:

6.9 Mб

Скачать

☆

<<< < Предыдущая 1 23 / 1213 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

DRAFT! © April 1, 2009 Cambridge University Press. Feedback welcome.

xxvii

Table of Notation

Symbol	Page	Meaning
γ	p. 98	γ code

γp. 256 Classiﬁcation or clustering function: γ(d) is d’s class

or cluster

p. 256 Supervised learning method in Chapters 13 and 14:

(D) is the classiﬁcation function γ learned from training set D

λp. 404 Eigenvalue

~µ(.)	p. 292 Centroid of a class (in Rocchio classiﬁcation) or a
	cluster (in K-means and centroid clustering)

Θ(·)

ω, ωk

arg maxx f (x) arg minx f (x) c, cj

cft

p. 114 Training example p. 408 Singular value

p. 11 A tight bound on the complexity of an algorithm p. 357 Cluster in clustering

p. 357 Clustering or set of clusters {ω1, . . . , ωK}

p. 181 The value of x for which f reaches its maximum p. 181 The value of x for which f reaches its minimum p. 256 Class or category in classiﬁcation

p.89 The collection frequency of term t (the total number of times the term appears in the document collection)

Cp. 256 Set {c1, . . . , cJ} of all classes

Cp. 268 A random variable that takes as values members of

xxviii

Table of Notation

Cp. 403 Term-document matrix

dp. 4 Index of the dth document in the collection D

dp. 71 A document

~	p. 181 Document vector, query vector
d,~q

Dp. 354 Set {d1, . . . , dN } of all documents

Dc	p. 292 Set of documents that is in class c

Dp. 256 Set {hd1, c1i, . . . , hdN, cNi} of all labeled documents

	in Chapters 13–15
dft	p. 118 The document frequency of term t (the total number
	of documents in the collection the term appears in)

Hp. 99 Entropy

HM	p. 101	Mth harmonic number
I(X; Y)	p. 272	Mutual information of random variables X and Y
idft	p. 118	Inverse document frequency of term t

Jp. 256 Number of classes

kp. 290 Top k items from a set, e.g., k nearest neighbors in

kNN, top k retrieved documents, top k selected features from the vocabulary V

kp. 54 Sequence of k characters

Kp. 354 Number of clusters

Ld	p. 233	Length of document d (in tokens)
La	p. 262	Length of the test document (or application docu-
		ment) in tokens
Lave	p. 70	Average length of a document (in tokens)

Mp. 5 Size of the vocabulary (|V|)

Ma	p. 262	Size of the vocabulary of the test document (or ap-
		plication document)
Mave	p. 78	Average size of the vocabulary in a document in the
		collection
Md	p. 237	Language model for document d

Np. 4 Number of documents in the retrieval or training

		collection
Nc	p. 259	Number of documents in class c
N(ω)	p. 298	Number of times the event ω occurred

Table of Notation

O(·)

P(·)

si si

sim(d1, d2)

Tct

t tft,d

xxix

p. 11 A bound on the complexity of an algorithm p. 221 The odds of an event

p. 155 Precision p. 220 Probability

p. 465 Transition probability matrix p. 59 A query

p. 155 Recall p. 58 A string

p. 112 Boolean values for zone scoring

p. 121 Similarity score for documents d1, d2

p. 43 Total number of tokens in the document collection

p.259 Number of occurrences of word t in documents of class c

p. 4 Index of the tth term in the vocabulary V p. 61 A term in the vocabulary

p.117 The term frequency of term t in document d (the total number of occurrences of t in d)

p.266 Random variable taking values 0 (term t is present) and 1 (t is not present)

Vp. 208 Vocabulary of terms {t1, . . . , tM} in a collection (a.k.a.

		the lexicon)
~v(d)	p. 122	Length-normalized document vector
~	p. 120	Vector of document d, not length-normalized
V(d)	p. 120	Vector of document d, not length-normalized
wft,d	p. 125	Weight of term t in document d

wp. 112 A weight, for example for zones or terms

w~ T~x = b	p. 293	Hyperplane; w~ is the normal vector of the hyper-
		plane and wi component i of w~
~x	p. 222	Term incidence vector ~x = (x1, . . . , xM); more gen-
		erally: document feature representation

Xp. 266 Random variable taking values in V, the vocabulary

(e.g., at a given position k in a document)

Xp. 256 Document space in text classiﬁcation

\|A\|	p. 61	Set cardinality: the number of members of set A
\|S\|	p. 404	Determinant of the square matrix S

xxx		Table of Notation
\|si\|	p. 58	Length in characters of string si
\|~x\|	p. 139	Length of vector ~x
\|~x −~y\|	p. 131	Euclidean distance of ~x and ~y (which is the length of
		(~x −~y))

<<< < Предыдущая 1 23 / 1213 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
01.04.2025356.35 Кб0analiz.doc
#
01.05.2025448.51 Кб0Analiz_i_upravlenie_zatratami_Verkholantseva_O_...doc
#
02.06.20151.08 Mб6Anderson_Rio_Gangster.pdf
#
18.12.2018721.41 Кб6antigtu.ru-shpora_po_teorii_veroyatnosti_disper....doc
#
02.06.2015108.54 Кб13Antipeva_chto_to_25_04_14.doc
#
26.03.20166.9 Mб421An_Introduction_to_Information_Retrieval.pdf
#
02.06.2015833.24 Кб3APK_(01.01.2012).rtf
#
02.06.2015846.45 Кб4APK_(24.09.2012).rtf
#
26.03.2016355.36 Кб14Arabic_London.docx
#
01.05.2025175.1 Кб0Arkhitektura.doc
#
07.09.201923.88 Кб4Armenia.docx