Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Bioinformatics_lectures / lecture9.pptx
Скачиваний:
1
Добавлен:
21.02.2016
Размер:
655.21 Кб
Скачать

LECTURE 9

MACHINE LEARNING OVERVIEW

Ultimately about writing programs which improve with experience

Experience through data

Experience through knowledge

Experience through experimentation (active)

Some common tasks:

Concept learning for prediction

Clustering

Association rule mining

ML

Machine learning consists in programming computers to optimize a performance criterion by using example data or past experience.

The optimized criterion can be the accuracy provided by a predictive model—in a modelling problem—, and the value of a fitness or evaluation function—in an optimization problem.

LEARNING

In a modelling problem, the ‘learning’ term refers to running a computer program to induce a model by using training data or past experience.

Machine learning uses statistical theory when building computational models since the objective is to make inferences from a sample.

The two main steps in this process are to induce the model by processing the huge amount of data and to represent the model and making inferences efficiently.

MAINTAINING A BALANCE

Predictive

Supervised

tasks

learning

Descriptive

Unsupervised

tasks

learning

Know what you’re looking for

Don’t know what you’re looking for

Don’t know you’re even looking

A PARTIAL CHARACTERISATION OF LEARNING TASKS

Concept learning

Outlier/anomaly detection

Clustering

Concept formation

Conjecture making

Puzzle generation

Theory formation

MAINTAINING A BALANCE

IN PREDICTIVE/DESCRIPTIVE TASKS

Predictive tasks

From accuracy to understanding

Need to show statistical significance

But hypotheses generated often need to be understandable

Difference between the stock market and biology

Descriptive tasks

From pebbles to pearls

Lots of rubbish produced

Cannot rely on statistical significance

Have to worry about notions of

MAINTAINING A BALANCE

IN SCIENTIFIC DISCOVERY TASKS

Machine learning researchers

Are generally not domain scientists also

Extremely important to collaborate

To provide interesting projects

Remembering that we are scientists not IT consultants

To gain materials

Data, background knowledge, heuristics,

To assess the value of the output

INDUCTIVE LOGIC PROGRAMMING

Concept/rule learning technique (usually)

Hypotheses represented as Logic Programs

Search for LPs

From general to specific or vice-versa

One method is inverse entailment

Use measures to guide the search

Predictive accuracy and compression (info. theory)

Search performed within a language bias

Produces good accuracy and understanding

Logic programs are easier to decipher than

EXAMPLE LEARNED LP

fold('Four-helical up-and-down bundle',P) :- helix(P,H1),

length(H1,hi),

position(P,H1,Pos), interval(1 =< Pos =< 3), adjacent(P,H1,H2), helix(P,H2).

Predicting protein folds from helices

Соседние файлы в папке Bioinformatics_lectures