LearnModel Class Reference

A unified interface for learning models. More...

#include <learnmodel.h>

Inheritance diagram for LearnModel:

[legend]Collaboration diagram for LearnModel:


Public Member Functions
	LearnModel (UINT n_in=0, UINT n_out=0)
virtual Output	operator() (const Input &) const =0
virtual Output	get_output (UINT idx) const
	Get the output of the hypothesis on the idx-th input.
bool	valid_dimensions (UINT, UINT) const
bool	valid_dimensions (const LearnModel &l) const
bool	exact_dimensions (UINT i, UINT o) const
bool	exact_dimensions (const LearnModel &l) const
bool	exact_dimensions (const DataSet &d) const

virtual LearnModel *	create () const =0
	Create a new object using the default constructor.
virtual LearnModel *	clone () const =0
	Create a new object by replicating itself.
UINT	n_input () const
UINT	n_output () const
void	set_log_file (FILE *f)

virtual bool	support_weighted_data () const
	Whether the learning model/algorithm supports unequally weighted data.
virtual REAL	r_error (const Output &out, const Output &y) const
	Error measure for regression problems.
virtual REAL	c_error (const Output &out, const Output &y) const
	Error measure for classification problems.
REAL	train_r_error () const
	Training error (regression).
REAL	train_c_error () const
	Training error (classification).
REAL	test_r_error (const pDataSet &) const
	Test error (regression).
REAL	test_c_error (const pDataSet &) const
	Test error (classification).
virtual void	initialize ()
virtual void	set_train_data (const pDataSet &, const pDataWgt &=0)
	Set the data set and sample weight to be used in training.
const pDataSet &	train_data () const
	Return pointer to the embedded training data set.
virtual void	train ()=0
	Train with preset data set and sample weight.
virtual void	reset ()

virtual REAL	margin_norm () const
	The normalization term for margins.
virtual REAL	margin_of (const Input &x, const Output &y) const
	Report the (unnormalized) margin of an example (x, y).
virtual REAL	margin (UINT i) const
	Report the (unnormalized) margin of the example i.
REAL	min_margin () const
	The minimal (unnormalized) in-sample margin.
Protected Member Functions
void	set_dimensions (UINT, UINT)
void	set_dimensions (const LearnModel &l)
void	set_dimensions (const DataSet &d)
virtual bool	serialize (std::ostream &, ver_list &) const
virtual bool	unserialize (std::istream &, ver_list &, const id_t &=NIL_ID)
Protected Attributes
UINT	_n_in
	input dimension of the model
UINT	_n_out
	output dimension of the model
pDataSet	ptd
	pointer to the training data set
pDataWgt	ptw
	pointer to the sample weight (for training)
UINT	n_samples
	equal to `ptd->size()`
FILE *	logf
	file to record train/validate error

Detailed Description

A unified interface for learning models.

I try to provide + r_error and c_error for regression problems, r_error should be defined; for classification problems, c_error should be defined; these two errors can both be present

The training data is stored with the learning model (as a pointer) Say: why (the benefit of store with, a pointer); maybe not a pointer Say: what's the impact of doing this (what will be changed from normal implementation) Say: wgt: could be null if the model doesn't support ...otherwise shoud be a probability vector (randome_sample)...

The flowchart of the learning ...

Create a new instance, load from a file, and/or reset an existing one lm->reset();.
lm->set_train_data(sample_data);
Specify the training data
err = lm->train();
Usually, the return value has no meaning
y = (*lm)(x);
Apply the learning model to new data.

Todo:: documentation
Do we really need two errors?

Definition at line 64 of file learnmodel.h.

Constructor & Destructor Documentation

LearnModel ( UINT n_in = 0,

UINT n_out = 0

)

Parameters:

n_in is the dimension of input.

n_out is the dimension of output.

Definition at line 70 of file learnmodel.cpp.

Member Function Documentation

REAL c_error ( const Output & out,

const Output & y

) const [virtual]

Error measure for classification problems.

Parameters:

out is the output from the learned hypothesis.

y is the real output.

Returns:
Classification error between out and y. The error measure is not necessary symmetric. A commonly used measure is out != y.

Reimplemented in MultiClass_ECOC, and Ordinal_BLE.
Definition at line 112 of file learnmodel.cpp.
References INFINITESIMAL, and LearnModel::n_output().
Referenced by CGBoost::linear_weight(), AdaBoost::linear_weight(), lemga::lp_add_hypothesis(), LearnModel::test_c_error(), and LearnModel::train_c_error().

virtual LearnModel* clone ( ) const [pure virtual]

Create a new object by replicating itself.

Returns:
A pointer to the new copy.
The code for a derived class Derived is always
return new Derived(*this);
Though seemingly redundant, it helps to copy an object without knowing the real type of the object.
See also:
C++ FAQ Lite 20.6

Implements Object.
Implemented in AdaBoost, AdaBoost_ECOC, AdaBoost_ERP, Aggregating, Bagging, Boosting, Cascade, CGBoost, CrossVal, vFoldCrossVal, HoldoutCrossVal, FeedForwardNN, LPBoost, MgnBoost, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Referenced by CrossVal::add_model(), Aggregating::set_base_model(), and Ordinal_BLE::set_model().

virtual LearnModel* create ( ) const [pure virtual]

Create a new object using the default constructor.
The code for a derived class Derived is always
return new Derived();

Implements Object.
Implemented in AdaBoost, AdaBoost_ECOC, AdaBoost_ERP, Aggregating, Bagging, Boosting, Cascade, CGBoost, CrossVal, vFoldCrossVal, HoldoutCrossVal, FeedForwardNN, LPBoost, MgnBoost, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.

bool exact_dimensions ( const DataSet & d ) const [inline]

Definition at line 179 of file learnmodel.h.
References LearnModel::exact_dimensions(), dataset::size(), dataset::x(), and dataset::y().

bool exact_dimensions ( const LearnModel & l ) const [inline]

Definition at line 177 of file learnmodel.h.
References LearnModel::exact_dimensions(), LearnModel::n_input(), and LearnModel::n_output().

bool exact_dimensions ( UINT i,

UINT o

) const [inline]

Definition at line 175 of file learnmodel.h.
References LearnModel::valid_dimensions().
Referenced by LearnModel::exact_dimensions(), Boosting::get_output(), CGBoost::linear_weight(), AdaBoost::linear_weight(), Boosting::operator()(), Bagging::operator()(), LearnModel::set_dimensions(), and Aggregating::unserialize().

virtual Output get_output ( UINT idx ) const [inline, virtual]

Get the output of the hypothesis on the idx-th input.

Note:
It is possible to cache results to save computational effort.

Reimplemented in Boosting, CrossVal, and MultiClass_ECOC.
Definition at line 139 of file learnmodel.h.
References LearnModel::operator()(), LearnModel::ptd, and LearnModel::ptw.
Referenced by FeedForwardNN::cost(), lemga::op::inner_product(), CGBoost::linear_weight(), AdaBoost::linear_weight(), lemga::lp_add_hypothesis(), LearnModel::train_c_error(), and LearnModel::train_r_error().

virtual void initialize ( ) [inline, virtual]

Reimplemented in FeedForwardNN, NNLayer, Perceptron, and SVM.
Definition at line 110 of file learnmodel.h.

virtual REAL margin ( UINT i ) const [inline, virtual]

Report the (unnormalized) margin of the example i.

Note:
It is possible to cache results to save computational effort.

Reimplemented in Boosting, CrossVal, and MultiClass_ECOC.
Definition at line 164 of file learnmodel.h.
References LearnModel::margin_of(), LearnModel::ptd, and LearnModel::ptw.
Referenced by LearnModel::min_margin().

virtual REAL margin_norm ( ) const [inline, virtual]

The normalization term for margins.
The margin concept can be normalized or unnormalized. For example, for a perceptron model, the unnormalized margin would be the wegithed sum of the input features, and the normalized margin would be the distance to the hyperplane, and the normalization term is the norm of the hyperplane weight.
Since the normalization term is usually a constant, it would be more efficient if it is precomputed instead of being calculated every time when a margin is asked for. The best way is to use a cache. Here I use a easier way: let the users decide when to compute the normalization term.
Reimplemented in Bagging, Boosting, CrossVal, Perceptron, and SVM.
Definition at line 158 of file learnmodel.h.

REAL margin_of ( const Input & x,

const Output & y

) const [virtual]

Report the (unnormalized) margin of an example (x, y).

Reimplemented in Bagging, Boosting, CrossVal, MultiClass_ECOC, Perceptron, and SVM.
Definition at line 199 of file learnmodel.cpp.
References OBJ_FUNC_UNDEFINED.
Referenced by LearnModel::margin().

REAL min_margin ( ) const

The minimal (unnormalized) in-sample margin.

Definition at line 203 of file learnmodel.cpp.
References INFINITESIMAL, INFINITY, LearnModel::margin(), LearnModel::n_samples, and LearnModel::ptw.

UINT n_input ( ) const [inline]

Definition at line 81 of file learnmodel.h.
References LearnModel::_n_in.
Referenced by FeedForwardNN::add_top(), NNLayer::back_propagate(), LearnModel::exact_dimensions(), NNLayer::feed_forward(), SVM::operator()(), Stump::operator()(), Pulse::operator()(), Perceptron::operator()(), FeedForwardNN::operator()(), LearnModel::set_dimensions(), Pulse::set_index(), SVM::signed_margin(), and LearnModel::valid_dimensions().

UINT n_output ( ) const [inline]

Definition at line 82 of file learnmodel.h.
References LearnModel::_n_out.
Referenced by FeedForwardNN::_cost_deriv(), FeedForwardNN::add_top(), NNLayer::back_propagate(), Cascade::belief(), Ordinal_BLE::c_error(), MultiClass_ECOC::c_error(), LearnModel::c_error(), MultiClass_ECOC::distances(), LearnModel::exact_dimensions(), NNLayer::feed_forward(), NNLayer::operator()(), Ordinal_BLE::r_error(), LearnModel::r_error(), LearnModel::set_dimensions(), NNLayer::size(), and LearnModel::valid_dimensions().

virtual Output operator() ( const Input & ) const [pure virtual]

Implemented in Bagging, Boosting, Cascade, CrossVal, FeedForwardNN, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Referenced by LearnModel::get_output().

REAL r_error ( const Output & out,

const Output & y

) const [virtual]

Error measure for regression problems.

Parameters:

out is the output from the learned hypothesis.

y is the real output.

Returns:
Regression error between out and y. A commonly used measure is the squared error.

Reimplemented in Ordinal_BLE.
Definition at line 94 of file learnmodel.cpp.
References LearnModel::_n_out, and LearnModel::n_output().
Referenced by FeedForwardNN::_cost(), LearnModel::test_r_error(), and LearnModel::train_r_error().

void reset ( ) [virtual]

Cleaning up the learning model but keeping most settings.
Note:
This is probably needed after training or loading from file, but before having another training.

Reimplemented in Aggregating, Boosting, CGBoost, CrossVal, MultiClass_ECOC, and Ordinal_BLE.
Definition at line 195 of file learnmodel.cpp.
References LearnModel::_n_in, and LearnModel::_n_out.
Referenced by Ordinal_BLE::reset(), CrossVal::reset(), and Aggregating::reset().

bool serialize ( std::ostream & ,

ver_list &

) const [protected, virtual]

Reimplemented in Aggregating, Boosting, Cascade, CGBoost, CrossVal, vFoldCrossVal, HoldoutCrossVal, FeedForwardNN, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Definition at line 74 of file learnmodel.cpp.
References LearnModel::_n_in, LearnModel::_n_out, and SERIALIZE_PARENT.

void set_dimensions ( const DataSet & d ) [inline, protected]

Definition at line 187 of file learnmodel.h.
References LearnModel::exact_dimensions(), LearnModel::set_dimensions(), dataset::x(), and dataset::y().

void set_dimensions ( const LearnModel & l ) [inline, protected]

Definition at line 185 of file learnmodel.h.
References LearnModel::n_input(), LearnModel::n_output(), and LearnModel::set_dimensions().

void set_dimensions ( UINT ,

UINT

) [protected]

Definition at line 219 of file learnmodel.cpp.
References LearnModel::_n_in, LearnModel::_n_out, and LearnModel::valid_dimensions().
Referenced by CrossVal::add_model(), Perceptron::initialize(), MultiClass_ECOC::MultiClass_ECOC(), Ordinal_BLE::Ordinal_BLE(), LearnModel::set_dimensions(), Perceptron::set_weight(), _boost_gd::set_weight(), SVM::train(), Stump::train(), Pulse::train(), Perceptron::train(), Ordinal_BLE::train(), MultiClass_ECOC::train(), LPBoost::train(), CrossVal::train(), Boosting::train(), and Bagging::train().

void set_log_file ( FILE * f ) [inline]

Definition at line 84 of file learnmodel.h.
References LearnModel::logf.

void set_train_data ( const pDataSet & pd,

const pDataWgt & pw = 0

) [virtual]

Set the data set and sample weight to be used in training.
If the learning model/algorithm can only do training using uniform sample weight, i.e., support_weighted_data() returns false, a ``boostrapped'' copy of the original data set will be generated and used in the following training. The boostrapping is done by randomly pick samples (with replacement) w.r.t. the given weight pw.
In order to make the life easier, when support_weighted_data() returns true, a null pw will be replaced by a uniformly distributed probability vector. So we have the following invariant
Invariant:
support_weighted_data() == (ptw != 0)

Parameters:

pd gives the data set.

pw gives the sample weight, whose default value is 0.

See also:
support_weighted_data(), train()

Reimplemented in Aggregating, Boosting, CrossVal, MultiClass_ECOC, and Ordinal_BLE.
Definition at line 165 of file learnmodel.cpp.
References EPSILON, LearnModel::n_samples, LearnModel::ptd, LearnModel::ptw, and LearnModel::support_weighted_data().
Referenced by Ordinal_BLE::set_train_data(), CrossVal::set_train_data(), Aggregating::set_train_data(), Bagging::train(), and AdaBoost_ECOC::train_with_full_partition().

virtual bool support_weighted_data ( ) const [inline, virtual]

Whether the learning model/algorithm supports unequally weighted data.

Returns:
true if supporting; false otherwise. The default is false, just for safety.

See also:
set_train_data()

Reimplemented in Bagging, Boosting, Cascade, FeedForwardNN, MultiClass_ECOC, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Definition at line 94 of file learnmodel.h.
Referenced by LearnModel::set_train_data().

REAL test_c_error ( const pDataSet & ) const

Test error (classification).

Definition at line 142 of file learnmodel.cpp.
References LearnModel::c_error().

REAL test_r_error ( const pDataSet & ) const

Test error (regression).

Definition at line 134 of file learnmodel.cpp.
References LearnModel::r_error().

virtual void train ( ) [pure virtual]

Train with preset data set and sample weight.

Implemented in AdaBoost, Bagging, Boosting, Cascade, CGBoost, CrossVal, FeedForwardNN, LPBoost, MgnBoost, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Referenced by Bagging::train(), and AdaBoost_ECOC::train_with_full_partition().

REAL train_c_error ( ) const

Training error (classification).

Definition at line 126 of file learnmodel.cpp.
References LearnModel::c_error(), LearnModel::get_output(), LearnModel::n_samples, LearnModel::ptd, and LearnModel::ptw.
Referenced by Perceptron::log_error(), and Boosting::train().

const pDataSet& train_data ( ) const [inline]

Return pointer to the embedded training data set.

Definition at line 118 of file learnmodel.h.
References LearnModel::ptd.
Referenced by Boosting::get_output(), lemga::op::inner_product(), CGBoost::linear_weight(), AdaBoost::linear_weight(), and lemga::lp_add_hypothesis().

REAL train_r_error ( ) const

Training error (regression).

Definition at line 118 of file learnmodel.cpp.
References LearnModel::get_output(), LearnModel::n_samples, LearnModel::ptd, LearnModel::ptw, and LearnModel::r_error().

bool unserialize ( std::istream & ,

ver_list & ,

const id_t & = NIL_ID

) [protected, virtual]

Reimplemented in Aggregating, Boosting, Cascade, CGBoost, CrossVal, vFoldCrossVal, HoldoutCrossVal, FeedForwardNN, MultiClass_ECOC, NNLayer, Ordinal_BLE, Perceptron, Pulse, Stump, and SVM.
Definition at line 80 of file learnmodel.cpp.
References LearnModel::_n_in, LearnModel::_n_out, LearnModel::n_samples, Object::NIL_ID, LearnModel::ptd, LearnModel::ptw, and UNSERIALIZE_PARENT.

bool valid_dimensions ( const LearnModel & l ) const [inline]

Definition at line 172 of file learnmodel.h.
References LearnModel::n_input(), LearnModel::n_output(), and LearnModel::valid_dimensions().

bool valid_dimensions ( UINT ,

UINT

) const

Definition at line 214 of file learnmodel.cpp.
References LearnModel::_n_in, and LearnModel::_n_out.
Referenced by LearnModel::exact_dimensions(), Ordinal_BLE::operator()(), Aggregating::reset(), Aggregating::set_base_model(), LearnModel::set_dimensions(), CrossVal::unserialize(), Aggregating::unserialize(), and LearnModel::valid_dimensions().

Member Data Documentation

UINT _n_in [protected]

input dimension of the model

Definition at line 66 of file learnmodel.h.
Referenced by FeedForwardNN::add_top(), NNLayer::back_propagate(), NNLayer::feed_forward(), Perceptron::fld(), Perceptron::initialize(), LearnModel::n_input(), Perceptron::Perceptron(), LearnModel::reset(), SVM::serialize(), Perceptron::serialize(), NNLayer::serialize(), LearnModel::serialize(), LearnModel::set_dimensions(), NNLayer::set_weight(), SVM::train(), Stump::train(), Pulse::train(), SVM::unserialize(), Stump::unserialize(), Perceptron::unserialize(), NNLayer::unserialize(), LearnModel::unserialize(), FeedForwardNN::unserialize(), Aggregating::unserialize(), and LearnModel::valid_dimensions().

UINT _n_out [protected]

output dimension of the model

Definition at line 67 of file learnmodel.h.
Referenced by FeedForwardNN::_cost_deriv(), FeedForwardNN::add_top(), NNLayer::back_propagate(), NNLayer::feed_forward(), Boosting::get_output(), FeedForwardNN::gradient(), LearnModel::n_output(), Boosting::operator()(), Bagging::operator()(), LearnModel::r_error(), LearnModel::reset(), NNLayer::serialize(), LearnModel::serialize(), LearnModel::set_dimensions(), NNLayer::set_weight(), MultiClass_ECOC::train(), Stump::unserialize(), NNLayer::unserialize(), LearnModel::unserialize(), FeedForwardNN::unserialize(), Aggregating::unserialize(), and LearnModel::valid_dimensions().

FILE* logf [protected]

file to record train/validate error

Definition at line 72 of file learnmodel.h.
Referenced by FeedForwardNN::log_cost(), Perceptron::log_error(), and LearnModel::set_log_file().

UINT n_samples [protected]

equal to ptd->size()

Definition at line 70 of file learnmodel.h.
Referenced by Boosting::assign_weight(), Boosting::clear_cache(), AdaBoost_ECOC::confusion_matrix(), Boosting::cost(), Ordinal_BLE::extend_data(), Perceptron::fld(), CGBoost::linear_smpwgt(), AdaBoost::linear_smpwgt(), CGBoost::linear_weight(), AdaBoost::linear_weight(), LearnModel::min_margin(), Boosting::sample_weight(), Ordinal_BLE::set_train_data(), MultiClass_ECOC::set_train_data(), LearnModel::set_train_data(), AdaBoost_ECOC::setup_aux(), AdaBoost_ECOC::smpwgt_with_partition(), Stump::train(), Pulse::train(), LPBoost::train(), CGBoost::train(), Bagging::train(), AdaBoost::train(), LearnModel::train_c_error(), LearnModel::train_r_error(), AdaBoost_ERP::train_with_partial_partition(), AdaBoost_ERP::train_with_partition(), LearnModel::unserialize(), and Boosting::update_smpwgt().

pDataSet ptd [protected]

pointer to the training data set

Definition at line 68 of file learnmodel.h.
Referenced by MultiClass_ECOC::cost(), FeedForwardNN::cost(), Boosting::cost(), HoldoutCrossVal::cv_round(), vFoldCrossVal::cv_round(), Ordinal_BLE::extend_data(), Perceptron::fld(), LearnModel::get_output(), CrossVal::get_output(), Boosting::get_output(), FeedForwardNN::gradient(), Perceptron::initialize(), CGBoost::linear_smpwgt(), CGBoost::linear_weight(), AdaBoost::linear_weight(), LearnModel::margin(), CrossVal::margin(), Boosting::margin(), Perceptron::matrix(), Boosting::sample_weight(), Perceptron::set_data(), Ordinal_BLE::set_train_data(), MultiClass_ECOC::set_train_data(), LearnModel::set_train_data(), Boosting::set_train_data(), Aggregating::set_train_data(), Stump::train(), Pulse::train(), Perceptron::train(), Ordinal_BLE::train(), MultiClass_ECOC::train(), LPBoost::train(), FeedForwardNN::train(), CrossVal::train(), Boosting::train(), Bagging::train(), LearnModel::train_c_error(), LearnModel::train_data(), LearnModel::train_r_error(), AdaBoost_ECOC::train_with_full_partition(), AdaBoost_ERP::train_with_partial_partition(), MultiClass_ECOC::train_with_partition(), AdaBoost_ERP::train_with_partition(), AdaBoost_ECOC::train_with_partition(), Boosting::train_with_smpwgt(), Ordinal_BLE::unserialize(), and LearnModel::unserialize().

pDataWgt ptw [protected]

pointer to the sample weight (for training)

Definition at line 69 of file learnmodel.h.
Referenced by MultiClass_ECOC::cost(), FeedForwardNN::cost(), Boosting::cost(), Perceptron::fld(), MultiClass_ECOC::get_output(), LearnModel::get_output(), Boosting::get_output(), FeedForwardNN::gradient(), CGBoost::linear_smpwgt(), MultiClass_ECOC::margin(), LearnModel::margin(), LearnModel::min_margin(), Boosting::sample_weight(), LearnModel::set_train_data(), Aggregating::set_train_data(), AdaBoost_ECOC::setup_aux(), Stump::train(), Pulse::train(), Perceptron::train(), Ordinal_BLE::train(), MultiClass_ECOC::train(), LPBoost::train(), FeedForwardNN::train(), CrossVal::train(), CGBoost::train(), Boosting::train(), Bagging::train(), LearnModel::train_c_error(), LearnModel::train_r_error(), MultiClass_ECOC::train_with_partition(), AdaBoost_ERP::train_with_partition(), AdaBoost_ECOC::train_with_partition(), Boosting::train_with_smpwgt(), Ordinal_BLE::unserialize(), and LearnModel::unserialize().

The documentation for this class was generated from the following files:

Generated on Wed Nov 8 08:16:59 2006 for LEMGA by

1.4.6

LearnModel Class Reference

Public Member Functions

Protected Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation