Table data as input to estimation functions in R

As mentioned earlier in related articles, functions written in other languages like R, matlab or ILNumerics can be integrated with table functions using the high-level constructs like function router1 and subtable transformer2.

Following example scenarios demonstrate how different estimation functions written with the popular free and open-source language R can be applied on selected parts (i.e. subtables) of an input table containing some historical data for price estimations.

You can find all the relevant code in R and C# at the download page, including the C# code for generating test data.

Price estimators in R

For the example scenarios below we have three estimation functions in R, all based on Linear Regression (LR) of the global indicators market index and oil price for delivering price predictions.

Price predictor based on Linear Regression (LR)

  1. PriceEstimatorA: Estimated product price based on estimated market index and oil price
  2. PriceEstimatorB: Estimated product price based on estimated market index only
  3. PriceEstimatorC: Estimated product price based on estimated oil price only
R code for estimation functions A, B, C
# help function for displaying variables
display_variable = function(v, vname) { 
print(paste(vname,' ='),quote = FALSE)
print(v,quote = FALSE)
}
 
# Price Estimation Function A
# Estimation with 1st order Linear Regression
# based on historical market index and oil price
#
# Columns of input matrix HistoricalData:
# market index, oil price, product price
#
# MarketIndex: Estimated market index for the next year
# OilPrice: Estimated oil price for the next year
EstimatedProductPriceA = function(HistoricalData, MarketIndex, OilPrice) { 
# historical market index and oil price
RowCount = nrow(HistoricalData)
X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1:2]))
 
# hostorical product prices
Y_train = matrix(HistoricalData[,3], RowCount, 1);
 
# optimal coefficient vector to minimize MSE
Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train
 
# get estimated product price
X_test = matrix(c(1.0, MarketIndex, OilPrice), nrow=1)
Y_test = X_test %*% Bopt
return (Y_test)
}
 
# Price Estimator B
# Estimation with 1st order Linear Regression
# based on historical market index only
#
# Columns of input matrix HistoricalData:
# market index, product price
#
# MarketIndex: Estimated market index for the next year
EstimatedProductPriceB = function(HistoricalData, MarketIndex) { 
# historical market index and oil price
RowCount = nrow(HistoricalData)
X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1]))
 
# hostorical product prices
Y_train = matrix(HistoricalData[,2], RowCount, 1);
 
# optimal coefficient vector to minimize MSE
Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train
 
# get estimated product price
X_test = matrix(c(1.0, MarketIndex), nrow=1)
Y_test = X_test %*% Bopt
return (Y_test)
}
 
# Price Estimator C
# Estimation with 1st order Linear Regression
# based on historical oil price only
#
# Columns of matrix HistoricalData:
# oil price, product price
#
# OilPrice: Estimated oil price for the next year
EstimatedProductPriceC = function(HistoricalData, OilPrice) { 
# historical market index and oil price
RowCount = nrow(HistoricalData)
X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1]))
 
# hostorical product prices
Y_train = matrix(HistoricalData[,2], RowCount, 1);
 
# optimal coefficient vector to minimize MSE
Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train
 
# get estimated product price
X_test = matrix(c(1.0, OilPrice), nrow=1)
Y_test = X_test %*% Bopt
return (Y_test)
}
 
HistoricalData = matrix(c(1.48, 94.38, 2.89, 0.91, 91.68, 1.83, 1.27, 85.61, 2.74), nrow=3, byrow=TRUE)
print('HistoricalData')
print(HistoricalData)
 
MarketIndex = 1.2
OilPrice = 100
 
# method A
EstPrdPrice = EstimatedProductPriceA(HistoricalData, MarketIndex, OilPrice);
display_variable(EstPrdPrice,'Estimated product price with estimator A')
 
# method B
EstPrdPrice = EstimatedProductPriceB(HistoricalData[,c(1,3)], MarketIndex);
display_variable(EstPrdPrice,'Estimated product price with estimator B')
 
# method C
EstPrdPrice = EstimatedProductPriceC(HistoricalData[,c(2,3)], OilPrice);
display_variable(EstPrdPrice,'Estimated product price with estimator C')

 

Note that the details of the estimator functions (and whether they make sense in reality) are unimportant for the scenarios below. We will focus on the big picture and integration: How can we apply selected price predictors in R on selected subtables of an input table?

You may see the related articles listed below for more information about estimations based on Linear Regression (LR):

Historical data

Following tables with historical data are used as input for price estimations:

  1. PriceTable: Historical prices of primary products per country and product
  2. IndicatorsTable: Historical values of global indicators market index and oil price

Historical data required for price estimations

Initializing R functions in C#

// initialize R connection
var envPath = Environment.GetEnvironmentVariable("PATH");
var rBinPath = @"C:\Program Files\R\R-2.14.1\bin\x64";  // check this path on your computer
Environment.SetEnvironmentVariable("PATH", envPath + System.IO.Path.PathSeparator + rBinPath);
 
REngine r_engine = REngine.CreateInstance("RDotNet");
r_engine.Initialize();
 
// initialize price estimation functions in R
 
// price estimator A
string funcstr = @"EstProdPriceFuncA <- function(HistData, EstMarketInd, EstOilPrice) { 
		RowCount = nrow(HistData);
		X_train = rbind(matrix(1,1,RowCount), t(HistData[,1:2]));
		Y_train = matrix(HistData[,3], RowCount, 1);
		Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train;
		X_test = matrix(c(1.0, EstMarketInd, EstOilPrice), nrow=1);
		Y_test = X_test %*% Bopt;
		return (Y_test);
		}";
Function EstProdPriceFuncA = r_engine.Evaluate(funcstr).AsFunction();
 
// price estimator B
funcstr = @"EstProdPriceFuncB <- function(HistData, EstMarketInd) { 
	RowCount = nrow(HistData);
	X_train = rbind(matrix(1,1,RowCount), t(HistData[,1]));
	Y_train = matrix(HistData[,2], RowCount, 1);
	Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train;
	X_test = matrix(c(1.0, EstMarketInd), nrow=1);
	Y_test = X_test %*% Bopt;
	return (Y_test);
	}";
Function EstProdPriceFuncB = r_engine.Evaluate(funcstr).AsFunction();
 
// price estimator C
funcstr = @"EstProdPriceFuncC <- function(HistData, EstOilPrice) { 
	RowCount = nrow(HistData);
	X_train = rbind(matrix(1,1,RowCount), t(HistData[,1]));
	Y_train = matrix(HistData[,2], RowCount, 1);
	Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train;
	X_test = matrix(c(1.0, EstOilPrice), nrow=1);
	Y_test = X_test %*% Bopt;
	return (Y_test);
	}";
Function EstProdPriceFuncC = r_engine.Evaluate(funcstr).AsFunction();

Integration of R with table functions in C#

See related article for the installation of R.NET that we use here for the integration of C# and R.

Function routers and subtable transformers expect delegate table functions as input parameters in the following format:

// Generic delegate table function
// InputTbl: Input table
// OtherParameters: Other parameters of any type
public delegate MatrixTable TransformTableFunc_OP(MatrixTable InputTbl, 
	params Object[] OtherParameters);

The initialized REngine and estimation functions are passed to the enveloping table functions (with the signature above) as additional parameters. The R functions are then called within the enveloping table functions. You may find the code examples below for these table functions:

C# code for table function PriceEstimatorA()

using FinaquantProtos;
using RDotNet;
 
// Table function for price estimation, estimator A (LR based on global market index and oil price)
// Input parameters:
// InputTbl: Historical product prices with fields year and price
// OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price
// OtherParameters[1]: REngine
// OtherParameters[2]: R Function
// OtherParameters[3]: Estimated market index for next year (double)
// OtherParameters[4]: Estimated oil price for next year (double)
public static MatrixTable PriceEstimatorA(MatrixTable InputTbl, params Object[] OtherParameters)
{
	// get all input parameters
	MatrixTable HistoricalProductPrices = InputTbl;
	MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0];
	REngine engine = (REngine)OtherParameters[1];
	Function EstProdPriceFuncA = (Function)OtherParameters[2];
	double[] EstMarketInd = {(double) OtherParameters[3]};
	double[] EstOilPrice = {(double) OtherParameters[4]};
 
	// combine tables
	MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices);
 
	// get matrix of key figures from table
	KeyMatrix HistData = HistoricalDataTbl.KeyFigValues;
 
	// convert data types of input parameters from C# to R
	NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray);
	engine.SetSymbol("HistDataR", HistDataR);
 
	NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd);
	engine.SetSymbol("EstMarketIndR", EstMarketIndR);
 
	NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice);
	engine.SetSymbol("EstOilPriceR", EstOilPriceR);
 
	// call function in R from c#
	NumericMatrix EstProductPrice = engine.Evaluate(
	@"EstProductPrice <- EstProdPriceFuncA(HistDataR,EstMarketIndR,EstOilPriceR)").AsNumericMatrix();
 
	// return a single-element table with estimated product price
	double EstProdPrice = EstProductPrice[0, 0];
 
	// return a single-element table with estimated product price
	TableFields tf = new TableFields(InputTbl.metaData);
	tf.AddNewField("product_price");
	MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice);
	return ResultTbl;
}

C# code for table function PriceEstimatorB()

// Table function for price estimation, estimator B (LR based on global market index only)
// Input parameters:
// InputTbl: Historical product prices with fields year and price
// OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price
// OtherParameters[1]: REngine
// OtherParameters[2]: R Function
// OtherParameters[3]: Estimated market index for next year (double)
// OtherParameters[4]: Estimated oil price for next year (double)
public static MatrixTable PriceEstimatorB(MatrixTable InputTbl, params Object[] OtherParameters)
{
	// get all input parameters
	MatrixTable HistoricalProductPrices = InputTbl;
	MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0];
	REngine engine = (REngine)OtherParameters[1];
	Function EstProdPriceFuncB = (Function)OtherParameters[2];
	double[] EstMarketInd = { (double)OtherParameters[3] };
	double[] EstOilPrice = { (double)OtherParameters[4] };
 
	// combine tables
	MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices);
 
	// exclude column oil_price from table
	HistoricalDataTbl = MatrixTable.ExcludeColumns(HistoricalDataTbl, 
		TextVector.CreateVectorWithElements("oil_price"));
 
	// get matrix of key figures from table
	KeyMatrix HistData = HistoricalDataTbl.KeyFigValues;
 
	// convert data types of input parameters from C# to R
	NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray);
	engine.SetSymbol("HistDataR", HistDataR);
 
	NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd);
	engine.SetSymbol("EstMarketIndR", EstMarketIndR);
 
	NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice);
	engine.SetSymbol("EstOilPriceR", EstOilPriceR);
 
	// call function in R from c#
	NumericMatrix EstProductPrice = engine.Evaluate(
	@"EstProductPrice <- EstProdPriceFuncB(HistDataR,EstMarketIndR)").AsNumericMatrix();
 
	// return a single-element table with estimated product price
	double EstProdPrice = EstProductPrice[0, 0];
 
	// return a single-element table with estimated product price
	TableFields tf = new TableFields(InputTbl.metaData);
	tf.AddNewField("product_price");
	MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice);
	return ResultTbl;
}

C# code for table function PriceEstimatorC()
// Table function for price estimation, estimator C (LR based on global oil price only)
// Input parameters:
// InputTbl: Historical product prices with fields year and price
// OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price
// OtherParameters[1]: REngine
// OtherParameters[2]: R Function
// OtherParameters[3]: Estimated market index for next year (double)
// OtherParameters[4]: Estimated oil price for next year (double)
public static MatrixTable PriceEstimatorC(MatrixTable InputTbl, params Object[] OtherParameters)
{
	// get all input parameters
	MatrixTable HistoricalProductPrices = InputTbl;
	MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0];
	REngine engine = (REngine)OtherParameters[1];
	Function EstProdPriceFuncC = (Function)OtherParameters[2];
	double[] EstMarketInd = { (double)OtherParameters[3] };
	double[] EstOilPrice = { (double)OtherParameters[4] };
 
	// combine tables
	MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices);
 
	// exclude column market_index from table
	HistoricalDataTbl = MatrixTable.ExcludeColumns(HistoricalDataTbl,
		TextVector.CreateVectorWithElements("market_index"));
 
	// get matrix of key figures from table
	KeyMatrix HistData = HistoricalDataTbl.KeyFigValues;
 
	// convert data types of input parameters from C# to R
	NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray);
	engine.SetSymbol("HistDataR", HistDataR);
 
	NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd);
	engine.SetSymbol("EstMarketIndR", EstMarketIndR);
 
	NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice);
	engine.SetSymbol("EstOilPriceR", EstOilPriceR);
 
	// call function in R from c#
	NumericMatrix EstProductPrice = engine.Evaluate(
	@"EstProductPrice <- EstProdPriceFuncC(HistDataR,EstOilPriceR)").AsNumericMatrix();
 
	// return a single-element table with estimated product price
	double EstProdPrice = EstProductPrice[0, 0];
 
	// return a single-element table with estimated product price
	TableFields tf = new TableFields(InputTbl.metaData);
	tf.AddNewField("product_price");
	MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice);
	return ResultTbl;
}

 

Scenario 1: Applying PriceEstimatorA on a single subtable for country-product pair “Utopia” and “bread”

This example is designed rather for testing if the table function PriceEstimatorA() which envelopes the corresponding R function works as expected.

Subtable for Utopia and bread

// apply table function on a single subtable for country-product pair "Utopia", "bread"
 
// filter TemperatureTable with condition table to obtain a subtable
var CondTblFields = TableFields.CreateEmptyTableFields(md);
TableFields.AddNewField(CondTblFields, "country");
TableFields.AddNewField(CondTblFields, "product");
 
var CondTbl = MatrixTable.CreateTableWithElements_A(CondTblFields,
	"Utopia", "bread"
	);
var SubTable = MatrixTable.FilterTableA(PriceTable, CondTbl);
 
// exclude fields country and product from subtable
SubTable = MatrixTable.ExcludeColumns(SubTable,
	TextVector.CreateVectorWithElements("country", "product"));
 
// view subtable
// MatrixTable.View_MatrixTable(SubTable, "Subtable of PriceTable (Utopia, bread)");
 
// execute user-defined table function to calculate expected energy consumption (Sedrun, 2012)
MatrixTable EstimatedPriceTbl = PriceEstimatorA(SubTable, GlobalIndTable, 
	r_engine, EstProdPriceFuncA, 1.2, 90.0);
 
// view result table
MatrixTable.View_MatrixTable(EstimatedPriceTbl, "Estimated price for subtable (Utopia, bread)");

Result table:
Result table for subtable Utopia and bread

Yes, the table function PriceEstimatorA() works as expected.

Scenario 2: Applying PriceEstimatorA on every subtable of price table

In this example, the same prediction function is applied for all countries and products.

Applying the same table (or matrix) function on all subtables of an input table… This is what a subtable transformer is created for.

Applying same estimation function on all subtables

// Apply the same table function on all subtables of Price Table
// subtable transformer
 
// define subtable fields
var SubTblFields = TableFields.CreateEmptyTableFields(md);
TableFields.AddNewField(SubTblFields, "year");
TableFields.AddNewField(SubTblFields, "product_price");
 
double EstMarketIndex = 1.2;
double EstOilPrice = 100.0;
 
// execute subtable transformer
MatrixTable ResultTbl = MatrixTable.TransformSubTables(PriceTable, SubTblFields,
	PriceEstimatorA, GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice);
 
// view result table
MatrixTable.View_MatrixTable(MatrixTable.Round(ResultTbl,2), 
	"Estimator-A applied for all countries and products");

Result table:
Result table for 2. scenario

Scenario 3: Applying different price estimators to different countries and products

This is where the fabulous function router comes into play.

What we want to do is summarized in the following table:
Function table, 3. scenario

In the result table we want to see explicitly which estimation function is used to generate each estimation.

// Function Router and R
// extend MetaData
md.AddNewField("estimation_method", FieldType.TextAttribute);
md.AddNewField("country_cell", FieldType.TextAttribute);
md.AddNewField("product_cell", FieldType.TextAttribute);
 
// create condition matrix table
CondTblFields = TableFields.CreateEmptyTableFields(md);
TableFields.AddNewField(CondTblFields, "country_cell");
TableFields.AddNewField(CondTblFields, "product_cell");
 
var CondMatTbl = MatrixTable.CreateTableWithElements_A(CondTblFields,
	"Utopia", "bread, cheese",      // estimator A
	"Utopia", "butter, honig",      // estimator B
	"Ignoria", "ALL",               // estimator C
	"Euphoria", "ALL"               // estimator A
	);
 
// create associated table to see the estimation method in the result table
var AssocTblFields = TableFields.CreateEmptyTableFields(md);
TableFields.AddNewField(AssocTblFields, "estimation_method");
 
var AssocTbl = MatrixTable.CreateTableWithElements_A(AssocTblFields,
	"estimator A",
	"estimator B",
	"estimator C",
	"estimator A"
	);
// array for delegate functions; assign a function for each row of condition matrix table
var MyTableFuncList = new TransformTableFunc_OP[4];
MyTableFuncList[0] = PriceEstimatorA;
MyTableFuncList[1] = PriceEstimatorB;
MyTableFuncList[2] = PriceEstimatorC;
MyTableFuncList[3] = PriceEstimatorA;
 
// array for other parameters
// pass same parameters for all subtables excluding R Function (estimators A, B, C)
EstMarketIndex = 1.2;
EstOilPrice = 100.0;
 
var OtherParametersList = new object[4][];
OtherParametersList[0] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice };
OtherParametersList[1] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncB, EstMarketIndex, EstOilPrice };
OtherParametersList[2] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncC, EstMarketIndex, EstOilPrice };
OtherParametersList[3] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice };
 
// call function router C (with condition cell table)
ResultTbl = MatrixTable.FunctionRouterC(PriceTable, SubTblFields,
	CondMatTbl, MyTableFuncList, AssocTbl, FirstMatchOnly: true, IgnoreHierarchy: true,
	ErrorIfConditionNotRelevant: false, OtherParametersList: OtherParametersList);
 
// round all key figures to 2 digits after decimal point
ResultTbl = MatrixTable.Round(ResultTbl, 2);
 
// view result table
MatrixTable.View_MatrixTable(ResultTbl, "Result table: Apply different estimators for different subtables");

Result table:
Result table, 3. scenario

Scenario 4: Applying all price estimators for all countries and products

In this last scenario we want to apply all available estimation functions on all subtables of the price table in order to compare the estimation results. This is again a task for the function router.

// Function Router and R
// Apply all price estimators to all countries and products
 
// create condition table (not condition matrix table with condition cells!)
CondTblFields = TableFields.CreateEmptyTableFields(md);
TableFields.AddNewField(CondTblFields, "country");
 
CondTbl = MatrixTable.CreateTableWithElements_A(CondTblFields,
	"ALL",                          // estimator A
	"ALL",                          // estimator B
	"ALL"                           // estimator C
	);
// create associated table to see the estimation method in the result table
AssocTbl = MatrixTable.CreateTableWithElements_A(AssocTblFields,
	"estimator A",
	"estimator B",
	"estimator C"
	);
// array for delegate functions; assign a function for each row of condition matrix table
MyTableFuncList = new TransformTableFunc_OP[3];
MyTableFuncList[0] = PriceEstimatorA;
MyTableFuncList[1] = PriceEstimatorB;
MyTableFuncList[2] = PriceEstimatorC;
 
// array for other parameters
// pass same parameters for all subtables excluding R Function (estimators A, B, C)
EstMarketIndex = 1.2;
EstOilPrice = 100.0;
 
OtherParametersList = new object[3][];
OtherParametersList[0] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice };
OtherParametersList[1] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncB, EstMarketIndex, EstOilPrice };
OtherParametersList[2] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncC, EstMarketIndex, EstOilPrice };
 
// call function router A (with condition table)
ResultTbl = MatrixTable.FunctionRouterA(PriceTable, SubTblFields,
	CondTbl, MyTableFuncList, AssocTbl, 
	JokerMatchesAllvalues: true, TextJoker: "ALL", NumJoker: 0,
	FirstMatchOnly: false, OtherParametersList: OtherParametersList);
 
// round all key figures to 2 digits after decimal point
ResultTbl = MatrixTable.Round(ResultTbl, 2);
 
// view result table
MatrixTable.View_MatrixTable(ResultTbl, "Result table: Apply all estimators for all subtables");

Result table:
Result table, 4. scenario

Conclusions

In this article we demonstrated how selected functions in R can be applied on selected parts (subtables) of input tables using flexible constructs like subtable transformer and function router. In that sense, these constructs can be used to integrate R with data tables containing any number of attributes and key figures.

The same (or similar) integration approach can be used for other computing languages like matlab.

Copyright secured by Digiprove © 2013 Tunc Ali Kütükcüoglu
  1. A function router applies selected table (or matrix) functions on selected subtables of an input table. []
  2. A subtable transformer applies the same table (or matrix) function on every subtable of an input table. []
This entry was posted in Calculation engine and tagged , , . Bookmark the permalink.

Leave a Reply