Iterating over multiple arrays using Reduce()
- Viswanathan Baskaran
- May 22
- 5 min read
The SCAN and REDUCE functions in MS Excel allow us to iterate over an array to produce a dynamic array results. While it is often used for aggregating a single variable, in this post we explain how to handle more complex models involving multiple inter-dependent variables through "Stack and Dice" approach.
(If you are not aware of how the Scan() and Reduce() function works, you can first check out this video: When to use Scan and Reduce)
The idea behind the "Stack and Dice" approach is that all the related variables are stacked together and passed as a single variable to the reduce function. Then inside the function, these variables are diced into individual variables.
We illustrate this using a standard transportation problem for which we are going to find an initial basic feasible solution (IBFS) using the Vogol's Approximation Method VAM).
We chose this problem because each variable that is involved here are table in itself, which adds its own complexity.
We have three tables to start with: (i) The cost table: Contains the transportation cost per unit between a given origin and destination
(ii) The supply table: Contains the information on amount of supply available at origin stations
(iii) The demand table: Contains the information on quantity of demand at each destination
VAM uses a penalty function to iteratively match an origin with a destination with the objective of reducing the transportation cost. The output is the allocation table, which is also built in an iterative manner. If you are unfamiliar with this method, you can see the manual step-by-step illustration in this MS Excel file to understand it better.
The MS Excel file has the entire Lambda with detailed comments and illustrations. You can read further to get an overview of the technique applied in the same.
The "Stacking and Dicing" Technique
The core idea is to combine (stack) all relevant tables or data structures into a single array. This composite array serves as the initial_value for the REDUCE function. Then, within the LAMBDA function that REDUCE applies at each iteration, this composite array (now the accumulator) is broken down (diced) back into its constituent parts. The algorithm's logic is applied to these individual parts, and the updated parts are then re-stitched into a new composite array, which becomes the accumulator for the next iteration.
Syntax of REDUCE:=REDUCE([initial_value], array, LAMBDA(accumulator, value, body))
[initial_value]: Sets the starting value for the accumulator. In our approach, this will be the initially stitched set of tables.
array: An array to be reduced or iterated over. This typically provides the number of iterations or values to process.
LAMBDA: The function called for each iteration. It takes the current accumulator and the current value from the array to perform calculations defined in its body.
Illustration: Vogel's Approximation Method (VAM) with REDUCE
Vogel's Approximation Method (VAM) is an iterative procedure to find an initial feasible solution for transportation problems by minimizing total costs. It involves calculating penalties, allocating resources, and updating supply, demand, and cost matrices in each step
The provided Excel LAMBDA function IBFS demonstrates how to implement VAM using the "stitching and dicing" technique with REDUCE.
1. Stitching the Initial Tables: The user passes three tables as input variables in to the function:
=Lambda(CostTbl, SupplyTbl, DemandTbl,
(The function does a certain quality check on the cost table to remove origins and destinations that are not relevant.) Initially, the cost, supply, and demand tables, along with a placeholder for the results (allocations), are combined into a single array. In the IBFS function, this is done using HSTACK:
Let(FirstTbl, HSTACK(RvCostTbl,
SupplyTbl,
DemandTbl,
HSTACK("NA", "NA", 0, 0)
)
Here, RvCostTbl (a filtered version of the input cost table), SupplyTbl, DemandTbl, and an initial empty structure for results (HSTACK("NA", "NA", 0, 0)) are horizontally stacked. This FirstTbl becomes the initial_value passed to REDUCE. The total row of the horizontally stacked array would be equal to the number of rows of the tallest table in the stack. MS Excel would pad the other tables with additional rows so that the stack has same rows across columns. All padded items will be #NA.
2. Iteration with REDUCE:The REDUCE function then iterates a set number of times, controlled by the Ns sequence (SEQUENCE(ROWS(_SupplyTbl) + ROWS(_DemandTbl) + 1)).
FinalTbl, REDUCE(FirstTbl, // Initial stitched table
Ns, // Sequence for iteration (1 to N)
LAMBDA(CombTbl, Iter, // CombTbl is the accumulator IF(SUM(TAKE(CombTbl, 1, 3)) = -PI(), // Termination condition CombTbl,
LET(// Dicing, VAM logic, and Re-stitching happens here ) //close let
)//close IF
)//close Lambda
)//close Reduce
The LAMBDA's first parameter, CombTbl, represents the accumulator. In each iteration, CombTbl holds the stitched set of tables from the previous step (or FirstTbl in the first iteration).
3. Dicing the Tables Inside LAMBDA:Within the LET function inside the LAMBDA, the CombTbl is "diced" back into individual tables using the take and drop functions. The padded rows with #NA of the diced tables needs to be removed and this is done with a custom function RemoveNARows. The dicing itself is done using Take() and Drop() functions
CostTbl, TAKE(RemoveNARows(CombTbl, 1), , 3),
SupplyTbl, DROP(TAKE(RemoveNARows(CombTbl, 4), , 5), , 3),
DemandTbl, DROP(TAKE(RemoveNARows(CombTbl, 6), , 7), , 5),
Results, TAKE(RemoveNARows(CombTbl, 8), , -4),
Since the columns of tables are static, the static column numbers are directly passed on to the function.
4. Performing VAM Calculations:The core VAM logic is then applied to these diced-out tables. This includes:
Calculating penalties for supply centres and demand centres (Diff1, Diff2). (Penalty equals the difference between lowest cost and second lowest cost for each demand and supply centres)
Identifying the cell with the highest penalty for allocation (Nthrow, SelectS, SelectD).
Determining the quantity to allocate (FilledQty).
Updating the cost, supply, and demand tables (NewCostTbl, NewSupplytbl, NewDemandtbl).
Recording the allocation in an updated results table (NewOutput).
For example, NewCostTbl is derived by filtering the CostTbl to remove rows or columns where supply or demand has been fully met:
NewCostTbl, FILTER( CostTbl, FirstCondtn * SecondCondtn, // Conditions to eliminate satisfied rows/columns HSTACK("NA", "NA", -PI()) // Placeholder if all met ),
5. Re-stacking the Updated Tables: After the VAM calculations for the current iteration, the newly updated tables (NewCostTbl, NewSupplytbl, NewDemandtbl, NewOutput) are stitched back together using HSTACK. This re-stitched composite table becomes the return value of the LAMBDA for the current iteration.
HSTACK(NewCostTbl, NewSupplytbl, NewDemandtbl, NewOutput)
This returned value then serves as the CombTbl (accumulator) for the subsequent iteration of REDUCE.
6. Termination and Final Result: The REDUCE function continues this process of dicing, calculating, and re-stitching until a termination condition is met (e.g., all supply and demand are satisfied, checked by SUM(TAKE(CombTbl, 1, 3)) = -PI()) or the iteration sequence Ns is exhausted. The final CombTbl returned by REDUCE (FinalTbl) contains all the tables after the VAM process is complete.
The desired output, typically the allocation table, is then extracted from this FinalTbl.
Advantages of this Approach
Handles Multiple Interdependent Variables: Allows complex algorithms requiring multiple data structures to be updated iteratively within REDUCE.
Avoids recursion: This approach eliminates the need for recursive Lamdas which are more complicated to debug and has performance limitations.
By using this "Stacking and Dicing" approach, the REDUCE function in Excel can be extended beyond simple accumulations to manage sophisticated iterative calculations involving multiple, evolving datasets, as demonstrated by the VAM implementation.
Download the file to see the result
Comments