Name

CurveBuilder — Reads in curve files and writes out CSV files that are suitable as input to the Matrix Builder.

Syntax

ca.spatial.tracks.curves.CurveBuilder {config}

The CurveBuilder command has the following arguments:

config

The name of a YAML format configuration file containing the parameters to the CurveBuilder process

Description

The CurveBuilder tool is a general purpose yield curve pre-processor. The CurveBuilder program reads in yield curve files and writes out CSV files that are suitable to input in to the Matrix Builder application. This tools was originally written to assist in preparing FVS-format yield curve files, but has been generalized as much as possible so as to operate on a variety of yield curve inputs from various forecasting systems.

As it works it optionally performs a number of transformation to the input values, including

  • Appending of multiple input curve files for single-pass processing.

  • Recoding the silvicultural regime name based on pattern matching. For example, the regime names can be decoded from the name of the keyword file that was used to run the yield simulation.

  • Applying netdown factors or expressions to the input curve values. These factors will scale the original curve values according to an arbitrary expression, for example to convert from gross to net values.

  • Aggregating the input curves into output values using arbitrary expressions. For example, summing several individual inputs into a single product group curve.

  • Calculating post-harvest values for inventory curves in periods where management events such as thinnings occur. The process is to interpolate a pre-harvest value in the year prior to the event, and set the post-harvest volume to be the remainder from subtracting harvest volumes from pre-harvest inventory volumes (checking to ensure the result is non-negative). In the figure below a reformatted post-harvest volume curve is depicted in blue.

  • Calculate the harvest curves so that they appear as a spike of volume in the year that the event takes place, with zero values in all other years. In the figure above these are shown as the red curve.

  • Calculate additional derived value curves as required.

  • Expansion of strata columns in to presence/absence curves, with one additional curve for each unique stratification value.

  • Calculate of multiple duration and retention curves. Both duration and retention curves have a delay value.

    • Duration curves are zero until an event occurs, and then they step up to a value of one and remain there until the delay value in years is met, and then they step back down to zero. Duration curves are used to indicate when a stand condition is under the influence of a previous event.

    • Retention have a value of one from the start curve set until the number of years specified by the delay value past the final event, at which point the curve steps down to a value of zero. Retention curves are used to indicate when the complete event sequence is finished, and are useful for indicating treatment eligibility after a full uneven age regime has been completed.

    The figure below shows a 10-year delay retention curve in red and a 20-year delay duration curve in blue for an event that occurred in year 130. The retention curve turns off 10 years after the event, and the duration curve is on from the time that the event occurred until 20 years later.

  • Produce an "age of first treatment" lookup table.

  • Optionally producing an operability lookup table, showing the early and late operability timings for each treatment regime.

  • Deduplication of regimes with identical values. A regime is considered identical if

    • it is in the same strata, and

    • all outputs after processing will be identical. Outputs include all of the feature and product curves values.

    Duplicated curvesets are not written to the output file. A value of -3 is recorded in the first year lookup table for this strata/regime combo.

  • Error checking and reporting on:

    • Strata codes read from the yield curve input file that do not match to strata in the block file.

    • Strata in the block file which did not have matching curve sets from the yield curve file.

    • Curve sets that are entirely zero values.

The transformation process results in three files:

  • The growing stock curves (also known as feature curves).

  • The harvested volumes (also known as product curves) typically from uneven aged or partial harvest treatments such as thinnings. Note that clearcut harvest volumes are derived from the growing stock curves, so are not duplicated in this harvested volume curve file.

  • A lookup table by strata of the age of the first management event, by each regime. This table has unique strata combinations in rows, and regime names in the columns. The cell values will be either:

    • the age when the first management event occurs;

    • -1 if no event occurs over the age of the simulated growth of the stand;

    • -2 if there are no curves for this strata/regime combination; or

    • -3 if curves existed for this strata/regime combo but were discarded because they were duplicates of another regime in the same strata.

Configuration file

The configuration file contains parameters to the transformation process in YAML format. A 'CurveBuilder: key is required at the top level of the file. All subsequent keys described below are children of this top level key.

The following keys are required:

blockTable:

The value of this key is an expression used to access the block attribute file, using a path name relative to the config file location.

blockTable: ../blocks/blocks.dbf

If you specify a block file and all other related block table information the unpacking tool will:

  • make the related fields in the block table available for queries; and

  • perform quality control checks to make sure that yield curves are available for every block record.

The Patchworks toolkit also supports reading directly from the database files produced by using the JDBC adapter, such as in the following example:

blockTable: JDBC "jdbc:sqlite:../../SpatialDataPrep_pll.gpkg" "select * from preblocks"

If you use a database connection you must provide a JDBC driver: these are not distributed with Patchworks. See the documentation for the JDBCStore class for details about how to download and specify a JDBC driver.

If the block file is not specified then some checks will be skipped.

blockStrata:

The name of a column in the block table that contains a unique identifier that will match up with the stratification field in the yield curve input files. If a blockTable is specified then this field is required.

blockStrata:  STANDID
blockQuery:

A query used to identify the records in the block file that require yield curve volumes. This query will be used to check that the matching records from the block file have corresponding yield records. Mismatches will be identified. This query does not limit the yield curve input records to be processed.

"FORNAME = 'Apollo'"
deduplicate:

A flag to indicate if regimes in the same strata with identical volumes and events should be de-duplicated. Identical means having the same inventory and harvest yield values for all ages and columns. Curve sets that are duplicates to previously written curve sets are discarded.

deduplicate:  true
dedupSkip:

A regular expression for regimes that should not be dropped due to de-duplication. These regimes will be included in the output set regardless of being duplicates of previously processed treatments.

dedupSkip: (GROW|REGE)
outputFolder

The output folder where the results will be stored, relative to the config file location. The folder must exist.

outputFolder: FVS_UNPACK_20220916
ageAdjustment:

A Patchworks query language expression convert yield curve ages to stand ages. This can be used to adjust the age for various reasons. A common reason for New Forests is that the assumption is that the regime is starting on bare ground, but the recorded age is still the age from the initial inventory. The adjustment to make in this case is to subtract the initial age from the yield curve recorded age for these specific regimes, so the resulting ages start at zero. For example;

ageAdjustment: Age - if(_regime_ = 'REGE' or endsWith(_regime_,'_b'),InitialAge,0);

If no adjustment is required then just enter the yield curve age field name. The adjusted age will be stored in the internal table as a column named '_age_' and may be used in the column output expressions.

curveStrata:

Enter the column name in the input yield curve file that contains the stratification value. The value from this field will be stored in the internal column named '_strata_' and may be used in the column output expressions.

curveStrata: StandID
curveRegime:

Specifies the name to be used as the column heading in the output files for the regime field.

curveRegime: MgmtId
curveQuery:

A query expression that can be used to limit the records that are processed from the yield curve input files. Only records that match to the expression are included. This is typically only used for debugging and diagnostic purposes and not for production runs. The expression must evaluate to true or false. This line may be commented out or set to true if not needed.

curveQuery: StandID eq '29N11W0002'
regimeTransform:

The regimeTransform entry contains a list of entries that will be used as arguments to a substitutionlist function. Each row in the list contains a sublist with the input, regex, translation template arguments required as a list element in the argument to the substitutionList function.

The triplets required in each row are:

  • Input expression - the expression read from the input file

  • regex - the regular expression that is pattern matched against the expression value

  • template - the name extraction template used when a match occurs

The template string may have substitution symbols that consists of a capture group number enclosed in braces (e.g. {1}). The symbol including the braces will be replaced with the corresponding capture group value from the regular expression.

Note that due to YAML formatting rules single quotes and braces must be enclosed in double quoted strings. Escaped regular expression symbols must have a double escape (e.g. 'Unev_(\d+)' becomes "'Unev_(\\d+)'", and 'SOFR_{1}' becomes "'SOFR_{1}'").

The result of transforming the regime name will be stored in the internal table in a column named '_regime_'. This column is available for use in other Patchworks Query Language expressions.

When the function is executed it will test each row in order, returning the result of the first match If a match cannot be found in any of the rows then the input record will be ignored.

  regimeTransform: &RT
    - [KeywordFile, "'GROW'", "'GROW'"]
    - [KeywordFile, "'Owl_Mgnt'", "'SOFR'"]
    - [KeywordFile, "'Owl_Mgntregen'", "'SOFR_b'"]
    - [KeywordFile, "'Owl_Mgnt_p(\\d)'", "'SOFR_{1}'"]
    - [KeywordFile, "'Regen'", "'REGE'"]
    - [KeywordFile, "'Riparian'", "'RIPR'"]
    - [KeywordFile, "'Riparianregen'", "'RIPR_b'"]
    - [KeywordFile, "'Riparian_p(\\d)'", "'RIPR_{1}'"]
    - [KeywordFile, "'Thin'", "'THRD'"]
    - [KeywordFile, "'Thin_p(\\d)'", "'THRD_{1}'"]
    - [KeywordFile, "'Unev_(\\d+)'", "'U{1}'"]
    - [KeywordFile, "'Unev_(\\d+)regen'", "'U{1}_b'"]
    - [KeywordFile, "'Unev_(\\d+)_p(\\d)'", "'U{1}_{2}'"]

In the above example the regimeTransform line used an alias of &RT. This alias may be used elsewhere in the configuration file, and reduces the need for the same transformations to be repeated multiple times.

curveInputFiles:

This entry contains a list of yield curve input files, along with the regime name transformation rules to use for each. Also provide a field name that will be used in the warning message when the regime transform expression fails to find a match.

  curveInputFiles:
    - filename: JDBC "jdbc:sqlite:../../SpatialDataPrep_pll.gpkg" "select * from ws_yields_plus"
      regimeTransform: *RT
      unmatchedWarning: KeywordFile

The filename entry contains a string that can be parsed by the TableSpec.open method. In the above example the JDBC driver is used to open a Sqlite data base using a select statement. Because the string includes the colon and space characters it must be enclosed in double quotes. Sqlite is one of the native output formats from FVS, so by using the JDBC driver the data can be read directly without requiring an intermediate export step. It may be useful to define a view within the database system in order to provide all of the columns needed to classify and process curves.

The regimeTransform entry contains a list of entries used to form a substitutionlist function, as described previously. In the above example the *RT value is a pointer that refers to the &RT alias that was defined in the previous step. The regime transform will be applied as the yield records are read, and the value will be stored in the internal table in a column named '_regime_'.

The unmatchedWarning entry provides and expression that will be evaluated and printed in the error message when the substitutionList command fails to find a match. This code can assist in determining the additional matching rules that need to be added.

During processing each input file is read, transformed, sorted and appended to a master table. If there is more than one input file then all input files must have the same column structure.

eventQuery:

An expression that determines if the current yield curve record contains an 'event' such as a thinning. The expression must return a true/false value. A typical test is that one or more of the harvest columns returns above non-zero value. This query is executed on the internal table that is a join of the yield curve and block files.

eventQuery: "summarizeColumns('yCHarvAG|yCHarvBG|Y.*HV')>0"

Double quotes are used to surround this expression because it contains characters that may be confusing to the YAML parser.

operabilityQuery:

An expression that determines if the current yield curve record is eligible for a treatment, such as a thinning. The expression must return a true/false value. A typical test is that one or more of the harvest columns returns above non-zero value. This query is executed on the internal table that is a join of the yield curve and block files.

operabilityQuery:
                "summarizeColumns('YQth.*')>0"

Double quotes are used to surround this expression because it contains characters that may be confusing to the YAML parser.

If specified this will cause the CurveBuilder app to write out an additional file named 'cb_Operability_lu.csv'. The contents of this file contains the strata and regime values, the earliest and latest operability ages, and an expression that represents the operability limits in case the operability range is discontinuous. The early operability age is the age of the initial yield curve record that passes the operability query. The late operability age is the age one year prior to the age in the next yield curve record after the last yield curve record where the operability query is true. If there are no subsequent yield curve records then the late operability age will be 999.

inputCurveScaling:

Input curve scaling is used to transform the curve values from the input files as they are being read in from the data base, after the management regime names have been translated but prior to any other manipulations or summations. This entry allows for a list of scaling factor sets, each potentially applying to different management regimes or input curve columns. Each list item may contain:

  • regime: a regular expression that will be applied to the regime name for this input set. If the regime name matches to the regular expression then this list item is a candidate match.

  • curves: a regular expression that will be applied to the input column names. If the expression matches then this list item is above candidate for scaling the curve.

  • factor: an expression that computes a scaling factor that will be applied to every point on the curve. The expression may use any value in the joined yield curve intput and block table.

  • expression: a Patchworks Query Language expression that will transform the input value. The expression may have subsitution symbols that consists of a match group number enclosed in braces (e.g. {0}). The symbol including the braces will be replaced with the corresponding match group value from the 'curves' regular expression.

Values for the curves key are required. The regime key is optional and if not present then the list item applies to all regimes.

  inputCurveScaling:
    - {curves: Yoh\d(HV)?, factor: 0.89}
    - {curves: Ydf\d(HV)?, factor: 0.92}
    - {curves: Yoc\d(HV)?, factor: 0.89}
    - {curves: Yce\d(HV)?, factor: 0.89}
    - {curves: Yww\d(HV)?, factor: 0.92}
    - {curves: Ypn\d(HV)?, expression: {0}*if(_age_<100,0.95,0.50}

During processing the first list item that matched to both the curve and regime will be used. If a matching item is not found the input value will not be scaled.

The first five lines in the previous example use constant scaling factors that are applied to each matching input value. The final line uses an age-based expression. The current matching column is substituted for the {0} symbol. This value is then multiplied by the value from the remainder of the expression, which will be 0.95 for ages less than 100, and 0.50 otherwise. The expression may use any of the values from the combined block and curve input files, as well as the special values of _age_, _regime_, and _strata_.

Either factor or expression values must be provided, but it is invalid to provide both.

The substitution symbol syntax of {0} in the expression conflicts with the Patchworks Query Language list syntax that uses braces to enclose list members. Do not use the brace list syntax in these expressions, instead use the list() function to form lists.

expandStrataColumns:

This list specifies column names that contain a discrete classification, such as a forest type coding. This column will be expanded to above series of output columns, one for each unique stratification code. Each column will contain a presence/absence (1/0) curve that has a 1 for ages when the stratification code is present, and a 0 for other ages.

Each list contains:

  • column: The column name containing the values of interest

  • extract: A regular expression that will extract the relevant part of the code. This is optional and if not specified an expression of .* (all values accepted) will be used.

  • template: A formatting template that will combine the extracted parts of the code to form a label. Capture groups from the extract expression can be referred to by numbers within braces, with {0} referring to the entire code. Since brace characters are a syntactic element of the YAML format the template value must be enclosed in double quotes.

  expandStrataColumns:
    - {column: ForTyp, extract: .*, template: "fortype_{0}"}
idColumns:

This entry contains a list of column specifications that describe values that will be output to the feature and product curve table. Id columns are not interpreted in any other way.

Each row in the list contains the following values:

  • label: The label field provides the column heading to be used in the output file.

  • expression: The expression field contains a Patchworks Query Language expression that will be used to calculate the output value for this column. The expression can use any of the values from the joined yield cufve input and block table.

  idColumns:
    - {label: ForTyp, expression: ForTyp}

inventoryColumns:

This entry contains a list of column specifications that describe how to sum up columns that will be output in the feature curve table.

Each row in the list contains the following values:

  • label: The label field provides the column heading to be used in the output file.

  • expression: The expression field contains a Patchworks Query Language expression that will be used to calculate the output value for this column. The expression can use any of the values from the joined yield cufve input and block table.

  • The hrvColumn specifies a corresponding column in the harvested columns list. For years that correspond to a management event the harvested value will be subtracted from the value computed by the expression to determine the amount left after the treatment has been applied. An interpolation value will be inserted in the year previous to the treatment year. For years that do not correspond to a management event the value calculated by the expression will not be altered.

    If the hrvColumn value is not specified then expression value is not altered to be a post-treatment value.

  inventoryColumns:
    - {label: yCGrnAG, expression: yCGrnAG,                   hrvColumn: yCHarvAG}
    - {label: yCGrnBG, expression: yCGrnBG,                   hrvColumn: yCHarvBG}
    - {label: yCInv,   expression: yCInv,                     hrvColumn: yCOnsite}
    - {label: Ytotal,  expression: summarizeColumns('Y.*\d'), hrvColumn: Ytotal}
    - {label: Yoha,    expression: summarizeColumns('Yoh.*'), hrvColumn: Yoha}
    - {label: Ydfl,    expression: Ydf1,                      hrvColumn: Ydfl}
    - {label: Ydfm,    expression: Ydf2 + Ydf3,               hrvColumn: Ydfm}
    - {label: Ydfs,    expression: Ydf4,                      hrvColumn: Ydfs}
    - {label: Yoca,    expression: summarizeColumns('Yoc.*'), hrvColumn: Yoca}
    - {label: Ycel,    expression: Yce1 + Yce2,               hrvColumn: Ycel}
    - {label: Yces,    expression: Yce3 + Yce4,               hrvColumn: Yces}
    - {label: Ywwl,    expression: Yww1,                      hrvColumn: Ywwl}
    - {label: Ywwm,    expression: Yww2 + Yww3,               hrvColumn: Ywwm}
    - {label: Ywws,    expression: Yww4,                      hrvColumn: Ywws}
    - {label: Ypnl,    expression: Ypn1,                      hrvColumn: Ypnl}
    - {label: Ypnm,    expression: Ypn2 + Ypn3,               hrvColumn: Ypnm}
    - {label: Ypns,    expression: Ypn4,                      hrvColumn: Ypns}

harvestColumns:

The harvestColumns entry is similar to the inventoryColumns entry. It is a list of columns that will be output to the product curves file. The harvestColumns entry is a list with rows for each column that is to be output.

Each row contain the label and expression fields that have the same meaning as the given in the inventoryColumns entry.

  harvestColumns:
    - {label: yCHarvAG, expression: yCHarvAG}
    - {label: yCHarvBG, expression: yCHarvBG}
    - {label: yCOnsite, expression: yCHarvAG + yCHarvBG}
    - {label: Ytotal, expression: summarizeColumns('Y.*\dHV')}
    - {label: Yoha, expression: summarizeColumns('Yoh.*HV')}
    - {label: Ydfl, expression: Ydf1HV}
    - {label: Ydfm, expression: Ydf2HV + Ydf3HV}
    - {label: Ydfs, expression: Ydf4HV}
    - {label: Yoca, expression: summarizeColumns('Yoc.*HV')}
    - {label: Ycel, expression: Yce1HV + Yce2HV}
    - {label: Yces, expression: Yce3HV + Yce4HV}
    - {label: Ywwl, expression: Yww1HV}
    - {label: Ywwm, expression: Yww2HV + Yww3HV}
    - {label: Ywws, expression: Yww4HV}
    - {label: Ypnl, expression: Ypn1HV}
    - {label: Ypnm, expression: Ypn2HV + Ypn3HV}
    - {label: Ypns, expression: Ypn4HV}
    - {label: obtTotal, expression: "setVar('tot',summarizeColumns('Y.*HV'));
         Number(tableRange('obtCost','Vol',getVar('tot'),if(SLOPE<35,'Slope0',
         if(SLOPE<55,'Slope1','Slope2'))))*getVar('tot')"}
retentionColumns:

A list of entries having a label and a Patchworks query language expression identifying the retention timings for various treatments. Each expression should return a numeric value being the number of years of retention after the final treatment. Presence/absence (0/1) curves will be output using the column label with a value of 1 prior to the number of retained years past the final event, and a value of 0 otherwise.

  retentionColumns:
    - {label:  retention, expression: "if(startsWith(_regime_,'THRD'),10,0)"}

If the expression returns 0 or less then no retention will be recorded.

durationColumns:

A list of entries having a label and a Patchwork Query Language expression that identify the disturbance duration after each event. Each expression should return a numeric value being the number of years of duration after the each treatment. Presence/absence (0/1) curves will be output using the column label with a value of 1 when the regime is under the disturbance duration, and zero otherwise.

  durationColumns:
    - {label:  fp20, expression: "if(matches(_regime_,'RIPR|SOFR|U.*'),20,0)"}

If the expression returns 0 or less then no duration will be recorded.

lookupTables:

This entry contains a list of lookup table entries. Each row in the list contains an id key for the table identifier, and either a data section or a filename entry (but not both). The lookup table references will be loaded in to the Patchworks Query Language environment and can be used in expressions, such as with the tablemap and tablerange functions.

  • id: the name of the table (required).

  • data: A list of rows in the lookup table, starting with the column headings

  • filename: A name of a external table that should be read in and used as a lookup table

  lookupTables:
    # The obtCost table provides on-board trucking costs by volume density and slope class.
    # The first column is the average volume of the harvest in Mbf/acre.  Choose the
    # row with the stand volume closest but not exceeding the row value.  Next choose the
    # column with the matching slope class.
    #
    # If the stand volume is greater than all values in the table, the last row will be used.
    #
    # The rows in the lookup table must be sorted by the Vol lookup column.
    #
    # If the OBT costs do not vary by average volume then only a single row is required
    # and it does not matter what value is in the Vol column.
    - id: obtCost
      data:
        - [Vol, Slope0, Slope1, Slope2]
        - [2, 199, 279, 432]
        - [4, 172, 241, 374]
        - [6, 157, 220, 341]
        - [8, 147, 206, 319]
        - [10, 140, 196, 304]
        - [12, 133, 186, 288]
        - [14, 127, 178, 276]
        - [16, 121, 169, 262]
        - [18, 115, 161, 250]
        - [20, 109, 153, 237]
        - [22, 105, 147, 228]
        - [24, 101, 141, 219]
        - [26, 98,  137, 212]
        - [28, 95,  133, 206]
        - [30, 93,  130, 202]
        - [32, 91,  127, 197]
        - [34, 89,  125, 194]
        - [36, 87,  122, 189]
        - [38, 86,  120, 186]
        - [40, 85,  119, 184]

Processing notes

Processing follows these steps:

  1. The configuration file is read and syntax is checked.

  2. The block file is opened.

  3. Each yield curve input file in the list is read. As the file is read:

    • The _strata_ field is populated from the curveStrata column.

    • The regimeTransform factor is applies and the _regime_ field is populated.

    • The ageAdjustment factor is applied and the _age_ field is populated.

    • The inputCurveScaling netdown factors or expression are applied to the selected input columns.

    • The table is sorted by strata, regime and age.

  4. The yield curve input files are appended together, and the block table is joined to the appended set. This is the table that the Patchworks Query Language (PQL) expressions will be compiled against. Besides all of the yield curve and block table columns, the following three columns are present and available for use:

    • _strata_ - the stratification value for this record, as copied from the field mentioned in the curveStrata: entry.

    • _regime_ - the regime value as calculated by the regimeTransform: entry.

    • _age_ - the stand age value as calculated by the ageAdjustment: entry.

  5. The Patchworks Query Language (PQL) expressions are compiled and compilation errors are reported.

  6. Record processing begins. Records are read and collected into groups by unique strata/regime combinations. When a group has been completely read it will be processed for output, including:

    • Inventory and harvest column values are calculated, included post-harvest volume calculations.

    • Calculation of duration, retention and expanded strata columns.

    • Duplicate curve sets are detected.

    • Zero volume curve sets are detected.

    • Curve values are written to the output files.

    • Operability lookups are written, but only if an operability query has been provided.

  7. After all records have been processed the "first treatment year" lookup table is written, and messages are printed about stands which have not been matched to yield curve sets.

Examples

The following examples show different methods to invoke the CurveBuilder command.

Example 1: From the Application Launcher

  1. Select the CurveBuilder tool from the Application Launcher menu:

  2. Select the file containing the configuration and press the Finish button:

Example 2: From within a BeanShell script

Enter the following line within a script:

AppChooser.invokeInline("ca.spatial.tracks.curves.CurveBuilder",
    new String[] {"F:/Projects/version_1/yield/unpackFvsConfig.yaml"});

This command will start the CurveBuilder command as a separate process. The name of the config file is passed as a parameter. The unpack process will run until completion and display output in the terminal window of the controlling script.

For more information about invoking applications in this way see the Application Launcher documentation.

Example 3. From the operating system command line

The command can be executed from the command line by using the following syntax:

java -jar "c:/Program Files/Spatial Planning Systems/Patchworks/patchworks.jar"
    ca.spatial.tracks.curves.CurveBuilder F:/Projects/version_1/yield/unpackFvsConfig.yaml

The above command is shown continued on to two lines, but when entered at the terminal it should be contained on a single line. Double quotes should surround path names that include spaces or other special characters.