libsim
Versione7.2.6
|
The program v7d_transform is a command-line utility which can perform many useful space and time transformations and conversions on volumes of sparse georeferenced data coming from various sources. The data are internally imported into a vol7d_class::vol7d structure.
The general syntax is:
Sparse data can be imported from the following sources:
BUFR/CREX and v7d native binary files can also be created by means of the vg6d_getpoint utility, starting from gridded datasets like grib files.
The input format has to be declared with the –input-format=
command line argument which can take the values dba
, BUFR
, CREX
, native
or orsim
.
For file-type input data, multiple sources of the same type can be indicated, so that they will be merged into a single dataset for computation and output, a single -
character indicates input from stdin (not supported for all formats).
For database-type input format, the inputfile argument specifies database access information in the form user/password@dsn, if empty or -
, a suitable default is used.
Data can be output in the following formats, regardless of the input format:
The output format has to be declared with the –output-format=name
[:template] command line argument. Here name
can take the values BUFR
, CREX
, native
, csv
or grib_api
. For BUFR
, CREX
and grib_api
a template can be specified.
The name of the output file is indicated as the last argument on the command line, a -
character indicates output to stdout (not supported for all formats).
The native format and BUFR/CREX format contain more or less the same information, however they have to be be used for different purposes: BUFR/CREX are stable formats, compatible with other applications, so they are suitable for long-term storage and exchange of data, while their reading/writing require a considerable extra amount of time. The v7d native format, conversely, can be read/written very quickly, and is suitable for working in a "pipe" of commands (stdin/stdout), but it is purely an internal format of libsim, which may not be portable among different versions or platforms, so it should be used only as a fast temporary storage within a single application.
When importing from database, rather than from file, more information has to be specified, in order to determine what is going to be imported:
–start-date=
initial date for extracting data, in iso format 'YYYY-MM-DD hh:mm'
, hour and minute can be omitted–end-date=
final date for extracting data, same format as for –start-date=
–network-list=
list of station networks (report types) to be extracted in the form of a comma-separated list of alphanumeric network identifiers–variable-list=
list of data variables to be extracted in the form of a comma-separated list of B-table alphanumeric codes–anavariable-list=
list of station variables to be –attribute-list=
list of data attributes to be extracted in the form of a comma-separated list of B-table alphanumeric codes–set-network=
if provided, all the input data are merged into a single pseudo-network with the given name, throwing away the original network information.When importing data, –start-date
, –end-date
, –network-list
and –variable-list
are mandatory; when importing station information, –network-list
and –anavariable-list
are mandatory.
Common values for B-table data variable codes are:
B10003
GeopotentialB10004
PressureB11001
Wind directionB11002
Wind speedB11003
U-component of windB11004
V-component of windB12101
TemperatureB12103
Dew-point temperatureB13003
Relative humidityB13011
Total precipitationCommon values for B-table data attribute codes are (Oracle SIM specific):
B33192
[SIM] Climatological and consistency checkB33193
[SIM] Time consistencyB33194
[SIM] Space consistencyB33195
[SIM] MeteoDB variable IDB33196
[SIM] Data has been invalidatedB33197
[SIM] Manual replacement in substitutionCommon values for B-table station variable codes are:
B07001
Station heightB07031
Barometer heightB01192
MeteoDB station idB01019
Station nameB01194
Report (network) mnemonicThis example imports two months of precipitation from two networks in the database, with output on a native binary file:
Notice the -
character in place of input file, indicating that the default access credentials will be used for connecting to the database.
This example imports only station information from two networks in the database, merging the networks in a single one, with output on screen in csv format:
and the result will look more or less like this:
With v7d_transform it is possible to apply simple statistical processing to data, using the command-line argument
–comp-stat-proc=
[isp:]ospwhere isp is the statistical process of input data which have to be processed and osp is the statistical process to apply and which will appear in output, possible values (from grib2 table) are:
If isp is not provided it is assumed to be equal to osp.
Other important command-line arguments for statistical processing are:
–comp-step=
length of statistical processing step in the form 'YYYYMMDDDD hh:mm:ss.msc'
, it can be simplified up to the form 'D hh'
; it is not recommended to mix variable intervals like months or years with fixed intervals like days or hours–comp-start=
start of the statistical processing interval, an empty value indicates to take the initial time step of the available data; the format is the same as for –start-date=
parameter.Depending on the values of isp and osp, different strategies are applied to the data for computing the statistical processing.
When isp is 254, e.g. in –comp-stat-proc=254
:0, instantaneous data are processed by computing their average, minimum or maximum on regular intervals. This can be applied to observed or analyzed data (not to forecast data). In the case of average, every value contributing to the average processing is weighted according to the length of the interval spanned by the value, in order to take into account uneven distribution of data in time.
In all cases, if an interval without data longer than 10% of the overall processing interval is encountered, the whole interval is discarded (to be tuned).
When isp and osp are 0, 1, 2, or 3, statistically processed data are processed by recomputing the statistical processing on a different time interval with respect to the input time interval. In principle isp and osp should be equal, however different values are allowed, in that case osp determines the statistical processing that will be applied, while isp is only used for selecting input data. For this kind of processing the two following methods are successively applied.
This method supports average, accumulation, maximum and minimum operations, and it can be applied to observed or analyzed data (not to forecast data).
The additional argument:
–frac-valid=
specifies the minimum fraction of data that should be present in input for each output interval, in order to consider the processing on that interval valid, it should be a value between 0 and 1.
When this method is applicable, it is recommended to provide the argument –comp-start=
.
This method supports average and accumulation operations, and it can be applied to observed, analyzed or forecast data.
When this method is applied, the argument –comp-start=
is ignored.
This is quite a stupid operation, however it is required by some users, so it is allowed to set –comp-stat-proc=1
:254 in order to transform average data into instantaneous data, obtaining an output reference time in the middle of the input average interval (see result Out1 in the next figure).
For this method the –comp-step=
argument indicates the length of time step of statistically processed input data to be used, not the expected step of output data, while the argument –comp-start=
is ignored.
If there is no available data averaged on the requested interval length, the algorithm looks for data averaged on half the requested interval length, it applies first an average processing by aggregation in order to obtain data on the requested interval, including both odd and even intervals, then it transforms them into instantaneous data (see result Out2 in the figure).
This example accumulates precipitation on 24 hours intervals, starting from data accumulated on shorter intervals, and outputs the data on a BUFR file:
This example computes maximum temperatures on 24 hours intervals, starting from instantaneous data, and outputs the data on a native binary file:
This example takes the output from the previous example and computes the average of maximum temperatures on monthly intervals:
notice that, in this case, the volume data descriptors (variable and timerange) suggest that the dataset contains average temperatures, while these are actually averages of daily maximum temperatures, so part of the information is lost and the user must keep track of it autonomously.
With v7d_transform it is possible to perform geographical transformations before and/or after the statistical processing.
For a complete description of the possible transformations available in libsim see the Overview of space transformations page; only a subset of the available transformations is applicable here.
Before the statistical processing, it is possible to perform a sparse-point to sparse-point transformation, which has to be specified in the form –pre-trans-type=trans-type
:subtype.
The only transformation type that makes sense here is polyinter:average
which averages the data on a set of polygons provided in shapefile format with the options –coord-file=
and –coord-format=shp
.
The output volume will contain a pseudo-station per each polygon contained in the shapefile, with the coordinates equal to the coordinates of the polygon centroid and in the same order.
After the statistical processing, it is possible to perform a sparse-point to grid transformation, which has to be specified in the form –post-trans-type=trans-type
:subtype.
The transformation types that make sense here are inter:linear
, which linearly interpolates the sparse data on a regular grid by triangulation, and boxinter:average
, which computes the value at each grid point as the average of the sparse input data on the corresponding grid box.
The only output format which makes sense here is grib_api:template
, where template
indicates a grib file which will be used as a template for the output grib file and which defines also the grid over which the interpolation is made.