Program Description pwlCopula and VGen. J.Ch.Strelen Copulas comprehend the entire dependence structure of multivariate distributions, not only the correlations. Together with the marginal distributions of the vector elements, they define a multivariate distribution which can be used to generate random vectors with this distribution. The MATLAB program pwlCopula implements input models with this method, for random vectors and time series. The copulas are estimated from observed samples of random vectors. It is fast and allows for random vectors with high dimension, for example 100. The generation algorithm is also implemented with Java methods in VGen. The technique is described in the paper J.Ch.Strelen, Tools for Dependent Simulation Input with Copulas, submitted for publication. See also www.informs-sim.org/wsc07papers/058.pdf Basically, the MATLAB program pwlCopula calculates the copula, provides some statistics and diagrams which serve the purpose to examine the quality of this model, and can generate random vectors and time series. The Java classes generate random vectors and time series using a copula which was calculated with pwlCopula. Java classes are easier to integrate into simulation models than MATLAB programs. For the copula, the MATLAB program uses a sample of independent vectors with dimension D or a time series whose elements can be vectors as well, dimension D' (or Ds in the program). They are stored as follows: First value is the dimension D or D', respectively, the second is the sample size n, then one vector after the other, all without line feed. The copula can be stored in a copula file (.cop). During the calculation, the program needs empirical marginal distributions. They can be stored, too (.emp). For these calculations, the user must specify some parameters: * K, integer, determines the accuracy. This is the granularity, the higher, the more accurate. We used values between 10 and 4000. * n_by_K, integer, defines the sample size n = n_by_K * K. Thus K divides n. * The name of the file which the sample is read from. * The window width m only if the copula is concerned with a time series. This defines how accurately the dependence between succeeding time series elements are modelled. We tried m=2,3,4. Using this copula, pwlCopula can generate random vectors or a time series which can be strored in a file for later use in a simulation model. For the generation, the user specifies * The random number stream * How many vectors are to be generated * The kind of inverse transformation o One method (2) generates only values which occur in the sample o The other (1) with linear interpolation of the empirical distribution function also values in between Moreover, the program can calculate statistics and plot diagrams with the generated random vectors or the time series, and corresponding statistics and diagrams with the given sample. The modeller can compare them in order to obtain insight in how good is the copula model. The statistics concern the means and the variances in each dimension, and correlations between pairs of dimensions. They are calculated for the original sample on one hand, and for the generated vectors on the other. The absolute values of the differences are taken as measure of accuracy. We consider a difference of means absolutely if at least one of the absolute values of the means is less than 0.00001, relatively otherwise. We consider the difference of two coefficients of variation if both according absolute values of the means are greater than 0.00001, the difference of the standard deviations otherwise. We consider the difference of two correlations if both according standard deviations are greater than 0.00001, the difference of the covariances otherwise. The greatest absolute value of these differences, the maximum statistical deviation, is a combined measure of accuracy. If one replicates the generation process, say r times, the smallest observed maximum statistical deviation and the greatest observed maximum statistical deviation are an (approximate) confidence interval to the confidence level 1 – 0.5^(r-1). Scatter diagrams are for visual inspection. In each of them, the value pairs of two different elements of the vectors are plotted as points. Looking on the diagram, one gets insight in the structure of dependency of these two dimensions: There may be regions with no points - obviously the corresponding value pairs do not occur at all, or with small probabilities. In the other regions, the points may be differently dense which indicates different probabilities of occurrence in this region. The modeller can compare corresponding scatter diagrams of the original sample on one hand, and of the generated vectors on the other. If regions without points correspond, and if the visual impression of the frequency is similar, this is a hint that the copula model is accurate. For time series, we calculate also correlations between two vector elements in the same dimension, but at different times i_1 and i_2 with the lag |i_1 - i_2|. Again, these correlations are calculated for the original sample and for the generated vectors, and the absolute value of their difference is taken as measure of accuracy. These differences grow with growing lag, in general. Therefore it makes no sense to consider only their maximum value. We provide diagrams with differences for different lags instead. The Java classes are only for the generation of random vectors and time series, they implement the same algorithms as the according part of the MATLAB program pwlCopula. They import a copula and empirical distributions which were calculated and stored in files .cop and .emp before with pwlCopula. They are not interactive, the parameters must be passed to the Java objects via method calls, the file name without extender. The Java generation is about 80 times faster. The class VectorGenerator containes the methods for setup the program, buildCopula and buildEmpDistr, and for generating vectors, gen_u_vector, gen_u_ar, and gen_z. The class Zufalls_Zahlen is for univariate uniform random numbers. If the copula is for random vectors, succeeding calls of gen_u_vector and gen_z generate one vector with dimension D. If the copula is for time series, succeeding calls of gen_u_ar and gen_z generate one element of a time series, a vector with dimension D'. The first generated elements of the time series are not stationary, they should be skipped. pwlCopula and the Java programs VGen are copyright the original author and the University of Bonn, and is published here under the GNU General Public License (See http://www.fsf.org/licenses/licenses.html).