require "../../inc/base.php"; PageInfo("SAMPLER - Dokumentace", // Titulek stránky "Jirka Kosek", // Autor stránky "SAMPLER - Dokumentace", // Popis "", // Klíčová slova "cs", // Jazyk stránky "", // Alternativní jazyky "sampler.php", // Předchozí stránka "", // Následující stránka "index.php" // Stránka s obsahem ); PageHead(); ?>
SORITEC SAMPLER Version 2.01 TM The SORITEC SAMPLER Version 2.01 Based on SORITEC Version 6.4.033 to accompany "Business Forecasting" by J. Holton Wilson and Barry Keating (c) Richard D. Irwin, Inc., 1990 Published by: The Sorites Group, Inc. P. O. Box 2939 Springfield, VA 22152 USA Phone: (703)569-1400 Telex/TWX: (710)831-0339 March 1, 1990 The SORITEC SAMPLER is a subset of the SORITEC fourth-generation language for econometric analysis. SORITEC was selected by PC Magazine as its Editor's Choice for econometrics (March 14, 1989). It is the standard package for econometric computation at the World Bank, OECD (Organization for Economic Cooperation and Development), and the British Treasury. SORITEC is currently installed at over 700 sites worldwide, and SAMPLER is used at thousands more. The software and documentation is copyrighted, (c) 1990 Sorites Group, Inc., but permission is granted for non-commercial reproduction. March 1, 1990 Rev. 1 SAMPLER USER REGISTRATION FORM If you will fill out this registration form, and answer the questions below, we will put you on the mailing list to receive information about new releases of SORITEC SAMPLER and the SAMPLER manual. If you would like to receive more information about SORITEC and other Sorites Group products, check one of these boxes: ( ) Send more information on the PC version only. ( ) Send more information on all versions, from mainframes to PCs. Name: __________________________________________________________________ Title: _________________________________________________________________ Firm/Institution: ______________________________________________________ Address: _______________________________________________________________ ________________________________________________________________________ City: ________________________________ State: __________________________ Country: _____________________________ Postal Code: ____________________ Date: _________________ Phone: (_____)_______________________________ What type of computer do you own or use? ______________________________ Organizational affiliation: ( ) Commercial ( ) Government ( ) Academic/Faculty ( ) Academic/Student I received SAMPLER from ________________________________________________ Please help us determine how our reputation spreads by answering these questions. Before you got the SAMPLER, were you aware that SORITEC was: Selected as PC Magazine Editor's Choice for ( ) Yes ( ) No Econometrics (March 14, 1989) Selected as the standard econometric software ( ) Yes ( ) No at the World Bank? Selected as the standard econometric software ( ) Yes ( ) No at the Organization for Economic Cooperation and Development (OECD, Paris)? When you received the SAMPLER, was that the ( ) Yes ( ) No first time you ever heard of SORITEC? PLEASE DUPLICATE THE SORITEC SAMPLER AND PASS IT ON. Your comments and suggestions on SORITEC SAMPLER will be appreciated. If you would like us to send information to someone you know, please supply us with their name and complete mailing address on the reverse. Complete and mail to: The Sorites Group, Inc. P.O. Box 2939 Springfield, VA 22152 USA 6.4.033 SORITEC TABLE OF CONTENTS SECTION (1) PRIMER Chapter 1 -- SORITEC SAMPLER for the New User 1.1 Introduction.......................... 1 1.2 What is SORITEC?...................... 1 1.3 Getting Started....................... 2 1.4 Hardware considerations............... 2 1.5 Invoking SORITEC ..................... 3 1.5.1 Interactive Processing.............. 3 1.5.2 Batch Processing.................... 4 1.6 Executing SAC Files................... 5 Chapter 2 -- SORITEC Syntax 2.1 Introduction.......................... 1 2.2 Variable Names........................ 1 2.3 Special Symbols....................... 1 2.4 Variable Types........................ 2 2.5 Selection of the Observation Set...... 4 2.6 Transformations....................... 5 2.7 Revising Data in SORITEC.............. 7 2.8 Missing Data Handling................. 9 2.9 Imputation of Missing Values.......... 10 2.10 Wildcards............................. 10 2.11 Recovering Internal SORITEC Variables........................... 11 2.12 Flags................................. 12 2.13 SORITEC's Symbol Table................ 12 2.14 Minor Control Statements.............. 13 Chapter 3 -- Data Entry and Output 3.1 Introduction.......................... 1 3.2 SORITEC Alternate Load (SAL) Files.... 1 3.2.1 SAL File Input...................... 2 3.2.2 SAL File Output..................... 2 3.3 Data Interchange Format (DIF) Files... 3 3.3.1 DIF File Input...................... 3 3.3.2 DIF File Output..................... 4 3.4 Formatted Input and Output............ 6 3.4.1 FORTRAN Formatted Input............. 6 3.4.2 FORTRAN Formatted Output............ 7 3.5 Keyboard Entry........................ 8 3.6 Output of Data to the Terminal........ 8 3.6.1 Tabular Display..................... 8 3.6.2 Graphical Display................... 9 3.7 SORITEC Databanks..................... 10 February 1, 1990 1 SORITEC Chapter 4 -- SORITEC Databank (SDB) Files 4.1 Introduction.......................... 1 4.2 Create a Databank..................... 1 4.3 Access a Databank..................... 1 4.4 Access a Databank in Read-Only Mode... 2 4.5 Release a Databank from SORITEC....... 2 4.6 Delete a Databank..................... 2 4.7 Store Items in a Databank............. 2 4.8 Retrieve Items from a Databank........ 3 4.9 Replace Items in a Databank........... 3 4.10 Rename Items in a Databank............ 3 4.11 Switch the Names of Two Items in a Databank...................... 3 4.12 Discard Items from a Databank......... 4 4.13 Generate a Directory Listing of a Databank...................... 4 Chapter 5 -- Programming Constructs 5.1 Introduction.......................... 1 5.2 Numeric Looping....................... 1 5.3 Unconditional Branching............... 2 5.4 Conditional Branching................. 2 5.5 Null (Continuation) Statement......... 3 5.6 Alpha Looping......................... 3 Chapter 6 -- Special Generation and Transformation Commands 6.1 Introduction......................... 1 6.2 Create a Time Trend Dummy Series..... 1 6.3 Create Seasonal Dummies.............. 1 6.4 Recode a Variable.................... 1 6.5 Conversion of Time-Series from One Periodicity to Another........ 2 6.6 Maximum Function..................... 3 6.7 Minimum Function..................... 4 6.8 Modular Division..................... 4 6.9 Compute Moving Average............... 4 6.10 Compute Moving Sum................... 5 6.11 Statistical Operations............... 5 6.10.1 Correlation Matrix Calculation.... 5 6.10.2 Covariance Matrix Calculation..... 5 6.10.3 Other Statistical Operations...... 5 Chapter 7 -- SORITEC Financial Functions 7.1 Financial Functions in SORITEC........ 1 7.2 Internal Rate of Return............... 1 7.3 Present Value......................... 1 7.4 Loan Amortization..................... 2 2 February 1, 1990 SORITEC Chapter 8 -- Cross-Section Techniques 8.1 Introduction.......................... 1 8.2 Synopsis.............................. 1 8.3 Crosstabulation Analysis.............. 2 8.4 Frequency Analysis.................... 2 Chapter 9 -- Estimation and Forecasting 9.1 Introduction.......................... 1 9.2 Ordinary Least Squares (OLS) Estimation......................... 1 9.3 Autocorrelation Techniques for the Single Equation Model.............. 2 9.3.1 Cochrane-Orcutt Iterative Technique. 2 9.3.2 Hildreth-Lu Scanning Technique...... 2 9.4 Two-Stage Least Squares (2SLS) Estimation......................... 3 9.5 Forecasting Single Equation Models.... 3 Chapter 10 -- SORITEC Interactive Print Server 10.1 Introduction......................... 1 10.2 Entering Interactive Mode............ 1 10.3 Tableau Descriptions................. 1 10.3.1 Coefficient Display (E)........... 2 10.3.2 Regression Summary Table (G)...... 2 10.3.3 Regression ANOVA Table (A)........ 2 10.3.4 Beta Coefficients, Elasticities and Partial R (B)................. 2 10.3.5 Correlation Matrix of Coefficient Estimates (C)......... 2 10.3.6 PDF and Histogram of Standardized Residuals (H)........ 2 10.3.7 Convergence Path for Autocorrelated Estimators (M)..... 3 10.3.8 Non-Parametric Residual Distribution Tests (N)............ 3 10.3.9 Actual vs Fitted Plot and Standardized Residuals (P)........ 3 10.3.10 Residual Autocorrelation Summary (R)....................... 3 10.3.11 Statistical Summary of Exogenous Variables (S)........... 3 10.3.12 Covariance Matrix of Coefficient Estimates (V)......... 3 10.3.13 Exogenous Variables List.......... 3 10.4 Interactive Crosstabs................ 3 Chapter 11 -- Simulation February 1, 1990 3 SORITEC 11.1 Introduction......................... 1 11.2 Defining the Equations of Your Model. 1 11.3 Combining the Equations into a Superformula......................... 1 11.4 Simulating a Model Using the Gauss- Seidel Algorithm..................... 2 11.5 Putting Add Factors into Equations... 3 11.6 Comparing Scenarios.................. 3 Chapter 12 -- Forecasting with Time Series Techniques 12.1 Introduction......................... 1 12.2 Identification of a Time Series Model 1 12.3 Estimation and Forecasting of Time Series Models................... 2 Chapter 13 -- Forecasting with Smoothing Techniques 13.1 Introduction......................... 1 13.2 Moving Average of a Time Series...... 1 13.3 Exponential Smoothing................ 1 SECTION (2) Survey articles from the SORITEC Reference Manual This section contains: FLAGS(2) : Global Option Settings RECOVER(2) : Recoverable Results These articles contain more information about the operation of certain commands than the Primer does. However, some information is relevant only to the full SORITEC language. 4 February 1, 1990 SORITEC SECTION (3) Command summaries from the SORITEC Reference Manual This section contains: BUILD(3) : Build a SORITEC Simulation Model COMPUTE(3) : Transformations of Time Series Data FORECAST(3) : Basic Forecasting Command MARMA(3) : Rational Distributed Lag Estimation REVISE(3) : Revision and Splicing of Data SMOOTH(3) : Exponential Smoothing SUPERF(3) : Superformula Construction (Gauss-Seidel Method) These articles contain more information about the operation of certain commands than the Primer does. However, some information is relevant only to the full SORITEC language. SECTIONS (4) through (7) from the SORITEC Reference Manual are not relevant to the use of the SAMPLER SECTION (8) Index and references from the SORITEC Reference Manual This section contains: INDEX(8) : Index February 1, 1990 5 SORITEC PRIMER -- Chapter 1 (1) PRIMER -- Chapter 1 -- SORITEC SAMPLER for the New User 1.1 Introduction The SORITEC SAMPLER is a freely reproducible subset of the SORITEC system. It makes high-quality, tested, reliable econometric computation readily available to students, and to organizations that need to expose large numbers of analysts to econometric methods, without the expense associated with the purchase of commercial software. The SAMPLER fulfills the needs of many analysts at low or no cost, and provides an upgrade path to a more extensive and powerful system, the SORITEC language. SORITEC is a fourth-generation language for econometric analysis, developed by the Sorites Group, Inc. (SGI) of Springfield, Virginia, USA. SORITEC is now supported on about thirty different mainframes, minicomputers and microcomputers. SORITEC jobs are portable; all versions of SORITEC use the same reference manual. The system's command syntax is identical on all machines. The first mainframe release of SORITEC was in December, 1978, on CDC Cyber equipment. Until 1981, SORITEC was only available on time-sharing networks. In 1981, mainframe and minicomputer in-house copy leasing was begun. In the spring of 1984, the first version of SORITEC for the IBM PC was released. Unlike most econometric packages for microcomputers, the PC version of SORITEC has the full functionality of the mainframe system. In early 1985, a reduced version of the system, the SORITEC SAMPLER, was released as a freely duplicable package for the IBM PC only. This Primer is intended to serve both as an introduction to SORITEC, and as the main documentation for the SORITEC SAMPLER. Accordingly, only those commands common to both SORITEC and the SORITEC SAMPLER are discussed in this document. Section 2 of the SORITEC Reference Manual contains more in-depth discussions of the topics covered here, and also describes the full SORITEC command set. SAMPLER users, please note: Many software companies provide only a demo disk, which can handle only a few records, or does not allow you to save your work, or cannot write to hard disk, or is deliberately crippled in some other way. They are determined not to provide any real value to you for free. If you like the SAMPLER, and want to insure that it will be continued and improved over time, we ask that you make at least three copies and send them to (preferably far-flung) colleagues. This is one of our main advertising vehicles. You defeat the purpose and diminish our incentive if you don't "pass it on". 1.2 What is SORITEC? SORITEC is a sophisticated econometric modeling and forecasting system that can be used to estimate, solve or simulate a broad range of econometric and statistical models. The program supports econometric time-series analysis within an easy-to-use command language syntax. SORITEC can support models with hundreds of equations, either linear or non- linear, in either a static or dynamic framework. Models can be specified, built, rearranged, manipulated by name, or kept on a databank. Once a model is constructed, it can be simulated by a single command. SORITEC provides a report February 1, 1990 1 SORITEC PRIMER -- Chapter 1 (1) writer capable of providing detailed and complex reports with minimal effort and training. SORITEC is also a complete data processing language that permits varied and complex data reduction operations. The SAMPLER supports a very large subset of the SORITEC commands. Its databanking facilities are upward-compatible with the full SORITEC language. If you decide some day to move up to the full SORITEC product, as we hope that you will, all the work you have done on SAMPLER will transfer with absolutely no conversion effort. In this document, when we refer to "SORITEC", you may generally assume that the statement applies to both the full product and to the SORITEC SAMPLER, unless we say otherwise. 1.3 Getting Started Because SORITEC is available on such a variety of machines, there is no standard installation procedure applicable to all environments. Before you start, you should be familiar with the computer and operating system where you plan to use the SORITEC system. If you are going to work on a mainframe or minicomputer, consult your system manager and find out about any local variations. If you are using the full SORITEC product, you will find any machine-dependent information in Section 7 of the SORITEC Reference Manual. If you are installing SORITEC, you will have received a set of Installation Notes that will give you precise instructions about what to do. If you are installing SAMPLER, there is a README file on the documentation diskette (diskette #3) that gives all the necessary installation information. 1.4 Hardware considerations SORITEC does not require any particular hardware configuration or special equipment. However, since SORITEC tasks usually involve great amounts of floating point arithmetic, it is highly recommended, for all systems where floating-point processors (FPPs are known as "math co-processors" on PCs) are optional, that they be used. Many UNIX workstations support FPPs only as an add-on option. If you use an FPP, SORITEC will run much faster. Minicomputers and mainframes almost invariably have FPPs as part of their basic configuration, so the question does not arise. PC versions of SORITEC and SORITEC SAMPLER require a hard disk and 512K of RAM. 640K of RAM is strongly recommended for the full SORITEC system. An FPP (i.e., "math co-processor") is not required for operation of either SORITEC or SORITEC SAMPLER on IBM PC-compatible machines, although it is highly recommended. Any other system requirements for proper operation of the full SORITEC system on other types of computers can be found in the machine-specific articles in Section 7 of the SORITEC Reference Manual. 2 February 1, 1990 SORITEC PRIMER -- Chapter 1 (1) 1.5 Invoking SORITEC SORITEC is both an interactive and batch processor. However, before describing how each mode is invoked, it is important to distinguish SORITEC interactive and batch processing modes from the foreground and background processing modes that are typically associated with these terms. When SORITEC is in interactive mode, the program takes each line of input and processes it as it is received. In batch processing mode, on the other hand, SORITEC accepts input lines until they are logically concluded with an END statement. At that point, batch job execution begins. Note that SORITEC interactive and batch modes can run in both foreground and background processing environments. Batch job processing in SORITEC has certain characteristics that sometimes make it more convenient to use than interactive mode. First, it compiles a complete listing of the commands of a job and outputs it without line prompts to the output device before execution begins. The command lines are thereby separated from the output for more presentable reports. Second, batch processing mode provides for the labeling of the job and the insertion of titles into the output listing. Batch processing mode is often useful when output is too wide to be displayed legibly on the terminal. Through redirection and respecification of the output width, output that would otherwise be difficult to read on a terminal can be routed to other output devices, such as line printers. Although most of these features can be replicated in interactive mode, it is generally more convenient as a batch job. 1.5.1 Interactive Processing On the initial banner page, SORITEC prints version information, date and time, default settings for input (SCAN) and output (WIDTH) device size, and workspace size. SORITEC will open an input journal file on the current directory. Its name will depend upon the installation type. Please consult your system manager. SAMPLER users running from floppy diskettes must take special care not to run the SAMPLER from the A: drive, because SAMPLER will attempt to write a journal file on the A: diskette, which may fail due to write-protection of the diskette. Even when a non-protected copy of the SAMPLER is used, it is generally undesirable to have anything written on the program diskettes in the A: drive. The journal file stores all commands that are entered during a session so that you can archive the command sequence for future use. The file can later be executed as a SAC file. Journal files can be edited and re-executed to produce a "final draft" of a particular statistical or estimation problem. Depending upon your machine, any journal file that exists on the current directory may be automatically erased when a new journal file is written. Be sure to rename any journal files you wish to keep. If you wish to execute the journal file as a command file, then you must change the filename extension to ".sac". SORITEC prompts you for input after printing the banner and the message indicating the name of the journal file. Prompts in SORITEC are of the form 1-- ,2-- ,3-- and so on. When you see the first prompt, 1--, interactive processing February 1, 1990 3 SORITEC PRIMER -- Chapter 1 (1) mode has been selected by default, and you may begin entering commands. In general, SORITEC allows you to enter any legal command at any time. There is no particular order in which tasks must be done. Interactive processing is terminated by entering the command: QUIT The QUIT command closes and returns any files that are currently attached and returns control to the operating system. All items in the workspace are discarded when the QUIT command is executed. 1.5.2 Batch Processing SORITEC batch job processing is initiated by the JOB command. The JOB command allows for a label of up to 120 characters. Its format is as follows: JOB [job_label] The JOB command supplies a label for the entire batch run. As such, only one JOB command may appear in any single job deck. Batch processing is terminated by the END command, which has no arguments, i.e.: END At the end of a JOB, the END command causes SORITEC to return and close any databanks or other files which are attached. The workspace is discarded after the END statement is processed. Besides being used to indicate the end of a batch job, the END command is used to terminate SORITEC SAL files, and to close procedures, DOT loops, and some types of DO loops. SORITEC keeps track of END statements when compiling batch job statements and takes an END statement to indicate the end of a job only after END commands have been processed to close each DO loop, DOT loop, procedure, etc. Descriptions of these other commands that use the END command are provided later in this documentation. A label of up to 120 characters can be specified using the TITLE command. The format of the command is as follows: TITLE [label] The label will appear on the third line of each output page. A TITLE command with no argument causes the third line of succeeding pages to be blank. As many TITLE commands as needed can be placed in a job. They are executed as they are encountered in the job stream and label all succeeding pages until another TITLE command is executed. 4 February 1, 1990 SORITEC PRIMER -- Chapter 1 (1) 1.6 Executing SAC Files SORITEC accepts input both from the terminal and from SORITEC Alternate Command files (SAC files). The EXECUTE command causes SORITEC to read commands from a SAC file, execute them, and then return control to the terminal. A SAC file is simply a file that contains legal SORITEC commands. It may be structured as a batch job for SORITEC's batch processing facility or may simply be a set of commands as you would enter them from the terminal. For SORITEC to recognize it as a SAC file, the filename must have the appropriate file type extension. This extension varies from machine to machine. On most UNIX systems, and on IBM PCs, the extension used is '.sac'. If you are using any other systems then please consult Section 7 to determine the extension used on your hardware. When you execute a SAC file, your current directory is searched for a file with the file name specified and the SAC file extension required by the computer you are using. You can construct a SAC file using a text editor, or any commercially available word processor. Be sure, when constructing a SAC file, that the editor or word processor used writes out your SAC file as an ASCII file, or an ordinary text file. Many word processors have two modes of operation. One mode is an internal mode, which is designed only to be read back into the word processor. The other mode is an external mode, which is designed to write a file that can be used by other applications. In general, the internal mode may contain formatting information that is not recognized by SORITEC, and may provoke SORITEC error messages. Also, internal mode files often contain garbage after the end of the file that the word processor knows not to use, but which will interfere with the proper functioning of SORITEC and other applications software. If you have trouble using files in SORITEC which have been created with a word processor, try adding a blank line at the end of the file. SORITEC will execute command files at any point in an interactive processing session. Command file processing is started by an EXECUTE command, i.e.: EXECUTE 'filename' where "filename" is the name of the command file you wish to have executed. You should always put file names in single quotes, although in many cases this is not absolutely required. Do not enter the file extension with the filename on the EXECUTE command line. In other words, this is legal: EXECUTE 'myfile' but this is not: EXECUTE 'myfile.sac' If the SAC file exists on a drive or directory other than the current one, it must be referenced within single quotations. For example, on a UNIX workstation: EXECUTE '/io4way/sorites/electric' February 1, 1990 5 SORITEC PRIMER -- Chapter 1 (1) would execute commands from the file /io4way/sorites/electric.sac. On the IBM PC, the following command forms might be used: EXECUTE 'd:filename' -or- EXECUTE '\path\filename' Command file output is always displayed on the terminal unless it has been redirected to another output device. 6 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) PRIMER -- Chapter 2 -- SORITEC Syntax 2.1 Introduction SORITEC syntax has been constructed to make the entire package easy to learn and use. Before considering the command structure, we need to examine the form of a SORITEC variable name. The most important fact to keep in mind when you are using SORITEC is that the language is oriented toward handling time series of numbers rather than toward handling individual data points. In FORTRAN or BASIC, the statement X=Y sets the value of a variable X to the single numerical value currently stored in variable Y. In SORITEC, X=Y replaces the entire time series X with the time series Y. In SORITEC, variables having single, or "scalar", values are more the exception than the rule. 2.2 Variable Names SORITEC variable names are composed of the letters A-Z, the numbers 0-9, and the symbols @, %, ^, _, and :. The name MUST begin with a letter and must be no more than 32 characters (or symbols) long. Mathematical operators may not be used in variable names. SORITEC allows you to indicate leading or lagging of variables by specifying the lag in brackets following the name of the time series. For example, GNP{1} and GNP{-1} are the first lead and lag values of GNP. In commands that expect multiple arguments, SORITEC will accept ranged values of leads and lags, e.g., GNP{2 TO -3} is automatically expanded to GNP{2} GNP{1} GNP GNP{-1} GNP{-2} GNP{-3}. Note that positive signing of lead arguments is not allowed. Arguments for leads, lags or ranges thereof may be integer constants or SORITEC variables. Previously, parentheses were used to indicate leads or lags, and they may still be used to do so. From now on, though, the brackets will be the preferred way to write leads and lags. This change is being made because there are some ambiguities between function references and lead/lag references that arise because both have, in the past, been indicated using parentheses. (However, parentheses will always remain an acceptable way to indicate leads and lags.) 2.3 Special Symbols SORITEC defines several special symbols to provide a simplifying shorthand in using the package. The semi-colon [;], exclamation mark [!], comma [,], equal sign [=], plus sign [+], minus sign [-], asterisk [*], slash [/], period [.], left angle-bracket [<], right angle-bracket [>], left parenthesis [(], right parenthesis [)], question mark [?], ampersand [&], and ellipsis [...] have special meaning in SORITEC. A semi-colon is used to delimit commands when several are stacked on a single line, e.g., USE 1984M1 1984M6 ; PRINT GNP An exclamation mark in column 1 is used to add comments within a SORITEC command February 1, 1990 1 SORITEC PRIMER -- Chapter 2 (1) sequence. The exclamation mark functions as a comment identifier only if placed in column 1. A comma is used only as an argument separator and is interpreted as a blank everywhere except in a format statement and within parentheses or square brackets (to separate arguments or subscripts). The symbols + - * / . < > and = are used for mathematical operations. The left and right parentheses, ( and ), are reserved for designating command modifiers, arguments, and subscripts. They can also be used as the less preferred way of indicating lead and lag operations on time series. The left and right curly brackets, { and }, are reserved to indicate lead and lag operations on time series. The asterisk, *, and question mark, ?, are used as wildcard references, described later in this chapter. Lastly, the ampersand, &, and ellipsis, ..., indicate that the current command continues onto the next line. 2.4 Variable Types There are many types of variables in the SORITEC language. Time series variables are the default data type in SORITEC. This means that, unless a statement contains some indication to the contrary, a reference to variable X implicitly references the time series X. Variable assignment implicitly assumes the variable is a time series unless you state otherwise, or there is some other indication to the contrary. (For instance, if you used a matrix function in the statement, then SORITEC would assume instead that matrices, rather than time series, are being manipulated.) So, if X has not been previously defined as some other data type, a simple statement such as X=2 creates a time series of numbers, all of which are equal to 2, NOT a single value. The next most common data types are parameters and constants. Both are scalars, i.e. they have one value, not a series of values changing over time. Parameter values can be changed by the nonlinear estimation commands, but constants cannot. Parameters and constants are created by the PARAMETER and CONSTANT commands. For example, the following commands would create parameters ALPHA, BETA, and GAMMA, and a constant LAMBDA. ALPHA and GAMMA would be given values of 1.0 and 0.5, respectively, while BETA and LAMBDA would be given a value of zero. PARAMETER ALPHA 1.0 BETA GAMMA 0.5 CONSTANT LAMBDA Constants and parameters can be set or reset using any standard transformation by prefixing the transformation with the SET command, for example: CONSTANT A .5 B .3 SET A = A**0.5 * LOG(B) 2 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) Although you may find few uses for parameters that cannot be equally satisfied with constants, it is recommended that you always use parameters in equations for "estimated" coefficients, i.e. ones that have either been estimated elsewhere, or which may need to estimated in the future when you expand your model. SORITEC also defines vector and matrix data types. These types are created by using the VECTOR or MATRIX commands, respectively. For example, the following command would create a vector V with values 2, 4, and 6: VECTOR V 2 4 6 SORITEC keeps track of the length of the vector when it is created. Individual elements of a vector can be manipulated like scalar values in SORITEC commands using subscript notation. For example: SET S = V(1) + V(3) would result in the value of the scalar S to 8. SORITEC also allows you to name and manipulate formulas as a separate data construct using the EQUATION command. For example, the following command would define an equation EQ_Y whose formula was Y=ALPHA+BETA*X EQUATION EQ_Y Y = ALPHA + BETA*X Formulas are structured much as they are in FORTRAN, and in most other computer languages. There are rules as to which operations in a complex formula get done first, and so forth, and these will be discussed later in this chapter, in the section on "Transformations". Equations can be stored in databases and can be computed by name, once values have been assigned to their constants, parameters and variables. Use the COMPUTE command to recompute values for the left-hand side variables. For example: USE 1 7 SERIES B 1 2 3 4 5 6 7 EQUATION MOVING A = B + .25*B{-1} + .5*B{-2} + .25*B{-3} USE 4 7 COMPUTE MOVING EQUATION SIMPLE G = H - J CONSTANT H 17 J 63 SET SIMPLE PRINT A G Constant G = -46.00000 A ................ 4 . 6.00000 5 . 8.00000 6 . 10.0000 7 . 12.0000 February 1, 1990 3 SORITEC PRIMER -- Chapter 2 (1) The primary uses of equations in SORITEC are for forecasting, simulation, and non-linear estimation. The group data type allows you to define one name as the set of several other names. It is very useful in repetitive commands (see the DOT command, for instance), and for avoiding re-typing of names which are commonly used together. A group is defined by the GROUP command. For example, to create a group G consisting of X, Y, and Z, the following command would be used: GROUP G X Y Z To then use the group, group expansion must first be enabled via the ON GROUP command. The group name will then be replaced by the individual names in the group, wherever the group name occurs. This avoids the need to type the same set of names repeatedly. For example, the following commands greatly simplify testing the inclusion of variables in a regression equation. GROUP BASIC_VARIABLES GNP M1 TAXES GOV_EXP PRIME ON GROUP REGRESS DEFICIT BASIC_VARIABLES PARTY REGRESS DEFICIT BASIC_VARIABLES TIME ... etc You can also refer to individual elements within a GROUP by the number of its position in the group. For example, you could refer to "BASIC_VARIABLES(2)" in place of M1 in the example given above. Referring to individual group elements by index number is particularly useful in DO loops. 2.5 Selection of the Observation Set Periodicity and length of data series are defined by the USE command in SORITEC. The USE period defined by this command is active in all subsequent SORITEC commands until explicitly changed by another USE. For example the following command would specify a period that begins in 1980 and continues through 1989: USE 1980 1989 Data need not be continuous over the range of observations, but instead may consist of a series of intervals. For example, the following command specifies a USE period that runs from the first quarter of 1982 through the second quarter of 1983, from the fourth quarter of 1983 through the third quarter of 1985, and from the fourth quarter of 1986 through the fourth quarter of 1989: USE 1982Q1 1983Q2 1983Q4 1985Q3 1986Q4 1989Q4 USE requires zero, one or an even number of arguments which may be positive integers, constants, parameters or a vector. Each pair of arguments defines a range of observations within the overall observation range. The second argument of each pair must not be less than the first. If no arguments are included in the command line, SORITEC returns the currently active USE period. If only one argument is included in the command line, the USE period is set to that one period. 4 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) SORITEC allows you to define annual, semi-annual, quarterly, monthly, ten day, weekly, daily, business daily (Monday-Friday) and undated data types. Periodicity of time-series data is defined by appending an appropriate suffix to the data year, as shown below: PERIODICITY SUFFIX RANGE(x) ----------- ------ -------- Annual none --- Semi-annual Sx [1,2] Quarterly Qx [1,4] Monthly Mx [1,12] Ten Day Tx [1,37] Weekly Wx [1,52] Daily Dx [1,366] or [0101,1231] Business Daily Bx [1,366] or [0101,1231] Undated none --- The permissible range of years for dated data types (other than annual) is from 1801 through 2099. Note that Ten Day data consists of first and second ten-day periods of the month, and a remaining period of 8 to 11 days. Weekly data span Sunday through Saturday. SORITEC will convert data series from one periodicity to another, but certain restrictions apply. Data conversion is discussed in CONVERT(3) of the SORITEC Reference Manual. The following are examples of USE commands: USE 1980Q1 1984Q4 USE 1942M12 1955M6 Note that the command USE 1980 is equivalent to USE 1980 1980. SORITEC permits conditional selection of the sample period based on a logical variable. For example, USEIF Z resets the USE period to select only the observations corresponding to non-zero entries of Z. For example, to select all individuals with income between $12,000 and $24,500, the following commands could be used: NEW_SAMPLE = INCOME > 12000 & INCOME < 24500 USEIF NEW_SAMPLE 2.6 Transformations The COMPUTE command is the basic SORITEC transformation command. The command line consists of the COMPUTE command name followed by either an EQUATION name or any legal SORITEC transformation expression. For example: EQUATION EQ1 Y = X/Z COMPUTE EQ1 COMPUTE C = A + B February 1, 1990 5 SORITEC PRIMER -- Chapter 2 (1) In either case, the COMPUTE command name may be omitted: EQ1 C = A + B Transformations are straightforward in SORITEC since syntax considerations conform to standard algebraic notation. Here are the legal SORITEC operators: ARITHMETIC OPERATORS LOGICAL OPERATORS -------------------- --------------------------------------- + addition .eq. equal - subtraction .ne. or <> or >< not-equal * multiplication .ge. or >= or => greater-or-equal / division .le. or <= or =< less-than-or-equal ** exponentiation .gt. or > greater-than .lt. or < less-than .not. or ~ negation .and. or & conjunction .or. or | alternation The transformations common to both SORITEC and SAMPLER include the following: LOG Natural Logarithm LOG10 Logarithm Base 10 EXP Exponentiate (Base e) ABS Absolute Value CEILING Next Largest Integer FLOOR Next Smallest Integer ROUND Round to Nearest Integer SIGN Extract Sign (+1,0,-1) SQRT Square Root TRUNC Truncate Fractional Part SIN Sine ASIN Arcsine COS Cosine ACOS Arccosine TAN Tangent ATAN Arctangent SINH Hyperbolic Sine ASINH Hyperbolic Arcsine COSH Hyperbolic Cosine ACOSH Hyperbolic Arccosine TANH Hyperbolic Tangent ATANH Hyperbolic Arctangent Arguments associated with these functions must be enclosed in parentheses. Use of operators in SORITEC transformations must conform to the following conventions: (1) Two operators (+, -, .and., .or., etc.) cannot occur in sequence unless separated by one or more left parentheses. (2) The number of left and right parentheses must be equal. 6 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) (3) The mathematical operators *, /, and ** cannot occur immediately after a left parenthesis. (4) An operator cannot occur immediately before a right parenthesis. Transformations are parsed according to standard programming conventions. Therefore, subformulae in parentheses are evaluated first, followed by all function evaluations, then all "**" operations, then all "*" and "/" operations, and lastly all "+" and "-" operations. Logical operators are evaluated after parentheses and mathematical operators. Within this group, mathematical comparisons (.eq., .ne. or <> or ><, .ge. or >= or =>, .le. or <= or =<, .gt. or >, .lt. or < ) are evaluated first, followed by logical negation (.not.), and lastly by .and. and .or.. When in doubt about the order of evaluation, use parentheses to avoid errors. Note that you can combine mathematical and logical operations in a single transformation. This allows complex conditional structures to be imbedded directly into equations and expressions in a highly flexible manner. For example, the expression log(x)*(b.gt.1)+x*(b.le.1) is legal in SORITEC. The logical portion of the expansion is merely evaluated to 1 or 0 and then used in the computation. 2.7 Revising Data in SORITEC Data series may be extended or revised easily in SORITEC using the REVISE command and the USE command. The format of the command is the same as that of the COMPUTE command, except that "REVISE" replaces "COMPUTE". A data item being revised must have been previously defined in SORITEC. The command cannot be used to initialize a variable. REVISE updates a variable by temporarily deactivating values for the variable that lie outside the range of the currently active USE period. In other words, to update a data series you must first define with the USE command the observations of the series that you wish to revise before changing the data with the REVISE command. For example, revision of the third observation of an undated series OLD_DATA, defined below, requires the following commands to generate the series on the right: OLD_DATA ................ SERIES OLD_DATA 1 2 3 4 5 . USE 3 1 . 1.00000 REVISE OLD_DATA=3.5 2 . 2.00000 USE 1 5 3 . 3.50000 PRINT OLD_DATA 4 . 4.00000 5 . 5.00000 Since any legal transformation is permitted as an argument, the right hand side of the equation can be a constant, time-series or other valid SORITEC expression. Revision of the third and fourth observations of the original OLD_DATA, for example, requires the following commands to produce the output on February 1, 1990 7 SORITEC PRIMER -- Chapter 2 (1) the right: OLD_DATA USE 3 4 ................ SERIES NEW_DATA 4 5 . REVISE OLD_DATA = NEW_DATA - 1.5 1 . 1.00000 USE 1 5 2 . 2.00000 PRINT OLD_DATA 3 . 2.50000 4 . 3.50000 5 . 5.00000 Extending a data series by one or more observations simply requires redefining the USE period to the period you wish to update and revising the data as before. For example, the output on the right is produced by the following commands: OLD_DATA USE 6 6 ................ REVISE OLD_DATA = 6 . USE 1 6 1 . 1.00000 PRINT OLD_DATA 2 . 2.00000 3 . 3.00000 4 . 4.00000 5 . 5.00000 6 . 6.00000 A similar procedure is used when splicing two series together. For example, the command sequence on the left splices observations 6 through 10 of NEW_DATA to the original five observations of OLD_DATA: OLD_DATA USE 6 10 ................ SERIES OLD_DATA 6 7 8 9 10 . REVISE OLD_DATA = OLD_DATA 1 . 1.00000 USE 1 10 2 . 2.00000 PRINT OLD_DATA 3 . 3.00000 4 . 4.00000 5 . 5.00000 6 . 6.00000 7 . 7.00000 8 . 8.00000 9 . 9.00000 10 . 10.0000 Data revision can also be automatically implemented through the COMPUTE and SERIES commands by enabling the ON REVISE global option. Values for data in the currently active USE period are overwritten when these commands are executed, but values outside the USE period are retained, until an OFF REVISE command is encountered. 8 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) 2.8 Missing Data Handling In general, SORITEC does not do casewise or any other type of deletion when it encounters missing data. An exception to this rule occurs in the cross- sectional procedures. Here, categorical techniques treat missing data as a separate category while the command SYNOPSIS, non-parametric and other statistical techniques ignore missing values. Several methods for handling missing data are available: (1) SORITEC generates a value MISSING in transformations that involve missing data, except when missing data are multiplied by zero. Here, a zero value for the transformation results. (2) The PUNCH command generates the word "MISSING" for each missing value. (3) The READ command recognizes the words "MISSING" and "NA" in input data. (4) The MISSING command assigns a missing value to a SORITEC constant. (5) The LEGAL function scans a data item for missing values. SORITEC constants can be assigned missing values with the MISSING command. For example: MISSING constant_name The argument "constant_name" is defined to be a SORITEC constant with the value MISSING assigned. Only one argument is permitted in the command line. Regardless of its prior type, the argument is always redefined as a SORITEC constant. MISSING cannot assign a missing value to any other variable type. You can, however, assign missing values to other variable types using the REVISE command, as the following example shows. The commands: yield: USE 1 3 SERIES SERIES SERIES 1 2 3 ............. USE 3 1 . 1.00000 MISSING X 2 . 2.00000 REVISE SERIES=X 3 . MISSING USE 1 3 PRINT SERIES The LEGAL function returns the value 1 if a data item is not MISSING and zero otherwise. This enables easy conversion of MISSING values to another value. February 1, 1990 9 SORITEC PRIMER -- Chapter 2 (1) 2.9 Imputation of Missing Values SORITEC provides four options for replacing missing values. Missing values may be substituted by zero, the series mean, the interpolated value or the trend forecast. The option is set globally by the IMPUTE command, i.e., IMPUTE [ZERO|MEAN|INTER|TREND|NONE] Normal missing value processing is resumed when the NONE option is executed. Entering the command IMPUTE with no arguments returns the option currently in effect. The details of each option are as follows: To substitute zero for each missing observation: IMPUTE ZERO To replace each observation with the mean of the series during the current use period: IMPUTE MEAN To interpolate the range between the last two known non-missing values over the missing observations: IMPUTE INTER To fill in missing values with the simple trend forecast for the series over the current use period: IMPUTE TREND To stop implicit imputation of missing values: IMPUTE NONE 2.10 Wildcards SORITEC recognizes the "*" and "?" symbols as wildcard characters. The wildcarding scheme is a simple way to reduce the time spent typing and viewing output (e.g. from the SYMBOLS command, described later in this chapter). The rules for wildcard construction are simple. An asterisk represents zero or more alphanumeric characters and a question mark represents any single character. Commands that permit wildcards match all the names in the workspace against the wildcard pattern and expand the command line appropriately. The following examples demonstrate wildcard processing in SORITEC. Assume that the workspace contains the variables X, XY, XXY, BBYB, BB, ABXYZ, and ABXY. Then: 10 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) THESE WILDCARDS: WOULD REFERENCE THESE ITEMS: * X, XY, XXY, BBYB, BB, ABXYZ, ABXY ? X B* BBYB, BB *B* BBYB, BB, ABXYZ, ABXY ?B?? BBYB, ABXY 2.11 Recovering Internal SORITEC Variables The RECOVER command allows you to access and manipulate secondary results which have been generated and stored under internal names by SORITEC commands. Either one or two arguments are associated with the command, which has the syntax: RECOVER [name] internal_name The "internal_name" is an internal name which identifies which secondary result to RECOVER from SORITEC for later use. Legal names of secondary results that can be recovered are given in RECOVER(2). The first argument, "name", is optional and is a name assigned to the recovered item. If omitted, the recovered name is identical to the internal name. In addition to the RECOVER command, SORITEC allows you to directly reference internal names by prefixing an caret (^) to the variable name. For example, either of the following commands would recover the fitted values of the dependent variable and copy them into the variable FITTED_VALUES: RECOVER FITTED_VALUES YFIT FITTED_VALUES = ^YFIT SORITEC internal names can be referenced directly in most situations. For example, parameters and time-series variables that are internal names can be reassigned with the SET command and can be referenced in transformation operations. Equations, matrices, vectors and GROUPS can also be referenced. However, reassignment still requires the use of the RECOVER command. Internal names cannot be saved to a databank without being reassigned to another variable. SORITEC will not confuse its own internal names with variables or other identically-named data items in your program. The type of the first argument (variable, vector, constant, or other SORITEC form of data organization) is automatically defined or redefined to the type required by the second argument. Secondary results need not be recovered immediately. All such results remain available until a command is executed which stores other results under the same internal name. In that event, the prior results held under that internal name are lost. Note that the raw forecasting equation, ^RAWEQ, is retained only if the ON RAWEQ command has been issued. February 1, 1990 11 SORITEC PRIMER -- Chapter 2 (1) 2.12 Flags Several global "flags" are available to control the amount of printing, depth of analysis, etc. These options are enabled and disabled by the ON and OFF commands. For example, the command ON PLOT will cause residual plots to be produced when an equation is estimated. A complete list of available options with current settings will be displayed by SORITEC if an ON command is entered with no arguments. Global options in SORITEC, with their default settings, are described in FLAGS(2). After every ON or OFF command which changes an option, an internal variable called ^FLAGS is stored as a vector. ^FLAGS contains information on the global options which are in effect immediately after the ON or OFF command is executed. It can be RECOVERed, retained in SORITEC's workspace or stored in SORITEC databanks, and can later be used to restore global options to settings that were in effect when they were recovered. Global options are restored with the FLAGS command, which has the format: FLAGS flag_vector The argument "flag_vector" is the name of the vector to which the recovered SORITEC internal variable ^FLAGS has been written. Flag vectors must not be changed in any way, or unpredictable results may occur. The FLAGS command exists solely to restore previous global option settings. Furthermore, the ordering and number of the global options is subject to change in future releases so flag vectors stored on SORITEC databanks may not restore the options desired if retrieved by a later release of SORITEC. 2.13 SORITEC'S Symbol Table Any time during an interactive or batch session you can determine what item names are currently active in SORITEC's workspace by examining the symbol table. SORITEC's symbol table is listed on the output device when the following command is entered: SYMBOLS ([ALL] [FULL]) [filter] The symbol table lists each item's name, item type, and length. Including the optional modifier "ALL" prints all currently active internal names in addition to user-defined items. Including the modifier "FULL" displays the full details of the items listed. The optional filter consists of a string of characters and wildcards. Only those items that match the filter are printed. If the filter is omitted then SYMBOLS prints all items. Items can be removed from SORITEC's symbol table by invoking the FORGET command, which is of the form: FORGET item... Each item is a currently active item in SORITEC's workspace, as identified from the symbol table. FORGET accepts wildcards so that selected items from the symbol table can be removed. For example, 12 February 1, 1990 SORITEC PRIMER -- Chapter 2 (1) FORGET AB* removes all items that begin with the characters "AB" from the symbol table. All items from the symbol table are removed by entering the wildcard symbol "*" in place of item names, i.e., FORGET * Note that FORGET operates only on the workspace. It has no effect on databanks. 2.14 Minor Control Statements Several commands alter default settings other than those identified with global options (ON/OFF) or pass information to SORITEC for use in output listings. The width of output from SORITEC can be adjusted using the WIDTH command, i.e., WIDTH number The argument, "number", must be a numeric value between 50 and 150. Arguments outside this range will generate an error message, leaving the previous WIDTH definition intact. The default value for interactive usage is 80 characters; in batch mode, the default value is 132 characters. The length of the input line that SORITEC can accept may be changed by the SCAN command, which has the format: SCAN number The argument, "number", must be a numeric quantity between 50 and 150. Arguments outside this range will cause an error message and the existing SCAN will remain in effect. The default value for scan in interactive and batch modes is 80 characters. The maximum error limit can be reset in SORITEC batch jobs to alter the number of NONFATAL and SERIOUS errors a job can commit before the batch processor abandons compilation and execution. The syntax of the command is: MAXERR number where "number" is a numeric quantity that defines the new error limit. The default setting for MAXERR is 25. Listings of batch job commands are turned on or off by the ONLIST and OFFLIST commands, respectively. The default setting is ONLIST. February 1, 1990 13 SORITEC PRIMER -- Chapter 3 (1) PRIMER -- Chapter 3 -- Data Entry and Output 3.1 Introduction Data may be transferred to or from SORITEC in several formats: SORITEC Alternate Load (SAL) files, DIF files, FORTRAN formatted files, and SORITEC Database Files (SDB). In addition, data can be entered directly into SORITEC through the keyboard. Data may be displayed at the terminal in either tabular or graphical format. This section describes the available data input and output options and provides examples and detailed descriptions of the syntax. The most common mistakes that users make with data entry are (a) forgetting to specify the directory in which the data file resides, (b) alternately, move the data file into the current working directory, (c) forgetting to add the correct file extension to the file when it is created with an editor, or (d) using a file extension when specifying the file within SORITEC. SORITEC always appends the appropriate file extension to the file name. You must not specify the extension in SORITEC file manipulation commands. If you try to read a SAL file with "READ('myfile.sal')", SORITEC will look for the file "myfile.sal.sal". On the other hand, "READ('myfile')" will not execute if you have forgotten to append a .SAL extension to the name of the stored DOS file that you want to read. 3.2 SORITEC Alternate Load (SAL) Files SAL files are the easiest way to import large amounts of data into SORITEC. They are also a convenient means of exporting data, particularly if you want to move data to SORITEC on another computer. SAL files are essentially free-field ASCII files. If you already have data in a tabular format, you can quickly create a SAL file by editing the table with any standard text editor, word processor, or spreadsheet program. An example demonstrates the structure of a SORITEC SAL file. We wish to import the following data into SORITEC: YEAR GNP TAXES PRIME ---- ------ ----- ----- 1970 1423.5 455.6 10.75 1971 1564.2 678.3 9.76 1972 1688.9 778.4 13.45 We could create the following file, naming it "macro.sal" (names of SAL files must end with a ".sal" extension): February 1, 1990 1 SORITEC PRIMER -- Chapter 3 (1) USE 1970 1972 READ GNP TAXES 1423.5 455.6 1564.2 678.3 1688.9 778.4 ; READ PRIME 10.75 9.76 13.45 ; END SAL files can contain any number of data series. Furthermore, data sections (sections of a SAL file delimited by an END statement) can be stacked as necessary and imported using multiple reads (or exported using multiple writes) in SORITEC. More than one variable can be read with a single READ. The USE period can be changed as often as necessary to conform to the data. 3.2.1 SAL File Input SAL files are imported into SORITEC using the READ command. To read the file "macro.sal" we could use the following command: READ('macro') As the USE period and all variable names are already predefined in the SAL file, no further information is needed. If referenced simply as above, the SAL file must exist in the current directory. If the SAL file exists on a drive or directory other than the current one, the explicit file name (without extension) must be referenced within single quotes. For MS/DOS systems, one would use: READ('A:MACRO') or READ('\DATA\MACRO') if macro.sal resided in the root of a floppy in the A: drive or in the directory \DATA. A READ command imports data from a SAL file until it encounters an END statement. A later READ of the same file would then begin importing data following this delimiter until the next END statement is reached, and so on. No section of a SAL file can be re-read, since the file is sequentially organized. 3.2.2 SAL File Output Data may be exported from SORITEC in SAL file format using the PUNCH command. The format of the PUNCH command is: PUNCH (['filename'] [ALL]) series... If the modifier ALL is omitted then only the observations in the current USE period are written to the SAL file. If the modifier ALL is used then all observations are written. SORITEC appends the extension ".sal" to the filename when the file is opened. If you omit the filename, then SORITEC will assign an arbitrary name. If a SAL file already exists that has the same name as the one 2 February 1, 1990 SORITEC PRIMER -- Chapter 3 (1) you are creating, then SORITEC will over-write the existing file with the new one. Note that SAL files remain open until a QUIT command is issued to end the session. Multiple PUNCH commands to the same file will therefore append the data to the referenced SAL file. An END delimiter is appended to the file when it is closed. 3.3 Data Interchange Format (DIF) Files The Data Interchange Format (DIF) file format has emerged as a de-facto standard for exchanging data between popular PC packages (such as LOTUS 1-2-3, DBASE III, SUPERCALC, and various stand-alone graphics packages), and between PC's and mainframes or minicomputers. SORITEC supports DIF file input and output. 3.3.1 DIF File Input SORITEC imports DIF files through the READDIF command. There are two forms of the READDIF command. If variable names are in the DIF file (in this example, "filename.dif"), then the command is simply: READDIF('filename') If variable names are not in the DIF file, the command line is: READDIF('filename') series... SORITEC supports subdirectory addressing within the filename reference. If the DIF file exists on a drive or directory other than the current one, it must be referenced within single quotes. Again, using an MS/DOS example: READDIF('d:filename') [series...] or READDIF('\path\filename') [series...] READDIF does not read dates in DIF files so an appropriate USE period must be in effect before the command is executed. READDIF expects to find ONLY time-series data in the input DIF file. Any spreadsheet cells that do not contain legal numbers are interpreted as missing values by SORITEC. As a consequence, SORITEC-generated DIF files that contain data other than time-series and that are later read by SORITEC will NOT generally produce useful results. There are two ways that data can be organized in LOTUS to pass it to SORITEC: with and without labels. In either case, the data are interpreted under the currently active USE period in SORITEC. If certain rules are followed, the USE interval for the data being read in can be derived from a DIF file's contents. If an entire column or an entire row of the spreadsheet being translated to DIF format contains legal SORITEC dates, in text mode, then SORITEC will recognize them as dates, and align the data accordingly. Note that for undated or annual data, if the dates are entered into the spreadsheet as numeric quantities, then they will not be recognized by READDIF as dates. February 1, 1990 3 SORITEC PRIMER -- Chapter 3 (1) If the columns are to be labeled, the names must appear in the first row being translated to DIF. If the rows are to be labeled, the names must appear in the first column being translated to DIF. For example, if the following worksheet is written to "national.dif" using the LOTUS translate function: A B C D +------------------------------------ 1 | GNP TAXES PRIME 2 | 1423.5 455.6 10.75 3 | 1564.2 678.3 9.76 4 | 1688.9 778.4 13.45 then "national.dif" can be read into SORITEC using the following commands: USE 1970 1972 READDIF('national') READDIF can read variable names up to 32 characters in length. If the columns are not labeled then correct variable names must be specified in the READDIF command. In the following example, READDIF assumes that the desired variables are stored in column order. If column D were not empty and the USE specified four observations, then the data would be interpreted in row order. The following table written from LOTUS to the file "national.dif": A B C D +------------------------------------ 1 | 1423.5 455.6 10.75 2 | 1564.2 678.3 9.76 3 | 1688.9 778.4 13.45 can be read into SORITEC with the commands: USE 1970 1972 READDIF('national')GNP TAXES PRIME with the same results as in the previous example. Input data outside the current USE interval are ignored. If insufficient data exist to satisfy the current USE period, the remaining observations are set to "MISSING". READDIF tries to do something reasonable with any input DIF file by first considering the current USE interval, then examining the DIF file contents. One should spot check READDIF input results to ensure that the rows and columns are interpreted as intended. 3.3.2 DIF File Output DIF files may be exported from SORITEC using the WRITEDIF command. This command has the format: WRITEDIF[('filename')] argument... where the arguments may be time-series, parameters, constants, vectors or matrices. Variable names in the argument list can be no longer than 10 4 February 1, 1990 SORITEC PRIMER -- Chapter 3 (1) characters. Longer names are truncated. SORITEC creates a file called "filename.dif" which can be translated into a LOTUS worksheet using the LOTUS Translate utility. If the filename is omitted, SORITEC assigns a name. You can redirect DIF file output to a file on another drive or directory other than the current one using the same conventions as the READDIF command. Note that the following rules apply: (1) Only observations active under the current USE command are written to the file. (2) WRITEDIF re-orders its arguments (if required) so that all SERIES are written first, followed by CONSTANT items, and lastly, VECTOR items. (3) PARAMETERS are output as CONSTANTS. (4) MATRICES are output as VECTORS with M * N elements. (5) SORITEC missing values are output as "NA". Most of these considerations are demonstrated by the following example: USE 1984Q1 1984Q3 SERIES GNP 1423.5 1564.2 1688.9 SERIES TAXES 455.6 678.3 778.4 SERIES PRIME 10.75 9.76 13.45 SET CONST=35. CONSTANT CONST2 223 PARAMETER C3 VECTOR VVV 1 2 3 VECTOR V2 4 3 2 1 USE 1984Q2 1984Q4 WRITEDIF('adiffile') V2 VVV C3 CONST2 CONST & GNP TAXES PRIME The file adiffile.dif is created and results in the following spreadsheet after being read into LOTUS 1-2-3: A B C D E F +-------------------------------------------------------- 1 | TIME GNP TAXES PRIME 2 | 1984Q2 1564.2 678.3 9.76 3 | 1984Q3 1688.9 778.4 13.45 4 | 1984Q4 NA NA NA 5 |CONSTANT C3 0 6 |CONSTANT CONST2 223 7 |CONSTANT CONST 35 8 | VECTOR VVV 1 2 3 9 | VECTOR V2 4 3 2 1 February 1, 1990 5 SORITEC PRIMER -- Chapter 3 (1) 3.4 Formatted Input and Output SORITEC supports formatted input and output of data and text. The command syntax for formatted I/O is similar to FORTRAN formatted I/O. In other words, the read or write statement refers to a FORMAT statement number that contains the format for the input or output. The FORMAT command has a statement number, the command name FORMAT, and a legal format specification enclosed in parentheses, i.e., statement_number FORMAT (format_specification) The statement_number is always a positive integer between 1 and 9999. It must be unique within any given session or batch job. In other words, once a FORMAT is entered and identified by a statement number, no other command can have the same command number during that session. Allowable "format_specifications" are identical to those permitted in FORTRAN programs. Consult any FORTRAN reference manual for details on FORMAT statements. 3.4.1 FORTRAN Formatted Input Although free-format SAL files are the preferred way to import data to SORITEC , there may be occasions when data are structured so that it is necessary to use an explicit format statement. Standard FORTRAN-style format statements are used. SORITEC can read formatted data directly from the terminal or from a file. The syntax for reading formatted data is: READ(['filename'] [statement_number]) series... Here, the "statement_number" refers to a previously defined format statement. The optional data file identified by "filename" must have a ".sal" file extension. If the filename is omitted, SORITEC reads the data from the current input device, i.e. the terminal or a SAC file if a command file is being executed. If the format statement number is omitted, data are assumed to be free-formatted. Input file redirection is supported by the READ statement so that you can read a formatted file from a drive or directory other than the current one if it is referenced within single quotes. In MS/DOS: READ('d:filename' statement_number) series... or READ('\path\filename' statement_number) series... Unlike SAL files, formatted files cannot be read by multiple READ statements; all data from the file must be imported at one time. Normally, formatted READ commands expect data to be organized in columns. However, if the STREAMIO option is enabled by the ON STREAMIO command, data can be read by rows. For example, to read the text file macro1.sal, including the headers, given below: 6 February 1, 1990 SORITEC PRIMER -- Chapter 3 (1) KEY MACROECONOMIC INDICATORS 1970 1971 1972 GNP 1423.5 1564.2 1688.9 TAXES 455.6 678.3 778.4 PRIME RATE 10.75 9.76 13.45; the following command sequence would be required: ON STREAMIO USE 1970 1972 101 FORMAT(///10X,3F8.1) READ('macro1' 101) GNP 102 FORMAT(10X,3F8.2) READ('macro1' 102) TAXES READ('macro1' 102) PRIME In order to read a formatted file, you must use a FORMAT statement and refer to the FORMAT statement number in the READ statement. You must explicitly list the variables to be read in the READ statement. The USE period must be set in the main program before the READ command is executed. The file must be terminated with a semi-colon. 3.4.2 FORTRAN Formatted Output Data and text may be printed in a prespecified format by the WRITE command. FORTRAN formatted output can be directed to either the terminal or a file. The general format for the formatted write command is: WRITE(['filename'] [statement_number]) var... The statement number refers to a previously defined FORMAT statement. If the optional "filename" is included, SORITEC writes the data according to the format statement associated with "statement_number" to the file filename.lst. Otherwise, the data are written to the terminal or the current output device if DOS redirection has been invoked. If the statement number is omitted, data are printed in a list format similar to the format used to PRINT variables at the terminal, e.g., VAR_A ................ . 1 . 1.00000 2 . 2.00000 3 . 2.50000 4 . 3.50000 5 . 5.00000 Variables in the variable list may be time-series, constants or parameters. When time-series or vectors are encountered in the variable list, SORITEC writes all active observations to the terminal before writing the next variable in the list. Placing parentheses around time series variables in the variable list, however, will direct SORITEC to print one value from each variable in turn, February 1, 1990 7 SORITEC PRIMER -- Chapter 3 (1) allowing you to print time series in columns. For example, the commands: USE 1973Q1 1973Q3 102 FORMAT(15X,' GNP CONSUMPTION INVESTMENT'/(10X,3F11.1)) WRITE(102) (GNP CONSUMP INVEST) produce the following output: GNP CONSUMPTION INVESTMENT 475.7 301.4 71.0 468.3 306.2 70.1 487.7 312.8 82.3 Constants and parameters cannot be included in parentheses. 3.5 Keyboard Entry Data may be entered directly from the keyboard using the SERIES command, which has the format: SERIES variable_name value_list where "value_list" is the set of values assigned to the variable "variable_name". For example, SERIES VAR_A 1 4 2 5 7 8 creates a new series VAR_A with the six specified values. When there is no USE command in effect, a SERIES command counts the data items, stores them as undated data and defines an appropriate USE interval which is assumed in later commands or until the USE period is redefined. If there are too many or too few observations entered for the current USE period, an error message is generated unless the ON RAGGED option is enabled. The option command ON RAGGED permits entry, through SERIES, of data series that are shorter than the current USE interval without generating an error. Unaccounted data are assigned MISSING values when this condition is encountered. SERIES will not accept data series longer than the current USE period under any circumstances. SERIES is commonly used to enter data series that consist of few observations or to extend current series. 3.6 Output of Data to the Terminal Data may be output to the terminal or printer in both tabular and graphical form. 3.6.1 Tabular Display The simplest data display is produced by the PRINT command. Any series, vector, constant, parameter, equation or group can be displayed as follows: PRINT argument... 8 February 1, 1990 SORITEC PRIMER -- Chapter 3 (1) Types of arguments to be printed may be mixed, but this is generally inadvisable. Since SORITEC does not put unlike items on the same lines, mixing types or periodicities indiscriminately can generate lengthy outputs. Lagged variables may be specified in a PRINT command. To display data from the members of a GROUP, the ON GROUP option must be active. PRINT displays the names of GROUP members if OFF GROUP is enabled. Data may be output to the terminal in specified formats and mixed with text using the WRITE command. See "FORTRAN Formatted Output" for a description of this command. 3.6.2 Graphical Display Two types of graphical displays are available from SORITEC, time-series plots and scatter diagrams. They are available in character mode on all types of computers. They are also available in medium-resolution color-graphics mode on personal computers. Multi-variable plots of time-series or cross-section data are generated by the PLOT command, which has the following form in character mode: PLOT series symbol [series symbol]... The PLOT command produces a time-series of as many as nine variables. Plotting symbols must be specified in the command line for each variable to distinguish plotted values. Plotting symbols may be alphanumeric (A-Z, 0-9) or the characters +, -, * , /, =. If two variables, at some observation, are nearly equal so that they occupy the same position on the screen, only the symbol for the latter named variable is displayed. The horizontal scale is determined automatically so that all data values can be plotted. The WIDTH command can be used to inform SORITEC that more (or fewer) than 72 characters can be printed on a single line. In that case, the width of the plot is adjusted accordingly, e.g., WIDTH 132. To generate meaningful output, all plotted variables should have roughly the same range of values. Otherwise, some multiplicative or additive scaling may be necessary. In color-graphics mode the PLOT command has the following form: PLOT series... The relationship between two variables can be illustrated graphically via the SCATTER command, which is specified as: SCATTER series_1 series_2 SCATTER generates a scatter diagram with the variable referenced in the first argument plotted with respect to the vertical or Y-axis and the variable referenced in the second argument plotted against the horizontal or X-axis. Lagged variables are permitted. February 1, 1990 9 SORITEC PRIMER -- Chapter 3 (1) The graph size is dependent upon the number of characters that can appear on a line. The default value is 72 but can be changed by the WIDTH command. 3.7 SORITEC Databanks SORITEC databanks (SDB files) are the most convenient means of storing data once the data have been entered into SORITEC. The databanking facility has its own set of commands for accessing and managing data. Those commands are described in the following chapter. 10 February 1, 1990 SORITEC PRIMER -- Chapter 4 (1) PRIMER -- Chapter 4 -- SORITEC Databank (SDB) Files 4.1 Introduction SORITEC databanks, also known as SDB files, can store data series, equations, matrices, vectors, scalars, parameters, groups, procedures, and models. The number of items that can be stored on a SORITEC databank is limited only by the amount of available disk space. SDB files are constructed in a "knapsack" databank arrangement. In effect, you can throw anything you want into an SDB file and later recall it by name. There is no need to specify the type or length of the data item; SORITEC keeps track of that for you. Databanks are opened in either of two modes: read/write, and read-only. You can have as many as five databanks open at any time, only one of which may be open in read/write mode. The commands necessary to create and manipulate SDB files are straightforward and easy to learn. Instructions on how to use each of the databanking commands follow. 4.2 Create a Databank CREATE constructs and initializes a SORITEC databank. The only argument in the command line is the name of the databank that you want to create. The format of the CREATE command is as follows: CREATE filename The CREATE command appends the extension ".sdb" to the filename specified on the command. The CREATE command normally creates the databank on the default drive and directory. However, the file can be created on an alternative drive or directory by enclosing the drive specification and filename in single quotes. In MS/DOS: CREATE 'd:filename' or CREATE '\path\filename' Once the databank is created, it remains open for reading and writing until either (a) another databank is created or accessed, (b) the file is closed, or (c) SORITEC is terminated. 4.3 Access a Databank ACCESS opens a SORITEC databank in read/write mode for use in the current session. The general form of the command is: ACCESS filename The databank must already exist in the current directory as "filename.sdb" or an error message is generated. Once a databank is accessed, SORITEC automatically copies the requested data items referenced in a command into the workspace if it February 1, 1990 1 SORITEC PRIMER -- Chapter 4 (1) is not already there. ACCESS automatically closes any databank that is currently open. Databanks residing on drives other than the current drive may be referenced by enclosing the drive designation and filename within single quote marks, as noted above for CREATE. 4.4 Access a Databank in Read-Only Mode LIBRARY accesses a databank in read-only mode. The syntax of the command is as follows: LIBRARY filename 4.5 Release a Databank from SORITEC CLOSE closes a databank which is currently open and releases it from SORITEC's control. The format of the command is: CLOSE [filename] If the filename is omitted then the current read/write databank is closed. 4.6 Delete a Databank Databanks may be deleted from a directory with the PURGE command. The format is as follows: PURGE filename Since the databank is permanently erased, this command should be used with care! Reference to a databank on a directory or drive other than the current one works just as it does with CREATE and ACCESS. 4.7 Store Items in a Databank Items in SORITEC's workspace are stored on the current read/write databank with the KEEP command. The syntax of the command is: KEEP item... Any item in the workspace can be stored on a databank. If you try to KEEP an item that has the same name as an item that already exists in the databank, a nonfatal error is reported and the item is not replaced. There are three ways to replace an item that already exists on a SORITEC databank. First, the item stored in the databank can be explicitly discarded using the DISCARD command and then stored using the KEEP command. Second, the item can be replaced explicitly with the REPLACE command. Lastly, items in in a databank can be implicitly replaced with the KEEP command if the ON REPLACE option has been enabled. 2 February 1, 1990 SORITEC PRIMER -- Chapter 4 (1) KEEP stores all observations associated with a given time series, regardless of the observation period, as defined by the current setting of the USE command, that is currently active. For example, if the series GNP is defined for 1950Q1 to 1984Q2 and the current USE period is for 1980Q1 to 1983Q4, the command KEEP GNP stores the series for 1950Q1-1984Q2. You may save only the active observations by using the ACTIVE modifier: KEEP(ACTIVE) item... 4.8 Retrieve Items from a Databank Data are explicitly copied from the current read/write databank into the workspace by the COPY command. The command syntax is: COPY item... Since the databank is always implicitly searched for items needed by SORITEC commands, this command is generally used only when you need to retrieve data from a second databank. If, for example, you wish to regress a measure of inflation, such as CPI, stored on one databank, against some measures of final demand, such as PCE and DEFENSE, stored on another, the command sequence would be: ACCESS 'inflate' COPY CPI ACCESS 'fdemand' REGRESS CPI PCE DEFENSE 4.9 Replace Items in a Databank Items in databanks are replaced by items of the same name in the current workspace with the REPLACE command. The command syntax is: REPLACE item... If the item is not currently stored on the databank, a warning message is generated but the item is still saved. 4.10 Rename Items in a Databank The names of items in a SORITEC databank are changed with the RENAME command, which has the form: RENAME new_name old_name [new_name old_name]... RENAME takes an even number of arguments consisting of pairs of item names. 4.11 Switch the Names of Two Items in a Databank Pairs of items in a SORITEC databank can have their names swapped by the SWITCH command. The syntax of the command is: SWITCH item_1 item_2 February 1, 1990 3 SORITEC PRIMER -- Chapter 4 (1) It is equivalent to the following series of commands: RENAME temp item_1 RENAME item_1 item_2 RENAME item_2 temp 4.12 Discard Items from a Databank Items are erased from a databank with the DISCARD command. The format of DISCARD is: DISCARD item... Once discarded, an item is irretrievably lost. 4.13 Generate a Directory Listing of a Databank An alphabetically sorted directory listing of a SORITEC databank is produced with the CONTENTS command, which has the form: CONTENTS ['filename'] If the filename is omitted from the command line, SORITEC produces a directory listing of the currently active databank. If no databank is active, an error message is returned. The optional argument "filename" is the name of a SORITEC databank in the current directory. Reference to a databank on a directory or drive other than the current one is done as for the CREATE, ACCESS, and PURGE commands. 4 February 1, 1990 SORITEC PRIMER -- Chapter 5 (1) PRIMER -- Chapter 5 -- Programming Constructs 5.1 Introduction SORITEC provides a powerful interpretive programming language that enables you to simplify complex and repetitive estimation procedures into a smaller set of commands that can be executed interactively or through SORITEC's batch processing facility. SORITEC's programming language supports numeric and alphanumeric looping, and conditional and unconditional transfer of control to other statements. When organized in a SORITEC Alternate Command (SAC) file, these programming constructs provides a convenient means for developing estimators and diagnostic statistics in addition to those provided directly by SORITEC. The SAC file facility enables command files to call other command files so that a series of command sequences can be executed. SORITEC also provides a procedure facility that allows you to structure a sequence of commands into a subprogram that, once defined, can be passed arguments and repetitively called from a SORITEC command line. The commands associated with SORITEC's programming capabilities follow. 5.2 Numeric Looping Repetitive execution of commands in SORITEC is accomplished by DO loops. The DO loop has the following general format: DO [index = beginning_value TO end_value [BY increment]] . . . (SORITEC commands) . . . END The DO loop index, beginning_value, end_value and increment may be integer or real scalars or parameters and you can proceed forward or backward through the loop by assigning a positive or negative value to the increment. Both the end_value and increment may be reset dynamically within the loop. If so, the new values are used to determine whether the loop is executed again. If the BY increment is omitted from the DO command line, it is set to 1. A DO command, with no specified values for "beginning_value", "end_value" and "increment", will cause the statements in the loop to be executed once. If the DO variable's initial value exceeds its maximum value before a positive increment is added, an error message is generated and the statements between the DO and END statements are not executed. The same situation results if the variable's initial value is set lower than a final value to be reached by negative increments. You can construct a DO loop to index through members of a group. For example, the commands: February 1, 1990 1 SORITEC PRIMER -- Chapter 5 (1) GROUP VARS A B C D ON GROUP DO I = 1 TO 4 REGRESS Y VARS(I) END would regress the dependent variable Y against each of the time-series in the group VARS. 5.3 Unconditional Branching SORITEC allows you to transfer control to any command prefixed by a statement number. The format of the command is as follows: GO TO statement_number Alternatively, the command may be specified as GOTO. The argument of the GOTO command may be a number, constant, or parameter, and must be in the range from 1 to 9999. A statement number may be prefixed to most commands. In batch mode, if the specified command number does not exist, an error message is generated, and control passes to the statement which follows the GO TO command. In interactive mode, the system responds with a query for the missing statement number until the statement number is entered. 5.4 Conditional Branching Conditional branching is enabled through an IF/THEN/ELSE command structure. The general format for the command sequence is: IF condition THEN task1 ELSE task2 The "condition" must be an arithmetic expression that may include logical and relational operators, as needed. If the condition is satisfied, control transfers to "task1". Otherwise control is transferred to "task2". Both "task1" and "task2" can be single statements, DO loops, or DOT loops. Note that IF, THEN, and ELSE are three distinct commands. So if you wish to type the IF/THEN/ELSE structure on a single line, you must use semicolons as follows: IF condition; THEN; task1; ELSE; task2 An IF/THEN/ELSE command structure can be nested provided it is enclosed in a DO or DOT LOOP. Here is an example of an IF/THEN/ELSE structure in which both "task1" and "task2" are DO loops: 2 February 1, 1990 SORITEC PRIMER -- Chapter 5 (1) IF A > B THEN DO C = B*LOG(A) PRINT A B C END ELSE DO C=A*LOG(B) PLOT A # B * END Obviously, a DO loop in an IF/THEN/ELSE sequence can be executed repetitively by specifying the index, initial value, final value and, optionally, the increment in the DO command line. Either the THEN or the ELSE clause may be omitted from a conditional branching command sequence. The IF command can also be used with the GO TO command to control the order of execution, e.g. IF X < Y .AND. A > B; THEN; GO TO 300 5.5 Null (Continuation) Statement The CONTINUE statement is generally used in SORITEC to position a statement number within a SORITEC program. Its syntax is: statement_number CONTINUE 5.6 Alpha Looping SORITEC will repetitively execute a sequence of commands by indexing over a set of alphabetic loop control variables. On each pass through the loop, SORITEC supplies succeeding alphabetic arguments in the DOT command. The DOT command is functionally similar to a DO command. The format of the command is: DOT variable... . . . (SORITEC commands) . . . ENDDOT Alpha loop control variables are successively entered into expressions within the DOT loop by substituting all references to any colons (":") within the DOT loop by the currently active alpha variable. For example, DOT A B C REGRESS Y A REGRESS Y : is executed as REGRESS Y B ENDDOT REGRESS Y C February 1, 1990 3 SORITEC PRIMER -- Chapter 5 (1) You may also use the colons as prefixes or suffixes to construct new variables within DOT loops, e.g., DOT VAR1 VAR2 VAR3 OUT: = INP: * Z ENDDOT is executed as OUTVAR1 = INPVAR1 * Z OUTVAR2 = INPVAR2 * Z OUTVAR3 = INPVAR3 * Z All commands in the DOT loop are executed as many times as there are variables in the DOT command. Note that if group expansion is enabled by the ON GROUP switch, a DOT loop can index through a GROUP. For example, GROUP VARS A B C ON GROUP DOT VARS REGRESS Y : ENDDOT would regress the variable Y against each of the time series in the group VARS. 4 February 1, 1990 SORITEC PRIMER -- Chapter 6 (1) PRIMER -- Chapter 6 -- Special Generation and Transformation Commands 6.1 Introduction SORITEC provides several commands that generate or transform time-series. These commands create dummy variables or they transform existing data series into new time-series. They include facilities for converting time-series from one periodicity to another and for transforming continuous variables into discrete variables. SORITEC also provides commands that perform modular division and invoke maximum and minimum functions. 6.2 Create a Time Trend Dummy Series SORITEC generates a time trend dummy series with the TIME command. The syntax of this command is: TIME [series_name] TIME sets the first observation of the "series_name" associated with the currently active USE period equal to one and increments successive observations by one, so that the second observation is set to two, the third to three, etc. If the "series_name" is omitted from the command line, TIME stores the time trend dummy in a series named "time". If a variable by that name already exists in the workspace, it will be overwritten by the TIME command. The TIME command may be invoked only when there are no internal gaps in the current USE period, i.e., the current USE period must have been invoked with only two arguments. 6.3 Create Seasonal Dummies A periodic dummy variable can be created using the DUMMY command, which has the form: DUMMY output_series first_observation skip_increment In the command line, "first_observation" is the first observation set to one. Series elements are then set to one every "skip_increment". The remaining values of the series are set to zero. For example, the following commands create a dummy variable QTR1 that is equal to one in the first quarter of each year: USE 1970Q1 1980Q4 DUMMY QTR1 1970Q1 4 6.4 Recode a Variable SORITEC allows you to convert a continuous variable into a discrete variable via the RECODE command. The form of the command is: RECODE output_series input_series number... February 1, 1990 1 SORITEC PRIMER -- Chapter 6 (1) In the above command line, "input_series" is the series to be recoded and "output_series" is the categorized output variable. The numbers are the interval boundaries for the recoding process. For example, SERIES A 3 17 21 28 31 35 26 41 RECODE B A 10 20 25 30 35 40 PRINT A B produce these results: A B 1 3 0 2 17 1 3 21 2 4 28 3 5 31 4 6 35 5 7 26 3 8 41 6 Let p(1) through p(n) be the "n" numbers specified on a RECODE command. For each element, i, of the series, RECODE uses the following formula: output_series(i) = k if p(k-1) =< input_series(i) < p(k) when p(k-1) < p(k), and output_series(i) = k if p(k-1) = input_series(i) = p(k) p(0) is always considered to be -infinity, and p(n+1) is always considered to be +infinity. 6.5 Conversion of Time-Series from One Periodicity to Another SORITEC converts time series from one periodicity to another with the CONVERT command. The command has the following syntax: CONVERT [(modifier)] output_series = input_series When the command is executed, data of one periodicity are converted to the periodicity specified by the current USE statement. In other words, the periodicity of the "input_series" does not have to be explicitly specified, since SORITEC determines it internally. Lags are not allowed in CONVERT arguments and the entire series is always converted, regardless of the range specified in the USE command. While the standard syntax of the convert command requires the specification of both an output (result) series and an input series, the converted series can be written to the input series name simply by specifying: CONVERT [(modifier)] input_series After the conversion, the old values of the input series, in the old 2 February 1, 1990 SORITEC PRIMER -- Chapter 6 (1) periodicity, are lost. The modifier argument in the command line is optional and controls the type of conversion which takes place. There are two sets of modifiers, one for aggregation (such as monthly to annual), and one for disaggregation (such as annual to monthly), plus a special MOVE modifier for converting to and from undated data. The modifiers are as follows: AGGREGATION SUM Sum observations in each period (default) AVERAGE Average observations in each period MIN Find the minimum observation in each period MAX Find the maximum observation in each period LAST Use the last observation in each period DISAGGREGATION FILL Use the data point for entire period for each sub-period SHARE Divide the data value for the entire period equally across all sub-periods (default) UNDATED TO DATED CONVERSIONS MOVE Move the data from an undated to a dated variable or vice versa without alteration (default) The default is selected whenever no modifier is entered on the command line. Conversion is currently permitted only between annual, semi-annual, quarterly, monthly, ten-day and undated data types. In addition, conversion from monthly to ten-day periodicity produces incorrect results because of the way the ten-day data type is defined. 6.6 Maximum Function SORITEC can determine the maximum of a series or the observation-by-observation maximum of a collection of series. The maximum value of a series is found by entering the MAX command with only two arguments, i.e., MAX maximum_value input_series When entered like this, "input_series" is the data series over which the maximum is to be taken. The result is stored in "maximum_value" which must be a constant or parameter. If the "maximum_value" name is undefined prior to entering the command, SORITEC defines it to be a constant. A new series consisting of the set of maximum values, by observation, associated with several series is generated by the MAX command when more than two arguments are entered in the command line, i.e., MAX output_series input_series... February 1, 1990 3 SORITEC PRIMER -- Chapter 6 (1) In this case, all arguments in the command line must be data series. The resulting "output_series" contains the observation-by-observation maximum of all the remaining arguments. 6.7 Minimum Function The minimum value of a data series or a series of minimum values, by observation, of several series is obtained using the MIN command. The format and use of MIN is identical to the MAX command except for the result it computes. In other words, the minimum value of a data series is determined when the MIN command is followed by two arguments: MIN minimum_value input_series where the first argument is a constant or parameter and the second is the series you wish to evaluate. A series containing observation-by-observation minima is generated when more than two arguments, all of which must be data series, appear on the MIN command, i.e., MIN output_series input_series... The same rules as apply to the MAX function apply to MIN. 6.8 Modular Division SORITEC performs modular division via the MOD command, which has the format: MOD remainder dividend divisor In mathematical notation, the formula used is: remainder = dividend - (INT(dividend/divisor) * divisor) where INT is the function that computes the integer part of a number. The dividend and divisor must be of the same type and may be constants, parameters, or series. The resulting remainder will be the same type. 6.9 Compute Moving Average The moving average of a series is calculated by the MA command. MA output_series input_series length In the command line, "input_series" is the series to be averaged, "length" is the length of the moving average, and "output_series" is the resulting series. The argument, "length", may be a constant, parameter, or a numeric quantity. The first "length-1" observations of the output_series are treated as missing data. 4 February 1, 1990 SORITEC PRIMER -- Chapter 6 (1) 6.10 Compute Moving Sum The MSUM command compute the moving sum of a series. MSUM output_series input_series length Arguments in the command line have the same meaning as the MA command. The first "length-1" observations of the output_series are treated as missing data. 6.11 Statistical Operations Several statistical functions are available for analyzing and manipulating data. They are described in the following sections. 6.11.1 Correlation Matrix Calculation A correlation matrix for the variables in an argument list is generated by the CORREL command. The format of the command is: CORREL series... Only observations active in the currently defined USE period are used in correlation matrix calculations. While only the correlation matrix is output to the terminal, the correlation matrix (COR), vector of means (MEANS), vector of standard deviations (DEVS) and covariance matrix (COV) are calculated by CORREL and stored as SORITEC internal variables. These results may be accessed with a RECOVER command. 6.11.2 Covariance Matrix Calculation The COVA command computes, stores, and prints a covariance matrix for the variables named as arguments in the command line. The format of the command is: COVA series... Similar to the CORREL command, only observations associated with the currently active USE period are used in calculations. The vector of means (MEANS), vector of standard deviations (DEVS) and covariance matrix (COV) are stored as SORITEC internal variables when the COVA command is executed, and may be accessed by the RECOVER command. 6.11.3 Other Statistical Operations Several specialized statistical operations are supported by SORITEC to describe the properties of a time-series. The common format consists of the command name followed by the output variable and the input series, i.e., command_name output_constant input_series Statistics are calculated over the currently active USE period. Some of the statistical operations available in SORITEC and the commands for executing them are: February 1, 1990 5 SORITEC PRIMER -- Chapter 6 (1) Command Description ------- ----------- MEAN mean input_series Arithmetic Mean RMS root_mean_square input_series Root Mean Square SUM sum input_series Arithmetic Sum SSR sum_squared_resids input_series Sum of Squared Residuals 6 February 1, 1990 SORITEC PRIMER -- Chapter 7 (1) PRIMER -- Chapter 7 -- SORITEC Financial Functions 7.1 Financial Functions in SORITEC SORITEC contains most of the common financial analysis functions. These functions, used alone or with SORITEC's forecasting commands, provide extremely powerful tools for performing financial project evaluation. The functions currently provided include internal rate of return, present value, and various loan amortization schedules. Note that in all SORITEC financial functions, interest rates are treated as decimal quantities unless otherwise noted; e.g., 15% is represented as 0.15. 7.2 Internal Rate of Return The internal rate of return command calculates the internal rate of return for an arbitrary series via a modified Newton-Raphson search algorithm. The format of the command is as follows: IRR ([CAPITAL=scalar] [INITIALR=scalar]) interest_rate income [cost] where "interest_rate" is a legal SORITEC constant name for the interest rate that discounts the "income" series (minus the "cost" series, if present) to a net present value of zero. The optional modifiers in the command line allow you to control the parameters determining convergence for the algorithm as well as specification of an arbitrary start-up capital cost. Specifically: CAPITAL is the start-up cost of the project. It is automatically subtracted from the first-period profits. INITIALR allows you to specify a starting value for the internal rate of return. This is of special value in finding multiple roots to the IRR equation when cash flows change signs more than once during the life of the project. 7.3 Present Value The present value command, PV, calculates the net present value of a stream of net benefits (or profits) associated with a financial venture. PV will take either a scalar value for the interest rate or a time series of forecast values. This later feature, when combined with the estimation and forecasting capabilities of SORITEC, provides a powerful tool for simulating and evaluating financial projects. The syntax of the command is: PV ([PERIOD=D|W|T|M|Q|S|A SIMPLE|COMPOUND]) & present_value net_income_stream [costs] interest_rate where "present_value" is a scalar value equal to the present value of the income stream, "net_income_stream" is the net income stream to be discounted, and "interest_rate" is the interest rate used in calculating the present value. The interest rate can be either a scalar, fixed for all periods, or a time series of interest rates. This allows for easy incorporation of interest rate forecasts February 1, 1990 1 SORITEC PRIMER -- Chapter 7 (1) into project evaluation. The "net_income_stream" can be followed by an optional cost series. This second argument in the command line can be either a single net income stream or a pair of series describing the revenues and costs of the project. The optional modifiers in the command line allow you to convert the periodicity of the interest rate to conform to the net income stream and to specify the type of conversion to be performed. Specifically, PERIOD allows an interest rate conversion to be specified; specifically, setting PERIOD equal to one of the options results in the specified interest rate being converted from the selected periodicity to the periodicity of the current USE period. The periodicity may be (D)aily, (W)eekly, (T)en-Day, (M)onthly, (Q)uarterly, (S)emiannual, or (A)nnual. A second option, specified either as SIMPLE or COMPOUND, is the type of conversion to be used. The default is COMPOUND conversion. The PERIOD modifier used with the conversion option can handle transformations between annual or effective interest rates and the effective periodic percentage rates. If the annual rate is given as 15%, the effective annual percentage rate is 16.0754% - calculated as .15/12 = 1.25% compounded monthly. For example, suppose that the current USE period is monthly. In that case, PV (PERIOD=A, SIMPLE) PV PROFIT .15 will correctly convert the 15% annual percentage rate to a 1.25% monthly rate before calculating the present value. If the available data are given in terms of effective yields, the COMPOUND option should be used to correctly convert rates between periods. A loan requiring 4% per quarter is equivalent to a loan rate of 1.316% compounded monthly [exp(ln(1.04)/3)-1]. Here, the appropriate command would be: PV (PERIOD=Q, COMPOUND) PV PROFIT .04 Again we suppose that the current USE period is monthly. 7.4 Loan Amortization The loan amortization command, AMORT, provides a convenient technique for calculating the monthly payment for a given loan situation. In addition to the standard loan value and interest rate setup, AMORT also allows an arbitrary number of loan payment series, balloon payments, variable interest rates, as well as options for dynamically extending the amount of the loan through additional borrowings. The format of the command is: AMORT ( [PERIOD=D|W|T|M|Q|S|A SIMPLE|COMPOUND] [RULEOF78] & [BALLOON=number] ) payment loan interest_rate [aux_pay]... where "payment" is the resulting per period payment to fully amortize the loan during the current USE period, and "loan" is the amount of the loan. The loan 2 February 1, 1990 SORITEC PRIMER -- Chapter 7 (1) can either be a constant or a it can be a time series if the loan is allocated over the time period set in the USE command. "interest_rate" is the interest rate of the loan. It must be the same type, either constant or time series, as the "loan". The optional command line arguments, "aux_pay", are time series of auxiliary payments in addition to the monthly loan payment. These can be used to enter payments to principal that are awkwardly or randomly timed. For example, a loan which required balloon payments of $5000 every five years can be handled as a time series with value 5000 for every fifth year and zeros elsewhere. The optional modifiers in the command line allow you to change the amortization schedule as follows: PERIOD is the same as for the PV command. It allows an interest rate conversion to be specified; specifically, setting PERIOD equal to one of the options results in the specified interest rate being converted from the selected periodicity to the periodicity of the current USE period. The periodicity may be (D)aily, (W)eekly, (T)en-Day, (M)onthly, (Q)uarterly, (S)emiannual or (A)nnual. RULEOF78 constructs a principal and interest payment series for the loan according the the "Rule of 78" (sum of the months). This option is valid only for loans with a single period of borrowing and a fixed interest rate. BALLOON specifies the amount of a balloon payment in the final period. February 1, 1990 3 SORITEC PRIMER -- Chapter 8 (1) PRIMER -- Chapter 8 -- Cross-Section Techniques 8.1 Introduction SORITEC contains many common techniques for processing and analyzing cross- sectional data sets. Access is provided to most of the intermediate and final results. The specific techniques currently implemented in SORITEC and SORITEC SAMPLER are: SYNOPSIS provides a quick statistical summary of a data series. XTAB performs cross-tabulation analysis. FREQ provides a complete frequency analysis, including an optional histogram display. Techniques implemented only in the full SORITEC language include: ANOVA performs a two-way analysis of variance. BRKDWN performs a breakdown analysis on a pair of variables. This procedure provides a frequency breakdown, optional histogram, and ANOVA test. MWHITNEY carries out the Mann-Whitney U test for equality of the means for two series of observation. WILCOXON performs the equivalent Wilcoxon rank-sum test. NCOR provides Spearman's non-parametric correlation coefficient and Kendall's tau statistic as general measures of association. PROBIT estimates a binary probit model. TTEST calculates grouped and paired t-tests for a list of variables. 8.2 Synopsis The SYNOPSIS command returns a detailed summary analysis of a data series including mean, standard deviation, median (including a 95% confidence interval), mode, quartiles, deciles, variance, skewness, kurtosis, coefficient of variation, number of observations, number of missing values, minimum, maximum, range, mode, and the frequency of the mode. The command format of SYNOPSIS is: SYNOPSIS series... In addition to outputting them to the terminal, SYNOPSIS stores the summary statistics as SORITEC internal variables, which may be recovered either explicitly with the RECOVER command or by implicit reference. See RECOVER(2) for the method of retrieving these data. Except for decile and quartile statistics, internal variables associated with the SYNOPSIS command are stored as vectors that have the same number of elements as arguments in the SYNOPSIS command line. Recoverable SORITEC internal variables stored as vectors are: February 1, 1990 1 SORITEC PRIMER -- Chapter 8 (1) ^COUNTS = number of non-missing observations for each variable ^MEDIAN = median value for each variable ^MIN = minimum values ^MAX = maximum values ^MEANS = mean values ^VAR = variances for each variable ^DEVS = standard deviations ^CV = coefficient of variation for each variable ^KURT = kurtosis of each variable ^SKEW = skewness for each variable ^MODE = mode values for each variable Two other internal variables are stored upon execution of the SYNOPSIS command. The variables are: ^DECILE = decile values of a series ^QUARTIL = quartile values of a series The ^DECILE and ^QUARTIL internal variables are stored as matrices. Quantiles are defined as the first observations less than or equal to the true mathematical quantiles (n/4 and n/10) in both cases. Note that SYNOPSIS exercises casewise deletion of missing values on each variable when it computes the summary statistics. Because of this, the statistics may not compare with those from other SORITEC statistics commands like STATS, KURTOSIS, etc. 8.3 Crosstabulation Analysis The XTAB command calculates the standard row-column crosstabulation report. The format of the command is: XTAB series_1 series_2 The arguments "series_1" and "series_2" must be discrete data. If the series you wish to crosstabulate are continuous, they must be converted via the RECODE command. XTAB doesn't delete missing values, but instead, reports them as a separate category "MISSING" in the appropriate row or column. In addition to printer-oriented output, XTAB has an interactive screen display mode which allows scrolling through the table in a "spreadsheet" mode. 8.4 Frequency Analysis The FREQ command calculates a frequency distribution for a data series, and can optionally produce a histogram to display the distribution. The command can also recode the observations of the series from continuous to discrete values "on the fly". The format of the command is: FREQ [( [HISTGRM] [CLASS=vector] )] series The HISTGRM option generates the histogram of the frequency distribution. If there are ten or fewer discrete values, the histogram is plotted across the 2 February 1, 1990 SORITEC PRIMER -- Chapter 8 (1) screen or page. Otherwise, the display runs down the screen. The elements of the CLASS vector are used as the endpoints of intervals according to which the data series is categorized. For example, if "a" and "b" are two adjacent elements of the CLASS vector, then FREQ would report the number of observations that are greater than or equal to "a" and less than "b". February 1, 1990 3 SORITEC PRIMER -- Chapter 9 (1) PRIMER -- Chapter 9 -- Estimation and Forecasting 9.1 Introduction SORITEC provides you with many single-equation estimation techniques for both single equation and simultaneous equation models. In this section, we will discuss the most frequently used single-equation techniques: ordinary least squares (REGRESS command), two-stage least squares (TWOSLS), and the Cochrane- Orcutt (CORC) and Hildreth-Lu (HILU) autocorrelation correction techniques. These procedures may be applied to either time series or cross-sectional data. The structure of the equations in any model may be recursive or simultaneous. The fitted equations estimated by SORITEC can be recovered and used for forecasting. The standard output from a SORITEC estimation command consists of a coefficient tableau and a summary tableau of regression diagnostics which includes the number of observations, the standard error of the regression, sum of squared residuals, R-squared, adjusted R-squared, Durbin-Watson, F test of overall significance, the log-likelihood, and the Akaike and Schwarz statistics for model selection. You may have the estimator generate additional diagnostics by setting one or more options with ON commands, which must be executed before the regression command. Use of these options is described above. SORITEC estimation procedures support ON VCOV, ON STATS, ON CCOR, ON ANOVA, ON PLOT, ON RESIDUAL and ON BETA commands. These options are associated with SORITEC's interactive tableaux and are described below. When the ON CRT option is invoked, all estimation commands described here support the display in interactive tableaux of regression diagnostics. These tableaux provide you with a greater number of regression diagnostics than are output by the estimation commands in their default modes. Commands for invoking the interactive tableaux and descriptions of their contents are detailed below. 9.2 Ordinary Least Squares (OLS) Estimation The ordinary least squares estimator is invoked by the REGRESS command, which has the following syntax: REGRESS [(ORIGIN)] dep_var ind_var... The dependent variable must be the first argument in the variable list, with the independent variables following immediately as the second through last arguments. The keyword ORIGIN is optional and, if specified, forces SORITEC to estimate the equation without a constant term. Otherwise, the constant term is supplied automatically. If ORIGIN is specified in the command line, it must be enclosed within parentheses. When the regression plane is forced through the origin, the regression diagnostics are adjusted accordingly. February 1, 1990 1 SORITEC PRIMER -- Chapter 9 (1) 9.3 Autocorrelation Techniques for the Single Equation Model Two estimation techniques are available for estimating single equation models when you believe that the error terms are not independent, but that a disturbance in one period depends on the previous disturbance. The Cochrane- Orcutt (CORC) iterative technique and the Hildreth-Lu (HILU) scanning technique estimate models assuming first order serial autocorrelation of the disturbances. When either autocorrelation technique is invoked, SORITEC temporarily shortens the USE period by one observation at the beginning of the sample and by one observation after every gap to calculate the required data transformations. The current USE period, therefore, should include the observations which will be lost in the transformation of variables. The USE period is restored to its original interval(s) after the command is completed. Regression diagnostics are calculated from the residuals of the regression on the transformed variables. 9.3.1 Cochrane-Orcutt Iterative Technique The Cochrane-Orcutt estimator is invoked by the command: CORC [(ORIGIN)] dep_var ind_var... Command syntax is identical to the REGRESS command described in the previous section. The Hildreth-Lu technique is often to be preferred, since the Cochrane-Orcutt technique will occasionally result in a value for rho which is a local rather than a global optimum, and of obviously wrong sign. In particular, if, in ordinary least-squares regression, a Durbin-Watson statistic significantly less than 2.0 is observed, yet the Cochrane-Orcutt technique gives a negative rho value, the Hildreth-Lu technique should be used. 9.3.2 Hildreth-Lu Scanning Technique In addition to the dependent and independent variable lists, the HILU command requires that the lower and upper limits to the value of rho and its stepsize during the scanning process be initialized. These values are entered onto the command line through a set of optional positional modifiers. The syntax of the HILU command is: HILU [ ( [ORIGIN] [lowlim [uplim [inc]]] ) ] dep_var ind_var... where the dependent and independent variable lists are positioned similar to the other regression commands. "lowlim" is the lower limit of rho. Similarly, "uplim" is the upper limit of rho. Finally, "inc" is the stepsize of the scanning process. If omitted from the command line, these modifiers assume default values of 0.0, 1.0 and 0.1, respectively. In previous versions of SORITEC and SORITEC SAMPLER, you could use an asterisk ("*") to select defaults for "lowlim" and "uplim". Since the generalization of wildcard syntax throughout the package, this capability has been removed. 2 February 1, 1990 SORITEC PRIMER -- Chapter 9 (1) 9.4 Two-Stage Least Squares (2SLS) Estimates Consistent estimates for a single equation from a simultaneous equation system can be obtained by using a two-stage least squares (2SLS) estimator. Unlike the other estimation commands in this section, the 2SLS procedure requires you to enter two commands to estimate an equation. First, all exogenous variables must be identified in an the EXOGENOUS statement, which has the form: EXOGENOUS exog_var... All arguments associated with this command are exogenous variable names. The EXOGENOUS command must be specified before invoking the 2SLS estimator. After execution, all later 2SLS commands use the same list of exogenous variables until another EXOGENOUS command is entered. Two-stage least squares estimation is invoked by the TWOSLS command which has the form: TWOSLS [(ORIGIN)] dep_var ind_var... All arguments plus the ORIGIN keyword in the command line have the same interpretation as used in the REGRESS command. Two-stage least squares commands that detect omitted or mis-specified exogenous variables generate error messages until a valid EXOGENOUS command is executed. 9.5 Forecasting Single-Equation Models Any single-equation model that has been estimated by SORITEC can be forecast using the fitted equation ^FOREQ, which is stored as a SORITEC internal variable. To forecast an equation, all of the independent or right-hand variables that were used to estimate it must be defined for the period over which the forecast is to be made. These values may be observed, projected, assumed or may be the product of other forecasts. To forecast using a single-equation model, the following steps are performed: (1) Estimate a single equation model using the REGRESS, CORC, HILU or TWOSLS command. (2) RECOVER the fitted equation from its internal system name of FOREQ. (3) Change the active observation period to the forecast period with the USE command. (4) Revise independent variables to include assumptions for the forecast period, if the data is not already present. (5) Use the FORECAST command to forecast the fitted equation over the desired time period. The format of the FORECAST command is: FORECAST fitted_equation_name February 1, 1990 3 SORITEC PRIMER -- Chapter 9 (1) Since SORITEC internal system names may be referenced directly from the FORECAST command, step (2) is optional. In this case, the fitted equation is forecast simply by entering: FORECAST ^FOREQ Use of the RECOVER command is necessary, however, if you want to FORECAST the fitted equation after estimating other models since SORITEC replaces ^FOREQ each time an equation is estimated. Fitted equations can be saved on a databank like any other SORITEC item. Forecasting single-equation models in SORITEC is illustrated below. USE 1975Q1 1982Q4 REGRESS GNP CONSUMPTION INVESTMENT{-1} RECOVER GNP_EQUATION FOREQ USE 1983Q1 1984Q3 REVISE CONSUMPTION 1 2 3 4 5 6 7 REVISE INVESTMENT 7 6 5 4 3 2 1 FORECAST GNP_EQUATION PRINT GNP If the fitted equation is not needed after being forecast, the command sequence is as follows: USE 1975Q1 1982Q4 REGRESS GNP CONSUMPTION INVESTMENT{-1} USE 1983Q1 1984Q3 REVISE CONSUMPTION 1 2 3 4 5 6 7 REVISE INVESTMENT 7 6 5 4 3 2 1 FORECAST ^FOREQ PRINT GNP The FORECAST command provides a static forecast by default. This means that lagged dependent variables are not automatically generated for each successive period, unless the DYNAMIC flag is on. In other words, the command sequence: USE 1980Q1 1989Q4 REGRESS GNP GNP{-1} ON DYNAMIC USE 1990Q1 1990Q4 FORECAST ^FOREQ will produce a dynamic forecast for GNP, i.e. the 1990Q2 forecast will be computed using the 1990Q1 forecast as a right-hand variable. If the "ON DYNAMIC" statement was not present, then SORITEC will attempt to use the actual value of GNP in 1990Q1 to forecast the 1990Q2 value. If no data had been entered before this sequence for 1990Q1, an error message would be generated, and the values of GNP from 1990Q2 to 1990Q4 would be missing values. The 1990Q1 forecast value would, however, be the same as if the DYNAMIC flag were ON, since it would rely only on the 1989Q4 value for GNP. Note that the FORECAST command stores the forecasted values of the dependent variable under the same name as the dependent variable previously defined. This 4 February 1, 1990 SORITEC PRIMER -- Chapter 9 (1) means that any existing values for the dependent variable over the forecast period are replaced and cannot be retrieved. All existing values for the dependent variable outside the forecast period are retained, however, with the result that forecasted values are spliced into the original series as though the REVISE command has been used. To preserve existing values, the dependent variable forecast should be deflected to another variable with the TAG modifier on the FORECAST command: USE 1975Q1 1982Q4 REGRESS GNP CONSUMPTION INVESTMENT{-1} RECOVER GNP_EQUATION FOREQ USE 1983Q1 1984Q3 FORECAST(TAG=TEMP_GNP)GNP_EQUATION PRINT GNP TEMP_GNP February 1, 1990 5 SORITEC PRIMER -- Chapter 10 (1) PRIMER -- Chapter 10 -- Interactive Print Server 10.1 Introduction SORITEC allows complete control over the output presentation for selected procedures. In the REGRESS and XTAB commands, the user can control the order and depth of the presentation of results. REGRESS generates ten separate output summaries which may be selected, or repeated, in any order that you desire. XTAB allows you to scroll through the crosstabs table in a "spreadsheet" mode. A menu is provided which describes each display option. The interactive regression display supports 10 different screen displays including 3 tables of residual summaries, a residual plot, the covariance matrix of coefficients, the correlation matrix of coefficients, extended regression reports (beta coefficient, partial r and elasticities), a regression summary table, the ANOVA table for goodness of fit, means and standard deviations of the independent variables and of course the regression estimates. When the interactive mode is in effect, a selection menu appears on the last line of the screen. Entering a question mark(?) will bring up a more detailed help menu regarding the contents of each display. Selecting an invalid choice sounds the "bell" and prompts you for another choice. There are several additional special keystrokes, in addition to those in the selection menu, that control interactive display. Entering a carriage return, a "+", or a space advances the display to the next tableau in the selection menu. Entering a backspace returns you to the previously displayed tableau. Entering a "-" displays the previous screen in the selection menu. The interactive option is available for REGRESS, TWOSLS, CORC, and HILU, which are supported by both SORITEC and SORITEC SAMPLER. In SORITEC only, this option is also available for CORC2, HILU2, TSCORC, TSCORC2, TSHILU, and TSHILU2. 10.2 Entering Interactive Mode To enable the interactive mode you must turn on the option by entering the command: ON CRT When this option is enabled, SORITEC automatically switches into an screen- oriented presentation whenever a command is executed that supports the interactive tableaus. To stop the interactive presentation, enter OFF CRT. SORITEC will switch to printer-oriented output. 10.3 Tableau Descriptions The following sections discuss each tableau and their associated menu selection codes available with SORITEC estimation commands. February 1, 1990 1 SORITEC PRIMER -- Chapter 10 (1) 10.3.1 Coefficient Display (E) Coefficient estimates are automatically displayed when the regression equation is estimated. The presentation shows the technique, the current sample period, coefficients, standard errors, t-values and the significance levels of the t statistic. 10.3.2 Regression Summary Table (G) The regression summary table provides a quick synopsis of the regression. The table reports the number of observations, sum of squared residuals, the value of the log-likelihood function, Schwarz and Akaike criteria, R-squared, adjusted R-squared, the standard error of the regression, Durbin-Watson and F-statistics and the significance of the F-statistic. If the ORIGIN option is specified, the statistics are adjusted appropriately. 10.3.3 Regression ANOVA Table (A) This is the standard ANOVA table showing the derivation of the F-statistic reported in the summary table. All reported statistics are adjusted appropriately when the regression equation is constrained through the origin. ON ANOVA will activate this output when the OFF CRT flag, or non-interactive mode, is set. 10.3.4 Beta Coefficients, Elasticities and Partial R (B) This tableau presents coefficient estimates and their associated Beta coefficients, elasticities and partial correlation coefficients. ON BETA enables this display when the OFF CRT option is set. 10.3.5 Correlation Matrix of Coefficient Estimates (C) Although there is little theory regarding this matrix, which is a normalized variance-covariance matrix of the coefficients (which we call the "correlation matrix of coefficient estimates", it does provide a quick way to examine the relationship between pairs of coefficients, and detect multicollinearity. ON CCOR will present this display when SORITEC is in OFF CRT mode. 10.3.6 PDF and Histogram of Standardized Residuals (H) This table provides a quick summary of the distribution of the residuals for quick identification of outliers or a skewed distribution, and shows the percentage of residuals falling between each integer multiple of the regression error variance, including a histogram of the same information. The histogram information has a higher resolution than the table since each line of the screen represents 1/3 of a standard deviation. Therefore, the scale of the histogram is about 1/3 that of the residual PDF table; specifically, if the maximum PDF table value is 50%, the maximum vertical value on the plot would be on the order of 17%. 2 February 1, 1990 SORITEC PRIMER -- Chapter 10 (1) 10.3.7 Convergence Path for Autocorrelated Estimators (M) This display shows the path of values of the coefficient of autocorrelation ("rho") for the Cochrane-Orcutt and Hildreth-Lu techniques (CORC and HILU) in the SORITEC SAMPLER, and in most of the CORC- and HILU-related techniques in the full SORITEC language. 10.3.8 Non-Parametric Residual Distribution Tests (N) This table provides a set of statistical tests on the normalcy of the residual distribution as well as tests of the randomness of the residuals. Specifically, SORITEC Sampler carries out a "Run of Signs" test for randomness, a chi-square test against the normal distribution, and a Kolmogorov-Smirnov test for normality. 10.3.9 Actual vs Fitted Plot and Standardized Residuals (P) This display shows the actual versus fitted and standardized residuals for the regression. The plot is produced in a form that is reproducible by line printers, except that on an IBM PC or compatible, the plots appear in 3-color medium resolution mode. ON PLOT activates this output when the OFF CRT option is set. 10.3.10 Residual Autocorrelation Summary (R) The residual summary table provides information on the distribution of the residuals (sum of squared residuals, skewness, kurtosis, etc.) and the autocorrelation structure of the residuals with Durbin-Watson (for one, four and 12 periods) and every fourth one of the first 24 Box-Pierce statistics. All these statistics, along with the first 24 autocorrelation coefficients, may be recovered for later analysis. 10.3.11 Statistical Summary of Exogenous Variables (S) This table reports the mean and standard deviation of the independent variables, as well as the mean of the dependent variable. When the OFF CRT option is set, this display is activated by ON STATS. 10.3.12 Covariance Matrix of Coefficient Estimates (V) This tableau displays a covariance matrix of the coefficients. It is equivalent to the display produced by the ON VCOV option when the OFF CRT option is set. 10.3.13 Exogenous Variables List (X) This display lists the variables named in the most recent EXOGENOUS statement and used in the current TWOSLS. 10.4 Interactive Crosstabs The XTAB command allows for interactive scrolling through the table in a spreadsheet manner. In this mode, keys are interpreted as follows: February 1, 1990 3 SORITEC PRIMER -- Chapter 10 (1) Key Interpretation X Move down one screen S Move left one screen D Move right one screen E Move up one screen Q Quit the XTAB command 4 February 1, 1990 SORITEC PRIMER -- Chapter 11 (1) PRIMER -- Chapter 11 -- Simulation 11.1 Introduction SORITEC allows you to simulate simultaneous systems of equations, either linear or non-linear. SORITEC offers you a choice of either Newton's method or the Gauss-Seidel algorithm for simulating your model. In this article we will consider only the latter option. 11.2 Defining the Equations of your Model The first step in simulating a model is to specify the equations. When using the Gauss-Seidel algorithm the equations must be in normalized form. That is, each endogenous variable must appear as the left-hand variable in exactly one equation. (Newton's method allows you to simulate models that are not normalized.) Let's consider a two-equation model of demand and supply. We'll need to use an EQUATION command to define a demand equation and another to define a supply equation. Let's suppose that both equations are linear, and are shifting over time. The following commands could be used to specify our equations: EQUATION DEMAND PRICE = a + b*QUANTITY + c*TIME EQUATION SUPPLY QUANTITY = d + e*PRICE + f*TIME Note that PRICE and QUANTITY each appear as a left-hand variable. Now we need to tell SORITEC that a, b, c, d, e, and f are parameters, and to give them values. We use the PARAMETER command as follows: PARAMETER a 100 b -4 c .75 d 10 e 3 f 1 (Note that we follow the SORITEC convention of defining as parameters quantities that we may wish to estimate in the future. For the immediate purposes of this example, we could have defined a, b, c, d, e, and f as constants.) 11.3 Combining the Equations into a Superformula The next step is to use the SUPERF command to combine the equations of the model into a superformula suitable for simulation by the Gauss-Seidel algorithm. We simply specify a name for the superformula and list the equations: SUPERF superformula equation1 equation2 ... In the demand and supply example, the following command would be used to create a superformula: SUPERF D_AND_S DEMAND SUPPLY The previous command would combine our equations into a superformula called D_AND_S. You might think of a superformula as a system of equations. Once you have created the superformula, the individual equations are no longer needed for simulation. February 1, 1990 1 SORITEC PRIMER -- Chapter 11 (1) 11.4 Simulating a Model Using the Gauss-Seidel Algorithm Before simulating our model, we need to specify the period over which the simulation is to be performed. Let's assume that we want to simulate the model from the first quarter of 1988 through the final quarter of 1989. Then we would enter the following command: USE 1988Q1 1989Q4 We also have to supply values for the exogenous variables. In our model we have only one such variable, namely TIME. We could supply values as follows: SERIES TIME 1 2 3 4 5 6 7 8 Now we are ready to simulate the model. To do so we use the FORECAST command: FORECAST D_AND_S The general form of the FORECAST command is as follows: FORECAST [ ( [TOL=s1] [MAXIT=n1] [MAXPRT=n2] [TAG=name1] [STATIC|DYNAMIC] [NOBASE] ) ] superformula The modifier TOL specifies the convergence criteria. Convergence is declared when the maximum absolute relative error is no greater than s1. {Default: .0001} The modifier MAXIT specifies the maximum number of iterations within any simultaneous block. {Default: 50} The modifier MAXPRT specifies the maximum number of iterations for which to print intermediate solution values. {Default: 0 if the PRINT flag is off, 5 if the PRINT flag is on} The TAG modifier specifies a tag to give to all solution values. For example, if you were to specify TAG=ALT, and one of the left-hand variables was GNP, then the simulated values for GNP would be stored in the variable ALT^GNP. The TAG modifier therefore allows you to run several versions of a model and store the results in different variables. The default action is to store the results in the variables specified in the equations. The STATIC modifier instructs SORITEC to perform a static simulation, whereas the DYNAMIC modifier instructs SORITEC to perform a dynamic simulation. The default is to perform a DYNAMIC simulation. The NOBASE modifier instructs SORITEC to not use any existing values as starting values for solutions within simultaneous blocks. The default is to use all such values as starting values. 2 February 1, 1990 SORITEC PRIMER -- Chapter 11 (1) 11.5 Putting Add Factors into Equations SORITEC allows you to specify add factors for your equations. To do so you use the ADDFAC command, whose general form is as follows: ADDFAC equation... ADDFAC inserts an add factor into each equation specified. The add factor will have the same name as the left-hand variable of the equation, prepended by an ampersand (&). For example, to put add factors into the equations of our demand and supply model, you would use the following command: ADDFAC DEMAND SUPPLY Since PRICE is the left-hand variable of the equation DEMAND, the add factor for that equation would be &PRICE. And since QUANTITY is the left-hand variable of the equation SUPPLY, the add factor for that equation would be &QUANTITY. ADDFAC does not declare or initialize the add factor. It only places the add factor into the equation. Thus an add factor can be any type of item that is legal in the equation. To suppress the effect of the add factor, set its value or values to zero. If you place an add factor in an equation that was previously combined into a superformula, and you then want to simulate the model with the new (add- factored) equation, you must first use the SUPERF command to re-make the superformula. Remember, the superformula, once created, does not reference the individual equations. Thus, changes to the equations are not reflected in the superformula unless the SUPERF command is run again. 11.6 Comparing Scenarios The TAG modifier of the FORECAST command allows you to run different scenarios of a model and store the results in different variables. Once you have done so you can use the COMPARE command to compare the various scenarios with the baseline values. For example, we could run the demand and supply example above with different values for the parameters, and then compare the results to historical data. The general format of the COMPARE command is as follows: COMPARE [( [TAG=name] [REVERSE] [DIFF|%CH] )] series... The modifier TAG compares each series to the corresponding series whose name begins with the tag. For example, the following command would result in comparisons between PRICE and BULL^PRICE, and QUANTITY and BULL^QUANTITY: COMPARE (TAG=BULL) PRICE QUANTITY The modifier REVERSE reverses the comparison values from "baseline vs. scenario" to "scenario vs. baseline". February 1, 1990 3 SORITEC PRIMER -- Chapter 11 (1) The DIFF or %CH options perform a comparison of the first difference or percentage growth between the baseline and simulation values, respectively. 4 February 1, 1990 SORITEC PRIMER -- Chapter 12 (1) PRIMER -- Chapter 12 -- Forecasting with Time Series Techniques 12.1 Introduction SORITEC provides you with the ability to forecast time series using Box-Jenkins techniques. The INSPECT command will provide you with a correlogram with autocorrelation and partial autocorrelation information, which will aid you in the identification of an appropriate model. The MARMA command will allow you to estimate your model and forecast with it. 12.2 Identification of a Time Series Model Identification of the lag structure of a time series model is aided by the INSPECT command, which has the following syntax: INSPECT [(YULE)] data numlag [ ndiff [ nseasonal lseasonal ]] The YULE modifier specifies a Yule-Walker approximation to partial correlation coefficients. These are calculated more rapidly than the exact coefficients, but are significantly less accurate, particularly in small samples. The "numlag" argument specifies the number of lags over which the autocorrelation and partial autocorrelation coefficients are to be calculated. The "ndiff" argument is optional and specifies the order of regular differencing to be applied to the time series before calculation of coefficients. The "nseasonal" and "lseasonal" arguments specify the order of seasonal differencing and season length, respectively. They are optional, but must both be specified if they are to be used at all. Issuing the ON PLOT command before executing INSPECT causes the correlogram to be produced, in addition to a table containing the calculated coefficients. A detailed discussion of the identification of a time series model is beyond the scope of this document, but here are a few simple guidelines: 1) A moving average (MA) process is characterized by a small number of significant spikes in the Autocorrelation Function (ACF) and a gradual approach to zero by the Partial Autocorrelation Function (PACF). The number of significant spikes provides an estimate of the order of the MA process. 2) An autoregressive (AR) process is characterized by a small number of significant spikes in the Partial Autocorrelation Function and a gradual approach to zero by the Autocorrelation Function. The number of significant spikes provides an estimate of the order of the AR process. 3) An ambiguous pattern, with significant spikes in both the ACF and the PACF, may signify an ARMA or ARIMA process. For a more detailed discussion of identification of and forecasting with Box- Jenkins models, please refer to Business Forecasting by J. Holton Wilson and February 1, 1990 1 SORITEC PRIMER -- Chapter 12 (1) Barry Keating (Richard D. Irwin, Inc., 1990), or Time Series Analysis, Forecasting and Control by G. E. P. Box and Gwilym M. Jenkins (Holden-Day, 1976). 12.3 Estimation and Forecasting of Time Series Models Estimation and forecasting of a time series model is accomplished by the MARMA command which has the following syntax: MARMA ([P=nar] [Q=nma] [D=ndiff] [S=nseas SL=seasl] [F=fper] [ORIGIN] [CENTER]) data The P modifier specifies the order of the AR process. The Q modifier specifies the order of the MA process. The D modifier specifies the order of regular differencing to be applied to the time series. The S and SL modifiers specify the order of seasonal differencing and the season length, respectively. They are optional, but must both be specified if they are to be used at all. The F modifier specifies the number of periods at the end of your active period for which to calculate forecast values. If you do not specify extra periods beyond your sample in the last USE command before MARMA, the final "fper" observations will be written over with forecast values. The ORIGIN modifier specifies suppression of the constant term. The CENTER modifier removes the mean from the variable "data" before doing the analysis. The MARMA command places residuals from the estimation in ^RES. You can obtain the correlogram for the residuals from the INSPECT command, as discussed above. If the correlogram indicates none of the patterns discussed above, then all that is left in the residuals is white noise, and your model specification explains the data well. If there is a noticeable pattern in the correlogram of the residuals, you should try another model specification (different P, Q, etc.). 2 February 1, 1990 SORITEC PRIMER -- Chapter 13 (1) PRIMER -- Chapter 13 -- Forecasting with Smoothing Techniques 13.1 Introduction SORITEC provides you with facilities to forecast time series using smoothing methods. The MA and CMA commands will calculate the regular and centered moving averages of a time series, respectively. The SMOOTH command will forecast a time series by exponential smoothing. 13.2 Moving Average of a Time Series The moving average of a time series can be calculated by the MA command, which has the following syntax: MA result data length The "result" argument specifies the time series where the calculated moving average is placed. The "data" argument specifies the original time series. The "length" argument specifies the length of the moving average. The CMA command has the same syntax, but computes the centered moving average instead of the regular moving average. 13.3 Exponential Smoothing Exponential smoothing of a time series is accomplished by the SMOOTH command which has the following syntax: SMOOTH ( type F=nper [L=seasl] ) result data The "type" modifier specifies the type of smoothing required. Valid types are SIMPLE, HOLT, and WINTER. The F modifier specifies the number of periods after the end of your active period for which to calculate forecast values. The L modifier specifies the the season length. This modifier is used only with WINTER specified as the type of smoothing. SIMPLE smoothing does not take trend or seasonality into account. This technique is a one-parameter method that calculates a simple weighted average of past values, assigning a greater weight to recent values than to older values. HOLT smoothing takes trend into account, but not seasonality. This technique is a two-parameter method that calculates a weighted trend component of the series in addition to a simple weighted average of past observations. WINTER smoothing takes both trend and seasonality into account. This technique is a three-parameter method that calculates a weighted seasonal component, weighted trend component, and simple weighted average of past observations. February 1, 1990 1 SORITEC PRIMER -- Chapter 13 (1) In all types of exponential smoothing, SORITEC automatically selects the optimal values for all parameters, allowing you to avoid trial and error in parameter value selection. 2 February 1, 1990 SORITEC FLAGS(2) FLAGS -- GLOBAL OPTION SETTINGS Many SORITEC commands and operations are controlled in part by flags, which are global SORITEC option switches. These flags can be used to increase or decrease the amount of standard output from a command, reset default options for ways of handling data, turn journalling on and off, or otherwise reconfigure the SORITEC computing environment to suit the user and the task at hand. The flags can also be stored, retrieved and printed. Flags are primarily manipulated using the following commands: ON/OFF -- Change One of the Flags ON or OFF is used to turn on or off one or more of the flags , or to display the current settings. ON|OFF [flagname...] RECOVER RECOVER initializes and names the current value stored under an internal name so that it can be preserved from being overwritten by a new operation. In the context of flag manipulation, the internal name ^FLAGS can be recovered for restoration with the FLAGS command: RECOVER vector FLAGS It is often desirable, when writing procedures, to routinely recover the flag settings and the USE period upon entering the procedure, and to reset both the flags and the USE period upon exiting the procedure. FLAGS -- Specification of Global Flag Configuration FLAGS resets an entire flag configuration, which was dumped into a vector by the use of the RECOVER command: FLAGS vector FLAG CONTROL AND PRINTING The ON and OFF commands turn flags on and off. Any number of flags can be turned on or off with a single command, but there is no way in a single command to turn some flags on and others off. Two commands, one ON and one OFF command, would be needed. Entering an ON or OFF command without any arguments prints a table showing the current status of the selectable options: ON February 1, 1990 1 SORITEC FLAGS(2) Flag settings: OFF ALIAS ON DIVZERO ON HEAD OFF PRINT OFF STREAMIO OFF ANOVA OFF DOLLAR ON JOURNAL ON PROMPT OFF TOKENS OFF AUTOLOG OFF DYNAMIC OFF LOG OFF QUIET OFF TRAIL OFF BETA OFF ECHO ON MISSING OFF RAGGED ON UPRINT ON BREAK OFF ECS OFF NOEJECT OFF RAWEQ OFF USE OFF BRIEF OFF EXACT OFF NOERROR OFF REPLACE OFF VCOV ON CAUTION OFF EXPDAMP ON NOTE OFF RESIDUAL OFF CCOR OFF FASTDIF OFF PATH OFF REVISE OFF CRT ON GLOBAL OFF PERFECT OFF ROBUSTSE OFF DETAIL OFF GROUP OFF PLOT OFF STATS Every ON or OFF command that changes an option stores an internal vector called ^FLAGS. This vector contains enough information for the FLAGS command to restore the option values to those in effect immediately after the command executed. RECOVERed ^FLAGS vectors may be stored in SORITEC databanks and subsequently used to restore the flags environment with the FLAGS command. The FLAGS command has one argument, the name of a vector containing the options settings. The only way to create an options vector is by RECOVERing the ^FLAGS vector. CAVEATS: The ^FLAGS vector must NOT be changed in any way, or unreliable or unpredictable results may occur. The FLAGS command exists solely to restore a previous ON/OFF pattern. The ordering and number of the ON/OFF options may change in future releases, so flag vectors stored in databanks may not restore the options desired if used by a later release of SORITEC. Any flag vectors residing as vectors on databases should be reconstructed and replaced whenever a new SORITEC release is received. Flag settings: FLAG DEFAULT FUNCTION WHEN TURNED ON --------- ------- ------------------------------------------------------ ALIAS OFF Aliases of names rather than the names themselves appear in all output. ANOVA OFF* ANOVA table is printed for all regressions in which it is calculated. AUTOLOG OFF This is an experimental flag. Allows logarithmic expressions to be used as a command argument, i.e. "REGRESS LN(Y) LN(X)". This is not a formally released facility. BETA OFF* Beta coefficients, partial r's, and elasticities are printed for all regressions in which they are calculated. BREAK ON Enables keystroke interrupt (usually CTRL-C). When ON, eight successive interrupts will terminate SORITEC. This is a system-dependent feature. There may be system-specific information regarding 2 February 1, 1990 SORITEC FLAGS(2) this feature in the Section 7 article specific to your computer. BRIEF OFF Suppresses interactive prompt and other output which the user might not want when designing a SORITEC job to be run by a non-SORITEC user. CAUTION ON Print CAUTION level error messages. CCOR OFF* Correlation matrix of the coefficient are printed for all regressions in which it is calculated. This flag also affects the printing of the ^CCORT matrix, which is generated by the ANALYZE command. CRT OFF Causes SORITEC to generate screen-oriented output for many commands, and to pause every PAGESIZE lines for commands not using screen-oriented output. This is a system-dependent feature. There may be system-specific information regarding this feature in the Section 7 article specific to your computer. DETAIL OFF Causes SORITEC to print details of particular calculations, particularly non-linear estimation. May cause voluminous output. DIVZERO ON Not used. DOLLAR OFF Dollar signs are interpreted as semicolons. DYNAMIC OFF Transformations involving lags of the result variable are computed dynamically, not statically. ECHO OFF Echoes each line typed in or read from a file. ECS OFF Extended Character Set -- allows ASCII characters 128 to 255 to be entered in a command. This enables, on an IBM PC and some other systems, the "national" character sets, including accented characters, umlaut characters, and currency symbols, such as the pound sterling symbol. EXACT OFF Suppresses standardization of validation mode round- off treatment (used only by Sorites Group in testing). EXPDAMP OFF Makes EXP and LOG functions become very, very steep outside the "normal" range encountered in well-behaved nonlinear problems. Both functions will then not issue error messages, but will generate function values that are extreme in value. The first derivatives of the damped functions have the same first derivative as the EXP and LOG functions at the tie points, so they are well-behaved substitutes for EXP and LOG in many nonlinear problems. FASTDIF OFF Makes DIF files read much more quickly. However, error messages produced by the FASTDIF facility are relatively unhelpful in identifying problems. Should only be used as part of completely debugged processing systems. GLOBAL ON Unreleased facility. GROUP OFF Groups are expanded, so that a command containing a group is interpreted as if the elements of the group, not the group name, were all present in the command. HEAD ON The batch page heading appears in batch jobs. JOURNAL ON A journal file is produced containing all interactive commands entered. LOG OFF Will write most SORITEC output on an output log file. February 1, 1990 3 SORITEC FLAGS(2) MISSING ON Missing values are recognized, and various commands deal with them by casewise deletion, imputation, or error processing, as the case may be. When this flag is OFF, missing values are treated as zero. NOEJECT OFF Suppresses all page ejects in batch jobs. NOERROR OFF Kills a SORITEC batch job on any error message. Obsolete flag, being replaced by PERFECT. NOTE ON Print NOTE level error messages. PATH OFF Cause printing of the iteration log for iterative autocorrelation estimators. PERFECT OFF Kills a SORITEC batch job on any error message. PLOT OFF* A plot of actual values, fitted values, and residuals is produced after every regression, and other plots of relevant data is produced after every use of certain other commands, such as INSPECT. PRINT OFF Data from SAL files is printed as it is read in, and many intermediate results from the more complicated commands (e.g., FIML, MARMA) are printed during command execution. QUIET OFF Suppresses much of SORITEC's output. This flag is generally used when the user wishes to take control of the output screen for a fourth-generation language application. PROMPT ON SORITEC prompts for user input in interactive mode. RAGGED OFF SORITEC will NOT enforce the usual requirement that the number of data values processed by a FILL, READ or LOAD command be compatible with the USE period. An error message will still be produced if too much data is entered, but too little data will result in padding with missing values. RAWEQ OFF The SORITEC internal result ^RAWEQ, and all necessary accompanying parameters, will be produced after each regression. This uses a lot of symbol table space. REPLACE OFF An attempt to KEEP a variable on a databank which already contains a variable with that name will be successful, and no error message will be produced. RESIDUAL OFF* A residual analysis table will be produced after each regression. REVISE OFF Every command which would ordinarily create a time series will be treated as a revision. This means that every time series which would have been created by, for instance, a transformation (COMPUTE command) must already exist so long as this flag is ON, and values of the series outside the current USE interval will be preserved, and merged with the new values. ROBUSTSE OFF Regression standard errors will be computed in accordance with a robust technique developed by Halbert White. STATS OFF* A table showing the mean and standard deviation of all independent variables will be produced for each linear regression. STREAMIO OFF During a formatted read in which the number of fields in the format being used does not conform to the number of variables being read, SORITEC normally will reuse 4 February 1, 1990 SORITEC FLAGS(2) the format from the start for each new observation. When this flag is ON, all data for the command will be read using the format once, or reusing the format according to the way FORTRAN reuses a format when more data is read than specified by the number of fields in the format statement. TOKENS OFF Not used. TRAIL OFF Produces a sometimes voluminous trail of internal temporary results and debug output. Much of what is produced would only be meaningful to Sorites personnel, but it may occasionally be helpful or suggestive to the user. UPRINT ON Underscores in variable names are printed whenever the variable name appears in SORITEC output. USE OFF The USE period will be printed out whenever it is changed. VCOV OFF* The variance-covariance matrix is printed after each regression. This flag also affects the printing of the ^VCOVT matrix, which is generated by the ANALYZE command. * Not relevant when (a) the CRT flag is ON, AND (b) the computer system supports the tableau form of presentation of results. See CRT(2). EXAMPLES AND EXTENDED DISCUSSION ON/OFF REVISE [DEFAULT OFF] When the REVISE flag is turned ON, new time series may not be created and attempts to assign values to undefined variables with any command whose result would be a time series will result in an error. With REVISE set ON, all time- series assignment and FILL statements will behave as though they were prefixed by a REVISE command. When the REVISE flag is ON ,observations are added to existing series if the current USE period is a superset of the USE period under which the symbol was last defined; if the current USE period is a subset of the USE period under which the variable was created, or if the current USE period does not overlap the old USE period at all, none of the old values of the series are lost. ON/OFF REPLACE [DEFAULT OFF] When the REPLACE flag is ON, the databanking KEEP command will KEEP items on the currently ACCESSed databank regardless of whether name conflicts occur with previously stored items. In other words, KEEP acts like a REPLACE command when necessary to do an unconditional KEEP. ON/OFF DOLLAR [DEFAULT OFF] When the DOLLAR flag is turned ON, dollar signs appearing in input are interpreted as semicolons (command separators). Use of this feature is not recommended; this flag will be removed in a future release. ON/OFF JOURNAL [DEFAULT ON] When SORITEC is executing in interactive mode, each command entered by the user from the keyboard is written to a journal file, provided that the JOURNAL flag is ON (which is the default). When an interactive session is finished, it is a February 1, 1990 5 SORITEC FLAGS(2) simple matter to re-create the session or extract a portion of the journalled commands for later re-use by editing the journal file. Since user discretion in controlling the journal file is contrary to the whole journal file philosophy, using the ON/OFF JOURNAL command is not recommended. A new journal file is written each time SORITEC is executed with the JOURNAL flag ON. On some systems, such as the IBM PC, the journal file always has the same name (in this case, SORITEC.JNL), and thus each session of SORITEC that uses journalling will overwrite the previous session's journal file. On most UNIX systems, the file name used for each journal is constructed from either the system time and date or the process identifier for SORITEC. It is not guaranteed to be unique, but it is likely that overwrites of the old journal will be rare. This does mean, however, that one must occasionally clean out the accumulated old journal files, usually with the command "rm *.jnl". ON/OFF ALIAS [DEFAULT OFF] SORITEC can show either the actual or formal parameter name of items passed to procedures. This is controlled by the ALIAS flag. The ALIAS option controls the printing of variable names in output produced by commands invoked from within a PROCEDURE. When ALIAS is OFF, arguments to a PROCEDURE are shown with the name of the formal parameter used in the procedure, i.e. with exactly the same name as the dummy variable in the PROCEDURE statement defining the subroutine. When ALIAS is ON, variables are shown bearing the name used in the CALL statement that invoked the procedure. Parameter passing in most languages transmits only values to formal procedure arguments. SORITEC also passes the symbolic identity of the actual passed parameters as well. This facility is called aliasing. Example: PROCEDURE SIMPLE(X) PRINT X END SIMPLE PARAMETER A 3.0 OFF ALIAS ; CALL SIMPLE(A) ON ALIAS ; CALL SIMPLE(A) Parameter X = 3.000000 Parameter A = 3.000000 NOTE: It is anticipated that the default ALIAS setting will be changed to ON at a later release. ON/OFF CRT [DEFAULT OFF] The ON CRT command is used with the PAGESIZE command to control output to the CRT terminal. When the CRT option is ON, SORITEC will print only PAGESIZE or fewer lines of information before pausing. Entering a carriage return resumes output. Many SORITEC commands will produce screen-oriented output when the CRT flag is 6 February 1, 1990 SORITEC FLAGS(2) ON. For instance, most of the linear regression variations, such as REGRESS, TWOSTAGE, HILU, CORC, and so on, will produce information on the regression being performed in sections that will fit on a CRT screen. Using single keystrokes, the user may page back and forth among the various reports, such as the estimated coefficients, the ANOVA table, residual plots, and so forth, without any re-calculation of those results being done. SEE ALSO CRT(2), FLAGS(3), ON(3) February 1, 1990 7 SORITEC RECOVER(2) RECOVER -- Recoverable Results Many SORITEC commands not only print results, but store those results, and other secondary results that are normally not printed, under "internal names". These internal names start with a caret (^), and you generally do not change them. For example, the coefficients from an ordinary least squares (OLS) regression are automatically stored under the name "^COEF", so that you may easily reference the coefficients without having to type them back into SORITEC. RECOVER(3) may be used to change the internal names to a legal SORITEC name. However, in most instances, the internal name may simply be used directly in SORITEC commands. For instance, the fitted values from an OLS regression can recalled by referencing ^YFIT. For example: resid = y - ^yfit will calculate the residuals for an estimated equation. Note that it is still necessary to use the RECOVER command to reassign the names of non-mathematical items, such as equations and groups. Internal variables may not be saved on a databank without being assigned a new name, e.g. KEEP ^YFIT will fail. Results stored under an internal name need not be recovered immediately since all such results remain available until a later command overwrites the values by using the same internal name. These are the internal names which are created by SORITEC: INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- ------- ---- ---------- -------------------------------------- ^A SCURV C Intercept term of linearized model EXGRO ^ACOR INSPECT V ** Autocorrelation coefficients ^ACORSE INSPECT V ** Standard error of autocorrelation coefficients ^ACOV INSPECT V ** Autocovariance coefficients ^AKAIKE (1-3) C Akaike information criterion ^AUTOCOR (1-3) V 24 First 24 autocorrelation coefficients for residual series ^B SCURV C Slope term of linearized model EXGRO ^BPQ (1-3) V 24 First 24 Box-Pierce Q statistics for residual series ^CCOR PROBIT P (^NV,^NV) Correlation matrix of coefficient (1-6) ^CCORT ANALYZE M (^NV,^NV) Correlation matrix of transformed coefficients ^COEF (1-6) V ^NV Estimated coefficients MARMA3 MINIMAX PROBIT ACTFIT ^CONST *(a) T Time series predefined to 1.0 ^COR CORREL M (^NARGS, Correlation matrix ^NARGS) February 1, 1990 1 SORITEC RECOVER(2) INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- -------- ---- ---------- -------------------------------------- ^COUNTS TTEST V ^NV Number of non-missing observations SYNOPSIS V ^NV in each variable ^COV COVA M (^NARGS, Covariance matrix CORREL ^NARGS) ^COVTR MVR P (^NEQ, Covariance matrix of transformed THREESLS ^NEQ) residuals ^COVUTR MVR P (^NEQ, Covariance matrix of untransformed THREESLS ^NEQ) residuals ^CV SYNOPSIS V ^NV Coefficient of variation of each variable ^DATE *(b) S Today's date ^DECILE SYNOPSIS V 10 Derived decile points ^DEP (1-5) G 1 Name of dependent variable RIDGE PROBIT ^DEVS (1-) V ^NARGS Standard deviation of each variable CORREL STATS SYNOPSIS TTEST ^DF1,^DF2, DISCRIM E Equations that represent the various ^DF3, etc. discriminant functions ^DLEST MARMA V SGI internal use ^DURBINH (1-3) C Durbin's H statistic ^DW (1-6) C Durbin-Watson statistic ^DW12 (1-3) C Durbin-Watson statistic, order 4 ^DW4 (1-3) C Durbin-Watson statistic, order 12 ^ENDOG ENDOGENOUS G Current endogenous variable list ^EQORD BUILD G Optimal equation block ordering for Gauss-Seidel solution ^ERRNO *(c) C Error number last encountered ^EXOG EXOGENOUS G Current exogenous variable list ^FACTOR ADJUST T Seasonal factor series ^FINFUNC MARMA T Final white noise function values in transformed model ^FINFUNC DISCRIM M (^OBS, Matrix that contains for each #classes) observation the values of all of the discriminant functions ^FLAGS ON V n/a Vector to use in restoring flag settings OFF ^FOREQ (1-5) E Forecasting equation MINIMAX ^FRAG DBGROUP n/a ** Largest possible subset storable ^FSTAT (1-3,6) C F-statistic for estimation ^GAPS USE C Number of gaps in USE period USEADD USEALL USEIF ^I AMORT T Interest series for RULEOF78 method ^IFCONV CORC C 1 if last CORC-class command converged, TSCORC else 0 2 February 1, 1990 SORITEC RECOVER(2) INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- -------- ---- ---------- -------------------------------------- CORC2 TSCORC2 ARC ^INFAR MARMA V ** SGI internal use ^INFMA MARMA V ** SGI internal use ^ITERS (2,3,6 C Number of iterations used except FIML) ^KENDALL NCOR M (^NV,^NV) Kendall correlation coefficients ^KURT (1-3) C Kurtosis of residuals ^KURT SYNOPSIS V ^NV Kurtosis of each variable ^LAGCOi (4) V ^NDEG(i)+1 Lag coefficients on the i'th distributed lag structure ^LAGSEi (4) V ^NDEG(i)+1 Standard errors of lag coefficients stored in ^LAGCOi ^LAGSUMi (4) C Sum of lag coefficients for i'th distributed lag variable ^LENGTH INPUT C Length of item read in ^LOGDET MCOMPUTE C Natural log of determinant MINV ^LOGLIK (1-6) C Log-likelihood function PROBIT ^MAPE SMOOTH C Mean absolute percentage error TREND of forecasted values SCURVE ^MAX SYNOPSIS V ^NV Maximum of each variable ^MAX DISCRIM T Maximum of the values of the discriminant functions ^MAXSDB *(a) C Maximum number of databanks that can be active during run ^MEANABS (1-3) C Mean absolute error ACTFIT ^MEANERR ACTFIT C Mean error ^MEANS (1-3) V ^NV Mean of each independent variable ^MEANS SYNOPSIS V ^NV Mean of each variable TTEST ^MEANS CORREL V ^NARGS Mean of each variable STATS ^MEANSE TTEST V ^NV Standard error of the mean for each variable ^MEDIAN SYNOPSIS V ^NV Median of each variable ^MIN SYNOPSIS V ^NV Minimum of each variable ^MINMAX MINIMAX C Maximum absolute error achieved ^MLAGi (4) C Mean lag for i'th distributed lag variable ^MODE SYNOPSIS V ^NV Mode of each variable ^NAMES (1-3,5,6) G Independent variable names RIDGE PROBIT ^NARGS GROUP C Number of arguments in command COVA February 1, 1990 3 SORITEC RECOVER(2) INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- -------- ---- ---------- -------------------------------------- CORREL STATS INPUT ^NCOL XTAB C Number of columns in crosstabs table ^NDEG (4) V ** Degree of i'th distributed lag variable, reduced if necessary for calculability ^NEQ MVR C Number of equations estimated THREESLS ^NET CAPITAL T Net investment series ^NGAPS (4,5) C Value of ^GAPS at time of last command which stores this internal result ^NOBS (1-6) C Number of observations actually used RANK by analysis command ACTFIT CORREL COVA ^NPER (4) V ** Number of periods for i'th distributed lag variable ^NROW XTAB C Number of rows in crosstabs table ^NV (1-3,5) C Number of variables used in command SYNOPSIS TTEST ^NV (4,6) C Number of coefficients actually estimated ^OBS USE C Number of observations in USE period USEADD USEALL USEIF ^ORIGIN *(d) C 1.0 if ORIGIN modifier present in last command which could have had one, else 0.0 ^P AMORT T Payment series for RULEOF78 method ^P DISCRIM T Estimated probability of being in the class predicted by ^PRED ^PACOR INSPECT V ** Partial autocorrelation coefficients ^PACORSE INSPECT V ** Standard errors of partial autocorrelation coefficients ^PERYR USE C Number of time periods per year under the USEALL current USE period ^PRED DISCRIM T Predicted classes (class corresponding to the maximum value of any discriminant function) ^PV IRR C Present value ^QUARTIL SYNOPSIS V 4 Derived quartile points ^R2ADJC (2,3) C R-squared from differenced model ^RAWEQ (1-5)(f) E Raw forecasting equation MINIMAX ^REGSE (4-6) C Standard error of regression (see ^SEE) ^REP CAPITAL T Net replacement series ^RES MARMA T Residual series 4 February 1, 1990 SORITEC RECOVER(2) INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- -------- ---- ---------- -------------------------------------- ^RESID (1-3) T Estimated residual series ^REVERR REVISE T Values that you attempted to REVISE into a non-existent variable ^RHO (2) C First-order autocorrelation coefficient ACTFIT ^RHO (3) V 2 First-order and second-order autocorrelation coefficients ^RHO1 (3) C First order autocorrelation coefficient ^RHO2 (3) C Second order autocorrelation coefficient ^RHOSE (2) C Standard error of ^RHO (first-order only) ^RMEANS (1-4) V ^NV Means of independent variables FASTREG FAST2SLS ^RMSERR (1-3) C Root mean squared error between actual ACTFIT and fitted values ^RSQ (1-6) C R-squared ^RSQADJ (1-6) C R-squared adjusted for degrees of freedom ^SCHWARZ (1-3) C Schwarz information criterion ^SE (1-6) V ^NV Standard errors of estimated coefficients MARMA ^SEE (1-3) C Standard error of estimate (see ^REGSE) ^SEED *(e) C Random number generator seed ^SIGNDET MCOMPUTE C Sign of determinant of last matrix MINV inverted ^SKEW (1-3) C Skewness of residuals ^SKEW SYNOPSIS V ^NV Skewness of each variable ^SPEARMAN NCOR M (^NV,^NV) Spearman correlation coefficients ^SSR (1-6) C Sum of squared residuals ^SUM SYNOPSIS V ^NV Sum of each variable ^SUMR (1-6) C Sum of residuals ^SYSERR SYSTEM C System error code (machine-dependent) ^THEILU ACTFIT C Theil U statistic ^TIME *(b) S Time of day ^TYPE INPUT C Type code of item read in ^USE USE V 2*^GAPS+2 Current use specification USEADD USEALL USEIF ^VALIMAG EIGEN V Imaginary portion of eigenvalues ^VALREAL EIGEN V Real portion of eigenvalues ^VAR SYNOPSIS V ^NV Variance of each variable ^VCOV (1-6) P (^NV,^NV) Variance-covariance matrix of coefficient PROBIT ^VCOVT ANALYZE M (^NV,^NV) Variance-covariance matrix of transformed coefficients ^VECIMAG EIGEN P Imaginary portion of eigenvectors ^VECREAL EIGEN P Real portion of eigenvectors ^WGT MVR P (^NEQ, Weighting matrix February 1, 1990 5 SORITEC RECOVER(2) INTERNAL CREATED ITEM DIMENSIONS RESULT BY TYPE OR LENGTH DESCRIPTION -------- -------- ---- ---------- -------------------------------------- THREESLS ^NEQ) ^XCOR CROSSCOR V ** Vector of cross-correlations ^XTABLE XTAB M (^NROW, Cross-tabulation matrix, containing ^NCOL) cell counts ^YFIT (1-6) T Fitted values ^YMEAN (1-6) C Mean of dependent variable ^YVAR (1-3) C Variance of dependent variable ^ZVALUE PROBIT T Fitted Z-values * Indicates an internal result for which one of the following applies: (a) predefined by SORITEC (b) updated every time it is referenced (c) affected by most or all commands (d) stored according to syntactic considerations rather than command name (e) changed every time any command causes a random number to be generated (f) only stored if RAWEQ flag is ON ** Indicates that the length of an item is dependent on an argument or the number of arguments you supplied in the command which created the item. Classes of commands: (1) Linear REGRESS and TWOSLS (2) CORC, HILU, TSCORC, TSHILU, ARC, ARH (3) CORC2, HILU2, TSCORC2, TSHILU2 (4) ALMON, ALMONC, ALMONH (5) FASTREG, FAST2SLS, GLS, MIXED, RLS, SHILLER (6) MVR, THREESLS, FIML, nonlinear REGRESS and TWOSLS Variable types: C -- Constant E -- Equation G -- Group M -- Matrix P -- Pseudo-matrix S -- String T -- Time series V -- Vector 6 February 1, 1990 SORITEC BUILD(3) BUILD -- Build a SORITEC Simulation Model DESCRIPTION BUILD links all the necessary elements together into a model for use by the SIMULATE command. In essence, BUILD constructs a "road map" that directs the simulation routine to the variables, parameters, and equations necessary to simulate the model. SYNTAX BUILD eqs vars mdl [density] ARGUMENT TYPES ARGUMENT TYPE I/O DESCRIPTION eqs group I group of equations vars group I group of endogenous variables mdl model O model being built density scalar density DISCUSSION The first two arguments are SORITEC groups containing lists of the equations and identities in the model and the list of the endogenous variables in the model, respectively. These two groups should contain the same number of elements, i.e. there must be as many equations as unknowns. This version of SORITEC requires that the model be normalized, that is, that each equation have a different left-hand variable. Also, although SORITEC can accept equation definitions in implicit form for use, for example, in the FIML command, the simulation facility cannot handle such equations. The third argument is the name to be given to the model. Models can be saved on a databank by KEEPing the model name. If a model is kept in this fashion, then all the component equations, and all SORITEC variables which appear in them (time series, constants, and parameters) must also be kept. The "density" argument needs to be specified only when an "insufficient memory" message is generated during the BUILD command. The density must be set higher than the density of the linkage matrix, which SORITEC prints out before doing its internal memory calculations. (If this item is not printed out, then the use of the density argument will not correct the problem.) Generally, rounding up to the next integer will suffice. A discussion of the linkage matrix is in section 5. ALGORITHM USED Stewart's algorithm is used to order the model equations into separate blocks for solution. LIMITATIONS The current algorithm does not optimally order equations within a simultaneous block, but rather leaves the equations in each block in the order that they appear in the equations group. Judicious work by the user to order logically related equations in such a way that each variable is computed as soon as possible after other variables upon which depends will greatly improve model performance. This applies only to simultaneous blocks, as recursive blocks are optimized by the current procedure. February 1, 1990 1 SORITEC BUILD(3) NOTES The syntax of this command does not follow the usual convention of the result being the first argument. EXAMPLES EQUATION EQ1 A=X1+X2*B+X3*C EQUATION EQ2 D=Y1+Y2*D+Y3*E GROUP G1 EQ1 EQ2 GROUP G2 A D BUILD G1 G2 SIMPLE_MODEL Linkage Statistics 2 Equations 3 Endogenous Linkages Density of Linkage Matrix is 1.50 Equations will be solved in the following order: Equation Associated Variable 1 1 EQ1 1 A 2 2 EQ2 2 D Recursive block 1 contains 2 equations. SEE ALSO SIMULATE(3), SOLVE(3), ADDFAC(3), SIMULATION(6) 2 February 1, 1990 SORITEC COMPUTE(3) COMPUTE -- Transformations of Time Series Data DESCRIPTION COMPUTE is used to evaluate one or more expressions involving time series. It is the time series analogue of the MCOMPUTE and SET commands. SYNTAX Single equation form: [ COMPUTE ] formula|equation|identity Multiple equation simulation form: [ COMPUTE ( [TOL=s1] [MAXIT=n1] [MAXPRT=n2] [TAG=name1] [STATIC|DYNAMIC] [NOBASE] ) ] superformula MODIFIERS Modifiers are used in the multiple equation simulation form only. TOL=s1 : Convergence criterion for simultaneous equation blocks. This is a maximum absolute relative error criterion. {Default: .0001} MAXIT=n1 : Maximum number of iterations within any single simultaneous block. {Default: 50} MAXPRT=n2 : Maximum number of iterations to print intermediate solution values. {Default: 0 if PRINT flag is OFF; 5 if PRINT flag is ON} TAG=name1 : Tag to give to all solution values. For instance, if TAG is FITTED, and one of the variables to be solved for is GNP, the solution values for that variable will be stored in FITTED^GNP. {Default: null, i.e. solutions are stored in the variables used in the equations, wiping out any values already there.} STATIC : Perform a static simulation. {Default: not selected.} DYNAMIC : Perform a dynamic simulation. {Default: selected.} NOBASE : Do not use any existing variable values as starting values for solutions within simultaneous blocks. {Default: not selected, i.e. values already existing in the variables are used as starting values.} DISCUSSION In single equation usage: The COMPUTE keyword is almost always omitted. It need not appear if the formula, equation, or identity contains at least one right- hand side operand that has been previously defined as a time series. In multiple equation usage (superformula simulation): The COMPUTE command name is usually omitted whenever modifiers are not needed. In all usages: The COMPUTE command takes one argument, either an explicit formula, or an equation, identity, or superformula. For example, the statement: X=Y/T Could also be expressed as: February 1, 1990 1 SORITEC COMPUTE(3) EQUATION DIVIDE X=Y/T DIVIDE -or- EQUATION DIVIDE X=Y/T COMPUTE DIVIDE The elementary math operations, "+", "-", "*", and "/", are defined in a manner consistent with intuitive notions of time series manipulation. For each time period in the current USE period, the transformation is carried out using the values of the right-hand variables in that time period (unless a lag is specified explicitly in the formula, equation, identity, or superformula). The result in each period is then stored in the corresponding observation of the left-hand variable(s) for that period. Scalars and single vector elements are allowed in time series expressions, but vectors are not. Scalars are treated as time series for which each observation has the value indicated by the scalar. Let "x" and "y" be time series or scalars, and "n" be a pos_integer. Then the available time series operations are: x+y : addition of "x" and "y". x-y : subtraction of "y" from "x". x*y : multiplication of "x" by "y". x/y : division of "x" by "y". x**y : raising of "x" to the "y" power. -x : negate "x". x.EQ.y : one if "x" equals "y", zero otherwise. x.NE.y : one if "x" does not equal "y", zero otherwise. x.LT.y : one if "x" is less than "y", zero otherwise. x.LE.y : one if "x" is less than or equal to "y", zero otherwise. x.GT.y : one if "x" is more than "y", zero otherwise. x.GE.y : one if "x" is more than or equal to "y", zero otherwise. x.OR.y : one if "x" or "y" is one, zero otherwise. x.AND.y : one if "x" and "y" are one, zero otherwise. ABS(x) : absolute value of "x". EXP(x) : exponential of "x". LOG(x) : natural log of "x". SIN(x) : sine of "x". COS(x) : cosine of "x". TAN(x) : tangent of "x". ASIN(x) : arc sine of "x". ACOS(x) : arc cosine of "x". ATAN(x) : arc tangent of "x". ATAN2(x,y) : arc tangent of "x" divided by "y", where some values of "y" can be very small or zero. SINH(x) : hyperbolic sine of "x". COSH(x) : hyperbolic cosine of "x". TANH(x) : hyperbolic tangent of "x". ACOSH(x) : arc hyperbolic cosine of "x". ASINH(x) : arc hyperbolic sine of "x". ATANH(x) : arc hyperbolic tangent of "x". CEILING(x) : rounds "x" to the next higher integer. FLOOR(x) : rounds "x" to the next lower integer. 2 February 1, 1990 SORITEC COMPUTE(3) LEGAL(x) : zero if "x" is missing, one otherwise. LOG10(x) : logarithm base 10 of "x". RANDOM(x) : uniform random number generator, distributed [0,"x") ROUND(x) : rounds "x" to the nearest integer. SIGN(x) : -1 if "x" is negative; 1, if positive; 0, if zero. SQRT(x) : square root of "x". TRUNC(x) : truncates fractional part of "x". EXAMPLES ! This example illustrates the single-equation form of ! the COMPUTE command, and omits the COMPUTE command name, ! which is normal SORITEC programming practice. For an ! example of superformula simulation using the COMPUTE ! command, see SUPERF(3). USE 1 5 SERIES Y 3 5 4 4 5 TIME T X=ABS(Y**2-24)/T P X X ................ 1 . 15.0000 2 . 0.500000 3 . 2.66667 4 . 2.00000 5 . 0.200000 SEE ALSO EXPRESSIONS(2), TRANSFORMATIONS(2) FORECAST(3), MCOMPUTE(3), REVISE(3), SET(3), SUPERF(3) Section 4 February 1, 1990 3 SORITEC FORECAST(3) FORECAST -- Basic Forecasting Command DESCRIPTION FORECAST takes one or more equations that have either been pre-specified or have been estimated and recovered, and computes the values of the left-hand variable(s) over the current USE period. SYNTAX Single-equation form: FORECAST [ ( TAG=name1 ) ] equation|identity Multiple-equation simulation form: FORECAST [ ( [TOL=s1] [MAXIT=n1] [MAXPRT=n2] [TAG=name1] [STATIC|DYNAMIC] [NOBASE] ) ] superformula MODIFIERS Single-equation form: TAG=name1 : Tag to give to the computed values. For instance, if TAG is FITTED, but the name of the left-hand variable in the equation or identity is GNP, then the computed values will be stored in FITTED, not in GNP. {Default: null, i.e. computed values are stored in the variable whose name appears on the left-hand side of the equation or identity, wiping out any values already there.} Multiple-equation simulation form: TOL=s1 : Convergence criterion for simultaneous equation blocks. This is a maximum absolute relative error criterion. {Default: .0001} MAXIT=n1 : Maximum number of iterations within any single simultaneous block. {Default: 50} MAXPRT=n2 : Maximum number of iterations to print intermediate solution values. {Default: 0 if PRINT flag is OFF; 5 if PRINT flag is ON} TAG=name1 : Tag to give to all solution values. For instance, if TAG is FITTED, and one of the variables to be solved for is GNP, the solution values for that variable will be stored in FITTED^GNP. {Default: null, i.e. solutions are stored in the variables used in the equations, wiping out any values already there.} STATIC : Perform a static simulation. {Default: not selected.} DYNAMIC : Perform a dynamic simulation. {Default: selected.} NOBASE : Do not use any existing variable values as starting values for solutions within simultaneous blocks. {Default: not selected, i.e. values already existing in the variables are used as starting values.} February 1, 1990 1 SORITEC FORECAST(3) DISCUSSION FORECAST acts like a COMPUTE command for the equation, identity or superformula specified. It will produce values for the dependent variable(s) that will overwrite or be spliced onto the existing series, depending on the USE period in force. The REVISE flag does not have to be on for the splicing to occur. For equations with lagged values, these must exist outside the USE period, which is not shortened as with CORC to ensure that values are available. In single equation usage: FORECAST performs a static computation (i.e. all the independent variables must exist throughout the required period) unless the DYNAMIC flag is ON. EXAMPLES ! This example illustrates the single equation form of the ! FORECAST command. For an example of superformula simulation ! using the FORECAST command, see SUPERF(3). SERIES X 2 4 6 8 SERIES Y 1 2 3 4 USE 2 4 EQUATION EQ Y = 10 + 5*X**2 FORECAST EQ FORECAST Y2 = X+17 USE 1 4 PRINT Y Y2 Y Y2 ............................... 1 . 1.00000 MISSING 2 . 90.0000 21.0000 3 . 190.000 23.0000 4 . 330.000 25.0000 SEE ALSO SIMULATE(3), COMPUTE(3) 2 February 1, 1990 SORITEC MARMA(3) MARMA -- Rational Distributed Lag Estimation DESCRIPTION MARMA performs nonlinear estimation of a variety of dynamic time series models. Most common time series models can be handled by using a subset of the extensive modifiers to the MARMA command. SYNTAX MARMA ( [P=num1] [Q=num2] [S=num3 SL=num4] [D=vect1] [F=num5] [ORIGIN] [CENTER] [NDEG=vect2] [DDEG=vect3] [CRF] [LINFORM] [INITIAL=vect4] [MISSLAG=vect5] [MAXIT=num6] [TOLB=num7] [TOL=num8] [EZERO] [EVEC=vect6] [ZERO] [HOLDOUT=num9] ) depvar [indvar...] ARGUMENT TYPES ARGUMENT TYPE I/O DESCRIPTION depvar series I dependent variable indvar series I independent variable MODIFIERS Model specification modifiers: P=num1 : number of terms in AR lag structure. Q=num2 : number of terms in MA lag structure. S=num3 : order of seasonal differencing. SL=num4 : seasonal length - used with S above. F=num5 : number of forecast periods included at the end of the current USE period. For examples, if the USE period is 1970Q1 to 1990Q4, and F=8, then forecasts are produced for 1989Q1 to 1990Q4. MARMA will overwrite the dependent variable with the forecast values in these periods. D=vect1 : order of differencing to be applied to dependent and independent variables. The differencing order for the variables must be the same in this vector as in the arguments list. For an ARIMA model (i.e. no independent variables), supply a simple integer or a constant containing an integer. NDEG=vect2 : number of terms in the numerators of linear filters to be applied to the independent variables. Do not confuse this with the order of the numerator polynomials. In general, the number of terms is one greater than the order of the polynomial, due to the appearance of the parameter representing the coefficient of the unlagged variable (but see MISSLAG). DDEG=vect3 : number of terms in the denominators of linear filters to be applied to the independent variables. In general, the order and number of terms in the denominator of each filter are the same, since the implied zero-order parameter is identical to one (but see MISSLAG). ORIGIN : suppress the constant term. In contrast to regression techniques, this is often appropriate for MARMA models. MISSLAG=vect5 : includes only specified lags in the structure of some of the lag polynomials. See BOXJENKINS(6). CENTER : centers each variable around a mean of zero, generally February 1, 1990 1 SORITEC MARMA(3) making the inclusion of a constant term unnecessary. Estimation initialization modifiers: MARMA models are very sensitive to starting values. Final specifications should be estimated with different initializations, and solutions should be examined for stability. INITIAL=vect4 : initial parameter estimates for the non-linear optimization process. For the order in which these initial estimates are presented, see BOXJENKINS(6). EZERO : all starting values for the disturbance series before the sample period are set to zero. EVEC=vect6 : provides a set of starting values for the disturbance series before the sample period. ZERO : if specified, all values of the independent variables before the sample period are considered to be zero. HOLDOUT=num9 : The USE period is effectively shortened by this value on the "front" end. This allows the actual values of the independent variables to be used to initialize the estimation process. Estimation technique modifiers: CRF : if specified, common rational coefficients are assumed. All denominator polynomials are constrained to be the same for all dependent variables. LINFORM : if specified, the linearized form rather than the rational lag form is estimated. This modifier is not recommended unless the user is thoroughly familiar with the technique. SHILLER and ALMON are preferable. Iteration control modifiers: MAXIT=num6 : maximum permitted number of iterations (default 20) TOLB=num7 : convergence criteria (default .0001) TOL=num8 : convergence criteria (default .0001) DISCUSSION SORITEC uses the conventions established by Box and Jenkins for assigning signs to the various parameters in the model. The user should be aware that the signs of the coefficients may thus differ from the traditional regression conventions. Familiarity with the Box and Jenkins notation is assumed. This command covers a family of techniques rather than a single one. For a full discussion, see BOXJENKINS(6). ARMA and ARIMA models may be specified within MARMA, and this is to be preferred due to the greater degree of control that is possible. At this release, forecasting is not available with ARIMA(3). ALGORITHM USED The minimization technique used is a modified Marquardt search routine. 2 February 1, 1990 SORITEC MARMA(3) EXAMPLES ! ! This is the gas furnace example from Box and Jenkins (BJ). ! The entire data series is not used, since SORITEC is implemented ! on some systems which cannot run the full example as presented ! by BJ. The truncated estimation is provided to give an example ! which will run identically (and successfully) on all machines. ! Therefore, the coefficients arrived at here will not match the ! BJ results. ! USE 4 120 ACCESS 'bj' *** File opened ( 2): bj.sdb VECTOR INIT .1 .1 .1 -.1 -.1 .1 .1 MARMA (ORIGIN CENTER INITIAL=INIT NDEG=3 DDEG=2 P=2) JY JX(-3) Multivariate ARMA Estimation ---------------------------- Using 4 - 120 1 Independent Variables 2-Term Autoregressive Process Data Centering Selected Number of Terms in 1 Numerators 3 Number of Terms in 1 Denominators (of which 1 are of nonzero length) Orders are 2 Constant Term Suppressed 7 Parameters to be Estimated Initial parameter values and their associated subscripts ( 1) 0.10000 ( 2) 0.10000 ( 0) 0.10000 ( 1)-0.10000 ( 2)-0.10000 ( 1) 0.10000 ( 2) 0.10000 Non-linear Gaussian Estimation Procedure 117 Observations, 7 Parameters Convergence achieved at 6 iterations. Relative change in sum of squares less than 0.1000d-03 Variance of residuals = 0.2432d-01, 110 degrees of freedom Parameter Estimates ------------------- Dependent Variable is JY Coefficient Estimated Standard t- February 1, 1990 3 SORITEC MARMA(3) Description Coefficient Error Statistic /JY(-1) 0.192089 0.238528 0.805309 /JY(-2) 0.305170 0.157192 1.94138 JX(-3) -0.986701 0.596770d-01 -16.5340 JX(-4) 0.451271 0.238678 1.89071 JX(-5) 0.217134 0.928007d-01 2.33979 /_AR-TERM(-1) 0.874158 0.318978d-01 27.4050 /_AR-TERM(-2) -0.589668d-02 0.219979d-02 -2.68056 Transfer Function Information for JX(-3) Distributed Lag Estimates in Final Form -0.98670 -0.64080 -0.64133 -0.31874 -0.25694 -0.14662 -0.10657 -0.65219d-01-0.45052d-01-0.28557d-01-0.19234d-01 -0.12409d-01-0.82534d-02-0.53724d-02-0.35507d-02-0.23215d-02 -0.15295d-02-0.10023d-02-0.65928d-03-0.43250d-03-0.28427d-03 -0.18659d-03-0.12259d-03-0.80491d-04-0.52873d-04-0.34720d-04 -0.22805d-04-0.14976d-04-0.98360d-05-0.64596d-05-0.42425d-05 -0.27862d-05-0.18299d-05-0.12018d-05 Total Multiplier = -3.2921 t-value = 55.166 Coefficients in the Infinite Moving Average 1.0000 0.87415 0.75825 0.65768 0.57044 0.49478 0.42915 0.37223 0.32285 0.28003 0.24289 0.21067 0.18272 0.15849 0.13747 0.11923 0.10342 0.89703d-01 0.77804d-01 0.67484d-01 0.58533d-01 0.50769d-01 0.44035d-01 0.38194d-01 0.33128d-01 0.28734d-01 0.24923d-01 0.21617d-01 0.18750d-01 0.16263d-01 0.14106d-01 0.12235d-01 0.10612d-01 0.92043d-02 0.79835d-02 0.69245d-02 0.60061d-02 0.52094d-02 0.45184d-02 0.39191d-02 0.33993d-02 0.29484d-02 0.25573d-02 0.22181d-02 0.19239d-02 0.16687d-02 0.14474d-02 0.12554d-02 0.10889d-02 0.94445d-03 0.81918d-03 0.71052d-03 0.61628d-03 0.53454d-03 0.46364d-03 0.40214d-03 0.34880d-03 0.30253d-03 0.26241d-03 0.22760d-03 0.19741d-03 0.17123d-03 0.14852d-03 0.12882d-03 0.11173d-03 0.96910d-04 0.84056d-04 0.72907d-04 0.63236d-04 0.54849d-04 0.47573d-04 0.41263d-04 0.35790d-04 0.31043d-04 0.26925d-04 0.23354d-04 0.20256d-04 0.17570d-04 0.15239d-04 0.13218d-04 0.11465d-04 0.99439d-05 0.86249d-05 0.74809d-05 0.64887d-05 0.56280d-05 0.48815d-05 0.42340d-05 0.36724d-05 0.31853d-05 0.27628d-05 0.23963d-05 0.20785d-05 0.18028d-05 0.15637d-05 0.13563d-05 0.11764d-05 0.10203d-05 Total Multiplier = 7.5907 Autocorrelations of Residuals Lags N SUM(R(k)**2) 1- 5 0.108 0.099 0.029 -0.118 -0.105 5.58 4 February 1, 1990 SORITEC MARMA(3) 6-10 -0.185 -0.182 0.042 -0.047 0.070 14.5 11-15 0.076 0.154 0.191 0.045 0.002 22.5 16-20 -0.059 -0.056 0.027 -0.097 -0.028 24.6 21-25 -0.077 0.037 -0.077 -0.039 0.128 28.3 26-30 0.026 0.009 0.065 -0.076 0.007 29.5 Sum of Squares of Residuals = 2.6751 Variance of Residuals = 0.24319d-01 Durbin-Watson Statistic = 1.79632 R-Squared = 0.9980 SEE ALSO TESTDATA(2) ARIMA(3), CROSSCOR(3), INSPECT(3) BOXJENKINS(6) February 1, 1990 5 SORITEC REVISE(3) REVISE -- Revision and Splicing of Data DESCRIPTION REVISE modifies active observations of one or more existing series while preserving the non-active observations. SYNTAX Single-equation form: REVISE equation|identity Multiple-equation simulation form: REVISE [ ( [TOL=s1] [MAXIT=n1] [MAXPRT=n2] [TAG=name1] [STATIC|DYNAMIC] [NOBASE] ) ] superformula MODIFIERS Modifiers are used in the multiple equation simulation form only. TOL=s1 : Convergence criterion for simultaneous equation blocks. This is a maximum absolute relative error criterion. {Default: .0001} MAXIT=n1 : Maximum number of iterations within any single simultaneous block. {Default: 50} MAXPRT=n2 : Maximum number of iterations to print intermediate solution values. {Default: 0 if PRINT flag is OFF; 5 if PRINT flag is ON} TAG=name1 : Tag to give to all solution values. For instance, if TAG is FITTED, and one of the variables to be solved for is GNP, the solution values for that variable will be stored in FITTED^GNP. {Default: null, i.e. solutions are stored in the variables used in the equations, wiping out any values already there.} STATIC : Perform a static simulation. {Default: not selected.} DYNAMIC : Perform a dynamic simulation. {Default: selected.} NOBASE : Do not use any existing variable values as starting values for solutions within simultaneous blocks. {Default: not selected, i.e. values already existing in the variables are used as starting values.} DISCUSSION The REVISE command, in either single equation or multiple equation (superformula) usage, acts exactly as the corresponding COMPUTE command does if the REVISE flag is ON. The only difference between REVISE and COMPUTE is that COMPUTE erases values of the variable(s) being calculated for all time periods outside the current USE period, while REVISE preserves those values. REVISE cannot create a variable, i.e. the left-hand side variable(s) must already exist. EXAMPLES ! This example illustrates the single equation form of the ! REVISE command. For an example of superformula simulation ! using the REVISE command, see SUPERF(3). USE 1 10 February 1, 1990 1 SORITEC REVISE(3) TIME T USE 2 4 REVISE T = T + 3 EQUATION EQ T = T**2 USE 9 10 REVISE EQ USE 1 10 PRINT T T ................ 1 . 1.00000 2 . 5.00000 3 . 6.00000 4 . 7.00000 5 . 5.00000 6 . 6.00000 7 . 7.00000 8 . 8.00000 9 . 81.0000 10 . 100.000 SEE ALSO REVISE flag in FLAGS(2) COMPUTE(3), FORECAST(3) 2 February 1, 1990 SORITEC SMOOTH(3) SMOOTH -- Exponential Smoothing DESCRIPTION SMOOTH applies one of six methods to smooth a time series and to project the smoothed series forward for the a specified number of periods. SYNTAX SMOOTH ( SIMPLE|LINEAR|QUAD|HOLT|ADRES|WINTER ALPHA=num1 [BETA=num2] [GAMMA=num3] F=num4 [L=num5] [SEARCH] ) result data ARGUMENT TYPES ARGUMENT TYPE I/O DESCRIPTION result series O smoothed and projected series data series I original data MODIFIERS num1 : smoothing constant (0 < num1 < 1) num2 : trend and season smoothing constant for HOLT and WINTER (0 < num2 < 1) num3 : linear trend smoothing constant for WINTER (0 < num3 < 1) num4 : integer number of forecast periods num5 : integer number for season length SEARCH : search, from the initial values, if provided, for the parameters which best fit the data DISCUSSION Smoothing techniques are used primarily for short-term projection of time series in situations where data on other potential explanatory variables are not available. These techniques smooth data by assigning exponentially decreasing weights to past values of a given series, and then assembling projected values which are wholly determined by the past values of that series, appropriately weighted. SORITEC provides six smoothing methods: SIMPLE performs single exponential smoothing for a stationary series. LINEAR performs Brown's one parameter linear exponential smoothing, which estimates and smooths a linear trend in non-stationary data. QUAD performs Brown's one parameter quadratic method, for a non-linear trend. HOLT performs Holt's two parameter linear method, which is similar to LINEAR except that the user must supply a constant to estimate the linear trend. ADRES performs the adaptive response method, where ALPHA is not specified but automatically adjusts to the pattern of residual errors of successive observations. The response rate parameter BETA is typically set at .1 or .2. WINTER performs Winter's three parameter linear and seasonal method, which expands HOLT to cover both season and trend. Note that values for BETA and GAMMA are usually smaller than the value chosen for ALPHA. "L" February 1, 1990 1 SORITEC SMOOTH(3) is used only with the last method. SORITEC automatically appends the projected values and prints out the full series, i.e. existing plus smoothed and projected. The USE command remains unchanged. If the "result" is omitted, the smoothed and projected values overwrite the existing "data". Small values of the smoothing constants are usually best when a great deal of randomness is present, and larger values when there are small fluctuations in the data. ALGORITHM USED When the SEARCH modifier is used, the parameters entered by the user with the ALPHA, BETA, and GAMMA modifiers are taken to be starting values, and a search is performed to find the parameter combination which minimizes the mean square error, i.e. which best fits the data. The BFGS (Broyden, Fletcher, Goldfarb and Shanno) algorithm is used to find the best parameter set. LIMITATIONS If the user specifies the SEARCH modifier, but does not give initial guesses for the parameters, then SORITEC uses a starting value of 0.1 for each parameter. This will generally, but not always, produce useful results. Exponential smoothing is, in fact, a minimization problem that is non-linear in the parameters, and is sometimes badly behaved. SORITEC restricts each parameter to the range (.001,.999). If the lower bound is attained for any parameter, minimization is attempted for the other parameters. If the upper bound is attained, a failure to converge is declared. It is often the case that the mean square error has a single local minimum at a set of small parameter values, rises as you move away, attains a maximum within the permitted range (i.e. between .001 and .999), then declines with a very low slope "forever". Only repeated attempts to start with a small parameter set will attain convergence in these situations. NOTES The older modifiers for method selection are still recognized by SORITEC. The correspondences between old and new method modifiers are: EXP <==> SIMPLE EXP2 <==> LINEAR EXP3 <==> QUAD EXP4 <==> HOLT EXP5 <==> ADRES EXP6 <==> WINTER EXAMPLES USE 1 7 SERIES A 4 6 5 7 6 7 8 SMOOTH ( QUAD ALPHA=.2 F=3 ) B A Brown's One Parameter Quadratic Exponential Smoothing -- Alpha=0.200 Period Actual Forecast Error Pct Error 1 4.0000 2 6.0000 4.0000 2.0000 33.3333% 2 February 1, 1990 SORITEC SMOOTH(3) 3 5.0000 5.2000 -0.2000 4.0000% 4 7.0000 5.3200 1.6800 24.0000% 5 6.0000 6.5600 -0.5600 9.3333% 6 7.0000 6.6720 0.3280 4.6857% 7 8.0000 7.2774 0.7225 9.0320% 8 8.1823 9 8.7663 10 9.3822 Mean Pct Error (MPE) or bias = 9.6196% Mean Squared Error (MSE) = 1.30094 Mean Absolute Pct Error (MAPE) = 14.0640% SMOOTH ( QUAD SEARCH ) B A Brown's One Parameter Quadratic Exponential Smoothing -- Alpha=0.229 Period Actual Forecast Error Pct Error 1 4.0000 2 6.0000 4.0000 2.0000 33.3333% 3 5.0000 5.3751 -0.3751 7.5021% 4 7.0000 5.4323 1.5676 22.3949% 5 6.0000 6.7903 -0.7903 13.1719% 6 7.0000 6.7936 0.2063 2.9480% 7 8.0000 7.3961 0.6038 7.5484% 8 8.3333 9 8.9819 10 9.6692 Mean Pct Error (MPE) or bias = 7.5917% Mean Squared Error (MSE) = 1.27167 Mean Absolute Pct Error (MAPE) = 14.4831% SEE ALSO EXGRO(3), SCURV(3) PROJECTION(6) February 1, 1990 3 SORITEC SUPERF(3) SUPERF -- Superformula Construction (Gauss-Seidel Method) DESCRIPTION SUPERF combines a set of equations and identities into a single "super-formula" which calculates many variables with a single command. It is the primary command used for simulation. SYNTAX SUPERF result eqname... ARGUMENT TYPES ARGUMENT TYPE I/O DESCRIPTION result superformula O resulting superformula eqname equation I the equations or identities |identity making up the superformula DISCUSSION The superformula is a much faster way of solving a set of simultaneous equations than SIMULATE(3). The SIMULATE command will be removed, as soon as the superformula facility can do everything that SIMULATE now does. The SIMULATE command allows "ragged" simulation, and allows several different convergence criteria to be used. It also has a KILL facility for simulations that are obviously not converging. Doing simulations with a superformula is about fifteen times faster than with SIMULATE. The superformula is also more space efficient. Model storage takes about one-fourth as much space with a superformula as with a SIMULATE command. Unlike the SIMULATE facility, when a model is linked with SUPERF, the individual equations that make up that model need not be kept any longer. The SIMULATE- type model is simply a "map" telling SORITEC in what order to solve the various equations, which retain their individual names. The superformula actually combines the various equations into one large named entity. Using SIMULATE, it was possible (although not recommended) to make minor changes in an equation that did not alter the basic structure of the model without re- BUILDing the model. This is not possible with a superformula, since the equations are combined at the time that the superformula is created. Later changes to the equations are not incorporated into the superformula unless another SUPERF command is executed. The SUPERF command generally is used in conjunction with a BUILD command. The default output from BUILD is a SIMULATE-type model, which will not be kept or used in the process of linking a model together using a superformula. BUILD creates an internal result, ^EQORD, which specifies the various recursive and simultaneous blocks into which the model can be resolved. This ordering minimizes the size of each set of simultaneous equations in the model. SUPERF uses this information to build the superformula, and takes added steps to (a) move as many recursive equations outside the Gauss-Seidel "loop" as possible, and (b) reorder the equations in each block to reduce the number of forward references in each block. Therefore, the order in which BUILD arranges the equations is not the same as the order in which SUPERF arranges them. SUPERF does a superior job of in-loop optimization, but BUILD does a superior job of February 1, 1990 1 SORITEC SUPERF(3) block decomposition. It is not required that the equations be ordered with BUILD before using SUPERF. However, for any but the smallest models, or models known to consist of just one simultaneous block, using BUILD to derive ^EQORD will result in a great increase in solution speed. A superformula is solved by using it as the argument of a COMPUTE, REVISE, or FORECAST command, just as if it were a simple equation or identity. As with a simple equation or identity, a COMPUTE command name may be omitted if no modifiers are used. ALGORITHM USED A superformula is constructed with embedded looping instructions that cause the COMPUTE, FORECAST and REVISE commands to perform Gauss-Seidel iterations over those operations that require simultaneous solution. Operations that involve only operands that are not changed within the loop (i.e. exogenous variables, scalar constants, and variables that have already been solved for) are moved outside the loop. As a result, much of the benefit of ordering the equations will be achieved by the loop optimization done by SUPERF, even if BUILD is not used. LIMITATIONS At this release, the convergence criteria (TOL) used in the solution phase (see COMPUTE(3), FORECAST(3), REVISE(3)) tends to be too tight for the default maximum number of iterations (MAXIT). The default for TOL is .0001; for MAXIT, 50. It is suggested that a TOL of .001 is more appropriate for most applications. EXAMPLES ! This is the simple form, not using the BUILD command ! to order the equations. The model is obviously a single ! simultaneous block. USE 1 3 SERIES U 11 12 13 SERIES V -2 -1 0 EQUATION EQ1 Y = ALPHA*X + U EQUATION EQ2 X = BETA*Y + V PARAMETER ALPHA .25 BETA .5 SUPERF MDL EQ1 EQ2 MDL P X Y X Y ............................... 1 . 3.99998 12.0000 2 . 5.71423 13.4285 3 . 7.42852 14.8570 ! The next example is a bit more complicated, and more typical ! of the steps that are generally used in model construction ! and simulation of a model that is more involved. This example ! uses Klein Model I. The coefficients are entered here in 2 February 1, 1990 SORITEC SUPERF(3) ! numeric form, as if each had been processed by FREEZE(3). ! This is only for clarity of presentation. All the coefficients ! could be specified as parameters with the corresponding values. ! (Frozen equations do run faster, though, and are generally used ! for models that are going to be run many times between ! re-estimations.) FORGET * USE 1922 1941 ACCESS 'klein' *** File opened ( 2): klein.sdb EQUATION CEQ C = 18.3411 - 0.232351*P + 0.385697*P(-1) + 0.801809*W EQUATION IEQ I = 27.2765 - 0.801452*P + 1.05221*P(-1) - 0.148196*K(-1) EQUATION WEQ W1= 5.78943 + 0.234238*X + 0.284589*X(-1) + 0.234766*A IDENTITY PID P=X-T-W1 IDENTITY WID W=W1+W2 IDENTITY XID X=C+I+G GROUP EQS CEQ IEQ WEQ PID WID XID GROUP VARS C I W1 P W X BUILD EQS VARS ZZZ Linkage Statistics 6 Equations 15 Endogenous Linkages Density of Linkage Matrix is 2.50 Equations will be solved in the following order: Equation Associated Variable 1 1 CEQ 1 C 2 2 IEQ 2 I 3 3 WEQ 3 W1 4 4 PID 4 P 5 5 WID 5 W 6 6 XID 6 X Recursive block 1 is empty. Linear simultaneous block 1 contains 6 equations. ON GROUP SUPERF MDL ^EQORD ! Although FORECAST is used here, REVISE or COMPUTE would ! produce the same answers, except that COMPUTE would ! erase the values for C, I, W1, P, W, and X outside the ! USE period. FORECAST(TAG=FIT TOL=.001)MDL USE 1939 1941 DOT VARS COMPARE : FIT^: February 1, 1990 3 SORITEC SUPERF(3) END 1939 1940 1941 --------------------------------------- C | 61.6000 65.0000 69.7000 FIT^C | 61.5355 63.4028 65.4778 Difference | 0.644800d-01 1.59715 4.22213 Pct. Difference | 0.10% 2.51% 6.44% 1939 1940 1941 --------------------------------------- I | 1.30000 3.30000 4.90000 FIT^I | 1.68737 1.85620 -0.506301d-01 Difference | -0.387371 1.44379 4.95063 Pct. Difference | 22.95% 77.78% >1000.0% 1939 1940 1941 --------------------------------------- W1 | 41.6000 45.0000 53.3000 FIT^W1 | 42.8406 44.7910 47.3741 Difference | -1.24068 0.208960 5.92583 Pct. Difference | 2.89% 0.46% 12.50% 1939 1940 1941 --------------------------------------- P | 19.0000 21.1000 23.5000 FIT^P | 18.0877 18.2610 20.2576 Difference | 0.912234 2.83891 3.24233 Pct. Difference | 5.04% 15.54% 16.00% 1939 1940 1941 --------------------------------------- W | 49.4000 53.0000 61.8000 FIT^W | 50.6406 52.7910 55.8741 Difference | -1.24068 0.208960 5.92583 Pct. Difference | 2.45% 0.39% 10.60% 1939 1940 1941 --------------------------------------- X | 69.5000 75.7000 88.4000 FIT^X | 69.8228 72.6590 79.2272 Difference | -0.322891 3.04094 9.17276 Pct. Difference | 0.46% 4.18% 11.57% SEE ALSO BUILD(3), FREEZE(3), NEWTON(3), SIMULATE(3) 4 February 1, 1990 SORITEC SAMPLER MANUAL INDEX Articles with numeric names refer to chapters of the Primer (Section 1). TOPIC Article(Section)Page ACCESS command 4(1)1 ADDFAC command 11(1)3 ALIAS flag FLAGS(2)2 ALIAS flag, discussion FLAGS(2)6 AMORT command 7(1)2 ANOVA flag FLAGS(2)2 AUTOLOG flag FLAGS(2)2 Accented characters FLAGS(2)3 Add factors 11(1)3 Arithmetic operators 2(1)6 Autocorrelation, ARIMA models MARMA(3) Autoregression, ARIMA models MARMA(3) BETA flag FLAGS(2)2 BREAK flag FLAGS(2)2 BRIEF flag FLAGS(2)3 BUILD command BUILD(3) Batch mode 1(1)3, 1(1)4 CAUTION flag FLAGS(2)3 CCOR flag FLAGS(2)3 CLOSE command 4(1)2 COMPUTE command 2(1)3, 2(1)5, COMPUTE(3) CONTENTS command 4(1)5 CONTINUE command 5(1)3 CONVERT command 6(1)2 COPY command 4(1)3 CORC command 9(1)2 CORREL command 6(1)5 COVA command 6(1)5 CREATE command 4(1)1 CRT flag FLAGS(2)3 CRT flag, discussed FLAGS(2)6 Character set FLAGS(2)3 Comments 2(1)1 Comparison of scenarios 11(1)3 Constants 2(1)2 Cross-section techniques 8(1)1 DETAIL flag FLAGS(2)3 DIF files 3(1)3 DISCARD command 4(1)4 DO command 5(1)1 DOLLAR flag FLAGS(2)3 DOLLAR flag, discussion FLAGS(2)5 DOT command 5(1)3 DUMMY command 6(1)1 DYNAMIC flag FLAGS(2)3 Databanks 4(1)1 ECS flag FLAGS(2)3 END command 1(1)4 February 1, 1990 1 SORITEC EQUATION command 11(1)1 EXACT flag FLAGS(2)3 EXECUTE command 1(1)5 EXOGENOUS command 9(1)3 EXPDAMP flag FLAGS(2)3 Equations 2(1)3, 2(1)5, COMPUTE(3), FORECAST(3), REVISE(3) Exponential smoothing SMOOTH(3) FASTDIF flag FLAGS(2)3 FLAGS article FLAGS(2)1 FORECAST command 9(1)3, 11(1)2, FORECAST(3) FORGET command 2(1)12 FORMAT command 3(1)6 Financial functions 7(1)1 Flags 2(1)12, FLAGS(2)1 Flags, resetting entire configuration FLAGS(2)1 Forecasting FORECAST(3) Frequencies 8(1)2 Functions 2(1)6 GLOBAL flag FLAGS(2)3 GOTO command 5(1)2 GROUP flag FLAGS(2)3 Gauss-Seidel method 11(1)2, SUPERF(3) Groups 2(1)4 HEAD flag FLAGS(2)3 HILU command 9(1)2 Histogram 8(1)2 IF command 5(1)2 IMPUTE command 2(1)10 INSPECT command 12(1)1 IRR command 7(1)1 Input/output 3(1)1 Input/output, databanks 3(1)10 Interactive mode 1(1)3, 10(1)1 Internal names FLAGS(2)1 JOB command 1(1)4 JOURNAL flag FLAGS(2)3, FLAGS(2)5 Journal files 1(1)3 KEEP command 4(1)2 LEGAL function 2(1)9 LIBRARY command 4(1)2 LOG flag FLAGS(2)3 Lags 2(1)1 Leads 2(1)1 Logical operators 2(1)6 MA command 6(1)4, 13(1)1 MARMA command 12(1)2, MARMA(3) MAX command 6(1)3 MAXERR command 2(1)13 MIN command 6(1)4 MISSING command 2(1)9 MISSING flag FLAGS(2)4 MOD command 6(1)4 MSUM command 6(1)5 2 February 1, 1990 SORITEC Matrices 2(1)3 Missing values 2(1)9 NOEJECT flag FLAGS(2)4 NOERROR flag FLAGS(2)4 NOTE flag FLAGS(2)4 Names of variables 2(1)1 Names, printing (UPRINT) FLAGS(2)5 National character sets FLAGS(2)3 OFF command FLAGS(2)1 OFFLIST command 2(1)13 ON CRT command 10(1)1 ON REVISE command 2(1)8 ON command FLAGS(2)1 ONLIST command 2(1)13 Operators 2(1)6 PAGESIZE command, with ON CRT FLAGS(2)6 PARAMETER command 11(1)1 PATH flag FLAGS(2)4 PERFECT flag FLAGS(2)4 PLOT command 3(1)9 PLOT flag FLAGS(2)4 PRINT command 3(1)8 PRINT flag FLAGS(2)4 PROMPT flag FLAGS(2)4 PUNCH command 3(1)2 PURGE command 4(1)2 PV command 7(1)1 Parameters 2(1)2 Periodicities 2(1)5 Pound sign FLAGS(2)3 Printing, in procedures FLAGS(2)6 Procedures, use of PRINT command FLAGS(2)6 Programming 5(1)1 QUIET flag FLAGS(2)4 QUIT command 1(1)4 RAGGED flag FLAGS(2)4 RAWEQ flag FLAGS(2)4 READ command 3(1)2, 3(1)6 READDIF command 3(1)3 RECODE command 6(1)1 RECOVER article RECOVER(2)1 RECOVER command 2(1)11, RECOVER(2)1 REGRESS command 9(1)1 RENAME command 4(1)3 REPLACE command 4(1)3 REPLACE flag FLAGS(2)4, FLAGS(2)5 RESIDUAL flag FLAGS(2)4 REVISE command 2(1)7, REVISE(3) REVISE flag 2(1)8, FLAGS(2)4, FLAGS(2)5 ROBUSTSE flag FLAGS(2)4 Recovering ^FLAGS FLAGS(2)1 Regression, rational distributed lags MARMA(3) Regression, with ARMA errors MARMA(3) Revising series REVISE(3) February 1, 1990 3 SORITEC SAL files 3(1)1 SCAN command 1(1)3, 2(1)13 SCATTER command 3(1)9 SERIES command 3(1)8 SET command 2(1)2 SMOOTH command 13(1)1, SMOOTH(3) STATS flag FLAGS(2)4 STREAMIO flag FLAGS(2)4 SUPERF command 11(1)1, SUPERF(3) SWITCH command 4(1)3 SYMBOLS command 2(1)12 SYNOPSIS command 8(1)1 Series 2(1)2 Series, forecasting FORECAST(3) Series, periodicities 2(1)5 Series, revising REVISE(3) Series, smoothing SMOOTH(3) Series, splicing REVISE(3) Series, transformation COMPUTE(3) Simulation 11(1)1, COMPUTE(3), FORECAST(3), REVISE(3), SUPERF(3) Simulation, building a model BUILD(3) Simultaneous equations, solution SUPERF(3) Smoothing of series SMOOTH(3) Splicing series REVISE(3) Standard errors (ROBUSTSE) FLAGS(2)4 Superformula 11(1)1, BUILD(3), COMPUTE(3), FORECAST(3), REVISE(3), SUPERF(3) Symbols, ampersand [&] 2(1)2 Symbols, asterisk [*] 2(1)2, 2(1)10 Symbols, comma [,] 2(1)2 Symbols, dollar sign [$] FLAGS(2)5 Symbols, ellipsis [...] 2(1)2 Symbols, equal sign [=] 2(1)2 Symbols, exclamation mark [!] 2(1)1 Symbols, left angle-bracket [<] 2(1)2 Symbols, left curly bracket [{] 2(1)2 Symbols, left parenthesis [(] 2(1)2 Symbols, minus sign [-] 2(1)2 Symbols, period [.] 2(1)2 Symbols, plus sign [+] 2(1)2 Symbols, question mark [?] 2(1)10 Symbols, question mark [?] 2(1)2 Symbols, right angle-bracket [>] 2(1)2 Symbols, right curly bracket [}] 2(1)2 Symbols, right parenthesis [)] 2(1)2 Symbols, semi-colon [;] 2(1)1 Symbols, slash [/] 2(1)2 TIME command 6(1)1 TITLE command 1(1)4 TOKENS flag FLAGS(2)5 TRAIL flag FLAGS(2)5 TWOSLS command 9(1)3 Two-stage least squares 9(1)3 4 February 1, 1990 SORITEC Tableau mode 10(1)1 Transfer functions MARMA(3) Transformation 2(1)5 Transformation, series COMPUTE(3) UPRINT flag FLAGS(2)5 USE command 2(1)4 USE flag FLAGS(2)5 USE period, resetting after PROCEDURE FLAGS(2)1 USEIF command 2(1)5 VCOV flag FLAGS(2)5 Variable names 2(1)1 Vectors 2(1)3 WIDTH command 1(1)3, 2(1)13 WRITE command 3(1)7, 3(1)9 WRITEDIF command 3(1)4 Wildcards 2(1)10 XTAB command 8(1)2 February 1, 1990 5PageFoot(); ?>