## The Data Step

### Subsetting IF

In a Data Step you can exclude some observations from the dataset with an if statement.

``````data tornados_1980s;
infile FileName;
input year city damages;
* this limits input data to 1980s data;
if 1980 <= year <= 1989;
run;
``````
• IF .. IN statement ```r if year in (1980, 1981, 1982); ```
• AND, OR ```r if year = '1980' and city = 'Baltimore'; ```

### infile 'filename'

In the data step, import data from a file with the infile command. ```r data tornados; infile 'tornados.dat'; input year city cost; run; ```

### Set

• Use the set command to create a new data set from an already created set. The following creates a dataset of 1980s tornado data from the larger set of tornado data. ```r data tornados_1980s; set tornados; if 1980 <= year <= 1989; run; ```

## The PROC Step

### PROC SORT

• The sort procedure, sorts data. You can sort by multiple fields.
• Also you can print by a field. ```r proc sort data=tornados; by year city; proc print data=tornados; by year; run; ```

### PROC Univariate

PROC Univariate generates descriptive statistics ```r proc univariate data=tornados; histogram year; run; ```

### PROC means

Use proc means when you are only interested in basic descriptive statistics.

### PROC freq

• generates tables for data in categories.

### PROC gplot

```r proc gplot data=tornados; plot year*cost; title 'Year by Cost tornados'; run; ```

### PROC corr

#### compute the correlation

```r proc corr data=grades; var exam1 exam2 hwscore; run; ```

### PROC reg

• p: prints obs, predicted, residuals
• r: same as p, plus more
• clm: 95% conf interval for mean of each obs
• cli: 95% prediction intervals. ```r proc reg data=grades; model final=exam1 hwscore / p r cli clm; plot final*hwscore; run; ```

## Multiple Regression Analysis

### Variable Selection

SAS has several methods for selecting variables ```r proc reg data=cdi; model y = x1-x8 /selection=rsquare best=1; model y = x1-x8 /selection=adjrsq best=5; model y = x1-x8 /selection=cp best=10; model y = x1-x8 /selection=forward slentry=0.10; model y = x1-x8 /selection=stepwise slentry=0.10 slstay=0.10; model y = x1-x8 /selection=backward slstay=0.10; run; ``` additional pages to try: <a href=http://www.ats.ucla.edu/stat/sas/modules/graph.htm>more sas</a>