# Entering Data Using the SPSS Data Editor

Reading: SPSS Base 9.0 User's Guide, Chapter 4, Data Editor
Activities: Run the SPSS tutorial "Getting Data: Using the Data Editor"
Homework: Data Editor

 1. Overview2. Variable Definition     Integer Numeric without Value Labels (ID) (EXTRAV)     Integer Numeric with Value Labels (CLASS) (ESSAY)     Integer Numeric with Common Value Labels (SHOCK, TUITION, PHDPSY)     Decimal Numeric (INCOME)     Date Variable (BIRTHDAY)     String Variable with Value Labels (GENDER) 3. Missing Values Summary     System Missing Values (BIRTHDAY)(CLASS)     User Missing Values (SHOCK, TUITION, PHDPSY)     String Missing Values (GENDER)

## 1. Overview

There are three steps to creating a data file using the SPSS Data Editor -

 Define the Variables Enter the Values Save the Data File

For this discussion, lets assume that we have run an experiment to test the hypothesis that writing an essay in favor of doubling tuition at UCCS would make people more accepting of an increase in tuition. The participants were randomly divided into two groups, one group wrote an essay about why tuition should be doubled and the other group wrote an essay about the value of intercollegiate sports. We then administered a questionnaire to gather some demographic information about the participants. The questionnaire also measured attitudes towards several issues, including the issue about doubling tuition. These variables are summarized in Table 1. The responses from 4 participants are shown in Table 2.

Table 1. Codebook for the Tuition Study
Name Variable Type Variable Label/ Value Labels
ID Numeric 3.0
Demographic data
BIRTHDAY Date (mm/dd/yyyy) Date of birth of the respondent(mm/dd/yyyy)
GENDER String 1 Gender of respondent/
"F" "FEMALE" "M" "MALE"
CLASS Numeric 1.0 Year in college/
1 'Freshman' 2 'Sophomore' 3 'Junior' 4 'Senior'
INCOME Dollar 11.2 Monthly income of respondent in dollars and cents
Treatment Conditions (Independent Variables)
ESSAY Numeric 1.0 Wrote essay about /
1 "doubling tuition"  2 "intercollegiate athletics"
Individual Difference Variables
EXTRAV Numeric 2.0 Extraversion score
Dependent Variables
SHOCK Numeric 1.0 Electric shock should not be used in experiments/
1 "Strongly Disagree"
2 "Disagree"
3 "Slightly Disagree"
4 "Slightly Agree"
5 "Agree"
6 "Strongly Agree"
9 "No Opinion on This Issue"
TUITION Numeric 1.0 Tuition at UCCS should be doubled/
-same value labels as SHOCK
PHDPSY Numeric 1.0 UCCS should have a Ph.D. program in Psychology/
-same value labels as SHOCK

The numbers in the Variable Type  column indicate:  whether the variable is a numeric, date, dollar, or string variable; the width of the variable; and for numeric variables, how many decimal places.  The first number after numeric, date, dollar, and string variables is the maximum width of the variable (including the number of whole digits, the decimal point, if any, and the number of decimal digits, if any.  The number after the decimal point is the number of decimal places for the variable.   Note that string variables and date variables do not have decimal places.

 ID BIRTHDAY GENDER CLASS INCOME ESSAY EXTRAV SHOCK TUITION PHDPSY 1 3/20/1975 F 1 1823.62 1 50 5 2 6 2 5/32/1977 M 3 128.50 2 6 2 4 3 10/3/68 F 4 1239252 2 23 5 9 5 4 10/10/1582 F 2 879 1 72 9 9 5

There are no hard and fast rules about how to organize the variables within the data file. I typically begin with an ID number followed by demographic data (age, gender, SES, etc.), the independent variable condition(s), individual difference measures, and finally the responses to the dependent variables. The data in Table 2 follow that organization. The variables in this example were chosen to illustrate some of the most common data types that students in the department have used over the years.

Parenthetically, you will normally be entering your data directly from the response sheets that you give to your participants. I don't recommend copying over the responses into a tabular format as in Table 2. The extra step of copying over the data onto a sheet to give to a data entry person who then enters the data into the computer will most surely cause errors to creep into the data.

top

## 2. Variable Definition

(Recommendation: Open SPSS in another window and create the data file described in these notes.)

Variable definition includes naming the variable (Variable Name), defining the type of variable, e.g., numeric or string (Variable Type), giving a long name for the variable (Variable Label), providing descriptions of the values that are entered into the data file (Value Labels), and defining missing values (Missing Values). Each of those elements will be described

### ID (integer variable)

a. Variable Name

Let's enter the variable definition information for each of the variables in Table 1. First, double left click on the word "var" in the 1st column (the upper left corner) of the worksheet. The Define Variable dialogue box will open and the word "VAR00001" will be highlighted in the Variable Name text box. You could use the default name for the variable, VAR00001. But it is better to use a name that is descriptive of the variable. Enter ID as the variable name. You should become familiar with the rules for naming variables (see the SPSS Help window under variable naming rules). If you use names that begin with a letter, that contain only letters, numbers and the symbols @, #, _, \$, or period, and that are no longer than 8 characters long you should run into no problems.

[Note: To find the rules for naming variables press: Help, Topics, Index. Then enter the phrase variable names:rules.  Then press the Display button.]

Click OK at the bottom of the Define Variable window and enter the values for ID for the first four cases. An easy way to do this is to move the cursor to the 1:ID cell, highlight the cell and then key in the value, 1. Then press the arrow key in the direction that you want to move, down in this instance. The value will be entered into the data file and the next cell, 2:ID, will be highlighted. Continue until all four values have been entered. Notice that the values are displayed with two decimal places, even though you only entered a whole number.  Open the Define Variable window again.

In the Variable Description section of the Define Variable window the Type is defined as Numeric8.2, there is no Variable Label, there are no Missing values, and the Alignment of the data is Right justified within the space allotted to variable. These are the default values for every new variable. A default value is the value that is assigned by SPSS in the absence of any information provided by the user. Each of these elements are described in more detail below.

Variable names are not case sensitive. The following names are identical: ID, Id, and id. The SPSS Data Editor always dislpays the SPSS variable name in lower case letters.

b. Variable Type

Move the cursor the Type... button and left click.  (if you press the ENTER key the cursor to the Define Variable box will close.) After pressing the Type... button several variable type options are presented. In psychology the most commonly used variable types are numeric, string, and date. Numeric variables can consist of the digits from 0 through 9 and an optional decimal point. String variables can contain any letters, numbers, and symbols. Date variables typically consist of a year, month, and day, but they can also include hours, minutes and seconds. Typical date variables are date of birth, date and time of testing, etc.

ID is a numeric variable. Note that the "numeric" box is already checked. The width of the variable refers how many spaces will be reserved for the variable when its values are displayed. Decimal places refers to how many of the width digits will be reserved for the decimal point and the decimal part of the number. The width does not refer to how many digits are stored in the data file, width refers to how many digits will be displayed in the data editor and in the output.   For example, if you set the width at 2 digits, then you can still enter a value that is 3 or more digits wide into the data file. Values that are wider than the defined width are displayed by an asterisk (*).

The default width is 8 digits; the default number of decimal places is 2, resulting in the data type of Numeric8.2. Notice that the values you entered: 1, 2, 3, and 4 are displayed as 1.00, 2.00, 3.00, and 4.00. The optimal width of a numeric variable is determined by the range of values that are possible for the variable. If you have, say, between 10 and 99 cases, then the width of the ID variable should be set at 2. If you have between 100 and 999 cases then the width of the ID variable should be set at 3. Lets set the Width to 2 digits and the number of decimal places to 0. Press continue to close the Type... dialog box. Then click "OK" to close the Define Variable window.  Note that the values are displayed as whole numbers rather than as decimal numbers.  Try entering a decimal number. Note that decimals will be rounded to whole numbers in the display. Remember that the width and number of decimal places refers to the display of the values, not to the actual number that is stored in the data file.

In SPSS version 8.0 and 9.0 the assigning the width of a numeric variable seems to have no effect on how that that variable is either saved or displayed.

c. Variable Label

Next, press the Labels.. button. Two options appear in the dialog box: Variable Label and Value Labels. A variable label is a longer description of the variable. Recall that the name of the variable can be no longer than eight characters. It is not mandatory to have a variable label. For example, ID is descriptive in itself, you probably do not need to add a longer variable label such as "Participant Identification Number."

Variable labels will preserve the case (upper and lower case) as entered.

d. Value Labels

Value labels identifies the coding scheme for the values. Value labels are not mandatory and they would not be used for ID values or other interval type data such as temperature values or scores on tests (e.g., you wouldn't label each value of an IQ score). Value labels are typically used when the value refers to a specific category such as "male" and "female," or the scale values for a Likert-type response scale, e.g., "strongly agree." Lets leave the Labels section blank for the ID variable. Click the Cancel button to exit the dialog box.

e. Missing Values

Because you assign the values of ID there "no missing values."

f. Column Format

Column format refers to how the values are displayed in the Data Editor. We have already altered how the values of ID are displayed by assigning values for the width and number of decimal places. Entering a value for Column Width will change the width of the display for the data editor only. The values you entered for the Variable Type will be in effect for any output involving those values.  To see how this works change the column width to 2 and press continue and then OK to exit the Define Variable dialog box.  The width of the ID column has been narrowed to two print columns columns. Any number that is wider than 2 digits is displayed as an asterisk, "**".  Try it for yourself.

You change change the display width of a variable by moving the cursor to the edge of name of the variable and then dragging the column to make it wider or narrower.

Numeric variables are always aligned to the right.

g. Measurement

Measurement refers to scale of measurement: nominal, ordinal, interval, or ratio.  SPSS allows you to assign one of three categories of measurement: nominal, ordinal, or scale.  "Scale" refers to both interval and ratio scales.   There is only one place in SPSS for windows where this information is used: in some chart (graphics) procedures that identify the measurement type.  The help files also indicate that this information is used when you an SPSS data file with a program called "Answer Tree."  Answer Tree is not a part of SPSS 8.0 or 9.0

For a review of the scales of measurement see Scales of Measurement

top

### BIRTHDAY (date variable)

a. Variable Name

Click on the 2nd variable column on the worksheet. Enter BIRTHDAY as the variable name.

b. Variable Type

Open Type... and click the button in front of Date. Select mm/dd/yyyy from the date options that appear and click the Continue button. Variable and value labels are not necessary or desirable for this variable so you can close the Define Variable dialog box.

Enter the first birthday value from Table 2 in the 1:BIRTHDAY cell (3/20/1975).

Enter the date for case #002 (5/32/1977). The Data Editor should beep and refuse to enter a value because the date is not possible. Reenter the date as 5/30/1977.

Enter the date for case #003 (10/3/68). Note that the year in the data file has been changed to 1968, while the display remains at 68. If you enter a 2-digit year, SPSS will automatically add "19" to the year. If you want to enter a date that is not in the 20th century, then you must enter all the digits of the date.  Try entering a date from the year 2001.

Enter the date for case #004 (10/10/1582). The Data Editor beeps and does not enter the date. Why? SPSS stores the date as the number of seconds from October 14, 1582 (the beginning of the Julian calendar). As a consequence you cannot enter a date that is on or before October 14, 1582. Enter the date as 10/15/1582.

c. Variable Label

Not needed.

d. Value Label

Not needed.

e. Missing Values

The easiest way to deal with missing values for date variables is to just leave the value blank. A blank numeric value will be displayed as a single period(.). Try deleting one of the date values and note that the result is a period for that case. A blank numeric value or a blank date value is defined as a system missing value. SPSS will correctly handle system missing values.

f. Column Formats

Date variables are 10 digits wide (2 for the month, 2 for the day, 4 for the year, and 1 each for the two "/" separators.  Try changing the the column format for this date variable from 10 digits to 8 digits. What happens?

g. Measurement

The scale of measurement for a data variable is "scale."

top

### GENDER (string variable)

a. Variable Name

Click on the 3rd variable column of worksheet. Enter GENDER as the name for the third variable.

b. Variable Type

Click string as the variable type. Because only one letter is needed to enter the M and F codes, enter 1 as the number of characters. The term "string" variable is synonymous with "alphanumeric" variable.

The data editor will only allow you to enter as many characters as you have defined in this dialog box. Because the number of characters was defined as 1, you will only be allowed to enter single characters. The data editor will beep at you if you try to enter more than one character.

c. Variable Label

This is another instance where the SPSS name for the variable is very descriptive. You may or may not wish to enter a variable label such as "Gender of the Respondent." Here is rule of thumb, think about another person working on the research project after you are gone, will that person clearly understand the SPSS variable name? If there is any possibility for misinterpretation you should include a longer variable label. SPSS variable names can be no longer than 8 characters, variable labels can be up to 256 characters long.

d. Value Label

SPSS is sensitive to the case of string values. The value "F" is different from the value "f". The reason for this is that strings are coded according their ASCII code. For example, the ASCII code for a capital F is 070 while the ASCII code for a small f is 102. This adds a complication to entering values for string variables You must be consistent in the case that you are using. In this example gender is coded as upper case M's and F's. If you used both upper and lower case F's and M's as values for gender then SPSS would think you had four different genders, F's, f's. M's, and m's.

Suppose you decide to enter the value M for males and the value F for females in the datafile. Then you can attach the value label "male" to the letter M and the value label "female" to the letter F.  Enter M in the Value: box.  Then enter "male" (without the quotes) in the Value Label: box. Then press the Add button to add the value label to the list of value labels.  Do the same for F and "female."

e. Missing Values

There is no such thing as a system missing value for a string variable. Blank string values are considered to be valid values. If you have missing data for gender and happen to leave the value blank, SPSS will think you have three valid genders, M, F, and blank. Therefore you must set up a user-defined missing value for string variables.

To enter user-defined missing values click -

Define Variable
Missing Values...
Discrete missing values

Your cursor should be in the first of the three boxes, press the space bar once and then click the Continue button. You have now defined a single blank as a user missing value.

f. Column Formats

Recall that the column formats dialog box refers to the display of values in the data editor. The column width was automatically set to the width of the string variable, 1 in this instance.  The text allignment was autmatically set to left.  By default, string or text variables are left justified.  SPSS assumes that all string variables are left justified. Numeric variables are right justified.

g. Measurement

When you identified the variable type as "string," the measurement type was automatically set to "nominal."  Nominal is the correct measurement type for gender.

Close the Define Variable dialog box.

#### Entering string data

Variables that have defined value labels can be entered in one of two ways: (a) by entering the value itself, or (b) by clicking on the label.

Note that only the first letter of the variable name, "g," is displayed in the data editor. In order to see what is happening when we enter the data, use your cursor to drag the column width wider so that you can read the whole variable name, "gender."  As you expand the display width of the variable you probably see the value label for gender rather than the value itself. You can toggle between displaying values and value labels.  The toggle switch is located in the Value Labels icon near the right end of the icon bar in the Data Editor, and in the

-Value Labels

Lets start by turning on the display of value labels.

#### Entering the value itself

Move the cursor to cell 1:gender and enter the value for the first case, "F."  The value label that you defined for "F," Female, appears in the cell. Move the cursor to cell 2:gender and enter the value for the second case, "M."  The value value label that you defined for "M," Male, appears in the cell.  The values for string variables are stored in the SPSS system file as the ASCII value,  the Data Editor can display either the value or its associated value label. Try toggling back and forth between displaying the ASCII values and the associated value labels.

The third and fourth cases our example data are females. This time enter their values as lowercase "f"s and then switch back and forth between viewing the value and the value label.  Note that "f" is not associated with a label. Only the value is displayed.  You now have two different values in your dataset for females, "F" and "f".  If you ran a frequencies on gender it would provide a separate count for all the "F"s and all the "f"s in the datafile.   Remember to be consistent in the case that you use when entering string data.

(Note: My version of SPSS, 8.0.0, seems to have a bug. When the "column width" is set to 1 the value "M" does not appear in the cell, although the value "F" does.)

Using Value Labels to enter the value

The data editor will allow you to enter values by clicking on the value label itself. To try this out toggle the Value Labels switch to display value labels. Highlight the cell to enter data and press Ctrl-(left)click. A value labels box will open. You can double (left)click the appropriate value to enter it into the data file. The value that is coded into the data file is shown at the top in the cell editor window.

This seems like a very long way to go about entering the values M and F. However, I can see some value in this approach if you have a very long list of value labels. Suppose that you are classifying psychological traumas and have a list of 20 different codes. It might be helpful to be able to choose from a list that comes up.

This string example was simplified by limiting the width of the string to a single character. Things get more complicated if your string values are longer than one character. Lets consider some of the complications that occur if we set the width of the gender variable to two characters rather than one character. There are now eight possible ways of entering upper and lower case M's and F's: M-space, space-M, m-space, space-m, F-space, space-F, f-space, and space-f. If you entered each of those variations, SPSS would identify eight different values for gender.

Suppose you enter M and then press Enter or an arrow key. Will SPSS interpret this as M-space or space-M? By default SPSS always left justifies string variables. If there are spaces left over then SPSS pads with blanks to right. So, M would be entered into the data file as M-space. (Numeric variables are always right justified.)

You need to remember those rules when you assign value labels. If you enter M without any spaces with value label will look like this, M = "Male". SPSS will pad to right with blanks and look for the value M-blank. That is, blank-M will not be assigned the value label of "Male."

What do you suppose happens if you define a single space as a user-missing value?   If a string variable that is two characters wide (String 2) is left blank will that blank value be defined as missing?

To summarize: (a) string variables are left justified, (b) SPSS pads string variables to the right with blanks.

top

### CLASS (integer numeric with value labels)

a. Variable Name

Click on the 4th variable column of the worksheet.

Define Variable
Variable Name:

Enter CLASS as the variable name.

b. Variable Type

Define Variable
Type...

Variable Type is Numeric1.0 (width = 1, decimal places = 0)

c. Variable Label

Define Variable
Labels...
Variable Label

The variable name "class" may not be readily understood by someone else so you should enter a variable label. How about "Year in school?"

d. Value Label

Define Variable
Labels...
Value Labels
Label:
Value Label:

Enter the four value labels: 1 = 'Freshman', 2 = 'Sophomore', 3 = 'Junior', and 4 = 'Senior.'

e. Missing Values.

Missing values should be left blank. SPSS will consider the blank numeric values to be system missing values.

f. Column Formats

The column format is set to the width of the variable, 1 in this instance. It can be left as is, or you could set it to a wider value.

g. Measurement

What scale of measurement is "year in school?"  It is at least ordinal, a person has earned more credit hours if he or she is a senior than if he or she is a junior.  Is the interval between freshman and sophomore the same as the interval between sophomore and junior?

top

### INCOME (Decimal Numeric)

Income is saved in dollars and cents. The variable variable type is "dollar." Because INCOME has values both to the right and left of the decimal point it is a decimal numeric number. You could define the width as 11 and the decimal places as 2, indicating that the total number of digits displayed, including the decimal point and the dollar sign, is 11 and the number of digits displayed after the decimal place is 2. Remember that this format determines how the values will be displayed. The format does not affect how the values are stored, or how the values are used in computations.

a. Variable Name

Click on the 5th variable column of the worksheet.

Define Variable
Variable Name:

Enter INCOME as the variable name.

b. Variable Type

Define Variable
Type...

The variable type is Dollar.

c. Variable Label

Define Variable
Labels...
Variable Label:

Enter "Monthly income" as the Variable Label.

d. Value Label

There is no reason to label each individual value for the INCOME variable.

e. Missing Values

I recommend using system missing values for this variable. Simply leave the cell blank if the income data is missing.

f. Column Formats

The column format for the data editor has been set automatically to the width of the dollar variable.

Commas will be displayed in the data editor if the width of the variable will accommodate them, otherwise the display of commas is suppressed. Decimal places will be displayed if the width of the variable will accommodate them, otherwise the display of the decimal part of the number is suppression.  The dollar sign will be displayed if the width of the variable will accommodate them, otherwise the display of the dollar sign will be suppressed.  After entering the income data use your cursor to change the width of the income variable and note the changing display for commas, cents, and the dollar sign.

g. Measurement

The scale of measurement for the dollar variable is "scale." It is actually a ratio variable, dollar has a rational zero point. It is right justified, as are all numeric variables.

top

### ESSAY (Integer Numeric with Value Labels)

a. Variable Name

Click on the 6th variable column of the worksheet.

Define Variable
Variable Name:

Enter ESSAY as the variable name.

b. Variable Type

Define Variable
Type...

It takes only a single digit to code the values of ESSAY, the independent variable. The essay score is a one-digit, whole number. Therefore the variable type is Numeric. Set the width to 1 and number of decimal places to 0.

c. Variable Label

Define Variable
Labels...
Variable Label:

The Variable Label could be "Essay Condition."

d. Value Label

Define Variable
Labels...
Value:
Value Label:

Enter the two value labels: 1 = "doubling tuition" 2 = "intercollegiate athletics."

e. Missing Values

Everyone should have been assigned to a treatment condition so there should be no missing values.

f. Column Formats

Column format has been set to a width of 1, the width of the variable set in variable type.  This is a numeric variable so it is right justified.

g. Measurement

What is the scale of measurement for this variable?

top

### EXTRAV (Integer Numeric without Value Labels)

a. Variable Name

Click on the 7th variable column of the worksheet.

Define Variable
Variable Name:

Enter EXTRAV as the variable name.

b. Variable Type

The extraversion score is a two-digit, whole number. Therefore the variable type is Numeric. Set the width to 2 and number of decimal places to 0.

c. Variable Label

Define Variable
Labels...
Variable Label:

The extraversion score is a subscale of the Eysenck Personality Inventory (EPI). The Variable Label might be "Eysenck Personality Inventory (EPI): Extraversion Subscale."

d. Value Label

Individual values of scales (e.g., personality measures, IQ, etc.) are not labeled.

e. Missing Values

I recommend using system missing values for this variable. Simply leave the cell blank if the extraversion score is missing.

f. Column Formats

Changing the Column Formats will change change how the values are displayed in the Data Editor.

g. Measurement

The scale of measurement for nearly all personality scales is interval.  This extraversion scale is measured as an interval scale so the measurement type should be set to "scale."

top

### SHOCK, TUITION, PHDPSY (Integer Numeric with Common Value Labels)

SHOCK, TUITION, and PHDPSY have a common set of value labels. You can set up a template and then use that template to define the common value labels, missing values, and column format for the other variables. To define a template click

Data (on the top row of buttons)
Templates...
Define>>
Name:

Enter a name for the template, e.g., "6-PT SCALE"  and click the Add button to add this to the list of templates.

Type...

Set the variable type as Numeric1.0 (width = 1, decimal places = 0)

Value Labels...

Enter the following value labels: 1 = "Strongly Disagree"
2 = "Disagree"
3 = "Slightly Disagree"
4 = "Slightly Agree"
5 = "Agree"
6 = "Strongly Agree"
9 = "No Opinion on This Issue"

Missing Values...

The value "9" (no opinion on this issue) is an out-of-range value that should not be considered when finding the mean or other statistics for this variable. All 9s will be considered to be valid values unless you explicitly define them as missing. Because you define which values are to be considered missing those values are called user-defined missing values.

There is only a single missing value for this variable. Click the discrete missing values option, enter "9" (without the quotes) in the first box, and then press Continue.

Column Format...

Enter a value for Column Format, e.g., 4 or 5 and press Continue.

To save the new template press Close.

The template can now be used to define new variables. First, define the variable name and variable label and then use the template to add the variable type, value labels, missing values, and column format.

SHOCK

a. Variable Name

Click on the 8th variable column of the worksheet.

Define Variable
Variable Name:

Enter SHOCK as the variable name.

b. Variable Label

Define Variable
Labels...
Variable Label:

Enter "Electric shock should not be used in experiments" and click Continue. Close the Define Variable dialog box by clicking the OK button.

Add the values from the 6-PT SCALE template.

Data
Templates...
Select the 6-PT SCALE template
Click all the options in the Apply section.
Click OK to add the template values and close the template.

You can reopen the Define Variable dialog box to verify that the template has been appriopriately applied.

The scale of measurement for this type of variable is a source of some controversy among statisticians.  The most conservative statisticians argue that this 6-point Likert type scale is at most ordinal. The majority would agree that the scale is interval.  I know of no case where a journal editor has rejected a paper where t-tests or analyses of variance (statistics for interval level of measurement) had been performed on Likert-type scales.  I go with "interval" as the scale of measurement for this type of scale.

Follow the same steps to define TUITION and PHDPSY.

COMMENT: Templates are stored as a part of the SPSS program on your computer, not as a part of the data file that you have created.   There is no way that I know of to save a template to a floppy disk.  Unless you log onto the same computer again you will probably have to reenter the template again.

top

### 3. Missing Values Summary

System missing values

-Leave the cell blank.
-No special action needs to be taken when system missing values are used in an SPSS data file.

User-defined missing values

-User-defined missing values should be clearly “out-of-range” of the regular values for the variable, e.g., “-9” or "-99" for age.

-If user-defined missing values are used then you must specify which values are to be considered missing. If you forget to define the user-missing values, they will be considered to be valid values by SPSS. For example, the -99s will be used as valid ages when computing the mean age for the participants.

top

## 4. Enter the Values

In this set of notes we have entered the values for most of the variables as we defined each variable. Normally you would enter the values on a case by case basis. If you have not done so please go back and enter the remainder of the data from Table 2. Key in the data and then use the arrow key to both enter the data into data file and move to the next cell.

Editing already entered values is as simple as going to the cell and changing the value.

You can insert a new case or a new variable. Move the cursor to the row or column where you want to make the insertion and then click on Data and then click either Insert Case or Insert Variable.

You can go to a particular case by clicking on Data then Go to Case then enter the case number.

It is possible to cut, copy, paste, and clear entire rows, columns, or blocks of data. Highlight the data and use the editing tools under the Edit button or press the right mouse button to bring up the editing tools.

top

## 5. Save the Data File

You can run SPSS procedures using the data file you have just created. But if you exit SPSS without saving the file it will be lost. To save the file click on File (top row of buttons) then Save. Enter a file name and then click Save. The file will be saved as an SPSS for Windows 9.0 system file with the extension ".sav." A system file is a file created by SPSS that includes all the values plus all the variable definition information.

If you choose the Save As option you can save the file in other formats including several different SPSS formats, ASCII, and several spreadsheet formats. Of particular interest is the SPSS portable format. The SPSS portable format is a generic format that can be read by any version of SPSS (e.g., Unix, Vax, and IBM mainframe version, SPSS/PC+, and SPSS version 7.0). I used to recommend that when you leave UCCS you should save your files in the portable format, you will be able to read them in any version of SPSS.  Most institutions now use SPSS for Windows rather than mainframe versions so this is no longer a problem.

IMPORTANT: DO NOT EDIT AN SPSS SYSTEMS FILE WITH A WORD OR TEXT PROCESSOR.

An SPSS system file is a specially formatted file that should not be edited with a word or text processor. If you try to edit a systems file outside of the SPSS Data Editor you may destroy structure of the file and you may not be able to open it again within SPSS. If you happen to open an SPSS systems file with a word processor close the file without making any changes to it.

Recall that string variables are stored as ASCII values. If you open an SPSS systems file you would be able to read the values of the string variables.   Numeric variables are stored as their binary equivalents.  You will not be able to read the values of numeric variables because your text editor only reads ASCII values. A word processor will attempt to interpret the value of a binary, numeric variable as if it were an ASCII value.  It would give you strange values.  If you happen to change even one character in a systems file you could damage the values of one or more variables, or even make the file inaccessible to SPSS.

Caution: You should always back up your data files to a floppy disk.

top

ŠLee A. Becker, 1997-1999       -revised 09/10/99