(1) Script files.
Recall that early on in these tutorial notes, it was mentioned that
there are generally three windows you will use frequently in R: the
console window, the graphics window, and the script window. The script
window is not necessary; but often preferred for building
script or code and proof reading it prior to submitting it (much like
the syntax windows/editors found in SPSS and SAS). To open a new script
window, simply click on ‘File’ à
‘New script’. We can write as much script here as desired and
highlight, right click, then submit individual elements or the entire
script as necessary. Another benefit to using a script window comes
when saving our work. Script files are extremely small (i.e. virtually
identical in size to equivalently lined text files [.txt]) and, if
appropriately thorough can be loaded and run to produce the entirety of
our work from a given session. The alternative; is to save the script
file and the workspace image (which is everything contained in the
console); but, a workspace image can grow quite large and thus consume
an often undesirable amount of space. The reason for my discussing
script files here is that we will be using R Commander (Rcmdr) to
import data but Rcmdr is not necessary to import data; only the script
is necessary. In future tutorial notes, a script file (e.g. filename.R)
will be all that is provided.
(2) Initial
Rcmdr orientation. *Note: if you’ve been following
these tutorial notes from the first, you should have downloaded,
installed, and updated all the packages used on this site (see
THIS script). Although this may seem excessive, some tasks in
Rcmdr require other packages (and Rcmdr will load them as needed only
if they have been downloaded and installed), so please take the time to
get all the packages used on this site.
For those of us (me
included) who started out with point-and-click statistical software,
Rcmdr represents a somewhat familiar interface for some basic tasks. It
also provides script for each task specified through point-and-click
operations; which allows us to see how we might
graduate away from R Commander as we progress to more complex tasks not
available in Rcmdr. So, let’s get started by loading the Rcmdr package:
library(Rcmdr)
Right away; you can
likely see that Rcmdr was made to be user friendly. Let’s orient
ourselves by starting at the bottom and working upward. At the bottom
of Rcmdr we find a ‘Messages’ window that generally serves to let us
know when something didn’t go quite right; in other words error
messages will appear here along with warning messages. Error messages
will appear in red and reflect an error which prevented a function/task
from being carried out. Warning messages (and note messages) will be
displayed in blue and do not necessarily reflect a failure of a
function/task to be carried out. The output window is as the name
implies where all output will be displayed; with the exception of
graphics which will be displayed in a separate window outside of Rcmdr.
The script window, again as the name implies; displays script which
result from specifying some task, analysis, or function through the use
of the point-and-click menus. You can also write/type script directly
into the Rcmdr script window and then highlight and submit it using the
‘Submit’ button between the script and output windows. As an example;
type the following into the Rcmdr script window and then highlight and
submit it:
x <- function
You will notice the
script appears in red in the output window and no actual output (would
be displayed in blue) appears. The reason no output appeared is because
there was an error; displayed in the messages window at the bottom. The
rest of the Rcmdr buttons and menu items will be fairly self
explanatory; but, we will be using some of them here.
(3) Using Rcmdr
to import SPSS data. For now; let’s get some data imported.
First, you need to download the example data files from the web page (Example
Data 1,
Example Data 2, &
Example Data 3) and save them to your source directory.
In Rcmdr, click on ‘Data’
and take note of the available choices. First, let’s create a simple
data file; so, click on ‘New data set…’ and then enter the name ‘ex1’
then hit the ‘OK’ button. This brings up the Data Editor where you can
type in data values. Notice too, you can click on ‘var1’ and give the
first variable a name, as well as specify it as numeric or string. For
now; go ahead and close the data editor and close the ‘New Data Set’
naming window. Again; you will see some script, output, and an error
stating “empty data set”. If we again, click on ‘Data’ and then ‘Load
data set…’ we could open an existing R data file. However, we generally
want to open an existing data file that is not in an R format. But,
before we do that; left-mouse click then right-mouse click in the
Script Window and select ‘Clear Window’. You can do this in the Output
Window as well. To clear the Messages window, you need to highlight all
the text in that window and then delete it.
In Rcmdr, click on ‘Data’
and then hold the cursor over the ‘Import data’. We will import an SPSS
file first, so click on ‘from SPSS data set…’. Next, we will be
prompted to name our data file—use the name ‘example1’. We also see
that by default the value labels will be converted to factor levels and
the maximum number of value labels for factor conversion is set to
infinite. Once you have typed in the name (example1), click ‘OK’. If
you set your source directory correctly and you downloaded the example
data sets into that source directory, you should be looking at them
now. Highlight ‘ExampleData1.sav’ and then click the ‘Open’ button.
Now, looking at the Script Window (and Output Window) you should see
the appropriate script for importing an SPSS data file into R:
example1 <-
read.spss("C:/Users/jons/Desktop/Work_Stuff/Jon_R/Example
Data/ExampleData1.sav",
use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
All you would need to
change for future use is the file name (and path if the data is not
located in your source directory).
*You will need the
‘foreign’ library loaded if you are working with just the R console or
from the R console and a script file (i.e. not using Rcmdr). This is
why I have the foreign library listed in my Rprofile.site file as one
of the libraries to load upon start up of R.
Notice also some key
features of the script: we have created an object ‘example1’ and
assigned it ‘<-‘ using the ‘read.spss’ function and our object
was created as a data frame (an R way of saying or identifying a data
file matrix). Also notice our specified options from above (e.g.
use.value.lables=TRUE, max.value.labels=Inf); which are important as
examples of the way R specifies conditions using =TRUE or =FALSE. Many
functions use arguments like these conditional
true/false statements as options for further specifying some task
within the function. If you would like more information on the
‘read.spss’ function or you would like an example of using the console
help; then type the following in the R console and hit enter:
help(read.spss)
You should also take note
that we now have a current data set specified just above the script
window in Rcmdr. Therefore, we can click on the ‘Edit data set’ button
to edit the data or we can click on the ‘View data set’ button to view
it—both of which produce script: fix(example1) and showData(example1,
…). Now, close both the data view window and the data editor
window if you have not done so already. An extremely common
practice when working with script is to use # to comment out anything
that is not used as working script (i.e. notes, reminders, comments,
etc.). For example, type (or copy and paste) the following in the
Script Window of Rcmdr:
# This is how we view a
data set loaded in Rcmdr.
showData(example1,
placement='-20+200', font=getRcmdr('logFont'), maxwidth=80,
maxheight=30)
Now, highlight those two
lines and click the ‘Submit’ button between the Script Window and the
Output Window in Rcmdr. Next, close the data view window
and then copy and past those two lines into your R console. Next,
close the data view window again and then, in the R console,
click on ‘File’ and ‘New script’. Now, copy and paste those two lines
into the new script window you have just opened. Next, highlight those
two lines in the new script window and then right-mouse-click on the
highlighted text and select ‘Run line or selection’. Now go ahead and close
the data view window for a final time.
(4) Using Rcmdr
to import Excel data. Click on ‘Data’ and then hold the
cursor over the ‘Import data’. Then click on ‘from Excel, Access, or
dBase data set…’. Next, we will be prompted to name our data file—use
the name ‘example2’ and click ‘OK’. Highlight the ‘ExampleData2.xls’
and click ‘Open’. It’s just that simple. One thing to note is that now
when you click on the ‘Data set:
example2’ button in Rcmdr, above the Script Window; you
can now select which data set you would like to use (example1 or
example2). Meaning, you can have multiple data sets available during a
single session and simply switch between them as necessary.
(5) Using Rcmdr
to import text data. Click on ‘Data’ and then hold the cursor
over the ‘Import data’. Then click on ‘from text file, clipboard, or
URL…’. Next, we will be prompted to name our data file—use the name
‘example3’ and notice all the options for specifying the nature of the
data. None of these default options needs to be changed with
this file, so click ‘OK’. Highlight the ‘ExampleData3.txt’
and click ‘Open’. Again, it’s just that simple.
You will now notice that
with a data file loaded into Rcmdr you can click on the different menu
options and a variety of functions, analysis, graphs, etc. are only a
mouse-click away. Also remember, virtually any script generated in
Rcmdr can be used in the R console with the necessary libraries loaded.
In future tutorial notes,
we will be using R console and script files; but remember all scripts
can be copied and pasted into the R Console. The script files can also
be downloaded and then opened with the R Console or in R Commander
using ‘File’, ‘Open script file…’ in the Console or Rcmdr top task bar.
When reading the script
files, you'll notice the common convention of using # to start a
comment line (which is not working code), while lines without # are
working code.