http://uit.unt.edu/research

Data Science and Analytics

UITHelp Desk | Training | About Us | Publications | DSA Home

Please participate in the DSA Client Feedback Survey.

Back to the Do it yourself Introduction to R

(1) Script files. Recall that early on in these tutorial notes, it was mentioned that there are generally three windows you will use frequently in R: the console window, the graphics window, and the script window. The script window is not necessary; but often preferred for building script or code and proof reading it prior to submitting it (much like the syntax windows/editors found in SPSS and SAS). To open a new script window, simply click on ‘File’ à ‘New script’. We can write as much script here as desired and highlight, right click, then submit individual elements or the entire script as necessary. Another benefit to using a script window comes when saving our work. Script files are extremely small (i.e. virtually identical in size to equivalently lined text files [.txt]) and, if appropriately thorough can be loaded and run to produce the entirety of our work from a given session. The alternative; is to save the script file and the workspace image (which is everything contained in the console); but, a workspace image can grow quite large and thus consume an often undesirable amount of space. The reason for my discussing script files here is that we will be using R Commander (Rcmdr) to import data but Rcmdr is not necessary to import data; only the script is necessary. In future tutorial notes, a script file (e.g. filename.R) will be all that is provided.

(2) Initial Rcmdr orientation. *Note: if you’ve been following these tutorial notes from the first, you should have downloaded, installed, and updated all the packages used on this site (see THIS script). Although this may seem excessive, some tasks in Rcmdr require other packages (and Rcmdr will load them as needed only if they have been downloaded and installed), so please take the time to get all the packages used on this site.

For those of us (me included) who started out with point-and-click statistical software, Rcmdr represents a somewhat familiar interface for some basic tasks. It also provides script for each task specified through point-and-click operations; which allows us to see how we might graduate away from R Commander as we progress to more complex tasks not available in Rcmdr. So, let’s get started by loading the Rcmdr package:

library(Rcmdr)

Right away; you can likely see that Rcmdr was made to be user friendly. Let’s orient ourselves by starting at the bottom and working upward. At the bottom of Rcmdr we find a ‘Messages’ window that generally serves to let us know when something didn’t go quite right; in other words error messages will appear here along with warning messages. Error messages will appear in red and reflect an error which prevented a function/task from being carried out. Warning messages (and note messages) will be displayed in blue and do not necessarily reflect a failure of a function/task to be carried out. The output window is as the name implies where all output will be displayed; with the exception of graphics which will be displayed in a separate window outside of Rcmdr. The script window, again as the name implies; displays script which result from specifying some task, analysis, or function through the use of the point-and-click menus. You can also write/type script directly into the Rcmdr script window and then highlight and submit it using the ‘Submit’ button between the script and output windows. As an example; type the following into the Rcmdr script window and then highlight and submit it:

x <- function

You will notice the script appears in red in the output window and no actual output (would be displayed in blue) appears. The reason no output appeared is because there was an error; displayed in the messages window at the bottom. The rest of the Rcmdr buttons and menu items will be fairly self explanatory; but, we will be using some of them here.

(3) Using Rcmdr to import SPSS data. For now; let’s get some data imported. First, you need to download the example data files from the web page (Example Data 1, Example Data 2, & Example Data 3) and save them to your source directory.

In Rcmdr, click on ‘Data’ and take note of the available choices. First, let’s create a simple data file; so, click on ‘New data set…’ and then enter the name ‘ex1’ then hit the ‘OK’ button. This brings up the Data Editor where you can type in data values. Notice too, you can click on ‘var1’ and give the first variable a name, as well as specify it as numeric or string. For now; go ahead and close the data editor and close the ‘New Data Set’ naming window. Again; you will see some script, output, and an error stating “empty data set”. If we again, click on ‘Data’ and then ‘Load data set…’ we could open an existing R data file. However, we generally want to open an existing data file that is not in an R format. But, before we do that; left-mouse click then right-mouse click in the Script Window and select ‘Clear Window’. You can do this in the Output Window as well. To clear the Messages window, you need to highlight all the text in that window and then delete it.

In Rcmdr, click on ‘Data’ and then hold the cursor over the ‘Import data’. We will import an SPSS file first, so click on ‘from SPSS data set…’. Next, we will be prompted to name our data file—use the name ‘example1’. We also see that by default the value labels will be converted to factor levels and the maximum number of value labels for factor conversion is set to infinite. Once you have typed in the name (example1), click ‘OK’. If you set your source directory correctly and you downloaded the example data sets into that source directory, you should be looking at them now. Highlight ‘ExampleData1.sav’ and then click the ‘Open’ button. Now, looking at the Script Window (and Output Window) you should see the appropriate script for importing an SPSS data file into R:

example1 <- read.spss("C:/Users/jons/Desktop/Work_Stuff/Jon_R/Example Data/ExampleData1.sav", use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)

All you would need to change for future use is the file name (and path if the data is not located in your source directory).

*You will need the ‘foreign’ library loaded if you are working with just the R console or from the R console and a script file (i.e. not using Rcmdr). This is why I have the foreign library listed in my Rprofile.site file as one of the libraries to load upon start up of R.

Notice also some key features of the script: we have created an object ‘example1’ and assigned it ‘<-‘ using the ‘read.spss’ function and our object was created as a data frame (an R way of saying or identifying a data file matrix). Also notice our specified options from above (e.g. use.value.lables=TRUE, max.value.labels=Inf); which are important as examples of the way R specifies conditions using =TRUE or =FALSE. Many functions use arguments like these conditional true/false statements as options for further specifying some task within the function. If you would like more information on the ‘read.spss’ function or you would like an example of using the console help; then type the following in the R console and hit enter: help(read.spss)

You should also take note that we now have a current data set specified just above the script window in Rcmdr. Therefore, we can click on the ‘Edit data set’ button to edit the data or we can click on the ‘View data set’ button to view it—both of which produce script: fix(example1) and showData(example1, …). Now, close both the data view window and the data editor window if you have not done so already. An extremely common practice when working with script is to use # to comment out anything that is not used as working script (i.e. notes, reminders, comments, etc.). For example, type (or copy and paste) the following in the Script Window of Rcmdr:

# This is how we view a data set loaded in Rcmdr.

showData(example1, placement='-20+200', font=getRcmdr('logFont'), maxwidth=80, maxheight=30)

Now, highlight those two lines and click the ‘Submit’ button between the Script Window and the Output Window in Rcmdr. Next, close the data view window and then copy and past those two lines into your R console. Next, close the data view window again and then, in the R console, click on ‘File’ and ‘New script’. Now, copy and paste those two lines into the new script window you have just opened. Next, highlight those two lines in the new script window and then right-mouse-click on the highlighted text and select ‘Run line or selection’. Now go ahead and close the data view window for a final time.

(4) Using Rcmdr to import Excel data. Click on ‘Data’ and then hold the cursor over the ‘Import data’. Then click on ‘from Excel, Access, or dBase data set…’. Next, we will be prompted to name our data file—use the name ‘example2’ and click ‘OK’. Highlight the ‘ExampleData2.xls’ and click ‘Open’. It’s just that simple. One thing to note is that now when you click on the ‘Data set: example2’ button in Rcmdr, above the Script Window; you can now select which data set you would like to use (example1 or example2). Meaning, you can have multiple data sets available during a single session and simply switch between them as necessary.

(5) Using Rcmdr to import text data. Click on ‘Data’ and then hold the cursor over the ‘Import data’. Then click on ‘from text file, clipboard, or URL…’. Next, we will be prompted to name our data file—use the name ‘example3’ and notice all the options for specifying the nature of the data. None of these default options needs to be changed with this file, so click ‘OK’. Highlight the ‘ExampleData3.txt’ and click ‘Open’. Again, it’s just that simple.

You will now notice that with a data file loaded into Rcmdr you can click on the different menu options and a variety of functions, analysis, graphs, etc. are only a mouse-click away. Also remember, virtually any script generated in Rcmdr can be used in the R console with the necessary libraries loaded.

In future tutorial notes, we will be using R console and script files; but remember all scripts can be copied and pasted into the R Console. The script files can also be downloaded and then opened with the R Console or in R Commander using ‘File’, ‘Open script file…’ in the Console or Rcmdr top task bar.

When reading the script files, you'll notice the common convention of using # to start a comment line (which is not working code), while lines without # are working code.

 

Back to the Do it yourself Introduction to R

Please participate in the DSA Client Feedback Survey.

Contact Information

Jon Starkweather, PhD

Jonathan.Starkweather@unt.edu

940-565-4066

Richard Herrington, PhD

Richard.Herrington@unt.edu

940-565-2140

Last updated: 2018.11.06 by Jon Starkweather.

UITHelp Desk | Training | About Us | Publications | DSA Home