4.1 Programming Preliminaries
- Look at a sentence in a language you don’t know, look carefully at the symbols, spacing and characters.
- Recall learning a foreign language, how you had to learn the syntax and grammar rules.
- Now think about English (or another language you know well) and think about the syntax and grammar rules that you take for granted.
All human languages rely on a set of rules called grammar, which describe how the language should be used to communicate. When two humans communicate with a language, they both must agree on the rules of that language.
R also has rules that must be followed in order for a human ( you ) to communicate with a computer, i.e. in order to tell the computer what to do. In human language, grammar is often fluid and evolving, and two people may have to adapt their use of the language in order to communicate. With R, the rules are fixed, and the computer “knows” them perfectly. It is up to you to learn the rules in order to make the computer do exactly what you want it to do.
Since any computer programming language will do exactly what you tell it to do, it’s important to cover some of the basic rules of the R programming language before you can learn what it can do.
So let’s get started:
4.1.1 R Commands
Like most programming languages, R consists of a set of commands which form the sequence of instructions which the computer completes. You can think of commands as the verbs of R, they are the actions the computer will take. Here is an example of a command, followed by the result.
[1] "hello, world!"
This command is telling R to print
out a message.
R code usually contains more than one command, and typically each command is put on a separate line.
Here are multiple commands, each on a separate line:
[1] "The air is fine!"
[1] 2
[1] FALSE
The first command prints another message, the second command does some math then prints the result, and the third command evaluates whether the statement is true or false and prints the result. Generally, it’s a good idea to put separate commands on separate lines, but you can put multiple commands on the same line, as long as you separate them by a semicolon. See this code for example:
[1] 2
[1] 4
In this example, three commands are given on one line.
The first command creates a new variable called x
, the second command prints the value of x
, and the third command prints the value of x
squared.
We see that the semicolon, ;
, serves as the command termination, because it tells R where one command ends and another begins.
When a line contains a single command, no semicolon is necessary at the end, but including a semicolon doesn’t have any effect either.
[1] "This line doesn't have a semicolon"
[1] "This line does have a semicolon"
Including multiple semicolons (e.g. print(“hello”);;
)
does not work!
You’ve just seen your first example of assignment. That is,
we created a thing called x
, and assigned to it
the value of 1 + 1
using the assignment operator,
<-
. Formally x
is called an object, but
we’ll talk more about objects and assignment later.
At this point, we’ve introduced several new types of R commands (assigning a variable, squaring a number, etc.), and we will talk more specifically about these later. The important part of this section is how R code is arranged into different commands.
Lastly, commands can be “grouped together” using left and right curly braces: {
and }
.
Here’s an example:
[1] "here's some code that's all grouped together"
[1] 1
[1] "hello"
The above grouped code is indented so that it looks nice, but it doesn’t have to be:
[1] "here's some code that's all grouped together"
[1] 1
[1] "hello"
Indenting is an example of coding style, which are formatting decisions which don’t affect the results of the code, but are meant to enhance readability. We’ll talk more about coding style later. In some programming languages, Python for example, white space matters. That is, code indents and other spaces change the way the code runs. In R, white space does not matter, so things like indents are used purely for readability.
What does it mean to “group” code? At this point there is no practical difference, each command gets executed whether or not it is grouped inside curly braces. However, code grouping will become very important later on, when we discuss control flow later.
There are several helpful shortcuts that you can use in R. If you forget to put quotes around something, you can highlight and press the quote key and it will add quotes to both sides. This works with parentheses too.
You can also use tab completion with functions and defined variables. Tab completion allows you to use longer, more descriptive variable names without the additional typing time. This can save you a lot of time and reduce mistakes!
In RStudio, open a new R script and type in all the R commands from this section, to verify that you get the same result. It’s good practice!
4.1.3 Blank Lines
Blank lines in R are ignored, but they can be used to organize code and enhance readability:
[1] "The sky is blue"
[1] "The grass is green"
4.1.4 CaSe SeNsItIvItY
In R, variables, functions, and other objects (all of which we’ll talk about later), have names. These names are case sensitive, so you must be careful when referencing an object by name. Here we create two variables and give them different values, notice how they are different from each other:
[1] 5
[1] 4
This may seem obvious, but case sensitivity applies to functions (which we’ll talk about later) too.
We’ve been using the print
function a lot in the above examples, which begins with a lower case p.
There is no Print
function:
Error in Print("testing"): could not find function "Print"
4.1.5 ?
One very nice thing in R is the documentation that accompanies it.
Every function included in R (like print
) has documentation that explains how that function works.
To access the documentation, use a ?
followed by the name of the function, like so:
The output of the above code chunk is not shown, because the result
of this code is best viewed in RStudio. Go to R Studio and type in
?print
and observe what happens!
4.1.6 ??
If you don’t remember the exact name of a function, or would like to search for general matches to a topic, then you can use ??
.
For example, trying ?Print
produces an error, because there is not Print
function (remember, R is case sensitive), so there’s no documentation to go with it.
However, the following should still work:
Programmers have a sense of humor, too! Try running
????print
to see a small joke. Remember, comedic taste
varies!
This is a lot to remember. As you get more familiar with R, you’ll begin to memorize basic functions - and Google is always there for the rest.
Want to know more about R syntax? Try typing ?Syntax
in
the R console (then press Enter
).
As we’ve seen, symbols and characters have specific meaning in R. You must be careful not to ignore things like semicolons, curly braces, parentheses, when reading R code. This takes practice!
Okay, now that we’ve covered some of the basics, it’s time to start learning how to do useful things in R! The next few sections will describe the different types of data that R can handle.
This video discusses programming preliminaries.
Any feedback for this section? Click here
4.1.2 Comments
When writing R code, you may wish to include notes which explain the code to your future self or to other humans. This can be done with comments, which are ignored by R when it is running the code. The “#” symbol initiates a comment.
Here’s an example of some comments:
Notice that it’s possible for a line to contain only a comment, or for part of a line to be a comment. R decides which part of a line is a comment by looking for the first “#”, and everything after that will be treated as a comment and ignored.
R ignores comments, but you should not! If you’re reading code that someone else has written, it’s likely that also paying attention to their comments will greatly help you to understand what their code is doing. It’s also courteous to make good comments in your own code, if only because you may have to return to your own code in the future and re-learn what it is doing! In this book, we will use comments to help explain the R code that you will see.