4.1 Programming Preliminaries

  1. Look at a sentence in a language you don’t know, look carefully at the symbols, spacing and characters.
  2. Recall learning a foreign language, how you had to learn the syntax and grammar rules.
  3. Now think about English (or another language you know well) and think about the syntax and grammar rules that you take for granted.

All human languages rely on a set of rules called grammar, which describe how the language should be used to communicate. When two humans communicate with a language, they both must agree on the rules of that language.

R also has rules that must be followed in order for a human ( you ) to communicate with a computer, i.e. in order to tell the computer what to do. In human language, grammar is often fluid and evolving, and two people may have to adapt their use of the language in order to communicate. With R, the rules are fixed, and the computer “knows” them perfectly. It is up to you to learn the rules in order to make the computer do exactly what you want it to do.

Since any computer programming language will do exactly what you tell it to do, it’s important to cover some of the basic rules of the R programming language before you can learn what it can do.

So let’s get started:

4.1.1 R Commands

Like most programming languages, R consists of a set of commands which form the sequence of instructions which the computer completes. You can think of commands as the verbs of R, they are the actions the computer will take. Here is an example of a command, followed by the result.

print("hello, world!")
[1] "hello, world!"

This command is telling R to print out a message. R code usually contains more than one command, and typically each command is put on a separate line. Here are multiple commands, each on a separate line:

print("The air is fine!")
print(1 + 1)
print(4 > 5)
[1] "The air is fine!"
[1] 2
[1] FALSE

The first command prints another message, the second command does some math then prints the result, and the third command evaluates whether the statement is true or false and prints the result. Generally, it’s a good idea to put separate commands on separate lines, but you can put multiple commands on the same line, as long as you separate them by a semicolon. See this code for example:

x <- 1+1; print(x); print(x^2)
[1] 2
[1] 4

In this example, three commands are given on one line. The first command creates a new variable called x, the second command prints the value of x, and the third command prints the value of x squared. We see that the semicolon, ;, serves as the command termination, because it tells R where one command ends and another begins. When a line contains a single command, no semicolon is necessary at the end, but including a semicolon doesn’t have any effect either.

print("This line doesn't have a semicolon")
print("This line does have a semicolon");
[1] "This line doesn't have a semicolon"
[1] "This line does have a semicolon"

Including multiple semicolons (e.g. print(“hello”);;) does not work!

You’ve just seen your first example of assignment. That is, we created a thing called x , and assigned to it the value of 1 + 1 using the assignment operator, <-. Formally x is called an object, but we’ll talk more about objects and assignment later.

So far, we’ve seen that you can place one command on one line, multiple commands on multiple lines, multiple commands on one line, so you may ask: can you can place one command on multiple lines? The answer is sometimes, depending on the command, but we will not discuss this now.

At this point, we’ve introduced several new types of R commands (assigning a variable, squaring a number, etc.), and we will talk more specifically about these later. The important part of this section is how R code is arranged into different commands.

Lastly, commands can be “grouped together” using left and right curly braces: { and }. Here’s an example:

{
  print("here's some code that's all grouped together")
  print(2^3 - 7)
  w <- "hello"
  print(w)
}
[1] "here's some code that's all grouped together"
[1] 1
[1] "hello"

The above grouped code is indented so that it looks nice, but it doesn’t have to be:

{
print("here's some code that's all grouped together")
print(2^3 - 7)
w <- "hello"
print(w)
}
[1] "here's some code that's all grouped together"
[1] 1
[1] "hello"

Indenting is an example of coding style, which are formatting decisions which don’t affect the results of the code, but are meant to enhance readability. We’ll talk more about coding style later. In some programming languages, Python for example, white space matters. That is, code indents and other spaces change the way the code runs. In R, white space does not matter, so things like indents are used purely for readability.

What does it mean to “group” code? At this point there is no practical difference, each command gets executed whether or not it is grouped inside curly braces. However, code grouping will become very important later on, when we discuss control flow later.

There are several helpful shortcuts that you can use in R. If you forget to put quotes around something, you can highlight and press the quote key and it will add quotes to both sides. This works with parentheses too.

You can also use tab completion with functions and defined variables. Tab completion allows you to use longer, more descriptive variable names without the additional typing time. This can save you a lot of time and reduce mistakes!

In RStudio, open a new R script and type in all the R commands from this section, to verify that you get the same result. It’s good practice!

4.1.2 Comments

When writing R code, you may wish to include notes which explain the code to your future self or to other humans. This can be done with comments, which are ignored by R when it is running the code. The “#” symbol initiates a comment.
Here’s an example of some comments:

# Let's define y and z
y <- 8
z <- y + 5  # Adding 5 to y and assigning the result to z
## This is still a comment, even though we're using two #'s

Notice that it’s possible for a line to contain only a comment, or for part of a line to be a comment. R decides which part of a line is a comment by looking for the first “#”, and everything after that will be treated as a comment and ignored.

R ignores comments, but you should not! If you’re reading code that someone else has written, it’s likely that also paying attention to their comments will greatly help you to understand what their code is doing. It’s also courteous to make good comments in your own code, if only because you may have to return to your own code in the future and re-learn what it is doing! In this book, we will use comments to help explain the R code that you will see.

4.1.3 Blank Lines

Blank lines in R are ignored, but they can be used to organize code and enhance readability:

print("The sky is blue")
# The blank line below here is ignored

print("The grass is green")
[1] "The sky is blue"
[1] "The grass is green"

4.1.4 CaSe SeNsItIvItY

In R, variables, functions, and other objects (all of which we’ll talk about later), have names. These names are case sensitive, so you must be careful when referencing an object by name. Here we create two variables and give them different values, notice how they are different from each other:

A <- 4
a <- 5

print(a)
print(A)
[1] 5
[1] 4

This may seem obvious, but case sensitivity applies to functions (which we’ll talk about later) too. We’ve been using the print function a lot in the above examples, which begins with a lower case p.  There is no Print function:

Print("testing")
Error in Print("testing"): could not find function "Print"

4.1.5 ?

One very nice thing in R is the documentation that accompanies it. Every function included in R (like print) has documentation that explains how that function works. To access the documentation, use a ? followed by the name of the function, like so:

?print

The output of the above code chunk is not shown, because the result of this code is best viewed in RStudio. Go to R Studio and type in ?print and observe what happens!

4.1.6 ??

If you don’t remember the exact name of a function, or would like to search for general matches to a topic, then you can use ??. For example, trying ?Print produces an error, because there is not Print function (remember, R is case sensitive), so there’s no documentation to go with it. However, the following should still work:

??Print

Programmers have a sense of humor, too! Try running ????print to see a small joke. Remember, comedic taste varies!

This is a lot to remember. As you get more familiar with R, you’ll begin to memorize basic functions - and Google is always there for the rest.

Want to know more about R syntax? Try typing ?Syntax in the R console (then press Enter).

As we’ve seen, symbols and characters have specific meaning in R. You must be careful not to ignore things like semicolons, curly braces, parentheses, when reading R code. This takes practice!

Okay, now that we’ve covered some of the basics, it’s time to start learning how to do useful things in R! The next few sections will describe the different types of data that R can handle.

This video discusses programming preliminaries.

https://youtu.be/EShV_T2P7sw