You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A **scalar** is a simple number. This can be an integer, a real number (single or double precision), a complex number. A scalar can be assigned to a generic variable with the following command (variable name = x):
852
+
```
853
+
# R
854
+
855
+
x <- 3.1415926535897932384626433832795028841971693993751
856
+
```
857
+
The symbol```<-``` represents the assignment operator. Alternatively, the symbol ```=``` can be used, but ```<-``` is preferred.
858
+
A **vector** is an ordered collection of scalars.
859
+
For instance, a 4-dimensional vector can be defined using the command ```c()```:
860
+
```
861
+
# R
862
+
863
+
X <- c(2, 3, 4, 5)
864
+
```
865
+
The elements of a vector can be selected by passing the index corresponding to the element of interest. For instance, to read the second element of a vector:
866
+
```
867
+
# R
868
+
869
+
myVector <- c(4, 5, 6, 7)
870
+
secElement <- myVector[2] # 2 is the second element, so now secElement is equal to 5
871
+
```
872
+
The index can be also a scalar variable. The following gives the same results
873
+
```
874
+
# R
875
+
876
+
myVector <- c(4, 5, 6, 7)
877
+
myIndex <- 2
878
+
secElement <- myVector[myIndex] # 2 is the second element, so now secElement is equal to 5
879
+
```
880
+
The length of a vector is given by the command ```length```:
881
+
```
882
+
# R
883
+
884
+
myVector <- c(4, 5, 6, 7)
885
+
length(myVector)
886
+
```
887
+
A **character** variable can be defined using the symbols ```'``` or ```"```:
888
+
```
889
+
# R
890
+
891
+
myChar <- "A"
892
+
myChar <- 'A' # Identical results
893
+
```
894
+
A **string** is a sequence of characters:
895
+
```
896
+
myString <- 'Hello, World!'
897
+
```
898
+
Characters cannot be accessed by their index, like for vectors. Specific functions are available to work with strings.
899
+
Note that a string is considered a 1-element object, differently from a character vector that instead is a collection of *n* character variables.
900
+
The length of a string is given by the command ```char```:
901
+
```
902
+
# R
903
+
904
+
myString <- 'Hello, World!'
905
+
nchar(myString)
906
+
```
907
+
908
+
A **factor** is a special vector of labelled elements. Usually its elements are discrete and can be either strings or scalars:
As it can be noticed, a factor vector is generated by passing a vector to the function ```factor```.
917
+
918
+
A numeric **matrix** can be defined by the command ```matrix```. The first argument of this function is the full list of values that will be used as matrix elements (column-by-column). The second and third arguments represent the number of rows and columns, respectively. Obviously, the number of elements must be equal to the product of the matrix dimensions. For instance, a randomly sampled vector of 20 scalars can be used to fill a 4x5 matrix:
919
+
```
920
+
# R
921
+
922
+
matElements <- sample(20)
923
+
Xmat <- matrix(matElements, 4, 5)
924
+
```
925
+
The dimensions of a matrix are given by the following commands:
926
+
```
927
+
# R
928
+
929
+
# Define a matrix
930
+
matElements <- sample(20)
931
+
Xmat <- matrix(matElements, 4, 5)
932
+
933
+
# Number of rows
934
+
nrow(Xmat)
935
+
# Number of columns
936
+
ncol(Xmat)
937
+
# Both
938
+
dim(Xmat)
939
+
```
940
+
Matrix dimensions can be named, using the commands `dimnames`, `rownames`, or `colnames`.
941
+
Names can be assigned also at the definition time:
942
+
```
943
+
# R
944
+
945
+
Xmat <- matrix(sample(20), 4, 5)
946
+
947
+
# Assign the row names
948
+
rownames(Xmat) <- c(1:4)
949
+
950
+
# Assign the column names
951
+
colnames(Xmat) <- c(1:5)
952
+
953
+
# Read the row names and column names
954
+
rownames(Xmat)
955
+
colnames(Xmat)
956
+
957
+
# Assign using dimnames
958
+
dimnames(Xmat) <- list(c(1:4), c(1:5)) # Notice that in this case we need a list
959
+
960
+
# Assign at the definition
961
+
Xmat <- matrix(sample(20), 4, 5, dimnames = list(c(1:4), c(1:5)) # Same as dimnames command
962
+
```
963
+
964
+
An **array** is the matrix extension to more than 2-dimensions. For instance, the following command will assign a 3-dimensional array of dimensions (5 x 6 x 10) to the variable ```myArray```:
965
+
```
966
+
# R
967
+
968
+
myArray <- array(sample(300), c(5, 6, 10)
969
+
```
970
+
Elements of arrays can be accessed in the similar fashion of vectors and matrices:
971
+
```
972
+
# R
973
+
974
+
myElement <- myArray[1, 3, 2] # myElement correspond to the element (1, 3, 2) of myArray
975
+
```
976
+
A **list** is a more complex data structure. It can be seen as a vector, whose elements can be of different types or dimensions. For instance, a list containing a vector and a matrix can be defined as follows:
# Direct assignment: the first element will be named 'myVector',
986
+
# and the second element 'myMatrix'
987
+
myList <- list(myVector = sample(20),
988
+
myMatrix = matrix(sample(100, 20), 4, 5))
989
+
```
990
+
The elements of a list can be accessed by passing their index or the name, as defined in the list. Using the previous example:
991
+
```
992
+
# R
993
+
994
+
X <- myList[[1]] # X is now equal to myVector NOTE: [[ ]] instead of [ ]
995
+
X <- myList$myVector # Access by name through the operator $
996
+
```
997
+
998
+
Finally, a **data frame** is a matrix-like structure (columns of same length), whose columns can be vectors of different data type. For instance a char and a numeric vector can be joined to form a data frame:
As seen in the previous section, elements of vectors, arrays, etc. can be accessed by their indices.
1009
+
Single elements can be accessed by the value of their index (also represented by an integer variable). However, also multiple elements can be accessed, using the following commands
1010
+
```
1011
+
# R
1012
+
1013
+
# Define a matrix
1014
+
myMatrix <- matrix(sample(30), 5, 6)
1015
+
1016
+
# Read the 4th row
1017
+
myMatrix[4, ]
1018
+
1019
+
# Read the 2nd column
1020
+
myMatrix[, 2]
1021
+
1022
+
# Read the first 3 elements of the 4th column
1023
+
myMatrix[1:3, 4]
1024
+
```
1025
+
The symbol ```a:b``` is equivalent to ```c(a, a+1, a+2, a+3, ..., b-2, b-1, b)```.
1026
+
1027
+
## Functions
1028
+
1029
+
Repeated operations can be assembled into **functions**.
1030
+
Functions are often exported by packages, or can be defined by the user.
Therefore, the function can be called through its name
1042
+
```
1043
+
# R
1044
+
1045
+
myValue <- myFunction(x, y, ...)
1046
+
```
1047
+
As you can notice, the function ends with the command ```return```. This defines the variable value returned by the function. This variable can be of any data type.
1048
+
For instance, a function that calculates the factorial of an integer can be defined as follows:
1049
+
```
1050
+
# R
1051
+
1052
+
myFactorial <- function(n) {
1053
+
1054
+
# Check that the argument is integer
1055
+
stopifnot(is.integer(n))
1056
+
1057
+
# Calculate 1 * 2 * ... * (n-1) * n
1058
+
f <- 1
1059
+
for (i in 2:n)
1060
+
f <- f * i
1061
+
1062
+
# Then return the value
1063
+
return(f)
1064
+
}
1065
+
```
1066
+
Then, the factorial of an integer can be calculated calling the function:
1067
+
```
1068
+
# R
1069
+
1070
+
myFactorial(25) # Returns the value of 25!
1071
+
```
1072
+
1073
+
**Resources:**
1074
+
[Examples of builtin functions](https://www.statmethods.net/management/functions.html)
1075
+
[Practice on writing R functions](https://www.datacamp.com/courses/writing-functions-in-r)
1076
+
1077
+
## For loops, apply, sapply, lapply
1078
+
1079
+
In R, repeated operations (iterations) can be modelled in different ways. The canonical *for loops* can be run in this way:
1080
+
```
1081
+
# R
1082
+
1083
+
for (iterator in firstValue:lastValue)
1084
+
{
1085
+
# Perform some operations
1086
+
doSomething(iterator)
1087
+
}
1088
+
```
1089
+
In this example, the third power of x can be calculate using a for loop:
1090
+
```
1091
+
# R
1092
+
1093
+
# A very inefficient power calculation (use x^3 in real life)
1094
+
for (i in 1:2)
1095
+
{
1096
+
x <- x * x
1097
+
}
1098
+
```
1099
+
R allows to run iterations also by the commands ```apply```, ```apply```, ```apply```.
1100
+
The function ```apply``` returns the values of a function calculated on the marginal dimension of a variable (e.g. columns of a matrix). For instance, to calculate the sum of a matrix columns elements
1101
+
```
1102
+
# R
1103
+
1104
+
apply(myMatrix, 2, sum) # 2 defines the calculation over columns (1 for rows)
1105
+
```
1106
+
If we want to apply more complex operations, we can define a function on the elements
1107
+
```
1108
+
# R
1109
+
1110
+
# Calculate the sum of squares of columns elements
1111
+
apply(myMatrix, 2, function(x) sum(x^2))
1112
+
```
1113
+
The functions ```sapply``` and ```lapply``` have a similar behaviour but they are applied to vectors and lists, respectively
1114
+
```
1115
+
# R
1116
+
1117
+
# Use sapply to avoid a for loop. Calculate the square of an array elements
1118
+
myVector <- c(4, 2, 10)
1119
+
sapply(1:length(myVector), function(x) x^2)
1120
+
1121
+
# Example of lapply
1122
+
1123
+
myList <- list(x = 'my', y = 'list', z = 'is', w = 'cool')
1124
+
# Calculate the number of characters of each element of myList
1125
+
lapply(myList, length) # This will give a list with 4 numbers: 2, 4, 2, 4
1126
+
```
1127
+
**Resources:**
1128
+
[Using apply, sapply, lapply in R](https://www.r-bloggers.com/using-apply-sapply-lapply-in-r/)
1129
+
1130
+
## If-else
1131
+
As all the other languages, also R has operators for conditional statement (if-else):
1132
+
```
1133
+
# R
1134
+
1135
+
if (conditionIsTrue)
1136
+
{
1137
+
doSomething
1138
+
} else
1139
+
{
1140
+
doSomethingElse
1141
+
}
1142
+
```
1143
+
Basic logical operators are
1144
+
```
1145
+
# R
1146
+
1147
+
x == y # Returns TRUE if x is identical to y
1148
+
x != y # Returns TRUE if x is not identical to y
1149
+
x <- TRUE # x contains the logical value TRUE
1150
+
x <- FALSE # x contains the logical value FALSE
1151
+
!x # Only if x is a logical variable, returns its negation
1152
+
x && y # Logical AND
1153
+
x || y # Logical OR
1154
+
```
830
1155
1156
+
## Plotting
1157
+
Beside the builtin functions, there are several packages designed to produce high quality graphs. Probably, the most famous among these is ```ggplot2```.
1158
+
Here, it is possible to find nice examples of data graphs generated using ```ggplot2```
1159
+
[R Graphs](http://www.cookbook-r.com/Graphs/)
831
1160
# Python
832
1161
"Python is an interpreted high-level programming language for general-purpose programming."
0 commit comments