diff --git a/README.md b/README.md index ee0608f..2d20607 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,6 @@ Welcome to my personal notes on various computer science topics, gathered over 3 I am sharing them in the hope that they would be useful you as well. - Available notes: * [Operating Systems](./operating_systems.org) @@ -13,7 +12,9 @@ Available notes: * [Artificial Intelligence](./artificial_intelligence.org) * [Machine Learning (Learning from data)](./learning_from_data.org) * [Pro Python](./pro_python.org) + * [Introduction to Cloud Infrastructure Technologies](./cloud.org) * [Computer Arch](./comp_arch.org) + * [Kubernetes](./kubernetes.org) * [Ansible](./ansible.org) * [C language](./C_language.org) * [Emacs](./emacs.org) @@ -50,3 +51,5 @@ I welcome you to contribute your notes on the topics or add new topics entirely! ##### DISCLAIMER A large chunk of these notes are not organized very nicely. Especially the ones from "pre-emacs, pre-org mode" era. I will structure them when I get time. + +Also, I have liberally scooped diagrams, explanations or even tables from the sources I was studying. This is more true for some notes (like [Ansible](./ansible.org)) than others (like [Operating Systems](./operating_systems.org)). diff --git a/algorithms.org b/algorithms.org index 9a22630..ffc7a84 100644 --- a/algorithms.org +++ b/algorithms.org @@ -3270,14 +3270,11 @@ BUT Another example: 2^{n+10} = O(2^{n}) BUT -2^{n+10} = o(2^{n+1}^{}) +2^{n+10} = o(2^{n+11}^{}) (Note: 2^{10n} = O(2^{10n}) and not O(2^{n})) -Similarly little omega notation - - - - +Little Oh is to say that the function f(n) is actually greater than T(n), even without the constants business, it is actually greater. +Similarly little omega notation diff --git a/ansible.org b/ansible.org index b332653..05c1eb8 100644 --- a/ansible.org +++ b/ansible.org @@ -1,7 +1,554 @@ -Crash course in Ansible +* Ansible Up and Running + +Ansible is simple, and that is the best part. +Eliminating management daemons and relying instead on OpenSSH meant the sys‐ tem could start managing a computer fleet immediately, without having to set up anything on the managed machines. Further, the system was apt to be more reliable and secure + +You can wire up these services by hand: spinning up the servers you need, SSHing to each one, installing packages, editing config files, and so forth, but it’s a pain. It’s time-consuming, error-prone, and just plain dull to do this kind of work manually, especially around the third or fourth time. And for more complex tasks, like standing up an OpenStack cloud inside your application, doing it by hand is madness. There’s a better way. +** What's in the name? +An ansible is a fictional communication device that can transfer information faster than the speed of light. + +Ansible is a great tool for deployment as well as configuration management. It can orchestrate deployment (have ordering on actions) and also provision infrastructure. +** How Ansible Works +In Ansible, a script is called a play‐ book. A playbook describes which hosts (what Ansible calls remote servers) to config‐ ure, and an ordered list of tasks to perform on those hosts + +Run ansible playbook with ~$ ansible-playbook webservers.yml~ +Ansible will make parallel SSH connections to the hosts. + +When you use: + +#+begin_src yaml +- name: Install nginx + apt: name=nginx +#+end_src + +Ansible will do the following: +1. Generate a Python script that installs the Nginx package +2. Copy the script to web1, web2, and web3 +3. Execute the script on web1, web2, and web3 +4. Wait for the script to complete execution on all hosts + +For each task, Ansible will generate a Python script and executes it in parallel on all the hosts. + +Ansible is push based, that is you run the playbook, and the update is done. In push based system, you push the configuration updates to a central configuration management service, the agents running on all machines periodically check for updates and run the changes. + +Ansible supports a pull based model with ~ansible-pull~ + +Ansible obeys Alan Kay’s maxim: “Simple things should be simple; complex things should be possible.” + +Ansible's modules (that it ships with and community contributed ones too) are declarative. +They are also idempotent. If the deploy user doesn’t exist, Ansible will create it. If it does exist, Ansible won’t do anything. + +The primary unit of reuse in the Ansible community is the module. + +Ansible playbooks aren’t really intended to be reused across different contexts. Roles, are a way of collecting playbooks together so they are more reusable + +*** Using Vagrant to Set Up a Test Server +If you prefer not to spend the money on a public cloud, I recommend you install Vagrant on your machine. Vagrant is an excellent open source tool for managing vir‐ tual machines. You can use Vagrant to boot a Linux virtual machine inside your lap‐ top, and you can use that as a test server. + +Vargant is responsible for building and maintaining portable virtual environments. It is like a Dockerfile for virtual machines. + + +Vagrant creates a vagrant user and a ssh key to enable the ~vagrant ssh~ command + +#+begin_src +$ ssh vagrant@127.0.0.1 -p 2222 -i /path/to/.vagrant/machines/default/virtualbox/private_key +#+end_src + + +*** Telling Ansible about the test server + +Ansible can manage only the servers it explicitly knows about. You provide Ansible with information about servers by specifying them in an inventory file. Create a file called hosts in the playbooks directory. This file will serve as the inventory file. + +Each server needs a name that Ansible will use to identify it. You can use the host‐ name of the server, or you can give it an alias and pass additional arguments to tell Ansible how to connect to it. + +To tell ansible about the Vagrant machine, we have to specify the user, ssh key etc + +#+begin_src +# Example 1-1. playbooks/hosts. We can call our vagrant machine testserver +testserver ansible_host=127.0.0.1 ansible_port=2222 \ + ansible_user=vagrant \ + ansible_private_key_file=.vagrant/machines/default/virtualbox/private_key +#+end_src + +We can also use the ~ansible.cfg~ file to avoid having to be so verbose in the inventory file. +For an ec2 instance, it can be something like: + +#+begin_src +testserver ansible_host=ec2-203-0-113-120.compute-1.amazonaws.com \ + ansible_user=ubuntu ansible_private_key_file=/path/to/keyfile.pem +#+end_src + +Example ansible.cfg file: + +#+begin_src +# Example 1-2. ansible.cfg +[defaults] +inventory = hosts +remote_user = vagrant +private_key_file = .vagrant/machines/default/virtualbox/private_key +host_key_checking = False +#+end_src + + +The ~command~ module is the default module. So all are valid: + +1. $ ansible testserver -m command -a uptime +2. $ ansible testserver -a uptime, +3. $ ansible testserver -a "tail /var/log/dmesg" + + +The ansible syntax is YAML : + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-09 22:44:01 +[[file:assets/screenshot_2018-08-09_22-44-00.png]] + +Ansible also accepts more truthy/falsey arguments for modules: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-09 22:44:27 +[[file:assets/screenshot_2018-08-09_22-44-27.png]] + + +An Ansible convention is to keep files in a subdirectory named les, and Jinja2 templates in a subdirectory named templates. + +We can create groups of hosts and mention them in our inventory file (aka the hosts file). Inventory files are in the .ini file format. + +So, now our hosts file looks like so: + +#+begin_src +[webservers] +testserver ansible_host=127.0.0.1 ansible_port=2222 +#+end_src + +Execute the playbook by: ~$ ansible-playbook web-notls.yml~ +If your playbook file is marked as executable and starts with a line that looks like this ~#!/usr/bin/env ansible-playbook~ (the shebang), then you can execute it by invoking it directly, like this: +~$ ./web-notls.yml~ + + + +** YAML syntax + +*** Start of File +YAML files are supposed to start with three dashes to indicate the beginning of the document: +--- + +*** Comments +Comments start with a number sign and apply to the end of the line, the same as in shell scripts, Python, and Ruby: +~# This is a YAML comment~ + + +*** Strings +In general, YAML strings don’t have to be quoted, although you can quote them if you prefer. Even if there are spaces, you don’t need to quote them. For example, this is a string in YAML: +~this is a lovely sentence~ + +Ansible will need you to quote strings if you use variable substitution, indicated by the use of ~{{ braces }}~ + + + +*** Lists +YAML lists are like arrays in JSON and Ruby, or lists in Python. Technically, these are called *sequences* in YAML, but I call them lists here to be consistent with the official Ansible documentation. + +They are delimited with hyphens, like this: + - My Fair Lady + - Oklahoma + - The Pirates of Penzance + +Note, no quoting needed. + +YAML also supports an inline format for lists, which looks like this: +~[My Fair Lady, Oklahoma, The Pirates of Penzance]~ + +*** Dictionary +YAML dictionaries are like objects in JSON, dictionaries in Python, or hashes in Ruby. Technically, these are called mappings in YAML, but I call them dictionaries here to be consistent with the official Ansible documentation. + + +They look like this: +#+begin_src yaml + address: 742 Evergreen Terrace + city: Springfield + state: North Takoma + +#+end_src + +The JSON equivalent is shown here: +#+begin_src js +{ +"address": "742 Evergreen Terrace", "city": "Springfield", +"state": "North Takoma" +} + +#+end_src + +YAML also supports an inline format for dictionaries, which looks like this: +~{address: 742 Evergreen Terrace, city: Springfield, state: North Takoma}~ + + +*** Line folding + +You can do this with YAML by using line folding with the greater than (>) character. The YAML parser will replace line breaks with spaces. For example: + +#+begin_src yaml + address: > + Department of Computer Science, + A.V. Williams Building, + University of Maryland + city: College Park + state: Maryland +#+end_src + +The JSON equivalent is as follows: + +#+begin_src json +{ + "address": "Department of Computer Science, A.V. Williams Building,\nUniversity of Maryland", + "city": "College Park", + "state": "Maryland" +} +#+end_src + +*A valid JSON file is also a valid YAML file. This is because YAML allows strings to be quoted, considers true and false to be valid Booleans, and has inline lists and dictionary syntaxes that are the same as JSON arrays and objects.* + +If you think about it, a playbook is a list of dictionaries. + +Modules are scripts that come packaged with Ansible and perform some kind of action on a host. + +Ansible ships with the ansible-doc command-line tool, which shows documentation about modules. For example, to show the documentation for the service module, run this: +~$ ansible-doc service~ + +The modules that ship with Ansible all are written in Python, but modules can be written in any language. + +Recall from the first chapter that Ansible executes a task on a host by generating a custom script based on the module name and arguments, and then copies this script to the host and runs it. + + +** Ansible components + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-09 23:07:43 +[[file:assets/screenshot_2018-08-09_23-07-43.png]] + +A playbook has a lof of Plays, which are just a series of tasks (using one module each) and running on a series of hosts. + + +You can define ~vars~ at a playbook level: + +#+begin_src yaml +- name: Configure webserver with nginx and tls + hosts: webservers + become: True + vars: + key_file: /etc/nginx/ssl/nginx.key + cert_file: /etc/nginx/ssl/nginx.crt + conf_file: /etc/nginx/sites-available/default + server_name: localhost + tasks: + - name: Install nginx + apt: name=nginx update_cache=yes cache_valid_time=3600 + - name: create directories for ssl certificates + file: path=/etc/nginx/ssl state=directory +#+end_src + + +In our example, each value is a string (e.g., /etc/nginx/ssl/nginx.key), but any valid YAML can be used as the value of a variable. You can use lists and dictionaries in addition to strings and Booleans. + + +** Handlers +They are just tasks which run only when they are ~notified~ by other tasks. Tasks notify only when they detect a state change caused by them. Handlers can be used if you want to for eg restart a service on a config change etc. + + +#+begin_src yaml +- name: copy TLS key + copy: src=files/nginx.key dest={{ key_file }} owner=root mode=0600 + notify: restart nginx + +handlers: + - name: restart nginx + service: name=nginx state=restarted +#+end_src + +** The Inventory File + +The default way to describe your hosts in Ansible is to list them in text files, called inventory files. + + +Ansible has several parameters to control the behavioral inventory parameters. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-10 22:48:58 +[[file:assets/screenshot_2018-08-10_22-48-58.png]] + +The ~ansible_connection~ can support multiple transports to connect to the host. If the SSH client supports Control‐ Persist, Ansible will use the local SSH client. If the SSH client doesn’t support ControlPersist, the smart transport will fall back to using a Python-based SSH client library called Paramiko. ControlPersist, also known as SSH multi‐ plexing. + + +If the inventory file is marked executable, Ansible will assume it is a dynamic inven‐ tory script and will execute the file instead of reading it. + +The Interface for a Dynamic Inventory Script +An Ansible dynamic inventory script must support two command-line flags: + +1. --host= for showing host details + + this is to show details of a particular host + +2. --list for listing groups + + this is to show listings of all the groups, and details about the individual hosts. + + + +#+begin_src +# assuming our inventory file is dynamic.py +$ ./dynamic.py --host=vagrant2 +{ "ansible_host": "127.0.0.1", "ansible_port": 2200, "ansible_user": "vagrant"} + +$ ./dynamic.py --list +{"lb": ["delaware.example.com"], +"web": ["georgia.example.com", "newhampshire.example.com", + "newjersey.example.com", "ontario.example.com", "vagrant1"]} +#+end_src + +Ansible ships with several dynamic inventory scripts that you can use. You can grab these by going to the Ansible GitHub repo and browsing to the contrib/inventory directory. Many of these inventory scripts have an accompanying configuration file. + +If you want to have both a regular inventory file and a dynamic inventory script just put them all in the same directory and configure Ansible to use that directory as the inventory. + +Ansible will let you add hosts and groups to the inventory during the execution of a playbook using ~add_host~. + +Even if you’re using dynamic inventory scripts, the add_host module is useful for sce‐ narios where you start up new virtual machine instances and configure those instan‐ ces in the same playbook. If a new host comes online while a playbook is executing, the dynamic inventory script will not pick up this new host. + +** Variables + +Ansible has variables, and a certain type of variable that Ansible calls a fact. +The simplest way to define variables is to put a vars section in your playbook with the names and values of variables. + +Ansible also allows you to put variables into one or more files, using a section called vars_files + +For debugging, it’s often handy to be able to view the output of a variable. We saw how to use the debug module to print out an arbitrary message. We can also use it to output the value of the variable. It works like this: +~- debug: var=myvarname~ + +Often, you’ll find that you need to set the value of a variable based on the result(output) of a task. To do so, we create a registered variable using the register clause when invok‐ ing a module + +The value of a variable set using the register clause is always a dictionary. Some of the keys always present are: +- changed -> indicates if state changed +- cmd -> invoked command as a list of strings +- stderr +- stdout + +Ansible uses Jinja2 to implement variable dereferencing on the dicts. + + +When Ansible runs a playbook, before the first task runs, this happens: + + GATHERING FACTS ************************************************** + ok: [servername] + +When Ansible gathers facts, it connects to the host and queries it for all kinds of details about the host: CPU architecture, operating system, IP addresses, memory info, disk info, and more. This information is stored in variables that are called facts, and they behave just like any other variable. + + +Here’s a simple playbook that prints out the operating system of each server: + +#+begin_src + - name: print out operating system + hosts: all + gather_facts: True + tasks: + - debug: var=ansible_distribution +#+end_src + +Ansible implements fact collecting through the use of a special module called the setup module. + +Any Module Can Return Facts. The use of ansible_facts in the return value is an Ansible idiom. If a module returns a dictionary that contains ansible_facts as a key, Ansible will create variable names in the environment with those values and associate them with the active host. + +For eg: + +#+begin_src +- name: get ec2 facts + ec2_facts: + +- debug: var=ansible_ec2_instance_id +#+end_src + +Here, :top:, the variable ansible_ec2_instance_id was returned by ec2_facts module. + + +Some of Ansible's variables that are always available: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-10 23:40:38 +[[file:assets/screenshot_2018-08-10_23-40-38.png]] + +In Ansible, variables are scoped by host. It only makes sense to talk about the value of a variable relative to a given host. + +Eg: +~{{ hostvars['db.example.com'].ansible_eth1.ipv4.address }}~ + +This evaluates to the ansible_eth1.ipv4.address fact associated with the host named db.example.com. + +Here, we did not use ~hostvars.db.example.com~ since the string "db.example.com" has periods. + +The ~groups~ var can be useful to access variables for a group of hosts. + +#+begin_src +# sample configuration file +backend web-backend + {% for host in groups.web %} + server {{ hostvars[host].inventory_hostname }} \ + {{ hostvars[host].ansible_default_ipv4.address }}:80 + {% endfor %} + +# generated file +backend web-backend + server georgia.example.com 203.0.113.15:80 + server newhampshire.example.com 203.0.113.25:80 + server newjersey.example.com 203.0.113.38:80 +#+end_src + +Variables set by passing -e var=value to ansible-playbook have the highest precedence. It is ~--extra-vars~ for ansible-playbook. + + +Django implements the standard Web Server Gateway Interface (WSGI),2 so any Python HTTP server that supports WSGI is suitable for running a Django application such as Mezzanine. We’ll use Gunicorn, one of the most popular HTTP WSGI servers. + +Gunicorn will execute our Django application, just like the development server does. However, Gunicorn won’t serve any of the static assets associated with the application. + +Although Gunicorn can handle TLS encryption, it’s common to configure Nginx to handle the encryption + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-10 23:52:45 +[[file:assets/screenshot_2018-08-10_23-52-45.png]] + +We need to run Guni‐ corn as a daemon, and we’d like to be able to easily stop it and restart it. Numerous service managers can do this job. We’re going to use Supervisor, because that’s what the Mezzanine deployment scripts use. + + +** Listing tasks in a Playbook +Useful for getting the list of tasks that will be run: +~$ ansible-playbook --list-tasks mezzanine.yml~ + + +Ansible ships with a django_manage module that invokes manage.py commands. We +could invoke it like this: + - name: initialize the database + django_manage: + command: createdb --noinput --nodata + app_path: "{{ proj_path }}" + virtualenv: "{{ venv_path }}" + + +~script~ module instead. This will copy over a custom script and execute it. + +In order to run these scripts in the context of the virtualenv, I also needed to set the path variable so that the first Python executable in the path would be the one inside the virtualenv. + +#+begin_src +- name: set the site id + script: scripts/setsite.py + environment: + PATH: "{{ venv_path }}/bin" + PROJECT_DIR: "{{ proj_path }}" + PROJECT_APP: "{{ proj_app }}" +#+end_src + +We have the ~cron~ module as well: +#+begin_src +- name: install poll twitter cron job + cron: name="poll twitter" minute="*/5" user={{ user }} job="{{ manage }} \ + poll_twitter" + +#+end_src + + +** Roles - Scaling up your playbooks + +One of the things I like about Ansible is how it scales both up and down. I’m not referring to the number of hosts you’re managing, but rather the complexity of the jobs you’re trying to automate. + +Ansible scales down well because simple tasks are easy to implement. It scales up well because it provides mechanisms for decomposing complex jobs into smaller pieces. + +In Ansible, the role is the primary mechanism for breaking a playbook into multiple files. This simplifies writing complex playbooks, and it makes them easier to reuse. Think of a role as something you assign to one or more hosts. For example, you’d assign a database role to the hosts that will act as database servers. + +*** Basic Structure of a Role + +Say, we have a role called ~database~. It lives in the roles/database directory. + +- roles/database/tasks/main.yml + - Tasks +- roles/database/ les/ + - Holds files to be uploaded to hosts +- roles/database/templates/ + - Holds Jinja2 template files +- roles/database/handlers/main.yml + - Handlers +- roles/database/vars/main.yml + - Variables that shouldn’t be overridden +- roles/database/defaults/main.yml + - Default variables that can be overridden +- roles/database/meta/main.yml + - Dependency information about a role + +Ansible looks for roles in the roles directory alongside your playbooks. It also looks for systemwide roles in /etc/ansible/roles. You can customize the systemwide location of roles by setting the roles_path setting in the defaults section of your ansible.cfg or setting the ANSIBLE_ROLES_PATH env var + +When we are done writing roles, we can assign them to our hosts like so: + +#+begin_src +- name: deploy mezzanine on vagrant + hosts: web + vars_files: + - secrets.yml + roles: + - role: database + database_name: "{{ mezzanine_proj_name }}" # these vars can be defined in vars/main.yml, or defaults/main.yml + database_user: "{{ mezzanine_proj_name }}" + + - role: mezzanine + live_hostname: 192.168.33.10.xip.io + domains: + - 192.168.33.10.xip.io + - www.192.168.33.10.xip.io +#+end_src + +Ansible allows you to define a list of tasks that execute before the roles with a pre_tasks section, and a list of tasks that execute after the roles with a post_tasks section + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 09:03:52 +[[file:assets/screenshot_2018-08-11_09-03-52.png]] + + +*** Writing the ~database~ role + +**** roles/database/tasks/main.yml +***** has all the tasks like in a regular playbook +**** roles/database/defaults/main.yml +***** here, we can give the default value of the variables that we use in our tasks/main.yml playbook +**** roles/database/handlers/main.yml +***** defines a handler (like say, restart postgres). Any task can use notify to call this handler based on it's result - execute if state changed +**** roles/database/files/pg_hba.conf +**** roles/database/files/postgresql.conf + + +So, Ansible roles are just long playbooks that have been broken down into organized smaller files. + +Ansible doesn’t have any notion of namespace across roles. This means that variables that are defined in other roles, or elsewhere in a playbook, will be accessible everywhere. So, it's a good practice to prefix variables in the role with the name of the role. + + + +Ansible ships with another command-line tool, ~ansible- galaxy~. Its primary purpose is to download roles that have been shared by the Ansible community. It can also be used to generate scaffolding, an initial set of files and directories involved in a role: +~$ ansible-galaxy init -p playbooks/roles web~ + +The -p flag tells ansible-galaxy where your roles directory is. If not specified, the role files will be created in your current directory. + +When you have a role that is dependent on another role already having been executed, you can leverage ansible's support for dependent roles. +Eg, for django role, you could have mentioned: + +#+begin_src yaml +dependencies: + - { role: web } + - { role: memcached } +#+end_src + +* Crash course in Ansible http://people.redhat.com/mlessard/qc/presentations/Mai2016/AnsibleWorkshopWA.pdf -* Introduction to ansible +** Introduction to ansible "Ansible" is a fictional machine capable of superluminal communication (faster than light communication) @@ -32,7 +579,7 @@ Key components: - Playbook -* Ansible Commands +** Ansible Commands We can run commands using one of the several modules and giving it the required arguments and specifying the hosts file - each command needs to have an inventory specified with -i @@ -56,7 +603,7 @@ ansible all -i ./hosts -m command -a "ping" Or :top: we can use the ping module! ansible all -i ./hosts -m ping -* Ansible playbooks +** Ansible playbooks #+begin_src yaml - name: This is a play # this is the name of the play @@ -264,7 +811,7 @@ We can add ignore_error parameter to skip potential errors Now, we can run this and pass the vars: ansible-playbook -i ../hosts lab2.yml -e "selinux=permissive" -* Ansible variables +** Ansible variables The precedence of variables: 1. Extra vars 2. Task vars (only for the task) @@ -330,7 +877,7 @@ vars: #+end_src -* Ansible roles +** Ansible roles Roles are a redistributable and reusable collection of: - tasks diff --git a/assets/2022-06-11_12-21-46_screenshot.png b/assets/2022-06-11_12-21-46_screenshot.png new file mode 100644 index 0000000..58acf78 Binary files /dev/null and b/assets/2022-06-11_12-21-46_screenshot.png differ diff --git a/assets/2022-06-11_12-22-02_screenshot.png b/assets/2022-06-11_12-22-02_screenshot.png new file mode 100644 index 0000000..68e803b Binary files /dev/null and b/assets/2022-06-11_12-22-02_screenshot.png differ diff --git a/assets/2022-06-11_12-22-19_screenshot.png b/assets/2022-06-11_12-22-19_screenshot.png new file mode 100644 index 0000000..b65780e Binary files /dev/null and b/assets/2022-06-11_12-22-19_screenshot.png differ diff --git a/assets/2022-06-11_12-23-27_screenshot.png b/assets/2022-06-11_12-23-27_screenshot.png new file mode 100644 index 0000000..561cf06 Binary files /dev/null and b/assets/2022-06-11_12-23-27_screenshot.png differ diff --git a/assets/screenshot_2018-05-23_18-09-10.png b/assets/screenshot_2018-05-23_18-09-10.png new file mode 100644 index 0000000..5b05571 Binary files /dev/null and b/assets/screenshot_2018-05-23_18-09-10.png differ diff --git a/assets/screenshot_2018-05-23_18-18-25.png b/assets/screenshot_2018-05-23_18-18-25.png new file mode 100644 index 0000000..4ebd04d Binary files /dev/null and b/assets/screenshot_2018-05-23_18-18-25.png differ diff --git a/assets/screenshot_2018-05-23_19-36-21.png b/assets/screenshot_2018-05-23_19-36-21.png new file mode 100644 index 0000000..f69d8ba Binary files /dev/null and b/assets/screenshot_2018-05-23_19-36-21.png differ diff --git a/assets/screenshot_2018-05-23_20-16-23.png b/assets/screenshot_2018-05-23_20-16-23.png new file mode 100644 index 0000000..2feafbb Binary files /dev/null and b/assets/screenshot_2018-05-23_20-16-23.png differ diff --git a/assets/screenshot_2018-05-23_20-20-21.png b/assets/screenshot_2018-05-23_20-20-21.png new file mode 100644 index 0000000..9c5f80c Binary files /dev/null and b/assets/screenshot_2018-05-23_20-20-21.png differ diff --git a/assets/screenshot_2018-05-23_20-26-11.png b/assets/screenshot_2018-05-23_20-26-11.png new file mode 100644 index 0000000..b5ea37a Binary files /dev/null and b/assets/screenshot_2018-05-23_20-26-11.png differ diff --git a/assets/screenshot_2018-05-23_20-29-19.png b/assets/screenshot_2018-05-23_20-29-19.png new file mode 100644 index 0000000..ed9088d Binary files /dev/null and b/assets/screenshot_2018-05-23_20-29-19.png differ diff --git a/assets/screenshot_2018-05-23_22-16-00.png b/assets/screenshot_2018-05-23_22-16-00.png new file mode 100644 index 0000000..3df4947 Binary files /dev/null and b/assets/screenshot_2018-05-23_22-16-00.png differ diff --git a/assets/screenshot_2018-05-23_22-26-44.png b/assets/screenshot_2018-05-23_22-26-44.png new file mode 100644 index 0000000..fd16aaf Binary files /dev/null and b/assets/screenshot_2018-05-23_22-26-44.png differ diff --git a/assets/screenshot_2018-05-23_22-28-58.png b/assets/screenshot_2018-05-23_22-28-58.png new file mode 100644 index 0000000..2274cb2 Binary files /dev/null and b/assets/screenshot_2018-05-23_22-28-58.png differ diff --git a/assets/screenshot_2018-05-23_22-54-47.png b/assets/screenshot_2018-05-23_22-54-47.png new file mode 100644 index 0000000..c63af6d Binary files /dev/null and b/assets/screenshot_2018-05-23_22-54-47.png differ diff --git a/assets/screenshot_2018-05-24_21-27-56.png b/assets/screenshot_2018-05-24_21-27-56.png new file mode 100644 index 0000000..2637c65 Binary files /dev/null and b/assets/screenshot_2018-05-24_21-27-56.png differ diff --git a/assets/screenshot_2018-05-24_21-35-38.png b/assets/screenshot_2018-05-24_21-35-38.png new file mode 100644 index 0000000..4fd9dda Binary files /dev/null and b/assets/screenshot_2018-05-24_21-35-38.png differ diff --git a/assets/screenshot_2018-05-24_22-02-30.png b/assets/screenshot_2018-05-24_22-02-30.png new file mode 100644 index 0000000..df18f72 Binary files /dev/null and b/assets/screenshot_2018-05-24_22-02-30.png differ diff --git a/assets/screenshot_2018-05-24_22-30-20.png b/assets/screenshot_2018-05-24_22-30-20.png new file mode 100644 index 0000000..ebf1dcb Binary files /dev/null and b/assets/screenshot_2018-05-24_22-30-20.png differ diff --git a/assets/screenshot_2018-05-24_22-38-12.png b/assets/screenshot_2018-05-24_22-38-12.png new file mode 100644 index 0000000..2fc6777 Binary files /dev/null and b/assets/screenshot_2018-05-24_22-38-12.png differ diff --git a/assets/screenshot_2018-05-24_23-00-31.png b/assets/screenshot_2018-05-24_23-00-31.png new file mode 100644 index 0000000..50ab66e Binary files /dev/null and b/assets/screenshot_2018-05-24_23-00-31.png differ diff --git a/assets/screenshot_2018-05-28_23-42-00.png b/assets/screenshot_2018-05-28_23-42-00.png new file mode 100644 index 0000000..f032482 Binary files /dev/null and b/assets/screenshot_2018-05-28_23-42-00.png differ diff --git a/assets/screenshot_2018-05-29_00-25-59.png b/assets/screenshot_2018-05-29_00-25-59.png new file mode 100644 index 0000000..b1f3bfe Binary files /dev/null and b/assets/screenshot_2018-05-29_00-25-59.png differ diff --git a/assets/screenshot_2018-05-29_00-50-04.png b/assets/screenshot_2018-05-29_00-50-04.png new file mode 100644 index 0000000..7883278 Binary files /dev/null and b/assets/screenshot_2018-05-29_00-50-04.png differ diff --git a/assets/screenshot_2018-05-29_00-50-16.png b/assets/screenshot_2018-05-29_00-50-16.png new file mode 100644 index 0000000..e1d3793 Binary files /dev/null and b/assets/screenshot_2018-05-29_00-50-16.png differ diff --git a/assets/screenshot_2018-05-29_01-00-26.png b/assets/screenshot_2018-05-29_01-00-26.png new file mode 100644 index 0000000..eea701e Binary files /dev/null and b/assets/screenshot_2018-05-29_01-00-26.png differ diff --git a/assets/screenshot_2018-05-31_09-55-27.png b/assets/screenshot_2018-05-31_09-55-27.png new file mode 100644 index 0000000..a354f76 Binary files /dev/null and b/assets/screenshot_2018-05-31_09-55-27.png differ diff --git a/assets/screenshot_2018-06-03_23-10-15.png b/assets/screenshot_2018-06-03_23-10-15.png new file mode 100644 index 0000000..5623472 Binary files /dev/null and b/assets/screenshot_2018-06-03_23-10-15.png differ diff --git a/assets/screenshot_2018-06-03_23-15-35.png b/assets/screenshot_2018-06-03_23-15-35.png new file mode 100644 index 0000000..bcbdc41 Binary files /dev/null and b/assets/screenshot_2018-06-03_23-15-35.png differ diff --git a/assets/screenshot_2018-06-03_23-22-06.png b/assets/screenshot_2018-06-03_23-22-06.png new file mode 100644 index 0000000..f1f2653 Binary files /dev/null and b/assets/screenshot_2018-06-03_23-22-06.png differ diff --git a/assets/screenshot_2018-06-03_23-39-40.png b/assets/screenshot_2018-06-03_23-39-40.png new file mode 100644 index 0000000..9e4dc09 Binary files /dev/null and b/assets/screenshot_2018-06-03_23-39-40.png differ diff --git a/assets/screenshot_2018-06-10_19-31-15.png b/assets/screenshot_2018-06-10_19-31-15.png new file mode 100644 index 0000000..bd83247 Binary files /dev/null and b/assets/screenshot_2018-06-10_19-31-15.png differ diff --git a/assets/screenshot_2018-06-10_19-33-57.png b/assets/screenshot_2018-06-10_19-33-57.png new file mode 100644 index 0000000..b3a54c8 Binary files /dev/null and b/assets/screenshot_2018-06-10_19-33-57.png differ diff --git a/assets/screenshot_2018-06-10_19-38-23.png b/assets/screenshot_2018-06-10_19-38-23.png new file mode 100644 index 0000000..9917db8 Binary files /dev/null and b/assets/screenshot_2018-06-10_19-38-23.png differ diff --git a/assets/screenshot_2018-06-10_19-39-34.png b/assets/screenshot_2018-06-10_19-39-34.png new file mode 100644 index 0000000..9dfc52e Binary files /dev/null and b/assets/screenshot_2018-06-10_19-39-34.png differ diff --git a/assets/screenshot_2018-06-10_19-41-06.png b/assets/screenshot_2018-06-10_19-41-06.png new file mode 100644 index 0000000..b967a1c Binary files /dev/null and b/assets/screenshot_2018-06-10_19-41-06.png differ diff --git a/assets/screenshot_2018-06-10_19-48-33.png b/assets/screenshot_2018-06-10_19-48-33.png new file mode 100644 index 0000000..2fd6bb0 Binary files /dev/null and b/assets/screenshot_2018-06-10_19-48-33.png differ diff --git a/assets/screenshot_2018-06-10_19-49-52.png b/assets/screenshot_2018-06-10_19-49-52.png new file mode 100644 index 0000000..85155ad Binary files /dev/null and b/assets/screenshot_2018-06-10_19-49-52.png differ diff --git a/assets/screenshot_2018-06-10_19-50-24.png b/assets/screenshot_2018-06-10_19-50-24.png new file mode 100644 index 0000000..85155ad Binary files /dev/null and b/assets/screenshot_2018-06-10_19-50-24.png differ diff --git a/assets/screenshot_2018-06-10_20-23-39.png b/assets/screenshot_2018-06-10_20-23-39.png new file mode 100644 index 0000000..f0f13c5 Binary files /dev/null and b/assets/screenshot_2018-06-10_20-23-39.png differ diff --git a/assets/screenshot_2018-06-10_20-46-31.png b/assets/screenshot_2018-06-10_20-46-31.png new file mode 100644 index 0000000..caefbda Binary files /dev/null and b/assets/screenshot_2018-06-10_20-46-31.png differ diff --git a/assets/screenshot_2018-06-10_20-58-49.png b/assets/screenshot_2018-06-10_20-58-49.png new file mode 100644 index 0000000..d1211e2 Binary files /dev/null and b/assets/screenshot_2018-06-10_20-58-49.png differ diff --git a/assets/screenshot_2018-06-10_23-55-41.png b/assets/screenshot_2018-06-10_23-55-41.png new file mode 100644 index 0000000..92b82d6 Binary files /dev/null and b/assets/screenshot_2018-06-10_23-55-41.png differ diff --git a/assets/screenshot_2018-06-11_00-16-55.png b/assets/screenshot_2018-06-11_00-16-55.png new file mode 100644 index 0000000..640300a Binary files /dev/null and b/assets/screenshot_2018-06-11_00-16-55.png differ diff --git a/assets/screenshot_2018-06-11_00-22-33.png b/assets/screenshot_2018-06-11_00-22-33.png new file mode 100644 index 0000000..59616c5 Binary files /dev/null and b/assets/screenshot_2018-06-11_00-22-33.png differ diff --git a/assets/screenshot_2018-06-11_23-48-34.png b/assets/screenshot_2018-06-11_23-48-34.png new file mode 100644 index 0000000..ef382aa Binary files /dev/null and b/assets/screenshot_2018-06-11_23-48-34.png differ diff --git a/assets/screenshot_2018-06-11_23-50-28.png b/assets/screenshot_2018-06-11_23-50-28.png new file mode 100644 index 0000000..25bd41d Binary files /dev/null and b/assets/screenshot_2018-06-11_23-50-28.png differ diff --git a/assets/screenshot_2018-06-12_20-58-57.png b/assets/screenshot_2018-06-12_20-58-57.png new file mode 100644 index 0000000..5b36266 Binary files /dev/null and b/assets/screenshot_2018-06-12_20-58-57.png differ diff --git a/assets/screenshot_2018-06-12_20-59-07.png b/assets/screenshot_2018-06-12_20-59-07.png new file mode 100644 index 0000000..9ef0a70 Binary files /dev/null and b/assets/screenshot_2018-06-12_20-59-07.png differ diff --git a/assets/screenshot_2018-06-12_22-20-29.png b/assets/screenshot_2018-06-12_22-20-29.png new file mode 100644 index 0000000..d6bcda3 Binary files /dev/null and b/assets/screenshot_2018-06-12_22-20-29.png differ diff --git a/assets/screenshot_2018-06-12_22-30-46.png b/assets/screenshot_2018-06-12_22-30-46.png new file mode 100644 index 0000000..0743a1a Binary files /dev/null and b/assets/screenshot_2018-06-12_22-30-46.png differ diff --git a/assets/screenshot_2018-06-12_22-46-44.png b/assets/screenshot_2018-06-12_22-46-44.png new file mode 100644 index 0000000..c361b2b Binary files /dev/null and b/assets/screenshot_2018-06-12_22-46-44.png differ diff --git a/assets/screenshot_2018-06-12_22-47-03.png b/assets/screenshot_2018-06-12_22-47-03.png new file mode 100644 index 0000000..9e9d443 Binary files /dev/null and b/assets/screenshot_2018-06-12_22-47-03.png differ diff --git a/assets/screenshot_2018-06-12_23-27-32.png b/assets/screenshot_2018-06-12_23-27-32.png new file mode 100644 index 0000000..cb22840 Binary files /dev/null and b/assets/screenshot_2018-06-12_23-27-32.png differ diff --git a/assets/screenshot_2018-06-12_23-44-47.png b/assets/screenshot_2018-06-12_23-44-47.png new file mode 100644 index 0000000..0c36aab Binary files /dev/null and b/assets/screenshot_2018-06-12_23-44-47.png differ diff --git a/assets/screenshot_2018-06-12_23-47-28.png b/assets/screenshot_2018-06-12_23-47-28.png new file mode 100644 index 0000000..5273120 Binary files /dev/null and b/assets/screenshot_2018-06-12_23-47-28.png differ diff --git a/assets/screenshot_2018-06-12_23-51-33.png b/assets/screenshot_2018-06-12_23-51-33.png new file mode 100644 index 0000000..a93f72b Binary files /dev/null and b/assets/screenshot_2018-06-12_23-51-33.png differ diff --git a/assets/screenshot_2018-06-13_00-17-51.png b/assets/screenshot_2018-06-13_00-17-51.png new file mode 100644 index 0000000..d86d428 Binary files /dev/null and b/assets/screenshot_2018-06-13_00-17-51.png differ diff --git a/assets/screenshot_2018-06-13_00-28-09.png b/assets/screenshot_2018-06-13_00-28-09.png new file mode 100644 index 0000000..73b0850 Binary files /dev/null and b/assets/screenshot_2018-06-13_00-28-09.png differ diff --git a/assets/screenshot_2018-06-13_00-32-19.png b/assets/screenshot_2018-06-13_00-32-19.png new file mode 100644 index 0000000..88a7bc1 Binary files /dev/null and b/assets/screenshot_2018-06-13_00-32-19.png differ diff --git a/assets/screenshot_2018-06-13_00-35-17.png b/assets/screenshot_2018-06-13_00-35-17.png new file mode 100644 index 0000000..b53e698 Binary files /dev/null and b/assets/screenshot_2018-06-13_00-35-17.png differ diff --git a/assets/screenshot_2018-06-13_00-52-15.png b/assets/screenshot_2018-06-13_00-52-15.png new file mode 100644 index 0000000..93baf4b Binary files /dev/null and b/assets/screenshot_2018-06-13_00-52-15.png differ diff --git a/assets/screenshot_2018-06-13_23-48-48.png b/assets/screenshot_2018-06-13_23-48-48.png new file mode 100644 index 0000000..39970f2 Binary files /dev/null and b/assets/screenshot_2018-06-13_23-48-48.png differ diff --git a/assets/screenshot_2018-06-14_19-31-56.png b/assets/screenshot_2018-06-14_19-31-56.png new file mode 100644 index 0000000..92df8c3 Binary files /dev/null and b/assets/screenshot_2018-06-14_19-31-56.png differ diff --git a/assets/screenshot_2018-06-14_20-02-04.png b/assets/screenshot_2018-06-14_20-02-04.png new file mode 100644 index 0000000..7a116cc Binary files /dev/null and b/assets/screenshot_2018-06-14_20-02-04.png differ diff --git a/assets/screenshot_2018-06-14_20-20-16.png b/assets/screenshot_2018-06-14_20-20-16.png new file mode 100644 index 0000000..29227c1 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-20-16.png differ diff --git a/assets/screenshot_2018-06-14_20-20-27.png b/assets/screenshot_2018-06-14_20-20-27.png new file mode 100644 index 0000000..1b7bea1 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-20-27.png differ diff --git a/assets/screenshot_2018-06-14_20-20-37.png b/assets/screenshot_2018-06-14_20-20-37.png new file mode 100644 index 0000000..ffa3514 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-20-37.png differ diff --git a/assets/screenshot_2018-06-14_20-23-38.png b/assets/screenshot_2018-06-14_20-23-38.png new file mode 100644 index 0000000..df91464 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-23-38.png differ diff --git a/assets/screenshot_2018-06-14_20-24-16.png b/assets/screenshot_2018-06-14_20-24-16.png new file mode 100644 index 0000000..cb65570 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-24-16.png differ diff --git a/assets/screenshot_2018-06-14_20-42-57.png b/assets/screenshot_2018-06-14_20-42-57.png new file mode 100644 index 0000000..cacce50 Binary files /dev/null and b/assets/screenshot_2018-06-14_20-42-57.png differ diff --git a/assets/screenshot_2018-06-16_23-33-07.png b/assets/screenshot_2018-06-16_23-33-07.png new file mode 100644 index 0000000..f142567 Binary files /dev/null and b/assets/screenshot_2018-06-16_23-33-07.png differ diff --git a/assets/screenshot_2018-06-16_23-36-44.png b/assets/screenshot_2018-06-16_23-36-44.png new file mode 100644 index 0000000..0ab8daf Binary files /dev/null and b/assets/screenshot_2018-06-16_23-36-44.png differ diff --git a/assets/screenshot_2018-06-16_23-48-17.png b/assets/screenshot_2018-06-16_23-48-17.png new file mode 100644 index 0000000..cbdcef6 Binary files /dev/null and b/assets/screenshot_2018-06-16_23-48-17.png differ diff --git a/assets/screenshot_2018-06-17_00-04-39.png b/assets/screenshot_2018-06-17_00-04-39.png new file mode 100644 index 0000000..7758465 Binary files /dev/null and b/assets/screenshot_2018-06-17_00-04-39.png differ diff --git a/assets/screenshot_2018-06-17_00-09-07.png b/assets/screenshot_2018-06-17_00-09-07.png new file mode 100644 index 0000000..2e9d23d Binary files /dev/null and b/assets/screenshot_2018-06-17_00-09-07.png differ diff --git a/assets/screenshot_2018-06-17_00-13-04.png b/assets/screenshot_2018-06-17_00-13-04.png new file mode 100644 index 0000000..e3abdc2 Binary files /dev/null and b/assets/screenshot_2018-06-17_00-13-04.png differ diff --git a/assets/screenshot_2018-06-17_00-35-34.png b/assets/screenshot_2018-06-17_00-35-34.png new file mode 100644 index 0000000..e3abdc2 Binary files /dev/null and b/assets/screenshot_2018-06-17_00-35-34.png differ diff --git a/assets/screenshot_2018-06-17_11-29-10.png b/assets/screenshot_2018-06-17_11-29-10.png new file mode 100644 index 0000000..2d0c9cf Binary files /dev/null and b/assets/screenshot_2018-06-17_11-29-10.png differ diff --git a/assets/screenshot_2018-06-17_11-38-36.png b/assets/screenshot_2018-06-17_11-38-36.png new file mode 100644 index 0000000..d7b78b3 Binary files /dev/null and b/assets/screenshot_2018-06-17_11-38-36.png differ diff --git a/assets/screenshot_2018-06-17_11-44-02.png b/assets/screenshot_2018-06-17_11-44-02.png new file mode 100644 index 0000000..d35f49e Binary files /dev/null and b/assets/screenshot_2018-06-17_11-44-02.png differ diff --git a/assets/screenshot_2018-06-17_11-44-52.png b/assets/screenshot_2018-06-17_11-44-52.png new file mode 100644 index 0000000..6cc97c6 Binary files /dev/null and b/assets/screenshot_2018-06-17_11-44-52.png differ diff --git a/assets/screenshot_2018-06-17_12-45-17.png b/assets/screenshot_2018-06-17_12-45-17.png new file mode 100644 index 0000000..a7baaff Binary files /dev/null and b/assets/screenshot_2018-06-17_12-45-17.png differ diff --git a/assets/screenshot_2018-06-17_13-08-20.png b/assets/screenshot_2018-06-17_13-08-20.png new file mode 100644 index 0000000..668102f Binary files /dev/null and b/assets/screenshot_2018-06-17_13-08-20.png differ diff --git a/assets/screenshot_2018-06-17_13-09-04.png b/assets/screenshot_2018-06-17_13-09-04.png new file mode 100644 index 0000000..86bc375 Binary files /dev/null and b/assets/screenshot_2018-06-17_13-09-04.png differ diff --git a/assets/screenshot_2018-06-17_13-55-28.png b/assets/screenshot_2018-06-17_13-55-28.png new file mode 100644 index 0000000..44a31c6 Binary files /dev/null and b/assets/screenshot_2018-06-17_13-55-28.png differ diff --git a/assets/screenshot_2018-06-17_13-55-43.png b/assets/screenshot_2018-06-17_13-55-43.png new file mode 100644 index 0000000..c50cc3c Binary files /dev/null and b/assets/screenshot_2018-06-17_13-55-43.png differ diff --git a/assets/screenshot_2018-07-03_18-45-34.png b/assets/screenshot_2018-07-03_18-45-34.png new file mode 100644 index 0000000..248907e Binary files /dev/null and b/assets/screenshot_2018-07-03_18-45-34.png differ diff --git a/assets/screenshot_2018-07-03_19-19-39.png b/assets/screenshot_2018-07-03_19-19-39.png new file mode 100644 index 0000000..778505e Binary files /dev/null and b/assets/screenshot_2018-07-03_19-19-39.png differ diff --git a/assets/screenshot_2018-07-06_19-40-02.png b/assets/screenshot_2018-07-06_19-40-02.png new file mode 100644 index 0000000..8dc417e Binary files /dev/null and b/assets/screenshot_2018-07-06_19-40-02.png differ diff --git a/assets/screenshot_2018-07-06_22-01-53.png b/assets/screenshot_2018-07-06_22-01-53.png new file mode 100644 index 0000000..bed2628 Binary files /dev/null and b/assets/screenshot_2018-07-06_22-01-53.png differ diff --git a/assets/screenshot_2018-07-06_22-16-25.png b/assets/screenshot_2018-07-06_22-16-25.png new file mode 100644 index 0000000..cd0f6da Binary files /dev/null and b/assets/screenshot_2018-07-06_22-16-25.png differ diff --git a/assets/screenshot_2018-07-06_22-21-33.png b/assets/screenshot_2018-07-06_22-21-33.png new file mode 100644 index 0000000..1fb8453 Binary files /dev/null and b/assets/screenshot_2018-07-06_22-21-33.png differ diff --git a/assets/screenshot_2018-07-06_22-45-48.png b/assets/screenshot_2018-07-06_22-45-48.png new file mode 100644 index 0000000..bad4d1c Binary files /dev/null and b/assets/screenshot_2018-07-06_22-45-48.png differ diff --git a/assets/screenshot_2018-07-06_22-47-06.png b/assets/screenshot_2018-07-06_22-47-06.png new file mode 100644 index 0000000..9af69bd Binary files /dev/null and b/assets/screenshot_2018-07-06_22-47-06.png differ diff --git a/assets/screenshot_2018-07-06_22-47-27.png b/assets/screenshot_2018-07-06_22-47-27.png new file mode 100644 index 0000000..0353005 Binary files /dev/null and b/assets/screenshot_2018-07-06_22-47-27.png differ diff --git a/assets/screenshot_2018-07-06_22-47-40.png b/assets/screenshot_2018-07-06_22-47-40.png new file mode 100644 index 0000000..29c122f Binary files /dev/null and b/assets/screenshot_2018-07-06_22-47-40.png differ diff --git a/assets/screenshot_2018-07-06_22-50-32.png b/assets/screenshot_2018-07-06_22-50-32.png new file mode 100644 index 0000000..8f6f6bb Binary files /dev/null and b/assets/screenshot_2018-07-06_22-50-32.png differ diff --git a/assets/screenshot_2018-08-07_10-37-15.png b/assets/screenshot_2018-08-07_10-37-15.png new file mode 100644 index 0000000..3d96f17 Binary files /dev/null and b/assets/screenshot_2018-08-07_10-37-15.png differ diff --git a/assets/screenshot_2018-08-07_10-48-01.png b/assets/screenshot_2018-08-07_10-48-01.png new file mode 100644 index 0000000..2eec1ce Binary files /dev/null and b/assets/screenshot_2018-08-07_10-48-01.png differ diff --git a/assets/screenshot_2018-08-09_22-44-00.png b/assets/screenshot_2018-08-09_22-44-00.png new file mode 100644 index 0000000..22ce245 Binary files /dev/null and b/assets/screenshot_2018-08-09_22-44-00.png differ diff --git a/assets/screenshot_2018-08-09_22-44-27.png b/assets/screenshot_2018-08-09_22-44-27.png new file mode 100644 index 0000000..6c0ad48 Binary files /dev/null and b/assets/screenshot_2018-08-09_22-44-27.png differ diff --git a/assets/screenshot_2018-08-09_23-07-43.png b/assets/screenshot_2018-08-09_23-07-43.png new file mode 100644 index 0000000..1a96750 Binary files /dev/null and b/assets/screenshot_2018-08-09_23-07-43.png differ diff --git a/assets/screenshot_2018-08-10_22-48-58.png b/assets/screenshot_2018-08-10_22-48-58.png new file mode 100644 index 0000000..d1b658e Binary files /dev/null and b/assets/screenshot_2018-08-10_22-48-58.png differ diff --git a/assets/screenshot_2018-08-10_23-40-38.png b/assets/screenshot_2018-08-10_23-40-38.png new file mode 100644 index 0000000..ec68185 Binary files /dev/null and b/assets/screenshot_2018-08-10_23-40-38.png differ diff --git a/assets/screenshot_2018-08-10_23-52-45.png b/assets/screenshot_2018-08-10_23-52-45.png new file mode 100644 index 0000000..8434344 Binary files /dev/null and b/assets/screenshot_2018-08-10_23-52-45.png differ diff --git a/assets/screenshot_2018-08-11_09-03-52.png b/assets/screenshot_2018-08-11_09-03-52.png new file mode 100644 index 0000000..b488715 Binary files /dev/null and b/assets/screenshot_2018-08-11_09-03-52.png differ diff --git a/assets/screenshot_2018-08-11_23-07-03.png b/assets/screenshot_2018-08-11_23-07-03.png new file mode 100644 index 0000000..0e45845 Binary files /dev/null and b/assets/screenshot_2018-08-11_23-07-03.png differ diff --git a/assets/screenshot_2018-08-11_23-07-13.png b/assets/screenshot_2018-08-11_23-07-13.png new file mode 100644 index 0000000..5755251 Binary files /dev/null and b/assets/screenshot_2018-08-11_23-07-13.png differ diff --git a/assets/screenshot_2018-08-11_23-15-15.png b/assets/screenshot_2018-08-11_23-15-15.png new file mode 100644 index 0000000..cf8d441 Binary files /dev/null and b/assets/screenshot_2018-08-11_23-15-15.png differ diff --git a/assets/screenshot_2018-08-11_23-27-18.png b/assets/screenshot_2018-08-11_23-27-18.png new file mode 100644 index 0000000..edce8a4 Binary files /dev/null and b/assets/screenshot_2018-08-11_23-27-18.png differ diff --git a/assets/screenshot_2018-08-11_23-53-04.png b/assets/screenshot_2018-08-11_23-53-04.png new file mode 100644 index 0000000..8c77ce8 Binary files /dev/null and b/assets/screenshot_2018-08-11_23-53-04.png differ diff --git a/assets/screenshot_2018-08-11_23-55-06.png b/assets/screenshot_2018-08-11_23-55-06.png new file mode 100644 index 0000000..a8a1d7c Binary files /dev/null and b/assets/screenshot_2018-08-11_23-55-06.png differ diff --git a/assets/screenshot_2018-08-11_23-56-57.png b/assets/screenshot_2018-08-11_23-56-57.png new file mode 100644 index 0000000..6b49cfc Binary files /dev/null and b/assets/screenshot_2018-08-11_23-56-57.png differ diff --git a/assets/screenshot_2018-08-12_00-02-16.png b/assets/screenshot_2018-08-12_00-02-16.png new file mode 100644 index 0000000..b114607 Binary files /dev/null and b/assets/screenshot_2018-08-12_00-02-16.png differ diff --git a/assets/screenshot_2018-08-12_15-12-58.png b/assets/screenshot_2018-08-12_15-12-58.png new file mode 100644 index 0000000..223c494 Binary files /dev/null and b/assets/screenshot_2018-08-12_15-12-58.png differ diff --git a/assets/screenshot_2018-08-12_20-35-27.png b/assets/screenshot_2018-08-12_20-35-27.png new file mode 100644 index 0000000..9035cc6 Binary files /dev/null and b/assets/screenshot_2018-08-12_20-35-27.png differ diff --git a/assets/screenshot_2018-08-15_14-49-29.png b/assets/screenshot_2018-08-15_14-49-29.png new file mode 100644 index 0000000..085f7a1 Binary files /dev/null and b/assets/screenshot_2018-08-15_14-49-29.png differ diff --git a/assets/screenshot_2018-08-15_15-00-49.png b/assets/screenshot_2018-08-15_15-00-49.png new file mode 100644 index 0000000..c33c36e Binary files /dev/null and b/assets/screenshot_2018-08-15_15-00-49.png differ diff --git a/assets/screenshot_2018-08-15_22-42-20.png b/assets/screenshot_2018-08-15_22-42-20.png new file mode 100644 index 0000000..c33c36e Binary files /dev/null and b/assets/screenshot_2018-08-15_22-42-20.png differ diff --git a/assets/screenshot_2018-08-15_22-42-36.png b/assets/screenshot_2018-08-15_22-42-36.png new file mode 100644 index 0000000..011fba0 Binary files /dev/null and b/assets/screenshot_2018-08-15_22-42-36.png differ diff --git a/assets/screenshot_2018-08-16_21-44-19.png b/assets/screenshot_2018-08-16_21-44-19.png new file mode 100644 index 0000000..93e0ed4 Binary files /dev/null and b/assets/screenshot_2018-08-16_21-44-19.png differ diff --git a/assets/screenshot_2018-08-17_13-44-49.png b/assets/screenshot_2018-08-17_13-44-49.png new file mode 100644 index 0000000..9573a80 Binary files /dev/null and b/assets/screenshot_2018-08-17_13-44-49.png differ diff --git a/assets/screenshot_2018-08-22_22-15-39.png b/assets/screenshot_2018-08-22_22-15-39.png new file mode 100644 index 0000000..f9f3f06 Binary files /dev/null and b/assets/screenshot_2018-08-22_22-15-39.png differ diff --git a/assets/screenshot_2018-09-02_10-17-16.png b/assets/screenshot_2018-09-02_10-17-16.png new file mode 100644 index 0000000..5451dae Binary files /dev/null and b/assets/screenshot_2018-09-02_10-17-16.png differ diff --git a/assets/screenshot_2018-09-02_10-45-29.png b/assets/screenshot_2018-09-02_10-45-29.png new file mode 100644 index 0000000..8b297b0 Binary files /dev/null and b/assets/screenshot_2018-09-02_10-45-29.png differ diff --git a/assets/screenshot_2018-09-02_10-55-16.png b/assets/screenshot_2018-09-02_10-55-16.png new file mode 100644 index 0000000..cf437d2 Binary files /dev/null and b/assets/screenshot_2018-09-02_10-55-16.png differ diff --git a/assets/screenshot_2018-09-02_11-38-06.png b/assets/screenshot_2018-09-02_11-38-06.png new file mode 100644 index 0000000..e50d69a Binary files /dev/null and b/assets/screenshot_2018-09-02_11-38-06.png differ diff --git a/assets/screenshot_2018-09-02_11-39-40.png b/assets/screenshot_2018-09-02_11-39-40.png new file mode 100644 index 0000000..f07535f Binary files /dev/null and b/assets/screenshot_2018-09-02_11-39-40.png differ diff --git a/assets/screenshot_2018-09-02_11-41-01.png b/assets/screenshot_2018-09-02_11-41-01.png new file mode 100644 index 0000000..9e9ad44 Binary files /dev/null and b/assets/screenshot_2018-09-02_11-41-01.png differ diff --git a/assets/screenshot_2018-09-02_16-19-21.png b/assets/screenshot_2018-09-02_16-19-21.png new file mode 100644 index 0000000..e1afd04 Binary files /dev/null and b/assets/screenshot_2018-09-02_16-19-21.png differ diff --git a/assets/screenshot_2018-09-02_16-19-40.png b/assets/screenshot_2018-09-02_16-19-40.png new file mode 100644 index 0000000..0e5d1a7 Binary files /dev/null and b/assets/screenshot_2018-09-02_16-19-40.png differ diff --git a/assets/screenshot_2018-09-10_11-15-33.png b/assets/screenshot_2018-09-10_11-15-33.png new file mode 100644 index 0000000..af78ba4 Binary files /dev/null and b/assets/screenshot_2018-09-10_11-15-33.png differ diff --git a/assets/screenshot_2018-09-10_11-58-53.png b/assets/screenshot_2018-09-10_11-58-53.png new file mode 100644 index 0000000..043825e Binary files /dev/null and b/assets/screenshot_2018-09-10_11-58-53.png differ diff --git a/assets/screenshot_2018-09-10_12-02-58.png b/assets/screenshot_2018-09-10_12-02-58.png new file mode 100644 index 0000000..76fd3f8 Binary files /dev/null and b/assets/screenshot_2018-09-10_12-02-58.png differ diff --git a/assets/screenshot_2018-09-10_12-14-41.png b/assets/screenshot_2018-09-10_12-14-41.png new file mode 100644 index 0000000..0ac864e Binary files /dev/null and b/assets/screenshot_2018-09-10_12-14-41.png differ diff --git a/assets/screenshot_2018-09-10_12-54-56.png b/assets/screenshot_2018-09-10_12-54-56.png new file mode 100644 index 0000000..f7ce6d4 Binary files /dev/null and b/assets/screenshot_2018-09-10_12-54-56.png differ diff --git a/assets/screenshot_2018-09-10_13-27-02.png b/assets/screenshot_2018-09-10_13-27-02.png new file mode 100644 index 0000000..a468be9 Binary files /dev/null and b/assets/screenshot_2018-09-10_13-27-02.png differ diff --git a/assets/screenshot_2018-09-10_14-23-45.png b/assets/screenshot_2018-09-10_14-23-45.png new file mode 100644 index 0000000..6be2def Binary files /dev/null and b/assets/screenshot_2018-09-10_14-23-45.png differ diff --git a/assets/screenshot_2018-09-16_13-57-58.png b/assets/screenshot_2018-09-16_13-57-58.png new file mode 100644 index 0000000..afaa964 Binary files /dev/null and b/assets/screenshot_2018-09-16_13-57-58.png differ diff --git a/assets/screenshot_2018-09-16_13-58-12.png b/assets/screenshot_2018-09-16_13-58-12.png new file mode 100644 index 0000000..326f1d2 Binary files /dev/null and b/assets/screenshot_2018-09-16_13-58-12.png differ diff --git a/assets/screenshot_2018-09-16_16-47-15.png b/assets/screenshot_2018-09-16_16-47-15.png new file mode 100644 index 0000000..2f49a61 Binary files /dev/null and b/assets/screenshot_2018-09-16_16-47-15.png differ diff --git a/assets/screenshot_2018-09-16_17-03-45.png b/assets/screenshot_2018-09-16_17-03-45.png new file mode 100644 index 0000000..464752a Binary files /dev/null and b/assets/screenshot_2018-09-16_17-03-45.png differ diff --git a/assets/screenshot_2018-09-16_17-05-43.png b/assets/screenshot_2018-09-16_17-05-43.png new file mode 100644 index 0000000..7b12581 Binary files /dev/null and b/assets/screenshot_2018-09-16_17-05-43.png differ diff --git a/assets/screenshot_2018-09-16_17-07-30.png b/assets/screenshot_2018-09-16_17-07-30.png new file mode 100644 index 0000000..7b12581 Binary files /dev/null and b/assets/screenshot_2018-09-16_17-07-30.png differ diff --git a/assets/screenshot_2018-09-16_17-07-49.png b/assets/screenshot_2018-09-16_17-07-49.png new file mode 100644 index 0000000..b343508 Binary files /dev/null and b/assets/screenshot_2018-09-16_17-07-49.png differ diff --git a/assets/screenshot_2018-09-16_17-54-10.png b/assets/screenshot_2018-09-16_17-54-10.png new file mode 100644 index 0000000..c8f2db9 Binary files /dev/null and b/assets/screenshot_2018-09-16_17-54-10.png differ diff --git a/assets/screenshot_2018-09-16_17-58-56.png b/assets/screenshot_2018-09-16_17-58-56.png new file mode 100644 index 0000000..431db98 Binary files /dev/null and b/assets/screenshot_2018-09-16_17-58-56.png differ diff --git a/assets/screenshot_2018-09-16_19-28-37.png b/assets/screenshot_2018-09-16_19-28-37.png new file mode 100644 index 0000000..14d6922 Binary files /dev/null and b/assets/screenshot_2018-09-16_19-28-37.png differ diff --git a/assets/screenshot_2018-11-08_18-49-33.png b/assets/screenshot_2018-11-08_18-49-33.png new file mode 100644 index 0000000..0705bb7 Binary files /dev/null and b/assets/screenshot_2018-11-08_18-49-33.png differ diff --git a/assets/screenshot_2018-11-08_20-38-52.png b/assets/screenshot_2018-11-08_20-38-52.png new file mode 100644 index 0000000..fa3bb0f Binary files /dev/null and b/assets/screenshot_2018-11-08_20-38-52.png differ diff --git a/assets/screenshot_2018-11-08_21-52-37.png b/assets/screenshot_2018-11-08_21-52-37.png new file mode 100644 index 0000000..8c8fa30 Binary files /dev/null and b/assets/screenshot_2018-11-08_21-52-37.png differ diff --git a/assets/screenshot_2018-11-08_23-04-36.png b/assets/screenshot_2018-11-08_23-04-36.png new file mode 100644 index 0000000..c92abc6 Binary files /dev/null and b/assets/screenshot_2018-11-08_23-04-36.png differ diff --git a/assets/screenshot_2018-11-16_16-42-50.png b/assets/screenshot_2018-11-16_16-42-50.png new file mode 100644 index 0000000..ec268b1 Binary files /dev/null and b/assets/screenshot_2018-11-16_16-42-50.png differ diff --git a/assets/screenshot_2018-11-17_22-49-49.png b/assets/screenshot_2018-11-17_22-49-49.png new file mode 100644 index 0000000..3f8a196 Binary files /dev/null and b/assets/screenshot_2018-11-17_22-49-49.png differ diff --git a/assets/screenshot_2018-11-17_22-51-15.png b/assets/screenshot_2018-11-17_22-51-15.png new file mode 100644 index 0000000..2d5998d Binary files /dev/null and b/assets/screenshot_2018-11-17_22-51-15.png differ diff --git a/assets/screenshot_2018-11-18_06-44-19.png b/assets/screenshot_2018-11-18_06-44-19.png new file mode 100644 index 0000000..bcd0ceb Binary files /dev/null and b/assets/screenshot_2018-11-18_06-44-19.png differ diff --git a/assets/screenshot_2018-11-18_06-44-59.png b/assets/screenshot_2018-11-18_06-44-59.png new file mode 100644 index 0000000..9cd5b14 Binary files /dev/null and b/assets/screenshot_2018-11-18_06-44-59.png differ diff --git a/assets/screenshot_2018-11-18_11-42-34.png b/assets/screenshot_2018-11-18_11-42-34.png new file mode 100644 index 0000000..ab4370b Binary files /dev/null and b/assets/screenshot_2018-11-18_11-42-34.png differ diff --git a/assets/screenshot_2018-11-18_15-22-09.png b/assets/screenshot_2018-11-18_15-22-09.png new file mode 100644 index 0000000..0d1c7d6 Binary files /dev/null and b/assets/screenshot_2018-11-18_15-22-09.png differ diff --git a/assets/screenshot_2018-11-18_15-25-42.png b/assets/screenshot_2018-11-18_15-25-42.png new file mode 100644 index 0000000..687c9ad Binary files /dev/null and b/assets/screenshot_2018-11-18_15-25-42.png differ diff --git a/assets/screenshot_2018-11-18_15-51-47.png b/assets/screenshot_2018-11-18_15-51-47.png new file mode 100644 index 0000000..9a6495a Binary files /dev/null and b/assets/screenshot_2018-11-18_15-51-47.png differ diff --git a/assets/screenshot_2018-11-18_19-51-31.png b/assets/screenshot_2018-11-18_19-51-31.png new file mode 100644 index 0000000..ec00be7 Binary files /dev/null and b/assets/screenshot_2018-11-18_19-51-31.png differ diff --git a/assets/screenshot_2018-11-18_20-04-17.png b/assets/screenshot_2018-11-18_20-04-17.png new file mode 100644 index 0000000..5a79dbb Binary files /dev/null and b/assets/screenshot_2018-11-18_20-04-17.png differ diff --git a/assets/screenshot_2018-11-18_20-05-10.png b/assets/screenshot_2018-11-18_20-05-10.png new file mode 100644 index 0000000..189595e Binary files /dev/null and b/assets/screenshot_2018-11-18_20-05-10.png differ diff --git a/assets/screenshot_2018-11-19_08-44-35.png b/assets/screenshot_2018-11-19_08-44-35.png new file mode 100644 index 0000000..d109d86 Binary files /dev/null and b/assets/screenshot_2018-11-19_08-44-35.png differ diff --git a/assets/screenshot_2018-11-19_09-00-39.png b/assets/screenshot_2018-11-19_09-00-39.png new file mode 100644 index 0000000..8c61afb Binary files /dev/null and b/assets/screenshot_2018-11-19_09-00-39.png differ diff --git a/assets/screenshot_2018-11-19_09-00-59.png b/assets/screenshot_2018-11-19_09-00-59.png new file mode 100644 index 0000000..2696764 Binary files /dev/null and b/assets/screenshot_2018-11-19_09-00-59.png differ diff --git a/assets/screenshot_2020-08-23_13-09-51.png b/assets/screenshot_2020-08-23_13-09-51.png new file mode 100644 index 0000000..9cb4c65 Binary files /dev/null and b/assets/screenshot_2020-08-23_13-09-51.png differ diff --git a/assets/screenshot_2020-08-23_13-10-03.png b/assets/screenshot_2020-08-23_13-10-03.png new file mode 100644 index 0000000..9cb4c65 Binary files /dev/null and b/assets/screenshot_2020-08-23_13-10-03.png differ diff --git a/assets/screenshot_2020-08-23_13-25-57.png b/assets/screenshot_2020-08-23_13-25-57.png new file mode 100644 index 0000000..f8ec715 Binary files /dev/null and b/assets/screenshot_2020-08-23_13-25-57.png differ diff --git a/assets/screenshot_2020-08-23_15-08-28.png b/assets/screenshot_2020-08-23_15-08-28.png new file mode 100644 index 0000000..6ba871c Binary files /dev/null and b/assets/screenshot_2020-08-23_15-08-28.png differ diff --git a/assets/screenshot_2020-08-23_15-16-46.png b/assets/screenshot_2020-08-23_15-16-46.png new file mode 100644 index 0000000..08ef4c7 Binary files /dev/null and b/assets/screenshot_2020-08-23_15-16-46.png differ diff --git a/assets/screenshot_2020-08-23_15-19-56.png b/assets/screenshot_2020-08-23_15-19-56.png new file mode 100644 index 0000000..0434a52 Binary files /dev/null and b/assets/screenshot_2020-08-23_15-19-56.png differ diff --git a/assets/screenshot_2020-08-23_15-24-47.png b/assets/screenshot_2020-08-23_15-24-47.png new file mode 100644 index 0000000..0876af0 Binary files /dev/null and b/assets/screenshot_2020-08-23_15-24-47.png differ diff --git a/assets/screenshot_2020-08-23_17-04-45.png b/assets/screenshot_2020-08-23_17-04-45.png new file mode 100644 index 0000000..c86344e Binary files /dev/null and b/assets/screenshot_2020-08-23_17-04-45.png differ diff --git a/assets/screenshot_2020-08-23_19-10-05.png b/assets/screenshot_2020-08-23_19-10-05.png new file mode 100644 index 0000000..49b2790 Binary files /dev/null and b/assets/screenshot_2020-08-23_19-10-05.png differ diff --git a/assets/screenshot_2020-08-23_19-10-46.png b/assets/screenshot_2020-08-23_19-10-46.png new file mode 100644 index 0000000..1958fc2 Binary files /dev/null and b/assets/screenshot_2020-08-23_19-10-46.png differ diff --git a/assets/screenshot_2020-08-23_19-10-49.png b/assets/screenshot_2020-08-23_19-10-49.png new file mode 100644 index 0000000..eb70e2f Binary files /dev/null and b/assets/screenshot_2020-08-23_19-10-49.png differ diff --git a/cloud.org b/cloud.org new file mode 100644 index 0000000..d91759e --- /dev/null +++ b/cloud.org @@ -0,0 +1,1904 @@ +* Introduction to Cloud Infrastructure Technologies +LFS151.x + +Cloud was used to refer to the internet. Now it refers to "remote systems" which you can use. +Cloud computing is the use of on demand network accessible pool of remote resources (networks, servers, storage, applications, services) + +Clouds can be private (your own datacenter, managed by you or externally - walmart has a largest private cloud in the world - you'd generally use openstack to manage it), public cloud (aws), hybrid cloud (you use aws to augment your internal cloud etc) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 18:18:25 +[[file:assets/screenshot_2018-05-23_18-18-25.png]] +** Virtualization + +It is the act of creating a virtual (rather than an actual version of some computer hardware/operating systems/storage devices/other computer resources + +Virtual Machines are created on top of a "hypervisor" - which runs on top of the Host Machine's OS +Hypervisors allow us to emulate hardware like CPU, disk, network, memory etc - it also allows us to install Guest Machines on it + +What we would do is, install Linux on a bare-metal and after setting up the Hypervisor, create multiple Guest Machines with Windows (for eg) + +Some of the hypervisors are: + - KVM + - VMWare + - Virtualbox + +Hypervisors can be hardware or software. Most recent CPUs now have hardware virtualizatioin support. + +** KVM +"Kerven Virtual Machine is a full virtualization solution for Linux on x86 hardware" + +It's a part of the Linux kernel and has been ported to some other architectures as well now + +It basically provides "an API" that other tools like qemu can use to build virtual machines on the Linux kernel + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 18:09:10 +[[file:assets/screenshot_2018-05-23_18-09-10.png]] + +KVM exposes the /dev/kvm interface using which an external userspace host (eg QEMU) can emulate a OS (like Windows, Solaris, Linux etc) +You can have the applications running on QEMU which will pass the syscalls from the application to the host kernel via the /dev/kvm interface + +Virtualbox is an alternative to the KVM+QEMU combo (KVM can be used by other host software too), it is by Oracle + +** Vagrant + +Using VMs gives us numerous benefits: + - reproducible environments - which can be deployed/shared easily + - managing (and isolating) different projects with a sandbox env for each + +Vagrant allows us to automate the setup of one or more VMs by providing an end to end life cycle cli tool +It has support for multiple providers (hypervisors) - and even for docker now + +*** Vagrantfile +We have to write a Vagrantfile to describe our VM and vagrant will do the rest + +#+begin_src ruby +# -*- mode: ruby -*- +# vi: set ft=ruby : + +Vagrant.configure(2) do |config| + # Every Vagrant development environment requires a box. You can search for + # boxes at https://atlas.hashicorp.com/search. + config.vm.box = "centos/7" + + # Create a private network, which allows host-only access to the machine + # using a specific IP. + config.vm.network "private_network", ip: "192.168.33.10" + + # config.vm.synced_folder "../data", "/vagrant_data" + + config.vm.provider "virtualbox" do |vb| + # Customize the amount of memory on the VM: + vb.memory = "1024" + end + + config.vm.provision "shell", inline: <<-SHELL + yum install vim -y + SHELL +end +#+end_src + +The vagrant command can do operations like ssh, up, destroy etc + +*** Boxes + +You need to provide an image in the Vagrantfile (like the FROM directive in Dockerfile) which can be used to instantiate the machines. +In the example above, we have used ~centos/7~ +Atlas is a central repository of the base images. + +Box is the actual image of the VM that you built from the base image (after following the steps in your Vagrantfile) - it is analogous to the docker image that you build from the Dockerfile + +Like docker images, you can version these images/boxes + +*** Vagrant providers +These are the underlying hypervisors used - like KVM, virtualbox (which is the default), now docker etc + +*** Synced folders +These allow you to "mount" your local dir on the host with a VM + +*** Provisioning + +These allow us to install software, make configuration changes etc after the machine is booted - it is part of the ~vargant up~ process. You can use provisioners like Ansible, shell, chef, docker etc + +So you need to provide 2 things to vagrant - provider and provisioner (eg: kvm, ansible respectively) + +*** Plugins +Vagrant has plugins as well to extend functionality + + +** Infrastructure as a service + +IaaS is the on-demand supply of physical and virtual computing resources (storage, network, firewall, load balancers etc) +IaaS uses some form of hypervisor (eg kvm, vmware etc) + +AWS uses the Xen hypervisor +Google uses the KVM hypervisor + +When you request an EC2 instance for eg, AWS creates a virtual machine using some hypervisor and then gives you access to that VM + + +You can become a IaaS provider yourself using OpenStack. +OpenStack very modular and has several components for different virtual components etc: + - keystone + - for identity, token, catalog etc + - nova + - for compute resources + - with Nova we can select an underneath Hypervisor depending on the requirement, which can be either libvirt (qemu/KVM), Hyper-V, VMware, XenServer, Xen via libvirt. + - horizon + - web based UI + - neutron + - network as a service +etc + +** Platform as a service + +PaaS is a class of services that allow users to develop, run and manage applications without worrying about the underlying infrastructure. + +Eg: openshift origin, deis, heroku etc +PaaS can be deployed on top of IaaS or independently on VMs, baremetal and containers - I.e the "thing" powering your applications (which you don't have to worry about) can be a VM (via IaaS or otherwise), baremetal servers, containers etc + +*** Cloud Foundry +It is an open source PaaS that provides a choice of clouds, developer frameworks, application servers +It can be deployed on premise, or on an IaaS like aws, openstack etc + +There are many commercial cloud foundry prooviders as well - like IBM bluemix etc + +CF gives you: + - application portability + - application auto scaling + - dynamic routing + - centralized logging + - security + - support for different IaaS + +CF runs on top of VMs from existing IaaS like aws, openstack etc +CF uses some VMs as components VMs - these run all the different components of CF to provide different PaaS functionalities +and Application VMs - these run ~Garden containers~ inside which your application is deployed + +CF has 3 major components: + - Bosh + - it is the system orchestration to configure VMs into well defined state thru manifest files. It provisions VMs automatically (sitting on top of IaaS - like terraform), then using the manifest files, it configures CF on them + + - cloud controller + - it runs the applications and other processes on provisioned VMs + - Go router + - it routes the incoming traffic to the right place (cloud controller or application) + +CF uses ~buildpacks~ that provide the framework and runtime support for the applications. There are buildpacks for Java, Python, Go etc + +You can have custom buildpacks as well. When an application is pushed to CF: + - it detects the required buildpack and installs it on the droplet execution agent (DEA) where the application needs to run + - the droplet containers OS-specific pre-built root filesystem called stack, the buildpack and source code of the application + - the droplet is then given to the application VM (diego cell) which unpacks, compiles and runs it + +So, (everything) -> dea -> droplet -> VMs + +The application runs a container using the ~Garden runtime~ +It supports running docker images as well, but it uses the garden runtime to run them + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 19:36:21 +[[file:assets/screenshot_2018-05-23_19-36-21.png]] + + +The messaging layer is for the component VMs to communicate with each other internally thru HTTP/HTTPS protocols. Note it uses ~consul~ for long-lived control data, such as IP addresses of component VMs + +Hasura (and heroku etc) are PaaS too just like cloudfoundry + +The difference b/w CF and hasura is that hasura uses k8s to manage your applications, CF has it's own thing :top: +(bosh, garden etc) + +CF can be integrated with CI/CD pipelines as well + +*** Open Shift +This is an open source PaaS solution by RedHat. +OpenShift v3 uses Docker and Kubernetes underneath, (so hasura is just a commercial provider of openshift like platform at this point) + +It can be deployed on CoreOS + +There are 3 different paths for OpenShift as offered by RedHat + - openshift online + - you deploy your applications on openshift cluster managed by redhat and pay for the usage + - openshift dedicated + - you get your own dedicated openshift cluster managed by RH + - openshift enterprise + - you can create your own private PaaS on your hardware (on premise installation of OpenShift?) + +Upsteam development of openshift happens on GH and it is called as OpenShift Origin + +OpenShift Origin is like open source Hasura + +OSv3 (the latest Open Shift) has a framework called ~source to image~ which creates Docker images from the source code directly +OSv3 integrates well with CI/CD etc + +OS Enterprise gives you GUI, access control etc + +RedHat and Google are collaborating to offer OS Dedicated on Google Cloud Platform + +OS creates an internal docker registry and pushes docker images of your application to it etc + +The pitch for OS is that: + - it enables developers to be more efficient and productive by allowing them to quickly develop, host and scale apps in the cloud via a user friendly UI and out of the box features like logging, security etc + +It's written in Go + +*** Heroku + +It is a fully managed container based PaaS company. Heroku supports many languages like Python, Go, Clojure etc +To use Heroku, you have to follow the Heroku way of doing things: + - mention the commands used in a Procfile + - mention the steps to execute to compile/built the app using a buildpack + - the application is fetched from GH/dropbox/via API etc and the buildpack is run on the fetched application code + - The runtime created by running the buildpack on the code, (fetching the dependency, configuring variables etc) is called a ~slug~ + - you can add ~add-ons~ that provide more functionality like logging, monitoring etc + - a combination of slug, configuration variables, and add-ons is referred to as a release, on which we can perform upgrade or rollback. + +Each process is run in a virtualized UNIX container called a ~dyno~. Each dyno gets its own ephemeral storage. The ~dyno manager~ manages the dynos across all applications running on Heroku + +Individual components of an application can be scaled up or down using dynos. + +The UI can be used to manage the entire application (create, release, rollback etc) + +Hasura is just like Heroku (heroku uses the the git push to a custom remote too) - just using k8s + + +*** Deis + +It is like OpenShift, just it does not have a GUI but a cli only. It helps you make the k8s experience smoother, in that it manages (like a PaaS should), the release, logging, rollback, CI/CD etc + +CoreOS is a lightweight OS to run just containers. It supports the Docker and rkt container runtimes right now. + +Overview of Deis: +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 20:16:23 +[[file:assets/screenshot_2018-05-23_20-16-23.png]] + + +The data plane is where the containers run - the router mesh routes traffic to the data plane +There is also a control plane that is for admins, which accepts logs etc, and can be accessed via the deis api +the router mesh again routes deis api traffic to the control plane + +Deis can deploy applications from Dockerfiles, docker images, heroku buildpacks (which was what we used at appknox) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 20:20:21 +[[file:assets/screenshot_2018-05-23_20-20-21.png]] + +The deis workflow :top: + +~etcd~ is a distributed key-value database which contains the IPs of the containers so that it can route the traffic it gets from the router to the right container + + +** Containers + +Containers are "operating system level virtualization" that provide us with "isolated user-space instances" (aka containers) +These user-space instances have the application code, required dependencies for our code, the required runtime to run the application etc + +*** The Challenge + +Often our applications have specific dependency requirements. And they need to run on a myriad of machines + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 20:26:11 +[[file:assets/screenshot_2018-05-23_20-26-11.png]] + +As developers, we don't want to worry about this mess. We want our application to work irrespective of the underlying platform and other applications that might be running on the platform. Also, we want them to run efficiently, using only the resources they need and not bogging down the host machines + +Docker allows us to bundle our applications with all it's dependencies "in a box" - basically a binary that has a isolated worldview, is agnostic of other things running on the host machine. The binary cannot run directly, it needs to be run a runtime (eg docker runtime, rkt runtime, garden runtime etc) + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 20:29:19 +[[file:assets/screenshot_2018-05-23_20-29-19.png]] + +The container will run identically on all the platforms - the runtime will make sure of that + +This container (having our application and it's dependencies and it's runtime) is called the image. +A running instance of the image is referred to as a container. +We can spin multiple containers (objects) from the image (class) +The image is built using a dockerfile + +Dockerfile -> docker image -> docker containers + +The docker container runs as a normal process on the host's kernel + +*** Building blocks + +The Linux kernel provides all the building blocks for the containers. The run times are just opionated APIs around the base kernel API + +**** Namespaces +A namespaces wraps a particular system resource like network, process id in an abstraction and makes it appear to the process within the namespace that they have their own isolated instance of the global resource. +The resources that are namespaced are: + - pid - provides each namespace to have the same PIDs - each container can have its own PID 1 + - net - provides each namespace with its own network stack - each container has its own IP address + - mnt - provides each namespace with its own view of filesystem + - ipc - provides each namespace with its own interprocess communication + - uts - provides each namespace with its own hostname and domainname + - user - provides each namespace with its own user and group id number spaces. + - *a root user is not the root user on the host machine* + + +**** cgroups + +Control groups are used to organize processes hierarchically and *distribute system resources along the hierarchy in a controlled and configurable manner* - so cgroups are mostly about distributing these system resources within the namespaces above + +The following ~cgroups~ are available for linux: + - blkio - to share block io + - cpu - to share compute + - cpuacct + - cpuset + - devices + - freezer + - memory + + +**** Union filesystem + +The union filesystem allows files and directories of separate filesystems (aka layers) to be transparently overlaid on each other to create a new virtual filesystem + +An image used in docker is made of multiple layers which are merged to create a ready-only filesystem. The container gets a read-write layer which is an ephemeral layer and it is local to the container + +*** Container runtimes + +Namespaces and cgroups have existed in the kernel for a long time. The run times are just wrappers around those apis and provide a easy workflow to work with them - in some talks, developers show how you can play with the apis directly + +Like POSIX, which is a specification of the API surface that the kernel should provide for the applications, so that they are portable, for containers we have OCI - The Open Container Initiative (under the auspices of The Linux Foundation) + +The OCI governance body has specifications to create standards on operating system process and application containers. + +This is so that there is cross compatibility between different container runtimes and operating systems - no vendor lockins etc. Also, same containers can be then run under different runtimes (this is how CF runs docker containers under it's garden runtime) + +~runC~ is a CLI tool for spawning and running containers according to the specifications. + +Docker uses the ~runC~ container runtime - so docker is fully compatible with the OCI specification +Docker uses the ~containerd~ daemon to control ~runC~ containers + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 22:16:00 +[[file:assets/screenshot_2018-05-23_22-16-00.png]] + + +Docker CLI -> docker engine -> ~containerd~ -> ~runC~ + +Another container runtime is ~rkt~ (rock-it) +~rkt~ does not support OCI containers currently - but it is in the pipeline - https://github.com/rkt/rkt/projects/4 +But ~rkt~ can run docker images. + +Since version 1.11, the Docker daemon no longer handles the execution of containers itself. Instead, this is now handled by containerd. More precisely, the Docker daemon prepares the image as an Open Container Image (OCI) bundle and makes an API call to containerd to start the OCI bundle. containerd then starts the container using runC. +~rkt~ takes the same docker image runs it without bundling it as an OCI bundle. + +~rkt~ can run "App Container Images" specified by the "App Container Specification" + +*** Containers vs VMs + +A VM runs on top of a Hypervisor, which emulates the different hardware - CPU, memory etc +Between an application and a Guest OS, there are multiple layers - guest OS, hypervisor, host OS + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 22:26:44 +[[file:assets/screenshot_2018-05-23_22-26-44.png]] + +In contrast to this, Containers run directly as processes on top of the host OS. This helps containers get near native performance and we can have a large number of containers running on a single host machine + +*** Docker runtime + +Docker follows a client-server architecture. +The docker client connects to the docker server (docker host) and executes the commands + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 22:28:58 +[[file:assets/screenshot_2018-05-23_22-28-58.png]] + +Docker Inc. has multiple products: + + - Docker Datacenter + - Docker Trusted Registry + - Universal Control Plane + - Docker Cloud + - Docker Hub + + +** Operating systems for containers + +Ideally, it would be awesome if our OSes just live to run our containers - we can rid them of all the packages and services that aren't used in running containers + +Once we remove the packages which are not required to boot the base OS and run container-related services, we are left with specialized OSes, which are referred to as *Micro OSes for containers.* + +Examples: + - atomic host (redhat) + - coreos + - ubuntu snappy + - vmware photon + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-23 22:54:47 +[[file:assets/screenshot_2018-05-23_22-54-47.png]] + + + +*** Atomic Host +Atomic Host is a lightweight operating system that is based on the fedora, centos, rhel family +It is a sub-project of Project Atomic - which has other projects like Atomic Registry etc + +With Atomic Host we can develop, run, administer and deploy containerized applications + +Atomic Host, though having a minimal base OS, has systemd and journald. +It is built on top of the following: + - rpm-ostree + - one cannot manage individual packages as there is no ~rpm~ + - to get any required service, you have to start a respective container + - there are 2 bootable, immutable and versioned filesystems - one used to boot the system, other used to fetch updates from upstream. Both are managed using rpm-ostree + - systemd + - to manage system services for atomic host + - docker + - AH supports docker as a container runtime (which means ~runC~ as the container runtime) + - k8s + - with k8s, we can create a cluster of AH to run applications at scale + +We have the usual docker command, but we get the ~atomic~ command to control the base host OS. AH can be managed using Cockpit which is another project under Project Atomic + +*** CoreOS + +CoreOS is a minimal operating system for running containers. It supports ~docker~ (so basically, ~runC~) and ~rkt~ container runtimes. It is designed to operate in a cluster mode + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 21:27:56 +[[file:assets/screenshot_2018-05-24_21-27-56.png]] + +Note how the CoreOS machines are all connected to etcd and are controlled via a local machine + +It is available on most cloud providers + +CoreOS does not have any package managers and the OS is treated as a single unit. There are 2 root partitions, active and passive. +When the system is booted with the active partition, the passive partition can be used to download the latest updates. + +Self updates are also possible, and the ops team can choose specific release channels to deploy and control the application with update strategies. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 21:35:38 +[[file:assets/screenshot_2018-05-24_21-35-38.png]] + +*booted off of partition A + +:top: +partition A was initially active and updates were getting installed on partition B. After the reboot, partition B becomes active and updates are installed on partition A, if available. + +CoreOS is built on top of the following: + +**** docker/rkt +CoreOS supports both these runtimes + +**** etcd +It is a distributed key-value pair, used to save the cluster state, configuration etc + +**** systemd + +It is an ~init~ system which helps us manage services on Linux + +example +#+begin_src +[Unit] +Description=My Service +Required=docker.service +After=docker.service + +[Service] +ExecStart=/usr/bin/docker run busybox /bin/sh -c "while true; do echo foobar; sleep 1; done" + +[Install] +WantedBy=multi-user.target +#+end_src + +**** fleet + +It is used to launch applications using the ~systemd~ unit files. With ~fleet~ we can treat the CoreOS cluster as a single ~init~ system + +#+begin_src +[Unit] +Description=My Advanced Service +After=etcd.service # we need etcd and docker to be running before our service starts +After=docker.service + +[Service] +TimeoutStartSec=0 +ExecStartPre=-/usr/bin/docker kill apache1 +ExecStartPre=-/usr/bin/docker rm apache1 # do this before running my service command +ExecStartPre=/usr/bin/docker pull coreos/apache +ExecStart=/usr/bin/docker run --name apache1 -p 8081:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND # our service command +ExecStartPost=/usr/bin/etcdctl set /domains/example.com/10.10.10.123:8081 running # do this after running our service command +ExecStop=/usr/bin/docker stop apache1 # run this to stop the service +ExecStopPost=/usr/bin/etcdctl rm /domains/example.com/10.10.10.123:8081 + +[Install] +WantedBy=multi-user.target +#+end_src + +CoreOS has a registry product (like docker registry) called Quay. Their enterprise k8s solution is called ~Tectonic~ + + +*** VMware Photon + +Photon OS is a minimal Linux container host developer by VMware and runs blazingly fast on VMware platforms + +It supports the docker, rkt and pivotal garden runtimes and is available on aws ec2, gcp, azure +it has a yum compatible package manager as well + +It is written in Python + Shell + +*** RancherOS + +It is a 20MB linux distribution that runs docker containers. +It runs directly on top of the linux kernel. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 22:02:30 +[[file:assets/screenshot_2018-05-24_22-02-30.png]] + +RancherOS runs 2 instances of the docker daemon. The first one is used to run the system containers (~dhcp~, ~udev~ etc) +The 2nd is used to run user level containers + +How about running docker containers in rancheros with gvisor? +The system containers will be run with gvisor + +We can use rancher to setup k8s and swarm clusters. It is the most "minimal" of all minimal OSes + + + +** Container Orchestration + +Running containers on a single host is okay, not so fancy. What we want is to run containers at scale. The problems we want to solve are: + + - who can bring multiple hosts together and make them part of a cluster - so that the hosts are abstracted away and all you have is a pool of resources + - who will schedule the containers to run on specific hosts + - who will connect the containers running on different hosts so that they can access each other? + - who will take care of the storage for the containers when they run on the hosts + + +Container orchestration tools solve all these problems - along with different plugins +CO is an umbrella term that encompasses container scheduling and cluster management. +Container Scheduling - which host a container or group of containers should be deployed +Cluster Management Orchestrater - manages the underlying nodes - add/delete them etc + +Some options: + - docker swarm + - k8s + - mesos marathon + - cloud foundry diego + - amazon ecs + - azure container service + + + +*** Docker Swarm + +It is a native CO tool from Docker, Inc +It logically groups multiple docker engines to create a virtual engine on which we can deploy and scale applications + +The main components of a swarm cluster are: + - swarm manager + - it accepts commands on behalf of the cluster and takes the scheduling decisions. One or more nodes can be configured as managers (they work in active/passive modes) + - swarm agents + - they are the hosts which run the docker engine and participate in the cluster + - swarm discovery service + - docker has a project called ~libkv~ which abstracts out the various kv stores and provides a uniform interface. It supports etcd, consul, zookeeper currently + - overlay networking + - swarm uses ~libnetwork~ to configure the overlay network and employs ~VxLAN~ between different hosts + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 22:30:20 +[[file:assets/screenshot_2018-05-24_22-30-20.png]] + +**** Features + - it is compatible with docker tools and api so the workflow is the same + - native support to docker networking and volumes + - built in scheduler supporting flexible scheduling + - filters: + - node filters (constraint, health) + - container filters (affinity, dependency, port) + - strategies + - spreak + - binpack + - random + - can scale to 1000 nodes with 50K containers + - supports failover, HA + - pluggable scheduler architecture, which means you can use mesos or k8s as scheduler + - node discovery can be done via - hosted discovery service, etcd/consul, static file + + + +**** Docker Machine +It helps us configure and manage local or remote docker engines - we can start/inspect/stop/restart a managed host, upgrade the docker client and daemon, configure a docker client to talk to our host etc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 22:38:12 +[[file:assets/screenshot_2018-05-24_22-38-12.png]] + + +It has drivers for ec2, google cloud, vagrant etc. We can also add existing docker engines to docker machines + +Docker machine can also be used to configure a swarm cluster + +**** Docker Compose + +It allows us to define and run multi-container applications on a single host thru a configuration file. + +*** Kubernetes + +It is an open source project for automating deployment, operations, scaling of containerized applications. +It was the first project to be accepted as the hosted project of Cloud Native Computing Foundation - CNCF + +It currently only supports docker as the container runtime, in the future it plans to add support for ~rkt~ + + +The high level architecture of k8s is: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-24 23:00:31 +[[file:assets/screenshot_2018-05-24_23-00-31.png]] + + +Each node is labeled as a minion. It has a docker engine on which runs the kubelet, (the k8s "agent"), cAdvisor (?), a proxy and one or more pods. In the pods run the containers. + +Then we have the management guys - including the scheduler, replication controller, authorization/authenticator, rest api etc + +**** Key Components of the k8s architecture + +***** Cluster +The cluster is a group of nodes (virtual or physical) and other infra resources that k8s uses to run containerized applications + +***** Node +The node is a system on which pods are scheduled and run. The node runs a daemon called kubelet which allows communication with the master node + +***** Master +The master is a system that takes pod scheduling decisions and manages replication and manager nodes + +***** Pod +The Pod is a co-located (located on the same place/node) group of containers with shared volumes. It is the smallest deployment unit in k8s. A pod can be created independently but its recommended to use replication controller + +***** Replication controller +It manages the lifecycle of the pods +Makes sure there are the desired number of pods running at any given point of time + +Example of replication controller: + +#+begin_src +apiVersion: v1 +kind: ReplicationController +metadata: + name: fronted +spec: + replicas: 2 + templates: + metadata: + labels: + app: dockerchat + tier: frontend + spec: + containers: + - name: chat + image: nkhare/dockerchat:v1 + env: + - name: GET_HOSTS_FROM + value: dns + ports: + - containerPort: 5000 +#+end_src + +***** Replica sets + - "they are the next generation replication controller" + - RS supports set-based selector requirements, whereas RC only supports equality based selector support + - Deployments + - with k8s 1.2, a new object has been added - deployment + - it provides declarative updates for pods and RSes + - you need to describe the desired state in a deployment object and the deployment controller will change the actual state to the desired state at a controlled rate for you + - can be used to create new resources, replace existing ones by new ones etc + +A typical use case: +- Create a Deployment to bring up a Replica Set and Pods. +- Check the status of a Deployment to see if it succeeds or not. +- Later, update that Deployment to recreate the Pods (for example, to use a new image). +- Rollback to an earlier Deployment revision if the current Deployment isn’t stable. +- Pause and resume a Deployment + +Example deployment: + +#+begin_src +apiVersion: extensions/v1beta1 +kind: Deployment +metadata: + name: nginx-deployment +spec: + replicas: 3 + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx + image: nginx:1.7.9 + ports: + - containerPort: 8 +#+end_src + +***** Service +A service groups sets of pods together and provides a way to refer to them from a single static IP address and the corresponding DNS name. + +Example of a service file +#+begin_src +apiVersion: v1 +kind: Service +metadata: + name: frontend + labels: + app: dockchat + tier: frontend +spec: + type: LoadBalancer + ports: + - port: 5000 + selector: + app: dockchat + tier: frontend +#+end_src + +***** Label +It is an arbitrary key-value pair attached to a resource like pod, replication controller etc +in the eg above :top:, we defined ~app~ and ~tier~ as the labels. + +***** Selector +They allow us to group resources based on labels. +In the example above, the ~frontend~ service will select all ~pods~ which have the labels app=dockerchat, tier=frontend + +***** Volume +The volume is an external filesystem or storage which is available to pods. They are built on top of docker volumes + +***** Namespace +It adds a prefix to the name of the resources so that it is easy to distinguish between different projects, teams etc in the same cluster. + +**** Features + - placement of containers based on resource requirements and other constraints + - horizontal scaling thru cli and ui, auto-scaling based on cpu load as well + - rolling updates and rollbacks + - supports multiple volume plugins like gcp/aws disk, ceph, cinder, flocker etc to attach volumes to pods - recall the pods share volumes + - self healing by restarting the failed pods etc + - secrets management + - supports batch execution + - packages all the necessary tools - orchestration, service discovery, load balancing + + + +*** Apache Mesos + +Mesos is a higher level orchestrater, in that it can be used to treat a cluster of nodes as one big computer, and allows us to run different applications on the pool of nodes (eg: hadoop, jenkins, web server etc) + +It has functionality that crosses between IaaS and PaaS + +**** Mesos Components + +***** Master + +It is the "brain" of the mesos cluster and provides a single source of truth. +The master node mediates between schedulers and slaves. +The slaves advertise their resources to the master node. The master node forwards them to the scheduler who gives the task to run on the slave to the master and the master forwards them to the slave. + + +Slaves -> master -> scheduler -> master -> slaves + +***** Slaves +They execute the tasks send by the scheduler via the master node + +***** Frameworks + +They are distributed applications that solve a particular use case. +It consists of a scheduler and an executor. The scheduler gets a resource offer, which it can accept or decline. The executor accepts the jobs from the scheduler and runs them + +Examples of existing frameworks - hadoop, spark, auror etc. +We can create our own too + +***** Executor + +They are used to run jobs on slaves. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-28 23:42:00 +[[file:assets/screenshot_2018-05-28_23-42-00.png]] + + +**** Features + - it can scale to 10k nodes + - uses Zookeeper for fault tolerant replicated master and slaves + - provides support for docker containers + - allows multi-resource scheduling (memory, CPU, disk, ports) + - has Java, Python, C++ APIs for developing new parallel applications + +Mesos ships binaries for different components (master, slaves, frameworks etc) which can be used to create the mesos cluster + +**** Mesosphere + +Mesophere offers a commercial solution on top of Apache Mesos called Mesosphere Enterprise DC/OS (which is also opensource). +It comes with the Marathon framework which has the features: + - HA + - supports docker natively + - logging, web api etc + +DC/OS stands for Datacenter Operating System +It treats the entire data center as one large computer + +DC/OS is in Python! + +**** DC/OS + +It has 2 main components: + +***** DC/OS Master +It has the following components +- mesos master process + - similar to the *master* component in mesos +- mesos dns + - provides service discovery within the cluster, so applications and services within the cluster can reach each other +- marathon + - framework which comes by default with dc/os and provides the _init system_ +- zookeeper + - high performance coordination service that manages dc/os services +- admin router + - open source nginx config which provides central authentication and proxy to dc/os services within the cluster + +***** DC/OS Agent +- Memos agent process + - runs the ~mesos-slave~ process, which is similar to the *slave* component of Mesos +- Mesos containerization + - lightweight containerization and resource isolation of executors + - uses cgroups and namespaces +- docker container + - provides support for launching tasks that contain docker images + +*** Hashicorp Nomad +It is a cluster manager and resource scheduler which is distributed, HA and sclaes to thousands of nodes. + +Designed to run micro services and batch jobs. +Supports different workloads, like containers, VMs, individual applications + +Since it is a Go project, it is distributed as a single statically linked binary and runs in a server and client mode. + +To submit a job, use the HCL - hashicorp configuration language. +Once submitted, Nomad will find available resources in the cluster and run it to maximize the resource utilization + +Sample job file: + +#+begin_src +# Define the hashicorp/web/frontend job +job "hashicorp/web/frontend" { + # Run in two datacenters + datacenters = ["us-west-1", "us-east-1"] + + # Only run our workload on linux + constraint { + attribute = "$attr.kernel.name" + value = "linux" + } + + # Configure the job to do rolling updates + update { + # Stagger updates every 30 seconds + stagger = "30s" + + # Update a single task at a time + max_parallel = 1 + } + + # Define the task group + group "frontend" { + # Ensure we have enough servers to handle traffic + count = 10 + + task "web" { + # Use Docker to run our server + driver = "docker" + config { + image = "hashicorp/web-frontend:latest" + } + + # Ask for some resources + resources { + cpu = 500 + memory = 128 + network { + mbits = 10 + dynamic_ports = ["http"] + } + } + } + } +} +#+end_src + +This would start 10 containers from the ~hashicord/web-frontend:latest~ docker image + +**** Features +- Supports both cluster management and resource scheduling +- supports multiple workloads like containers, VMs, unikernels, individual applications (like apache mesos) +- ships with just one binary +- has multi-datacenter support and multi-region support - we can run nomad client/server running in different clouds to get a logical nomad cluster +- bin packs applications onto servers to achieve high resource utilization + + + +*** Amazon ECS +It is a service provided by AWS offering container orchestration and management on top of EC2 instances using Docker. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-29 00:25:59 +[[file:assets/screenshot_2018-05-29_00-25-59.png]] + +Some of the components: + +- Cluster +Logical grouping of container instances on which tasks are placed + +- container instances +It is an ec2 instance with ecs agent that has been registered with a cluster + +- task definition +Specifies the blueprint of an application which consists of 1 or more containers + +- scheduler +Places tasks on the container instances. + +- service +1 or more instances of tasks to run depending on task definition. +- task +Running container instance from the task definition +- container +Docker container created from task definition + + +The features are that it fits in nicely with the rest of AWS ecosystem - cloudwatch for monitoring, cloudtrail for logging etc +It can support 3rd party schedulers like mesos marathon + + +*** Google Container Engine + +GKE is a fully managed solution for running k8s on google cloud. +It is like coreos' tectonic, redhat's openorigin and aws' ecs - a fully managed k8s service + +*** Azure container service + +It simplifies creation, configuration, management of containerized applications on microsoft azure +it uses either apache mesos, or docker swarm to orchestrate applications which are containerized using the docker runtime. + +** Unikernels + +One trend has been towards removing unnecessary components from our servers. We have VMs, then we moved to containers which removed a lot of redundant components - like a kernel etc (instead replying on the host os). Then we have mini oses like coreos' container linux which was made specially to run containers. +One extreme trend here is to strip down the host os further so that it now only has "specialized, single address space machine images" constructed to solely run our application only - we don't even need containers now, our application runs directly with the kernel code + +The single address space executable has both the application and kernel components. +It only contains: +- the application code +- configuration files of the application +- user space libraries needed by the application (like the tcp stack maybe) +- application runtime (like the jvm for eg) +- system libraries of the unikernel which allow it to communicate with the hypervisor + +x86 has protection rings - the kernel runs on ~ring0~ with maximum privileges, the application on ~ring3~ with least privileges. + +With unikernels, everything runs on ~ring0~ +UK would run directly on top of hypervisors like Xen, or even on bare metal + +Example of a UK created by the ~mirage compiler~ + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-29 00:50:16 +[[file:assets/screenshot_2018-05-29_00-50-16.png]] + + +Benefits include faster boot times, maximized resource utilization, easily reproducible VM environment. +Safer environment since the attack surface has been reduced + +*** Implementations +There are many implementations, mainly falling in 2 categories: +- specialized and purpose built unikernels + - the utilize all the modern features of the hardware, and aren't posix compliant. Eg: ING, Clive, MirageOS + +- Generalized "fat" unikernels + - the run unmodified applications, which make the fat. (?) + - examples: OSv, BSD Rump kernels + +*** Docker and Unikernels + +In Jan 2016, Docker brought Unikernels to make them 1st class citizens of the Docker ecosystem. +Both containers and unikernels can coexist on the same host, they can be managed by the same docker library + +Unikernels power Docker Engine on top of Alpine Linux on Mac and Windows with their default hypervisors (xhyve Virtual Machine and Hyper-V VM respectively) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-29 01:00:26 +[[file:assets/screenshot_2018-05-29_01-00-26.png]] + +*** Microservices + +They are small independent processes that communicate with each other to form complex applications which utilize language agnostic APIs. + +The components (aka services) are highly decoupled, do one thing and do it well (the UNIX philosophy), allow a modular approach + +In monoliths, the entire application is built as a single code base (repo). In microservices, the application is built with many small components (services) which communicate with each other using rest apis/grpc etc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-05-31 09:55:27 +[[file:assets/screenshot_2018-05-31_09-55-27.png]] + +The graphic above is very insightful :top: + +**** Advantages +- The microservices allow us to scale the components that are under load currently and now have to deploy the entire thing with each scale up + +- Microservices allow us to be polyglots - we can choose any language to write any service in. It doesn't matter because the application talks to one another using APIs etc + +- Cascading failure is averted - if one instance of a service fails, others continue to work etc +There is a catch however, if all instances of a particular service is slow to respond/fails, it can lead to cascading failures + +- Services can be reused as well + +**** Disadvantages + +- Need to find the right "size" of the services + +- Deploying a monolith is simple, deploying a microservice is tricky, needs a orchestrater like k8s + +- End to end testing becomes difficult because of so many moving parts + +- managing databases can be difficult + +- monitoring can be a little difficult + + +*** Containers as a Service + +There are companies providing containers on demand. +A CaaS sits between IaaS and PaaS. Examples include - Docker Universe Control Plane. +When you demand containers, you don't have to worry about infrastructure, you get it on demand for you. Also, your application gets deployed and taken care of. This is close the AWS Lambda, the serverless tech where you don't have to worry about infra/deployment too + +Examples of CaaS providers: +- OpenStack Magnum +- Docker Universe Control Plane + +Other solutions that enable CaaS are (or what CaaS uses under the hood): +- Kubernetes +- aws ecs +- tectonic (coreos' k8s as a service) +- rancher (the miniOS) + + +**** Docker Universe Control Plane + +UCP provides a centralized container management solution (on premise or on the cloud) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-03 23:10:15 +[[file:assets/screenshot_2018-06-03_23-10-15.png]] + +UCP works with Docker Machine, Docker Swarm etc so adding and removing nodes is simpler. UCP also integrates well with auth mechanisms like LDAP/AD so one can define fine grained policies and roles. + +***** Features +- works with existing auth tools like LDAP, LDAD, SSO with docker trusted registry +- works with existing docker tools like DM, DC +- has a web gui +- provides a centralized container management solution + + +**** Docker Datacenter + +Docker has another project - Docker Datacenter, which builds on top of UCP and DTR. It is hosted completely behind a firewall. + +It leverages Docker Swarm under the hood and has out of the box logging, monitoring etc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-03 23:15:35 +[[file:assets/screenshot_2018-06-03_23-15-35.png]] + +In UCP, we can define and start containers on demand using the UI which also has logs etc for that container +The developers can deploy applications without worrying about the infra etc. + + +**** Project Magnum + +Openstack Magnum is a CaaS service built on top of OpenStack. + +We can choose the underlying orchestrater from k8s, swarm, or mesos. + +There are 2 components to Magnum +- Server API +Magnum Client talks to this service +- Conductor +It manages the cluster lifecycle thru *Heat* and communicates with the *container orchestration enginer* (COE) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-03 23:22:06 +[[file:assets/screenshot_2018-06-03_23-22-06.png]] + +***** Magnum Components +- Bay +Bays are the nodes on which the COE sets up the cluster +- BayModels +Stores metadata information about Bays like COE, keypairs, images to use etc +- COE +It is the container orchestrater used by magnum. Currently supported orchestrater are k8s, swarm, or mesos. COEs can run on top of Micro OSes like CoreOS, atomic host etc +- Pod +A colocated (located close to one another) group of application containers that run with a shared context +- Service +An abstraction which defines a logical set of pods and a policy to access them +- Replication Controller +Abstraction for managing a group of pods to ensure that a specified number of resources are running +- Container +The docker container running the actual user application + +***** Features +- Magnum offers an asynchronous API that is compatible with Keystone +- multi-tenant +- HA, scalable + + +*** Software defined networking and networking for containers + +SDN decouples the network control layer from the layer which controls the traffic. This allows SDN to program the control layer to create custom rules in order to meet the networking requirements. + +**** SDN Architecture +In networking (in general), we have 3 planes defined: + +- Data Plane (aka Forwarding Plane) +It is responsible for handling data packets and applying actions to them based on rules in lookup-tables + +- Control Plane +It is tasked with calculating and programming the actions for the data plane. Here the forwarding decisions are made and the services (Quality of Service, VLANs) are implemented + +- Management Plane +Here we can configure and manage the network devices + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-03 23:39:40 +[[file:assets/screenshot_2018-06-03_23-39-40.png]] + + +**** Activities performed by a network device + +The network device facilitates the + +Every network device performs 3 activities: +- Ingress and Egress packets +Done at the lowest level, which decides what to do with the ingress packets - weather to forward them or not (based on the forwarding tables) - these activities are mapped as data plane activities. +All routers, switches, modem etc are part of this plane + +- Collect, Process, Manage network information +Using this information, the network device makes the forwarding decisions, which the data plane follows. + +- Monitor and manage the network +We can use the tools available in the Management Plane to manage the network devices eg: SNMP - Simple Network Management Protocol + +In SDN, we decouple the control plane with the data plane. The control plane has a centralized view of the overall network, which allows it to create forwarding tables that the data plane uses to manage network traffic (the network devices follow the rules) + +The Control Plane has APIs that take requests from applications to configure the network. After preparing the desired state of the network, it is given to the Data Plane (aka Forwarding Plane) using a well defined protocol like OpenFlow + +We can also use tools like Ansible, Chef etc to configure SDN. + + +**** Introduction to Networking for Containers +Containers need to be connected on the same host and across hosts. The host kernel uses the Network Namespace feature of the kernel to isolate the network from one container to another on the host. +The network namespace can be shared as well + +On a single host, we can use the *Virtual Ethernet (~vnet~)* feature with Linux bridging to give a virtual network interface to each container and assign it an IP address - this is as if each container was a full machine on itself and had an ethernet port, a unique IP on the network. + +With kernel features like IPVLAN, we can configure each container to have a unique and world-wide routable IP address - this will allow us to do connect containers on any host to other containers on any other host. It is a recent feature and the support for different container runtimes is coming soon. + +Currently, to do multi-host networking with containers, the most common solution is to use some form of *Overlay* network driver, which encapsulates the Layer 2 traffic to a higher layer. (Recall Layer 2 is the layer that transfers frames (which are the smallest units of bits on L2) between hosts on the same local network) + +Examples of this type of implementation are Docker Overlay Driver, Flannel, Weave etc. Project Calico allows multi-host networking at Layer 3 using BGP - border gateway protocol. L3 is basically the IP layer + +**** Container Networking Standards + +There are 2 different standards for container networking +- The Container Network Model - CNM +Docker Inc. Is the primary driver for this networking model. It is implemented using the libnetwork model which has the follow utilizations: + - Null + - the NOOP (no operation) implementation of the driver. It is used when no networking is required + - Bridge + - It provides a Linux specific bridging implementation based on Linux Bridge + - Overlay + - It provides a multi host communication over VXLAN (recall the technology where we encapsulated the L3 packets etc) + - Remote + - it does not provide a driver. Instead, it provides a means of supporting drivers over a remote transport, by which we can write 3rd party drivers + +- Container Networking Interface +CoreOS is the primary driver for this networking model. It is derived from the ~rkt~ networking proposal. k8s supports CNI. + + +**** Service Discovery + +Service discovery is important for when we do multi host networking, and some form of orchestration. +SD is a mechanism by which processes can find each other automatically and talk. For k8s, it means mapping a container name with it's IP address so that we can access the container without worrying about it's exact location (the node on which it resides etc) + +SD has 2 parts: +- Registration + - The k8s scheduler registers the container in some key value store like etcd, consul etc when the container starts/stops +- lookup + - services and applications use lookup to get the address of a container so that they can connect to it. This is done using some form of DNS. In k8s, the DNS resolves the requests by looking up the entries in the key-value store used for *Registration*. Examples of such DNS services include SkyDNS, Mesos-DNS etc + + +*** Networks on docker +We can list the available networks on docker (on our PC) with: + +$ docker network ls + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:31:15 +[[file:assets/screenshot_2018-06-10_19-31-15.png]] + +Here, we have 3 different types of networks, ~bridge~, ~host~, ~none~ + +**** Bridge + +The bridge is a hardware device that has 2 ports - it passes traffic from one network on to another network. It operates on L2, which means it transfers the frames (which make up the packages) using the MAC address of the attached devices + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:33:57 +[[file:assets/screenshot_2018-06-10_19-33-57.png]] + +Here, when we say bridge, we mean a virtual bridge - actually, a virtual switch. +A networking switch has multiple ports (interfaces. Very loosely, an interface is just a source/sink of frames). It can accept frames from one port and forward it to the destination device attached on another port. +It operates on L2, which means it uses hardware address (like MAC) to identify the destination device + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:38:23 +[[file:assets/screenshot_2018-06-10_19-38-23.png]] + + +A classical use is to connect 2 devices to an external network say (like the internet) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:39:34 +[[file:assets/screenshot_2018-06-10_19-39-34.png]] + +Ref: http://www.innervoice.in/blogs/2012/08/16/understanding-virtual-networks-the-basics/ + +So, here the bridge is actually a virtual switch like the one above. +It routes traffic from our container to a physical interface on the host + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:41:06 +[[file:assets/screenshot_2018-06-10_19-41-06.png]] + +By default, docker uses a virtual bridge called ~docker0~ and all the containers get an IP from this bridge. Docker uses a virtual ethernet ~vnet~ to create 2 virtual interfaces, one end of which is attached to the container and the other end to the ~docker0~ bridge. + +When we install docker on a single host, the ~docker0~ interface is created: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:48:33 +[[file:assets/screenshot_2018-06-10_19-48-33.png]] + +Creating a new container and looking at it's interfaces shows us it got an IP from the range 172.17.0.0/16, catered by the bridge network. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 19:49:52 +[[file:assets/screenshot_2018-06-10_19-49-52.png]] + + + +Getting more info about the bridge network is easy: + +#+begin_src json +$ docker network inspect bridge +[ + { + "Name": "bridge", + "Id": "6f30debc5baff467d437e3c7c3de673f21b51f821588aca2e30a7db68f10260c", + "Scope": "local", + "Driver": "bridge", + "EnableIPv6": false, + "IPAM": { + "Driver": "default", + "Options": null, + "Config": [ + { + "Subnet": "172.17.0.0/16" + } + ] + }, + "Internal": false, + "Containers": { + "613f1c7812a9db597e7e0efbd1cc102426edea02d9b281061967e25a4841733f": { + "Name": "c1", + "EndpointID": "80070f69de6d147732eb119e02d161326f40b47a0cc0f7f14ac7d207ac09a695", + "MacAddress": "02:42:ac:11:00:02", + "IPv4Address": "172.17.0.2/16", + "IPv6Address": "" + } + }, + "Options": { + "com.docker.network.bridge.default_bridge": "true", + "com.docker.network.bridge.enable_icc": "true", + "com.docker.network.bridge.enable_ip_masquerade": "true" + "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0", + "com.docker.network.bridge.name": "docker0", + "com.docker.network.driver.mtu": "1500" + }, + "Labels": {} + } + ] +#+end_src + +***** Creating a new bridge network is simple + +~docker network create --driver bridge mybridge~ + +Now starting a container to use the new bridge is simple also: +~docker run --net=mybridge -itd --name=c2 busybox~ + +Bridge network does not support automatic service discovery, so you have to use the legacy ~--link~ option, which will connect the other container to the same bridge + + +**** NULL + +Null means no networking. If we attach a container to a ~null~ driver, we just get the ~loopback~ (~lo~) interface. The container won't be accessible from the outside + +~docker run -it --name=c3 --net=none busybox /bin/sh~ + + +**** Host +If we don't want the container to have a separate network namespace, we can use the ~host~ driver + +~docker run -it --name=c4 --net=host busybox /bin/sh~ +The container will have full access to the host network. + +A container with ~host~ driver has access to all the interfaces on the host machine + + +**** Sharing network namespaces +We can have 2 or more containers share the same network namespaces. This means they be able to refer to each other by referring to ~localhost~ + +Start a container: +~docker run -it --name=c5 busybox /bin/sh~ + +Now, start another container +~docker run -it --name=c6 --net=container:c5 busybox /bin/sh~ + +K8s uses this feature to share the network namespace among all the containers in a pod + +*** Docker Multi-host networking + +Most of multi host networking solutions for docker are based on *Overlay* network. +We encapsulate the container's IP packet, transfer it over the wire, decapsulate it and then forward it to the destination container. + +Examples of projects using the overlay networks are: +- docker overlay driver +- flannel +- waeve + +Calico uses border gateway protocol (BGP) to do IP based routing instead of encapsulation, so it operates on L3 + +**** libnetwork +Docker's implementation of the overlay network driver is in _libnetwork, a built-in VXLAN based overlay network driver_ and the ~libkv~ library. + +To configure the overlay network, we configure a key-value store and connect it with docker engine in each host. +Docker uses ~libkv~ to configure the k-v store which supports etcd, consule and zookeeper as the backend store + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 20:23:39 +[[file:assets/screenshot_2018-06-10_20-23-39.png]] + +Once the k-v store is configured, we can create an overlay network using +~docker network create --driver overlay multi-host-network~ + +~multi-host-network~ is the name of the network we created + + +To create a container which uses the multi-host Overlay network we created, we have to start the container with a command like the following: + +~$ docker run -itd --net=multi-host-network busybox~ + +What happens under the hood is that each packet is encapsulated, sent to the destination host node having the destination container (the k-v store is used to find the IP of the destination host node), which decapsulates it and sends it to the destination container + +When we create a new docker engine on a new host, we can give it the location of the k-v store using + +~docker-machine create -d virtualbox --engine-opt="cluster-store=consul://$(docker-machine ip keystore):8500" --engine-opt="cluster-advertise=eth1:2376" node2~ + +In docker swarm, the central k-v store has been implemented in the swarm core itself, so we don't need to create it outside. +We have to if we don't use swarm and connect the containers directly + +The containers on ~node2~ above, get 2 interfaces (the overlay network, the bridge for connecting to host machine) (each having it's own IP addresses) + +The driver makes sure that only the packets from the overlay network interface are encapsulated (and decapsulated) etc + + +*** Docker networking Plugins + +We implement Docker Remote Driver APIs to write custom network plugins for docker. +Docker has plugins for network and volumes (so we can use them to provision glusterfs volumes for containers eg) + +Examples: +- weave network plugin +Weave net provides multi-host container networking for docker. + +In Software Defined Networking, we decouple the control plane (use to control the containers etc, do the admin stuff) with the data plane (which has the traffic for our containers) + +** Software defined Storage (ADS) + +Used to manage storage hardware with software. Software can provide different features, like replication, erasure coding, snapshot etc on top of pooled resources + +SDS allows multiple access methods like File, Block, Object + +Examples of software defined storage: +- Ceph +- Gluster +- FreeNAS +- Nexenta +- VMware Virtual SAN + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 20:46:31 +[[file:assets/screenshot_2018-06-10_20-46-31.png]] + +Here, the storage on the different individual hosts has been abstracted away and SDS provices a pool of storage to the containers via the network + + +*** Ceph + +Ceph is a distributed: +- object store + - which means it allows us to store objects like S3 +- block storage + - which allows you to mount ceph as a block device, write a filesystem on it etc. Ceph will automatically replicate the contents on the block etc +- file system + - ceph provides a traditional file system API with POSIX semantics, so you can do open('/path/on/ceph/fs') + + +Minio is also an object store like ceph etc. What it does too is, manage replication and sharding of the objects given to it + + +**** Ceph architecture + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 20:58:49 +[[file:assets/screenshot_2018-06-10_20-58-49.png]] + +***** Reliable Autonomic Distributed Object Store - RADOS + +It is the object store which stores objects. This layer makes sure data is consistent and in a reliable state. +It performs the following operations: +- replication +- failure detection (of a node in a ceph cluster) +- recovery +- data migration +- rebalancing data across cluster nodes + +RADOS has 3 main components: +- Object Storage Device + - user content is written and retrieved using read operations. OSD daemon is typically tied to one physical disk in the cluster +- Ceph Monitors + - responsible for monitoring the cluster state +- Ceph Metadata Server + - needed only by cephFS to store file hierarchy and metadata for files + +***** Librados + +It allows direct access to Rados from languages like C, C++, Python, Java etc. +Ceph Block Device, CephFS are implemented on top o librados + +***** Ceph Block Device +This provides the block interface for Ceph. It allows ceph block devices to be mounted as block devices and used as such + +***** Rados Gateway (RadosGW) +It provices a REST API interface for Ceph, which is compatible with AWS S3 + +***** Ceph File System (CephFS) +It provides a POSIX compliant distributed filesystem on top of Ceph + +**** Advantages of using Ceph +- Open source storage supporting Object, Block and File System storage +- Runs on commodity hardware, without vendor lock in +- Distributed file syste, no single point of failure + + +*** Gluster + +Gluster is a scalable network filesystem which can run on common off-the-shelf hardware. It can be used to create large, distributed storage solutions for media streaming, data analysis and other data-and-bandwith intensive tasks. GlusterFS is free and open source + +**** GlusterFS volumes + +We need to start by grouping machines in a trusted pool. Then, we group the directories (called bricks) from those machines in a GlusterFS volume using FUSE (file system in user space). + +So, add machines to form a pool. On the machines, create partitions (aka bricks) and group them to form glusterfs volumes + +GlusterFS supports different kinds of volumes: +- distributed glusterfs volumes +- replicated glusterfs volumes +- distributed replicated glusterfs volumes +- stripped glusterfs volumes +- distributed stripped glusterfs volumes + +GlusterFS does not have a centralized metadata server (unlike HDFS), so no single point of failure. +It uses an elastic hashing algorithm to store files on bricks. + +The GlusterFS volume can be accessed using one of the following methods: +- Native FUSE mount +- NFS (Network File System) +- CIFS (Common Internet File System) + + + +**** Benefits +- supports object, block, filesystem storage +- does not have a metadata server, so no SPOF +- open source, posix compatible, HA by data replication etc + +*** Introduction to storage management for containers + +Containers are ephemeral by nature, so we have to store data outside the container. In a multi-host environment, containers can be scheduled to run on any host. So we need to make sure the volume required by the container is available on the node on which the container is scheduled to run + +We will see how docker uses Docker Volumes to store persistent data. Also, we will look at Docker Volume Plugins to see how it allows vendors to support their storage for docker. + +**** Docker Storage backends + +Docker uses copy-on-write to start containers from images, which means we don't have to copy an image while starting a container. + +Docker supports the following storage backends: +- aufs (another union file system) +- btrfs +- device mapper +- overlay +- vfs (vritual file system) +- zfs + +**** Docker Volumes vs host directory mounts + +You can mount a host directory on a container and store data there. Or you can mount a docker volume container and store data there. They are 2 different approaches. +The first one is not portable, since each host might not have that directory present locally. This is the reason you can't mount host directories in Dockerfiles because dockerfiles are suppose to be portable. + +A better idea is to create a docker volume container and mount that with your container. + +Examples: + +~docker run --volumes-from dbdata -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata~ + +Here, we have launched a new container and mounted the volume from ~dbdata~ volume container. We also mounted ~pwd~ which is a local host directory as ~/backup~ in the container. + +Docker volume plugins allow us to use different storage backends as data volume containers - like btrfs etc. Much like k8s allows us to use gluster, ceph etc as volumes. + +**** Docker Volumes +Docker volumes are different from mounted directories on containers. + +A data volume is a specially designated directory within containers that bypasses the union file system. +- data volumes can be shared and reused among containers +- changes to data volume are made directly +- changes to data volume won't be included when you update an image + +To create a container with a volume: + +~docker run -d --name web -v /webapp nkhare/webapp~ + +This :top: will create a volume inside the default docker working directory ~/var/lib/docker~ on the host system + +We can create a named volume as well: +~docker volume create --name my-named-volume~ + +This can be later mounted and used + +Mounting a host directory inside a container is simple too +~docker run -d --name web -v /mnt/webapp:/webapp nkhare/webapp~ + +Here, we mount the host's ~/mnt/webapp~ to the ~/webapp~ in the container + +To share persistent data across containers, or share persistent data with a non persistent container, we can use data volume container. + +You can create data volumes: +~docker create -v /data --name dbstore ubuntu~ + +And use them: +~docker run --volumes-from dbstore --name=client1 centos /bin/sh~ +~docker run --volumes-from dbstore --name=client2 centos /bin/sh~ + +Volume plugins supported by docker: +- flocker +- glusterfs +- blockbridge +- emc rex-xray + +If you use gluster volume, you get all the replication, HA etc out of the box + +Let's discuss some of them + +***** Flocker +A flocker docker volume is referred to as a dataset. +Flocker manages docker containers and data volumes together. This makes sure that the volumes follow the containers. + +K8s has this and a lot more out of the box + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-10 23:55:41 +[[file:assets/screenshot_2018-06-10_23-55-41.png]] + + +****** Supported storage options for flocker +- aws ebs +- openstack cinder +- emc scaleio +- vmware vsphere +- netapp ontap + +** DevOps and CI/CD + +In CD, we deploy the entire application/software automatically, provided that all the tests' results and conditions have met the expectations. + +Some of the software used in the CI/CD domain are Jenkins, Drone, Travis and Shippable + + +*** Jenkins +It is one of the most popular tools used for doing any kind of automation. + +It can build freestyle, apache ant, apache maven based projects. +Plugins can be used to extend functionality. Pipelines can be built to implement CD. + +Pipelines can survive jenkins master restarts, are pausable for human approval, are versatile (can fork or join, loop, work in parallel), extensible (can be integrated with other plugins) + +*** Drone +It provides both hosted and on-premise solutions to do CI for projects hosted on Github, BitBucket + +*** Travis CI +It is a hosted, distributed CI solution for projects hosted on Github. + +Configuration is set thru ~.travis.yml~ which defines how our built should be executed step-by-step + +A typical build consists of 2 steps: +- install +- script + - to run the build script + +There are several build options: +- before_install +- install +- before_script +- script +- after_success or after_failure +- before_deploy +- deploy +- after_deploy +- after_script + + +** Tools for cloud infrastructure - configuration management +Configuration Management tools allow us to define the desired state of the systems in an automated way + +*** Ansible +It is by RedHat +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-11 00:16:55 +[[file:assets/screenshot_2018-06-11_00-16-55.png]] + + +The host inventory can be static or dynamic +Ansible Galaxy is a free site for finding, downloading and sharing community developed ansible roles + + +*** Puppet +It runs in a master/slave mode. +We need to install Puppet Agent on each system we want to manage/configure with Puppet. + +Each agent: +- Connects securely to Puppet Master to get the series of instructions in a file referred to as the Catalog File +- Performs operations from the Catalog File to get to the desired state +- Sends back the status to Puppet Master + +Puppet Master can be installed only on *nix systems. It: + +- Compiles the Catalog File for hosts based on the system, configuration, manifest file, etc +- Sends the Catalog File file to agents when they query the master +- Has information about the entire environment, such as host information, metadata like authentication keys, etc +- Gathers the report from each agent and then prepares the overall report + +Centralized reporting needs PuppetDB + + +*** Chef + +It too runs in a client/master model. A client is installed on each host we want to manage. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-11 00:22:33 +[[file:assets/screenshot_2018-06-11_00-22-33.png]] +Apart from chef client and master, we also have chef workstation which is used to: +- develop cookbooks and recipes +- run command line tools +- configure policy, roles etc + + +Of all the above :top:, only Ansible is completely agentless + +** Tools for cloud infrastructure - build and release + +Like we can version control our software, we can codify and version control our infrastructure as well - infrastructure as code + +*** Terraform +It allows us to write infrastructure as code. This allows us to write the same infrastructure everywhere - the code is the different, we have to write it for each provider etc, and we have to make the functionality be the same everywhere, but once we do it, we get the same infrastructure. + +Terraform has providers which understand the underlying VMs, network switches etc as resources +The provider is responsible for exposing the resources which makes terraform agnostic to the underlying platforms. + +A custom provider can be created thru plugins +- IaaS: aws, do, gce, openstack etc +- PaaS: heroku, cloudfoundry +- SaaS: DNSimple + + + +** Tools for cloud infrastructure - key value pair store + +For building any distributed and dynamically scalable environment, we need an endpoint which is a single point of truth. +This means this endpoint must agree on one version of the truth, using consensus etc +Most k-v stores provide rest apis for doing operations like GET, PUT, DELETE etc. + +Some examples of k-v stores: +- etcd +- consul + + +*** etcd +Etcd is an open source k-v pair storage based on raft consensus algorithm. +It can run in a standalone or cluster mode. It can gracefully handle master election during network partitions, can tolerate machine failures, including the master + +We can also watch on a value of a key, which allows us to do certain operations based on the value changes. + +*** consul +It is distributed, highly-available system which can be used for service discovery, configuration. +Apart from k-v store, it has features like: +- service discovery in conjunction with DNS or HTTP +- health checks for services and nodes +- multi-datacenter support + + +** Tools for cloud infrastructure - image building + +We need an automated way of creating images - either docker images or VM images for different cloud platforms. + +The extremely naive way to do is to create a docker image of a base container, the install the required software and then store the resulting image on disk. This is not scalable. +Or you can use Dockerfile - the docker engine create a container after each command and persists it on the disk + + +We can also use Packer + +*** Packer +Packer is a tool from Hashicorp for creating virtual images for different platforms from configuration files + +Generally, the process of creating virtual images has 3 steps: +- building base image +Has support for aws, do, docker etc + +- provision the base image to do configuration +We need a provisioner like Ansible, chef, puppet, shell etc to do the provisioning + +- post build operations +We can move the image to a central repository etc + +** Tools for cloud infrastructure - debugging, logging etc + +Some of the tools which we use for debugging, logging, monitoring: +- strace +- tcpdump +- gdb +- syslog +- nagios + +Containers have some challenges compared to monitoring/logging/debugging traditional VMs: +- containers are ephemeral +- containers do not have kernel space components + +We want to do MLD from the outside, so as to reduce the footprint of the container. + +Debugging: sysdig +Logging: Docker logging driver +Monitoring: sysdig, cAdvisor (or Heapster which uses cAdvisor underneath), Prometheus, Datadog, new relic + +Docker has commands like ~inspect~, ~logs~ to get insights from containers. +With the docker logging driver, we can forward logs to the corresponding drivers, like syslog, journald, fluentd, awslog, splunk. +Once the logs are saved in a central location, we can use the respective tools to get the insights. + +Docker has monitoring commands like ~docker stats~, ~docker top~ + +*** Sysdig + +It is an open source tool which describes itself as: + +"strace + tcpdump + htop + iftop + lsof + awesome sauce". + +Sysdig inserts a kernel module inside the running linux kernel, this allows it to capture system calls and OS events. + + +*** cAdvisor +It is an open source tool to collect stats from host system and running containers. +It collects, aggregates, processes and exports information about running containers + +You can enable the cAdvisor container like so: + +#+begin_src +sudo docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor google/cadvisor:latest +#+end_src + +Now you can go to http://host_ip:8080 to view the stats. +cAdvisor also supports exporting the stats to InfluxDB. It also exposes container statistics as prometheus metrics. + +*** Heapster + +It enables container cluster monitoring and performance analysis. +Heapster collects and interprets various signals, like compute resource usage, lifecycle events, etc., and exports cluster metrics via REST endpoints. + + +*** fluentd +It is an open source data collector + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-11 23:48:34 +[[file:assets/screenshot_2018-06-11_23-48-34.png]] + +It has more than 300 plugins to connect input sources and output sources. It does filtering, buffering, routing as well. + +It is a good replacement for logstash +Fluentd is one of the logging drivers supported by docker + +We can either specify the logging driver for docker daemon or specify it while starting the container + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-11 23:50:28 +[[file:assets/screenshot_2018-06-11_23-50-28.png]] + + + +** Misc notes + +- With the cloud's _pay-as-you-go_ model and _software-defined-everything_ model, starts have a very low barrier to take an enterprise assignment. +- Hybrid model is useful when you want to keep your data on-premise and serve the request from Public clouds. +- you can built docker images using debootstrap, packer, docker built with dockerfile, docker commit + + + diff --git a/design_patterns.org b/design_patterns.org new file mode 100644 index 0000000..acefeca --- /dev/null +++ b/design_patterns.org @@ -0,0 +1,157 @@ +* Design Patterns +Patters allow us to separate the part of the application that changes from the other parts, so that a new change does not break the old stuff. This is done via encapsulation, decoupling, delegation, composition. + +** composition over inheritance +There are some ideas that are widely applicable in software development. One such idea is, favoring composition over inheritance (java style inheritance for example). + +The issue with java style inheritance is, it's rigid; if the parent has a behavior (method), _all_ children of that parent have to have that behavior. + +Consider some class Foo: +#+ATTR_ORG: :width 400 +#+ATTR_HTML: :width 400 +[[file:assets/2022-06-11_12-21-46_screenshot.png]] + + +All the children it has, they must implement both a() and b(). Now, it is possible that even though all the subclasses of the same children, they might have some differences amongst themselves, let’s say some of them have another method c() as well. + +There can be various ways to incorporate that method: + + +*** Using an interface +#+ATTR_ORG: :width 400 +#+ATTR_HTML: :width 400 +[[file:assets/2022-06-11_12-22-02_screenshot.png]] + +This works, but we are losing code reusibility. If there are some multiple subclasses that need a particular type of behaviour, each has to implement it separately. + +*** Adding ~c()~ to parent class - foo +#+ATTR_ORG: :width 400 +#+ATTR_HTML: :width 400 +[[file:assets/2022-06-11_12-22-19_screenshot.png]] + + +We can add that method to parent class, maybe with some default behaviour, and the children can override that with no-op if required. + +This is bad because we broke previously working classes, and also have to remember to overwrite the implementation for all future subclasses. + +*** Using delegation + +We want to separate the behavior that changes from the behavior that remains constant, aka we want to encapsulate the changing behavior into a separate class. + +In this case, we can add an instance variable to the Foo class, that is of the type C. + +Adding an instance variable + +#+ATTR_ORG: :width 400 +#+ATTR_HTML: :width 400 +[[file:assets/2022-06-11_12-23-27_screenshot.png]] + +The type of the instance variable can be an interface. The method c() will just invoke the interface method on the instance variable. To make sure that the instance variable is always set, we can provide 2 constructors, one which accepts the instance variable and the other that sets the default value for it. + +This way, we can even change the behavior of the class at runtime by using getters and setters for that instance variable. + +Here, we are using delegation to delegate the behavior to another class. This gives us code reusibility and also doesn’t break existing code. + +*** Composition + +In instance variable example, we are using composition to compose our parent class with the right behaviors as required. + + +#+begin_src java + +// abstract duck class, with an instance variable +public abstract class Duck { + Flyable fly; + + abstract void squeak(); + + void doFly() { + this.fly.doFly(); + }; +} + + +// interface defining the behavior +public interface Flyable { + void doFly(); +} + + +// slow fly is a kind of a flyable behavior +public class SlowFly implements Flyable { + + @Override + public void doFly() { + System.out.println("made a slow flight"); + } +} + + + +// no fly is another kind of flyable behavior +public class NoFly implements Flyable { + + @Override + public void doFly() { + System.out.println("cannot fly"); + } +} + + +// rubber duck cannot fly, so is composed with nofly +public class RubberDuck extends Duck { + @Override + void squeak() { + System.out.println("rubber duck squeaks"); + } + + public RubberDuck() { + this.fly = new NoFly(); + } + +} + + +// wood duck can fly if it is given a flyable +public class WoodDuck extends Duck { + + public WoodDuck(Flyable f) { + this.fly = f; + } + + public WoodDuck() { + this.fly = new NoFly(); + } + + @Override + void squeak() { + System.out.println("wood duck grrs"); + } +} + + +public class DesignPatterns { + public static void main(String[] args) { + Flyable slowFly = new SlowFly(); + + Duck woodDuck = new WoodDuck(slowFly); // giving the woodduck a flyable behavior + woodDuck.squeak(); + woodDuck.doFly(); + + Duck rubberDuck = new RubberDuck(); // whereas the rubber duck cannot fly + rubberDuck.squeak(); + rubberDuck.doFly(); + } +} + +#+end_src + +Points to note: +- type of the instance variable for the fly behavior is an interface, not a concrete class type - this gives us more flexibility. It is always good to ~program to an interface, not a concrete type~. (the interface here can be a superclass too) +- composition allows us to change the fly behavior dynamically, via the getters and setters on the instance variable +- composition is a ~has a~ relationship (rubber duck has a flyable behavior), whereas inheritance is a ~is a~ relationship +- composition allows us to use delegation + - we delegated the logic of the flying to a separate class + - this allows us to add new flying behaviors and assign it to old classes dynamically (decoupling b/w behavior and duck class) +- this is called the ~strategy pattern~, because we have a family of behaviors, as defined by the implementors of the flyable behavior and our strategy is to delegate some of our behavior to those classes + diff --git a/ecs.org b/ecs.org new file mode 100644 index 0000000..a1e4c0c --- /dev/null +++ b/ecs.org @@ -0,0 +1,31 @@ +* Amazon Web Services ECS - EC2 Container Service +- need a docker container +- need to push it to ecr - and set up iam controls for access restriction + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-07 10:37:15 +[[file:assets/screenshot_2018-08-07_10-37-15.png]] + + +- In a single sentence, "ECS is a service to automate the running of containers on a fleet of instances" +- It orchestrates the deployment(placement, self heal) and running of the containers +- ECS has an agent like the kubelet, running on each instance. + +- after creating a cluster, we need to define a task. +- A Task is a json document which defines what to run on the cluster + - it references an image to download, volumes etc + - Tasks are versioned +- use the console to launch a submitted task + +- launching a task is good for a one off script etc +- we could also launch a Service + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-07 10:48:01 +[[file:assets/screenshot_2018-08-07_10-48-01.png]] + + +25*1*0.00001406*12*3600*30 +25*5*0.00000353*12*3600*30 diff --git a/elasticsearch.org b/elasticsearch.org new file mode 100644 index 0000000..4801491 --- /dev/null +++ b/elasticsearch.org @@ -0,0 +1,69 @@ +* Elasticsearch The Definitive Guide +A distributed real-time search and analytics engine + +* Prelude + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-03 18:45:34 +[[file:assets/screenshot_2018-07-03_18-45-34.png]] + +Elasticsearch is built on top of Apache Lucene. It is a "full text search engine" library. +Elasticsearch is a wrapper around Lucene, it hides the internal details of Lucene behind a coherent, REST API. It can be described as: + +- a distributed real-time document store where every field is indexed and search-able. + - "Distributed" - the data is sharded and partitioned. + - Real time - the new documents are index online and are immediately available for query + - Document Store - Elasticsearch accepts json documents + + +Elasticsearch has sensible defaults and is usable out of the box. It is also highly configurable, all the components are configurable and flexible. + +Marvel is a management and monitoring tool used for Elasticsearch. It has an interactive console called Sense. Recall we used this in the early days with Piyush + +Elasticsearch comes as a binary which runs in the JVM. You just need Java installed on your machine to run Elasticsearch + +A node is a running instance of Elasticsearch. A cluster is a group of nodes with the same cluster.name that are working together to share data and to provide failover and scale, + +If you don't change the ~cluster.name~, your nodes could join other nodes in a different cluster on the same network. + +API to shutdown Elasticsearch : ~curl -XPOST 'http://localhost:9200/_shutdown'~ +* Part 1 +** You Know, for Search + +Elasticsearch can be reached by 2 means: +1. Java API +When using Java, you can use these + - Node Client +It joins a cluster as a non data node, it doesn't hold the data itself, it knows where the data lives. + - Transport Client +It is lighter weight client and can be used to transport client to a remote cluster. + +Both the clients use a custom elasticsearch transport protocol to talk to the cluster. + +2. RESTful API with JSON over HTTP + +All other languages can communicate with Elasticsearch over port 9200 using a RESTful API + +------ + +Elasticsearch uses JavaScript Object Notation (JSON) as the serialization format for documents. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-03 19:19:39 +[[file:assets/screenshot_2018-07-03_19-19-39.png]] + +JSON allows for rich information storage which can be difficult if we move to a tabular structure. + + +** Life Inside a Cluster +** Data In, Data Out +** Distributed Document Store +** Searching - The Basic Tools +** Mapping and Analysis +** Full Body Search +** Sorting and Relevance +** Distributed Search Execution +** Index Management +** Inside a Shard diff --git a/epbf.org b/epbf.org new file mode 100644 index 0000000..312209d --- /dev/null +++ b/epbf.org @@ -0,0 +1,255 @@ +* eBPF + +Made by Daniel Borkmann +Alexei Starovoitov + + +**** BCC + +**** bpftrace + +**** ply + +**** LLVM + +**** kprobes +Kernel dynamic instrumentation for Linux + +**** uprobes +user level dynamic instrumentation for Linux + +**** tracepoints +Linux tracing + +**** perf + +**** ftrace + +**** bpf + +**** field of dynamic instrumentation +The tools of which include dtrace, systemtap, bcc, bpftrace, and other dynamic tracers. + +**** ltt +the first linux tracer + +**** dprobes +first dynamic instrumentation technology. this let to kprobes which is in use today. + + +**** systemtap + +**** ktap +a high level tracer that helped build support in linux for VM based tracers + + +** Part I - Technologies + +*** Chapter 1 - Introduction +BPF - Berkeley Packet Filter + +Alexei rewrote it and expanded it to eBPF +Daniel Borkmann included it in the Kernel. + +BPF has an instruction set, storage objects and helper functions. +It is like a VM due to its virtual instruction set specification. + +These instructions are executed by a BPF runtine inside the Linux kernel, which includes an interpreter and JIT compiler for turning BPF instructions into native instructions for execution. This is very similar to Python. + +There is a verifier too which makes sure that the ebpf program doesn't crash the kernel or have infinite loops etc. + +3 main uses: +- networking - cilium +- observability - this book, node exporter +- security - falco + + + +**** Tracing +Tracing is event based recording - a type of instrumentation. eg strace records and prints system call events. strace is a type of special purpose tracing tool which only traces system calls. +There are some tools that do not trace events, but measure events using fixed statistical counters and then print summaries - eg top + +Programmatic tracers can run small programs on the events to do custom on the fly statistical summaries. + +**** Sampling +These tools take a subset of measurements to paint a coarse picture of the target. Eg ~profile~ can take a sample of running code every 10ms for eg. + +**** BCC, bpftrace +Since bpf has a virtual instruction set, you can either code bpf instructions directly or else use some frontends - the main ones for tracing are BCC and bpftrace. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 13:09:51 +[[file:assets/screenshot_2020-08-23_13-09-51.png]] + + +See how libbcc is used to instrument the events with bpf programs. +There must be a compiler to compile the code to instruction set and load it into the kernel. +Also, libbpf is used to read the data? + +BCC - bpf compiler collection +it provides a C programming environment for writing bpc code in python, lua, cpp. + +bpftrace is a newer frontend - it provides a high level language for developing bpf tools. + +There is another bpf frontend that is being developed - ply +It is designed to be lightweight and not need many dependencies. + +iovisor is a LF project that houses the bcc and bpftrace projects + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 13:25:57 +[[file:assets/screenshot_2020-08-23_13-25-57.png]] + + +**** kprobers, uprobes +There are both dynamic instrumentation tools. +BPC tracing, tracing in general actually depends on events - the events are either reported directly or there is some post processing done on them. +How do you generate these events? They may be static - always emitted, or dynamic - we can ask the kernel to start emitting them +kprobe and uprobe help us to dynamically start or stop emition of these events - they give us to ability to insert instrumentation points into live software in prod. + +uprobes were added in 2012. kprobes was added in 2004. + +Some kprobe and uprobe examples: + +kprobe:vfs_read --> instrument the beginning of the kernel vfs_read() function +kretprobe:vfs_read --> instrument the return of the kernel vfs_read function +uprobe:/bin/bash:read-line --> instrument the beginning of the readline function in /bin/bash +uretprobe:/bin/bash:read-line --> instrument the return of the readline function in /bin/bash + + +***** static instrumentation +The problem with dynamic instrumentation is that if the function is renamed/removed etc, the bpf tool will break. +The alternative is to use static instrumentation - where the code has hardcoded event names - called tracepoints and USDT (user level statically defined tracing) for user-level static instrumentation + +eg: + +tracepoint:syscalls:sys_enter_open -- instrument the open(2) syscall +usdt:/usr/sbin/mysqld:mysql:query_start -- instrument the query_start probe from /usr/sbin/mysqld + +**** example - bpftrace +Okay, we can use bpftrace to trace the tracepoint for open() syscall. + +bpftrace -e 'tracepoint:syscalls:sys_enter_open { printf("%s %s\n", comm, str(args->filename)); }' + + +BCC tools are more full fledged utilities, bpftrace is just a one liner in comparison. + +*** Chapter 2 - Technology Background + +We will learn about - origins of bpf, frame pointer stack walking, how to read flame graphs, use of kprobes and uprobes, tracepoints, usdt probes, dynamic usdt, PMCs, BTF and other BPF stack walkers. + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 15:08:28 +[[file:assets/screenshot_2020-08-23_15-08-28.png]] + +Just like Python, bpf has it's own bytecode that is interpreted by a interpreter. In 2011, a JIT compiler was added to compile the bytecode to native code + +Original BPF had this: +- 32 bit registers +- 2 registers +- scratch memory with 16 memory slots +- program counter + + +ebpf has: +- 64 bit registers +- flexible map storage +- can call some restricted kernel functions + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 15:16:46 +[[file:assets/screenshot_2020-08-23_15-16-46.png]] + + +Loading and unloading the bpf programs happens via ~bpf~ syscall +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 15:19:56 +[[file:assets/screenshot_2020-08-23_15-19-56.png]] + +bpf program can use helper functions for getting kernel state, bpf maps for storage. bpf program is executed on events, which includes kprobes, uprobes and tracepoints. + +BPF can make things fast by doing the computations on the events in the kernel space itself. Earlier, we had to get the events outside and then do the calculations. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 15:24:47 +[[file:assets/screenshot_2020-08-23_15-24-47.png]] + + +bpftool can be used to print the instructions and also see the currently loaded bpf programs. + +**** bpf helper functions + +A bpf function cannot call arbitrary kernel functions, only some helper functions. + +| function | desc | +|--------------------------------------+---------------------------------------------------------------| +| bpf_map_lookup_elem(map, key) | find a key in a map and returns its value (pointer) | +| bpf_map_update_elem(map, key, flags) | update the value of the entry selected by key | +| bpf_map_delete_elem(map, key) | delete the entry selected by key from the map | +| bpf_probe_read(dst, size, src) | safely read size bytes from address src and store in dst | +| bpf_ktime_get_ns | return the time since boot in ns | +| bpf_trace_printfk | debugging helper that writes to tracefs trace | +| bpf_get_current_pid_tgid | return a u64 containing the current tgid (user space pid) etc | + + +You can copy memory from kernel space and user space too. +To read random memory, bpf programs must use bpf_probe_read() + +there are some bpf syscall commands too: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 17:04:45 +[[file:assets/screenshot_2020-08-23_17-04-45.png]] + +Different types of bpf programs: +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 19:10:05 +[[file:assets/screenshot_2020-08-23_19-10-05.png]] +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 19:10:46 +[[file:assets/screenshot_2020-08-23_19-10-46.png]] +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /var/folders/t4/722gxsyj4qb_lpfwv9pz5dqh0000gn/T/screenshot.png @ 2020-08-23 19:10:49 +[[file:assets/screenshot_2020-08-23_19-10-49.png]] + +For concurrency control, bpf has locks - spin locks - mainly to guard against parallel update of maps etc. + +There is also atomic add instruction, map in map that can update entire maps atomically. + +bpf introduced commands to expose bpf programs and maps via virtual file system - mounted on /sys/fs/bpf generally. + +This is called pinning. It allows user level programs to interact with a running bpf program - read and write the bpf maps. + +BTF - BPF Type Format +This is the metadata about the bpf program so that we can get more info about source code - like line numbers etc. +BTF is also becoming a general purpose format for describing all kernel data structures. + +**** Stack trace walking + +Stacks are useful to profile where execution time is spent. +bpf provides special map types for recording stack traces and can fetch them using frame pointer based stack walks. + +Techniques: +- Frame pointer based stacks + - Here, we assume that the RBP register has the head of the linked list of stack frames. This won't work on all platforms, but does in x86_64 +- debuginfo + - this works by including metadata files that include line numbers etc. These files however can be large. +- Last Branch Record + - These are intel processor feature to store the stack. It has a limited capacity and bpf does not support this as of now. +- ORC + + + + diff --git a/front-end.org b/front-end.org index c6b5b3a..8a199cf 100644 --- a/front-end.org +++ b/front-end.org @@ -378,10 +378,25 @@ id selector: class selector: .center {//css} -element+class selector: +element.class selector: p.center {//css}

This paragraph refers to two classes.

+element element selector: + div p {//css} + // selects all the

elements inside

element + +element > element selector: + div > p {//css}} + // selects all the

elements where the parents is a

element + +element + element + div + p {//css}} + // selects all the

elements that are placed immediately after

elements + +element ~ element + div ~ p{//css}} + // selects every

element that are preceded by a

element multiple selectors allowed h1, h2, p {//css} diff --git a/golang.org b/golang.org index f14db1a..28f5c54 100644 --- a/golang.org +++ b/golang.org @@ -56,12 +56,16 @@ Eg: reader, marshaler etc Another example: #+begin_src go +package main + +import "fmt" type Office int const ( Boston Office = iota NewYork ) +var officePlace = [2]string{"Boston", "New York"} func (o Office) String() string { return "Google, " + officePlace[o] @@ -2685,6 +2689,299 @@ func main() { } #+end_src +** custom type for []int +In a problem, I had to give a rotated slice of ints. +For eg, given ~1 2 3 4 5~, and rotation of ~4~, I had to output: ~5 1 2 3 4~ + +I created a custom type and attached a method to that type, (alternatively I could have written a function) + +#+begin_src go +package main + +import ( + "bufio" + "fmt" + "os" + "strconv" + "strings" +) + +func readSlice(reader *bufio.Reader) []int { + inp, _ := reader.ReadString('\n') + inp = strings.TrimSuffix(inp, "\n") + nums := strings.Split(inp, " ") + + var a []int + for _, j := range nums { + i, _ := strconv.Atoi(string(j)) + a = append(a, i) + } + return a +} + +type mySlice []int + +func (m *mySlice) rotatedValue(i, n int) int { + l := len(*m) + if (i - n) >= 0 { + return i - n + } + return l + (i - n) +} + +func main() { + reader := bufio.NewReader(os.Stdin) + t := readSlice(reader) + s, n := t[0], t[1] + var inp mySlice + inp = readSlice(reader) + res := make(mySlice, s) + for index, v := range inp { + res[inp.rotatedValue(index, n)] = v + } + fmt.Println(strings.Trim(fmt.Sprint(res), "[]")) +} +#+end_src + +Note, here, the readSlice outputs []int. But, we are assigning output of readSlice to ~var inp mySlice~. This is okay. +However, if you had something like this: + + +#+begin_src go +// Foo is a normal struct +type Foo struct { + a string + b string +} + +// Foo2 is a superstruct for Foo, it has more fields +type Foo2 struct { + Foo + c string +} + +func returnFoo(s Foo) Foo { + return s +} + +func main() { + f := Foo{a: "a", b: "b"} + var b Foo2 + b = returnFoo(f) // error! You cannot assign Foo to Foo2 + fmt.Println(returnFoo(f)) + fmt.Println(b) +} + +#+end_src + +If we make make the functions return Foo2, then this happens: + +#+begin_src go +func returnFoo2(s Foo2) Foo2 { + return s +} + +func main() { + + // f := Foo2{a: "a", b: "b"} --> Cannot use this to create Foo2 + f := Foo2{Foo: Foo{a: "a", b: "b"}, c: "c"} + // f := Foo2{Foo{a: "a", b: "b"},"c"} --> also legal + // f := Foo2{Foo{"a","b"},"c"} --> also legal + + var b Foo + // b = returnFoo2(f) --> error, cannot assign Foo2 to Foo + fmt.Println(returnFoo2(f)) + fmt.Println(b) +} +#+end_src + +However, if the return type of ~returnFoo~ was an interface, we could assign to anything that implements that interface. + +This is also not allowed: + +#+begin_src go +type Foo3 Foo2 + +func main() { + // f := Foo2{a: "a", b: "b"} --> Cannot use this to create Foo2 + f := Foo2{Foo{"a", "b"}, "c"} + var b Foo3 // cannot assign Foo2 to Foo3 variable + b = returnFoo2(f) + fmt.Println(returnFoo2(f)) + fmt.Println(b) +} +#+end_src +We can't even use it for ~type Foo3 string~ + +#+begin_src go +func returnFoo2(s Foo2) string { + return s.c +} + +type Foo3 string + +func main() { + // f := Foo2{a: "a", b: "b"} --> Cannot use this to create Foo2 + f := Foo2{Foo{"a", "b"}, "c"} + var b Foo3 // error -> cannot assign string to Foo3 + b = returnFoo2(f) + fmt.Println(returnFoo2(f)) + fmt.Println(b) +} +#+end_src + +However, this works: + +#+begin_src go +func returnFoo2(s Foo2) []int { + return []int{1} +} + +type Foo3 []int + +func main() { + // f := Foo2{a: "a", b: "b"} --> Cannot use this to create Foo2 + f := Foo2{Foo{"a", "b"}, "c"} + var b Foo3 + b = returnFoo2(f) + fmt.Println(returnFoo2(f)) + fmt.Println(b) +} + +#+end_src +Here, we are able to assign []int to Foo3 +Asked about this here: https://stackoverflow.com/questions/52308617/implicit-type-conversion-in-go + +The answer is mostly that int[] is an ~unnamed type~ whereas ~int~ is a named type. And the spec says ~x is assignable to T~ if ~x's type V and T have identical underlying types and at least one of V or T is not a named type.~ + +Good quote from Tim Heckman on Gophers slack regarding using methods vs function +#+BEGIN_QUOTE +Functions should generally be for things working on inputs, methods should be for things that use state and optionally inputs. So when I see things like `type Foo []int` I wonder if there is a better way. +#+END_QUOTE + +** variables reuse in loops + +Consider this: + +#+begin_src go +package main + +import "fmt" + +func main() { + var a int + for i := 0; i < 10; i++ { + b := i + a = i + fmt.Println(b, a) + } +} +#+end_src + +Here, in each loop iteration, the variable ~i~ is shared. When the loop starts, the compiler created a variable ~i~ and gives it a memory location. On each iteration, it is overwritten. This is why you can't use the index directly in a go routine etc. + +However, in the example above :top: +I use ~b:=i~ in each iteration, so I was confused as to why this is not erroring out, since ~b~ would be created in the first iteration and then next time, ~b~ would error out since we are using ~:=~ which is to initialize new variables. + +It is correct Go code since *each iteration of a for loop is a new scope*. So, when we use ~:=~ in each iteration, it is similar to + +#+begin_src go +package main + +import ( + "fmt" +) + +func main() { + a := 0 + { + a := 1 + fmt.Println(a) // will print 1 + } + fmt.Println(a) // will print 0 + // here, if you want to promote the value of the inner scope to the outer scope, assign the value to a variable declared in the outer scope (use a = 1 in the inner scope for eg) +} +#+end_src + +However, if you aren't using the inner scope variables outside the iteration scope, the compiler will reuse the memory for the next iteration scope. + +In competitive program, this fact can be used to not have to reset variables used in the iteration's scope. Note, this can only be used if you are using ~:=~ in the iteration scope, and not if you are promoting the values from inside to outside. + +When using maps, be careful to check ~val, ok := mymap[mykey]~. ~ok~ will be ~true~ if the key was present, else it will be false. The statement will always execute and will return the zero type of the value of ~mymap~. So, if it is ~map[int]int~, it will return ~0~ + + +This fails +#+begin_src go + // code + if k >= len(nums) { + acc := nums[:] + } else { + acc := nums[0:k] + } + + sort.Slice(acc, func(i, j int) bool { return acc[i] > acc[j] }) + // code +#+end_src + +This is because ~acc~ is defined only in the scope of the ~if~ statement. It doesn't exist outside. To access it outside, do predeclare it with, ~var acc []int~, and use it with ~acc = nums[:]~. + +** init() + +The entry point for any go software is main(). However, if that package has an init() (even the main package), it will be called before everything else. +Actually, even before that package's init(), the global vars in that package will be initialized first. +SO: +#+begin_src go +var WhatIsThe = AnswerToLife() + +func AnswerToLife() int { + return 42 +} + +func init() { + WhatIsThe = 0 +} + +func main() { + if WhatIsThe == 0 { + fmt.Println("It's all a lie.") + } +} +#+end_src + +Here, AnswerToLife will be called first. This is to initialize the WhatIsThe variable. +Next, init() will be called. Finally, main(). + +** Make to create maps +Use ~make(map[int]int)~ for eg. +Since map is a reference, you can pass it directly to function and changes will show up. +Channels created using make are references too. + +Strings are immutable, taking a substring mystr[a:b] returns a new string. +You are not allowed to modify a string in place. + +~mystr := "hello world"~ + +:top: this is a string literal, you are defining it in place + +** Arrays +By default, arrays are zero values of their types +array literal ~myArray:=[...]{1, 2, 3}~ + +Array type + length is a type, so ~[3]int~ and ~[4]int~ are different types +Arrays are fixed length + +Slices are a pointer (a reference) to an array, so can be passed around and modified. +Make liberal use of slices (they are cheap references to the same array) + +Use ~make([]int, , )~ to initialize it + + + + +Map is a reference to a hash table. To create it, either use a map literal +~myMap := map[int]int{1:2}~ +or use ~make~: +~myMap := make(map[int]int)~ * How to Write Go Code @@ -3639,3 +3936,6 @@ Errors Panic Recover A web server + + + diff --git a/gophercises.org b/gophercises.org new file mode 100644 index 0000000..b76476a --- /dev/null +++ b/gophercises.org @@ -0,0 +1,79 @@ +* Exercise 1 - Quiz Game + +To read a CSV with 2 columns - Question, Answer. +Present a quiz to the user, note the answers provided. +** Solution: +- two flags - filename should be able to choose the csv file +- read the CSV, store the questions in a ~map[int]Question~ +- iterate thru the map, show the question to the user, record the answer +- in the end, show the score + +* Exercise 2 - UrlShortner + +There are some common names in the http circles for the Go standard libary. + +1. ~Handler interface~ +#+begin_src go +type Handler interface { + ServeHTTP(ResponseWriter, *Request) +} +#+end_src + +A Handler response to the HTTP request. It has the method ServerHTTP, which should write reply headers to the ResponseWriter and then return. + +2. ~type HandlerFunc~ + +HandlerFunc is a type, which is just a function. This function is just the ServeHTTP function. +#+begin_src go +// The HandlerFunc type is an adapter to allow the use of +// ordinary functions as HTTP handlers. If f is a function +// with the appropriate signature, HandlerFunc(f) is a +// Handler that calls f. +type HandlerFunc func(ResponseWriter, *Request) + +// ServeHTTP calls f(w, r). +func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) { + f(w, r) +} +#+end_src + +It's usecase is this. Let's say we want to write a function X that can serve requests. One way is to create a new type and make that type have the ServeHTTP method which calls this function X. Or, we can just type cast that function X with http.HandlerFunc(function X) and just use it + + +3. ServeMux + +#+begin_src go +// ServeMux is an HTTP request multiplexer. +// It matches the URL of each incoming request against a list of registered +// patterns and calls the handler for the pattern that +// most closely matches the URL. +#+end_src + + +** Unmarshal vs Marshall +Unmarshal is taking in JSON and converting it into structs etc. +Marshalling is taking structs and converting them to JSON etc. + +#+BEGIN_QUOTE +Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v. If v is nil or not a pointer, Unmarshal returns an InvalidUnmarshalError. + +Unmarshal uses the inverse of the encodings that Marshal uses, allocating maps, slices, and pointers as necessary, with the following additional rules: +#+END_QUOTE + + +** converting a file to []byte +Once we’ve used the os.Open function to read our file into memory, we then have to convert it toa byte array using ioutil.ReadAll. + + +** using var f T vs f := T +One reason you would want to define the type of a variable first and not use := is this: + +#+begin_src go +var f func(string) +f := func(s string) { + f(s) +} +#+end_src + +Here, we have recursive definition of ~f~, which won't work when we do ~f:=T~ + diff --git a/grpc.org b/grpc.org new file mode 100644 index 0000000..fe24026 --- /dev/null +++ b/grpc.org @@ -0,0 +1,200 @@ +** RPCs +RPC allows one computer to call a subroutine in another computer. It is a high level model for client-server communication. + +In a microservices, one could use RPCs and not HTTP APIs since RPCs have several advantages over HTTP APIs. + +** History +In the early days, there were protocols that were great for human to application communication, like Email etc, but none for computer-computer application protocols. + +Just like we have procedure calls, (which are function calls), we have remote procedure calls which call the procedures that aren't on the same machine, but are remote. + +The idea was that, since in a local procedure call, the compiler gives us the ability to make the call, the idea was that the compiler could play a role in enabling remote procedure calls as well. + + +** General +When the client sends a RPC, it blocks till the server sends the response back. The server receives the request and starts process execution. +There has to be some code on the client machine that knows this is a RPC and makes the needed network communication and get the answer and present it as the return value of the procedure call. + +The RPC framework knows which server to contact, which port to contact, how to serialize the call, marshal it's arguments etc. On the server side, similar stub should be present to unmarshall the arguments etc. + +** Intro to gRPC - https://www.youtube.com/watch?v=RoXT_Rkg8LA +We know how apps talk to one another. It is SOAP, REST (HTTP + JSON). REST is just a architectural principle, about how to structure your API when you use HTTP+JSON etc. + +REST is not that great, actually. It has some advantages: +- easy to understand - text based protocols +- great tooling to inspect and modify etc +- loose coupling b/w client and server makes changes relatively easy +- high quality implementation in every language. + +It has disadvantages as well: +- No formal API contract - there is documentation (swagger etc), but no formal contract +- Streaming is difficult +- Bidirectional streaming not possible (that's why we had to invent websockets etc) +- Operations are difficult to model ("restart the computer", should that be a POST, GET call?) +- Many times your services are just HTTP endpoints, and don't follow the REST principles nicely. +- Not the most efficient since we are using gRPC + + +GRPC solves all these problems. You first define a contract using GRPC IDL - GRPC Interface Definition Language. + +#+begin_src +service Identity { + rpc GetPluginInfo(GetPluginInfoRequest) + returns (GetPluginInfoResponse) {} + + rpc GetPluginCapabilities(GetPluginCapabilitiesRequest) + returns (GetPluginCapabilitiesResponse) {} + + rpc Probe (ProbeRequest) + returns (ProbeResponse) {} +} + +message GetPluginInfoResponse { + // The name MUST follow reverse domain name notation format + // (https://en.wikipedia.org/wiki/Reverse_domain_name_notation). + // It SHOULD include the plugin's host company name and the plugin + // name, to minimize the possibility of collisions. It MUST be 63 + // characters or less, beginning and ending with an alphanumeric + // character ([a-z0-9A-Z]) with dashes (-), underscores (_), + // dots (.), and alphanumerics between. This field is REQUIRED. + string name = 1; + + // This field is REQUIRED. Value of this field is opaque to the CO. + string vendor_version = 2; + + // This field is OPTIONAL. Values are opaque to the CO. + map manifest = 3; +} +#+end_src + +Here, we have define a ~Identity~ service, which supports 3 procedure calls - ~GetPluginInfo~, ~GetPluginCapabilities~, ~Probe~. +See how the ~GetPluginInfo~ takes in the ~GetPluginInfoRequest~ and returns a ~GetPluginInfoResponse~. Then we defined what the ~GetPluginInfoResponse~ looks like below that. + +This is a formal definition, with types. We can run a compiler thru it. + +~protoc --proto_path=. --python_out=plugins-grpc:./py calls.proto~ + +This will generate client side code that can call the RPC. +Similarly, we can generate the server side code as well. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 13:58:12 +[[file:assets/screenshot_2018-09-16_13-58-12.png]] + + +Grpc is the framework which makes the RPC possible, it is a implementation of the RPC protocol. + +GRPC is a protocol built on top of HTTP/2 as the transport protocol. The messages that you send and receive are serialized using Protocol Buffers - you can use some other. + +Clients open one long lived connection to a grpc server. A new HTTP/2 stream for each RPC call. This allows multiple simultaneous inflight RPC calls. Allows client side and server side streaming. + +The compiler can generate stubs in 9 languages. The reference implementation of grpc is in C. Ruby, Python are all bindings to the C core. + +Gprc supports plugabble middleware, on the client and server side which can be used to do logging etc. + +Swagger solves some of the problems around contract, in that people can make swagger IDL like contracts, but it is still text based protocol, doesn't solve bidirectional streaming. Also, the swagger IDL is verbose. + +One problem is that you can't call this from the browser. The fact that it relies on having an intimate control over the HTTP2 connection means you need to have a shim layer in between. + +Many companies have exposed grpc APIs to the public - like Google Cloud (their pub sub api, speech recognition api) etc. + +** gRPC - https://www.youtube.com/watch?v=OZ_Qmklc4zE + +Grpc - gRPC Remote Procedure Calls + +It is a recursive fullform. It is a open source, high performance "RPC framework" + +It is the next generation of Snubby RPC build and used inside Google. + + +** Getting Started +- defining a service in a ~.proto~ file using protocol buffers IDL +- generate the client and server stub using the protocol buffer compiler +- extend the generated server class in your code to fill in the business logic +- invoke it using the client stubs + +** An aside: Protocol Buffers +Google's lingua franca for serializing data - RPCs and storage. It is binary (so compact), structures can be extended in backward and forward compatible ways. + +It is strongly typed, supports several data types + + +** Example +Let's write an example called Route Guide. There are clients traveling around and talking to a central server. Or, it can be 2 friends traveling along 2 different routes and talking to each other. + +We have to decide: what types of services do we need to expose? What messages to send? + +#+begin_src +syntax = "proto3"; + +// Interface exported by the server. +service RouteGuide { + // A simple RPC. + // + // Obtains the feature at a given position. + // + // A feature with an empty name is returned if there's no feature at the given + // position. + rpc GetFeature(Point) returns (Feature) {} + + // A Bidirectional streaming RPC. + // + // Accepts a stream of RouteNotes sent while a route is being traversed, + // while receiving other RouteNotes (e.g. from other users). + rpc RouteChat(stream RouteNote) returns (stream RouteNote) {} +} + +// Points are represented as latitude-longitude pairs in the E7 representation +message Point { + int32 latitude = 1; + int32 longitude = 2; +} + +// A feature names something at a given point. +// +// If a feature could not be named, the name is empty. +message Feature { + // The name of the feature. + string name = 1; + + // The point where the feature is detected. + Point location = 2; +} + +// A RouteNote is a message sent while at a given point. +message RouteNote { + // The location from which the message is sent. + Point location = 1; + + // The message to be sent. + string message = 2; +} +#+end_src + +Grpc supports 2 types of RPCs: +- unary + - client sends a request + - server sends a response +- client streaming rpc + - client sends multiple messages + - server sends one response +- server streaming rpc + - client sends one response + - server sends multiple messages +- bi-directional streaming rpc + - client and server independently send multiple messages to each other + +Now running the proto compiler on this will give you the client and server stubs. You just have to implement the business logic using these stubs. + +Grpc is extensible: +- interceptors +- transports +- auth and security - plugin auth mechanisms +- stats, monitoring, tracing - has promotheseus, zipkin integrations +- service discovery - consul, zookeeper integrations +- supported with proxies - envoy, nginx, linkerd + +Grpc has deadline propagation, cancellation propagation + +The wire protocol used by grpc is based on HTTP/2 and the specification is well established. diff --git a/hackerearth.org b/hackerearth.org new file mode 100644 index 0000000..9dd83ab --- /dev/null +++ b/hackerearth.org @@ -0,0 +1,295 @@ +** reading an int and string from stdin +To read an int, use: +#+begin_src go + var num int + _, err := fmt.Scanf("%d", &num) +#+end_src + +Here, _ is the #bytes read in. + +#+begin_src go +// Scanf scans text read from standard input, storing successive +// space-separated values into successive arguments as determined by +// the format. It returns the number of items successfully scanned. +// If that is less than the number of arguments, err will report why. +// Newlines in the input must match newlines in the format. +// The one exception: the verb %c always scans the next rune in the +// input, even if it is a space (or tab etc.) or newline. +func Scanf(format string, a ...interface{}) (n int, err error) { + return Fscanf(os.Stdin, format, a...) +} +#+end_src + +So, can be used to read in many args at once too. + +Eg: +#+begin_src go +package main + +import ( + "fmt" +) + +func main() { + var num int + var msg string + _, err := fmt.Scanf("%d\n%s", &num, &msg) + if err != nil { + fmt.Println("err", err) + } + // reader := bufio.NewReader(os.Stdin) + // msg, _ := reader.ReadString('\n') + fmt.Println(num * 2) + fmt.Print(msg) +} +#+end_src + +Alt: +#+begin_src go +package main + +import ( + "bufio" + "fmt" + "os" +) + +func main() { + var num int + _, err := fmt.Scanf("%d", &num) + if err != nil { + fmt.Println("err", err) + } + reader := bufio.NewReader(os.Stdin) + msg, _ := reader.ReadString('\n') + fmt.Println(num * 2) + fmt.Print(msg) +} +#+end_src + +** Competitive programming +When solving competitive programming questions, we have, max allowed running times: +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-22 22:15:39 +[[file:assets/screenshot_2018-08-22_22-15-39.png]] + +#+begin_src go +// go program to accept a number and print all prime numbers upto that number +package main + +import ( + "fmt" + "strconv" + "strings" +) + +// isPrime... +func isPrime(n int) bool { + for i := 2; i <= (n+1)/2; i++ { + if n%i == 0 { + return false + } + } + return true +} + +func main() { + var num int + var ans []int + _, err := fmt.Scanf("%d", &num) + if err != nil { + fmt.Println("err", err) + } + for i := 2; i < num; i++ { + if isPrime(i) { + ans = append(ans, i) + } + } + var s []string + + for _, j := range ans { + // fmt.Println(i, " - ", j) + s = append(s, strconv.Itoa(j)) + } + fmt.Printf(strings.Join(s, " ")) +} +#+end_src + +** ways to declare a map +#+begin_src go +res := map[rune]int{} +res := make(map[rune]int) +#+end_src + +Both work :top: + +I wanted to read a string as a line, I did this: +#+begin_src go +reader := bufio.NewReader(os.Stdin) +x, _ := reader.ReadString('\n') +x = strings.TrimSuffix(x, "\n") +#+end_src + +Note, the TrumSuffix which removes the newline character. Splitting the string and parsing the integer fails otherwise. + +A full fledged program to print the transpose of a matrix +#+begin_src go +package main + +import ( + "bufio" + "fmt" + "os" + "strconv" + "strings" +) + +// isPrime... +func isPrime(n int) bool { + for i := 2; i <= (n+1)/2; i++ { + if n%i == 0 { + return false + } + } + return true +} + +func main() { + reader := bufio.NewReader(os.Stdin) + var n, m int + _, err := fmt.Scanf("%d %d", &n, &m) + if err != nil { + fmt.Println("err", err) + } + res := make([][]int, 0) + for i := 0; i < n; i++ { + x, _ := reader.ReadString('\n') + x = strings.TrimSuffix(x, "\n") + tres := strings.Split(x, " ") + // fmt.Println("x", x) + // fmt.Println("tres", tres) + resj := make([]int, 0) + for _, c := range tres { + // fmt.Println("c", c) + xint, _ := strconv.Atoi(c) + // fmt.Println("xint", xint) + resj = append(resj, xint) + } + // fmt.Println("resj", resj) + res = append(res, resj) + } + // fmt.Println("res", res) + for i := 0; i < m; i++ { + for j := 0; j < n; j++ { + fmt.Printf("%d ", res[j][i]) + } + fmt.Printf("\n") + } +} +#+end_src + + +** sort the slice + +#+begin_src go +sort.Slice(a, func(i, j int) bool { return a[i] < a[j] }) +sort.Slice(b, func(i, j int) bool { return b[i] < b[j] }) +#+end_src + + + +** function to read slice + +#+begin_src go +func readSlice(reader *bufio.Reader) []int { + inp, _ := reader.ReadString('\n') + inp = strings.TrimSuffix(inp, "\n") + nums := strings.Split(inp, " ") + + var a []int + for _, j := range nums { + i, _ := strconv.Atoi(string(j)) + a = append(a, i) + } + return a +} +func main() { + reader := bufio.NewReader(os.Stdin) + a = readSlice(reader) + b = readSlice(reader) +} +#+end_src + +** a strange thingy, bug? + +#+begin_src go +package main + +import ( + "bufio" + "fmt" + "os" + "sort" + "strconv" + "strings" +) + +func readSlice(reader *bufio.Reader) []int { + inp, _ := reader.ReadString('\n') + inp = strings.TrimSuffix(inp, "\n") + nums := strings.Split(inp, " ") + + var a []int + for _, j := range nums { + i, _ := strconv.Atoi(string(j)) + a = append(a, i) + } + return a +} + +func main() { + var t int + fmt.Scanf("%d", &t) + reader := bufio.NewReader(os.Stdin) + + for i := 0; i < t; i++ { + var n, k int + fmt.Scanf("%d %d", &n, &k) + a := readSlice(reader) + sort.Slice(a, func(p, q int) bool { return a[p] < a[q] }) + if a[0] >= k { + fmt.Println(0) + } else { + fmt.Println(k - a[0]) + } + } +} +#+end_src + +This program in go1.8.3 worked correctly - https://www.hackerearth.com/practice/data-structures/arrays/1-d/practice-problems/algorithm/micro-and-array-update/ + +That is, on each iteration it picked the right values via fmt.Scanf. +However, if we move the ~var n, k int~ outside the ~for~ loop, it doesn't scan the second time and continues with the first scanned value. + +*This could be due to intermixing of Scanf and readSlice* :top: +the thing is, Scanf string we gave was ~fmt.Scanf("%d %d", &n, &k)~, which doesn't accept ~\n~. So, it was kept there and the ~readSlice~ got the wrong input. + + +** iterating a slice +You can iterate thru a slice partially +#+begin_src go +for _, value := range mySlice[5:10] { + // use value +} +#+end_src + +When iterating thru the slice in reverse, start from ~len(mySlice)-1~ and go upto ~i>=0~ +#+begin_src go + for i := len(res) - 1; i >= 0; i-- { + fmt.Print(res[i]) + } + +#+end_src + + diff --git a/high_perf_browser_networking.org b/high_perf_browser_networking.org new file mode 100644 index 0000000..1a12069 --- /dev/null +++ b/high_perf_browser_networking.org @@ -0,0 +1,219 @@ +* High-Performance Browser Networking +- Ilya Grigorik + +Good developers know how things work. Great developers know why things work + +This book talks about the myriad of networking protocols it rides on: TCP, TLS, UDP, HTTP etc + +Some of the quick insights: +- tcp is not always the best transport mechanism +- reusing connections is a critical optimization technique (~keep-alive~ helps with this) +- terminating sessions at a server closer to the client leads to lower latency +* Networking 101 +** Primer on Latency and Bandwidth + +Speed is a feature, it is also important to the bottom-line performance of the applications: +- faster sites lead to better user engagement, user retention, conversions. + + + +There are 2 critical components that dictate the performance of all network traffic. +- Latency + - the time from source sending a packet to the destination receiving it +- Bandwidth + - Maximum throughput of a (logical or physical) communication path + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 19:40:02 +[[file:assets/screenshot_2018-07-06_19-40-02.png]] + +The latency is how long it takes to travel between the 2 ISPs. The width between the red lines is the bandwidth. + +Latency has several components, even within a typical router: +- propagation delay + + - "The time required for a message to travel from the sender to the receiver" - is a function of distance over speed with which the signal propagates. + - generally around ~c~, the speed of light + +- transmission delay + - "Time required to push the packet's bits into the link" - is a function of packet's length, data rate of the link + - depends on the available data rate of the transmitting link (has nothing to do with distance b/w origin and destination) + - so, if we want to say transfer 10Mb file over 1Mbps and 100Mbps, it will take 10s to put the entire file on the wire over 1mbps, on 100mbps, it will take 0.1s + +- processing delay + - "The required to process the packet header, check for bit level errors, determine the packet's destination" + - this is nowadays done in the hardware, so the delays are very small + +- queuing delay + - "The amount of time the incoming packet is waiting in the queue, waiting to be processed. + - high if the packets are coming in at a rate faster than how the router can process, they are queued in a buffering channel. + + +Total latency is the sum of all the above latencies. :top: + +If our source and destination are far away, it will have a higher propagation delay +If there are a lot of intermediate routers, the transmission and processing delays will be more +If the load of the traffic along the path is high, the queuing delay will be higher + + +Bufferbloat - modern routers have a large packet buffers, (as they don't want to lose any packets) - however, this breaks TCP's congestion avoidance mechanisms and introduces high and variable latency delays into the network + +Often, a significant delay is in the last mile - getting the packet to the ISP's network. This last mile latency can depend on several things; deployed technology, topology of the network, time of the day etc. + +For most websites, latency is the performance bottleneck, not bandwidth. + +Latency can be measured with ~traceroute~ +It sends a packet with "hop limit" (is it the TTL, time to live on the packet?). When the limit is reached, the intermediary returns an ICMO time execcded message, allowing ~traceroute~ to measure the latency of each network hop + + +Optical fibers are light carrying pipes, slightly thicker than a human hair. It can carry many wavelengths of light thru a process known as WDM - wavelength division modulation, so the bandwidth is high. The total bandwidth of a fiber link is the multiple of per-channel data rate and the number of multiplexed channels. + +Most of the core data paths of the Internet, specially over large distances, are optical fibers. However, the available capacity at the edges of the network is much, much less, and varies wildly based on deployed technology (dial up, DSL, cable, wireless, fiber to home etc). This limits the bandwidth to the end user - The available bandwidth to the user is a function of the lowest capacity link between the client and the destination server + +** Building Block of TCP + +At the heart of the Internet are 2 protocols; IP and TCP +- IP (Internet Protocol) + - it provides the host-to-host routing and addressing. +- TCP (Transmission Control Protocol) + - provides the abstraction of a reliable network running over an unreliable channel + - It hides most of the complexity of network communication; retransmission of lost data, in order delivery, congestion control/avoidance, data integrity + - TCP guarantees that the bytes received are in the same order that they were sent in. + - TCP is optimized for accurate delivery, not a timely one (unlike UDP?) + +TCP/IP is commonly referred to as the Internet Protocol Suite. +TCP is the protocol of choice for most of the popular applications: world wide web, email, file transfers etc + +Hmm, so it is possible to create a new protocol, either on IP or something new entirely on the network infrastructure that we have right now. It would have to be on the IP though, if we want to use the existing network routers etc. + + +Fun fact: HTTP standard does not specify TCP as the only transport protocol. We could also deliver HTTP via a datagram socket (ie User Datagram Protocol, or UDP) or any other transport protocol of our choice. However, TCP is mostly used. + +*** Three way handshake +All TCP connections start with the 3 way handshake. + +The client and the server must agree on starting packet sequence numbers (among other parameters) +- SYN + - Client picks a random sequence number ~x~ and sends a SYN packet + +- SYN ACK + - server increments ~x~ by 1, picks own random sequence number ~y~ + +- ACK + - client increments both ~x~ and ~y~ by 1 and completes the handshake by dispatching the last ACK package in the handshake + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:01:53 +[[file:assets/screenshot_2018-07-06_22-01-53.png]] + +After the 3 way handshake is complete, the application data can being to flow between the client and the server. +The client can send a packet immediately after the ACK packet. + +An important and obvious optimization is reusing TCP connections. Note, the delay in the connection is not governed by bandwidth, but the latency between the client and the server + +TCP Fast Open is an optimization that allows data transfers within the SYN packet onwards. + + +*** Congestion Avoidance and Control + +Congestion collapse could affect any network with asymmetric bandwidth capacity between the nodes. If the bandwidth of one node is say, 100Mbps, it will load a very large chunk of packets on the wire and the other node won't be able to process them this fast. + +Consider this: a node is sending packets to another node. If the roundtrip time has exceeded the maximum retransmission interval, the sending host will think that the packet has been lost and will retransmit. This will lead to flooding of all the available buffers in the switching nodes with these packets, which will have to be dropped now. The condition will persist, it won't go away on it's own + +The congestion collapse wasn't a problem in ARPANET because the nodes had uniform bandwidth and the backbone had substantial excess capacity. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:16:25 +[[file:assets/screenshot_2018-07-06_22-16-25.png]] + +To address the issue of congestion, multiple mechanisms were implemented in TCP to govern the rate at which the data can be sent in both directions: flow control, congestion control, congestion avoidance. + +**** Flow control + +It is a mechanism to prevent the sender from overwhelming the receiver with data it may not be able to process - the receiver may be busy, under heavy load etc. + +Each side of the TCP connection advertises its own receive window, (~rwnd~), which communicates the size of the available buffer space to hold the incoming data. + +If the window reaches zero, then it is treated as a signal that no more data should be sent until the existing data in the buffer has been cleared by the application layer. +This continues thruout the lifetime of every TCP connection, each ACK packet carries the latest rwnd value for each side, allowing both sides to dynamically adjust the data flow rate to the capacity and processing speed of the sender and receiver. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:21:33 +[[file:assets/screenshot_2018-07-06_22-21-33.png]] + +Fun fact: the original TCP specification allocated 16bits for advertising the receive window side, this means the maximum value of the window size was 2^{16} = 64kb. This is not a optimal window size especially for networks that exhibit high bandwidth delay product. +To deal with this, RFC 1323 provided a "TCP window scaling" option, which is communicated during the TCP 3 way handshake and carries a value that represents the number of bits to left shift the 16bit window size field in future ACKs. + +Today, TCP window scaling is enabled by default on all major platforms. However, intermediate nodes, routers, and firewalls can rewrite or even strip this option entirely. If your connection to the server, or the client, is unable to make full use of the available bandwidth, then checking the interaction of your window sizes is always a good place to start. On Linux platforms, the window scaling setting can be checked and enabled via the following commands: +~$ sysctl net.ipv4.tcp_window_scaling~ +~$ sysctl -w net.ipv4.tcp_window_scaling=1~ + +**** Slow start + +Flow control was not sufficient in preventing congestion collapse. The problem was that flow control prevented the sender from overwhelming the receiver, but there was no mechanism to prevent either side from overwhelming the entire network - they don't know the available bandwidth at the beginning of a new connection. Hence, they need a mechanism to estimate it and adapt their speeds to the continuously changing conditions within the network. + +To illustrate one example where such an adaptation is beneficial, imagine you are at home and streaming a large video from a remote server that managed to saturate your downlink to deliver the maximum quality experience. Then another user on your home network opens a new connection to download some software updates. All of the sudden, the amount of available downlink bandwidth to the video stream is much less, and the video server must adjust its data rate—otherwise, if it continues at the same rate, the data will simply pile up at some intermediate gateway and packets will be dropped, leading to inefficient use of the network. + +Jacobson and Karels documented 4 algorithms to address these problems: slow start, congestion avoidance, fast retransmit, fast recovery. + +In slow start, the client and the server start with a small congestion window size ~cwnd~. + +*Congestion window size* - the sender side limit on the amount of data the sender can have in flight before receiving an acknowledgment from the client. + +We have a new rule: the max amount of data in flight (not ACKed) between the client and the server is the minimum of rwnd (the receive window size), and cwnd (the congestion window size). + +The algorithm is, the cwnd window size is set to 1 to start with (this was later changed to 4 (rfc 2581), and then to 10 (rfc 6928)). Since the max data in flight for a new tcp connection is the minimum of rwnd and cwnd, the server can send upto 4 (or 1 or 10) network segments and then stop even if the rwnd is higher. Then, for every received ack, the slow start algorithm will increse the cwnd window size by 1 segment. So for every ACKed packet, 2 new packets (1 because 1 ACK was received, and 1 increase in size of cwnd). This phase of the tcp connection is commonly known as the "exponential growth" algorithm as the client and the server are trying to quickly converge on the available bandwidth on the network path between them. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:45:48 +[[file:assets/screenshot_2018-07-06_22-45-48.png]] + +every TCP connection must go through the slow-start phase—we cannot use the full capacity of the link immediately! + +Also, we can compute a simple formula to find the time taken to reach a cwnd of size N + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:47:06 +[[file:assets/screenshot_2018-07-06_22-47-06.png]] + +Consider these parameters: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:47:27 +[[file:assets/screenshot_2018-07-06_22-47-27.png]] + +Putting in the values, we get: +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:47:40 +[[file:assets/screenshot_2018-07-06_22-47-40.png]] + +So, a new tcp connection will require 224 ms to reach the 64kb rwnd (receiving window size). The fact that the client and server may be capable of transferring at Mbps+ data rates has no effect—that’s slow-start. + + +To decrease the amount of time it takes to grow the congestion window, we can decrease the roundtrip time between the client and server—e.g., move the server geographically closer to the client. Or we can increase the initial congestion window size to the new RFC 6928 value of 10 segments. + +Slow-start is not as big of an issue for large, streaming downloads, as the client and the server will arrive at their maximum window sizes after a few hundred milliseconds and continue to transmit at near maximum speeds—the cost of the slow-start phase is amortized over the lifetime of the larger transfer. + +However, for many HTTP connections, which are often short and bursty, it is not un‐ usual for the request to terminate before the maximum window size is reached. As a result, the performance of many web applications is often limited by the roundtrip time between server and client: slow-start limits the available bandwidth throughput, which has an adverse effect on the performance of small transfers. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-07-06 22:50:32 +[[file:assets/screenshot_2018-07-06_22-50-32.png]] + + + + + + + + diff --git a/k&r_c.org b/k&r_c.org index dd8a4d3..876e3d4 100644 --- a/k&r_c.org +++ b/k&r_c.org @@ -288,7 +288,7 @@ we also have signed or unsigned ints signed means the varialbe has sign, can be negative unsigned means only 0 and positive -so, in 1 byte, unsigned can be 0 t0 256 +so, in 1 byte, unsigned can be 0 to 255 and signed can be -128 - 127 the header files limits.h and float.h contain symbolic constants for all these sizes diff --git a/kubernetes.org b/kubernetes.org new file mode 100644 index 0000000..fd675e2 --- /dev/null +++ b/kubernetes.org @@ -0,0 +1,3990 @@ +* Introduction to Kubernetes +LFS158x + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 20:59:07 +[[file:assets/screenshot_2018-06-12_20-59-07.png]] +* Introduction + +Kubernetes is an open source system for automating deployment, scaling and management of containerzied applications + +It means helmsman, or "ship pilot" in Greek. The analogy is to think of k8s as a manager for ships loaded with containers + +K8s has new releases every 3 months. The latest is 1.11 + +Some of the lessons put in k8s come from Borg, like: +- api servers +- pods +- ip per pod +- services +- labels + +K8s has features like: +- automatic binpacking +K8s automatically schedules the containers based on resource usage and constraints + +- self healing +Following the declarative paradigm, k8s makes sure that the infra is always what it should be + +- horizontal scaling +- service discovery and load balancing +K8s groups sets of containers and refers to them via a DNS. This DNS is also called k8s service. +K8s can discover these services automatically and load balance requests b/w containers of a given service. + +- automated rollouts and rollbacks without downtime +- secrets and configuration management +- storage orchestration +With k8s and its plugins we can automatically mount local, external and storage solutions to the containers in a seamless manner, based on software defined storage (SDS) + +- batch execution +K8s supports batch execution + +- role based access control + +K8s also abstracts away the hardware and the same application can be run on aws, digital ocean, gcp, bare metal, VMs etc once you have the cluster up (and also given you don't use the cloud native solutions like aws ebs etc) + +K8s also has a very pluggable architecture, which means we can plug in any of our components and use it. The api can be extended as well. We can write custom plugins too + +** Cloud Native Computing Foundation +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 22:20:29 +[[file:assets/screenshot_2018-06-12_22-20-29.png]] + +The CNCF is one of the projects hosted by the Linux Foundation. It aims to accelerate the adoption of containers, microservices, cloud native applications. + +Some of the projects under the cncf: +- containerd + - a container runtime - used by docker +- rkt + - another container runtime from coreos +- k8s + - container orchestration engine +- linkerd + - for service mesh +- envoy + - for service mesh +- gRPC + - for remote procedure call (RPC) +- container network interface - CNI + - for networking api +- CoreDNS + - for service discovery +- Rook + - for cloud native storage +- notary + - for security +- The Update Framework - TUF + - for software updates +- prometheus + - for monitoring +- opentracing + - for tracing +- jaeger + - for distributed tracing +- fluentd + - for logging +- vitess + - for storage + +this set of CNCF projects can cover the entire lifecycle of an application, from its execution using container runtimes, to its monitoring and logging + +The cncf helps k8s by: +- neutral home for k8s trademark and enforces proper usage +- offers legal guidance on patent and copyright issues +- community building, training etc + + + +** K8s architecture + +K8s has 3 main components: +- master node +- worker node +- distributed k-v store, like etcd + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 22:30:46 +[[file:assets/screenshot_2018-06-12_22-30-46.png]] + +The user contacts the ~api-server~ present in the master node via cli, apis, dashboard etc +The master node also has controller, scheduler etc + +Each of the worker node has: +- kubelet +- kube-proxy +- pods + + +*** Master Node +It is responsible for managing the kubernetes cluster. We can have more than 1 master node in our kubernetes cluster. This will enable HA mode. Only one will be master, others will be followers + +The distributed k-v store, etcd can be a part of the master node, or it can be configured externally. + +**** API server +All the administrative tasks are performed via the api server. The user sends rest commands to the api server which then validates and processes the requests. After executing the requests, the resulting state of the cluster is stored in a distributed k-v store etcd +**** Scheduler +It schedules work on different worker nodes. It has the resource usage information for each worker node. It keeps in mind the constrains that the user might have set on each pod etc. The scheduler takes into account the quality of the service requirements, data locality, affinity, anti-affinity etc + +It schedules pods and services + +**** Controller manager +It manages non-terminating control loops which regulate the state of the kubernetes cluster. +The CM knows about the descried state of the objects it manages and makes sure that the object stays in that state. +In a control loop, it makes sure that the desired state and the current state are in sync + +**** etcd +It is used to store the current state of the cluster. + +*** Worker Node + +It runs applications using Pods and is controlled by the master node. The master node has the necessary tools to connect and manage the pods. +A pod is a scheduling unit in kubernetes. It is a logical collection of one or more containers which are always scheduled together. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 22:47:03 +[[file:assets/screenshot_2018-06-12_22-47-03.png]] + +A worker node has the following components: +- container runtime +- kubelet +- kube-proxy + +**** Continer Runtime +To run and manage the container's lifecycle, we need a container runtime on all the worker nodes. +Examples include: +- containerd +- rkt +- lxd + +**** kubelet + +It is an agent that runs on each worker node and communicates with the master node. +It receives the pod definition (for eg from api server, can receive from other sources too) and runs the containers associated with the pod, also making sure that the pods are healthy. + +The kublet connects to the container runtime using the CRI - container runtime interface +The CRI consists of protocol buffers, gRPC API, libraries + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 23:27:32 +[[file:assets/screenshot_2018-06-12_23-27-32.png]] + +The CRI shim converts the CRI commands into commands the container runtime understands + +The CRI implements 2 services: + +- ImageService +It is responsible for all the image related operations + +- RuntimeService +It is responsible for all the pod and container related operations + +With the CRI, kubernetes can use different container runtimes. Any container runtime that implements CRI can be used by kubernetes to manage pods, containers, container images + +***** CRI shims + +Some examples of CRI shims +- dockershim +With dockershim, containers are cerated using docker engine that is installed on the worker nodes. +The docker engine talks to the containerd and manages the nodes + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 23:44:47 +[[file:assets/screenshot_2018-06-12_23-44-47.png]] +***** cri-containerd + +With cri-containerd, we directly talk to containerd by passing docker engine + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 23:47:28 +[[file:assets/screenshot_2018-06-12_23-47-28.png]] +***** cri-o + +There is an initiative called OCI - open container initiative that defines a spec for container runtimes. +What cri-o does is, it implements the container runtime interface - CRI with a general purpose shim layer that can talk to all the container runtimes that comply with the OCI. + +This way, we can use any oci compatible runtime with kubernetes (since cri-o will implement the cri) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-12 23:51:33 +[[file:assets/screenshot_2018-06-12_23-51-33.png]] + +Note here, the cri-o implements the CNI, and also has the image service and the runtime service + +***** Notes +It can get a little messy sometimes, all these things. + +Docker engine is the whole thing, it was a monolith that enabled users to run containers. Then it was broken down into individual components. It was broken down into: +- docker engine +- containerd +- runc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:07:13 +[[file:assets/screenshot_2018-08-11_23-07-13.png]] + +runC is the lowest level component that implements the OCI interface. It interacts with the kernel and does the "runs" the container + +containerd does things like take care of setting up the networking, image transfer/storage etc - It takes care of the complete container runtime (which means, it manages and makes life easy for runC, which is the actual container runtime). Unlike the Docker daemon it has a reduced feature set; not supporting image download, for example. + +Docker engine just does some high level things itself like accepting user commands, downloading the images from the docker registry etc. It offloads a lot of it to containerd. + +"the Docker daemon prepares the image as an Open Container Image (OCI) bundle and makes an API call to containerd to start the OCI bundle. containerd then starts the container using runC." + +Note, the runtimes have to be OCI compliant, (like runC is), that is, they have to expose a fixed API to managers like containerd so that they(containerd) can make life easy for them(runC) (and ask them to stop/start containers) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:15:15 +[[file:assets/screenshot_2018-08-11_23-15-15.png]] + + +rkt is another container runtime, which does not support OCI yet, but supports the appc specification. But it is a full fledged solution, it manages and makes it's own life easy, so it needs no containerd like daddy. + +So, that's that. Now let's add another component (and another interface) to the mix - Kubernetes + +Kubernetes can run anything that satisfies the CRI - container runtime interface. + +You can run rkt with k8s, as rkt satisfies CRI - container runtime interface. Kubernetes doesn't ask for anything else, it just needs CRI, it doesn't give a FF about how you run your containers, OCI or not. + +containerd does not support CRI, but cri-containerd which is a shim around containerd does. So, if you want to run containerd with Kubernetes, you have to use cri-containerd (this also is the default runtime for Kubernetes). cri-containerd recently got renamed to CRI Plugin. + +If you want to get the docker engine in the mix as well, you can do it. Use dockershim, it will add the CRI shim to the docker engine. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:27:18 +[[file:assets/screenshot_2018-08-11_23-27-18.png]] + +Now, like containerd can manage and make life easy for runC (the container runtime), it can manage and make life easy for other container runtimes as well - in fact, for every container runtime that supports OCI - like Kata container runtime (known as ~kata-runtime~ - https://github.com/kata-containers/runtime.) - which runs kata containers, Clear Container runtime (by Intel). + +Now we know that rkt satisfies the CRI, cri-containerd (aka CRI Plugin) does it too. + +Note what containerd is doing here. It is not a runtime, it is a manager for runC which is the container runtime. It just manages the image download, storage etc. Heck, it doesn't even satisfy CRI. + +That's why we have CRI-O. It is just like containerd, but it implements CRI. CRI-O needs a container runtime to run images. It will manage and make life easy for that runtime, but it needs a runtime. It will take any runtime that is OCI compliant. So, naturally, ~kata-runtime~ is CRI-O compliant, runC is CRI-O compliant. + +Use with Kubernetes is simple, point Kubernetes to CRI-O as the container runtime. (yes yes, CRI-O, but CRI-O and the actual container runtime IS. And Kubernetes is referring to that happy couple when it says container runtime). + +Like containerd has docker to make it REALLY usable, and to manage and make life easy for containerd, CRI-O needs someone to take care of image management - it has buildah, umochi etc. + +crun is another runtime which is OCI compliant and written in C. It is by RedHat. + +We already discussed, kata-runtime is another runtime which is OCI compliant. So, we can use kata-runtime with CRI-O like we discussed. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:53:04 +[[file:assets/screenshot_2018-08-11_23-53-04.png]] + +Note, here, the kubelet is talking to CRI-O via the CRI. CRI-O is talking to cc-runtime (which is another runtime for Intel's clear containers, yes, OCI compliant), but it could be kata-runtime as well. + +Don't forget containerd, it can manage and make life easy for all OCI complaint runtimes too - runC sure, but also kata-runtime, cc-runtime + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:55:06 +[[file:assets/screenshot_2018-08-11_23-55-06.png]] + +Here, note just the runtime is moved from runC to kata-runtime. +To do this, in the containerd config, just change runtime to "kata" + +Needless to say, it can run on Kubernetes either by CRI-O, or by cri-containerd (aka CRI Plugin). + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-11 23:56:57 +[[file:assets/screenshot_2018-08-11_23-56-57.png]] + +This is really cool :top: + +Kubernetes, represented here by it's Ambassador, Mr. Kubelet runs anything that satisfies the CRI. +Now, we have several candidates that can. +- Cri-containerd makes containerd do it. +- CRI-O does it natively. +- Dockershim makes the docker engine do it. + +Now, all the 3 guys above, can manage and make life easy for all OCI compliant runtimes - runC, kata-runtime, cc-runtimes. + +We also have frakti, which satisfies CRI, like rkt, but doesn't satisfy OCI, and comes bundled with it's own container runtime. + + +Here we have CRI-O in action managing and making life easy for OCI compliant kata-runtime and runC both + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-12 00:02:16 +[[file:assets/screenshot_2018-08-12_00-02-16.png]] + +We have some more runtimes as well: +- railcar - OCI compliant, written in rust +- Pouch - Alibaba's modified runC +- nvidia runtime - nvidia's fork of runC + +**** kube-proxy + +To connect to the pods, we group them logically, and the use a ~Service~ to connect to them. The service exposes the pods to the external world and load balances across them + +Kube-proxy is responsible for setting the routes in the iptables of the node when a new service is created such that the service is accessible from outside. The apiserver gives the service a IP which the kube-proxy puts in the node's iptables + +The kube-proxy is responsible for "implementing the service abstraction" - in that it is responsible for exposing a load balanced endpoint that can be reached from inside or outside the cluster to reach the pods that define the service. + +Some of the modes in which it operates to achieve that :top: + +1. Proxy-mode - userspace + +In this scheme, it uses a proxy port. + +The kube-proxy does 2 things: +- it opens up a _proxy port_ on each node for each new service that is created +- it sets the iptable rules for each node so that whenever a request is made for the service's ~clusterIP~ and it's port (as specified by the apiserver), the packets come to the _proxy port_ that kube-proxy created. The kube-proxy then uses round robin to forward the packets to one of the pods in that service + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 00:17:51 +[[file:assets/screenshot_2018-06-13_00-17-51.png]] + +So, let's say the service has 3 pods A, B, C that belong to service S (let's say the apiserver gave it the endpoint 10.0.1.2:44131). Also let's say we have nodes X, Y, Z + +earlier, in the userland scheme, each node got a new port opened, say 30333. +Also, each node's iptables got updated with the endpoints of service S (10.0.1.2:44131) pointing to :30333, :30333, :30333 + +Now, when the request comes to from and node, it goes to :30333 (say) and from there, kube-proxy sends it to the pod A, B or C whichever resides on it. + + + +2. iptables + +Here, there is no central proxy port. For each pod that is there in the service, it updates the iptables of the nodes to point to the backend pod directly. + +Continuing the above example, here each node's iptables would get a separate entry for each of the 3 pods A, B, C that are part of the service S. +So the traffic can be routed to them directly without the involvement of kube-proxy + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 00:28:09 +[[file:assets/screenshot_2018-06-13_00-28-09.png]] + +This is faster since there is no involvement of kube-proxy here, everything can operate in the kernelspace. However, the iptables proxier cannot automatically retry another pod if the one it initially selects does not respond. + +So we need a readiness probe to know which pods are healthy and keep the iptables up to date + +3. Proxy-mode: ipvs + +The kernel implements a virtual server that can proxy requests to real server in a load balanced way. +This is better since it operates in the kernelspace and also gives us more loadbalancing options + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 00:32:19 +[[file:assets/screenshot_2018-06-13_00-32-19.png]] + + + +**** etcd +Etcd is used for state management. It is the truth store for the present state of the cluster. Since it has very important information, it has to be highly consistent. It uses the raft consensus protocol to cope with machine failures etc. + +Raft allows a collection of machines to work as a coherent group that can survive the failures of some of its members. At any given time, one of the nodes in the group will be the master, and the rest of them will be the followers. Any node can be treated as a master. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 00:35:17 +[[file:assets/screenshot_2018-06-13_00-35-17.png]] + +In kubernetes, besides storing the cluster state, it is also used to store configuration details such as subnets, ConfigMaps, Secrets etc + + +**** Network setup challenges +To have a fully functional kubernetes cluster, we need to make sure: +1. a unique ip is assigned to each pod +2. containers in a pod can talk to each other (easy, make them share the same networking namespace ) +3. the pod is able to communicate with other pods in the cluster +4. if configured, the pod is accessible from the external world + + +1. Unique IP +For container networking, there are 2 main specifications: + +- Container Network Model - CNM - proposed by docker +- Container Network Interface - CNI - proposed by CoreOS + +kubernetes uses CNI to assign the IP address to each Pod + +The runtime talks to the CNI, the CNI offloads the task of finding IP for the pod to the network plugin + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 00:52:15 +[[file:assets/screenshot_2018-06-13_00-52-15.png]] + +2. Containers in a Pod +Simple, make all the containers in a Pod share the same network namespace. This way, they can reach each other via localhost + +3. Pod-to-Pod communication across nodes + +Kubernetes needs that there shouldn't be any NAT - network address translation when doing pod-to-pod communication. This means, that each pod should have it's own ip address and we shouldn't have say, a subnet level distribution of pods on the nodes (this subent lives on this node, and the pods are accessible via NAT) + + +4. Communication between external world and pods +This can be achieved by exposing our services to the external world using kube-proxy + + +** Installing Kubernetes +Kubernetes can be installed in various configurations: +- all-in-one single node installation +Everything on a single node. Good for learning, development and testing. Minikube does this +- single node etcd, single master, multi-worker +- single node etcd, multi master, multi-worker +We have HA +- multi node etcd, multi master, multi-worker +Here, etcd runs outside Kubernetes in a clustered mode. We have HA. This is the recommended mode for production. + + +Kubernetes on-premise +- Kubernetes can be installed on VMs via Ansible, kubeadm etc +- Kubernetes can also be installed on on-premise bare metal, on top of different operating systems, like RHEL, CoreOS, CentOS, Fedora, Ubuntu, etc. Most of the tools used to install VMs can be used with bare metal as well. + +Kubernetes in the cloud +- hosted solutions +Kubernetes is completely managed by the provider. The user just needs to pay hosting and management charges. +Examples: + - GKE + - AKS + - EKS + - openshift dedicated + - IBM Cloud Container Service + +- Turnkey solutions +These allow easy installation of Kubernetes with just a few clicks on underlying IaaS + - Google compute engine + - amazon aws + - tectonic by coreos + +- Kubernetes installation tools +There are some tools which make the installation easy + - kubeadm +This is the recommended way to bootstrap the Kubernetes cluster. It does not support provisioning the machines + + - KubeSpray +It can install HA Kubernetes clusters on AWS, GCE, Azure, OpenStack, bare metal etc. It is based on Ansible and is available for most Linux distributions. It is a Kubernetes incubator project + + - Kops +Helps us create, destroy, upgrade and maintain production grade HA Kubernetes cluster from the command line. It can provision the machines as well. AWS is officially supported + +You can setup Kubernetes manually by following the repo Kubernetes the hard way by Kelsey Hightower. + + +** Minikube + +Prerequisites to run minikube: +- Minikube runs inside a VM on Linux, Mac, Windows. So the use minikube, we need to have the required hypervisor installed first. We can also use ~--vm-driver=none~ to start the Kubernetes single node "cluster" on your local machine +- kubectl - it is a binary used to interact with the Kubernetes cluster + + +We know about ~cri-o~, which is a general shim layer implementing CRI (container runtime interface) for all OCI (open containers initiative) compliant container runtimes + +To use cri-o runtime with minikube, we can do: + +~minikube start --container-runtime=cri-o~ +then, docker commands won't work. We have to use: ~sudo runc list~ to list the containers for example + +** Kubernetes dashboard + +We can use the kubectl cli to access Minikube via CLI, Kubernetes dashboard to access it via cli, or curl with the right credentials to access it via APIs + +Kubernetes has an API server, which is the entry point to interact with the Kubernetes cluster - it is used by kubectl, by the gui, and by curl directly as well + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-13 23:48:48 +[[file:assets/screenshot_2018-06-13_23-48-48.png]] + + +The api space :top: is divided into 3 independent groups. +- core group + - ~/api/v1~ + - this includes objects such as pods, services, nodes etc +- named group + - these include objects in ~/apis/$NAME/$VERSION~ format + - The different levels imply different levels of stability and support: + - alpha - it may be dropped anytime without notice, eg: ~/apis/batch/v2alpha1~ + - beta - it is well tested, but the semantics may change in incompatible ways in a subsequent beta or stable release. Eg: ~/apis/certificates.k8s.io/v1beta1~ + - stable - appears in released software for many subsequent versions. Eg ~apis/networking.k8s.io/v1~ +- system wide + - this group consists of system wide API endpoints, like ~/healthz~, ~/logs~, ~/metrics~, ~/ui~ etc + + +Minikube has a dashboard, start it with ~minikube dashboard~ + +You can get a dashboard using the ~kubectl proxy~ command also. It starts a service called ~kubernetes-dashboard~ which runs inside the ~kube-system~ namespace +access the dashboard on ~localhost:8001~ +once ~kubectl proxy~ is configured, we can use curl to localhost on the proxy port - ~curl http://localhost:8001~ + +If we don't use ~kubectl proxy~, we have to get a token from the api server by: + +~$ TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t' | tr -d " ")~ + +Also, the api server endpoint: +~$ APISERVER=$(kubectl config view | grep https | cut -f 2- -d ":" | tr -d " ")~ + +Now, it's a matter of a simple curl call: +~$ curl $APISERVER --header "Authorization: Bearer $TOKEN" --insecure~ + +** Kubernetes building blocks +Kubernetes has several objects like Pods, ReplicaSets, Deployments, Namespaces etc +We also have Labels, Selectors which are used to group objects together. + +Kubernetes has a rich object model which is used to represent *persistent entities* +The persistent entities describe: +- what containerized applications we are running, and on which node +- application resource consumption +- different restart/upgrade/fault tolerance policies attached to applications + +With each object, we declare our intent (or desired state) using *spec* field. +The Kubernetes api server always accepts only json input. Generally however, we write ~yaml~ files which are converted to json by ~kubectl~ before sending it + +Example of deployment object: +#+begin_src yaml +apiVersion: apps/v1 # the api endpoint we want to connect to +kind: Deployment # the object type +metadata: # as the name implies, some info about deployment object + name: nginx-deployment + labels: + app: nginx +spec: # desired state of the deployment + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: # desired state of the Pod + containers: + - name: nginx + image: nginx:1.7.9 + ports: + - containerPort: 80 +#+end_src + +Once this is created, Kubernetes attaches the ~status~ field to the object + +*** Pods + +It is the smallest and simplest Kubernetes object, a unit of deployment in Kubernetes. +It is a logical unit representing an application. +The pod is a logical collection of containers which are deployed on the same host (colocated), share the same network namespace, mount the same external storage volume + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 19:31:56 +[[file:assets/screenshot_2018-06-14_19-31-56.png]] + +Pods cannot self heal, so we use them with controllers, which can handle pod's replication, fault tolerance, self heal etc. +Examples of controllers: +- Deployments +- ReplicaSets +- ReplicationControllers + +We attach the pod's spec (specification) to other objects using pods templates like in previous example + +*** Labels + +They are key-value pairs that are attached to any Kubernetes objects (like pods) +They are used to organize and select a subset of objects. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 20:02:04 +[[file:assets/screenshot_2018-06-14_20-02-04.png]] + + +**** Label Selectors + +Kubernetes has 2 types of selectors: +- equality based selectors +We can use =, ==, or != operators +- set based selectors +Allows filtering based on a set of values. We can use ~in~, ~notin~, ~exist~ operators. +Eg: ~env in (dev, qa)~ which allows selecting objects where env label is dev or qa + + +*** ReplicationControllers + +A rc is a controller that is part of the master node's controller manager. It makes sure that the specified number of replicas for a Pod is running - no more, no less. +We generally don't deploy pods on their own since they can't self-heal, we almost always use ~ReplicationController~s to deploy and manage them. + +*** ReplicaSets + +~rs~ is the next generation ~ReplicationController~. It has both equality and set based selectors. RCs only support equality based controllers. + +RSs can be used independently, but they are mostly used by Deployments to orchestrate pod creation, deletion and updates. +A deployment automatically creates the ReplicaSets + + +*** Deployments +Deployment objects provide declarative (just describe what you want, not how to get it) updates to Pods and ReplicaSets. +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 20:20:37 +[[file:assets/screenshot_2018-06-14_20-20-37.png]] + +Here, :top:, the Deployment creates a ~ReplicaSet A~ which creates 3 pods. In each pod, the container runs ~nginx:1.7.9~ image. + +Now, we can update the nginx to say ~1.9.1~. This will trigger a new ReplicaSet to be created. Now, this ReplicaSet will make sure that there are the required number of pods as specified in it's spec (that's what it does) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 20:23:38 +[[file:assets/screenshot_2018-06-14_20-23-38.png]] + + +Once the ReplicaSet B is ready, Deployment starts pointing to it + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 20:24:16 +[[file:assets/screenshot_2018-06-14_20-24-16.png]] + +The Deployments provide features like Deployment recording, which allows us to rollback if something goes wrong. + +*** Namespaces + +If we want to partition our Kubernetes cluster into different projects/teams, we can use Namespaces to logically divide the cluster into sub-clusters. + +The names of the resources/objects created inside a namespace are unique, but not across Namespace. + +#+begin_src +$ kubectl get namespaces +NAME STATUS AGE +default Active 11h +kube-public Active 11h +kube-system Active 11h +#+end_src + +The namespace above are: +- default +This is the default namespace +- kube-system +Objects created by the Kubernetes system +- kube-public +It is a special namespace, which is readable by all users and used for special purposes - like bootstrapping a cluster + +We can use Resource Quotas to divide the cluster resources within Namespaces. + + +** Authentication, Authorization, Admission Control + +Each API access request goes thru the following 3 stages: +- authentication +You are who you say you are +- authorization +You are allowed to access this resource +- admission control +Further modify/reject requests based on some additional checks, like Quota. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-14 20:42:57 +[[file:assets/screenshot_2018-06-14_20-42-57.png]] + +Kubernetes does not have an object called _user_, not does it store _usernames_. +There are 2 kinds of users: +- normal users +They are managed outside of Kubernetes cluster via independent services like user/client certificates, a file listing usernames/passwords, google accounts etc. + +- service accounts +With *Service Account* users, in-cluster processes communicate with the API server. Most of the SA users are created automatically via the API server, or can be created manually. The SA users are tied to a given namespace and mount the respective credentials to communicate with the API server as Secrets. + +*** Authenticator Modules +For authentication, Kubernetes uses different authenticator modules. +- client certificates +We can enable client certificate authentication by giving a CA reference to the api server which will validate the client certificates presented to the API server. The flag is ~--client-ca-file=/path/to/file~ + +- static token file +We can have pre-defined bearer tokens in a file which can be used with ~--token-auth-file=/path/to/file~ +the tokens would last indefinitely, and cannot be changed without restarting the api server + +- bootstrap tokens +Can be used for bootstrapping a Kubernetes cluster + +- static password file +Similar to static token file. The plag is: ~--basic-auth-file=/path/to/file~. The passwords cannot be changed without restarting the api-server + +- service account tokens +This authenticator uses bearer tokens which are attached to pods using the ~ServiceAccount~ admission controller (which allows the in-cluster processes to talk to the api server) + +- OpenID Connect tokens +OpenID Connect helps us connect with OAuth 2 providers like Google etc to offload authentication to those services + +- Webhook Token Authentication +We can offload verification to a remote service via webhooks + +- Keystone password +- Authenticating Proxy +Such as nginx. We have this for our logs stack at Draup + + +*** Authorization + +After authentication, we need authorization. + +Some of the API request attributes that are reviewed by Kubernetes are: user, group, extra, Resource, Namespace etc. They are evaluated against policies. There are several modules that are supported. + +- Node Authorizer +It authorizes API requests made by kubelets (it authorizes the kubelet's read operations for services, endpoints, nodes etc, and write operations for nodes, pods, events etc) + +- ABAC authorizer - Attribute based access control +Here, Kubernetes grants access to API requests which combine policies with attributes. Eg: + +#+begin_src +{ + "apiVersion": "abac.authorization.kubernetes.io/v1beta1", + "kind": "Policy", + "spec": { + "user": "nkhare", + "namespace": "lfs158", + "resource": "pods", + "readonly": true + } +} +#+end_src + +Here, :top:, ~nkhare~ has only ~read only~ access to pods in namespace ~lfs158~. + +To enable this, we have to start the API server with the ~--authorization-mode=ABAC~ option and specify the authorization policy with ~--authorization-policy-file=PolicyFile.json~ + +- Webhook authorizer +We can offload authorizer decisions to 3rd party services. To use this, start the API server with ~authorization-webhook-config-file=/path/to/file~ where the file has the configuration of the remote authorization service. + +- RBAC authorizer - role based access control +Kubernetes has different roles that can be attached to subjects like users, service accounts etc +while creating the roles, we restrict access to specific operations like ~create, get, update, patch~ etc + +There are 2 kinds of roles: +- role +With role, we can grant access to 2 kinds of roles: +- Role +We can grant access to resources within a specific namespace + +- ClusterRole +Can be used to grant the same permission as Role, but its scope is cluster-wide. + +We will only focus on ~Role~ + +Example: + +#+begin_src +kind: Role +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + namespace: lfs158 + name: pod-reader +rules: +- apiGroups: [""] # "" indicates the core API group + resources: ["pods"] + verbs: ["get", "watch", "list"] +#+end_src + +Here, we created a ~pod-reader~ role which can only access pods in the ~lfs158~ namespace +one we create this Role, we can bind users with ~RoleBinding~ + +There are 2 kinds of ~RoleBinding~s: +- RoleBinding +This allows us to bind users to the same namespace as a Role. + +- ClusterRoleBinding +It allows us to grant access to resources at cluster-level and to all namespaces + + +To start API server with rbac option, we use ~--authorization-mode=RBAC~ +we can also dynamically configure policies. + + +*** Admission Control +It is used to specify granular access control policies which include allowing privileged containers, checking on resource quota etc. +There are different admission controllers to enforce these eg: ~ResourceQuota, AlwaysAdmit, DefaultStorageClass~ etc. + +They come into affect only after API requests are authenticated and authorized + +To use them, we must start the api server with the flag ~admission-control~ which takes a comma separated ordered list of controller names. +~--admission-control=NamespaceLifecycle,ResourceQuota,PodSecurityPolicy,DefaultStorageClass~ + + +** Services + +We will learn about services, which are used to group Pods to provide common access points from the external world. +We will learn about ~kube-proxy~ daemon, which runs each on worker node to provide access to services. +Also, we'll talk about *service discovery* and *service types* which decide the access scope of a service. + +*** Connecting users to Pods + +Pods are ephemeral, they can be terminated, rescheduled etc. We cannot connect to them using pod IP directly. Kubernetes provides a higher level abstraction called ~Service~ which logically groups Pods and a policy to access them. +The grouping is achieved with labels and selectors. + +Example consider this: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-16 23:33:07 +[[file:assets/screenshot_2018-06-16_23-33-07.png]] + +Here, we have grouped the pods into 2 logical groups based on the selectors ~frontend~ and ~db~. + +We can assign a name to the logical group, called a Service name eg: ~frontend-svc~ and ~db-svc~. + +Example: + +#+begin_src +kind: Service +apiVersion: v1 +metadata: + name: frontend-svc +spec: + selector: + app: frontend + ports: + - protocol: TCP + port: 80 + targetPort: 5000 +#+end_src + +Here, :top:, we are creating ~frontend-svc~ service. By default each service also gets an IP address which is routable only inside the cluster. +The IP attached to each service is aka as ~ClusterIP~ for that service (eg: ~172.17.0.4~ here in the diagram) + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-16 23:36:44 +[[file:assets/screenshot_2018-06-16_23-36-44.png]] + +The user/client now connects to the IP address which forwards the traffic to the pods attached to it. It does the load balancing, routing etc. + +We also in our service spec, defined a ~targetPort~ as 5000. So the service will route the traffic to port 5000 on the pods. If we don't select it, it will be the same port as the service port (80 in the example above) + +A tuple of Pods, IP addresses, along with the targetPort is referred to as a Service endpoint. In our case, frontend-svc has 3 endpoints: 10.0.1.3:5000, 10.0.1.4:5000, and 10.0.1.5:5000. + + + +All the worker nodes run ~kube-proxy~ which watches the API server for addition and removal of services. +For each new service, the ~kube-proxy~ updates the iptables of all the nodes to route the traffic for its ClusterIP to the service endpoints (node-ip:port tuples). It does the load balancing etc. The ~kube-proxy~ _implements_ the service abstraction. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-16 23:48:17 +[[file:assets/screenshot_2018-06-16_23-48-17.png]] + +*** Service Discovery + +Services are the primary mode of communication in Kubernetes, so we need a way to discover them at runtime. +*Kubernetes supports 2 methods of discovering a service*: + +- Environment Variables +As soon as a pod runs on any worker node, the ~kubelet~ daemon running on that node adds a set of environment variables in the pod for all the active services. +Eg: consider a service ~redis-master~, with exposed port ~6379~ and ClusterIP as ~172.17.0.6~ + +This would lead to the following env vars to be declared in the pods: +#+begin_src +REDIS_MASTER_SERVICE_HOST=172.17.0.6 +REDIS_MASTER_SERVICE_PORT=6379 +REDIS_MASTER_PORT=tcp://172.17.0.6:6379 +REDIS_MASTER_PORT_6379_TCP=tcp://172.17.0.6:6379 +REDIS_MASTER_PORT_6379_TCP_PROTO=tcp +REDIS_MASTER_PORT_6379_TCP_PORT=6379 +REDIS_MASTER_PORT_6379_TCP_ADDR=172.17.0.6 +#+end_src + +- DNS +Kubernetes has an add-on for DNS, which creates a DNS record for each Service and its format is ~my-svc.my-namespace.svc.cluster.local~. +Services within the same namespace can reach other service with just their name. For example, if we add a Service redis-master in the my-ns Namespace, then all the Pods in the same Namespace can reach to the redis Service just by using its name, redis-master. Pods from other Namespaces can reach the Service by adding the respective Namespace as a suffix, like redis-master.my-ns. + +This method is recommended. + + +*** ServiceType + +While defining a Service, we can also choose it's scope. We can decide if the Service +- is accessible only within the cluster +- is accessible from within the cluster AND the external world +- maps to an external entity which resides outside to the cluster + +The scope is decided with the ~ServiceType~ declared when creating the service. + +Service types - ClusterIP, NodePort + +**** ClusterIP, NodePort + +ClusterIP is the default ServiceType. A service gets its virtual IP using the ClusterIP. This IP is used for communicating with the service and is accessible only within the cluster + +With the NodePort ServiceType in addition to creating a ClusterIP, a port from the range 30,000-32,767 also gets mapped to the Service from all the worker nodes. +Eg: if the ~frontend-svc~ has the NodePort ~32233~, then when we connect to any worked node on ~32233~, the packets are routed to the assigned ClusterIP ~172.17.0.4~ + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 00:04:39 +[[file:assets/screenshot_2018-06-17_00-04-39.png]] + +NodePort is useful when we want to make our service accessible to the outside world. The end user connects to the worker nodes on the specified port, which forwards the traffic to the applications running inside the cluster. + +To access the service from the outside world, we need to configure a reverse proxy outside the Kubernetes cluster and map the specific endpoint to the respective port on the worked nodes. + +There is another ServiceType: LoadBalancer + +**** LoadBalancer +- With this ServiceType, NodePort and ClusterIP services are automatically created, and the external loadbalancer will route to them + +- The Services are exposed at a static port on each worker node. +- the Service is exposed externally using the underlying cloud provider's load balancer. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 00:09:07 +[[file:assets/screenshot_2018-06-17_00-09-07.png]] + +**** ServiceType: ExternalIP + +The cluster administrator can manually configure the service to be mapped to an external IP also. The traffic on the ExternalIP (and the service port) will be routed to one of the service endpoints + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 00:13:05 +[[file:assets/screenshot_2018-06-17_00-13-04.png]] + +**** ServiceType: ExternalName + +It is a special ServiceType that has no selectors, or endpoints. +When accessed within a cluster, it returns a ~CNAME~ record of an externally configured service. + +This is primarily used to make an externally configured service like ~my-db.aws.com~ available inside the cluster using just the name ~my-db~ to other services inside the same namespace. + +*** Deploying a Service + +Example of using NodePort + +#+begin_src +apiVersion: v1 +kind: Service +metadata: + name: web-service + labels: + run: web-service +spec: + type: NodePort + ports: + - port: 80 + protocol: TCP + selector: + app: nginx +#+end_src + +Create it using: + +#+begin_src +$ kubectl create -f webserver-svc.yaml +service "web-service" created + + +$ kubectl get svc +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kubernetes ClusterIP 10.96.0.1 443/TCP 1d +web-service NodePort 10.110.47.84 80:31074/TCP 12s +#+end_src + +We can access it at: ~$(CLUSTER_IP):31074~ + +This is the port that will route the traffic to the service endpoint's port 80 +(recall again, the Service Endpoint is just the tuples of (node IP:service port), the service port is the ~targetPort~ in the service spec) + +Deploying MongoDB +We need a Deployment and a Service + +#+begin_src +apiVersion: apps/v1 +kind: Deployment +metadata: + name: rsvp-db + labels: + appdb: rsvpdb +spec: + replicas: 1 + selector: + matchLabels: + appdb: rsvpdb + template: + metadata: + labels: + appdb: rsvpdb + spec: + containers: + - name: rsvp-db + image: mongo:3.3 + ports: + - containerPort: 27017 + +$ kubectl create -f rsvp-db.yaml +deployment "rsvp-db" created + + +apiVersion: v1 +kind: Service +metadata: + name: mongodb + labels: + app: rsvpdb +spec: + ports: + - port: 27017 + protocol: TCP + selector: + appdb: rsvpdb + + +$ kubectl create -f rsvp-db-service.yaml +service "mongodb" created +#+end_src + + + +** Liveness and Readiness Probes + +They are used by kubelet to control the health of the application running inside the Pod's container. +Liveness probe is like the health check on AWS' ELB. If the health check fails, the container is restarted + +It can be defined as: +- liveness command +- liveness HTTP request +- TCP liveness probe + +Example: + +#+begin_src +apiVersion: v1 +kind: Pod +metadata: + labels: + test: liveness + name: liveness-exec +spec: + containers: + - name: liveness + image: k8s.gcr.io/busybox + args: + - /bin/sh + - -c + - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 + livenessProbe: + exec: + command: + - cat + - /tmp/healthy + initialDelaySeconds: 3 + periodSeconds: 5 +#+end_src + +Here, we start a container with a command which creates a new file in ~/tmp~. +Next, we defined the ~livenessProbe~ to be a command which ~cat~s the file. If it exists, the container is healthy we say. + +Deleting this file will trigger a restart + +We can also define a HTTP request as the liveness test: + +#+begin_src +livenessProbe: + httpGet: + path: /healthz + port: 8080 + httpHeaders: + - name: X-Custom-Header + value: Awesome + initialDelaySeconds: 3 + periodSeconds: 3 +#+end_src + +Here, we hit the ~/healthz~ endpoint on port ~8080~ + +We can also do TCP liveness probes +The kubelet attempts to open the TCP socket to the container which is running the application. If it succeeds, the application is considered healthy, otherwise the kubelet marks it as unhealthy and triggers a restart + +#+begin_src +livenessProbe: + tcpSocket: + port: 8080 + initialDelaySeconds: 15 + periodSeconds: 20 +#+end_src + +*** Readiness Probe + +Sometimes, the pod has to do some task before it can serve traffic. This can be loading a file in memory, downloading some assets etc. +We can use Readiness probes to signal that the container (in the context of Kubernetes, containers and Pods are used interchangeably) is ready to receive traffic. + +#+begin_src +readinessProbe: + exec: + command: + - cat + - /tmp/healthy + initialDelaySeconds: 5 + periodSeconds: 5 +#+end_src + +** Kubernetes volume management + +Kubernetes uses Volumes for persistent storage. We'll talk about PersistantVolume and PersistentVolumeClaim which help us attach volumes to Pods + +A Volume is essentially a directory backed by a storage medium. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 11:29:10 +[[file:assets/screenshot_2018-06-17_11-29-10.png]] + +A Volume is attached to a Pod and shared by the containers of that Pod. +The volume has the same lifespan as the Pod and it outlives the containers of the Pod - it allows data to be preserved across container restarts. + +A directory which is mounted inside a Pod is backed by the underlying Volume Type. The Volume Type decides the properties of the directory, like: size, content etc + +There are several volume types: +- emptyDir +An empty Volume is created for the Pod as soon as it is schedules on the worker node. The Volume's life is coupled with the Pod's. When the Pod dies, the contents of the emptyDir Volume are deleted + +- hostPath +We can share a directory from the host to the Pod. If the Pod dies, the contents of the hostPath still exist. Their use is not recommended because not all the hosts would have the same directory structure + +- gcePersistentDisk +We can mount Google Compute Engine's PD (persistent disk) into a Pod + +- awsElasticBlockStore +We can mount AWS EBS into a Pod + +- nfs +We can mount nfs share + +- iscsi +We can mount iSCSI into a Pod. Iscsi stands for (internet small computer systems interface), it is an IP based storage networking standard for linking data storage facilities. + +- secret +With the ~secret~ volume type, we can pass sensitive information such as passwords to pods. + +- persistentVolumeClaim + +We can attach a PersistentVolume to a pod using PersistentVolumeClaim. +PVC is a volume type + +*** PersistentVolumes + +In a typical setup, storage is maintained by the system administrators. The developer just gets instructions to use the storage, and doesn't have to worry about provisioning etc + +Using vanilla Volume Types makes the same model difficult in the Kubernetes. So we have PersistentVolume (PV), which provides APIs for users and administrators to manage and consume storage of the above Volume Types. +To manage - PV API resource type +To consume - PVC API resource type + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 11:38:36 +[[file:assets/screenshot_2018-06-17_11-38-36.png]] +PV :top: + +PVs can be dynamically provisioned as well - using the StorageClass resource. A StorageClass contains pre-defined provisioners and parameters to create a PV. +How it works is, the user sends a PVC request and this results in the creation of a PV + +Some of the Volume Types that support managing storage using PV: + +- GCEPersistentDisk +- AWSElasticBlockStore +- AzureFile +- NFS +- iSCSI + +*** PersistentVolumeClaim + +A PVC is a request for storage by the user. User requests PV resources based on size, access modes etc. +Once a suitable PV is found, it is bound to a PVC. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 11:44:02 +[[file:assets/screenshot_2018-06-17_11-44-02.png]] + +The administrator provisions PVs, the user requests them using PVC. Once the suitable PVs are found, they are bound to the PVC and given to the user to use. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 11:44:52 +[[file:assets/screenshot_2018-06-17_11-44-52.png]] + +After use, the PV can be released. The underlying PV can then be reclaimed and used by someone else. + +*** CSI - Container Storage Interface + +Note: Kubernetes interfaces are always CXI - Container X Interface (eg: CNI, CSI etc) + +We have several CO - Container Orchestraters (Kubernetes, Mesos, Cloud Foundry). Each manages volumes in its own way. This lead to a difficult time for the storage vendors as they have to support all the different COs. +Also, the code written by the vendors has to live "in-tree" in the COs and has to be tied to the release cycle of the COs. This is not ideal + +So, the volume interface is standardized now so that a volume plugin using the CSI would work for all COs. + + +** ConfigMaps and Secrets + +While deploying an application, we may need to pass runtime parameters like endpoints, passwords etc. To do this we can use ~ConfigMap API~ resource. + +We can use ConfigMaps to pass key-value pairs, which can be consumed by pods, or any other system components like controllers. +There are 2 ways to create ConfigMaps: + +*** From literal values +Recall literal values are just values defined "in-place" + +#+begin_src +$ kubectl create configmap my-config --from-literal=key1=value1 --from-literal=key2=value2 +configmap "my-config" created +#+end_src + + +*** From files + +#+begin_src +apiVersion: v1 +kind: ConfigMap +metadata: + name: customer1 +data: + TEXT1: Customer1_Company + TEXT2: Welcomes You + COMPANY: Customer1 Company Technology Pct. Ltd. + +$ kubectl create -f customer1-configmap.yaml +configmap "customer1" created +#+end_src + +We can use the ConfigMap values from inside the Pod using: + +#+begin_src +.... + containers: + - name: rsvp-app + image: teamcloudyuga/rsvpapp + env: + - name: MONGODB_HOST + value: mongodb + - name: TEXT1 + valueFrom: + configMapKeyRef: + name: customer1 + key: TEXT1 + - name: TEXT2 + valueFrom: + configMapKeyRef: + name: customer1 + key: TEXT2 + - name: COMPANY + valueFrom: + configMapKeyRef: + name: customer1 + key: COMPANY +.... +#+end_src + +We can also mount a ConfigMap as a Volume inside a Pod. For each key, we will see a file in the mount path and the content of that file becomes the respective key's value. + +*** Secrets + +Secrets are similar to ConfigMaps in that they are key-value pairs that can be passed on to Pods etc. The only difference being they deal with sensitive information like passwords, tokens, keys etc + +The Secret data is stored as plain text inside etcd, so the administrators must restrict access to the api server and etcd + +We can create a secret literally +~$ kubectl create secret generic my-password --from-literal=password=mysqlpassword~ + +The above command would create a secret called my-password, which has the value of the password key set to mysqlpassword. + + +Analyzing the get and describe examples below, we can see that they do not reveal the content of the Secret. The type is listed as Opaque. +#+begin_src +$ kubectl get secret my-password +NAME TYPE DATA AGE +my-password Opaque 1 8m + +$ kubectl describe secret my-password +Name: my-password +Namespace: default +Labels: +Annotations: + +Type Opaque + +Data +==== +password.txt: 13 bytes +#+end_src + +We can also create a secret manually using a YAML configuration file. With secrets, each object data must be encoded using ~base64~. + +So: + +#+begin_src +# get the base64 encoding of password +$ echo mysqlpassword | base64 + +bXlzcWxwYXNzd29yZAo= + +# now use it to create a secret +apiVersion: v1 +kind: Secret +metadata: + name: my-password +type: Opaque +data: + password: bXlzcWxwYXNzd29yZAo= +#+end_src + + +Base64 is not encryption of course, so decrypting is easy: +#+begin_src +$ echo "bXlzcWxwYXNzd29yZAo=" | base64 --decode +#+end_src + +Like ConfigMaps, we can use Secrets in Pods using: +- environment variables +#+begin_src +..... + spec: + containers: + - image: wordpress:4.7.3-apache + name: wordpress + env: + - name: WORDPRESS_DB_HOST + value: wordpress-mysql + - name: WORDPRESS_DB_PASSWORD + valueFrom: + secretKeyRef: + name: my-password + key: password +..... +#+end_src + +- mounting secrets as a volume inside a Pod. A file would be created for each key mentioned in the Secret whose content would be the respective value. + + +*** Ingress + +We earlier saw how we can access our deployed containerized application from the external world using Services. We talked about ~LoadBalancer~ ServiceType which gives us a load balancer on the underlying cloud platform. This can get expensive if we use too many Load Balancers. + +We also talked about NodePort which gives us a port on each worker node and we can have a reverse proxy that would route the requests to the (node-ip:service-port) tuples. +However, this can get tricky, as we need to keep track of assigned ports etc. + +Kubernetes has ~Ingress~ which is another method we can use to access our applications from the external world. + +With Services, routing rules are attached to a given Service, they exist as long as the service exists. If we decouple the routing rules from the application, we can then update our application without worrying about its external access. + +The Ingress resource helps us do that. + +According to kubernetes.io +~an ingress is a collection of rules that allow inbound connections to reach the cluster Services~ + +To allow inbound connection to reach the cluster Services, ingress configures a L7 HTTP load balancer for Services and provides the following: +- TLS - transport layer security +- Name based virtual hosting +- Path based routing +- Custom rules + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 12:45:17 +[[file:assets/screenshot_2018-06-17_12-45-17.png]] + +With Ingress, users don't connect directly the a Service. They reach the Ingress endpoint, and from there, the request is forwarded to the respective Service. (note the usage of request, not packets since Ingress is L7 load balancer) + +#+begin_src +apiVersion: extensions/v1beta1 +kind: Ingress +metadata: + name: web-ingress + namespace: default +spec: + rules: + - host: blue.example.com + http: + paths: + - backend: + serviceName: webserver-blue-svc + servicePort: 80 + - host: green.example.com + http: + paths: + - backend: + serviceName: webserver-green-svc + servicePort: 80 +#+end_src + + +The requests for both (blue.example.com and green.example.com) will come to the same Ingress endpoint which will route it to the right Service endpoint + +The example above :top: is an example of name based virtual hosting ingress rule + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 13:08:20 +[[file:assets/screenshot_2018-06-17_13-08-20.png]] + +We can also have fan out ingress rules, in which we send the requests like example.com/blue and example.com/green which would be forwarded to the correct Service + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 13:09:04 +[[file:assets/screenshot_2018-06-17_13-09-04.png]] + +The Ingress resource uses the Ingress Controller which does the request forwarding. + +**** Ingress Controller + +It is an application that watches the master node's API server for changes in the ingress resources and updates the L7 load balancer accordingly. + +Kubernetes has several different Ingress Controllers (eg: Nginx Ingress Controller) and you can write yours too. + +Once the controller is deployed (recall it's a normal application) we can use it with an ingress resource + +#+begin_src +$ kubectl create -f webserver-ingress.yaml +#+end_src + +*** Other Kubernetes topics + +Kubernetes also has features like auto-scaling, rollbacks, quota management etc + + +**** Annotations +We can attach arbitrary non-identifying metadata to any object, in a key-value format + +#+begin_src +"annotations": { + "key1" : "value1", + "key2" : "value2" +} +#+end_src + +They are not used to identify and select objects, but for: +- storing release ids, git branch info etc +- phone/pages numbers +- pointers to logging etc +- descriptions + +Example: + +#+begin_src +apiVersion: extensions/v1beta1 +kind: Deployment +metadata: + name: webserver + annotations: + description: Deployment based PoC dates 2nd June'2017 +.... +.... +#+end_src + +Annotations can be looked at by using describe + +~$ kubectl describe deployment webserver~ + +**** Deployments + +If we have recorded our Deployment before doing our update, we can revert back to a know working state if the deployment fails + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 13:55:43 +[[file:assets/screenshot_2018-06-17_13-55-43.png]] + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-06-17 13:55:28 +[[file:assets/screenshot_2018-06-17_13-55-28.png]] + + +Deployments also has features like: +- autoscaling +- proportional scaling +- pausing and resuming + +A deployment automatically creates a ReplicaSet - which makes sure the correct number of Pods are present and pass the liveness probe. + +**** Jobs + +A Job creates 1 or more Pods to perform a given task. The Job object takes the responsibility of Pod failures and makes sure the task is completed successfully. After the task, the Pods are terminated automatically. +We can also have cron jobs etc + +**** Quota Management +In a multi tenant deployment, fair usage is vital. Administrators can use ResourceQuota object to limit resource consumption per Namespace + +We can have the following types of quotas per namespace: +- Compute Resource Quota +We can limit the compute resources (CPU, memory etc) that can be requested in a given namespace + +- Storage Resource Quota +We can limit the storage resources (PVC, requests.storage etc) + +- Object Count Quota +We can restrict the number of objects of a given type (Pods, ConfigMaps, PVC, ReplicationControllers, Services, Secrets etc) + +This is implemented using cgroups under the hood + +**** DaemonSets +If we want a "ghost" Pod(a pod that is running on all nodes at all times), for eg to collect monitoring data from all nodes etc we can use DaemonSet object. + +Whenever a node is added to the cluster, a Pod from a given DaemonSet is created on it. If the DaemonSet is deleted, all Pods are deleted as well. + +**** StatefulSets +The StatefulSet controller is used for applications that require a unique identity such as name, network identifications, strict ordering etc - eg: mysql cluster, etcd cluster + +The StatefulSet controller provides identity and guaranteed ordering of deployment and scaling to Pods. + +**** Kubernetes Federation +We can manage multiple Kubernetes clusters from a single control plane using Kubernetes Federation. We can sync resources across the clusters and have cross-cluster discovery, allowing us to do Deployments across regions and access them using a global DNS record. + +The Federation is very useful when we want to build a hybrid solution, in which we can have one cluster running inside our private datacenter and another one on the public cloud. We can also assign weights for each cluster in the Federation, to distribute the load as per our choice. + +**** Custom Resources +In Kubernetes, a resource is an API endpoint. It stores a collection of API objects. Eg: a Pod resource contains all the Pod objects. + +If the existing Kubernetes resources are not sufficient to fulfill our requirements, we can create new resources using *custom resources* + +To make a resource declarative(like the rest of Kubernetes), we have to write a custom controller - which can interpret the resource structure and perform the required actions. + +There are 2 ways of adding custom resources: +- CRDs - custom resource definitions +- API aggregation +They are subordinate API servers which sit behind the primary API server and act as proxy. They offer more fine grained control. + +**** Helm +When we deploy an application on Kubernetes, we have to deal with a lot of manifests (the yaml containing the spec) such as Deployments, Services, Volume Claims, Ingress etc. +It can be too much work to deploy them one by one specially for the common use cases like deploying a redis cluster etc. + +We can bundle these manifests after templatizing them into a well-defined format (with some metadata). This becomes a package essentially - we call them Charts. +They can then be served by package managers like Helm. + +Helm is a package manager (analogous to yum and apt) for Kubernetes, which can install/update/delete those Charts in the Kubernetes cluster. + +Helm has two components: +- A client called helm, which runs on your user's workstation +- A server called tiller, which runs inside your Kubernetes cluster. + + +The client helm connects to the server tiller to manage Charts + +**** Monitoring and Logging +2 popular solutions are: +- Heapster +It is a cluster wide aggregator of monitoring and event data which is natively supported on Kubernetes. + +- Prometheus +It can also be used to scrape resource usage from different Kubernetes components and objects. + +We can collect logs from the different components of Kubernetes using fluentd, which is an open source data collector. We can ship the logs to Elasticsearch etc. + + +* Kubernetes Up and Running + +#+BEGIN_QUOTE +Kubernetes would like to thank every sysadmin who has woken up at 3am to restart a process. +#+END_QUOTE + + +Kubernetes intends to radically simplify the task of building, deploying, and maintaining distributed systems. + + +From the first programming languages, to object-oriented programming, to the development of virtualization and cloud infrastructure, the history of computer science is a history of the development of abstractions that hide complexity and empower you to build ever more sophisticated applications. + +** Benefits of Kubernetes include: + +**** Velocity +The speed with which you can deploy new features and components - while keeping the service up reliably. +Kubernetes provides this by providing immutability, declarative configuration, self healing systems + +***** Immutability +Containers and Kubernetes encourage developers to build distributed systems that adhere to the principles of immutable infrastructure. + +With immutable infrastructure, once an artifact is created in the system it does not change via user modifications. + +Traditionally, this was not the case, they were treated as mutable infrastructure. With mutable infrastructure, changes are applied as incremental updates to an existing system. + +A system upgrade via the apt-get update tool is +a good example of an update to a mutable system. Running apt sequentially +downloads any updated binaries, copies them on top of older binaries, and makes incremental updates to configuration files. + +In contrast, in an immutable system, rather than a series of incremental updates and changes, an entirely new, complete image is built, where the update simply replaces the entire image with the newer image in a single operation. There are no incremental changes. + +In Draup, we have the artifact which is the source code, which is replaced by a new git clone, but the pip packages etc maybe updated/removed incrementally. Hence, we use mutable infrastructure. +In services, we have docker compose, and subsequently immutable infrastructure. + +Consider containers. What would you rather do? +- You can login to a container,run a command to download your new software, kill the old server, and start the new one. +- You can build a new container image, push it to a container registry,kill the existing container, and start a new one. + +In the 2nd case, the entire artifact replacement makes it easy to track changes that you made, and also to rollback your changes. Go-Jek's VP's lecture during the recent #go-include meetup comes to mind, where he spoke about "snowflakes" that are created by mutable infrastructure. + +***** Declarative Configuration + +Everything in Kubernetes is a declarative configuration object that represents the desired state of the system. It is Kubernetes’s job to ensure that the actual state of the world matches this desired state. + +declarative configuration is an alternative to imperative configuration, where the state of the world is defined by the execution of a series of instructions rather than a declaration of the desired state of the world. + +While imperative commands define actions, declarative configurations define state. + +To understand these two approaches, consider the task of producing three replicas of a piece of software. With an imperative approach, the configuration would say: “run A, run B, and run C.” The corresponding declarative configuration would be “replicas equals three.” + + +***** Self healing systems +As a concrete example of the self-healing behavior, if you assert a desired state of three replicas to Kubernetes, it does not just create three replicas — it continuously ensures that there are exactly three replicas. If you manually create a fourth replica Kubernetes will destroy one to bring the number back to three. If you manually destroy a replica, Kubernetes will create one to again return you to the desired state. + +**** Scaling (of both software and teams) + +Kubernetes achieves and enables scaling by favoring decoupled architectures. + +***** Decoupling +In a decoupled architecture each component is separated from other components by defined APIs and service load balancers. + +Decoupling components via load balancers makes it easy to scale the programs that make up your service, because increasing the size (and therefore the capacity) of the program can be done without adjusting or reconfiguring any of the other layers of your service. Each can be scaled independently. + +Decoupling servers via APIs makes it easier to scale the development teams because each team can focus on a single, smaller microservice with a comprehensible surface area + +Crisp APIs between microservices (defining an interface b/w services) limit the amount of cross-team communication overhead required to build and deploy software. Hence, teams can be scaled effectively. + +***** Easy scaling +We can have autoscaling at 2 levels +- pods + - they can be configured to be scaled up or down depending on some predefined condition + - this assumes that the nodes have resources to support the new number of pods +- cluster nodes + - since each node is exactly like the previous one, adding a new node to the cluster is trivial and can be done with a few commands or a prebaked image. + +Also, since Kubernetes allows us to bin pack, we can place containers from different services onto a single server. This reduces stastical noise and allows us to have a more reliable forecast about growth of different services. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-12 15:12:58 +[[file:assets/screenshot_2018-08-12_15-12-58.png]] + +Here, we see each time is decoupled by APIs. The Hardware Ops team has to provide the hardware. The kernel team just needs the hardware to make sure their kernel is providing the system calls api. The cluster guys need the api so that they can provision the cluster. The application developers need the kube api to run their apps. Everyone is happy. + +**** Abstracting the infrastructure +When your developers build their applications in terms of container images and deploy them in terms of portable Kubernetes APIs, transferring your application between environments, or even running in hybrid environments, is simply a matter of sending the declarative config to a new cluster. + +Kubernetes has a number of plug-ins that can abstract you from a particular cloud. For example, Kubernetes services know how to create load balancers on all major public clouds as well as several different private and physical infrastructures. Likewise, Kubernetes PersistentVolumes and PersistentVolumeClaims can be used to abstract your applications away from specific storage implementations. + + + +Container images bundle an application and its dependencies, under a root filesystem, into a single artifact. The most popular container image format is the Docker image format, the primary image format supported by Kubernetes. + +Docker images also include additional metadata used by a container runtime to start a running application instance based on the contents of the container image. + +Each layer adds, removes, or modifies files from the preceding layer in the filesystem. This is an example of an overlay filesystem. There are a variety of different concrete implementations of such filesystems, including aufs, +overlay, and overlay2. + +*** Dockerfiles + +There are several gotchas that come when people begin to experiment with container images that lead to overly large images. The first thing to remember is that files that are removed by subsequent layers in the system are actually still present in the images; they’re just inaccessible. + + +Another pitfall that people fall into revolves around image caching and building. Remember that each layer is an independent delta from the layer below it. Every time you change a layer, it changes every layer that comes after it. Changing the preceding layers means that they need to be rebuilt, repushed, and repulled to deploy your image to development. + + +In general, you want to order your layers from least likely to change to most likely to change in order to optimize the image size for pushing and pulling. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-12 20:35:27 +[[file:assets/screenshot_2018-08-12_20-35-27.png]] + +In the 1st case, every time the server.js file changes, the node package layer has to be pushed and pulled. + +Kubernetes relies on the fact that images described in a pod manifest are available across every machine in the cluster. +So that the scheduler can schedule Pods on any container. + +Recall we heard this point at the Kubernetes meetup organized by the Redhat guys at Go-Jek. + +Docker provides an API for creating application containers on Linux and Windows systems. Note, docker now has windows containers as well. + +It’s important to note that unless you explicitly delete an image it will live on your system forever, even if you build a new image with an identical name. Building this new image simply moves the tag to the new image; it doesn’t delete or replace the old image. +Consequently, as you iterate while you are creating a new image, you will often create many, many different images that end up taking up unnecessary space on your computer. +To see the images currently on your machine, you can use the docker images command. + + +** kubectl + +kubectl can be used to manage most Kubernetes objects such as pods, ReplicaSets, and services. kubectl can also be used to explore and verify the overall health of the cluster. + + +*** kubectl describe nodes + +This gives us information about the instance OS, memory, harddisk space, docker version, running pods etc + +#+begin_src +Non-terminated Pods: (2 in total) + Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits + --------- ---- ------------ ---------- --------------- ------------- + kube-system aws-node-hl4lc 10m (0%) 0 (0%) 0 (0%) 0 (0%) + kube-system kube-proxy-bgblb 100m (5%) 0 (0%) 0 (0%) 0 (0%) +#+end_src + +Here, note the "requests" and "limits" +Requests are the resources requested by the pod. It is guaranteed to be present. The "limit" is the maximum resources the pod can consume. + +A pod’s limit can be higher than its request, in which case the extra resources are supplied on a best-effort basis. They are not guaranteed to be present on the node. + +** Cluster components +Many of the components of Kubernetes are deployed using Kubernetes itself. All of these components run in the kube-system namespace. + +*** Kubernetes proxy +It implements the "service" abstraction. It is responsible for routing network traffic to load balanced services in the Kubernetes cluster. +kube-proxy is implemented in Kubernetes using the ~DaemonSet~ object. + +*** Kubernetes DNS +Kubernetes also runs a DNS server, which provides naming and discovery for the services that are defined in the cluster. This DNS server also runs as a replicated service on the cluster. + +There is also a Kubernetes service that performs load-balancing for the DNS server + +#+begin_src +$ k8s kubectl get services --namespace=kube-system kube-dns +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kube-dns ClusterIP 10.100.0.10 53/UDP,53/TCP 1h +#+end_src + +For all containers, the DNS for that pod has been set to point to this internal ip in the ~/etc/resolve.conf~ file for that container. + +*** Kubernetes UI + +Needs to be deployed. + +** Basic ~kubectl~ commands +There are some basic kubectl commands that apply to all Kubernetes objects. + +*** Namespaces +Kubernetes uses namespaces to organize objects in the cluster. By default, the ~default~ namespace is used. If you want to use a different namespace, you can pass kubectl the --namespace flag. + + +*** Contexts +If you want to change the default namespace more permanently, you can use a context. + +A context is like a set of settings. It can either have just a different namespace configuration, or can even point to a whole new cluster. +Note, creating and using contexts gets recorded in the ~$HOME/.kube/config~ + +Let's create a different namespace context: +~kubectl config set-context my-context --namespace=mystuff~ + +This creates a new context, but it doesn’t actually start using it yet. To use this +newly created context, you can run: + +~$ kubectl config use-context my-context~ + +*** Viewing Kubernetes resources +Everything contained in Kubernetes is represented by a RESTful resource. + +Each Kubernetes object exists at a unique HTTP path; for example, https://your-k8s.com/api/v1/namespaces/default/pods/my-pod leads to the representation of a pod in the default namespace named my-pod. The kubectl command makes HTTP requests to these URLs to access the Kubernetes objects that reside at these paths. By default, it prunes information so that it fits on a single line. To get more info, use ~-o wide~, or ~-o json~, or ~-o yaml~ + +The most basic command for viewing Kubernetes objects via kubectl is get. + +Eg: ~kubectl get ~ will get a listing of all resources in the current namespace. +To get a particular resource, ~kubectl get ~ + +kubectl uses the JSONPath query language to select fields in the returned object. +~kubectl get pods my-pod -o jsonpath --template={.status.podIP}~ + +*** CRUDing Kubernetes objects +Objects in the Kubernetes API are represented as JSON or YAML files. These files are either returned by the server in response to a query or posted to the server as part of an API request. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-15 14:49:29 +[[file:assets/screenshot_2018-08-15_14-49-29.png]] + +To delete, ~kubectl delete -f obj.yaml~ + +Labels and annotations are tags for your objects. +you can update the labels and annotations on any Kubernetes object using the annotate and label commands. For example, to +add the color=red label to a pod named bar, you can run: ~$ kubectl label pods bar color=red~ + +*** Debugging commands + +To view logs of a container: ~$ kubectl logs ~ +To execute command on a container ~kubectl exec -it -- bash~ +To copy files to and from a container using the cp command ~kubectl cp :/path/to/remote/file /path/to/local/file~ + + + +** Pods + +Containers in a pod share the host volume. +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-15 15:00:49 +[[file:assets/screenshot_2018-08-15_15-00-49.png]] + +Here, the web serving and git containers are part of the same logical group, so they are in the same pod. But they are still in separate containers since we don't want one's memory leak to OOM (out of memory, process terminated) the other. + +The name goes with the whale theme of Docker containers, since a Pod is also a group of whales. + +Each container within a Pod runs in its own cgroup (which means they have their own limits on resource usage), but they share a number of Linux namespaces (eg network) + +Applications running in the same Pod share the same IP address and port space (network namespace), have the same hostname (UTS namespace), and can communicate using native interprocess communication channels over System V IPC or POSIX message queues (IPC namespace). + +However, applications in different Pods are isolated from each other(since they don't share the namespaces); they have different IP addresses, different hostnames, and more. Containers in different Pods running on the same node might as well be on different servers. + + +Before putting your containers in same Pod, think: +- do they have a truly symbiotic relationship? + - as in, can they work if they are on different machines +- do you want to scale them together? + - as in, it doesn't make sense to scale 1st container without also scaling 2nd container + +In general, the right question to ask yourself when designing Pods is, “Will these containers work correctly if they land on different machines?” If the answer is “no,” a Pod is the correct grouping for the containers. If the answer is “yes,” multiple Pods is probably the correct solution. + +In the example of Git pod web server pod, two containers interact via a local filesystem. It would be impossible for them to operate correctly if the containers were scheduled on different machines. + +Pods are described in a Pod manifest. The Pod manifest is just a text-file representation of the Pod Kubernetes API object. + +Declarative configuration in Kubernetes is the basis for all of the self-healing behaviors in Kubernetes that keep applications running without user action. + +The Kubernetes API server accepts and processes Pod manifests before storing them in persistent storage (etcd). +The scheduler also uses the Kubernetes API to find Pods that haven’t been scheduled to a node. Once scheduled to a node, Pods don’t move and must be explicitly destroyed and rescheduled. + +The simplest way to create a Pod is via the imperative ~kubectl run command~. +Eg: ~$ kubectl run kuard --image=gcr.io/kuar-demo/kuard-amd64:1~ + +#+begin_src yaml +apiVersion: v1 +kind: Pod +metadata: + name: kuard +spec: + containers: + - image: gcr.io/kuar-demo/kuard-amd64:1 + name: kuard + ports: + - containerPort: 8080 + name: http + protocol: TCP +#+end_src + +This is equivalent to: + +The Pod manifest will be submitted to the Kubernetes API server. The Kubernetes system will then schedule that Pod to run on a healthy node in the cluster, where it will be monitored by the kubelet daemon process. + +#+begin_src +$ docker run -d --name kuard \ --publish 8080:8080 gcr.io/kuar-demo/kuard-amd64:1 +#+end_src + +Get running pods: ~$ kubectl get pods~ + +Get more info: ~kubectl describe pods kuard~ +Deleting a pod: ~kubectl delete pods/kuard~ or via the file ~kubectl delete -f kuard-pod.yaml~ + +When a Pod is deleted, it is not immediately killed. Instead, if you run kubectl get pods you will see that the Pod is in the Terminating state. All Pods have a +termination grace period. By default, this is 30 seconds. When a Pod is transitioned to Terminating it no longer receives new requests. In a serving +scenario, the grace period is important for reliability because it allows the Pod to finish any active requests that it may be in the middle of processing before it is terminated. +It’s important to note that when you delete a Pod, any data stored in the containers associated with that Pod will be deleted as well. If you want to persist data across multiple instances of a Pod, you need to use PersistentVolumes. + +The data is not deleted when the pod is restarted etc + +*you can port forward directly to localhost using kubectl* +kubectl port-forward kuard 8080:8080 + +On running this :top:, a secure tunnel is created from your local machine, through the Kubernetes master, to the instance of the Pod running on one of the worker nodes. + + +You can run commands on the pod too: +~$ kubectl exec kuard date~, even an interactive one ~$ kubectl exec -it kuard bash~ + +Copying files to and fro is easy: +~$ kubectl cp :/captures/capture3.txt ./capture3.txt~ + +Generally speaking, copying files into a container is an antipattern. You really should treat the contents of a container as immutable. + +When you run your application as a container in Kubernetes, it is automatically kept alive for you using a process health check. This health check simply ensures that the main process of your application is always running. If it isn’t, Kubernetes restarts it. + +However, in most cases, a simple process check is insufficient. For example, if your process has deadlocked and is unable to serve requests, a process health check will still believe that your application is healthy since its process is still running. +To address this, Kubernetes introduced health checks for application liveness. Liveness health checks run application-specific logic (e.g., loading a web page) to verify that the application is not just still running, but is functioning properly. Since these liveness health checks are application-specific, you have to define them in your Pod manifest. + +Liveness probes are defined per container, which means each container inside a Pod is health-checked separately. + +#+begin_src +apiVersion: v1 +kind: Pod +metadata: + name: kuard +spec: + containers: + - image: gcr.io/kuar-demo/kuard-amd64:1 + name: kuard + ports: + - containerPort: 8080 + name: http + protocol: TCP + livenessProbe: + httpGet: + path: /healthy + port: 8080 + initialDelaySeconds: 5 + timeoutSeconds: 1 + periodSeconds: 10 + failureThreshold: 3 +#+end_src + +If the probe fails, the pod is restarted. +Details of the restart can be found with kubectl describe kuard. The “Events” section will have text similar to the +following: + Killing container with id docker://2ac946...:pod "kuard_default(9ee84...)" + container "kuard" is unhealthy, it will be killed and re-created. + + +Kubernetes makes a distinction between liveness and readiness. Liveness determines if an application is running properly. Containers that fail liveness checks are restarted. Readiness describes when a container is ready to serve user requests. Containers that fail readiness checks are removed from service load balancers. +Readiness probes are configured similarly to liveness probes. + +*** Health checks +Kubernetes supports different kinds of healthchecks: +- tcpSocket + - it the tcp connection succeeds, considered healthy. This is for non http applications, like databases etc +- exec + - These execute a script or program in the context of the container. Following typical convention, if this script returns a zero exit code, the probe succeeds; otherwise, it fails. exec scripts are often useful for custom application validation logic that doesn’t fit neatly into an HTTP call. + +Kubernetes allows users to specify two different resource metrics. Resource requests specify the minimum amount of a resource required to run the application. Resource limits specify the maximum amount of a resource that an application can consume. + +#+begin_src +apiVersion: v1 +kind: Pod +metadata: + name: kuard +spec: + containers: + - image: gcr.io/kuar-demo/kuard-amd64:1 + name: kuard +resources: + requests: + cpu: "500m" + memory: "128Mi" +ports: + - containerPort: 8080 + name: http + protocol: TCP +#+end_src + +Here, :top:, we requested a machine with at least half cpu free, and 128mb of memory. + +a Pod is guaranteed to have at least the requested resources when running on the node. Importantly, “request” specifies a minimum. It does not specify a maximum cap on the resources a Pod may use. + +Imagine that we have container whose code attempts to use all available CPU cores. Suppose that we create a Pod with this container that requests 0.5 CPU. + +Kubernetes schedules this Pod onto a machine with a total of 2 CPU cores. +As long as it is the only Pod on the machine, it will consume all 2.0 of the available cores, despite only requesting 0.5 CPU. + +If a second Pod with the same container and the same request of 0.5 CPU lands on the machine, then each Pod will receive 1.0 cores. + +If a third identical Pod is scheduled, each Pod will receive 0.66 cores. Finally, if a fourth identical Pod is scheduled, each Pod will receive the 0.5 core it requested, and the node will be at capacity. + +CPU requests are implemented using the cpu-shares functionality in the Linux kernel. + +With memory, it is a little different. If we put a pod with say 256MB requested and it takes up everything, on adding a new pod, we can't take away half the memory, since it is being used. Here, the pod is killed and restarted, but with less available memory on the machine for the container to consume. + +To cap the max a pod can use, we can set the Limits + +#+begin_src +apiVersion: v1 +kind: Pod +metadata: + name: kuard +spec: + containers: + - image: gcr.io/kuar-demo/kuard-amd64:1 + name: kuard + resources: + requests: + cpu: "500m" + memory: "128Mi" + limits: + cpu: "1000m" + memory: "256Mi" + ports: + - containerPort: 8080 + name: http + protocol: TCP +#+end_src + +A container with a CPU limit of 0.5 cores will only ever get 0.5 cores, even if the CPU is otherwise idle. + +*** Persisting Data with Volumes + +When a Pod is deleted or a container restarts, any and all data in the container’s filesystem is also deleted. + +To persist data beyond the pod, use Volumes. + +There are 2 additions to add volumes to pods: +- spec.volumes + - This array defines all of the volumes that may be accessed by containers in the Pod manifest. Note that not all containers are required to mount all volumes defined in the Pod. +- volumeMounts + - This array defines the volumes that are mounted into a particular container, and the path where each volume should be mounted. Note that two different containers in a Pod can mount the same volume at different mount paths. + + +So, first in spec.volumes, we define what volumes may be used by the containers in the Pod. And, in volumeMounts, we actually use them. + +#+begin_src +apiVersion: v1 +kind: Pod +metadata: + name: kuard +spec: + volumes: + - name: "kuard-data" + hostPath: + path: "/var/lib/kuard" + containers: + - image: gcr.io/kuar-demo/kuard-amd64:1 + name: kuard + volumeMounts: + - mountPath: "/data" + name: "kuard-data" + ports: + - containerPort: 8080 + name: http + protocol: TCP +#+end_src + +Here, we define kuard-data as the volume, and then mount it on the kuard container. + +There are various types of volumes: +- ~emptyDir~ + - Such a volume is scoped to the Pod’s lifespan, but it can be shared between two containers. (in our example above, this forms the basis for communication between our Git sync and web serving containers). This survives the pod restart +- ~hostDir~ + - this can mount arbitrary locations on the worker node into the container + - this was used in the example above :top: + - This can be used when the pod wants to direct access to the instance's block storage for eg. But shouldn't be used to store ordinary data since not all the hosts would have the same underlying dir structure. +- network storage + - if you want the data to stay with the Pod even when the pod is moved around, restarted etc, use one of the several options available in the network based storage + - Kubernetes includes support for standard protocols such as NFS and iSCSI as well as cloud provider–based storage APIs for the major cloud providers (both public and private) + +#+begin_src +# Rest of pod definition above here + volumes: + - name: "kuard-data" + nfs: + server: my.nfs.server.local + path: "/exports" +#+end_src + + +Once you’ve submitted the manifest to the API server, the Kubernetes scheduler finds a machine where the Pod can fit and schedules the Pod to that machine(note, it first finds the node to host the Pod). Once scheduled, the kubelet daemon on that machine is responsible for creating +the containers that correspond to the Pod, as well as performing any health checks defined in the Pod manifested. + +We can use an ReplicaSet object to automate the creation of multiple identical Pods and ensure that they are recreated in the event of a node machine failure. + +*** Labels and Annotations + +Labels and annotations let you work in sets of things that map to how you think about your application. You can organize, mark, and cross-index all of your resources to represent the groups that make the most sense for your application. + +Labels are key/value pairs that can be attached to Kubernetes objects such as Pods and ReplicaSets. Both the key and value are represented by strings. Names must also start and end with an alphanumeric character and permit the use of dashes (-), underscores (_), and dots (.) between characters. + +Annotations are key/value pairs designed to hold nonidentifying information that can be leveraged by tools and libraries. + +Labels are for your use, annotations are for use by tools (including Kubernetes) and libraries + +You can apply a label like so: ~kubectl label deployments alpaca-test "canary=true".~ and remove it like so: ~$ kubectl label deployments alpaca-test "canary-".~ + +Label selectors are used to filter Kubernetes objects based on a set of labels. Selectors use a simple Boolean language. They are used both by end users (via tools like kubectl) and by different types of objects (such as how ReplicaSet +relates to its Pods). + +Eg: ~$ kubectl get pods --selector="ver=2" .~ +supports AND - ~$ kubectl get pods --selector="app=bandicoot,ver=2".~ +supports OR - ~$ kubectl get pods --selector="app in (alpaca,bandicoot)".~ + +Each deployment (via a ReplicaSet) creates a set of Pods using the labels specified in the template embedded in the deployment. + + +When a Kubernetes object refers to a set of other Kubernetes objects, a label selector is used. + +#+begin_src +selector: + matchLabels: + app: alpaca + matchExpressions: + - {key: ver, operator: In, values: [1, 2]} +#+end_src + +Annotations provide a place to store additional metadata for Kubernetes objects with the sole purpose of assisting tools and libraries. They can be used for the tool itself or to pass configuration information between external systems. + +There is overlap, and it is a matter of taste as to when to use an annotation or a label. When in doubt, add information to an object as an annotation and promote it to a label if you find yourself wanting to use it in a selector. + +Annotations are used by Kubernetes too: +- Communicate a specialized scheduling policy to a specialized scheduler. +- Enable the Deployment object to keep track of ReplicaSets that it is managing for rollouts. +- Prototype alpha functionality in Kubernetes (instead of creating a first-class API field, the parameters for that functionality are instead encoded in an annotation). +- During rolling deployments, annotations are used to track rollout status and provide the necessary information required to roll back a deployment to a previous state. + +The value component of an annotation is a free-form string field. + +Annotations are defined in the common metadata section in every Kubernetes object: + +#+begin_src + metadata: + annotations: + example.com/icon-url: "https://example.com/icon.png" +#+end_src + +Using labels and annotations properly unlocks the true power of Kubernetes’s flexibility and provides the starting point for building automation tools and deployment workflows. + +*** Service Discovery +Service discovery tools help solve the problem of finding which processes are listening at which addresses for which services. + +A good service discovery tool has these features: +- low latency to requests +- is able to store richer information - like ports the services are running on +- information propagates quickly + +Real service discovery in Kubernetes starts with a Service object. + +Just as the kubectl run command is an easy way to create a Kubernetes deployment(and start pods), we can use kubectl expose to create a service. + +By default, we have the kubernetes service already created for us so that we can find and talk to the Kubernetes API + +The service is assigned a new type of virtual IP called a cluster IP. This is a special IP address the system will load-balance across all of the pods that are identified by the selector. + +Because the cluster IP is virtual it is stable and it is appropriate to give it a DNS address. + +Kubernetes provides a DNS service exposed to Pods running in the cluster. This Kubernetes DNS service was installed as a system component when the cluster was first created. The DNS service is, itself, managed by Kubernetes and is a great example of Kubernetes building on Kubernetes. The Kubernetes DNS service provides DNS names for cluster IPs. + +When we expose a service ~myservice~, it is available in the cluster as: +~myservice.default.svc.cluster.local~. +The syntax is: ~service-name.namespace.svc.cluster.local~ +- ~svc~ is required to allow Kubernetes to expose other types of things as DNS in the future +- ~cluster.local~ can be changed if required to allow unique DNS names across multiple clusters. + +When referring to the service from same namespace, you can use ~myservice~, else, ~myservice.default~ works. Full name works as well. + +To allow outside traffic to come in, we need ~NodePorts~. +In addition to a cluster IP, the system picks a port (or the user can specify one), and every node in the cluster then forwards traffic to that port to the service. + +Finally, if you have support from the cloud that you are running on you can use the LoadBalancer type. +This builds on NodePorts by additionally configuring the cloud to create a new load balancer and direct it at nodes in your cluster. + +Under the hood, with each service, Kubernetes creates an object called ~Endpoints~ that contains the IP address for that service. + +To use a service, an advanced application can talk to the Kubernetes API directly to look up endpoints and call them. + +Cluster IPs are stable virtual IPs that load-balance traffic across all of the endpoints in a service. This is performed by a component running on every node in the cluster called the kube-proxy + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-15 22:42:36 +[[file:assets/screenshot_2018-08-15_22-42-36.png]] + +The ~kube-proxy~ looks for new services in the cluster from the apiserver. When a new service comes in, it writes the iptables rules on the node, so that packets can be routed. If the set of endpoints for a service changes (due to pods coming and going or due to a failed readiness check) the set of iptables rules is rewritten. + +Once set, the cluster IP cannot be modified without deleting and recreating the Service object. + +The Kubernetes service address range is configured using the --service-cluster-ip-range flag on the kube-apiserver binary. + +EKS does not support modifying all the flags on the apiserver. + +Earlier, for SD, pods used to define env vars to know the ClusterIP of the service they wanted to access. A problem with the environment variable approach is that it requires resources to be created in a specific order. +The services must be created before the pods that reference them. This can introduce quite a bit of complexity when deploying a set of services that make up a larger application. + + +*** ReplicaSets + +Generally, we want multiple pods to run, not a single instance. This can be for: +- Redundancy (so we can survive failure of a single pod) +- Scale +- Sharding (Different replicas can handle different parts of a computation in parallel.) + +A user managing a replicated set of Pods considers them as a single entity to be defined and managed. This is precisely what a ReplicaSet is. + +A ReplicaSet acts as a cluster-wide Pod manager, ensuring that the right types and number of Pods are running at all times. + +ReplicaSets are the building blocks for self healing infrastructure. +Pods managed by ReplicaSets are automatically rescheduled under certain failure conditions such as node failures and network partitions. + + +The actual act of managing the replicated Pods is an example of a reconciliation loop. Such loops are fundamental to most of the design and implementation of Kubernetes. They are part of the controller binary. + +Kubernetes is completely decoupled. All of the core concepts of Kubernetes are modular with respect to each other and that they are swappable and replaceable with other components. + +Though ReplicaSets create and manage Pods, they do not own the Pods they create. ReplicaSets use label queries to identify the set of Pods they should be managing. This makes ReplicaSets very clean and decoupled. Their code can be used to manage something else also, based on their labels. + +Another example of decoupling, ReplicaSets that create multiple Pods and the services that load-balance to those Pods are also totally separate, decoupled API objects. + +This has several advantages: + +- Adopting Existing Containers + - And because ReplicaSets are decoupled from the Pods they manage, you can simply create a ReplicaSet that will “adopt” the existing Pod, and scale out additional copies of those containers. +- Quarantining Containers + - If a pod is misbehaving, you can change the label on it and ReplicaSet will see that one pod is less and create a new one. Now, you can debug the pod. + + + +ReplicaSets are designed to represent a single, scalable microservice inside your architecture. Every Pod that is created by the ReplicaSet controller is entirely homogeneous. +Typically, these Pods are then fronted by a Kubernetes service load balancer, which spreads traffic across the Pods that make up the service. Generally speaking, ReplicaSets are designed for stateles (or nearly stateless) services. + + +**** ReplicaSet spec + +All ReplicaSets must have a unique name (~metadata.name~), the number of Pods (replicas) that should be running cluster-wide at a given time, and a Pod template that describes the Pod to be created when the defined number of replicas is not met. + +#+begin_src +apiVersion: extensions/v1beta1 +kind: ReplicaSet +metadata: + name: kuard +spec: + replicas: 1 + template: + metadata: + labels: + app: kuard + version: "2" + spec: + containers: + - name: kuard + image: "gcr.io/kuar-demo/kuard-amd64:2" +#+end_src + +Note, the ~spec.template.metadata.labels~, these are the labels that are monitored by the rs + +The selector in the ReplicaSet spec should be a proper subset of the labels in the Pod template. + +Sometimes you may wonder if a Pod is being managed by a ReplicaSet, and, if it is, which ReplicaSet. To enable this kind of discovery, the ReplicaSet controller adds an annotation to every Pod that it creates. The key for the annotation is ~kubernetes.io/created-by~. + +Note that such annotations are best-effort; they are only created when the Pod is created by the ReplicaSet, and can be removed by a Kubernetes user at any time. + + +**** Autoscaling +Autoscaling means nothing, there are multiple types of autoscaling: +- horizontal autoscaling + - increasing the number of pods in response to load +- vertical autoscaling + - increasing the resources for each running pod + +To autoscale on CPU or some other metric, you need ~heapster~ installed. +~kubectl autoscale rs kuard --min=2 --max=5 --cpu-percent=80~ + +This command creates an autoscaler that scales between two and five replicas with a CPU threshold of 80%. + +To view, modify, or delete this resource you can use the standard kubectl commands and the horizontalpodautoscalers +resource. horizontalpodautoscalers is quite a bit to type, but it can be shortened to hpa: ~$ kubectl get hpa~ + +It’s a bad idea to combine both autoscaling and imperative or declarative management of the number of replicas, so use either static number of pods, or use hpa + +If you don’t want to delete the Pods that are being managed by the ReplicaSet +you can set the --cascade flag to false to ensure only the ReplicaSet object is deleted and not the Pods: +~$ kubectl delete rs kuard --cascade=false~ + + + +*** DaemonSets + +This is used if you want to schedule a single Pod on every node in the cluster. +DaemonSets are used to deploy system daemons such as log collectors and monitoring agents, which typically must run on every node. +Generally, these pods are expected to be long running services. + +You can exclude any node in DaemonSets if you wish by specifying the ~nodeName~ field in the Pod spec. + +#+begin_src yaml +apiVersion: extensions/v1beta1 +kind: DaemonSet +metadata: + name: fluentd + namespace: kube-system + labels: + app: fluentd +spec: + template: + metadata: + labels: + app: fluentd + spec: + containers: + - name: fluentd + image: fluent/fluentd:v0.14.10 + resources: + limits: + memory: 200Mi + requests: + cpu: 100m + memory: 200Mi + volumeMounts: + - name: varlog + mountPath: /var/log + - name: varlibdockercontainers + mountPath: /var/lib/docker/containers + readOnly: true + terminationGracePeriodSeconds: 30 + volumes: + - name: varlog + hostPath: + path: /var/log + - name: varlibdockercontainers + hostPath: + path: /var/lib/docker/containers +#+end_src + +Each DaemonSet must include a Pod template spec, which will be used to create Pods as needed. + +You can use labels to select nodes on which to run the DaemonSet + +The following DaemonSet configuration limits nginx to running only on nodes with the ssd=true label set +#+begin_src yaml +apiVersion: extensions/v1beta1 +kind: "DaemonSet" +metadata: + labels: + app: nginx + ssd: "true" + name: nginx-fast-storage +spec: + template: + metadata: + labels: +app: nginx + ssd: "true" + spec: + nodeSelector: + ssd: "true" + containers: + - name: nginx + image: nginx:1.10.0 +#+end_src + +Of course, set the labels using: ~kubectl label nodes k0-default-pool-35609c18-z7tb ssd=true~ + +Adding the ssd=true label to additional nodes will case the nginx-fast- storage Pod to be deployed on those nodes. The inverse is also true: if a required label is removed from a node, the Pod will be removed by the DaemonSet controller. + +This is the reconciliatory loop in action. + +Prior to Kubernetes 1.6, the only way for pods in DaemonSets to be updated was to delete them. + + +Now, you can do rolling updates using ~spec.updateStrategy.type~ -> ~RollingUpdate~ etc + +There are 2 parameters: +- ~spec.minReadySeconds~ (how long a Pod must be “ready” before the rolling update proceeds to upgrade subsequent Pods) +- ~spec.updateStrategy.rollingUpdate.maxUnavailable~ (how many Pods may be simultaneously updated by the rolling update) + + + +DaemonSets provide an easy-to-use abstraction for running a set of Pods on every node in a Kubernetes cluster, or if the case requires it, on a subset of nodes based on labels. The DaemonSet provides its own controller and scheduler. + +These DaemonSets aren’t really traditional serving applications, but rather add additional capabilities and features to the Kubernetes cluster itself. + +*** Jobs + +They are short-lived, one-off tasks which run on Kubernetes. + +A Job creates Pods that run until successful termination (i.e., exit with 0). In contrast, a regular Pod will continually restart regardless of its exit code. + +They are represented by the ~Job~ object. The Job object coordinates running a number of pods in parallel. + + +If the Pod fails before a successful termination, the Job controller will create a new Pod based on the Pod template in the Job specification. + +The 2 parameters that are important are: ~completions~, and ~parallelism~ + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-16 21:44:19 +[[file:assets/screenshot_2018-08-16_21-44-19.png]] + +After the Job has completed, the Job object and related Pod are still around. This is so that you can inspect the log output. Note that this Job won’t show up in kubectl get jobs unless you pass the -a flag. Without this flag kubectl hides completed Jobs. + + +#+begin_src yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: oneshot + labels: + chapter: jobs +spec: + template: + metadata: + labels: + chapter: jobs + spec: + containers: + - name: kuard + image: gcr.io/kuar-demo/kuard-amd64:1 + imagePullPolicy: Always + args: + - "--keygen-enable" + - "--keygen-exit-on-complete" + - "--keygen-num-to-gen=10" + restartPolicy: OnFailure +#+end_src + + + +Generating keys can be slow. Let’s start a bunch of workers together to make key generation faster. We’re going to use a combination of the completions and +parallelism parameters. Our goal is to generate 100 keys by having 10 runs of kuard with each run generating 10 keys. But we don’t want to swamp our cluster, so we’ll limit ourselves to only five pods at a time. +This translates to setting completions to 10 and parallelism to 5. + +The spec now has: ~spec.parallelism=5~ and ~spec.completions=10~ + +**** Work Queues +There is a producer that puts jobs on the Work queue, and there are multiple consumers that consume the tasks. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-08-17 13:44:49 +[[file:assets/screenshot_2018-08-17_13-44-49.png]] +The Job abstraction allows you to model batch job patterns ranging from simple one-time tasks to parallel jobs that process many items until work has been exhausted. + +Jobs are a low-level primitive and can be used directly for simple workloads. Or you can make libraries to build on top of them. + +*** ConfigMaps and Secrets + +It is a good practice to make container images as reusable as possible. The same image should be able to be used for development, staging, and production. Which means, the same container should be flexible enough to be used on all the environments. This is where ConfigMaps and Secrets come into the picture. They allow us to specialize the image at runtime. + +ConfigMaps are used to provide configuration information for workloads. This can either be fine-grained information (a short string) or a composite value in the form of a file. +Secrets are similar to ConfigMaps but focused on making sensitive information available to the workload. They can be used for things like credentials or TLS certificates. + + + + + + + + + + + + + + + +* Katakoda +** Launch a single node +*** Minikube + +Start a cluster with ~minikube start~ + +The debugging commands are extremely useful: ~kubectl cluster-info~ + +When you use ~kubectl run~, it creates a deployment. +Eg: ~kubectl run first-deployment --image=katacoda/docker-http-server --port=80~ + +Recall a deployment is just a higher level abstraction over ReplicaSets, which maintain a fixed sets of Pods running. +Now, we can expose this deployment as a service with: +- either creating a service object and sending it to the apiserver +- or running the ~kubectl expose command~ + +~kubectl expose deployment first-deployment --port=80 --type=NodePort~ + +This will create a service and assign a node port to it +The dashboard is created in the kube-system namespace. + +When the dashboard was deployed, it used externalIPs to bind the service to port 8443. +For production, instead of externalIPs, it's recommended to use kubectl proxy to access the dashboard. + +A good debugging command: ~kubectl describe ~ eg: ~kubectl describe pods mypod~ +Earlier we ran the expose command: + +~kubectl expose deployment http --external-ip="172.17.0.92" --port=8000 --target-port=80~ +This :top: exposes the container port 80 to the host port 8000 + +Run and expose in a single command: +~kubectl run httpexposed --image=katacoda/docker-http-server:latest --replicas=1 --port=80 --hostport=8001~ + +The ports are opened on the pod, not on the container. And since all the containers in the pod share the same network namespace, they cannot run on the same port, so it's no problem. + +Scaling the deployment will request Kubernetes to launch additional Pods. These Pods will then automatically be load balanced using the exposed Service. +You can scale a deployment (or a replicaset), with: ~k scale --replicas=10 myreplicaset_or_deployment~ + +After you create a deployment (or a replicaset), you can put a service on it. + +#+begin_src yaml +apiVersion: v1 +kind: Service +metadata: + name: webapp1-svc + labels: + app: webapp1 +spec: + type: NodePort + ports: + - port: 80 + nodePort: 30080 + selector: + app: webapp1 +#+end_src + +Cluster IP is the default approach when creating a Kubernetes Service. The service is allocated an internal IP that other components can use to access the pods. + +* General Notes + +** RBAC + +*** Talk by Eric, CoreOS - https://www.youtube.com/watch?v=WvnXemaYQ50 + +Everyone needs to talk to the api-server + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 10:17:16 +[[file:assets/screenshot_2018-09-02_10-17-16.png]] + + +The api-server needs to authenticate everyone who talks to the api-server. After authentication (who is talking to me), we may want the api-server to have authorize powers also, that is we may want different permissions for different components. + +**** Authentication +Users in Kubernetes are just strings which are associated with the request through credentials. +Eg: +#+begin_src +Username=darshanime +Groups=["developer", "gopher"] +#+end_src + +How this information is pulled out can of any of: +- x509 client auth +- password files +- bearer token passwords +- etc + +This is pluggable. You have to decide when you deploy your api-server, you need to pass these flags. +Eg: +#+begin_src +/usr/bin/kube-apiserver --client-ca-file=/etc/kubernetes/ca.pem ... +#+end_src + +The cert that is generated by the api-server and given to the nodes, kubelet etc looks like so: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 10:45:29 +[[file:assets/screenshot_2018-09-02_10-45-29.png]] + +The cert should be valid. Also, the cert has: CN (common name), which is thought of as the username for any request using this cert. +We can use the O (organization) field to pull out group information, which can be later used for authorization. + + +Apart from x509, there are some other methods as well: +- token files +4190u4122u101-23123, darshanime,3,"developer,grophers" + +:top:, if I show up with the first field as the Bearer token, I would be authenticated as darshanime with the group "developer,grophers" + +There is webhook too, with this, the api-server sends the Authorization: Bearer to another server to determine what the user is. GKE uses this to authenticate you on their Kubernetes service. + +All of these are completely managed outside Kubernetes. We have to set them when we setup the api-server. We have to say, I'll provision the api-server with these flags, these files on disk etc + +**** Service Accounts +Service accounts are just bare token managed by Kubernetes. You can use these thru the api-server to create users as you want. + +#+begin_src go +k create serviceaccount do-the-root +#+end_src + +This will create a serviceaccount, which has some secrets. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 10:55:16 +[[file:assets/screenshot_2018-09-02_10-55-16.png]] + + +These secrets are the credentials that the serviceaccount uses to authenticate itself. +If your api-server is using tokens, this will be a token. If it is using x509 certs, it will be a certificate.+ It is a jwt token. + +You can mention the service account for each Kubernetes object (pod, deployment etc). If you do not give it a name, it will be set to the ~default~ service account. +If you specify the serviceaccount, Kubernetes takes care of mounting the secrets onto your pod. + +~Secrets are tmpfs, they don't touch the disk, they are always in memory~ + +You can view it: + +#+begin_src +$ k get serviceaccount default -o yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + creationTimestamp: 2018-08-31T11:08:12Z + name: default + namespace: default + resourceVersion: "431" + selfLink: /api/v1/namespaces/default/serviceaccounts/default + uid: 26b959b9-ad0e-11e8-a2ea-027dc18e9b58 +secrets: +- name: default-token-blvch + + +#+end_src + +Here, we see that the default serviceaccount has a secret attached - default-token-blvch + +We can view it: +#+begin_src +k8s (master) ✗ k get secrets -o yaml default-token-blvch +apiVersion: v1 +data: + ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMrVENDQWVHZ0F3SUJBZ0lKQVBXcHR6QkVqTzVSTUEwR0NTcUdTSWIzRFFFQkN3VUFNQkl4RURBT0JnTlYKQkFNTUIydDFZbVV0WTJFd0lCY05NVGd3T0RNeE1UQTFPREEwV2hnUE1qRXhPREE0TURjeE1EVTRNRFJhTUJJeApFREFPQmdOVkJBTU1CMnQxWW1VdFkyRXdnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCCkFRREN1eFdlMGo2dllwVTlieEFUYTN0ZVRIT0wxdGlTMStWYlRjbUsyZkVnb3Qwb212bjh6VkNqOEVtT0I1amUKaUNsQVhpNUtrMGRhcGJkbitxM2VzZXZ1dkYrYnEzUVJEdzJ5bnBMdnRpZlRqZ1hSY1l5cUlybjN3cVVQbUZhMQpaekNhTnpnOTluYTgxT2tmT0dPMmM4WEVhdkcyM09MV2hvSlE0SWVuZWZFWlNSTjNOVmdxaWVabzZiSjIzUUMyClFldjZFNS9kWnlsSDMrcUF5UTJOTGRmaHRlc2VkdlJiVzBzeHVtNExncU53bmNaa2R0cEhudXZvQjVpWDNZdFkKdWlZc2JQKzh0SHROSzBUY29WKzNGSjVhcDdYSUQwaTVKWEI4M0dTK0FtQ2RFbXZ2QUVVNlp2MVE5aG9ESllxQgpnbDl5YVNHMTB6azdPSnVhd1lCU29sbHhBZ01CQUFHalVEQk9NQjBHQTFVZERnUVdCQlFCNjRBTENpRXdpam1FCjV4UzYwZThtRFVFODZUQWZCZ05WSFNNRUdEQVdnQlFCNjRBTENpRXdpam1FNXhTNjBlOG1EVUU4NlRBTUJnTlYKSFJNRUJUQURBUUgvTUEwR0NTcUdTSWIzRFFFQkN3VUFBNElCQVFDaFhEaGc0WWx5NDhLY0hQeGU5OFRFTTRTVApYVWtiUFpwSW5aU2s1VTc4S1FWaGNTWm9nNnk2QXoyS3JPUlg4QmwrSDdGandtSVl6REwrdVk1SWtWVVJ3bVdsClhkUVAzdWNUOXpLd1RDNnRRWk95bElvQ1VDOFBxcCtTbC83Ym4rRVdSQ0dmZzM3RDlMem11blloWmFkd0doSjIKZTVSUm8vanIzL2FSSzlXUkduYzloSmRBSThjbnFabkRXbWRZUEFKTDZwVCtKTytYQmNxUXlrUC8vTFVIMlovYgo2NW9QUEdFQlJTdWxldDdLUkw2dHBQWkI2c3lMS1QxazV0TFdqZUZ0ZCtuSVQ1d0p3cUlybGZhcWF0NTEraWdkClZ0YTcxbGc4U1R5WEZ0Wmt5enNFa0VieU44OCswZnh0OGI1emo1RmdmMnMvQ1ZUOG5YOWxBK21WeHdaNQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== + namespace: ZGVmYXVsdA== + token: ZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklpSjkuZXlKcGMzTWlPaUpyZFdKbGNtNWxkR1Z6TDNObGNuWnBZMlZoWTJOdmRXNTBJaXdpYTNWaVpYSnVaWFJsY3k1cGJ5OXpaWEoyYVdObFlXTmpiM1Z1ZEM5dVlXMWxjM0JoWTJVaU9pSmtaV1poZFd4MElpd2lhM1ZpWlhKdVpYUmxjeTVwYnk5elpYSjJhV05sWVdOamIzVnVkQzl6WldOeVpYUXVibUZ0WlNJNkltUmxabUYxYkhRdGRHOXJaVzR0WW14MlkyZ2lMQ0pyZFdKbGNtNWxkR1Z6TG1sdkwzTmxjblpwWTJWaFkyTnZkVzUwTDNObGNuWnBZMlV0WVdOamIzVnVkQzV1WVcxbElqb2laR1ZtWVhWc2RDSXNJbXQxWW1WeWJtVjBaWE11YVc4dmMyVnlkbWxqWldGalkyOTFiblF2YzJWeWRtbGpaUzFoWTJOdmRXNTBMblZwWkNJNklqSTJZamsxT1dJNUxXRmtNR1V0TVRGbE9DMWhNbVZoTFRBeU4yUmpNVGhsT1dJMU9DSXNJbk4xWWlJNkluTjVjM1JsYlRwelpYSjJhV05sWVdOamIzVnVkRHBrWldaaGRXeDBPbVJsWm1GMWJIUWlmUS5BcTh5VUQtQ0RWc0EzWlFydEQ2UlVmWGItUHVWX3lOMFA5alNVYjRoemEzNWtzXzgtVUNOeUtjenU3X0pxVmZXYnBqQjhLS1Q5d1lmSlMxczJaT3NIRmVLMXFRQ3VNRmdnZnpRWndpa01aMFlsV0ctd05NUkN6eHpUVjZtNTAwRTdDS09yTjZNOGsxZ0hsNjQxb3NPZW9pRWRqbEVJOUR6ZzktUVNzaHVMSkpGNnFTaFV4NDltLTBlMWVFY05BREpBV2p4M3ZGUDVYR3dfVWRPckM5RENxaWFidEtZekJuXy03NGtKTXF4RWh5NVJ3T0NEUFpqT1QzNkhlVkxQUWJGZzNjX0hmOUdzOWhxdUN0THl1UzA1Q0ZNRVZlU1NXWVJDNHJOd19NMEtRX0ZYbjkxUUtULU9ZeGEyLWVPTzhZSEtFMV9rR1RZbk9lWXZaNGVZcTY5VFE= +kind: Secret +metadata: + annotations: + kubernetes.io/service-account.name: default + kubernetes.io/service-account.uid: 26b959b9-ad0e-11e8-a2ea-027dc18e9b58 + creationTimestamp: 2018-08-31T11:08:12Z + name: default-token-blvch + namespace: default + resourceVersion: "429" + selfLink: /api/v1/namespaces/default/secrets/default-token-blvch + uid: 26bb51fb-ad0e-11e8-a2ea-027dc18e9b58 +type: kubernetes.io/service-account-token +#+end_src + +The token is base64 encoded, it can be decoded and seen on jwt.io. It shows information like: + +#+begin_src json +{ + "iss": "kubernetes/serviceaccount", + "kubernetes.io/serviceaccount/namespace": "default", + "kubernetes.io/serviceaccount/secret.name": "default-token-blvch", + "kubernetes.io/serviceaccount/service-account.name": "default", + "kubernetes.io/serviceaccount/service-account.uid": "26b959b9-ad0e-11e8-a2ea-027dc18e9b58", + "sub": "system:serviceaccount:default:default" +} +#+end_src + +We can see the serviceaccount name, the serviceaccount uuid, serviceaccount namespace etc +They can be used outside the cluster. So, if anyone has your serviceaccount tokens, it can act on behalf of your serviceaccount and talk to the api-server + +**** Authorization + +It uses the usernames and group names that we pulled out of the authentication phase and uses that for authorization. + +One of the plugins for authorization is RBAC - role based access control. +It was made by RedHat in OpenShift. + + +Overview: +- default: deny all +- contain a subject, verb, resource, and namespace. + - eg: user A can create pods in namespace B +- cannot + - +refer to a single object in a namespace+ + - +refer to arbitrary fields in a resource+ +- can + - refer to sub resources (eg, node/status) + +The RBAC apigroup has 4 top level objects: +- Role +- RoleBinding +- ClusterRole +- ClusterRoleBinding + +**** Roles vs Bindings +Roles declare a set of powers. +Bindings "bind" users or groups to those powers (roles). +So, you can have a role called "admin" that can do anything. +Then you can give (bind) this power (role) to the user darshanime for eg + +#+begin_src yaml +apiVersion: rbac.authorization.k8s.io/v1alpha1 +kind: ClusterRole +metadata: + name: secret-reader +rules: + - apiGroups: [""] # v1 namespace + resources: ["secrets"] + verbs: ["get", "watch", "list"] +#+end_src + +Here, :top: we defined a role which can read secrets. +Now, we can bind a user to the role. + +#+begin_src yaml +apiVersion: rbac.authorization.k8s.io/v1alpha1 +kind: ClusterRoleBinding +metadata: + name: read-secrets +subjects: + - kind: Group # may be User, Group, or ServiceAccount + name: manager +roleRef: + kind: ClusterRole + name: secret-reader + apiVersion: rbac.authorization.k8s.io/v1alpha1 +#+end_src + +Here, we say that anyone in the "group" manager, has this power. + +**** ClusterRoles vs Roles +Objects in Kubernetes can be either namespaced, or not. Eg, pods are namespaced, nodes are clusterlevel, they are not namespaced. + +Roles can exist either at the namespace level or at a cluster level (using * namespace?) + +Cluster role - manage one set of roles for your entire cluster -> define once, assign to any one in any namespace + +Role: allow "namespace admins" to admin roles just for a namespace -> are local to the namespace. Can't create once, use for every namespace + +Similarly, we have ClusterRoleBindings, vs. RoleBindings +ClusterRoleBindings can refer to only ClusterRoles, they offer cluster wide powers. +RoleBindings grant power within a namespace - they can refer to Roles, ClusterRoles + +Consider: +Let's say I want to give coworker A the power to administer a particular namespace. Steps: +- create a ClusterRole + - that is the namespace admin +- create a RoleBinding + - for that namespace referring to the ClusterRole + +So, we have 3 cases: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 11:38:06 +[[file:assets/screenshot_2018-09-02_11-38-06.png]] + +Here, we have a ClusterRole, which has powers thru out the cluster, and we assign them using ClusterRoleBinding + +They now have power thru out the cluster, like delete node etc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 11:39:40 +[[file:assets/screenshot_2018-09-02_11-39-40.png]] + +Here, a central admin is creating ClusterRole, but it is bound only for a particular namespace, so the guy has admin privileges over the namespace. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 11:41:01 +[[file:assets/screenshot_2018-09-02_11-41-01.png]] + +This is when I have a rolebinding to create new roles applicable only within the given namespace. + +This separation between ClusterRole and Role has builtin escalation prevention. If I do not have the privilege to read secrets, I cannot give someone a role that can read secrets. + +To bootstrap the cluster, the api-server has a cluster-admin role for core components like kubelet etc + +Apart from RBAC, we have other plugins -> webhook, policy file etc + + + +*** TLS with Go - Eric, https://ericchiang.github.io/post/go-tls/ + +There are several different ideas here + +**** Public and Private key encryption + +Whenever we want to prove our identity, for eg, when sshing etc, we can use the private-public key cryptography. +The public key is public and the private key can be used to prove you are the owner of that public key, or in other words, you are who the public key says you are. + +*Things encrypt with a public key can only be decrypted by its paired private key.* + +So, how we can make use of this :top: fact is, +I can encrypt some text using your public key. It can be decrypted only by someone who has the private key. If you have it, you can decrypt it and if you are able to say the phrase back to me, I can confirm that you are who you say you are. + +*The cool part about this is you can prove you hold a private key without ever showing it to somebody.* + + +**** Digital Signatures + +A free feature of the public private key pair is that we can digitally sign content. This sign can be used to ensure the validity of the document. + +How we do this is, we take the document, hash it (say thru SHA256), then the private key computes a signature of the hashed document. This document can now be sent wherever. + +The public key can then confirm, if its private key combined with a particular hash would have created that signature. + +So, when you send the document, you can also send it's signature. The receiver can take your public key and verify you sent the document and it has not been tempered with. + +This can be useful for a certificate authority, who wants to be able to distribute documents which can't be altered without everyone detecting. + +**** x509 certs + +*Certificates are public keys wit some attached information (like what domains they work for)* + +In order to create a certificate, we need to both specify that information and provide a public key. + +So, we first start with creating a public-private key pair. We can call the private key the rootKey. +Now, we can make a certificate. + +Certificates must be signed by the private key of a parent certificate -> like we talked in Digital Signature + +To create the certificate, we need: +- the information to be put in the certificate, like the Subject, validity dates, etc +- the public key we want to wrap +- the parent certificate +- the parent’s private key + +If we don't have the parent, then we use self's private key - this becomes a self signed certificate + +Remember, the certificate is just a public key, with some extra information is all. + +To prove ownership of a certificate you must have it's private key as well. This is why we need to have the private key inside our servers. + +Now, from this cert (from this cert's private key), we can make more certs. + +If you put this cert on a webserver and make requests to it, the client (browser, curl etc) will reject it because none of the public keys trusted by the client validate the digital signature provided by the certificate from the webserver. + +**** Getting the Client to Trust the Server +Let's mimic a situation where a certificate authority provides a organization with a cert for their website. +For this, we can pretend that the rootCert we created above :top: belongs to a certificate authority, and we'll attempt to create another certificate for our server + +Steps: +- create a public private key pair +- we will create the cert using: + - some information about the expiry etc + - public key generated + - rootKey (the private key of the parent certificate) -> this will sign the certificate + +To have the client trust the certificate, we need to ask it to trust "the public key that can validate the certificate's signature". Since we used the root key-pair to sign the certificate, if we trust the rootCert (it's public key), we'll trust the server's certificate. + +**** Getting the Server to trust the client +This is exactly similar to the server side. We create a public-private key pair for the client, then we create a certificate for the client. On making the tcp connection, during the TLS handshake, we can present our certificate and the server can be configured to complete the handshake and accept the connection only if the certificate is valid. + +*** RBAC with Joe Beda - https://www.youtube.com/watch?v=slUMVwRXlRo + +Our kubectl talks to the api-server via TLS, so but it uses a self signed cert. So, in the kubeconfig file, we have the certificate data that we ask the kubectl to trust. When we switch it off, using --insecure-skip-tls-verify, what we do is we communicate to the api-server without verifying it's certificate. So, we could be talking to some over api-server if we are attacked without knowing it. + +The api-server plays the role of the CA in the cluster. It issues new certs to service accounts etc + +Also, for authenticating, the kubeconfig has ~users~, with the name and client-certificate-data, client-key-data + +In the heptio aws quickstart, the kubectl creates an TLS connection to TCP ELB, it can do it because it knows the root cert to trust. +The QuickStart creates an ELB first, puts the elb's name in the SAN (subject alternate name), so the ELB can legitimately offer that cert. +Then we give the kubectl cert to prove who we are, and that we are allowed to do that action that we are doing. + + +Service Accounts are cool, butt they are mostly about something running on Kubernetes to talk to api-server. Authorization (like x509, static passwords, static tokens, webhooks, authentication proxy (like we have for logs at draup) etc) are about accessing Kubernetes from outside. + + +We can mint certificates using the CA (the api-server). Other options are: CFSSL (cloudflare ssl certificate), OpenSSL, Hashicorp Vault etc. Then it's up to you how you want to sign them, you can use either of the above options. + +For ease, Kubernetes has a built in CA in the api-server, which is used to automatic bootstrapping and certificate rotation for core components, but it can also mint certs for users. + + +**** Adding a new user to the cluster +1. Generate a public-private key pair: (here, only the private key is generated. It is easy to go from private key to public key) +~openssl genrsa -out jakub.pem 2048~ + +And the csr - certificate signing request +~openssl req -new -key jakub.pem -out jakub.csr -subj "/CN=jakub"\~ +We can add organization: +~openssl req -new -key jakub.pem -out jakub.csr -subj "/CN=users:jakub/O=cool-people"\~ + +The username lives in the common name. + +Now, in the earlier post Eric wrote, we did not have a certificate signing request because we signed the certificate with our own root certificate. +In real life, we create a certificate signing request using our private key. Here, we provide the information like user, organization etc. The CA will add expiration etc. + +The CA can verify that the info in the CSR is accurate, and sign the certificate using the it's (the CA's) private key. + +In Kubernetes, we will base64 encode the csr and ~kubectl apply -f~ it + +We can approve the CSR using ~k certificate approve ~ + +Now, we should be able to download the certificate. +Now, you have the certificate and the private key. We can put this in the kubeconfig and we are good to go on the authentication front. + +**** Authorization +Authorization has several plugins (by plugins, we mean we have an interface that we expose and everyone implements it) - apac, webhook, Node Authorization, RBAC + +There are 4 different things +- Role +- ClusterRole +- RoleBindings +- ClusterRoleBidings + +Resources like nodes, csrs don't belong to a namespace -> they are cluster wide resources. + +Since roles belong to namespaces, you can do: +~k get roles --all-namespaces~ + +#+begin_src + k8s (master) ✗ k get roles --all-namespaces +NAMESPACE NAME AGE +kube-public system:controller:bootstrap-signer 1d +kube-system extension-apiserver-authentication-reader 1d +kube-system kubernetes-dashboard-minimal 1d +kube-system system::leader-locking-kube-controller-manager 1d +kube-system system::leader-locking-kube-scheduler 1d +kube-system system:controller:bootstrap-signer 1d +kube-system system:controller:cloud-provider 1d +kube-system system:controller:token-cleaner 1d +#+end_src + +There are also clusterroles, ~k get clusterroles~ + +You can view then in detail: + +#+begin_src yaml +k8s (master) ✗ k get clusterroles system:basic-user -o yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + annotations: + rbac.authorization.kubernetes.io/autoupdate: "true" + creationTimestamp: 2018-08-31T11:06:46Z + labels: + kubernetes.io/bootstrapping: rbac-defaults + name: system:basic-user + resourceVersion: "59" + selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Abasic-user + uid: f38738a4-ad0d-11e8-a2ea-027dc18e9b58 +rules: +- apiGroups: + - authorization.k8s.io + resources: + - selfsubjectaccessreviews + - selfsubjectrulesreviews + verbs: + - create + +------- + k8s (master) ✗ k get roles system::leader-locking-kube-scheduler -n kube-system -o yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + annotations: + rbac.authorization.kubernetes.io/autoupdate: "true" + creationTimestamp: 2018-08-31T11:06:48Z + labels: + kubernetes.io/bootstrapping: rbac-defaults + name: system::leader-locking-kube-scheduler + namespace: kube-system + resourceVersion: "178" + selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/system%3A%3Aleader-locking-kube-scheduler + uid: f49fadb3-ad0d-11e8-a2ea-027dc18e9b58 +rules: +- apiGroups: + - "" + resources: + - configmaps + verbs: + - watch +- apiGroups: + - "" + resourceNames: + - kube-scheduler + resources: + - configmaps + verbs: + - get + - update +#+end_src + +They have similar structure. We mention the api group, the resource type and the verb. + +Apart from these roles that Kubernetes ships with, we can create our own roles. + +ClusterRoleBinding - I'm going to let this user do this thingy across every name space in the cluster. + +RoleBinding - I'm going to let this user do this thingy in this ONE name space in the cluster. + +~kubectl~ has a ~--dry-run~ flag, with the ~-o yaml~ to get the yaml. Do this to jump start the yaml if needed. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-02 16:19:40 +[[file:assets/screenshot_2018-09-02_16-19-40.png]] + +Here, :top:, we are doing: + +~kubectl create rolebinding joe --clusterrole=admin --user=users:joe~ +So, we are assigning the clusterrole admin to joe on the default namespace. (default since we aren't explicitly mentioning any namespace) + +Now, if we create a new namespace, and run a pod there, we won't be able to use the user joe to view the pods there. + +**** Service Accounts +This is a robotic identity for the apps running on the cluster. +They operate just like Role and ClusterRoles, but not for users. They are for applications. + +Each namespace has a ~default~ service account + +So, example: +This will give the ~default~ service account admin role for a particular namespace +#+begin_src +kubectl create rolebinding varMyRoleBinding --clusterrole=admin --serviceaccount=varMyNamespace:default --namespace=varMyNamespace +#+end_src + +To give the service account admin access to the whole cluster. + +#+begin_src +kubectl create clusterrolebinding varMyClusterRoleBinding --clusterrole=cluster-admin --serviceaccount=kube-system:default +#+end_src + +*** Configuring RBAC for Helm - https://docs.helm.sh/using_helm/#role-based-access-control + +Helm has 2 components, (v2.10), helm the client, and tiller the server side component. + +We can have various ways to configure rbac for tiller. +We need to create a service account for the tiller component. SAs live in a namespace. + +**** Granting cluster admin to tiller +This is the simplest to accomplish. +We already have a clusterrole called ~cluster-admin~. We can create a service account tiller-sa and just assign it to that sa. + +#+begin_src yaml +api: v1 +kind: ServiceAccount +metadata: + name: tiller + namespace: kube-system +#+end_src + +Now, we can assign the clusterrole ~cluster-admin~ using the clusterrolebinding +#+begin_src yaml +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: ClusterRoleBinding +metadata: + name: tiller # this is just metadata - the name of this clusterrolebinding, doesn't matter +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: cluster-admin # we way, we want to bind the cluster-admin cluster role +subjects: + - kind: ServiceAccount # to the tiller service account which lives in the kube-system namespace, could be user etc here? + name: tiller + namespace: kube-system +#+end_src + +I followed this and got a tiller pod running + +#+begin_src +$ k get po --all-namespaces +NAMESPACE NAME READY STATUS RESTARTS AGE +kube-system tiller-deploy-78779c79d4-884tk 1/1 Running 0 1m +#+end_src + +Doing a describe on the pod showed a secret +#+begin_src +k describe pods tiller-deploy-78779c79d4-884tk -n kube-system +Priority: 0 +... +Volumes: + tiller-sa-token-62kn4: + Type: Secret (a volume populated by a Secret) + SecretName: tiller-sa-token-62kn4 + Optional: false +QoS Class: BestEffort +... +#+end_src + +We can see the secret: +#+begin_src +k get secrets -o yaml tiller-sa-token-62kn4 -n kube-system +# take the token from the output, base64 --decode it and check on jwt.io to get this: +{ + "iss": "kubernetes/serviceaccount", + "kubernetes.io/serviceaccount/namespace": "kube-system", + "kubernetes.io/serviceaccount/secret.name": "tiller-sa-token-62kn4", + "kubernetes.io/serviceaccount/service-account.name": "tiller-sa", + "kubernetes.io/serviceaccount/service-account.uid": "14d98272-aeb2-11e8-800b-0261adf852f0", + "sub": "system:serviceaccount:kube-system:tiller-sa" +} + +# the output also has the SA's certificate +# the certificate decoder on sslshopper.com shows this: +Certificate Information: +Common Name: kube-ca +Valid From: August 31, 2018 +Valid To: August 7, 2118 +Serial Number: 17701881228292845137 (0xf5a9b730448cee51) +#+end_src + +**** Deploy tiller in a namespace +We can replace ClusterRoleBinding with RoleBinding and restrict tiller to a particular namespace. + +Creating a namespace +#+begin_src yaml +apiVersion: v1 +kind: Namespace +metadata: + name: tiller-world +#+end_src + +Creating the service account +#+begin_src yaml +api: v1 +kind: ServiceAccount +metadata: + name: tiller + namespace: kube-system +#+end_src + +Now, we can create a role that allows tiller to manage all resources in the tiller-world namespace. + +#+begin_src yaml +apiVersion: rbac.authroization.k8s.io/v1beta1 +kind: Role +metadata: + name: tiller-manager # name for our Role + namespace: tiller-world # namespace for which the role exists +rules: +- apiGroups: ["", "batch", "extensions", "apps"] # apis the role gives access to + resources: ["*"] # resources in the api groups that the role gives access to + verbs: ["*"] # verbs the role gives access to +#+end_src + +Finally, we can create a RoleBinding to marry the two +#+begin_src yaml +apiVersion: rbac.authroization.k8s.io/v1beta1 +kind: RoleBinding +metadata: + name: tiller-binding # name for the RoleBinding + namespace: tiller-world +subjects: +- kind: ServiceAccount + name: tiller + namespace: tiller-world # name for the ServiceAccount +roleRef: + kind: Role + name: tiller-manager # name for the Role + apiGroup: rbac.authorization.k8s.io +#+end_src +Now we can init helm with the service account and namespace + +~helm init --service-account tiller --tiller-namespace tiller-world~ + +Run it using: ~$ helm install nginx --tiller-namespace tiller-world --namespace tiller-world~ + +**** Deploy tiller in and restrict deploying to another namespace + +In the example above, we gave Tiller admin access to the namespace it was deployed inside. +Now we will limit Tiller’s scope to deploy resources in a different namespace! + +For example, let’s install Tiller in the namespace myorg-system and allow Tiller to deploy resources in the namespace myorg-users. +Like before, creating the namespace and serviceaccount +#+begin_src +$ kubectl create namespace myorg-system +namespace "myorg-system" created +$ kubectl create serviceaccount tiller --namespace myorg-system +serviceaccount "tiller" created +#+end_src + +Now, defining a role that gives tiller privileges to manage all resources in myorg-users namespace. + +#+begin_src yaml +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: Role +metadata: + name: tiller-manager + namespace: myorg-users +rules: +- apiGroups: ["", "extension", "apps"] + resources: ["*"] + verbs: ["*"] +#+end_src + +Now we can create a RoleBinding to marry the Role (tiller-manager) with SA (tiller) + +#+begin_src yaml +kind: RoleBinding +apiVersion: rbac.authorization.k8s.io/v1beta1 +metadata: + name: tiller-binding + namespace: myorg-users +subjects: +- kind: ServiceAccount + name: tiller + namespace: myorg-system +roleRef: + kind: Role + name: tiller-manager + apiGroup: rbac.authorization.k8s.io +#+end_src + +One more thing, we need to give tiller access to read configmaps in myorg-system so that it can store release information. + +#+begin_src yaml +kind: Role +apiVersion: rbac.authorization.k8s.io/v1beta1 +metadata: + namespace: myorg-system + name: tiller-manager +rules: +- apiGroups: ["", "extensions", "apps"] + resources: ["configmaps"] + verbs: ["*"] +#+end_src + +#+begin_src yaml +kind: RoleBinding +apiVersion: rbac.authorization.k8s.io/v1beta1 +metadata: + name: tiller-binding + namespace: myorg-system +subjects: +- kind: ServiceAccount + name: tiller + namespace: myorg-system +roleRef: + kind: Role + name: tiller-manager + apiGroup: rbac.authorization.k8s.io +#+end_src + +* Istio + +Istio is a service mesh. When you deploy microservices to cloud platforms, you want to control the interaction between the services. You may want: +- explicit rules like, frontend should talk to backend only, the frontend cannot talk to database etc +- to roll out a new version and test perform A/B testing on it, with say 1% of the traffic. So, you want to route 1% of the network traffic on a service to this new version. +- to mirror the entire traffic to a new version of the service to check out how it performs under production load before deploying it +- to set timeouts, retries on requests so that you get circuit breaking features etc (like Netflix's Hystrix, GoJek's Heimdall) + + +Istio helps with all that :top: + +It basically injects a sidecar container in all your pods (running ~envoy~, which is a proxy server, like we can operate nginx) and hijacks the network. Everything goes thru istio(via envoy) now. This allows it to control the network at a very fine grained level, like setting the network rules, timeouts, retries etc. +All the features are implemented by the envoy proxy, istio makes it accessible and integrated with COs (cloud orchestrators) + +This act of hijacking the network, by having an agent running along all the containers forms a "service mesh" +~The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them~ + +Istio documentation says: + +#+BEGIN_QUOTE +As a service mesh grows in size and complexity, it can become harder to understand and manage. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary releases, rate limiting, access control, and end-to-end authentication. +#+END_QUOTE + +We manage and configure Istio using its control plane. +#+BEGIN_QUOTE +Istio is platform-independent and designed to run in a variety of environments, including those spanning Cloud, on-premise, Kubernetes, Mesos, and more. You can deploy Istio on Kubernetes, or on Nomad with Consul. +#+END_QUOTE + +Istio is designed to be customizable, and can be extended, and integrated with existing solutions for ACLs, logging, monitoring etc. + +Istio has a sidecar injector, which looks out for new pods and automatically injects envoy sidecar in the pod. It works by registering the sidecar injector as a admission webhook which allows it to dynamically change the pod configuration before the pod starts. + +** Architecture +The Istio service mesh is logically split into a data plane and control plane + +- data plane + - consists of a set of intelligent proxies deployed as sidecars which mediate and control all network communication b/w microservices. + - it also has Mixer which is a general purpose policy and telemetry hub (so it gets resource usage (cpu, ram) etc) +- control hub + - manages and configures the components in the data plane - the proxy and mixers + + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 11:15:33 +[[file:assets/screenshot_2018-09-10_11-15-33.png]] + +Note here, Service A is trying to talk to Service B. The traffic goes thru envoy proxy. It is allowed only after the policy checks are passed by the Mixer in the Control Plane API. + +The control plane also has Pilot, and Citadel + +** Istio Components +*** Envoy +Envoy is an high performance C++ proxy designed to be used as a sidecar container in service meshes. +Istio uses many builtin envoy features like: +- dynamic service discovery +- load balancing +- tls termination +- http/2 and grpc proxies +- circuit breakers +- health checks +- staged rollouts with %based traffic split +- fault injection +- rich metrics + +These features allow istio to extract a wealth of signals about traffic behavior as ~attributes~, which can be used to enforce policy decisions, and monitoring purposed. + +*** Mixer +Istio’s Mixer component is responsible for policy controls and telemetry collection. It provides backend abstraction and intermediation - ie it abstracts away the rest of Istio from the implementation details of individual infrastructure backends. +It takes the attributes provided by envoy and enforces the policy. + +Mixer includes a flexible plugin model. This model enables istio to interface with a variety of host environments and infrastructure backends. This, Mixer abstracts envoy proxy and istio managed services from these details + +*** Pilot +Pilot provides service discovery, traffic management capabilities for intelligent routing (A/B, canary deployments), resiliency etc. +So, Pilot is used as in, it directs the packets across the service mesh. The Mixer says this route is allowed, the Pilot takes over from there. It dictates the timeout, the retries, the circuit breaking etc. + +Pilot consumes high level routing rules provided by the user from the control plane and propagates them to the sidecars at runtime. + +Pilot abstracts platform specific service discovery mechanisms for both operator and proxy (like CRI-O). So now, the operator (cluster administrator) can use the same interface for traffic management across any CO - this is like the operator is given the CRI interface. +If any project wants to be used as a proxy for Istio (like maybe Nginx), they just have to implement the Envoy data plane APIs and istio will be happy to talk to them. + +*** Citadel +It provides service-to-service and end-user authentication with built-in identity and credential management. + +#+BEGIN_QUOTE +You can use Citadel to upgrade unencrypted traffic in the service mesh. Using Citadel, operators can enforce policies based on service identity rather than on network controls. Starting from release 0.5, you can use Istio’s authorization feature to control who can access your services. +#+END_QUOTE + +*** Galley +It validates user authored Istio API configuration. Over time it will be at the forefront of the Control Plane and will be responsible for configuration ingestion, processing and distribution component of Istio. It will be the sole interface to the user accessing the control plane. + + +** Traffic Management + +Istio's traffic management essentially decouples traffic flow and infrastructure scaling. This allows us to specify via Pilot the traffic rules which we want, rather than saying which specific pods/VMs should receive traffic. + +#+BEGIN_QUOTE +For example, you can specify via Pilot that you want 5% of traffic for a particular service to go to a canary version irrespective of the size of the canary deployment, or send traffic to a particular version depending on the content of the request. +#+END_QUOTE + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 11:58:53 +[[file:assets/screenshot_2018-09-10_11-58-53.png]] + + +*** Pilot and Envoy +#+BEGIN_QUOTE +The core component used for traffic management in Istio is Pilot, which manages and configures all the Envoy proxy instances deployed in a particular Istio service mesh. +Pilot lets you specify which rules you want to use to route traffic between Envoy proxies and configure failure recovery features such as timeouts, retries, and circuit breakers. +It also maintains a canonical model of all the services in the mesh and uses this model to let Envoy instances know about the other Envoy instances in the mesh via its discovery service. +#+END_QUOTE + +Each envoy instance has the load balancing information from Pilot, so it intelligently distributes traffic to the right place (following its specified routing rules), performs health checks etc + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 12:02:58 +[[file:assets/screenshot_2018-09-10_12-02-58.png]] + +Pilot abstracts the CO from envoy proxy and the user. To envoy, it exposes the Envoy API and to the user, it employs the Rules API. + +Want to be used by Istio as proxy? Implement the Envoy API +Want to be used by Istio to configure traffic rules in the CO? Contribute to Istio a platform adapter. + +Istio maintains a canonical representation of services in the mesh that is independent of the underlying platform - in the Abstract Model layer. + +For eg, the Kubernetes adapter (eg of a platform adapter) in Pilot implements the necessary controllers to watch the Kubernetes API server for changes to pod registration information, ingress resources, 3rd party resources that store traffic management rules etc. +This data is translated into the canonical representation. An Envoy-specific configuration is then generated based on the canonical representation. + + +You can specify high-level traffic management rules through Pilot’s Rule configuration. These rules are translated into low-level configurations and distributed to Envoy instances via the process outlined above. Once the configuration is set on the envoy instance, it doesn't need to talk to Pilot. So, the Pilot just configures envoy and then it gets working on it's own. + +So, essentially: + +High level rules -> abstract model -> envoy configuration -> envoy + + +Istio in it's canonical model of the services, maintains a version of all the services it has. It isa finer-grained way to subdivide services. You can specify the traffic routing rules based on the service versions to provide additional control over traffic b/w services. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 12:14:41 +[[file:assets/screenshot_2018-09-10_12-14-41.png]] + +Note how the user is putting the rule using the Rules API component of Pilot. It is converting that to envoy configuration and sending it over. The envoy starts respecting it immediately since it is a built in feature for it. +Routing rules allow Envoy to select a version based on conditions such as headers, tags associated with source/destination, and/or by weights assigned to each version. + +So, one way to describe Istio is: ~Istio is just a platform agnostic way to configure Envoy proxy.~ + +Istio does not provide a DNS. Applications can try to resolve the FQDN using the DNS service present in the underlying platform (kube-dns, mesos-dns, etc.). + +*** Ingress and Egress + +Since all inbound and outbound (ingress and egress) traffic is proxied thru the envoy side car, we can add retries, timeout, circuit breaking etc to the traffic and obtain detailed metrics on the connections to these services. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 12:54:56 +[[file:assets/screenshot_2018-09-10_12-54-56.png]] + +*** Failure handling +Istio’s traffic management rules allow you to set defaults for failure recovery per service and version that apply to all callers. + +Note, this is why we put the retry defaults in the virtualservice for draup services + +#+begin_src yaml +apiVersion: networking.istio.io/v1alpha3 +kind: VirtualService +metadata: + name: foobar-vs +spec: + hosts: + - "*.foobar.com" + http: + - timeout: 3s + route: + - destination: + host: foobar-se + weight: 100 + retries: + attempts: 3 +#+end_src + +#+BEGIN_QUOTE +However, consumers of a service can also override timeout and retry defaults by providing request-level overrides through special HTTP headers. With the Envoy proxy implementation, the headers are ~x-envoy-upstream-rq-timeout-ms~ and ~x-envoy-max-retries~, respectively. +#+END_QUOTE + +** Policies and Telemetry + +#+BEGIN_QUOTE +Istio provides a flexible model to enforce authorization policies and collect telemetry for the services in a mesh. +They include such things as access control systems, telemetry capturing systems, quota enforcement systems, billing systems, and so forth. Services traditionally directly integrate with these backend systems, creating a hard coupling and baking-in specific semantics and usage options. +#+END_QUOTE + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 13:27:02 +[[file:assets/screenshot_2018-09-10_13-27-02.png]] + + + +The Envoy sidecar logically calls Mixer before each request to perform precondition checks, and after each request to report telemetry. + +The sidecar has local caching such that a large percentage of precondition checks can be performed from cache. Additionally, the sidecar buffers outgoing telemetry such that it only calls Mixer infrequently. + +Mixer has 2 components from a high level: +- Backend abstraction + - it abstracts away the backend and provides the rest of istio with a consistent abstraction of the backend +- Intermediation + - Mixer allows operators(cluster admin) to have fine grained control b/w the mesh and infrastructure backends + +Mixer also has reliability and scalability benefits. Policy enforcement and telemetry collection are entirely driven from configuration. + +*** Adapters +#+BEGIN_QUOTE +Mixer is a highly modular and extensible component. One of its key functions is to abstract away the details of different policy and telemetry backend systems, allowing the rest of Istio to be agnostic of those backends. +#+END_QUOTE + +Note, it supports plugins for both policy and telemetry backend systems. + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-10 14:23:45 +[[file:assets/screenshot_2018-09-10_14-23-45.png]] + + + +* CSI - Container Storage Interface + +** RPCs +RPC allows one computer to call a subroutine in another computer. It is a high level model for client-server communication. + +In a microservices, one could use RPCs and not HTTP APIs since RPCs have several advantages over HTTP APIs. + +*** History +In the early days, there were protocols that were great for human to application communication, like Email etc, but none for computer-computer application protocols. + +Just like we have procedure calls, (which are function calls), we have remote procedure calls which call the procedures that aren't on the same machine, but are remote. + +The idea was that, since in a local procedure call, the compiler gives us the ability to make the call, the idea was that the compiler could play a role in enabling remote procedure calls as well. + + +*** General +When the client sends a RPC, it blocks till the server sends the response back. The server receives the request and starts process execution. +There has to be some code on the client machine that knows this is a RPC and makes the needed network communication and get the answer and present it as the return value of the procedure call. + +The RPC framework knows which server to contact, which port to contact, how to serialize the call, marshal it's arguments etc. On the server side, similar stub should be present to unmarshall the arguments etc. + +** Intro to gRPC - https://www.youtube.com/watch?v=RoXT_Rkg8LA +We know how apps talk to one another. It is SOAP, REST (HTTP + JSON). REST is just a architectural principle, about how to structure your API when you use HTTP+JSON etc. + +REST is not that great, actually. It has some advantages: +- easy to understand - text based protocols +- great tooling to inspect and modify etc +- loose coupling b/w client and server makes changes relatively easy +- high quality implementation in every language. + +It has disadvantages as well: +- No formal API contract - there is documentation (swagger etc), but no formal contract +- Streaming is difficult +- Bidirectional streaming not possible (that's why we had to invent websockets etc) +- Operations are difficult to model ("restart the computer", should that be a POST, GET call?) +- Many times your services are just HTTP endpoints, and don't follow the REST principles nicely. +- Not the most efficient since we are using gRPC + + +GRPC solves all these problems. You first define a contract using GRPC IDL - GRPC Interface Definition Language. + +#+begin_src +service Identity { + rpc GetPluginInfo(GetPluginInfoRequest) + returns (GetPluginInfoResponse) {} + + rpc GetPluginCapabilities(GetPluginCapabilitiesRequest) + returns (GetPluginCapabilitiesResponse) {} + + rpc Probe (ProbeRequest) + returns (ProbeResponse) {} +} + +message GetPluginInfoResponse { + // The name MUST follow reverse domain name notation format + // (https://en.wikipedia.org/wiki/Reverse_domain_name_notation). + // It SHOULD include the plugin's host company name and the plugin + // name, to minimize the possibility of collisions. It MUST be 63 + // characters or less, beginning and ending with an alphanumeric + // character ([a-z0-9A-Z]) with dashes (-), underscores (_), + // dots (.), and alphanumerics between. This field is REQUIRED. + string name = 1; + + // This field is REQUIRED. Value of this field is opaque to the CO. + string vendor_version = 2; + + // This field is OPTIONAL. Values are opaque to the CO. + map manifest = 3; +} +#+end_src + +Here, we have define a ~Identity~ service, which supports 3 procedure calls - ~GetPluginInfo~, ~GetPluginCapabilities~, ~Probe~. +See how the ~GetPluginInfo~ takes in the ~GetPluginInfoRequest~ and returns a ~GetPluginInfoResponse~. Then we defined what the ~GetPluginInfoResponse~ looks like below that. + +This is a formal definition, with types. We can run a compiler thru it. + +~protoc --proto_path=. --python_out=plugins-grpc:./py calls.proto~ + +This will generate client side code that can call the RPC. +Similarly, we can generate the server side code as well. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 13:58:12 +[[file:assets/screenshot_2018-09-16_13-58-12.png]] + + +Grpc is the framework which makes the RPC possible, it is a implementation of the RPC protocol. + +GRPC is a protocol built on top of HTTP/2 as the transport protocol. The messages that you send and receive are serialized using Protocol Buffers - you can use some other. + +Clients open one long lived connection to a grpc server. A new HTTP/2 stream for each RPC call. This allows multiple simultaneous inflight RPC calls. Allows client side and server side streaming. + +The compiler can generate stubs in 9 languages. The reference implementation of grpc is in C. Ruby, Python are all bindings to the C core. + +Gprc supports plugabble middleware, on the client and server side which can be used to do logging etc. + +Swagger solves some of the problems around contract, in that people can make swagger IDL like contracts, but it is still text based protocol, doesn't solve bidirectional streaming. Also, the swagger IDL is verbose. + +One problem is that you can't call this from the browser. The fact that it relies on having an intimate control over the HTTP2 connection means you need to have a shim layer in between. + +Many companies have exposed grpc APIs to the public - like Google Cloud (their pub sub api, speech recognition api) etc. + +** gRPC - https://www.youtube.com/watch?v=OZ_Qmklc4zE + +Grpc - gRPC Remote Procedure Calls. GRPC by itself is agnostic of schema, only when it is used with protocol buffers, it has enforced schema. + +It is a recursive fullform. It is a open source, high performance "RPC framework" + +It is the next generation of Snubby RPC build and used inside Google. + + +*** Getting Started +- defining a service in a ~.proto~ file using protocol buffers IDL +- generate the client and server stub using the protocol buffer compiler +- extend the generated server class in your code to fill in the business logic +- invoke it using the client stubs + +*** An aside: Protocol Buffers +Google's lingua franca for serializing data - RPCs and storage. It is binary (so compact), structures can be extended in backward and forward compatible ways. + +It is strongly typed, supports several data types + + +*** Example +Let's write an example called Route Guide. There are clients traveling around and talking to a central server. Or, it can be 2 friends traveling along 2 different routes and talking to each other. + +We have to decide: what types of services do we need to expose? What messages to send? + +#+begin_src +syntax = "proto3"; + +// Interface exported by the server. +service RouteGuide { + // A simple RPC. + // + // Obtains the feature at a given position. + // + // A feature with an empty name is returned if there's no feature at the given + // position. + rpc GetFeature(Point) returns (Feature) {} + + // A Bidirectional streaming RPC. + // + // Accepts a stream of RouteNotes sent while a route is being traversed, + // while receiving other RouteNotes (e.g. from other users). + rpc RouteChat(stream RouteNote) returns (stream RouteNote) {} +} + +// Points are represented as latitude-longitude pairs in the E7 representation +message Point { + int32 latitude = 1; + int32 longitude = 2; +} + +// A feature names something at a given point. +// +// If a feature could not be named, the name is empty. +message Feature { + // The name of the feature. + string name = 1; + + // The point where the feature is detected. + Point location = 2; +} + +// A RouteNote is a message sent while at a given point. +message RouteNote { + // The location from which the message is sent. + Point location = 1; + + // The message to be sent. + string message = 2; +} +#+end_src + +Grpc supports 2 types of RPCs: +- unary + - client sends a request + - server sends a response +- client streaming rpc + - client sends multiple messages + - server sends one response +- server streaming rpc + - client sends one response + - server sends multiple messages +- bi-directional streaming rpc + - client and server independently send multiple messages to each other + +Now running the proto compiler on this will give you the client and server stubs. You just have to implement the business logic using these stubs. + +Grpc is extensible: +- interceptors +- transports +- auth and security - plugin auth mechanisms +- stats, monitoring, tracing - has promotheseus, zipkin integrations +- service discovery - consul, zookeeper integrations +- supported with proxies - envoy, nginx, linkerd + +Grpc has deadline propagation, cancellation propagation + +The wire protocol used by grpc is based on HTTP/2 and the specification is well established. + + +** Container Storage Interface - https://www.youtube.com/watch?v=ktwY1anKN58 +Kubernetes needs storage. So, initially the started with *in-tree plugins* for various storage provisioners, like aws ebs, google persistent disks etc. This was not ideal because: +- the release cycles of the plugins was tied to release of Kubernetes +- the plugin code ran with same access level as the Kubernetes core, so a bug could take down Kubernetes +- the code of the plugins was in the code, and maintaining it was difficult because the aws folks had to get in and make changes and the Kubernetes core team would have to review it for example. +- source code had to be opensourced, regardless of whether the storage vendors wanted that or not + + +This led to the creation of the *out of tree plugins*. First came flex volumes. These needed binaries to be present on each node(even master), like the CNI binaries and needed root access to the node. This also means, if the master is not accessible (like in many managed Kubernetes services), you can't attach there. + +The latest and greatest out of tree implementation is CSI - Container Storage Interface. It is a spec, which is created by folks from not just Kubernetes but other COs like Mesos. Now, any storage vendor who wants to provide storage, has to just implement the CSI spec and his storage solution should be pluggable into the COs. + + +The In tree volume plugins won't be deprecated because Kubernetes has a very strict deprecation policy. So, the logic will be changed to proxy the calls thru the CSI plugin internally. + + +The talk is by Jie Yu, Mesosphere. He is the co-author of the CSI spec. + +*** Motivations and background + +We have several COs and several storage vendors. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 16:47:15 +[[file:assets/screenshot_2018-09-16_16-47-15.png]] + +We wanted a common spec so that the vendors could provide volumes for many COs. + +Also, a new standard was created because the existing storage interfaces had one or more of the following issues: +- cli based interface +- lack of idempotency on APIs +- in tree interface +- tightly coupled with an implementation +- too heavyweight + + +*** Goals + +So, the Goals for CSI were: +- interoperable +- vendor neutral +- focus on specification (CSI committee won't implement anything, only Storage Vendors will) +- control plane only +- keep it simple and boring + +First meet on Feb 2017. In Dec 2017, CSI 0.1 released. +Kubernetes 1.9 added alpha support for CSI. + +*** Design Choices + +1. in-tree vs out-of-tree + - in-tree means the volume plugin code will be in the same binary, will be running in the same OS process as CO process + - possible to have in-tree but out of process, but this gets complicated since we then have to define both north bound API (talk to CO), south bound API (talk to drivers) + - in out-of-tree implementations, the code lives outside the CO core codebase and the CO calls the plugin via CLI or service interfaces. Eg would be flex volumes. + - It is possible to be out of tree, but in process via dynamic library loading but it is complicated and won't work if different COs are in different languages. + - *Decision*: The committee went with out of tree, out of process + +2. Service vs. CLI + - cli - the vendor deploys binaries on hosts, CO invokes the binary with args. + - this is followed by CNI, flex volumes + - long running service - vendor deploys services on hosts (via systemd, Kubernetes etc), CO makes requests to the service + - *Decision*: The committee went with service based because: + - services are easier to deploy + - root access is required to install CLI binaries + - deploying CLI binary dependencies is not easy + - fuse based backends required long running processes, so why not one more + + +3. Idempotency + - this is good for failure recovery + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 17:07:49 +[[file:assets/screenshot_2018-09-16_17-07-49.png]] + +If CO says create volume A, but the response is lost, the CO will retry and the Storage Plugin will create another volume and the first one will be lost. This is avoided if the API is idempotent. +For this, the call has to give a id for the volume. + +4. Wire protocol: gRPC + +This was chosen because it is language agnostic, easy to write specification, big community with lots of tooling, real production users. +It is also part of the cncf + +5. Async vs. Sync APIs + +Async would mean when the CO requests a volume, the plugin returns immediately and says okay, I will create the volume and send a message when it is ready. + +Sync would mean the CO blocks + +*Decision*: Synchronous APIs +- keep it simple, async is more complex +- async does not solve the long running operation problem - if the operation takes too long, the CO should be able to timeout, this is possible with a sync API +- Plugin implementation can be async, the interface b/w the CO and the plugin has to be synchronous + +6. Plugin packaging and deployment +How should the plugin be packaged and deployed? + +*Decision*: Don't dictate +The spec does not dictate that. The only requirement is that it should provide a grpc endpoint over unix socket for now. + +So, possible options for deployment: +- container deployed by CO (eg DaemonSet in Kubernetes) +- systemd services deployed by cluster admin +- can have running services someplace else, the CO connects via grpc (so you can have a company that provides these endpoints) + +7. Controller and Node services + - Identified 2 sets of control plane operations that have different characteristics - Node and Controller services. + - Node services are the services that have to run on the node that is using the volume + - OS mount/unmount, iSCSI initiator + - Controller services can be run on any node + - volume attach/detach (eg EBS), volume creation/deletion + - *Decision*: Created 2 above mentioned sets of APIs + +There are several options for deploying the node services and controller services. + - Option 1: Split Controller and Node services + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 17:03:45 +[[file:assets/screenshot_2018-09-16_17-03-45.png]] + +Here, we have the Node Service deployed on all the nodes, and there is a controller service running "somewhere" that the master can call. + + - Option 2: Headless + +Some COs don't have the master node, everything is just a node. In this case, you can bundle both the node and the controller service on a single container and the CO will talk to it. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 17:05:43 +[[file:assets/screenshot_2018-09-16_17-05-43.png]] + + + +*** Spec Overview + +There are 3 core grpc services +1. Identity +Gives basic information about the plugin, like the GetPluginInfo, GetPluginCapabilities, Probe (to see if the plugin is healthy or not etc). + +2. Controller + + +- ~CreateVolume~ +- ~DeleteVolume~ +- ~ControllerPublishVolume~ - make the volume available on the node (can be just "attach" on the node, (not mount, attach)) +- ~ControllerUnpublishVolume~ +- ~ValidateVolumeCapabilities~ +- ~ListVolumes~ +- ~GetCapacity~ +- ~ControllerGetCapabilities~ + +Many of the calls are optional. + + +3. Node +This service has to run on the node where the volume will be used + +- ~NodeStageVolume~ - this should be called only once for a volume on a given node (eg ~mount~) +- ~NodeUnstageVolume~ +- ~NodePublishVolume~ +- ~NodeUnpublishVolume~ +- ~NodeGetId~ +- ~NodeGetCapabilities~ + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 17:54:10 +[[file:assets/screenshot_2018-09-16_17-54-10.png]] +The lifecycle for the volume. +Note, first the ControllerPublishVolume is called to make the volume "ready". Then the NodePublishVolume is called to "mount" etc + +If you want to write a plugin for GCE PD say, +- need both controller and node services +- create the persistent disk with CreateVolume (use the google cloud apis underneath) +- attach the disk in the ControllerPublishVolume +- format and mount the volume in the NodeStageVolume +- perform a bind mount in the NodePublishVolume + +(Reference implementation in: github.com/googlecloudplatform/compute-persistent-disk-csi-driver) + +Can create a plugin for LVM - logical volume manager etc + +*** CO Integrations + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 17:58:56 +[[file:assets/screenshot_2018-09-16_17-58-56.png]] + + +We have 2 types of pods (each with a sidecar provided by the Kubernetes team. The side car sits b/w the volume plugin and the apiserver. It understands the CSI spec and the apiserver also. It translates commands between the two) +- controller pod - running the controller service of the plugin +- node pod - running the node service of the plugin + +*** Governance model +- Inclusive and open +- independent of any single CO +- try to avoid a storage vendor war + +*** Future +- topology aware + - the CO needs to know whether a volume can be used within a given zone or not +- snapshot support +- volume resizing +- plugin registration (to tell the CO here the unix socket for the plugin is) + - Kubernetes already has it's own spec for this - called device plugin +- smoke test suite + +** External Provisioner - https://github.com/kubernetes-csi/external-provisioner + +Ref: https://kubernetes-csi.github.io/docs/CSI-Kubernetes.html + +Recall we had sidecar containers running along with the CSI volumes which are there to translate between the CSI speak that the volume speaks and the apiserver. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-09-16 19:28:37 +[[file:assets/screenshot_2018-09-16_19-28-37.png]] + +Here, we see that on a node, there are 2 containers - the side car container, and the CSI driver container. There are 3 sidecars that are needed to manage Kubernetes events and make appropriate calls to the CSI driver. + +They are: external attacher, external provisioner, driver registrar + +*** External Attacher +This implements the ~Controller~ services. It watches the Kubernetes ~VolumeAttachment~ objects and triggers CSI ~ControllerPublish~ and ~ControllerUnpublish~ operations against the driver endpoint. + +*** External Provisioner +It looks out for ~PersistentVolumeClaim~ objects and triggers the ~CreateVolume~ and ~DeleteVolume~ operations against the driver endpoint. + +*** Driver Registrar +It is a sidecar container that registers the CSI driver with the kubelet and adds the drivers custom NodeId to a label on the Kubernetes Node API object. + + +The kubelet runs on every node and is responsible for making the CSI calls ~NodePublish~ and ~NodeUnpublish~. +Very important document - https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/container-storage-interface.md + + +The design docs say that there will be an in-tree csi plugin: +#+BEGIN_QUOTE +To support CSI Compliant Volume plugins, a new in-tree CSI Volume plugin will be introduced in Kubernetes. This new volume plugin will be the mechanism by which Kubernetes users (application developers and cluster admins) interact with external CSI volume drivers. +#+END_QUOTE + +However, + +#+BEGIN_QUOTE +Provision/delete and attach/detach must be handled by some external component that monitors the Kubernetes API on behalf of a CSI volume driver and invokes the appropriate CSI RPCs against it. +#+END_QUOTE + +So, we have external provisioner and external attacher respectively. + +** Mount Propagation +ref: https://medium.com/kokster/kubernetes-mount-propagation-5306c36a4a2d +Mount Propagation, was introduced in Kubernetes v1.8 + +The filesystem that we browse is called the VFS - virtual file system. The File system (or more accurately the VFS) hides the complexity of writing to the actual physical location on the disk. + +The VFS has this struct in mount.h + +#+begin_src c +struct vfsmount { + struct list_head mnt_hash; + struct vfsmount *mnt_parent; /* fs we are mounted on */ + struct dentry *mnt_mountpoint; /* dentry of mountpoint */ + struct dentry *mnt_root; /* root of the mounted + tree*/ + struct super_block *mnt_sb; /* pointer to superblock */ + struct list_head mnt_mounts; /* list of children, + anchored here */ + struct list_head mnt_child; /* and going through their + mnt_child */ + atomic_t mnt_count; + int mnt_flags; + char *mnt_devname; /* Name of device e.g. + /dev/dsk/hda1 */ + struct list_head mnt_list; +}; +#+end_src + +Look at ~dentry~ first + +The ~dentry~ struct is used to represent "inode, file name, parent directory, other files in same directory (siblings), and sub-directory" + + +#+begin_src c +struct dentry { + struct inode *d_inode; + struct dentry *d_parent; + struct qstr d_name; + struct list_head d_subdirs; /* sub-directories/files */ + struct list_head d_child; /* sibling directories/files */ + ... +} +#+end_src +In the ~vfsmount~ struct from earlier, ~mnt_mountpoint~ and ~mnt_root~ are of type ~dentry~. +When the OS mounts, a ~vfsmount~ is created with ~mnt_mountpoint~ set to ~/~ + +#+BEGIN_QUOTE +This operation of creating vfsmount entry for a specific dentry is what is commonly referred to as Mounting. +A tree of dentry structs form the file-system. + + root + / \ + tmp usr + / + m1 + / \ + tmp usr + / + m1 +#+END_QUOTE + +~dentry~ struct is used when the ~cd~ is invoked + +So, mount is just the creating of the ~vfsmount~ and ~dentry~ data structures. Since it is only that, all these are valid: +- mount a device at one path - the simplest case +- mount a device multiple times at different paths - different vfsmount, different dentry +- mount a device multiple times at same paths - different vfsmount, same dentry +- mount a directory at another path + diff --git a/operating_systems.org b/operating_systems.org index 568d175..6fb6a90 100644 --- a/operating_systems.org +++ b/operating_systems.org @@ -1477,7 +1477,7 @@ we have different notes for both fellas #+ATTR_ORG: :height 400 [[./assets/ucbOS_32.png]] -this won't work - A leaves a noteA, B leaves a noteA, nobody buys any milk. +this won't work - A leaves a noteA, B leaves a noteB, nobody buys any milk. this reduces the probability of synchronization problem but it can still happen original unix had these a lot diff --git a/ruby.org b/ruby.org new file mode 100644 index 0000000..d53d1be --- /dev/null +++ b/ruby.org @@ -0,0 +1,333 @@ +* Ruby + +Ruby is an interpreted language with dynamic typing used for scripting mostly. + +#+begin_src ruby +puts "hello there" +#+end_src + +is the hello world + +#+begin_src ruby +puts "hello, #{1 + 1}" +# 2 +#+end_src + +You can put expressions inside the `#{}` and they will be evaluated. +#+begin_src ruby +days = "Mon Tue Wed Thu Fri Sat Sun" +puts "Here are the days: #{days}" +#+end_src + +Use triple quotes to write multiline strings. + +#+begin_src ruby +fat_cat = """ +I'll do a list: +\t* Cat food +\t* Fishies +\t* Catnip\n\t* Grass +""" +# this will print +# I'll do a list: +# * Cat food +# * Fishies +# * Catnip +# * Grass +#+end_src + +~puts~ is simply ~print~ with a newline + +Get user input: + +#+begin_src ruby +print "how old are you" +age = gets.chomp +puts "okay so you are #{age} years old" +#+end_src + +You can convert the string to integer using +~age = gets.chomp.to_i~, similarly, ~to_f~ makes it a float + +You can accept the arguments from the user + +#+begin_src ruby +first, second, third = ARGV + +puts "Your first variable is: #{first}" # here, first has the 1st argument for eg +#+end_src + +To accept input, one should use ~$stdin.gets.chomp~ + + +We can read a file using: + +#+begin_src ruby +filename = ARGV.first # or file_again = $stdin.gets.chomp +txt = open(filename) +puts txt.read +#+end_src + + +** Methods + +#+begin_src ruby +def foo(x) + x*2 +end +#+end_src + +The last statement is the return type. +Calling is ~foo 2~, or ~foo(2)~ + + +Recall what blocks are, they are a block of code, that can be executed. +All methods accept an implicit block. You can insert it whereever needed with the ~yield~ keyword + +#+begin_src ruby +def surround + puts '{' + yield + puts '}' +end + +puts surround { puts 'hello world' } +#=> { +#=> hello world +#=> } +#+end_src + +The block can also be converted to a ~proc~ object, which can be called with ~.call~ invocation. + +#+begin_src ruby +def guests(&block) + block.class #=> Proc + block.call(4) +end +#+end_src + +The arguments passed to call are given to the block + +#+begin_src ruby +def guests(&block) + block.class #=> Proc + block.call(4) +end + +guests { |n| "You have #{n} guests." } +# => "You have 4 guests." +#+end_src + +By convention, all methods that return booleans end with a question mark. + +By convention, if a method name ends with an exclamation mark, it does something destructive like mutate the receiver. Many methods have a ! version to make a change, and a non-! version to just return a new changed version. +Like we saw in ~reverse!~ + +** Classes + +Classes can be defined with a ~class~ keyword. +There are: +- class variables ~@@foo~, which are shared by all instances of this class. +- instance variables ~@foo~, which can be initialized anywhere, belong to the particular instance of the class + + +Basic setter: +#+begin_src ruby +def name=(name) + @name = name + end +#+end_src + +Basic initializer: +#+begin_src ruby + def initialize(name, age = 0) + # Assign the argument to the 'name' instance variable for the instance. + @name = name + # If no age given, we will fall back to the default in the arguments list. + @age = age + end +#+end_src + +Basic getter method +#+begin_src ruby + def name + @name + end +#+end_src + +Example: +#+begin_src ruby +class Foo + def initialize(name, age=0) + @name = name + end + + def getName + @name + end +end + +f = Foo.new "Foobar" +puts f.getName +# Foobar +#+end_src + + +The getter/setter is such a pattern that there are shortcuts to auto create them: + +~attr_reader :name~ + +So, this works now: + +#+begin_src ruby +class Foo + attr_reader :name + def initialize(name, age=0) + @name = name + end + + def getName + @name + end +end + +f = Foo.new "Foobar" +puts f.getName +puts f.name +# Foobar +# Foobar +#+end_src + +Note, when I call ~f.name = "bar"~, I am actually doing ~f.name=("bar")~ aka ~f.name= "bar"~ +So I better have a method called ~name=~ + +Another example: +#+begin_src ruby +class Person + attr_reader :name + + def initialize(name) + @name = name + end + + def name=(name) + @name = name + end +end + +john = Person.new("John") +john.name = "Jim" +puts john.name # => Jim +#+end_src + +It can also be a single line ~attr_accessor :name~ + +A class method uses self to distinguish from instance methods. + It can only be called on the class, not an instance. +#+begin_src ruby +class Human + def self.say(msg) + puts msg + end +end + +puts Human.say "hello" +# hello +#+end_src + + +Variable's scopes are defined by the way we name them. +Variables that start with $ have global scope. + +#+begin_src ruby +$var = "I'm a global var" +defined? $var #=> "global-variable" +#+end_src + +Variables that start with @ have instance scope. + +#+begin_src ruby +@var = "I'm an instance var" +defined? @var #=> "instance-variable" +#+end_src + +Variables that start with @@ have class scope. +#+begin_src ruby +@@var = "I'm a class var" +defined? @@var #=> "class variable" +#+end_src + +Variables that start with a capital letter are constants. +#+begin_src ruby +Var = "I'm a constant" +defined? Var #=> "constant" +#+end_src + + +** Modules + +Including modules binds their methods to the class instances. + +#+begin_src ruby +class Person + include ModuleExample +end +#+end_src + +Extending modules binds their methods to the class itself. +#+begin_src ruby +class Book + extend ModuleExample +end +#+end_src + + + + + + +* Ruby misc +~2+2~ +The ~+~ symbol is just syntactic sugar for the calling the ~+~ function on the number +So, these are equivalent + +#+begin_src ruby +puts 2+2 +puts 2.+ 2 +#+end_src + +Note, in ruby, the syntax to call functions is: +~fn_name [ ... ]~ + + +For loop are like this elsewhere, + +#+begin_src ruby +for counter in 1..5 + puts "iteration #{counter}" +end +#+end_src + +In ruby, they look like this: +#+begin_src ruby +(1..5).each do |counter| + puts "iteration #{counter}" +end +#+end_src + +This is a block. +The 'each' method of a range runs the block once for each element of the range. +The block is passed a counter as a parameter. + +The lambda nature of blocks is more easily visible by alternative equivalent form: + +~(1..5).each { |counter| puts "iteration #{counter}" }~ + +But the general syntax to follow is: + +#+begin_src ruby +array.each do |foo| + puts foo +end +#+end_src + diff --git a/sicp.org b/sicp.org new file mode 100644 index 0000000..e21da29 --- /dev/null +++ b/sicp.org @@ -0,0 +1,1762 @@ +* Structure And Interpretation of Computer Programs +** Foreword + +The subject matter of this book has focus on 3 phenomena: +- the human mind +- collections of computer programs +- the computer + + +Since it is very difficult to formally prove the correctness of large programs, what we end up doing us having a large number of small programs of whose correctness we have become sure and then learn the art of combining them into larger structures using organization techniques of proven value. + +These techniques of combining these small programs is discussed at length in this book. +We can learn a great deal of this organization technique by studying the programs that convert the code programmers write to "machine" programs, what the hardware understands. + +"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." -> echoes Pike's "Bigger the interface, weaker the abstraction" + +** Chapter 1 - Building Abstractions with Procedures + +The acts of mind, on simple ideas are 3: +- taking simple ideas and combining them to form a compound idea +- being able to join 2 compound (or simple) ideas without making them one +- recognizing the joined ideas as separate ideas from other ideas + +We will study the "computational process" - they aren't the unix processes etc, they are the abstract beings that inhabit the computers. + +They manipulate "data", their evolution is governed by pattern of rules called programs - so, programs are just things humans create to direct processes. + +The programs are the spells which create and control these processes - the aatma. + +#+BEGIN_QUOTE +Well-designed computational systems, like well-designed automobiles or nuclear reactors, are designed in a modular manner +#+END_QUOTE + +"Recursive equations" are a kind of logical expressions. They can be used as a model for computation. +Lisp was invented to explore the use of recursive equations for modeling computation. + +Lisp's description of processes, or as Lisp likes to call them, "procedures" can be represented and manipulated as data - rephrasing, it has the ability of handling procedures as data. It blurs the distinction between "passive" data and "active" processes - this enables powerful programming paradigms. + +Since we can treat procedures as data, Lisp is great for writing programs that manipulate other programs as data, which is something that must be done by, compilers and interpreters. + +*** 1.1 The Elements of Programming + +The programming language serves as a framework within which we organize our ideas about processes (the broader process we talked about in the last section) + +Languages provides us means of combining simple ideas to form complex ones. +There are 3 mechanisms for doing that: + +- primitive expressions + - which represent the simplest entities the language is concerned with +- means of combination + - by which compound elements are built from simpler ones +- means of abstraction + - by which compound elements can be *named and manipulated as units* + + +There are 2 kinds of elements, "procedures" and "data". +Data is the stuff we want to manipulate, procedures are the things that manipulate it. + +Why Scheme? +#+BEGIN_QUOTE +Scheme, the dialect of Lisp that we use, is an attempt to bring together the power and elegance of Lisp and Algol. From Lisp we take: +- the metalinguistic power that derives from the simple syntax +- the uniform representation of programs as data objects +- the garbage-collected heap-allocated data. + +From Algol we take: +- lexical scoping and block structure +#+END_QUOTE + +**** 1.1.1 Expressions +We can try out some basic expressions. +The interpreter can "evaluate" the expression and return the result of evaluating that expression. + +One kind of expression can be ~55~, on evaluation, it returns the same number back ~55~ + +#+begin_src scheme +55 +;Value: 55 + +3 error> (+ 12 12) + +;Value: 24 + +3 error> (/ 10 5) + +;Value: 2 +#+end_src + +There expressions :top:, formed by delimiting a "list of expressions" within parentheses are called "combinations" + +The leftmost expressions is called "operator", and the remaining elements(expressions) are called "operands" + +The value of the "combination" is obtained by applying the procedure specified by the operator to the arguments that are the values of the operands. + + +The convention of placing the operand on the left is called prefix notation. +This has the advantage of accepting variable number of operands and that each operand can be an expression + +Eg: ~(+ (* 3 (+ (* 2 4) (+ 3 5))) (+ (- 10 7) 6))~ + + +**** 1.1.2 Naming and the Environment + +One important feature of programming languages is that they provide us with the option to refer to computational objects with names + +*The name identifies a _variable_ whose _value_ is the object* + +In scheme, we use ~define~ to name things. + +~(define size 2)~ + +Now, we have a expression with value 2, which can be referred to by the variable ~size~ + +This allows us to do: + +~(* 5 size)~ + +More eg: + +#+begin_src scheme +(define pi 3.14159) +(define radius 10) +(* pi (* radius radius)) +314.159 +(define circumference (* 2 pi radius)) circumference +62.8318 +#+end_src + +#+BEGIN_QUOTE +~Define~ is our language’s simplest means of abstraction, for it allows us to use simple names to refer to the results of compound operations, such as the circumference computed above +#+END_QUOTE + +Complex programs are created by building step-by-step computational objects of increasing complexity. +This leads to incremental development and testing of programs. + +Note, the interpreter needs to maintain this mapping between names and values - this is called the environment. + +**** 1.1.3 Evaluating Combinations + +Evaluating combinations is inherently a recursive operation. To evaluate an expression, the interpreter has to recursively evaluate each operand expression. + +Consider: + +#+begin_src scheme +(* (+ 2 (* 4 6)) + (+ 3 5 7)) +#+end_src + +This can be represented with: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-08 18:49:33 +[[file:assets/screenshot_2018-11-08_18-49-33.png]] + +Here, the combination is represented as a tree. Each combination (that makes up the overall combination --> recall, combinations are just operator and operand expressions) can be represented as a node. The branches of the node are the operator and operands. + +The terminal nodes are the original expressions( --> recall, expressions = operands/operators), and the internal ones are derived expressions. + +Note how the value of the operands percolate upwards, from the terminal nodes to higher levels. + +#+BEGIN_QUOTE +In general, we shall see that recursion is a very powerful technique for dealing with hierarchical, treelike objects +#+END_QUOTE + +The "percolate values upward" form of evaluation is an example of a general kind of precess known as "*tree accumulation*" + +We have to evaluate the expressions recursively. To end somewhere, we make these assumptions for the primitive cases, the terminal nodes: + + +- values of numerals are the numbers they name +- values of built-in operators are the machine instructions sequences that carry out the corresponding operations +- values of other names are the objects associated with those names in the environment + +These are the *General Evaluation Rules* + +"the general notion of the environment as providing a context in which evaluation takes place will play an important role in our understanding of program execution." + +Note, there are some exceptions to the general evaluation rules mentioned above. ~define~ for example, ~(define x 3)~ does not apply ~define~ to 2 arguments ~x~ and ~3~, but does something special of associating the value of ~3~ with variable name ~x~. That is, ~(define x 3)~ is not a combination. + +Such exceptions are "special forms". They have their own evaluation rules. + + +#+BEGIN_QUOTE +The various kinds of expressions (each with its associated evaluation rule) constitute the syntax of the programming language. +#+END_QUOTE + +Lisp has a simple syntax because the evaluation rule for ALL expressions in the language can be described by the above 3 general rules and a small number of special forms. + +**** 1.1.4 Compound Procedures + +Procedure definitions are a more powerful abstraction technique by which *a _compound operation_ can be give a name and then referred to as a unit*. + +Eg: + +#+begin_src scheme +(define (square x) (* x x)) +#+end_src + +Here, we associated the procedure ~square~ with the expression ~(* x x)~, which is a compound operation. + +Here, ~x~ in the compound expression ~(* x x)~ is a local name. + +General syntax: + +#+begin_src scheme +(define ( ) ) +#+end_src + +#+BEGIN_QUOTE +The is a symbol to be associated with the procedure definition in the environment. +The are the names used within the body of the procedure to refer to the corresponding arguments of the procedure. +The is an expression that will yield the value of the procedure application when the formal parameters are replaced by the actual arguments to which the procedure is applied +#+END_QUOTE + +Usage: + +#+begin_src scheme +(square (+ 2 5)) +49 +#+end_src + +One cannot tell by looking at the conditional if it is a compound procedure or built into the interpreter (primitive procedure). + +**** 1.1.5 The Substitution Model for Procedure Application +Evaluation of both primitive and compound procedures is the same for the interpreter. In both cases, recursively evaluate the operands and apply them to the operator. + +Here, the value of the operator = procedure (primitive or compound) +value of the operands = arguments + +How to apply the compound procedure to arguments? +- evaluate the body of the procedure with each formal parameter replaced by the corresponding argument (using the general evaluation rules) + +Example: + +#+begin_src scheme +(f 5) ;; defination of (define (f a)) is (sum-of-squares (+ a 1) (* a 2)) +(sum-of-squares (+ 5 1) (* 5 2)) ;; dfination of (define (sum-of-squares x y)) is (+ (square x) (square y)) +(+ (square 6) (square 10)) +(+ (* 6 6) (* 10 10)) +(+ 36 100) +136 +#+end_src + +This :top: model of evaluating a procedure is called *substitution model* +It is a simple model of evaluating which is okay for now. Later we'll study more complex models. The substitution model breaks down for procedures with "mutable data" + +We saw earlier that the interpreter first evaluates the operator and operands and then applies the resulting procedures to the resulting arguments. + +An alternative way can be, first simplify the expressions - both operator and operands by replacing them with their definitions till only primitive operators are left and then perform the evaluation. + +In this case, + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-08 20:38:52 +[[file:assets/screenshot_2018-11-08_20-38-52.png]] +Note: in this approach, we performed ~(+ 5 1)~ twice, same with ~(* 5 2)~ +If we have evaluated (+ 5 1) first, we could substitute to get (square 6), avoiding the double computation + +This, :top: "fully expand then reduce" evaluation model is called "normal-order evaluation" +earlier we had studied "evaluate the arguments and then apply" - which the interpreter actually uses - which is called "applicative order evaluation" + +Lisp uses applicative order evaluation (evaluate the args, then apply) because it is more efficient (see above) and also because normal order evaluation fails when you have procedures that can't be modeled by direct substitution till you get primitive operands. + +**** 1.1.6 Conditional Expressions and Predicates + +Till now, we don't have predicates in our procedures. +Lisp has a special form for this, called ~cond~ + +Eg: + +#+begin_src scheme + (define (abs x) + (cond ((> x 0) x) + ((= x 0) 0) + ((> x 0) (- x)) + ) + ) +#+end_src + +The general form is: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-08 21:52:37 +[[file:assets/screenshot_2018-11-08_21-52-37.png]] + +p_{1} is the predicate and e is the expression to be returned. + +So, syntax is, cond followed by pairs of ~(

)~ (called clauses) +The order of the clauses is important, the first predicate to evaluate to true succeeds. + +The word predicate is used for procedures(or expressions) that return true or false + +#+begin_src scheme +(define (abs x) + (cond ((< x 0) (- x)) (else x))) +#+end_src + +Here, ~else~ is a special symbol that can be used in place of ~p~ in the final clause of a cond (only in the final clause, in fact - anything that always evaluates to true can be used) + +Lisp has some more syntactic sugar here: + +#+begin_src scheme +(define (abs x) + (if (< x 0) + (- x) + x)) +#+end_src + +So, syntax for if is: + +~(if )~ + +If the predicate evaluates to true, consequent is returned, else the alternative is returned - (evaluated and returned) + +Apart from <, >, = we have more predicates: +- (and e_{1} ... e_{n}) ;; start l to r, if any e evaluates to false, return false +- (or e_{1} ... e_{n}) ;; start l to r, if any e evaluates to true, return false - don't evaluate the rest of the expressions +- (not e_{1}) + +~and~ and ~or~ are special forms, since not all expressions are necessarily evaluated. ~not~ is an ordinary procedure. + + +#+begin_src scheme +(define (square a) (* a a)) + +(define (ex1.3v2 x y z) + (if (> x y) + (if (> y z) (+ (square x) (square y)) (+ (square x) (square z))) + (if (> x z) (+ (square x) (square y)) (+ (square y) (square z))) + ) + ) + +(ex1.3v2 5 2 3) +;; 34 + + +;; Basically, look for any repetition of code, and make it a procedure + +;; 1.4 +(define (a+|b| a b) + ((if (> b 0) + -) a b)) + +(a+|b| 2 3) + +;; here, we see that we can return a operator/procedure as well + +;; 1.5 +;; In a applicative order evaluation, the intrepreter will be stuck because p is defined recursively as itself - so when the interpreter tries to evaluate the operands, it'll stall. +;; There will be no stack overflow however, since the same frame is popped and put back. +;; In normal order, the procedure will return with 0, since the predicate evaluates to true and the subsequent expression is returned +#+end_src + + +Evaluation rule for special form ~if~: +#+BEGIN_QUOTE +The evaluation rule for the special form if is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression. +#+END_QUOTE + +**** 1.1.7 Example: Square Roots by Newton’s Method + +#+BEGIN_QUOTE +There is an important difference between _mathematical functions_ and _computer procedures_. Procedures must be effective. +#+END_QUOTE + +We can define the square root as: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-08 23:04:36 +[[file:assets/screenshot_2018-11-08_23-04-36.png]] + +This allows us to study the properties of the square root etc, but it does not tell us how to find the square root. + +This contrast b/w "functions" and "procedures" is a reflection of a general distinction b/w declarative knowledge and imperative knowledge. + +Mathematics is usually concerned with declarative (what is) descriptions. CS is usually concerned with imperative (how to) descriptions. + +So, how do we compute the roots? One way is using Newton's method. + +Take an initial guess y, and while it is not good enough, update it to be average of y and x/y where x is the number you are trying to find root of + +#+begin_src +;; 1.6 +The procedure enters into an infinite loop because, if the new-if, which is a regular procedure, the interpreter evaluates the operands first and then it enters to new-if (as specified in the applicative order evaluation). However, when the 2nd expression is evaluated, which is defined recursively, it goes in an infinite loop. + +The built-in special form ~if~ solves this problem by first evaluating the predicate and only then evaluating the subsequent OR alternative clauses +#+end_src + + +#+begin_src scheme +;; 1.7 +(define (sqrt x) + (sqrt-iter 1.0 10.0 x) + ) + +(define (sqrt-iter guess old-guess x) + (if (good-enough? guess old-guess) + guess + (sqrt-iter (improve-guess guess x) guess x) + )) + +(define (good-enough? guess old-guess) + (< (abs (- guess old-guess)) 0.001)) + +(define (improve-guess guess x) + (average guess (/ x guess)) + ) + + +(define (average a b) + (/ (+ a b) 2) + ) + +;; 1.8 + +(define (cube-root x) + (cube-root-iter 1.0 10.0 x) + ) + +(define (cube-root-iter guess old-guess x) + (if (good-enough? guess old-guess) + guess + (cube-root-iter (improve-cubic-guess guess x) guess x) + ) + ) + +(define (improve-cubic-guess guess x) + (/ (+ (/ x (* guess guess)) (* 2 guess)) 3) + ) + +#+end_src + +**** 1.1.8 Procedures as Black-Box Abstractions +Appreciate how the entire program can be broken down into simple procedures. + +Note how in the above example, the procedure ~good-enough?~ does not need to worry about the ~sqrt-iter~ procedure that uses it. For it, the ~sqrt-iter~ is not even a real procedure, it is a ~procedure abstraction~ denoting the idea that someone uses it to compute squares. + +This ability of being dividing the program into small pieces and treat each one as a "black box" is very powerful since it leads to more composability. You can use the different pieces independently etc. + +So, given a procedure ~(square x)~, the user should not have to worry about the implementation of how ~square~ is implemented, it could be any of the millions of possible ways. + +*~The arguments that the procedure takes are called the formal parameters of the procedure.~* + +This principle -- that the meaning of a procedure should be independent of the parameter names used by its author -- seems on the surface to be self-evident, but its consequences are profound. + +Formal parameters being local to the procedure allow us to use the procedure as a black box. +The formal parameters of the procedure are *bounded variables*, we say that the procedure binds its *formal parameters* +~If a variable is not bound, we say that it is free.~ The set of expressions for which the binding defines the value of the parameter is called the *scope* of the variable. + +The procedure ~good-enough?~ is not affected by the names of the bounded variables, they can be changed without changing the behavior of ~good-enough?~. However, if you change the names of the *free* variables, the behavior changes. For, eg, it uses ~abs~, it matters what the procedure ~abs~ is. + +If you use the name ~abs~ to refer to a formal parameter of the procedure ~good-enough?~, it is called ~capturing~ the variable. + +Using bound variables is the first solution for the problem of name isolation we have seen. +In ~sqrt~, we can put the various procedures into the defination of ~sqrt~ itself so that we don't pollute the global namespace with internal procedures. + + +#+begin_src scheme +(define (sqrt x) + (define (good-enough? guess x) + (< (abs (- (square guess) x)) 0.001)) + (define (improve guess x) + (average guess (/ x guess))) + (define (sqrt-iter guess x) + (if (good-enough? guess x) + guess + (sqrt-iter (improve guess x) x))) + (sqrt-iter 1.0 x)) +#+end_src + +This nesting of definitions is called ~block structure~. It is the easiest solution to the "name-packaging" problem. Also, we can consider ~x~ to be a free variable inside the ~good-enough?~ defination and avoid passing it to all the internal procedures. This is called ~lexical scoping~ + +Lexical scoping dictates that free variables in a procedure are taken (assumed) to refer to bindings made by enclosing procedure definitions; that is, they are looked up in the environment in which the procedure was defined -> which is the environment of the enclosing procedure. + +#+begin_src scheme +(define (sqrt x) + (define (good-enough? guess) + (< (abs (- (square guess) x)) 0.001)) + (define (improve guess) + (average guess (/ x guess))) + (define (sqrt-iter guess) + (if (good-enough? guess) + guess + (sqrt-iter (improve guess) x))) + (sqrt-iter 1.0)) +#+end_src + +The idea of block structure came first in Algo 60, and it allows us to break large problems into tractable pieces (not a million small pieces) + + +*** 1.2 Procedures and the Processes They Generate + +Being able to visualize how the processes you write will play out is important. + +#+BEGIN_QUOTE +A procedure is a pattern for the ~local evolution~ of a computational process. It specifies how each stage of the process is built upon the previous stage. + +We would like to be able to make statements about the overall, or ~global~, behavior of a process whose local evolution has been specified by a procedure. +#+END_QUOTE + +**** 1.2.1 Linear Recursion and Iteration + +Consider the factorial. + +It is defined as: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-16 16:42:50 +[[file:assets/screenshot_2018-11-16_16-42-50.png]] + +We can directly translate this to a procedure: + +#+begin_src scheme + +(define (fac product n) + (cond ((= n 1) product) + (else (fac (* product n) (- n 1))) + ) + ) + +(define (fact n) + (fac 1 n) + ) + + +;; alternative implementation +(define (factorial n) + (fact-iter 1 1 n)) +(define (fact-iter product counter max-count) + (if (> counter max-count) + product + (fact-iter (* counter product) + (+ counter 1) + max-count))) + +#+end_src + +Here, the process looks like this: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-17 22:51:15 +[[file:assets/screenshot_2018-11-17_22-51-15.png]] + + +This is called an *linear iterative process* + +It can also be written as: + +#+begin_src scheme +(define (fact-rec n) + (if (= n 1) n + (* (fact-rec (- n 1)) n) + ) + ) + +#+end_src + +This creates a series of deferred operations. This type of process, characterized by a chain of deferred operations is called a ~recursive process~ + +The interpreter needs to keep a track of the deferred processes. +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-17 22:49:49 +[[file:assets/screenshot_2018-11-17_22-49-49.png]] + +This is a *linear recursive process.* + +One more difference between the 2 is that in the iterative case, the program variables provide a complete description of the state of the process at any point. +In the recursive case, there is some hidden information maintained by the interpreter and not contained in the program variables which indicates ‘‘where the process is’’ in negotiating the chain of deferred operations. + +Also, note that a *recursive process* is different from a *recursive procedure*. +- The recursive procedure is just based on the syntactic fact that the procedure definition refers to itself. +- The recursive process is about how the process evolves - weather as a series of growing and contracting operations + +Note here that ~fact-iter~ is a recursive procedure but it is generating a iterative process. + + +#+BEGIN_QUOTE +One reason that the distinction between process and procedure may be confusing is that most implementations of common languages (including Ada, Pascal, and C) are designed in such a way that the interpretation of any recursive procedure consumes an amount of memory that grows with the number of procedure calls, even when the process described is, in principle, iterative. + +As a consequence, these languages can describe iterative processes only by resorting to special-purpose ‘‘looping constructs’’ such as ~do,repeat,until,for~, and ~while~. The implementation of Scheme we shall consider in chapter 5 does not share this defect. *It will execute an iterative process in constant space, even if the iterative process is described by a recursive procedure*. An implementation with this property is called *tail-recursive*. With a tail-recursive implementation, iteration can be expressed using the ordinary procedure call mechanism, so that special iteration constructs are useful only as syntactic sugar. +#+END_QUOTE + +#+begin_src scheme +;; 1.9 +(define (+ a b) + (if (= a 0) + b + (inc (+ (dec a) b)) + ) + ) + +;; this is recursive process since the operations are deferred. +;; it is like taking 1 from ~a~ stack to add later + + +(define (+ a b) + (if (= a b) + b + (+ (dec a) (inc b))) + ) +;; this is an iterative process, but a recursive procedure. +;; it is like moving 1 element from stack ~a~ to stack ~b~ +#+end_src + +**** 1.2.2 Tree Recursion + +Apart from the linearly recursive process, there is also tree recursion. This happens when each operation does not create 1 deferred operation, but multiple. + +We can define it as: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 06:44:19 +[[file:assets/screenshot_2018-11-18_06-44-19.png]] + +#+begin_src scheme +(define (fib x) + (cond ((= x 0) 0) + ((= x 1) 1) + (else (+ (fib (- x 1)) (fib(- x 2)))) + )) +#+end_src + +Consider ~(fib 5)~, this leads to a tree: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 06:44:59 +[[file:assets/screenshot_2018-11-18_06-44-59.png]] + +This is wasteful since we are doing the same computation multiple times. +Since Fibonacci grows exponentially with ~n~, this process has exponential time complexity ~O(fib(n))~ and linear storage complexity ~O(n)~ because that's the max depth of the tree + +#+begin_src scheme +;; p1 is the leading element of the series, p2 is the trailing element +(define (fib-iter n p1 p2) + (if (= n 0) p1 + (fib-iter (- n 1) (+ p1 p2) p1) + )) + +(define (fib-iterative n) + (fib-iter n 1 0)) +#+end_src + +This is a iterative process (but the procedure is recursive still - this is because Scheme has tail recursion that makes this iterative. Other programming languages need special looping constructs to make it iterative.) + +Tree recursion might not be useful for computing Fib numbers, but it's very powerful when it comes to processes that operate on hierarchically structured data. + +Consider the problem of counting change: + +#+BEGIN_QUOTE +How many different ways can we make change of $ 1.00, given half-dollars, quarters, dimes, nickels, and pennies? More generally, can we write a procedure to compute the number of ways to change any given amount of money? +#+END_QUOTE + +- 1 dollar = 100 cents +- Half-dollars = 50 cents +- Quarters = 25 cents +- dimes = 10 cents +- nickels = 5 cents +- penny = 1 cent + +Now, we can use tree recursion to solve this: + +#+begin_src scheme +(define (break-n n) + (cond ((= n 0) 1) ;; if we managed to get the amount to zero, we found a way + ((< n 0) 0) ;; we are past the 0 mark, not a solution + (else (+ + (break-n (- n 50)) ;; branch out to check with all possible smaller denomiations + (break-n (- n 25)) + (break-n (- n 10)) + (break-n (- n 5)) + (break-n (- n 1)) + )))) +#+end_src + + +This solution has the problem of counting permutations, not combinations. That is, it will count ~5, 1, 1, 1, 1, 1~ and ~1, 1, 1, 1, 1, 5~ as 2 distinct changes. + +Also, it does a lot of work multiple times, like in the naive fib tree recursion algorithm. We can end up computing (break-n 5) many times for eg. + +We can use memoization to lookup already computed values or we can organize our calculations better using dynamic programming ideas + +The key insights is: +#+BEGIN_QUOTE +The number of ways to change amount _a_ using _n_ kinds of coins equals +- the number of ways to change amount _a_ using all but the first kind of coin, plus +- the number of ways to change amount _a - d_ using all _n_ kinds of coins, where _d_ is the denomination of the first kind of coin. +#+END_QUOTE + +This can be translated to code: + +#+begin_src scheme +(define (coin-change amount) + (cc amount 5)) + +(define (cc amount kinds-of-coins) + (cond ((= amount 0) 1) + ((or (< amount 0) (= kinds-of-coins 0)) 0) + (else (+ + (cc amount (- kinds-of-coins 1)) + (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins))))) + +(define (first-denomination kinds-of-coins) + (cond ((= kinds-of-coins 1) 1) + ((= kinds-of-coins 2) 5) + ((= kinds-of-coins 3) 10) + ((= kinds-of-coins 4) 25) + ((= kinds-of-coins 5) 50))) + +#+end_src + +See how we use the ~first-denomination~ to make do for the lack of the list data structure. +Note, this is still tree recursion, and not very efficient because it does the same computation multiple times. Memoization can help here too. + +#+BEGIN_QUOTE +The observation that a tree-recursive process may be highly inefficient but often easy to specify and understand has led people to propose that one could get the best of both worlds by designing a ‘‘smart compiler’’ that could transform tree-recursive procedures into more efficient procedures that compute the same result. +#+END_QUOTE + +#+begin_src scheme +;; ex1.11 +(define (ex1.11 n) + (if (< n 3) + n + (+ (* 1 (ex1.11 (- n 1))) + (* 2 (ex1.11 (- n 2))) + (* 3 (ex1.11 (- n 3)))))) + +;; this is a straightforward in recursive tree process. +#+end_src + +The iterative process would look like this: + +#+begin_src scheme +(define (ex1.11-v2 n) + (if (< n 3) + n + (ex1.11-v2-iter n 3 (+ 2 2 0) 2 1 0)) + ) + +(define (ex1.11-v2-iter n counter a b c d) + (if (= n counter) + a + (ex1.11-v2-iter n + (+ counter 1) + (+ (* 1 a) + (* 2 b) + (* 3 c)) + a + b + c + ))) + +#+end_src + +Here, we are building up to the ~n~th value of the function. We are bootstrapping with this: + +| n | 0 | 1 | 2 | 3 | 4 | 5 | +|------+---+---+---+---+----+----| +| f(n) | 0 | 1 | 2 | 4 | 11 | 25 | +| char | d | c | b | a | | | + +The counter is set to 3 because we have calculated the values till 3. Now, we increment the counter on each iteration and update the values as follows: + +- a_{}_{n+1} \leftarrow a_{n} + 2b_{n} + 3c_{n} +- b_{n+1} \leftarrow a_{n} +- c_{n+1} \leftarrow b_{n} +- d_{n+1} \leftarrow c_{n} + +So, each char variable moves one step ahead. We stop when the counter reacher ~n~ + +One thing to note is that for iterative processes, since we can't defer operations, we have to start from bottom up, or more precisely from the values that we know to values to want to compute. We need some bootstrapping values and such that we can build up the next values from them. + +#+begin_src scheme +;; ex 1.12 +(define (pascals r c) + (cond ((= c 1) 1) ;; handle 1st column always = 1 + ((= c r) 1) ;; handle last column always = 1 + ((= r 1) 0) ;; handle 1st row having 0 for all c, r != 1 + (else (+ (pascals (- r 1) (- c 1)) + (pascals (- r 1) c))))) +#+end_src + +This is a tree recursion process, does the same computation many times, memoization can help. +If we were to do this in an iterative fashion, we could have started from the tip, where the values are known and then proceeded downwards. + +**** 1.2.3 Orders of Growth + +#+BEGIN_QUOTE +Let ~n~ be a parameter that measures the size of the problem, and let R(n) be the amount of resources the process requires for a problem of size n. + +In our previous examples *we took n to be the number for which a given function is to be computed*, but there are other possibilities. For instance: + +- if our goal is to compute an approximation to the square root of a number, we might take n to be the number of digits accuracy required. +- for matrix multiplication we might take n to be the number of rows in the matrices. + +In general there are a number of properties of the problem with respect to which it will be desirable to analyze a given process and n can be any of it. + +Similarly, R(n) might measure the number of internal storage registers used, the number of elementary machine operations performed, and so on +#+END_QUOTE + + +- for a linear order process, doubling the size will double the resources required +- for a exponential order process, incrementing the size will *multiply* the resources required by a constant factor +- for a logarithmic order process, doubling the size will increase the resources required by a constant amount (log_{some-base representing how the problem size reduces on each level - the branching factor perhaps}2) + + + +#+begin_src +;; 1.15 +The order of growth is log(n) since in each iteration, the value gets reduced by 3. So, log(a) +#+end_src + +**** 1.2.4 Exponentiation + +Exponentiation (finding a^{n}) is a great case study. We can quickly come up with 3 solutions: + +***** Linear Recursive process + +#+begin_src scheme +(define (exp-1 b p) + (if (= p 1) b ;; or, if (= p 0) 1 + (* b (exp b (- p 1))))) +#+end_src + +This is a linear recursive process +- time: O(n) +- space: O(n) + + +***** Iterative process + +#+begin_src scheme +(define (exp-2 b p) + (exp-2-iter b p b) + ) +(define (exp-2-iter b p product) + (if (= p 1) + product + (exp-2-iter b (- p 1) (* product b)))) +#+end_src + +Here, we are using a state variable ~product~ to keep track of the state. +- time: O(n) +- space: O(1) + +***** Logarithmic recursive + +We can use the squaring to get to the power we want +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 11:42:34 +[[file:assets/screenshot_2018-11-18_11-42-34.png]] + +#+begin_src scheme +;; note, instead of using 2 nested ifs, it's better to use cond +(define (exp-3 b p) + (if (= p 1) + b + (if (even? p) + (square (exp-3 b (/ p 2))) + (* b (exp-3 b (- p 1)))))) +#+end_src + +- time: O(log(n)) +- space: O(log(n)) + + +***** Logarithmic iterative + +#+begin_src scheme +;; eg 1.16 +(define (exp-4 b p) + (if (even? p) + (exp-4-iter b p 1) + (* b (exp-4-iter b (- p 1) 1)))) + +(define (exp-4-iter b p counter) + (if (= counter p) + b + (exp-4-iter (square b) p (* 2 counter)))) +#+end_src + +Here again, we build up to the solution by having a counter that goes up to ~p~ + +We have b^{p} as constant, this is our "invariant" in the iteration. + +- time: O(log(n)) +- space: O(1) + +#+begin_src scheme +;; 1.17 +(define (*1 a b) + (cond ((= b 1) a) + ((even? b) (*1 (double a) (halve b))) + (else (+ a (*1 a (- b 1)))))) + +(define (double a) + (* a 2)) + +(define (halve a) + (/ a 2)) + + +;; 1.18 +(define (*1-v2 a b) + (*1-iter a b 0)) + +(define (*1-iter a b residue) + (cond ((= b 1) (+ a residue)) + ((even? b) (*1-iter (double a) (halve b) residue)) + (else (*1-iter a (- b 1) (+ residue a)))))) + +#+end_src + +Note how we moved the residue sum resulting in deferred operations to a new state variable to get an iterative process. + +**** 1.2.5 Greatest Common Divisors + +GCD, also known as HCF is the largest common factor of two numbers. It is the largest number which divides both the numbers. +The largest step size that can reach both the numbers. + +The naive way to find it is to get the factors of both the numbers and get the largest one. + +Euclid's theorem says, if you divide the 2 nums a, b (a/b) and the remainder is r, then the GCD(a, b) = GCD(b, r). + +This can be understood when thought of as the step size analogy. It's like saying when you try to reach the larger number using the step size equal to the smaller number, you might have some leftover. Now, if find a step size that is able to cover both the remainder amount and the step size (b), that should be the GCD/HCF + +#+begin_src scheme +(define (gcd a b) + (if (= b 0) a + (gcd b (remainder a b)))) +#+end_src + +#+begin_src scheme +;; 1.20 +;; In the normal order, since we are subsituiting and deffering the operations the number of remainders is huge + +;; in applicative order, since the operands are evaluated first, we don't have to do the work again for the predicate and so the number of remainder invokations are small +#+end_src + +**** 1.2.6 Example: Testing for Primality + +#+BEGIN_QUOTE +This section describes two methods for checking the primality of an integer n, one with order of growth (sqrt(n)) - which we used earlier, + +and a ‘‘probabilistic’’ algorithm with order of growth (logn). The exercises at the end of this section suggest programming projects based on these algorithms. +#+END_QUOTE + +*** 1.3 Formulating Abstractions with Higher-Order Procedures + +We defined the cube procedure to get cubes ~(cube x)~ +This works for all numbers. If not for the procedures, we would have had to write ~(* x x x)~ where x would be our number + +This would force us to always work at the level of the primitives offered by the language and never build higher level abstractions. + +"Our programs would be able to compute cubes, but our language would lack the ability to express the concept of cubing." + +#+BEGIN_QUOTE +One of the things we should demand from a powerful programming language is the ability to build abstractions by assigning names to common patterns and then to work in terms of the abstractions directly. Procedures provide this ability. + +... + +Often the same programming pattern will be used with a number of different procedures. + +To express such patterns as concepts, we will need to construct procedures that can accept procedures as arguments or return procedures as values. Procedures that manipulate procedures are called _higher-order procedures_. +#+END_QUOTE + +**** 1.3.1 Procedures as Arguments + +Consider these 3 procedures: + +#+begin_src scheme +;; computes: +;; SUMMATION of a, a+1, a+2, ..., b +(define (sum-integers a b) + (if (> a b) + 0 + (+ a (sum-integers (+ a 1) b)))) + + +;; computes: +;; SUMMATION of a3, (a+1)3, ..., b3 +(define (sum-cubes a b) + (if (> a b) + 0 + (+ (cube a) (sum-cubes (+ a 1) b)))) + + +(define (pi-sum a b) + (if (> a b) + 0 + (+ (/ 1.0 (* a (+ a 2))) (pi-sum (+ a 4) b)))) +#+end_src + +The last procedure computes: +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 15:22:09 +[[file:assets/screenshot_2018-11-18_15-22-09.png]] + +All these processes are linearly recursive, and share a common pattern. Each iteration is adding one element for the summation. + +All these share a common template, as they represent the idea of SUMMATION. + +#+begin_src scheme +(define ( a b) + ( + if (> a b) + 0 + (+ ( a) ( ( a) b)))) + +#+end_src + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 15:25:42 +[[file:assets/screenshot_2018-11-18_15-25-42.png]] + +We already have procedures, which allow us to encode general patterns. +We can pass the ~, , ~ procedures to them as formal parameters. + +#+begin_src scheme +;; (term X) gives you the term for the summation +;; (next X) gives you the next value of the variable X for the next summation iteration +(define (sum term a next b) + (if (> a b) + 0 + (+ (term a) + (sum term (next a) next b)))) +#+end_src + +We can encode all the 3 procedures above with this: + +#+begin_src scheme +;; the first procedure +(define (sum-1 a b) + (sum identity a inc b)) + +(define (identity x) x) +(define (inc c) (+ c 1)) + +;; the 2nd procedure +(define (sum-2 a b) + (sum cube a inc b)) + +;; the 3rd procedure +(define (sum-3 a b) + (sum t-3 a a-3 b)) + +(define (t-3 x) + (/ 1.0 (* x (+ x 2)))) + +(define (a-3 x) (+ x 4)) +#+end_src + +We can use the summation abstraction to define complex summations easily now: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 15:51:47 +[[file:assets/screenshot_2018-11-18_15-51-47.png]] + +Here, we have the ~f~ + +#+begin_src scheme +;; f is given by the user +;; (+ a (/ dx 2.0)) is for the term in each iteration +(define (integral f a b dx) + (define (add-dx x) (+ x dx)) + (* (sum f (+ a (/ dx 2.0)) add-dx b) dx)) +#+end_src + +Here, the procedure ~(add-dx)~ would require 2 formal parameters, but then it wouldn't fit in the scheme that we have. Hence, we have put it in the same lexical scope as the ~integral~ function so it has access to ~dx~. Same with the term function here ~(+ a (/ dx 2.0))~ + +This is a good example of how you fit in a function and use the same paradigm for many types of use cases. + +#+begin_src scheme +;; 1.30 +(define (sum-iter term a next b) + (define (iter a result) + (if (> a b) + result + (iter (next a) (+ result (term a))))) + (iter a 0)) + +;; 1.32 +(define (accumulate combiner null-values term a next b) + (if (> a b) + null-values + (combiner (term a) (accumulate combiner null-values term (next a) next b)))) + + +(define (sum-a term a next b) + (accumulate + 0 term a next b)) + +;; iterative version of accumulate +(define (accumulate3 combiner null-values term a next b) + (define (accumulate-iter a result) + (if (> a b) + result + (accumulate-iter (next a) (combiner (term a) result)))) + (accumulate-iter a null-values))) + +(define (sum-a term a next b) + (accumulate3 + 0 term a next b)) + +(define (sum-2-4 a b) + (sum-a cube a inc b)) +#+end_src + +**** 1.3.2 Constructing Procedures Using Lambda +We saw above how we had to define trivial procedures just to use them as values for formal parameters of higher order procedures. We had to name them etc. + +#+BEGIN_QUOTE +it would be more convenient to have a way to directly specify ‘‘the procedure that returns its input incremented by 4’’ and ‘‘the procedure that returns the reciprocal of its input times its input plus 2.’’ +#+END_QUOTE + +We can do this with ~lambda~ function +eg: ~(lambda (x) (+ x 4))~ + +Syntax for lambda: ~(lambda () )~ + +In fact, + +#+begin_src scheme +;; this is what we have been using +(define (plus4 x) (+ x 4)) + +;; this is an EQUIVALENT version +;; the first form is just syntactic sugar, this is the real deal +(define plus4 (lambda (x) (+ x 4))) +#+end_src + +We can even define it in place and use it +~((lambda (x y z) (+ x y z)) 1 2 3)~ + + +When faced with a complex function like this: + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 19:51:31 +[[file:assets/screenshot_2018-11-18_19-51-31.png]] + +We can use ~a~ and ~b~ to simplify the definition. + +#+begin_src scheme + +;; we can use an internal helper procedure +(define (f x y) + (define (f-helper a b) + (+ (* (square a) x) + (* y b) + (* a b))) + (f-helper (+ 1 (* x y)) (- 1 y))) + +;; we can use lambda +;; here, the body of f2 is: +;; ( arg1 arg2) +(define (f2 x y) + ((lambda (a b) + (+ (* (square a) x) + (* y b) + (* a b))) + (+ 1 (* x y)) (- 1 y))) + +;; this is very common. There is some syntactic sugar to make this easier +;; note, here we have (let () ) +;; so, the body is within the let expression, as it's 2nd formal parameter +(define (f3 x y) + (let ((a (+ 1 (* x y))) + (b (- 1 y))) + (+ (* x (square a)) + (* y b) + (* a b)))) +#+end_src + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 20:04:17 +[[file:assets/screenshot_2018-11-18_20-04-17.png]] + + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-18 20:05:10 +[[file:assets/screenshot_2018-11-18_20-05-10.png]] + +So, ~let~ is just an expression with an inplace ~lambda~ which is declared and used + + +2 points: + +***** Let variables are local variables +And their scope is limited to the body of the let statement. + +#+begin_src scheme +;; x = 5 outside; inside the let body it goes to 3, the other x is still 5 +(+ (let (x 3) (+ x (* x 10))) + x) +#+end_src + +***** The variables are defined outside the let block + +#+begin_src scheme +;; if the value of x is 2 +(let ((x 3) + (y (+ x 2))) ;; here, y still refers to the outside x and so it'll get the value 4 +(* x y)) ;; this is 3*4 which is 12 +#+end_src + + +**** 1.3.3 Procedures as General Methods + +We began by introducing compound procedures, which allowed us to abstract patterns of numerical computation so as to make them independent of the particular numbers involved. + +Later, we saw higher order procedures (eg ~integral~) procedure, that are used to express "general methods" of computation, independent of the individual functions involved. + +There are 2 more examples to discuss: +1. General methods for finding zeros +2. General methods for fixed points of functions + + +***** Finding roots of equations by the half-interval method + +We start with 2 values, one where the function is positive and the 2nd where it is negative. Assuming it is a continuous function, there must be a value between them where function is 0. +Since at each step the search space reduces by half, the running time is O(log(L/T)) where T is the L is the length of the original interval and T is the error toleration + + +#+begin_src scheme +(define (find-roots f a b) + (let ((midpoint (average a b))) + (if (close-enough? a b) + midpoint + (let ((test-value (f midpoint))) + (cond ((positive? test-value) (find-roots f midpoint b)) + ((else? test-value) (find-roots f b midpoint)) + (else midpoint)))))) +#+end_src + +Thing to notice is how this is looking more and more like mainstream programming languages now. Note the use of let to define variables and the scope in the let body +Again these 2 are equivalent: + +#+begin_src scheme +((lambda (midpoint) (* 2 midpoint)) (+ 2 4)) + +(let ((midpoint (+ 2 4))) + (* 2 midpoint))) +#+end_src + +The ~close-enough?~ procedure is simple. + +#+begin_src scheme +(define (close-enough? x y) + (< (abs (- x y)) 0.001)) +#+end_src + +We can have another function that calls search if the values are okay + +#+begin_src scheme +(define (half-interval-method f a b) + (let ((a-value (f a)) + (b-value (f b))) + (cond ((and (negative? a-value) (positive? b-value)) (search f a b)) + ((and (negative? a-value) (positive? b-value)) (search f a b)) + (else (error "Values are not opposite signs" a b))))) +#+end_src + +***** Finding fixed points of functions + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-19 08:44:35 +[[file:assets/screenshot_2018-11-19_08-44-35.png]] + +This will converge to a value ~x~ (for some functions) + +Note, not all functions have fixed points, and even in those do, it isn't that they can be found by repeated applications of ~f~. + +Only "attractive fixed points" can be found by repeated applications. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-19 09:00:39 +[[file:assets/screenshot_2018-11-19_09-00-39.png]] + +Cosine has an attractive fixed point. + +#+ATTR_ORG: :width 400 +#+ATTR_ORG: :height 400 +#+DOWNLOADED: /tmp/screenshot.png @ 2018-11-19 09:00:59 +[[file:assets/screenshot_2018-11-19_09-00-59.png]] + +We can formulate finding square root as a fixed point search as well + +Square root is the root of the equation: f(x) = x^{1/2} or y^{2} = x +this can be rewritten as x/y = y. +This can be solved by finding the fixed point of x/y + +We can find the fixed points by applying the same function to itself till the values don't change much. + +#+begin_src scheme +;; fixed point +(define tolerance 0.001) + +(define (fixed-point f first-guess) + (define (close-enough? x y) + (< (abs (- x y)) tolerance)) + ;; note now we use a different name for the inner def + (define (try guess) + (let ((next-value (f guess))) + (if (close-enough? next-value guess) + next-value + (try next-value)))) + (try first-guess)) + +;; we can find solution to y = siny + cosy +(fixed-point (lambda (x) (+ (sin x) (cos x))) 1.0) +;Value: 1.2590038597400248 +#+end_src + +This is similar to the process for finding the square roots; repeatedly improving the guess until the result is close enough. + +We can rewrite ~sqrt~ as a ~fixed-point~ solution, + +#+begin_src scheme +(define (sqrt x) + (fixed-point (lambda (y) (/ x y)) 1.0)) +#+end_src + +This won't converge. +#+BEGIN_QUOTE +Unfortunately, this fixed-point search does not converge. Consider an initial guess y 1 . The next guess is y2 = x/y1 and the next guess is y3 = x/y2 = x/(x/y1 ) = y1 . +#+END_QUOTE + +We can prevent this by damping the oscillations. Since our answer lies between y_{t+1} and x/y_{t}, we can have the new value of y_{t+1} as (1/2)*(y_{t+1} + x/y_{t}) + +So, we can find the sqrt as: +#+begin_src scheme +(define (sqrt x) + (fixed-point (lambda (y) (average y (/ x y))) 1.0)) + +;; 1.35 +(define (golden-ration x) + (fixed-point (lambda (y) (average y (+ 1 (/ 1 x)))) 1.0)) + +;; 1.36 +(define (ex1.36-without-damping) + (fixed-point (lambda (x) (/ (log 1000) (log x))) 2)) + +(ex1.36-without-damping) +9.965784284662087 +3.004472209841214 +6.279195757507157 +3.759850702401539 +5.215843784925895 +4.182207192401397 +4.8277650983445906 +4.387593384662677 +4.671250085763899 +4.481403616895052 +4.6053657460929 +4.5230849678718865 +4.577114682047341 +4.541382480151454 +4.564903245230833 +4.549372679303342 +4.559606491913287 +4.552853875788271 +4.557305529748263 +4.554369064436181 +4.556305311532999 +4.555028263573554 +4.555870396702851 +;Value: 4.555870396702851 + + +(define (ex1.36-damping) + (fixed-point (lambda (x) (average x (/ (log 1000) (log x)))) 2)) + +(ex1.36-damping) +5.9828921423310435 +4.922168721308343 +4.628224318195455 +4.568346513136242 +4.5577305909237005 +4.555909809045131 +4.555599411610624 +;Value: 4.555599411610624 + +;; 1/37 +;; recurisve procedure, iterative process +(define (cont-frac n d k) + (define (fn t) (/ (n k) (+ (d k) t))) + (define (cont-frac-iter n d k term) + (if (= k 1) (fn term) + (cont-frac-iter n d (- k 1) (fn term)))) + (cont-frac-iter n d k 0)) + + +;; recurisve procedure, recursive process +(define (cont-frac-recursive n d k) + (define (fn t) (/ (n k) (+ (d k) t))) + (define (cont-frac-re n d k) + (if (= k 1) 0 + (fn (cont-frac-re n d (- k 1))))) + (cont-frac-re n d k)) + +;; note the difference between the iterative process and the recursive one. +;; the iterative one has a state variable which needs to be carried around +;; the recursive one just has keeps "k" with it +#+end_src + +#+begin_src scheme +;; 1.38 +(define (n i) i) +(define (d i) + (cond ((= (remainder (+ i 1) 3) 0) (* 2 (/ (+ 1 i) 3))) + (else 1))) +(cont-frac-recursive (lambda (i) 1.0) d 500) + +;; 1.39 +(define (tan-cf x k) + (define (d1.39 i) + (+ 1.0 (* 2 (- i 1)))) + (define (n1.39 i) + (if (= i 1) x + (square x))) + (define (cont-frac-recursive-1.39 n d k c) + (define (fn t) (/ (n c) (- (d c) t))) + (if (= c k) 0 + (fn (cont-frac-recursive-1.39 n d k (+ c 1))))) + (cont-frac-recursive-1.39 n1.39 d1.39 k 1)) +#+end_src + + +**** 1.3.4 Procedures as Returned Values +This ability to pass procedures as arguments makes the language more expressive. We can get more if we are able to return procedures as values too. We can use it to abstract away average damping from ~fixed-point~ earlier. + +Average damping can be defined as a function (~g(x)~ say) whose value at ~x~ is average of ~x~ and another given function ~f(x)~. + + +Earlier in the ~fixed-point~, we added damping. We can make that a procedure too + +#+begin_src scheme +;; note what we had earlier +;; the fixed-point just repeatedly applied f till values were close enough +;; to add damping we use this: +(define (golden-ration) + (fixed-point (lambda (x) (average x (+ 1 (/ 1 x)))) 1.0)) +;; here, we are passing a anonymous function to fixed-point that takes a value and returns the average of x and f(x) + +;; so, we can write a procedure that does this. It should take a function (which accepts a parameter x) and return another function whose value is the average of x and f(x) +(define (average-damp f) + (lambda (x) (average x (f x)))) +;; this x will be provided to the function in the fixed-point procedure. Starting with first-guess +#+end_src + +#+begin_src scheme +;; example raw usage of average-damp +((average-damp square) 10) +55 +#+end_src + +We can rewrite sqrt now: + +#+begin_src scheme +(define (sqrt-w/damping x) + (fixed-point (average-damp (lambda (y) (/ x y))) 2.0)) +#+end_src + +#+BEGIN_QUOTE +Notice how this formulation makes explicit the three ideas in the method: +- fixed-point search, +- average damping, +- and the function y \to x/y. + +Note how much cleaner the idea becomes when expressed in terms of these abstractions. + +In general, there are many ways to formulate a process as a procedure. +Experienced programmers know how to choose procedural formulations that are particularly perspicuous, and where useful elements of the process are exposed as separate entities that can be reused in other applications. + +As a simple example of reuse, notice that the cube root of x is a fixed point of the function y \to x/y^{2} , so we can immediately generalize our square-root procedure to one that extracts cube-roots +#+END_QUOTE + +#+begin_src scheme +(define (cube-root-w/damping x) + (fixed-point (average-damp (lambda (y) (/ x (square y)))) 2.0)) +#+end_src + +***** Abstractions and first-class procedures +We have seen how to use the compute square root using: +- fixed-point search (the first one, without damping) +- Newton's method (which itself can be expressed as a fixed-point process) ((the second one, with damping)) + +Thus, we have seen 2 ways to compute sqrt as fixed-points. We can express this idea of having multiple ways to doing fixed points as an higher level abstraction + +#+begin_src scheme +(define (fixed-point-of-transform g transform guess) + (fixed-point (transform g) guess)) + +;; so, our variant of the average-damp is just one of the transformation that can be applied +;; we can do some other damping also. And if we use this abstraction, once we have a better damping method, +;; that helps converge to solutions faster, we can just start using that +(define (sqrt-fixed-point-of-transform x) + (fixed-point-of-transform (lambda (y) (/ x y)) average-damp 1.0)) +#+end_src + +#+BEGIN_QUOTE +As programmers, we should be alert to opportunities to identify the underlying abstractions in our programs and to build upon them and generalize them to create more powerful abstractions. This is not to say that one should always write programs in the most abstract way possible; expert programmers know how to choose the level of abstraction appropriate to their task. +#+END_QUOTE + + +#+BEGIN_QUOTE +In general, programming languages impose restrictions on the ways in which computational elements can be manipulated. + +Elements with the fewest restrictions are said to have first-class status. Some of the ‘‘rights and privileges’’ of first-class elements are: + +- They may be named by variables. +- They may be passed as arguments to procedures. +- They may be returned as the results of procedures. +- They may be included in data structures. + +Lisp, unlike other common programming languages, awards *procedures full first-class status*. This poses challenges for efficient implementation, but the resulting gain in expressive power is enormous. +#+END_QUOTE + +#+begin_src scheme +;; 1.40 +(define (inc x) + (+ x 1)) + +(define (double fn) + (lambda (x) (fn (fn x)))) + + +;; 1.41 +(((double (double double)) inc) 5) +;; (D (D D) inc 5) +;; (D (D(D)) inc 5) +;; (D(D(D(D))) inc 5) +;; (D(D(D(inc inc))) inc 5) +;; (D(D(inc inc inc inc))) inc 5) +;; (D((inc inc inc inc inc inc inc inc)) inc 5) +;; ((inc inc inc inc inc inc inc inc) (inc inc inc inc inc inc inc inc) inc 5) +21 + +;; 1.42 +(define (compose f g) + (lambda (x) (f (g x)))) + +;; 1.43 +(define (repeated f n) + (define (repeated-iter f c) + (if (= c 1) f + (repeated-iter (compose f f) (- c 1)))) + (repeated-iter f n)) + +;; 1.44 +(define (n-smooth n) + (repeated smooth n)) + +(((n-smooth 5) square) 2) +;Value: 4.000010666666668 + +;; 1.46 +(define (n-root n r x) + (fixed-point ((repeated average-damp r) (lambda (y) (/ x (pow y (- n 1))))) 2.0)) + +(define (pow b p) + (if (= p 1) b + (if (even? p) (pow (square b) (/ p 2)) + (* b (pow (square b) (/ (- p 1) 2)))))) + +(n-root 2 2 4) ;; find the square root, use average damp chained 2 times, find root of 4 +2. +;Value: 2. +#+end_src + +#+BEGIN_QUOTE +Several of the numerical methods described in this chapter are instances of an extremely general computational strategy known as iterative improvement. + +Iterative improvement says that, to compute something, we start with an initial guess for the answer, test if the guess is good enough, and otherwise improve the guess and continue the process using the improved guess as the new guess. +#+END_QUOTE + + + +One catch is that in scheme, you cannot define a named procedure and return it directly +eg: + +#+begin_src scheme +;; this works +(define (x-inc x) + (lambda (y) + (+ x y))) + +((x-inc 2) 3) +; 5 + +;; however this doesn't work +(define (x-inc x) + (define (x-inc2 y) + (+ x y))) +((x-inc 2) 3) +;The object #[constant 12 #x2] is not applicable. + +;; if you want to return the named procedure, use this: +(define (x-inc x) + (define (x-inc2 y) + (+ x y)) + (lambda (y) (x-inc2 y))) + +((x-inc 2) 3) +;Value: 5 + +;; 1.46 +(define (iterative-improvement good-enough? improve-guess) + (define (improve guess) + (let ((next-value (improve-guess guess))) + (display next-value) + (newline) + (if (good-enough? guess next-value) + next-value + (improve next-value)))) + (lambda (guess) (improve guess))) + +(define (new-fixed-point f first-guess) + ((iterative-improvement good-enough? f) first-guess)) + +(define (new-golden-ration) + (new-fixed-point (lambda (x) (average x (+ 1 (/ 1 x)))) 1.0)) + +#+end_src + +* Reserved + +2 Building Abstractions with Data + + 2.1 Introduction to Data Abstraction + 2.1.1 Example: Arithmetic Operations for Rational Numbers + 2.1.2 Abstraction Barriers + 2.1.3 What Is Meant by Data? + 2.1.4 Extended Exercise: Interval Arithmetic + 2.2 Hierarchical Data and the Closure Property + 2.2.1 Representing Sequences + 2.2.2 Hierarchical Structures + 2.2.3 Sequences as Conventional Interfaces + 2.2.4 Example: A Picture Language + 2.3 Symbolic Data + 2.3.1 Quotation + 2.3.2 Example: Symbolic Differentiation + 2.3.3 Example: Representing Sets + 2.3.4 Example: Huffman Encoding Trees + 2.4 Multiple Representations for Abstract Data + 2.4.1 Representations for Complex Numbers + 2.4.2 Tagged data + 2.4.3 Data-Directed Programming and Additivity + 2.5 Systems with Generic Operations + 2.5.1 Generic Arithmetic Operations + 2.5.2 Combining Data of Different Types + 2.5.3 Example: Symbolic Algebra + +3 Modularity, Objects, and State + + 3.1 Assignment and Local State + 3.1.1 Local State Variables + 3.1.2 The Benefits of Introducing Assignment + 3.1.3 The Costs of Introducing Assignment + 3.2 The Environment Model of Evaluation + 3.2.1 The Rules for Evaluation + 3.2.2 Applying Simple Procedures + 3.2.3 Frames as the Repository of Local State + 3.2.4 Internal Definitions + 3.3 Modeling with Mutable Data + 3.3.1 Mutable List Structure + 3.3.2 Representing Queues + 3.3.3 Representing Tables + 3.3.4 A Simulator for Digital Circuits + 3.3.5 Propagation of Constraints + 3.4 Concurrency: Time Is of the Essence + 3.4.1 The Nature of Time in Concurrent Systems + 3.4.2 Mechanisms for Controlling Concurrency + 3.5 Streams + 3.5.1 Streams Are Delayed Lists + 3.5.2 Infinite Streams + 3.5.3 Exploiting the Stream Paradigm + 3.5.4 Streams and Delayed Evaluation + 3.5.5 Modularity of Functional Programs and Modularity of Objects + +4 Metalinguistic Abstraction + + 4.1 The Metacircular Evaluator + 4.1.1 The Core of the Evaluator + 4.1.2 Representing Expressions + 4.1.3 Evaluator Data Structures + 4.1.4 Running the Evaluator as a Program + 4.1.5 Data as Programs + 4.1.6 Internal Definitions + 4.1.7 Separating Syntactic Analysis from Execution + 4.2 Variations on a Scheme — Lazy Evaluation + 4.2.1 Normal Order and Applicative Order + 4.2.2 An Interpreter with Lazy Evaluation + 4.2.3 Streams as Lazy Lists + 4.3 Variations on a Scheme — Nondeterministic Computing + 4.3.1 Amb and Search + 4.3.2 Examples of Nondeterministic Programs + 4.3.3 Implementing the Amb Evaluator + 4.4 Logic Programming + 4.4.1 Deductive Information Retrieval + 4.4.2 How the Query System Works + 4.4.3 Is Logic Programming Mathematical Logic? + 4.4.4 Implementing the Query System + 4.4.4.1 The Driver Loop and Instantiation + 4.4.4.2 The Evaluator + 4.4.4.3 Finding Assertions by Pattern Matching + 4.4.4.4 Rules and Unification + 4.4.4.5 Maintaining the Data Base + 4.4.4.6 Stream Operations + 4.4.4.7 Query Syntax Procedures + 4.4.4.8 Frames and Bindings + +5 Computing with Register Machines + + 5.1 Designing Register Machines + 5.1.1 A Language for Describing Register Machines + 5.1.2 Abstraction in Machine Design + 5.1.3 Subroutines + 5.1.4 Using a Stack to Implement Recursion + 5.1.5 Instruction Summary + 5.2 A Register-Machine Simulator + 5.2.1 The Machine Model + 5.2.2 The Assembler + 5.2.3 Generating Execution Procedures for Instructions + 5.2.4 Monitoring Machine Performance + 5.3 Storage Allocation and Garbage Collection + 5.3.1 Memory as Vectors + 5.3.2 Maintaining the Illusion of Infinite Memory + 5.4 The Explicit-Control Evaluator + 5.4.1 The Core of the Explicit-Control Evaluator + 5.4.2 Sequence Evaluation and Tail Recursion + 5.4.3 Conditionals, Assignments, and Definitions + 5.4.4 Running the Evaluator + 5.5 Compilation + 5.5.1 Structure of the Compiler + 5.5.2 Compiling Expressions + 5.5.3 Compiling Combinations + 5.5.4 Combining Instruction Sequences + 5.5.5 An Example of Compiled Code + 5.5.6 Lexical Addressing + 5.5.7 Interfacing Compiled Code to the Evaluator + + + + + +